Skip to content

Rework Bytes Chunk type and tweak comparison#6112

Merged
pchiusano merged 3 commits intotrunkfrom
topic/bytes-chunk
Jan 14, 2026
Merged

Rework Bytes Chunk type and tweak comparison#6112
pchiusano merged 3 commits intotrunkfrom
topic/bytes-chunk

Conversation

@dolio
Copy link
Contributor

@dolio dolio commented Jan 14, 2026

Chunk is now a wrapped ByteArray similar to vector, but it takes advantage of some low level possibilities instead of implementing various things via stream fusion. The latter seems to be pretty poor in some cases, like comparison.

Comparison of Rope optimistically looks for pairs of One values to try a faster path.

Avoid parameterizing universalEq/Compare by a comparison function. We're never going to dynamically expand the set of equalities at runtime, which was the idea behind that, and it likely just costs performance.

I took the opportunity to implement encodeNat... using the endian tests and a bulk array write. I wanted to do this back when I was fiddling with the bytes read replacements, but it wasn't as convenient when using Vector Word8.

base tests and transcripts pass. If @ceedubs could do some cloud testing that'd be appreciated, since I think they probably make more thorough use of this stuff.

`Chunk` is now a wrapped `ByteArray` similar to vector, but it takes
advantage of some low level possibilities instead of implementing various
things via stream fusion. The latter seems to be pretty poor in some
cases, like comparison.

Comparison of `Rope` optimistically looks for pairs of `One` values to try
a faster path.

Avoid parameterizing `universalEq/Compare` by a comparison function. We're
never going to dynamically expand the set of equalities at runtime, which
was the idea behind that, and it likely just costs performance.
Copy link
Member

@pchiusano pchiusano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@ceedubs
Copy link
Contributor

ceedubs commented Jan 14, 2026

I've confirmed that the Unison Cloud integration tests pass with these changes.

I know that benchmarks can be deceiving, but since these changes were (presumably) made in the name of performance, are there any benchmark results showing the impact?

@pchiusano pchiusano merged commit 401050e into trunk Jan 14, 2026
32 checks passed
@pchiusano pchiusano deleted the topic/bytes-chunk branch January 14, 2026 16:39
@dolio
Copy link
Contributor Author

dolio commented Jan 14, 2026

@ceedubs Sure. I was running benchmarks like these

looper = do
  printTime
    "Bytes comparison" 1 let
      bs0 = 0xs00010203040506070809101112131415
      bs1 = 0xs00010203040506070809101112131415

      go = cases
        0 -> ()
        n ->
          _ = Universal.compare bs0 bs1
          go (n-1)
      go

That compares a pair of UUID-sized bytes. I also tried what would happen if you used Nat instead

loopen = do
  printTime
    "Nat comparison" 1 let
      n00 = 0x0001020304050607
      n01 = 0x0809101112131415
      n10 = 0x0001020304050607
      n11 = 0x080910111213140f

      go = cases
        0 -> ()
        n ->
          _ = n00 Nat.< n01
          _ = n10 Nat.< n11
          go (n-1)
      go

This doesn't measure the absolute performance of the operations, because the loop itself is non-trivial overhead. But I thought that was more representative of a scenario they'd actually be used in (like map insertion).

Anyhow...

  • The Nat version is ~95ns (this measures cost/iteration)
  • With the old chunk type, comparing equal values takes ~780ns
  • The best case scenario—where the values differ on the first byte—takes ~250ns with the old chunk type.
  • With the new chunk type (and the single chunk short circuit), the cost is ~150ns regardless (and I have some plans to bring that down a bit more, maybe to ~110ns)

@ceedubs
Copy link
Contributor

ceedubs commented Jan 14, 2026

Awesome. Thanks @dolio!

@pchiusano
Copy link
Member

I also have some benchmarks showing ~ 3-5x speed improvement for Map insertion and lookup when using Bytes keys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants