MsgPackReader: properly increment index in `ext` #370

harpocrates · 2021-11-07T15:08:57Z

The parseExt helper in MsgPackReader was not properly updating the
position in the data source.

Added two regression tests. These compare serialized and re-serialized
values (instead of just initial and serialized) since structurally identical Ext
messages won't compare identical (due to the Array[Byte] field whose
equality is determined by identity).

Fixes #369

The `parseExt` helper in `MsgPackReader` was not properly updating the position in the data source. Also override `Ext.equals` to compare the data arrays by contents instead of identity (which is consistent with the other subclasses of `Msg`). Added two regression tests. Note that these don't work without the `equals` changes, since structurally identical `Ext` messages won't compare equal unless they also happen to be the same object. Fixes com-lihaoyi#369

upack/src/upack/Msg.scala

htmldoug · 2021-11-12T21:20:32Z

upack/src/upack/Msg.scala

+  }
+
+  override def hashCode: Int =
+    MurmurHash3.bytesHash(b, "Ext".hashCode)


tag should be represented in the hash somehow. We could probably just use it as the seed.

The more I think about this, the less comfortable I am about updating equals and hashCode here. That could lead to some end-user changes and it isn't consistent with some of the other types (eg. Bytes). Maybe this is still desirable change, but probably not in a PR about fixing ext parsing.

I'm going to switch the round-tripping test I added to assert equivalence of the serialized and serialized-deserialized-reserialized payloads (that way we don't need to call equals on a Msg)

👍 for smaller, well-scoped PRs.

Re: equals/hashCode, it's hard to imagine that any users are depending on the behavior being broken other than to work around it.

Good point about case class Binary(value: Array[Byte]) extends Msg being affected, too. A fix for one should probably fix both.

The rabbit hole goes deeper. Given that 0.0d == 0L, should Float64(0.0) == Int64(0L)? Probably so! This is especially relevant since round-tripping Int64(0L) through messagepack will come back as Int32(0). I think you're wise to push equals/hashCode to a separate PR. Making assertions against the Array[Byte] sounds good for now.

Given that 0.0d == 0L, should Float64(0.0) == Int64(0L)?

Personally, I don't think this should be the case - the "int format family" and "float format family" are separate. I further wish that 0.0d == 0L would fail to typecheck instead of returning true (that's another rabbit hole though, and Scala is unlikely to ever change that).

This is especially relevant since round-tripping Int64(0L) through messagepack will come back as Int32(0).

I agree Int64(0L) should roundtrip, but on the other hand: is there really a need for the distinction? Why not just a single case class Int(value: Long) extends Msg (and UInt for unsigned)? We don't distinguish Int16 or the fixint cases, so why distinguish the 32 and 64 bit cases?

htmldoug · 2021-11-12T21:43:36Z

upack/src/upack/MsgPackReader.scala

@@ -98,7 +98,9 @@ abstract class BaseMsgPackReader extends upickle.core.BufferingByteParser{
  }
  def parseExt[T](n: Int, visitor: Visitor[_, T]) = {
    val (arr, i, j) = sliceArr(index + 1, n)
-    visitor.visitExt(getByteSafe(index), arr, i, j, index)
+    val res = visitor.visitExt(getByteSafe(index), arr, i, j, index)
+    index += n + 1


This looks important!

If I'm reading correctly, looks like n is the array length, and the + 1 is the tag/ext-type byte. Seems like this fix should cover the fixext types as well. 👍

htmldoug

After hash fix, lgtm. thanks!

This rolls back changing `equals` and `hashCode`. Those changes may be worth it, but not in this PR. The test compares serialized and re-serialized values

harpocrates · 2021-11-13T13:23:44Z

I've switched back to Array[Byte] for the tests.

If you are planning on squashing and merging, maybe use that as the commit message instead of the default concatenation of intermediate messages?

htmldoug reviewed Nov 11, 2021

View reviewed changes

upack/src/upack/Msg.scala Outdated Show resolved Hide resolved

update hashcode too

bfbd2fa

htmldoug reviewed Nov 12, 2021

View reviewed changes

htmldoug approved these changes Nov 12, 2021

View reviewed changes

htmldoug mentioned this pull request Nov 12, 2021

Bugfix: MsgPackReader: properly increment index in ext rallyhealth/weePickle#96

Merged

Compare serialized and re-serialized values in test

385e4cd

This rolls back changing `equals` and `hashCode`. Those changes may be worth it, but not in this PR. The test compares serialized and re-serialized values

htmldoug approved these changes Nov 15, 2021

View reviewed changes

htmldoug merged commit 106a706 into com-lihaoyi:master Nov 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

MsgPackReader: properly increment index in `ext` #370

MsgPackReader: properly increment index in `ext` #370

Uh oh!

harpocrates commented Nov 7, 2021 •

edited

Loading

Uh oh!

Uh oh!

htmldoug Nov 12, 2021

Uh oh!

harpocrates Nov 12, 2021

Uh oh!

htmldoug Nov 12, 2021

Uh oh!

htmldoug Nov 13, 2021

Uh oh!

harpocrates Nov 13, 2021 •

edited

Loading

Uh oh!

htmldoug Nov 12, 2021

Uh oh!

htmldoug left a comment •

edited

Loading

Uh oh!

harpocrates commented Nov 13, 2021

Uh oh!

Uh oh!

Uh oh!

MsgPackReader: properly increment index in ext #370

MsgPackReader: properly increment index in ext #370

Uh oh!

Conversation

harpocrates commented Nov 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

htmldoug Nov 12, 2021

Choose a reason for hiding this comment

Uh oh!

harpocrates Nov 12, 2021

Choose a reason for hiding this comment

Uh oh!

htmldoug Nov 12, 2021

Choose a reason for hiding this comment

Uh oh!

htmldoug Nov 13, 2021

Choose a reason for hiding this comment

Uh oh!

harpocrates Nov 13, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

htmldoug Nov 12, 2021

Choose a reason for hiding this comment

Uh oh!

htmldoug left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

harpocrates commented Nov 13, 2021

Uh oh!

Uh oh!

MsgPackReader: properly increment index in `ext` #370

MsgPackReader: properly increment index in `ext` #370

harpocrates commented Nov 7, 2021 •

edited

Loading

harpocrates Nov 13, 2021 •

edited

Loading

htmldoug left a comment •

edited

Loading