-
-
Notifications
You must be signed in to change notification settings - Fork 171
MsgPackReader: properly increment index in ext
#370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The `parseExt` helper in `MsgPackReader` was not properly updating the position in the data source. Also override `Ext.equals` to compare the data arrays by contents instead of identity (which is consistent with the other subclasses of `Msg`). Added two regression tests. Note that these don't work without the `equals` changes, since structurally identical `Ext` messages won't compare equal unless they also happen to be the same object. Fixes com-lihaoyi#369
upack/src/upack/Msg.scala
Outdated
} | ||
|
||
override def hashCode: Int = | ||
MurmurHash3.bytesHash(b, "Ext".hashCode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tag
should be represented in the hash somehow. We could probably just use it as the seed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more I think about this, the less comfortable I am about updating equals
and hashCode
here. That could lead to some end-user changes and it isn't consistent with some of the other types (eg. Bytes
). Maybe this is still desirable change, but probably not in a PR about fixing ext parsing.
I'm going to switch the round-tripping test I added to assert equivalence of the serialized and serialized-deserialized-reserialized payloads (that way we don't need to call equals
on a Msg
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for smaller, well-scoped PRs.
Re: equals
/hashCode
, it's hard to imagine that any users are depending on the behavior being broken other than to work around it.
Good point about case class Binary(value: Array[Byte]) extends Msg
being affected, too. A fix for one should probably fix both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rabbit hole goes deeper. Given that 0.0d == 0L
, should Float64(0.0) == Int64(0L)
? Probably so! This is especially relevant since round-tripping Int64(0L)
through messagepack will come back as Int32(0)
. I think you're wise to push equals
/hashCode
to a separate PR. Making assertions against the Array[Byte]
sounds good for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that
0.0d == 0L
, shouldFloat64(0.0) == Int64(0L)
?
Personally, I don't think this should be the case - the "int format family" and "float format family" are separate. I further wish that 0.0d == 0L
would fail to typecheck instead of returning true (that's another rabbit hole though, and Scala is unlikely to ever change that).
This is especially relevant since round-tripping
Int64(0L)
through messagepack will come back asInt32(0)
.
I agree Int64(0L)
should roundtrip, but on the other hand: is there really a need for the distinction? Why not just a single case class Int(value: Long) extends Msg
(and UInt
for unsigned)? We don't distinguish Int16
or the fixint cases, so why distinguish the 32 and 64 bit cases?
@@ -98,7 +98,9 @@ abstract class BaseMsgPackReader extends upickle.core.BufferingByteParser{ | |||
} | |||
def parseExt[T](n: Int, visitor: Visitor[_, T]) = { | |||
val (arr, i, j) = sliceArr(index + 1, n) | |||
visitor.visitExt(getByteSafe(index), arr, i, j, index) | |||
val res = visitor.visitExt(getByteSafe(index), arr, i, j, index) | |||
index += n + 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks important!
If I'm reading correctly, looks like n
is the array length, and the + 1
is the tag/ext-type byte. Seems like this fix should cover the fixext types as well. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After hash fix, lgtm. thanks!
This rolls back changing `equals` and `hashCode`. Those changes may be worth it, but not in this PR. The test compares serialized and re-serialized values
I've switched back to If you are planning on squashing and merging, maybe use that as the commit message instead of the default concatenation of intermediate messages? |
The
parseExt
helper inMsgPackReader
was not properly updating theposition in the data source.
Added two regression tests. These compare serialized and re-serialized
values (instead of just initial and serialized) since structurally identical
Ext
messages won't compare identical (due to the
Array[Byte]
field whoseequality is determined by identity).
Fixes #369