Skip to content

Add WebSocket::into_inner_with_read_buffer to recover buffered bytes#556

Open
jeremyjpj0916 wants to merge 2 commits into
snapview:masterfrom
jeremyjpj0916:into-inner-with-read-buffer
Open

Add WebSocket::into_inner_with_read_buffer to recover buffered bytes#556
jeremyjpj0916 wants to merge 2 commits into
snapview:masterfrom
jeremyjpj0916:into-inner-with-read-buffer

Conversation

@jeremyjpj0916

Copy link
Copy Markdown

Fixes #555.

Problem

WebSocket::into_inner() returns only the underlying stream and discards the frame codec's internal read buffer. When a caller takes over the raw stream after the handshake (e.g. to relay raw bytes / drop to a plain byte copy), bytes the peer coalesced into the same read as the handshake response — or an unread frame tail — are silently lost. The codec's in_buffer is unreachable: it sits behind the private WebSocketContext.frame field inside the private WebSocket.context, and only FrameSocket (not the WebSocket path) exposes into_inner() -> (Stream, BytesMut).

Change

Add a consuming accessor that returns the stream together with its not-yet-consumed read buffer:

  • WebSocket::into_inner_with_read_buffer(self) -> (Stream, Bytes)
  • WebSocketContext::into_read_buffer(self) -> Bytes (counterpart for callers driving WebSocketContext over their own stream)
  • FrameCodec::into_read_buffer(self) -> BytesMut (internal, pub(super))

Bytes is already re-exported by the crate (pub use bytes::Bytes, used by Message), and BytesMut::freeze() is zero-copy, so this adds no new public dependency and no allocation. into_inner() is unchanged.

Tests

  • into_inner_with_read_buffer_recovers_coalesced_bytes — frame bytes seeded via from_partially_read (modelling a coalesced post-handshake segment) are returned, not dropped.
  • into_inner_with_read_buffer_returns_unread_tail — after reading the first of two buffered frames, the second remains and is recovered.

Full lib test suite passes (50/50); cargo fmt and cargo clippy --lib are clean. CHANGELOG updated under UNRELEASED.

Motivation

This is the upstream blocker for a lossless raw-stream takeover after a WebSocket handshake — e.g. a proxy/gateway tunnel mode that completes the handshake with tungstenite/tokio-tungstenite then bypasses framing to relay raw bytes. A companion PR on snapview/tokio-tungstenite surfaces this on WebSocketStream.

WebSocket::into_inner() returns only the underlying stream and discards
the frame codec's internal read buffer. When a caller takes over the raw
stream after the handshake (e.g. to relay raw bytes / drop to a plain
byte copy), any bytes the peer coalesced into the same read as the
handshake response — or any unread frame tail — are silently lost, with
no public accessor to recover them.

Add WebSocket::into_inner_with_read_buffer() -> (Stream, Bytes) and the
WebSocketContext::into_read_buffer() -> Bytes counterpart, returning the
codec's not-yet-consumed read buffer alongside the stream. Bytes is
already re-exported by the crate, and BytesMut::freeze() is zero-copy.

Tests cover recovering coalesced post-handshake bytes and an unread frame
tail after a partial read.
@jeremyjpj0916

Copy link
Copy Markdown
Author

Companion PR that surfaces this on tokio-tungstenite's WebSocketStream: snapview/tokio-tungstenite#380 (draft, depends on this landing + a release). Tracking issue: snapview/tokio-tungstenite#379.

The read-buffer accessors are lossless only at a frame boundary. If a
frame is partially read (header parsed, payload incomplete — e.g. after a
mid-frame WouldBlock), read_frame has already advanced the header bytes
out of the codec's in_buffer, so into_read_buffer returns only the
partial payload and silently drops the parsed header.

- Document the precondition on WebSocket::into_inner_with_read_buffer and
  WebSocketContext::into_read_buffer.
- debug_assert in FrameCodec::into_read_buffer to catch mid-frame misuse
  in debug builds (zero cost in release).
- Add a regression test driving the mid-frame state via a WouldBlock
  stream, gated on debug_assertions.

The motivating use (raw takeover right after the handshake) is always at
a frame boundary and is unaffected.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

No way to recover the codec read buffer when taking over the raw stream after the handshake

1 participant