[hybi] Re: Questions about RFC6455 (implementing websockets)
"A. Rothman" <amichai2@amichais.net> Sun, 15 September 2024 13:50 UTC
Return-Path: <amichai2@amichais.net>
X-Original-To: hybi@ietfa.amsl.com
Delivered-To: hybi@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DCBA9C14F603 for <hybi@ietfa.amsl.com>; Sun, 15 Sep 2024 06:50:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.906
X-Spam-Level:
X-Spam-Status: No, score=-1.906 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pBs05AVIYma7 for <hybi@ietfa.amsl.com>; Sun, 15 Sep 2024 06:50:53 -0700 (PDT)
Received: from freeutils.net (amichais.net [85.130.191.195]) by ietfa.amsl.com (Postfix) with ESMTP id 79843C14F5F1 for <hybi@ietf.org>; Sun, 15 Sep 2024 06:50:52 -0700 (PDT)
Received: from router.asus.com (EHLO [192.168.80.80]) ([192.168.80.1]) by shefa (JAMES SMTP Server ) with ESMTPA ID 96d013ec; Sun, 15 Sep 2024 16:50:50 +0300 (IDT)
Content-Type: multipart/alternative; boundary="------------xcA0lxAHHOQwAtYYO2S3CNzE"
Message-ID: <01e20cbd-cd30-4235-bcf9-05a7bc448ab3@amichais.net>
Date: Sun, 15 Sep 2024 16:50:49 +0300
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
To: Adam Rice <ricea@google.com>
References: <9c213fb1-8f31-4c9a-a8a8-562693a3f7b7@amichais.net> <CAHixhFp1+bgvfFrro-a-SaqBHkVUG3PX8Fwfv9+uu_pApWVSVA@mail.gmail.com>
Content-Language: en-US
From: "A. Rothman" <amichai2@amichais.net>
In-Reply-To: <CAHixhFp1+bgvfFrro-a-SaqBHkVUG3PX8Fwfv9+uu_pApWVSVA@mail.gmail.com>
Message-ID-Hash: RW4S5F72PBUT6NPYTNQCAVOSJFLUPP4K
X-Message-ID-Hash: RW4S5F72PBUT6NPYTNQCAVOSJFLUPP4K
X-MailFrom: amichai2@amichais.net
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-hybi.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: hybi@ietf.org
X-Mailman-Version: 3.3.9rc4
Precedence: list
Subject: [hybi] Re: Questions about RFC6455 (implementing websockets)
List-Id: Server-Initiated HTTP <hybi.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/hybi/luK0U6KmGqASHZAwoi0uzsCMS48>
List-Archive: <https://mailarchive.ietf.org/arch/browse/hybi>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Owner: <mailto:hybi-owner@ietf.org>
List-Post: <mailto:hybi@ietf.org>
List-Subscribe: <mailto:hybi-join@ietf.org>
List-Unsubscribe: <mailto:hybi-leave@ietf.org>
Thanks for responding, Adam! This list does seem like a bit of a wasteland... I appreciate you going over the points in detail anyway. I'm not sure which of the points I mentioned counts as errata or if they should be reported at all. They are unclear sections and not specific typos or well-defined bugs. I thought there might be a place to discuss them with whomever may write the next update of the RFC so it can be improved, but now I understand there's likely no such person or plan, and unfortunately it'll probably just rot here in the mailing list's archives :-/ So if I understand correctly you implemented the websocket client in chrome, and the pywebsocket experimental test server on the other side? Do you know what other clients are tested against it? In any case thanks for the pointer - I'll have a look at pywebsocket to see how it handles some of the edge cases, assuming it is itself RFC compliant. In the meanwhile I also came across the Autobahn test suite, are you familiar with it? It seems quite old and unmaintained, but is still usable, seems to have quite a nice battery of tests, and at least at some point had an impressive number of projects and implementations tested against it. My implementation passes all tests there, so that's encouraging, but I did have to fix two failures that I think are more opinionated than something actually required by the RFC, but I'll have a closer look to see if there is good reason behind them. One is that when receiving a close frame with no body, it must respond with a 1000 normal close - I would think it more intuitive to echo the empty body rather than decide the reason was 'normal', especially considering that if I understand correctly the application will be passed 1005 anyway on receiving the empty close, so why reply with another arbitrary code? And in any case, maybe this should have been marked as implementation-dependent rather than a fail. The second iirc was how to react to receiving a code like 999 or 1005. I don't remember anything specific to how this should be handled in the RFC, but will look again. While going down this rabbit hole I also found an endless loop bug in the JDK's deflate handling, so at least something good came out of it :-) I also think I found a bug in the deflate extension RFC 7692 and reported the errata, which ironically seems to be posted back to this mailing list with no answer yet, but I guess someone may get to it eventually. Regarding the base64 validation, if I understand correctly you do both a regex validation and an actual base64 decode? If you do the latter then what is the former for? The base64 decoding is likely more efficient than a regex engine, and the regex doesn't add any validation on top of that... also the length check doesn't require a full decode, only the encoded string length and the padding bytes. I understand performance is not an issue in your case, but just didn't get why you would need both and not just one (or the other). Regarding handling a state change in the middle of sending, just to clarify the question, the state changes to CLOSING when receiving a close frame, which should be replied to, and not just when the tcp connection disappears, so it's unclear what happens to the currently sent frame/frames/message, when it MUST be aborted in the middle, and in whatever state the framing protocol is (e.g. broken frame, broken continuation between frames, etc.) where does the reply close frame fit in. So the common case is not the moot one but the not-well-defined one. The rest of your answers are helpful, I'll consider them in deciding what to do. But again it's unfortunate this discussion is between two sometimes confused implementers and not anyone who considered these things when writing the spec or could do something about them in the next one. Thanks again, Amichai On 9/12/24 11:27, Adam Rice wrote: > Good analysis. You can see existing errata at > https://www.rfc-editor.org/errata_search.php?rfc=6455 and report new > errata using the link at the bottom of the page. > > You should be aware that since the IETF hybi working group was closed, > it is unclear who is responsible for updates to RFC6455, and in > practice updates haven't been happening. > > I was not involved in the standardisation of RFC6455, but I've been > working with WebSockets for a long time and I'm one of the maintainers > of pywebsocket, a server that is used in browser testing. I will > comment from the point-of-view of what pywebsocket does. > > On Thu, 5 Sept 2024 at 02:31, A. Rothman <amichai2@amichais.net> wrote: > > 1: 1.8 > > "At the time of writing of this > specification, it should be noted that connections on ports 80 and > 443 have significantly different success rates, with connections on > port 443 being significantly more likely to succeed, though > this may > change with time." > > I could neither understand what this is trying to say (Is this > a general statistic about port-scanning? Or a fun fact specific to > websocket deployments?) nor why "it should be noted" - how is it > relevant to the websocket spec? Would removing this sentence > entirely make any difference in understanding, implementing or > using websockets? I'm not sure if I'm missing an important point > or this is really redundant. > > > Safe to ignore. > > 2: 4.2.1.3 > > "An |Upgrade| header field containing the value "websocket"" > > Does "containing" imply it can have other elements as well (as > the Upgrade field is normally defined), or does it mean the entire > value must exactly equal "websocket"? The former makes more sense, > being standard, but I've seen interpretations both ways, and > mention of clients that reject anything but the single exact > value. Perhaps this can be clarified in the spec. > > > pywebsocket requires an exact case-insensitive match for "websocket", > so if you only have browser clients it's safe to do the same. > > 4: 4.2.2.4 (regarding /key/) > > "It is not necessary for the server to base64-decode the > |Sec-WebSocket-Key| value." > > Although technically correct, this is somewhere between a > half-truth and misleading: section 4.2.1 states that > > "If the server, while reading the handshake, finds that the > client did > not send a handshake that matches the description below [...] > the server MUST stop processing the client's handshake and > return an HTTP > response with an appropriate error code" > > and the "description below" of the key field is: > > "A |Sec-WebSocket-Key| header field with a base64-encoded (see > Section 4 of [RFC4648]) value that, when decoded, is 16 bytes in > length." > > > In practice, pywebsocket checks that the string looks like base64 > using the regular expression /^[+/0-9A-Za-z]{21}[AQgw]==$/ and then > decodes it to check the length. I think the first step is probably > sufficient by itself, but performance isn't a priority for pywebsocket. > > 6: 5.1 > > "a client MUST mask all frames that it sends to the server" and > "A server MUST NOT mask any frames that it sends to the client" > > However in 5.2, referring to the Mask field definition, it only > says: > > "All frames sent from client to server have this bit set to 1." > > I think it should also say "All frames sent from server to > client have this bit set to 0". It's strange to mention the > requirement in one direction and not the other here, even though > both are MUST (NOT) earlier. It would make sense to mention either > both directions or none, not just one. > > My understanding is that there was an open question of whether masking > would be required in both directions, so the standard left it easy to > change. > > 7.1.5 > > "_The WebSocket Connection Close Code_ is defined as the status > code (Section 7.4 > <https://datatracker.ietf.org/doc/html/rfc6455#section-7.4>) > contained in > the first Close control frame received by the application > implementing this protocol." > > "the first Close control frame received" implies there can be > more than one (also in 7.1.6)? What scenario is this? Why not > explicitly disallow sending more than one? How is the receiver > supposed to react to the second one? > > My interpretation is that this is just protection against a > badly-behaved peer that sends multiple close codes. > > Chrome ignores an unexpected Close frame. pywebsocket always closes > the connection immediately after receiving a Close frame, so the issue > cannot arise. > > > 9: 6.1.1 > > "If at any point the state of the WebSocket connection changes, > the endpoint MUST abort the following steps." > > It isn't entirely clear what "at any point" means here - in > between the listed steps? within a step? more specifically, in > step 7 it transmits potentially multiple frames - is that an > atomic operation? or MUST it abort also in between > frames/fragments? or MUST it abort in the middle of a single > (possibly very long) frame? This has significant implications to > what happens next e.g. if aborted in the middle of a frame, or > even in between frames in a multi-frame message, the framing > protocol is then broken and no close frame can be sent, etc., > whereas if step 7 is all-or-nothing with regards to the MUST > abort, then the closing can continue gracefully and no data is > lost). This should really be clarified, as this is a MUST that can > be interpreted in significantly different ways. > > > My philosophy is that frames are atomic, but messages are not. In > practice, many implementations avoid the question completely by > sending every message in a single frame. Obviously if the underlying > connection goes down in the middle of sending a frame we can't > continue anyway, so the question is moot. > > Chrome's implementation may be slightly wrong here, as we have an > internal queue of frames to be written to the OS's socket buffer, and > we don't have a way for a Close frame to jump the queue. > > 2 - More generally, it's not entirely clear why it would be > required to close the socket in any case if the handshake fails, > whether according to this section or other sections. To my > understanding the handshake is still using the HTTP protocol, so > sending an error status code should be enough, and the client > should be free to reuse the connection to try again, or to send > unrelated non-websocket requests for other HTTP resources on the > same TCP connection. Specifically, section 4.4 describes such a > scenario for protocol version negotiations - why should it be > necessary to close the connection in the middle before the second > request with the corrected version number is sent? (section 4.2.2, > under /version/, requires aborting the handshake). > > > As a practical matter, in order for NTLM or Negotiate HTTP > authentication to work it's necessary for the client to be able to > receive a 401 response and send another request with updated > credentials on the same connection. I was never sure whether RFC6455 > permits this, but I implemented it for Chrome anyway as it was > required for interoperability. > Thanks, > Adam Rice > Chromium Networking