Re: [hybi] Several questions/proposals about WebSocket Close Status Codes

Andy Green <andy@warmcat.com> Sun, 30 January 2022 07:04 UTC

Return-Path: <andy@warmcat.com>
X-Original-To: hybi@ietfa.amsl.com
Delivered-To: hybi@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BF5C93A1A7E for <hybi@ietfa.amsl.com>; Sat, 29 Jan 2022 23:04:30 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.814
X-Spam-Level:
X-Spam-Status: No, score=-2.814 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.714, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=warmcat.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bO0S0QYJhoS3 for <hybi@ietfa.amsl.com>; Sat, 29 Jan 2022 23:04:25 -0800 (PST)
Received: from warmcat.com (warmcat.com [46.105.127.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 66A503A1A7D for <hybi@ietf.org>; Sat, 29 Jan 2022 23:04:24 -0800 (PST)
Message-ID: <575a1e62-4058-1b56-8c09-62814dec3473@warmcat.com>
DKIM-Filter: OpenDKIM Filter v2.11.0 warmcat.com A2A9D60B9CE5
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=warmcat.com; s=default; t=1643526262; bh=eTk1KSvXdLZRot+tvJhZ2olCi1P75gUg8kA8YhOFup4=; h=Date:Subject:To:References:From:In-Reply-To:From; b=RN4R/bOMsmc+rhfWsuVfpjmKms7T0WN9XQr/ub1PVUTSWxtW3bG8hke3sdCmDgyop azQbNmo4AsUQ8F93zOFwsegcDXx3J3xMLhTMHqZJUOjA4EBkIyoCV2V4m5z5t2VSdc ulp0p6PcGhXt/pJXLFej2Rd29pyP+Qy+1q7hewiA=
Date: Sun, 30 Jan 2022 07:04:21 +0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.4.0
Content-Language: en-US
To: David Jarry <hybi=40melnofil.fr@dmarc.ietf.org>, hybi@ietf.org
References: <f9ca533c-7cfb-9079-26c1-6f99eec529a2@melnofil.fr>
From: Andy Green <andy@warmcat.com>
In-Reply-To: <f9ca533c-7cfb-9079-26c1-6f99eec529a2@melnofil.fr>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/hybi/Os6Uo8sdnx5qUWFVFTb5HQIP7OM>
Subject: Re: [hybi] Several questions/proposals about WebSocket Close Status Codes
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/hybi/>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 30 Jan 2022 07:04:31 -0000


On 1/30/22 06:07, David Jarry wrote:

>> RFC : 1003 indicates that an endpoint is terminating the connection 
>> because it has received a type of data it cannot accept (e.g., an 
>> endpoint that understands only text data MAY send this if it receives 
>> a binary message).
>> IANA : Unsupported Data.
>>
>> RFC : 1007 indicates that an endpoint is terminating the connection 
>> because it has received data within a message that was not consistent 
>> with the type of the message (e.g., non-UTF-8 data within a text message).
>> IANA : Invalid frame payload data.
> 
> I don't understand how to choose between these two codes: How does a 
> server expecting UTF-8 text know the difference between binary input and 
> non-UTF-8 input?

In ws, messages come with an indication they are UTF-8 TEXT (started 
with opcode 1) or BINARY (opcode 2).  So "receives a binary message" 
isn't an ambiguous thing in ws, the server will always know clearly if 
he got the type he wasn't expecting.

> It would be much simpler if one code corresponded to "I expect XML and I 
> received JSON" (type error, encoding error, parsing error…) and if the 
> other code corresponded to "I expect a JSON object ({}) and I received a 
> JSON array ([])" (value error, precondition fail on data, wrong content…).

It is simple since it tells you if it is BINARY or TEXT explicitly.  Ws 
doesn't attempt to define payload semantics so it's not the right layer 
to define what's wrong with them.  The server can send arbitrary 
diagnostic payload in the CLOSE frame extra data to help with analysis.

> So if a server expects UTF-8 text:
> 
>   * The first code is returned if UTF-8 decoding fails (it doesn't
>     matter if the input is binary or a nearly UTF-8 text).
>   * The second code is returned if the text uses Egyptian hieroglyphs
>     when an English text was expected.

First code is generally for "data it cannot accept", the example given 
is the case it was sent the wrong one of BINARY or TEXT message.

Second one is for if you see a TEXT message that violated the specific 
requirement that it must contain valid UTF-8.  It's written generally 
but IIRC TEXT + UTF-8 is the only time ws talks about judging payload 
contents themselves.

> II/
> 
>>  IANA 1014 : The server was acting as a gateway or proxy and received 
>> an invalid response from the upstream server. This is similar to 502 
>> HTTP Status Code.
>> RFC : Status codes in the range 4000-4999 are reserved for private use 
>> and thus can't be registered. Such codes can be used by prior 
>> agreements between WebSocket applications. The interpretation of these 
>> codes is undefined by this protocol. IANA : Reserved for Private Use.
> 
> Since the code 1014 admits the existence of gateway and proxy, shouldn't 
> it be specified somewhere that the intermediate servers MUST pass 
> without modifications any code in the range 4000-4999 that it does not 
> understand, so that the agreement can be directly established between 
> the end server and the end client (dispite of gateway and proxy)?

It doesn't say that they shouldn't, so there is no conflict with doing 
so.  It would indeed be better if it did say they MUST.

> III/
> 
> Unlike HTTP, communication is two-way, but many codes assume that there 
> is a service, a server, and a client. For certain uses, it may be 
> interesting to use the WebSocket protocol between two servers (peer to 
> peer). For example if a server on a local network wants to communicate 
> with an external server through a Firewall which only accepts port 80 in 
> both directions, then WebSocket makes it possible to create sockets in 
> both directions even through an HTTPd. Everything should be done to 
> reduce the difference between a server and a client (once the connection 
> is established, think of it like peers).

I don't disagree with the observation, but it's like that because ws was 
defined as an upgrade protocol from http, which has this schism 
thoroughly baked in.

>> IANA 1014 : The server was acting as a gateway or proxy and received 
>> an invalid response from the upstream server. This is similar to 502 
>> HTTP Status Code.
> 
> Must be "Gateway or Proxy Error"! Please authorize a client to send this 
> code. I don't understand why a client is forbidden to be behind a 
> gateway or a proxy.

He's not "forbidden to be behind a gateway or proxy", he just isn't 
given an explicit way to explain he's closing from shenanigans due to that.

>> IANA 1012 : Service Restart
> 
> Must be "Restart"! Please authorize a client to send this code. Maybe 
> the server will decide to keep certain things cached (unlike code 1000 
> where the server can destroy all its caches).

Well, I guess it could be useful.

> IV/
> 
> Proposals (codes for timeouts and version checks):
> 
> Gateway or Proxy Timeout:
> The server was acting as a gateway or proxy and did not receive a timely 
> response from the upstream server. This is similar to 504 HTTP Status Code.

I guess if you have 1014, this is also useful.

> Timeout:
> A generic status code that can be returned when a timeout has occurred, 
> meaning that the purpose for which the connection was established was 
> not been fulfilled (something took too long).

Timeouts may not necessarily want to present at ws layer, eg, whatever 
timed out might be retryable inside the ws connection session without 
terminating and re-establishing it.  So I think this is of limited use.

> Deprecated:
> Please update. Even a client can send this code and try another server, 
> a warning can be written in the server log.

The idea of the subprotocol naming is to contain the necessary 
versioning cleanly.

https://datatracker.ietf.org/doc/html/rfc6455#section-1.9

> Not Implemented:
> I'm not updated.A client can reboot with backwards compatibility enabled 
> or connect to another server. A server can treat this request as a 1012 
> code, except that the last request was not understood (e.g., the 
> clientbrowser should reload the page, then the serveur send back the 
> last message)!

Unless I miss your point, again subprotocols are designed to absorb this 
kind of problem (and allow the client to propose multiple supported 
versions in one step).

-Andy