Re: support for non-ASCII in strings, was: signatures vs sf-date

Mark Nottingham <mnot@mnot.net> Sat, 03 December 2022 08:09 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6A365C14CE42 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sat, 3 Dec 2022 00:09:07 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.748
X-Spam-Level:
X-Spam-Status: No, score=-7.748 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=mnot.net header.b=oqguRunG; dkim=pass (2048-bit key) header.d=messagingengine.com header.b=YT3MVKAw
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TA-1rFOo4e6O for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sat, 3 Dec 2022 00:09:03 -0800 (PST)
Received: from lyra.w3.org (lyra.w3.org [128.30.52.18]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id F0328C14CE3F for <httpbisa-archive-bis2Juki@lists.ietf.org>; Sat, 3 Dec 2022 00:09:02 -0800 (PST)
Received: from lists by lyra.w3.org with local (Exim 4.94.2) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1p1NZp-00BWQc-HA for ietf-http-wg-dist@listhub.w3.org; Sat, 03 Dec 2022 08:08:49 +0000
Resent-Date: Sat, 03 Dec 2022 08:08:49 +0000
Resent-Message-Id: <E1p1NZp-00BWQc-HA@lyra.w3.org>
Received: from mimas.w3.org ([128.30.52.79]) by lyra.w3.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from <mnot@mnot.net>) id 1p1NZn-00BWP8-D6 for ietf-http-wg@listhub.w3.org; Sat, 03 Dec 2022 08:08:47 +0000
Received: from wout2-smtp.messagingengine.com ([64.147.123.25]) by mimas.w3.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from <mnot@mnot.net>) id 1p1NZl-006Hn0-RT for ietf-http-wg@w3.org; Sat, 03 Dec 2022 08:08:47 +0000
Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.west.internal (Postfix) with ESMTP id 5AE1E3200392; Sat, 3 Dec 2022 03:08:34 -0500 (EST)
Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Sat, 03 Dec 2022 03:08:34 -0500
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mnot.net; h=cc :cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm2; t=1670054913; x= 1670141313; bh=1aLEPz7HPXdOIZdATccURfDAuNyU8rXwCBXWDu66Iw0=; b=o qguRunGWHeymBg3OK7qaWagkGvFBU075UiWOc3ufp0SvNxcucCflwQouzZzv7il3 ss8Dym7nwfp5TEsX+MP1uF5lt4ironcItVdhl2fS3PmUb8eZ+N+11ZfjzTNkwM7j yODhZaHqykh+V35oxZu7l+73HOztTrfS9b65WneS+Tavwzk0D6Q4v+4i7Fto/vV3 3WCKHXYoC1TpU3QgZnJCxAhhaR8VGkm/btiRiSpDAgAuqOOT+PFCwOa7PJRYXVQK OicW/llZ4EWaGzhyg9ATZuSUKxC8GM+E/BoMO8Hj+TVzccLrgs96O7Pr1OumFJGx oOqyJUuIVl6W0LO6mQesA==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t=1670054913; x= 1670141313; bh=1aLEPz7HPXdOIZdATccURfDAuNyU8rXwCBXWDu66Iw0=; b=Y T3MVKAwBwV/iwC+h4So1CauJIxvVi/OUw+gFH2GT7KCdWgIgxqNg52DU75lDnbNZ 4DVlS3i5YaU/tZ4H3Mg0ZBYUJplZU2KcgtPvE3iUV6e2nJZfkfIWOzHrFBINvJfc 2QjfvQZGFsAvy5oJQT6z3iS/kPW0SQFzp9szNBWYn2KBeNrhd8kZH8B8jlSG5PDH kS7zAtajXesQE9WUcd9mK88JhJaPPmRzDEchMvcSpmRtYTd9Tw86h6UBpmJ6goQF NBERkK4fxxAqiN4xjzMnOho1tIIKHUxIA4dJneHf0XXKhvNqC+9GLnCJdzdvsNWe Wg0HJlb0G+FXqM1O4VC8Q==
X-ME-Sender: <xms:AQSLY58OLeptGv_FP8mM9idcHhomdbEOM_uvKYuy1EnR-Py_nzD1Gw> <xme:AQSLY9sv3NRxb77RK1ULBfCk58QX8hH_U94dxgNVqhIXMBz0H9kvgn_AqEG1adRFT 9mQ7F9YRm3pjsscRQ>
X-ME-Received: <xmr:AQSLY3B6A3TsBG-IK5gP82B4T3Ql9r2wEXZDpV3uPW4VIeevh4gZtfDNoWO5XQop5_nhbicw8LLBkH4TI9oudEEyKCy5VR3rUb7FKByRlGLrhaZEkdg6wnBz>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrtdelgdduudeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurheptggguffhjgffvefgkfhfvffosehtqhhmtdhhtddvnecuhfhrohhmpeforghr khcupfhothhtihhnghhhrghmuceomhhnohhtsehmnhhothdrnhgvtheqnecuggftrfgrth htvghrnheptddtgefgueevtddugfdtkeffudegveetffegjeelhfdvtedvueejteegueeg teetnecuffhomhgrihhnpehmnhhothdrnhgvthenucevlhhushhtvghrufhiiigvpedtne curfgrrhgrmhepmhgrihhlfhhrohhmpehmnhhothesmhhnohhtrdhnvght
X-ME-Proxy: <xmx:AQSLY9ctn8mlzakEVoZSorxq2EyA1j4BJakpJs2mb9gE-x5C5JGtSA> <xmx:AQSLY-NybcylEANdapRaUBFWlGWmVv5fL9AfYZL_xuhUmxx89UDe1A> <xmx:AQSLY_l4_KPlAgYej3ZYnpUSWC6r9g3HVIpKBS7PnXVvNhpl1U0NkQ> <xmx:AQSLY4V1-4fpgDZZnVLXu74Z_PgomE0X3CbsArRck5SgAqH0FK6xaA>
Feedback-ID: ie6694242:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Sat, 3 Dec 2022 03:08:32 -0500 (EST)
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.200.110.1.12\))
From: Mark Nottingham <mnot@mnot.net>
In-Reply-To: <c6b41b93-23b0-f3b8-5d7f-05e52614070a@gmx.de>
Date: Sat, 03 Dec 2022 19:08:08 +1100
Cc: ietf-http-wg@w3.org
Content-Transfer-Encoding: quoted-printable
Message-Id: <FCC95D8F-0F64-4245-98E5-5760AD63E8FA@mnot.net>
References: <2070c8e0-98d6-7b63-77c3-550bcd661397@gmx.de> <e580db7e-c0ec-0f1a-17af-5719ab09468c@gmx.de> <202212020810.2B28ALnL004331@critter.freebsd.dk> <eee5a787-da37-feb1-098a-7d2d9c0f1d37@it.aoyama.ac.jp> <202212020848.2B28mGbc004600@critter.freebsd.dk> <4e251954-afb6-fa08-616c-db95e23ad1fd@gmx.de> <202212020946.2B29kSe6004829@critter.freebsd.dk> <75dad0c0-e3bb-1189-0c16-e8275d3879ff@gmx.de> <202212021016.2B2AGvEP004972@critter.freebsd.dk> <9990b393-93ff-75af-4e14-de4f6ba3366c@gmx.de> <7b7f714d-890e-db90-4922-cfbc46b3e999@gmx.de> <202212021129.2B2BTY9f005362@critter.freebsd.dk> <b1d3af79-373f-a9af-7ff9-39f5f44915f0@gmx.de> <202212021214.2B2CEUQx005654@critter.freebsd.dk> <7a93fa17-38fe-5fa8-54ed-a726ab9d5a39@gmx.de> <841DC85E-F936-4350-A74F-170D22E6ADCE@gbiv.com> <202212021918.2B2JIBHC007228@critter.freebsd.dk> <65070e79-5429-a4cd-abe2-667b526badf1@gmx.de> <202212022147.2B2LlcqP008154@critter.freebsd.dk> <53D8E497-284A-4B2C-91D8-367542AA0A7C@mnot.net> <c6b41b93-23b0-f3b8-5d7f-05e52614070a@gmx.de>
To: "Julian F. Reschke" <julian.reschke@gmx.de>
X-Mailer: Apple Mail (2.3731.200.110.1.12)
Received-SPF: pass client-ip=64.147.123.25; envelope-from=mnot@mnot.net; helo=wout2-smtp.messagingengine.com
X-W3C-Hub-DKIM-Status: validation passed: (address=mnot@mnot.net domain=mnot.net), signature is good
X-W3C-Hub-DKIM-Status: validation passed: (address=mnot@mnot.net domain=messagingengine.com), signature is good
X-W3C-Hub-Spam-Status: No, score=-9.8
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, W3C_AA=-1, W3C_DB=-1, W3C_IRA=-1, W3C_IRR=-3, W3C_WL=-1
X-W3C-Scan-Sig: mimas.w3.org 1p1NZl-006Hn0-RT 266d2de5954361a3df008b83fa62ccba
X-Original-To: ietf-http-wg@w3.org
Subject: Re: support for non-ASCII in strings, was: signatures vs sf-date
Archived-At: <https://www.w3.org/mid/FCC95D8F-0F64-4245-98E5-5760AD63E8FA@mnot.net>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/40636
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>


> On 3 Dec 2022, at 6:47 pm, Julian Reschke <julian.reschke@gmx.de> wrote:

>> 2) I added %-encoded strings to Problem because the other encoding didn't fit cleanly into SF-land. However, we should _not_ add non-ASCII strings to SF because they're a footgun that for _most_ cases, will cause more trouble than they're worth.
>> 
>> In the protocol (not content), most strings are intended for machines, not people, and ASCII strings can be processed fairly unambiguously; that's not true when you open things up to full Unicode.
> 
> We're discussing this for header fields that *do* carry
> human-presentable content. Content-Disposition, Link, and now Problem.

Yes

> If you're serious about human presentable text not belonging here, why
> do we add that to "Problem" right now?

Please re-read what I wrote. Discussing them does not obviate what I said.

>> There are some cases where non-ASCII strings are needed in header fields; mostly, when you're presenting something to a human from the fields. Those cases are not as common. However, there's a catch to adding them: if full unicode strings were available in the protocol, many designers will understandably use them because it's been drilled into all our heads that unicode is what you use for strings.
>> 
>> Hence, footgun.
> 
> I would appreciate if you would explain why there is a problem we need
> to prevent, and what exactly that problem is. Do you have an example?

As you've pointed out, the scope for this bis document was tightly defined. The onus isn't on me to prove what shouldn't go into it...

>> By leaving full unicode support out of the spec and forcing designers to take positive steps to support it, the (relatively small) barrier to adoption makes them stop and think whether they need it. I think that's a good thing. I also know that will make some i18n folks unhappy, and I'm sorry for that; unfortunately we're working in an area where protocol artefacts intended for humans and machines are mixed, and so it gets difficult.
> 
> I continue to disagree. By not supporting non-ASCII in the base
> definition, we force people to come up with ad hoc definitions which in
> general will be worse than a common extension we can define here.
> 
>> All of that said, once the algorithms are stable (as Julian has pointed out, they contain some errors), I wouldn't object to including the %-encoding text as an appendix in sf-bis with appropriate warnings, if other folks are amenable.
> 
> That would be a good step into the right direction. I still think we
> need an on-the-wire signal that the encoding is in place, for the same
> reasons why we're doing this revision in the first place (tooling
> support for special-casing integers that happen to represent dates).

I disagree, and you should have brought that up in the scoping discussion.

Cheers,


--
Mark Nottingham   https://www.mnot.net/