Re: [I18ndir] Working Group Last Call: Structured Headers for HTTP
John C Klensin <john-ietf@jck.com> Tue, 04 February 2020 16:06 UTC
Return-Path: <john-ietf@jck.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id ADF1312084A for <i18ndir@ietfa.amsl.com>; Tue, 4 Feb 2020 08:06:37 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iWaI36p6jRKY for <i18ndir@ietfa.amsl.com>; Tue, 4 Feb 2020 08:06:35 -0800 (PST)
Received: from bsa2.jck.com (ns.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B8189120844 for <i18ndir@ietf.org>; Tue, 4 Feb 2020 08:06:35 -0800 (PST)
Received: from [198.252.137.10] (helo=PSB) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1iz0iY-000MAU-1d; Tue, 04 Feb 2020 11:06:26 -0500
Date: Tue, 04 Feb 2020 11:06:20 -0500
From: John C Klensin <john-ietf@jck.com>
To: Patrik Fältström <patrik@frobbit.se>
cc: John R Levine <johnl@taugh.com>, i18ndir@ietf.org
Message-ID: <E5B773EBE912789255643EEF@PSB>
In-Reply-To: <7D31FE0A-D4EC-4096-83FE-97D2BF4908F5@frobbit.se>
References: <20200203173404.88EE813AA055@ary.qy> <E2361F8BA970A15043416C2D@PSB> <alpine.OSX.2.21.99999.374.2002031653540.31381@ary.qy> <D03AE38116EF15538E10CFAF@PSB> <7D31FE0A-D4EC-4096-83FE-97D2BF4908F5@frobbit.se>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/n3-29BpdG4LtJguncehSZsf2Kis>
Subject: Re: [I18ndir] Working Group Last Call: Structured Headers for HTTP
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 04 Feb 2020 16:06:38 -0000
--On Tuesday, February 4, 2020 13:07 +0100 Patrik Fältström <patrik@frobbit.se> wrote: > On 4 Feb 2020, at 2:30, John C Klensin wrote: > >> --On Monday, February 3, 2020 17:05 -0500 John R Levine >> <johnl@taugh.com> wrote: >> >>> ... >>> In section 3.3.3 it says that non-ASCII text is to be encoded >>> as base64, which seems reasonable, "along with a character >>> encoding (preferably UTF-8)" but it doesn't have any >>> convention for where the encoding goes. Experience with MIME >>> suggests that life is simpler if there's a standard form for >>> an encoded text blob so libraries can decode and if need be >>> transliterate in one go. >> >> And omission of a clear model for specifying what is really >> what we call a "charset" elsewhere (not just an encoding) is >> the one clear defect in that particular discussion in the >> document. > > Don't forget normalization or otherwise how comparison is to > be done of the UTF-8 encoded characters in the string. Nothing > about that either. The omission was deliberate, so I should explain. Let me take advantage of John Levine's observation about email headers, with which we have much more experience. Very loosely and informally, there are two types of header field data - the stuff that is highly structured like addresses (e.g., "From:", "To:", and "Cc:") and maybe even "Date:" and the stuff, like "Subject:" fields, that is basically free text. For the latter, in most cases one doesn't care whether it can be compared to other strings or not -- the important thing is that it can be read by the user and not be confusing. Of course, someone might want to sort or select on unstructured fields, but that may not impose as strong a requirement. For HTTP headers that contain structured material, strict normalization and comparison rules might be less important -- as I suggested to John, one place where the analogy breaks down is that many or most HTTP headers are closer to envelope information in email than they are to user presentation information. What is more important is that, as The Unicode Standard has been suggesting all along and more recent W3C work addresses in a more refined way, in many or most cases (and even for more structured fields), the right answer is to normalize (and apply other rules as needed) only at comparison time, normalizing both (all) strings to be compared, and not even bothering to store the normalized strings after the comparison is completed. I still believe that we got IDN labels right by requiring that both the stored form and any lookup keys be normalized and structured early, but they are, for several reasons, an unusual case. So, out of concern that anything that looked like "one size fits all" might turn out to be closer to "...fits none", I decided to avoid touching those issues in the context of this particular document. Maybe that is wrong but, if it is, I'd be inclined to suggest that any document that specifies a new HTTP header field type and that allows characters in the value/data that are not natively ASCII MUST include an Internationalization Considerations section that explicitly addresses these issues. Does that make sense? Do we think that sort of requirement is needed and would be helpful? john
- [I18ndir] Fwd: Working Group Last Call: Structure… Martin J. Dürst
- Re: [I18ndir] Fwd: Working Group Last Call: Struc… John C Klensin
- Re: [I18ndir] Fwd: Working Group Last Call: Struc… John Levine
- Re: [I18ndir] Fwd: Working Group Last Call: Struc… John C Klensin
- Re: [I18ndir] Fwd: Working Group Last Call: Struc… John R Levine
- Re: [I18ndir] Fwd: Working Group Last Call: Struc… John C Klensin
- Re: [I18ndir] Fwd: Working Group Last Call: Struc… John R Levine
- Re: [I18ndir] Fwd: Working Group Last Call: Struc… John C Klensin
- Re: [I18ndir] Working Group Last Call: Structured… Patrik Fältström
- Re: [I18ndir] Working Group Last Call: Structured… John R Levine
- Re: [I18ndir] Working Group Last Call: Structured… John C Klensin
- Re: [I18ndir] Working Group Last Call: Structured… John C Klensin
- Re: [I18ndir] Working Group Last Call: Structured… John C Klensin
- Re: [I18ndir] Working Group Last Call: Structured… John R Levine
- Re: [I18ndir] Working Group Last Call: Structured… John C Klensin
- Re: [I18ndir] Working Group Last Call: Structured… Patrik Fältström
- Re: [I18ndir] Working Group Last Call: Structured… Patrik Fältström
- Re: [I18ndir] Working Group Last Call: Structured… John R Levine
- Re: [I18ndir] Working Group Last Call: Structured… Patrik Fältström
- Re: [I18ndir] Working Group Last Call: Structured… John C Klensin
- Re: [I18ndir] Working Group Last Call: Structured… Patrik Fältström
- Re: [I18ndir] Working Group Last Call: Structured… John C Klensin
- Re: [I18ndir] Working Group Last Call: Structured… Patrik Fältström
- Re: [I18ndir] Working Group Last Call: Structured… John R Levine
- Re: [I18ndir] Working Group Last Call: Structured… John C Klensin
- Re: [I18ndir] Working Group Last Call: Structured… Asmus Freytag
- [I18ndir] Fwd: Re: Working Group Last Call: Struc… Asmus Freytag
- Re: [I18ndir] Fwd: Re: Working Group Last Call: S… John C Klensin
- Re: [I18ndir] Fwd: Re: Working Group Last Call: S… Asmus Freytag (c)
- Re: [I18ndir] Fwd: Re: Working Group Last Call: S… John C Klensin
- Re: [I18ndir] Fwd: Re: Working Group Last Call: S… Asmus Freytag (c)