Re: Determining which fields are structured

Mark Nottingham <mnot@mnot.net> Wed, 17 November 2021 22:35 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 674433A07F5 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 17 Nov 2021 14:35:09 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.748
X-Spam-Level:
X-Spam-Status: No, score=-2.748 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, MAILING_LIST_MULTI=-1, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=mnot.net header.b=QazshqQe; dkim=pass (2048-bit key) header.d=messagingengine.com header.b=CGd/5cQF
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TakMsJBtb9Gc for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 17 Nov 2021 14:35:04 -0800 (PST)
Received: from lyra.w3.org (lyra.w3.org [128.30.52.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AD78A3A07F2 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Wed, 17 Nov 2021 14:35:04 -0800 (PST)
Received: from lists by lyra.w3.org with local (Exim 4.92) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1mnTTi-0004Er-By for ietf-http-wg-dist@listhub.w3.org; Wed, 17 Nov 2021 22:32:30 +0000
Resent-Date: Wed, 17 Nov 2021 22:32:30 +0000
Resent-Message-Id: <E1mnTTi-0004Er-By@lyra.w3.org>
Received: from mimas.w3.org ([128.30.52.79]) by lyra.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from <mnot@mnot.net>) id 1mnTTd-0004Du-Vt for ietf-http-wg@listhub.w3.org; Wed, 17 Nov 2021 22:32:26 +0000
Received: from out4-smtp.messagingengine.com ([66.111.4.28]) by mimas.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from <mnot@mnot.net>) id 1mnTTb-0004YN-GV for ietf-http-wg@w3.org; Wed, 17 Nov 2021 22:32:25 +0000
Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 7BAC65C0062; Wed, 17 Nov 2021 17:32:10 -0500 (EST)
Received: from mailfrontend1 ([10.202.2.162]) by compute3.internal (MEProxy); Wed, 17 Nov 2021 17:32:10 -0500
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mnot.net; h= content-type:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; s=fm1; bh=5 2yA0t3+8w5rRZ9UAz0bG1O8zvbFM9W5Q7G/F0GEryU=; b=QazshqQeRoid09prR PWfDVJh4wqwDwui9SS9Wg9q0SfRtfGRkbvFRAniRInv9WA4f6x9fEmXgwjZYX0Al t+ZDXrYToEdYSJ7x9NbGcSMP8G8McuFm1N3TupizD7L/mviyTg72s9Lww9WaU0Vk PldHQwo4geaeBfhHObmcvwCh+EakWyPvP5EAk8kbXnQAORMZwgCJnvD75YkVN38z TfACeCoLiJi1UXhCKdgu+naTzKPjLs2E9bPEKB6Y6CxQJecmS+wB8ZTnMfqkPzJ0 DqsyXZ4Uim8M8W9xrdJ+3qbjw5Zp+UkcKc6VZnYgj/sm5sqpAYQt8COBrnD8hAPR BZAHg==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; bh=52yA0t3+8w5rRZ9UAz0bG1O8zvbFM9W5Q7G/F0GEr yU=; b=CGd/5cQF43McijI9hJXtofEbYMhZbWYaSM8kiArDaAa81FwUiugy5clk1 Mw6Ks+T9OEQpPLmDwidXSqNhKBH67B/OFOa2fOG388NAyahXIW8mJP7D13E5KPqc RRBJ7+QxAhHwZuGSNzmVs2Iw1zUStTLJ35yNonbnkzgXqO9hn8InPh20dSrUA8L9 3bxWQ4U7kwquU0Ya7OQpsCqdf54NvTl2Hje+GuPdMPFgpSF7vuGRyDJ511ZVpRJG 0EUaqci6Zfc9rVWptN10wmute7Oe0rQCStc0mkB9TAl0g7dtwN/PML0DzT1+rFLK fg90iWhP76J6ZPCAzIJKHD9pY3b4g==
X-ME-Sender: <xms:6oKVYct3KtV7P7hMFW6g30qZHyX2bUtKxxU3NagEdB3_HosonV183g> <xme:6oKVYZd_z6tuV7jyAhinwullcDGprVbC9sn7_OldXipkPfN7xXuuMZMAadcCunBQ2 qfWEXl9h_WSUr_23w>
X-ME-Received: <xmr:6oKVYXxVjxwh8Vs7u_u5kY9Zbr8rj2fmJ8JcESC2wkerr4jLz3QLCYFYJ_A0XI41w2j09_MYF38hPBgxm9Z1DizgfPz4tCSpTthzTDoM89JDFW2hYNUs5v5B>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvuddrfeeggdduheehucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurheptggguffhjgffgffkfhfvofesthhqmhdthhdtjeenucfhrhhomhepofgrrhhk ucfpohhtthhinhhghhgrmhcuoehmnhhothesmhhnohhtrdhnvghtqeenucggtffrrghtth gvrhhnpefhudeivedvtdeigfegieekiefhteffuddvgeelffevudfggfejveefteeuffej tdenucffohhmrghinhepvgigrghmphhlvgdrohhrghdpihgvthhfrdhorhhgpdhirghnrg drohhrghdpmhhnohhtrdhnvghtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghm pehmrghilhhfrhhomhepmhhnohhtsehmnhhothdrnhgvth
X-ME-Proxy: <xmx:6oKVYfM5CIvbEtqnuYhNsKmvvGu-AeQ2HgszKGSWWBZs6lG8CB7eZg> <xmx:6oKVYc_C2PicrHXpOPUjMUKOu7byQRrsbSAwBYgInlBrQlYWDf7WlA> <xmx:6oKVYXWIC02x1Ku5oF0W8WoZtMjImfFTZJPwaGAlss--dfF4vwYd3w> <xmx:6oKVYbanzmFuWQnsXPD1Mae_fFy0N59JsphCHjGif-x3seV4RMXU0w>
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 17 Nov 2021 17:32:09 -0500 (EST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 15.0 \(3693.20.0.1.32\))
From: Mark Nottingham <mnot@mnot.net>
In-Reply-To: <68C54E31-A5F0-4E67-8FEA-0F555518DE5C@mit.edu>
Date: Thu, 18 Nov 2021 09:32:06 +1100
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <773C8621-18CE-49DC-B8F4-1B4311282EDB@mnot.net>
References: <68C54E31-A5F0-4E67-8FEA-0F555518DE5C@mit.edu>
To: Justin Richer <jricher@mit.edu>
X-Mailer: Apple Mail (2.3693.20.0.1.32)
Received-SPF: pass client-ip=66.111.4.28; envelope-from=mnot@mnot.net; helo=out4-smtp.messagingengine.com
X-W3C-Hub-DKIM-Status: validation passed: (address=mnot@mnot.net domain=mnot.net), signature is good
X-W3C-Hub-DKIM-Status: validation passed: (address=mnot@mnot.net domain=messagingengine.com), signature is good
X-W3C-Hub-Spam-Status: No, score=-9.8
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, W3C_AA=-1, W3C_DB=-1, W3C_IRA=-1, W3C_IRR=-3, W3C_WL=-1
X-W3C-Scan-Sig: mimas.w3.org 1mnTTb-0004YN-GV 4577081e9f272fddec2a11e56fa6e7ac
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Determining which fields are structured
Archived-At: <https://www.w3.org/mid/773C8621-18CE-49DC-B8F4-1B4311282EDB@mnot.net>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/39581
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Hi Justin,

> On 18 Nov 2021, at 3:21 am, Justin Richer <jricher@mit.edu> wrote:
> 
> The question at hand: how do you know if a particular field is supposed to be structured or not?

We made an explicit decision that you'd have to have specific knowledge of the header in some way; you can't recognise whether a field is structured just by looking at it, nor can you tell which top-level type it is (list, dictionary, or item), necessarily. 

> While working on an implementation for HTTP Signatures, I put in a bit of code that simply tries to parse any field as a Dictionary, List, or Item, and if it doesn’t throw an error, marks it as whatever kind of structured field worked. This seemed to work for the most part, but I quickly hit one case that surprised me:
> 
> 	Host: example.org
> 
> This field parsed as sf-dictionary, which I wasn’t expecting at all because it doesn’t look or feel like a dictionary item. However, after talking with Mark Nottingham, it turns out that this fits the ABNF just fine: it’s a valid single-key dictionary with one key of “example.org” and no value, which is interpreted as a boolean “True” value.

Aside: the ABNF is only illustrative, the algorithms are normative.

> So with that I’d like to re-assert my support for the “Retrofit of Structured Fields for HTTP” draft: https://www.ietf.org/archive/id/draft-nottingham-http-structure-retrofit-00.html
> 
> In particular, if the WG picks up this document, I would also like to see us add a column to the HTTP Field Registry for SF type, and register all of the existing ones that we know work as different fields (leaving unknown or undefined ones as blank, maybe? Or something like “unstructured” if we know it’s specifically not meant to be structured, like Host): https://www.iana.org/assignments/http-fields/http-fields.xhtml
> 
> This resource would help code like mine, as I’d be able to pull in that table and have some sense of what to expect when trying to parse a given field.
> 
> This would also help push the goal of having any new fields be built using structured field types — new field definitions would be required to fill out that column when they register the field name.

I think that's reasonable. I could see adding an annotation as to whether the type is "native" (i.e., the field is specified as a structured field), or retrofit (i.e., it's not the normative parsing algorithm, but you might have some luck using it).

Cheers,


--
Mark Nottingham   https://www.mnot.net/