Re: New Version Notification for draft-kamp-httpbis-structure-01.txt (fwd)

"Poul-Henning Kamp" <phk@phk.freebsd.dk> Fri, 11 November 2016 13:47 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9BF37129A2E for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 11 Nov 2016 05:47:08 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.398
X-Spam-Level:
X-Spam-Status: No, score=-8.398 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-1.497, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3wxHc-HgSpS3 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 11 Nov 2016 05:47:02 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2F97B1299D5 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Fri, 11 Nov 2016 05:47:02 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1c5C78-0005eW-Vd for ietf-http-wg-dist@listhub.w3.org; Fri, 11 Nov 2016 13:43:31 +0000
Resent-Date: Fri, 11 Nov 2016 13:43:30 +0000
Resent-Message-Id: <E1c5C78-0005eW-Vd@frink.w3.org>
Received: from titan.w3.org ([128.30.52.76]) by frink.w3.org with esmtps (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <phk@phk.freebsd.dk>) id 1c5C73-0005ci-G1 for ietf-http-wg@listhub.w3.org; Fri, 11 Nov 2016 13:43:25 +0000
Received: from phk.freebsd.dk ([130.225.244.222]) by titan.w3.org with esmtp (Exim 4.84_2) (envelope-from <phk@phk.freebsd.dk>) id 1c5C6w-0006x4-Cf for ietf-http-wg@w3.org; Fri, 11 Nov 2016 13:43:20 +0000
Received: from critter.freebsd.dk (unknown [192.168.55.3]) by phk.freebsd.dk (Postfix) with ESMTP id 0E46C27342; Fri, 11 Nov 2016 13:42:55 +0000 (UTC)
Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.15.2/8.15.2) with ESMTP id uABDgsrd034042; Fri, 11 Nov 2016 13:42:55 GMT (envelope-from phk@phk.freebsd.dk)
To: Julian Reschke <julian.reschke@gmx.de>
cc: HTTP Working Group <ietf-http-wg@w3.org>
In-reply-to: <f8b3b877-6b6e-f002-f237-311e91b86d82@gmx.de>
From: "Poul-Henning Kamp" <phk@phk.freebsd.dk>
References: <78354.1477853918@critter.freebsd.dk> <f8b3b877-6b6e-f002-f237-311e91b86d82@gmx.de>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-ID: <34040.1478871774.1@critter.freebsd.dk>
Content-Transfer-Encoding: quoted-printable
Date: Fri, 11 Nov 2016 13:42:54 +0000
Message-ID: <34041.1478871774@critter.freebsd.dk>
Received-SPF: none client-ip=130.225.244.222; envelope-from=phk@phk.freebsd.dk; helo=phk.freebsd.dk
X-W3C-Hub-Spam-Status: No, score=-6.8
X-W3C-Hub-Spam-Report: AWL=0.009, BAYES_00=-1.9, RP_MATCHES_RCVD=-2.899, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: titan.w3.org 1c5C6w-0006x4-Cf 0f5401c2335fbe7066c4a2b317c99aec
X-Original-To: ietf-http-wg@w3.org
Subject: Re: New Version Notification for draft-kamp-httpbis-structure-01.txt (fwd)
Archived-At: <http://www.w3.org/mid/34041.1478871774@critter.freebsd.dk>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/32864
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

--------
In message <f8b3b877-6b6e-f002-f237-311e91b86d82@gmx.de>de>, Julian Reschke writes:

>        ascii_string = * %x20-7e
>                # This is a "safe" string in the sense that it
>                # contains no control characters or multi-byte
>                # sequences.  If that is not fancy enough, use
>                # unicode_string.
>
>        unicode_string = * unicode_codepoint
>                # XXX: Is there a place to import this from ?
>                # Unrestricted unicode, because there is no sane
>                # way to restrict or otherwise make unicode "safe".
>
>It's not clear why there's even a distinction...

To give designers of HTTP headers a trivial way to define strings
which are "safe" and free from needless complexity vs. strings where
the recipient should be prepared to deal with BOM, RLM and "MAN
IN BUSINESS SUIT LEVITATING".

>Also, it needs to be stated whether the grammar is octet or character 
>based. For an abstract datamodel, the latter probably makes more sense.

The abstract data model is abstract, so it is obviously neither.

For the h1 serialization, I don't see how it makes a difference, unless
somebody is running HTTP/1 in EBCDIC or Morse-code ?

>        h1_common-structure-header =
>                ( field-name ":" OWS ">" h1_common_structure "<" )
>                        # Self-identifying HTTP headers
>                ( field-name ":" OWS h1_common_structure ) /
>                        # legacy HTTP headers on white-list, see {{iana}}
>
>Do not mix message block ABNF with field value ABNF. Just define what's 
>inside the field value.
>
>        h1_element = identifier * (";" identifier ["=" h1_value])
>
>Shouldn't the second "identifier" be "token"?

yes, probably.

>How would a generic recipient decide whether it needs to handle "\u"? 
>What's the point of having different ABNF productions?

Based on the definition/data-dictionary of the header in question.

Remember:  This is only the data-model/h1-serialization, for each
HTTP header, it will (still) be necessary to define what the data
is/can be.

>Also: this puts raw non-ASCII UTF-8 in the string value. It's not clear 
>that this is a good idea for HTTP/1, 

Neither is it obvious that it is going to cause any problems.

For reasons of transmission efficiency, I'm not keen on mandating
\uXXXX for all non-ascii unicode unless experimentation on the live
indicates that we have to, or if we decide for reasons of purity
that HTTP/1 can never have the high bit set.

Either way, another good reason to keep the "safe" string type.

>Introduction.
>
>        h1_common_structure = ">" h1_common_structure "<"
>
>That's a bit too recursive

No, that is deliberately making recursion possible, as the simplest
possible way to define complex datastructures.

>(speaking of which: "_" isn't allowed in ABNF names)

Noted.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.