Re: [apps-discuss] [Technical Errata Reported] RFC6839 (4367)

Graham Klyne <gk@ninebynine.org> Sun, 17 May 2015 14:25 UTC

Return-Path: <gk@ninebynine.org>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B0B831AC43B for <apps-discuss@ietfa.amsl.com>; Sun, 17 May 2015 07:25:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.301
X-Spam-Level:
X-Spam-Status: No, score=-2.301 tagged_above=-999 required=5 tests=[BAYES_20=-0.001, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UhNmuDkKTjCR for <apps-discuss@ietfa.amsl.com>; Sun, 17 May 2015 07:25:14 -0700 (PDT)
Received: from relay14.mail.ox.ac.uk (relay14.mail.ox.ac.uk [163.1.2.162]) by ietfa.amsl.com (Postfix) with ESMTP id 6F0191AC43A for <apps-discuss@ietf.org>; Sun, 17 May 2015 07:25:13 -0700 (PDT)
Received: from smtp4.mail.ox.ac.uk ([129.67.1.207]) by relay14.mail.ox.ac.uk with esmtp (Exim 4.80) (envelope-from <gk@ninebynine.org>) id 1YtzV5-0005X6-mJ; Sun, 17 May 2015 15:25:08 +0100
Received: from gklyne38.plus.com ([81.174.129.24] helo=conina-wl.atuin.ninebynine.org) by smtp4.mail.ox.ac.uk with esmtpsa (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <gk@ninebynine.org>) id 1YtzV5-000738-Fe; Sun, 17 May 2015 15:25:07 +0100
Message-ID: <5558A4C0.9050900@ninebynine.org>
Date: Sun, 17 May 2015 15:25:04 +0100
From: Graham Klyne <gk@ninebynine.org>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:31.0) Gecko/20100101 Thunderbird/31.3.0
MIME-Version: 1.0
To: Ned Freed <ned.freed@mrochek.com>
References: <20150515131052.8E76D180092@rfc-editor.org> <CALaySJ++ptrFqjjC=mRC9zH8ns18bermy2YAfYYLx5OtX0Zdqw@mail.gmail.com> <CAPQd5oTZZKimSWcQaLBeHmq7o-npxvL8KM3HRQPW9JQPHs_ONw@mail.gmail.com> <55562081.6070504@att.com> <CAPQd5oRws8pQo7qR6xG2E0_=4vka-ymQO8sb_gAOup5_56F11g@mail.gmail.com> <555624A6.5050505@att.com> <55578A38.2010609@ninebynine.org> <01PM1VPYNTIY0000AQ@mauve.mrochek.com>
In-Reply-To: <01PM1VPYNTIY0000AQ@mauve.mrochek.com>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Oxford-Username: zool0635
Archived-At: <http://mailarchive.ietf.org/arch/msg/apps-discuss/1zGlgKOs-KrMRGDXN--mjJ9rNbY>
Cc: Barry Leiba <barryleiba@computer.org>, "tony+sss@maillennium.att.com" <tony+sss@maillennium.att.com>, "apps-discuss@ietf.org" <apps-discuss@ietf.org>, RFC Errata System <rfc-editor@rfc-editor.org>
Subject: Re: [apps-discuss] [Technical Errata Reported] RFC6839 (4367)
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss/>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 17 May 2015 14:25:18 -0000

On 17/05/2015 02:21, Ned Freed wrote:
>> On 15/05/2015 17:53, Tony Hansen wrote:
>> > There is one more constraint for a file written with utf-8 to use the
>> > encoding of 8bit: the lines are limited to 998 bytes in length (not
>> > counting the CRLF line terminators). See RFC 2045 for details.
>
>> I think that would be a problem for JSON - there's no way (I know of) to break
>> long text strings over multiple lines, so that restriction would rule out very
>> long string values.  So maybe "binary" is the right choice.
>
> The same can be said for XML - multimegabyte XML objects containing no line
> breaks at all are pretty common. And while you can in theory encode text()
> nodes as CDATA, introducing line breaks into XML isn't something you can just
> do and leave the semantics unchanged.
>
> It's also the case that the line terminators in an XML or JSON object may not
> always be CRLF, and thus an object may not be compatible with 7bit or 8bit on
> that basis.
>
> But the fact remains that a lot of XML and JSON objects are 8bit or even 7bit
> compatible. In fact a lot of formats based on XML or JSON are by their nature
> _always_ 8bit or 7bit compatible. And I see some value in pointing that out
> where it's appropriate to do so.
>
> I suppose we could insist on a full rundown of the issues in every
> registration. But do you really think this is a useful thing to do, rather
> than leaving it a judgement call on the part of the person constructing
> the registration?

No, I don't.

I raised the point because I've found that I can construct XML in a way that 
avoids very long lines (YMMV).  But I've noticed in my work with JSON that there 
is no way to encode long string values over multiple lines (even when the 
strings themselves are multiline values).

Checking back... the encoding considerations for +xml in 
https://tools.ietf.org/html/rfc6839#section-4.1 call for 7bit or 8bit for UTF-8, 
and binary for UTF-16/32.  Your comment suggests some UTF-8 encoded XML would 
not satisfy the 8bit constraint on line length, and binary should at least be an 
option there.

And, as noted, https://tools.ietf.org/html/rfc7159#section-11 just says "binary" 
for JSON, which seems OK to me, but as you say 7bit or 8bit might be OK for some 
JSON data.

(I suspect that in practice these concerns don't arise as 7bit and 8bit are 
primarily email concerns, and my experience is that JSON is mostly used in HTTP 
exchanges.)

#g
--