Re: [Cbor] MIME tag 257 vs 36

Carsten Bormann <cabo@tzi.org> Fri, 25 September 2020 06:00 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 746D13A00E0 for <cbor@ietfa.amsl.com>; Thu, 24 Sep 2020 23:00:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.919
X-Spam-Level:
X-Spam-Status: No, score=-1.919 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sRAulpiglwra for <cbor@ietfa.amsl.com>; Thu, 24 Sep 2020 22:59:59 -0700 (PDT)
Received: from gabriel-vm-2.zfn.uni-bremen.de (gabriel-vm-2.zfn.uni-bremen.de [134.102.50.17]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BB1663A1188 for <cbor@ietf.org>; Thu, 24 Sep 2020 22:59:57 -0700 (PDT)
Received: from [192.168.217.118] (p548dcc60.dip0.t-ipconnect.de [84.141.204.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-vm-2.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4ByLpz24VKzytC; Fri, 25 Sep 2020 07:59:55 +0200 (CEST)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.1\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <7D9A02D7-6D3B-4466-9682-D957BEDCCA4C@island-resort.com>
Date: Fri, 25 Sep 2020 07:59:54 +0200
Cc: Jim Schaad <ietf@augustcellars.com>, cbor@ietf.org
X-Mao-Original-Outgoing-Id: 622706394.916219-793ac71ab77f37955d38ae78696fea2d
Content-Transfer-Encoding: quoted-printable
Message-Id: <2F089689-3E53-4DB1-81D7-60841721818E@tzi.org>
References: <77902B73-54E2-455C-88D3-D9CC62EDD84E@island-resort.com> <4271C433-0B38-4B05-AD44-01830EDBD834@tzi.org> <D4F397FE-79BF-41A5-9B14-3C2D9E7A83FA@island-resort.com> <C4D07067-D855-401F-9EA5-5F11F3896835@island-resort.com> <03e801d68f77$297a2800$7c6e7800$@augustcellars.com> <31F3B345-A983-45BA-BF8E-BEBB3881A13D@tzi.org> <7D9A02D7-6D3B-4466-9682-D957BEDCCA4C@island-resort.com>
To: Laurence Lundblade <lgl@island-resort.com>
X-Mailer: Apple Mail (2.3608.120.23.2.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/nJbe7S7ekODhWJMqLleDU76t24w>
Subject: Re: [Cbor] MIME tag 257 vs 36
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 25 Sep 2020 06:00:02 -0000

Hi Laurence,

On 2020-09-25, at 07:24, Laurence Lundblade <lgl@island-resort.com> wrote:
> 
> I did some more reading:
> - CRLF is required for MIME headers, q-p, b64 CTEs (RFC 2045)
> - CRLF is required for text/xxx types (RFC 2046)
> - Unicode Format for Network Interchange requires CRLF (RFC 5198)
> - HTTP headers require CRLF

This is one of the big areas where the letter of the standards and the reality can differ a lot.  Newer versions of the standards tend to contain a larger dose of reality; e.g., RFC 7230:

   Although the line terminator for the start-line and header fields is
   the sequence CRLF, a recipient MAY recognize a single LF as a line
   terminator and ignore any preceding CR.

RFC 2616 was somewhat more weasely:

   RFC 2046 requires that content with a type of "text" represent
   line breaks as CRLF and forbids the use of CR or LF outside of line
   break sequences. HTTP allows CRLF, bare CR, and bare LF to indicate a
   line break within text content when a message is transmitted over
   HTTP.

   Where it is possible, a proxy or gateway from HTTP to a strict MIME
   environment SHOULD translate all line breaks within the text media
   types described in section 3.7.1 of this document to the RFC 2049
   canonical form of CRLF. 

Since HTTP is what happens for most interchange, expect the strict CRLF requirement of RFC 2046 to be ignored.

> - Git stores LF and translates to local line ending

Git is rooted in reality.
(It gained CRLF support only after some gyrations.)

> These text formats choose a canonical line ending for on-the-wire messages so they can know to correctly translate to the local text format. These days it is just CRLF for Windows and LF for Linux/MacOS,

Right.

> it used to be CR for MacOS and in theory any local representation could be used including line lengths or a database of records.

I wish people would stop talking about CR line ends; these used to exist, but they no longer do in reality (outside very weird closed environments such as certain PDF files).  RFC 7230 reflects what the reality is, today, which is no longer very tolerant of pure-CR line endings (it’s been 20 years now since those became obsolete).

> It seems clear that tag 257 and 36 do require CRLF per the MIME standard. That’s all fine and there’s nothing more to do other than reference the MIME standard.

That’s what they say.
Any implementation that doesn’t accept LF line endings is pretty much broken for the real world.

> You all may have though about this more than me, but it also seems there is room for the definition of some tags for major type 3 used with line-oriented text since major type 3 is silent on the line ending convention (as it should be). Maybe:
> 
>   Tag xx for IETF CRLF line endings (means translate CRLF to local; perhaps error on CR)
>   Tag yy fo MNU (means translate LF or CRLF to local; perhaps error on CR)

It may not be necessary to tag something for what it already is.
But it may be useful to be able to say in CDDL what a text string should look like  ; that may be something that could be added to the MNU specification or to a companion CDDL extension.

> CBOR protocols that use line-oriented data can specify which on-the-wire line ending they use, so the tags are not absolutely required, but they may be useful. They may also slightly help CBOR protocol designers using line-oriented text realize they need to say what the line ending is.

In reality, the line ending on the Internet is [CR] LF.  There is no other place CR can be used outside playing teleprinter (RFC 5198), so most software can simply treat CR as pure noise, which it typically does.

Grüße, Carsten