Re: [Cbor] MIME tag 257 vs 36

Laurence Lundblade <lgl@island-resort.com> Fri, 25 September 2020 17:11 UTC

Return-Path: <lgl@island-resort.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1CF8B3A100F for <cbor@ietfa.amsl.com>; Fri, 25 Sep 2020 10:11:04 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.916
X-Spam-Level:
X-Spam-Status: No, score=-1.916 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DRiD-nQ2LbuN for <cbor@ietfa.amsl.com>; Fri, 25 Sep 2020 10:11:02 -0700 (PDT)
Received: from p3plsmtpa09-06.prod.phx3.secureserver.net (p3plsmtpa09-06.prod.phx3.secureserver.net [173.201.193.235]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 37C663A1002 for <cbor@ietf.org>; Fri, 25 Sep 2020 10:11:02 -0700 (PDT)
Received: from [192.168.1.81] ([76.167.193.86]) by :SMTPAUTH: with ESMTPA id LrFNkrA9Az5JkLrFNkZFIp; Fri, 25 Sep 2020 10:11:01 -0700
X-CMAE-Analysis: v=2.3 cv=aPSOVo1m c=1 sm=1 tr=0 a=t2DvPg6iSvRzsOFYbaV4uQ==:117 a=t2DvPg6iSvRzsOFYbaV4uQ==:17 a=gKmFwSsBAAAA:8 a=K6EGIJCdAAAA:8 a=48vgC7mUAAAA:8 a=VPq36HW9t6T3Fm9qqYIA:9 a=QEXdDO2ut3YA:10 a=2_6byTsn3QfTmgL2010A:9 a=k1eHiH0sk6a-3OBN:21 a=_W_S_7VecoQA:10 a=nnPW6aIcBuj1ljLj_o6Q:22 a=L6pVIi0Kn1GYQfi8-iRI:22 a=w1C3t2QeGrPiZgrLijVG:22
X-SECURESERVER-ACCT: lgl@island-resort.com
From: Laurence Lundblade <lgl@island-resort.com>
Message-Id: <03268088-A7A8-4F7A-B923-ED3334BD992A@island-resort.com>
Content-Type: multipart/alternative; boundary="Apple-Mail=_52FDDAA6-66FB-4307-B961-6E7BE69446FB"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\))
Date: Fri, 25 Sep 2020 10:11:00 -0700
In-Reply-To: <2F089689-3E53-4DB1-81D7-60841721818E@tzi.org>
Cc: cbor@ietf.org, Jim Schaad <ietf@augustcellars.com>
To: Carsten Bormann <cabo@tzi.org>
References: <77902B73-54E2-455C-88D3-D9CC62EDD84E@island-resort.com> <4271C433-0B38-4B05-AD44-01830EDBD834@tzi.org> <D4F397FE-79BF-41A5-9B14-3C2D9E7A83FA@island-resort.com> <C4D07067-D855-401F-9EA5-5F11F3896835@island-resort.com> <03e801d68f77$297a2800$7c6e7800$@augustcellars.com> <31F3B345-A983-45BA-BF8E-BEBB3881A13D@tzi.org> <7D9A02D7-6D3B-4466-9682-D957BEDCCA4C@island-resort.com> <2F089689-3E53-4DB1-81D7-60841721818E@tzi.org>
X-Mailer: Apple Mail (2.3445.104.17)
X-CMAE-Envelope: MS4wfEV1gl430a54aIHpFpA66Q0YD+iUMZMvdynz0o54ql+Z1qN1YmygPHTNTVBYpgc2v1e4/WMHqFSTzbTzLmMQc8O6X5HG2MIC5O0DuXWbaZuUCtZuKWYn j3TDGlHqGJz+I5BP/lsKwBwviWDsJXLUVnl+pG4DSyjAMxypZcwxJ/bLjmtCmzjEVO4lhXkF/OfChcA7/l8mX/rKtLzdNB1LZveuyWIURBaeyretAhWY4ccd
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/FPDDV8OJZXrFNYEA5HJKGE9aN-Q>
Subject: Re: [Cbor] MIME tag 257 vs 36
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 25 Sep 2020 17:11:04 -0000

Hi Carsten,

Thanks for the detailed explanation.

Is this a good summary of the upshot?

Text lines in Internet protocols (on the wire) are delimited by either a CRLF or just an LF. Officially many protocols specify CRLF, but implementations often work with either. CBOR type 3 text can be either line ending, even a mixture of both.

Operating systems usually have a line end convention. Windows uses CRLF. Linux and MacOS use LF. Some applications on a given OS may work with either and some may prefer the OS’s line ending convention.

The majority of use cases and CBOR protocols using type 3 text will work with either line ending. However, some use cases or protocols may not work with either in which case translation to and/or from the local line end convention, typically that of the OS, is necessary.

Are you proposing that CDDL be able to specify a line end convention for type 3 items?

LL



> On Sep 24, 2020, at 10:59 PM, Carsten Bormann <cabo@tzi.org> wrote:
> 
> Hi Laurence,
> 
> On 2020-09-25, at 07:24, Laurence Lundblade <lgl@island-resort.com> wrote:
>> 
>> I did some more reading:
>> - CRLF is required for MIME headers, q-p, b64 CTEs (RFC 2045)
>> - CRLF is required for text/xxx types (RFC 2046)
>> - Unicode Format for Network Interchange requires CRLF (RFC 5198)
>> - HTTP headers require CRLF
> 
> This is one of the big areas where the letter of the standards and the reality can differ a lot.  Newer versions of the standards tend to contain a larger dose of reality; e.g., RFC 7230:
> 
>   Although the line terminator for the start-line and header fields is
>   the sequence CRLF, a recipient MAY recognize a single LF as a line
>   terminator and ignore any preceding CR.
> 
> RFC 2616 was somewhat more weasely:
> 
>   RFC 2046 requires that content with a type of "text" represent
>   line breaks as CRLF and forbids the use of CR or LF outside of line
>   break sequences. HTTP allows CRLF, bare CR, and bare LF to indicate a
>   line break within text content when a message is transmitted over
>   HTTP.
> 
>   Where it is possible, a proxy or gateway from HTTP to a strict MIME
>   environment SHOULD translate all line breaks within the text media
>   types described in section 3.7.1 of this document to the RFC 2049
>   canonical form of CRLF. 
> 
> Since HTTP is what happens for most interchange, expect the strict CRLF requirement of RFC 2046 to be ignored.
> 
>> - Git stores LF and translates to local line ending
> 
> Git is rooted in reality.
> (It gained CRLF support only after some gyrations.)
> 
>> These text formats choose a canonical line ending for on-the-wire messages so they can know to correctly translate to the local text format. These days it is just CRLF for Windows and LF for Linux/MacOS,
> 
> Right.
> 
>> it used to be CR for MacOS and in theory any local representation could be used including line lengths or a database of records.
> 
> I wish people would stop talking about CR line ends; these used to exist, but they no longer do in reality (outside very weird closed environments such as certain PDF files).  RFC 7230 reflects what the reality is, today, which is no longer very tolerant of pure-CR line endings (it’s been 20 years now since those became obsolete).
> 
>> It seems clear that tag 257 and 36 do require CRLF per the MIME standard. That’s all fine and there’s nothing more to do other than reference the MIME standard.
> 
> That’s what they say.
> Any implementation that doesn’t accept LF line endings is pretty much broken for the real world.
> 
>> You all may have though about this more than me, but it also seems there is room for the definition of some tags for major type 3 used with line-oriented text since major type 3 is silent on the line ending convention (as it should be). Maybe:
>> 
>>  Tag xx for IETF CRLF line endings (means translate CRLF to local; perhaps error on CR)
>>  Tag yy fo MNU (means translate LF or CRLF to local; perhaps error on CR)
> 
> It may not be necessary to tag something for what it already is.
> But it may be useful to be able to say in CDDL what a text string should look like  ; that may be something that could be added to the MNU specification or to a companion CDDL extension.
> 
>> CBOR protocols that use line-oriented data can specify which on-the-wire line ending they use, so the tags are not absolutely required, but they may be useful. They may also slightly help CBOR protocol designers using line-oriented text realize they need to say what the line ending is.
> 
> In reality, the line ending on the Internet is [CR] LF.  There is no other place CR can be used outside playing teleprinter (RFC 5198), so most software can simply treat CR as pure noise, which it typically does.
> 
> Grüße, Carsten
> 
> _______________________________________________
> CBOR mailing list
> CBOR@ietf.org
> https://www.ietf.org/mailman/listinfo/cbor