Re: [apps-discuss] Objection to processing draft-ietf-appsawg-text-markdown-* documents as WG drafts (was: Re: Benoit Claise's Discuss on draft-ietf-appsawg-text-markdown-use-cases-02: (with DISCUSS and COMMENT))

Martin J. Dürst <duerst@it.aoyama.ac.jp> Mon, 13 July 2015 07:48 UTC

Return-Path: <duerst@it.aoyama.ac.jp>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B73091AD241; Mon, 13 Jul 2015 00:48:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.601
X-Spam-Level:
X-Spam-Status: No, score=-1.601 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2srHwbSMtuOB; Mon, 13 Jul 2015 00:48:54 -0700 (PDT)
Received: from APC01-PU1-obe.outbound.protection.outlook.com (mail-pu1apc01on0139.outbound.protection.outlook.com [104.47.126.139]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E17211AD23D; Mon, 13 Jul 2015 00:48:53 -0700 (PDT)
Authentication-Results: cisco.com; dkim=none (message not signed) header.d=none;
Received: from [133.2.210.64] (133.2.210.64) by KAWPR01MB0131.jpnprd01.prod.outlook.com (10.161.27.12) with Microsoft SMTP Server (TLS) id 15.1.213.14; Mon, 13 Jul 2015 07:48:48 +0000
To: John C Klensin <john-ietf@jck.com>, Sean Leonard <dev+ietf@seantek.com>, The IESG <iesg@ietf.org>
References: <BC704810D276B2B3DD5EFBAE@JcK-HP8200.jck.com>
From: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
Organization: Aoyama Gakuin University
Message-ID: <55A36D59.1010101@it.aoyama.ac.jp>
Date: Mon, 13 Jul 2015 16:48:41 +0900
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0
MIME-Version: 1.0
In-Reply-To: <BC704810D276B2B3DD5EFBAE@JcK-HP8200.jck.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Originating-IP: [133.2.210.64]
X-ClientProxiedBy: TY1PR0201CA0009.apcprd02.prod.outlook.com (25.164.90.147) To KAWPR01MB0131.jpnprd01.prod.outlook.com (25.161.27.12)
X-Microsoft-Exchange-Diagnostics: 1; KAWPR01MB0131; 2:FBUwK3bhUVEkvu7dZpgPd2BqEnCUjzB6L3v0qxiwm9S3K7Sz8zJQx4/Q4TdTlbUr; 3:C2Q/GzKLU3m+M84ybvFZya03hmKOzOjbIt8g3WjardT69f5G2F+cy9yv3lX9wzLarYOQCF8biS6vO2/f3pmXcIVXl31lXsXvNeCbdfmwSEM7HF+/W8ZrOn1b1rDnb1qv0kp26UZ86e+mBYupmI0ajw==; 25:JfvXkrTfDXEQG+oGuDaIyZMdpXqVCVGrF0mQHbatwQrnJyyMJNW8mK95wuRoiVXHFb86B9NIywKFwSa5CK9gHdomHNdWA5G6OsTREUYC1HMsYhAZXaRzXm3fRB+hucCn5Uln02yydTsku0M7ZY2cbSkVCvP6TO4OX4vTfn0CC8hDz4tZ1kDUtZtbciIe7bRBsUHwqkVHhuHKmt/alNfF4Cw2cbEAJ29tEUTwbFbNINUUCCEOX3xOmNzpVM3dmbs2L3QAq93kC1Gq5L3oeMjUug==; 4:Qm8qHBLCXYs4pYaN4unhT+mzxOZmP1FKVlu/7NVsHyMGXx/Ukd3O/yB7WNOuJWu8q6re6HeHhGx1YzOGt6j7ovf0rwfvolzYBal1ezEaCeQE4WoKZo09keIzQKy6pL0o68024rlDBDbWfdmARXkltQrJq39sAyb2NKsjVtsV1OM8fqrjbsrofnomdONfG6RLniuwaoX41r1jY724n6bkBBnM51K1rWeSNSdUqex/+L62z5IGjEUB2tWcZbXRAUlmS9rs6UKTDBG3DupgLdmlRA==
X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:KAWPR01MB0131;
X-Microsoft-Antispam-PRVS: <KAWPR01MB01313461CE681BCB77CE0761CA9C0@KAWPR01MB0131.jpnprd01.prod.outlook.com>
X-Exchange-Antispam-Report-Test: UriScan:;
X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(5005006)(3002001); SRVR:KAWPR01MB0131; BCL:0; PCL:0; RULEID:; SRVR:KAWPR01MB0131;
X-Forefront-PRVS: 0636271852
X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10019020)(6049001)(6009001)(479174004)(24454002)(51704005)(66066001)(50466002)(230783001)(23676002)(46102003)(74482002)(33656002)(62966003)(47776003)(5001770100001)(92566002)(77096005)(122386002)(65956001)(189998001)(50986999)(86362001)(76176999)(2950100001)(77156002)(40100003)(5001960100002)(54356999)(87266999)(65816999)(87976001)(42186005)(3940600001); DIR:OUT; SFP:1102; SCL:1; SRVR:KAWPR01MB0131; H:[133.2.210.64]; FPR:; SPF:None; MLV:sfv; LANG:en;
X-Microsoft-Exchange-Diagnostics: 1; KAWPR01MB0131; 23:MMGNMiQsresk3j8HHamvXQetvdRTxZ6ILk33MxH4AzU/lV7cLT3DkkSjkjua6eeOBvAXiSqMFIOA2wW3PydWGMSTlcS10Cf95YU09FOV0N6srVtdtOIfKo5GRjQ3Xvh3asuenOAFNHAUrM1q0zwJbIMAbjs7MpSBCmfZIJtwsSNrPAhe9R0gw3B1PvVkCDMdWaPC1auTF+bYzefOlRslwFUH0UxNtkE/1eckFSPTOwnIh7L2YQhcSo5BD1ADZ04P0VBxoBtNLVebltWr6e6WsbRoi49fv8qh8EUsYSbn6JSEjXHIvdhX2IM+uritiyHuuU0NZkQFIYc8LReuxGHC1rHwvxrE6ABrZq2jOQeC2h+qHlPo3jle90CqhSAhEQSoTDEJx4BvtRuQjYotbHFPOQdorx3x2CotQyGv/BZOC0+0pOYXLXrFG3GMyBW5PutMZF9am/RNNskq7xJqRmG547pBSFwBywvVBFIySZJVJF7VZSHYVlMYUfem79JsMU7pWekUMDSabdTCbAPTm2HsSkPPLvVhcc/fXMPRLDm/Pcyig7WyJ4iWwwd6pY6JhS6HLBj+N697RZI36WxIF6lURzWDoyoHQEr0KkJIu08mhK4WKhvjGqDYa7ygxOoOfMB3I4zWdAGVr5JRN2xl5B2QS2enJ2zKYNTjKWOEAUHMwTtMqgxUotbCn24y5nbZ5Zbm3QnslN9Ssvxa9Iofs1wVpS7rgWMGf+yJVJIgjaaOcqopc0mnoYxiOvfSR1SnCFbEuKacEy9oFWIXMzIze15VcgMvD931ugzXNunEQqCiWYVf4H/A/U9oY4a82HH5BNrc8k1Sof7FtXcFgJHc5FCnQvUeu/UnZF2laMy1bxQG+n7uc3XiVITdk6xNsHpoXkjy
X-Microsoft-Exchange-Diagnostics: 1; KAWPR01MB0131; 5:pUx2Y3VIHW9FTD+H9sbyp8XbPYk2ZWSbR7tgE3Kbj4pR6ImCFaNJEMiA/deZ2RovAZ9xBSsES/n4y9dwoVhRTzBbzOEFOVJjDs48isI+HEdAsc+ZqN7njJpDnTrswFO3hYIKyKhjTSRZ8KM68g/bDw==; 24:hDP/9MW5BmvjVk0u9UMJ8UTj4NNtopyVrbzXzc8iptnao+Wron8K2Cs6B5MiVc2/FcpEFeUxJL8z4AOhy5v42RU6Ur6llglPWrqHy0dhWB0=; 20:mROGVlz1YfnaIUnsyhTasvcKapub469Ldq8YhsBvbjLpo2UC9u7ZEwPkz0VVigAJtkLixU+Q9k+zTSYXIADycw==
SpamDiagnosticOutput: 1:23
SpamDiagnosticMetadata: NSPM
X-OriginatorOrg: it.aoyama.ac.jp
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 Jul 2015 07:48:48.1437 (UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-Transport-CrossTenantHeadersStamped: KAWPR01MB0131
Archived-At: <http://mailarchive.ietf.org/arch/msg/apps-discuss/Yj5bkt7oBZJzb3WXurfT5kFeEL8>
Cc: appsawg-chairs@ietf.org, apps-discuss@ietf.org, draft-ietf-appsawg-text-markdown-use-cases.shepherd@ietf.org, draft-ietf-appsawg-text-markdown-use-cases@ietf.org, draft-ietf-appsawg-text-markdown-use-cases.ad@ietf.org, Benoit Claise <bclaise@cisco.com>
Subject: Re: [apps-discuss] Objection to processing draft-ietf-appsawg-text-markdown-* documents as WG drafts (was: Re: Benoit Claise's Discuss on draft-ietf-appsawg-text-markdown-use-cases-02: (with DISCUSS and COMMENT))
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/apps-discuss/>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 13 Jul 2015 07:48:56 -0000

On 2015/07/13 08:20, John C Klensin wrote:

> For example, the first paragraph of Section 1.1 of
> draft-ietf-appsawg-text-markdown includes "a linear sequence of
> characters in some character set (code)".  That just isn't
> acceptable terminology.  Not only does it not conform to the
> recommendations of RFC 6365, but, in a slightly different
> environment, it would probably be read as meaning something
> entirely different from what was probably intended.

Agreed. "Character set" as used in RFC 2046 is not a term we want to use 
in 2015.


> That
> paragraph goes on to say "Because they are non-printing, these
> characters" (referring to "line breaks, page breaks, or other
> control characters) "are also hard to enter with standard
> keyboards."   At least for European writing systems, that is
> plain silly unless one has a keyboard that lacks an "Enter" or
> "Return" function or is using a _very_ strange input method
> editor (IME).

For line breaks, I'd argue that they are very easy to enter on pretty 
much any keyboard, because pretty much any keyboard has an Enter key 
(and because pretty much every modern language has line- or paragraph 
breaks).

For page breaks and other control characters, I'd argue that they are 
difficult to enter with the average keyboard, in any language.

So just remove "line breaks", and the problem should be fixed.


> The next paragraph goes on to make a suggestion
> about "overload certain characters with additional meanings". At
> least for SGML (and its descendents), that is not the way what
> happens is described.  I'd suggest it is even less true of
> LaTex, but YMMD.  What might be intended is something like
> "certain characters or character sequences are treated as
> reserved delimiters, with the strings they delimit acting as
> processing, identification, or formatting directions".

I'd agree that this isn't how SGML or LaTeX would describe it, but it's 
not actually in any way wrong. In XML, depending on context, '>' means 
itself or "close tag" (or any of a few more obscure meanings). We are 
still in the introduction, after all.


> Continuing with this theme, the "charset" portion of Section 2
> of draft-ietf-appsawg-text-markdown-06 says:
>
> 	"...will get along just fine by operating on character
> 	codes that lie in printable US-ASCII, blissfully
> 	oblivious to coded values outside of that range."
>
> I don't know what that means in spite of being regularly
> mistaken for an expert in the area.  Given that you want to be
> CCS-independent (see RFC6365), I think the first part probably
> refers to "graphic characters in the ASCII repertoire", but I
> don't know what "blissfully oblivious..." is trying to tell me.
> Is it that each Markdown processor has, or assumes, a CCS and
> encoding and, if anything is encountered outside that range or
> is a non-graphic character in that range, it will be ignored?
> Noting that set of exclusions would ignore the character known
> as SP, I suggest that any such Markdown processor would be
> seriously broken.  It is more likely that the sentence is wrong.

I know what the sentence tries to refer to. It refers to the fact that, 
as long as the syntax you are looking at only uses 7-bit bytes values 
with ASCII semantics (including the usual control characters and space), 
a processor will work for a wide range of character encodings.

The problem is that this applies to some encodings, but not to others. 
The criterion is a certain kind of strong ASCII-compatibility, namely 
that characters in the ASCII range (including C0) are always represented 
directly as 7-bit bytes, and that these 7-bit bytes always represent the 
corresponding characters and nothing else.

Encodings that qualify are of course US-ASCII itself, straightforward 
8-bit encodings starting with iso-8859-1 and including vendor encodings 
(Windows, Mac,...), some multibyte encodings such as GB2312 and EUC-JP, 
and UTF-8 (but beware of the BOM). However, it does not include some 
other encodings, in particular not iso-2022-jp, Shift_JIS, GBK, or 
GB18030, and of course not UTF-(16|32)(LE|BE).

So the text should definitely be more careful here.


Regards,   Martin.