Re: [rfc-i] Unicode in xml2rfc v3
Marc Petit-Huguenin <marc@petit-huguenin.org> Sun, 20 December 2020 18:27 UTC
Return-Path: <rfc-interest-bounces@rfc-editor.org>
X-Original-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Delivered-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7F6FA3A1136; Sun, 20 Dec 2020 10:27:08 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.65
X-Spam-Level:
X-Spam-Status: No, score=-2.65 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.25, MAILING_LIST_MULTI=-1, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1Ddzi0Oc2Y6o; Sun, 20 Dec 2020 10:27:04 -0800 (PST)
Received: from rfc-editor.org (rfc-editor.org [4.31.198.49]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 544DB3A1135; Sun, 20 Dec 2020 10:27:04 -0800 (PST)
Received: from rfcpa.amsl.com (localhost [IPv6:::1]) by rfc-editor.org (Postfix) with ESMTP id 8A046F40726; Sun, 20 Dec 2020 10:26:48 -0800 (PST)
X-Original-To: rfc-interest@rfc-editor.org
Delivered-To: rfc-interest@rfc-editor.org
Received: from localhost (localhost [127.0.0.1]) by rfc-editor.org (Postfix) with ESMTP id C3F07F40726 for <rfc-interest@rfc-editor.org>; Sun, 20 Dec 2020 10:26:46 -0800 (PST)
X-Virus-Scanned: amavisd-new at rfc-editor.org
Received: from rfc-editor.org ([127.0.0.1]) by localhost (rfcpa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vqIr9eSNAbKq for <rfc-interest@rfc-editor.org>; Sun, 20 Dec 2020 10:26:42 -0800 (PST)
Received: from implementers.org (implementers.org [92.243.22.217]) by rfc-editor.org (Postfix) with ESMTPS id 7855EF40720 for <rfc-interest@rfc-editor.org>; Sun, 20 Dec 2020 10:26:42 -0800 (PST)
Received: from [IPv6:2601:648:8400:8e7d:3995:454c:9923:c3b5] (unknown [IPv6:2601:648:8400:8e7d:3995:454c:9923:c3b5]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "Marc Petit-Huguenin", Issuer "implementers.org" (verified OK)) by implementers.org (Postfix) with ESMTPS id 5A814AE11A; Sun, 20 Dec 2020 19:26:51 +0100 (CET)
To: Carsten Bormann <cabo@tzi.org>
References: <20201219215415.CFEBA2AE17AC@ary.qy> <53f68fa2-933f-8909-0c37-6e8e1d5e9c9b@petit-huguenin.org> <93bd1bc0-229-3914-ba71-ccaf1976f69@taugh.com> <7b588e1d-74db-4bab-6154-4e8306fd779b@petit-huguenin.org> <482B5895-0A89-43B4-9CF0-D8E6B0A5EB5B@tzi.org>
From: Marc Petit-Huguenin <marc@petit-huguenin.org>
Message-ID: <9dc7f667-dc1f-4b35-3632-874c50d915de@petit-huguenin.org>
Date: Sun, 20 Dec 2020 10:26:49 -0800
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0
MIME-Version: 1.0
In-Reply-To: <482B5895-0A89-43B4-9CF0-D8E6B0A5EB5B@tzi.org>
Content-Language: en-US
Subject: Re: [rfc-i] Unicode in xml2rfc v3
X-BeenThere: rfc-interest@rfc-editor.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "A list for discussion of the RFC series and RFC Editor functions." <rfc-interest.rfc-editor.org>
List-Unsubscribe: <https://www.rfc-editor.org/mailman/options/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=unsubscribe>
List-Archive: <http://www.rfc-editor.org/pipermail/rfc-interest/>
List-Post: <mailto:rfc-interest@rfc-editor.org>
List-Help: <mailto:rfc-interest-request@rfc-editor.org?subject=help>
List-Subscribe: <https://www.rfc-editor.org/mailman/listinfo/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=subscribe>
Cc: RFC Interest <rfc-interest@rfc-editor.org>, "John R. Levine" <johnl@taugh.com>
Content-Transfer-Encoding: base64
Content-Type: text/plain; charset="utf-8"; Format="flowed"
Errors-To: rfc-interest-bounces@rfc-editor.org
Sender: rfc-interest <rfc-interest-bounces@rfc-editor.org>
I am not arguing for the use of formal methods (at least not here, that would be in fdt@). I am arguing that normative text can be understood as paraphrasing a formal specification (which stays invisible and in most cases nonexistent) and that concision, including using ASCII, is key for that paraphrasing. See section IV of [1]. [1] Zave P. Experiences with protocol description. In Workshop on Rigorous Protocol Engineering (W-RiPE’11) 2011 Oct. On 12/20/20 9:47 AM, Carsten Bormann wrote: > Hi Marc, > > you describe an interesting research program. > > There are quite a few challenges before something like that can be a reality. > Not that we haven’t tried [1]. > Overzealous attempts at applying formal description techniques helped kill OSI [2], and I don’t want to be on the guilty side again. > > Until we have made more progress on this (*), most RFCs will contain large parts that need to be understood by humans, if only because of the incredible span of subject matter that RFCs cover. Any help we can get there is good, and making use of a capability that we now pretty much universally have, namely the use of beyond-ASCII characters, is highly indicated. > > Grüße, Carsten > > [1] e.g., https://doi.org/10.1016/0140-3664(80)90151-6 > [2] http://www.cs.columbia.edu/~hgs/papers/2011/Dagstuhl%2011042.pdf > (*) Obviously, we already have XDR syntax, ABNF, YANG, CDDL, ... > >> On 2020-12-20, at 17:39, Marc Petit-Huguenin <marc@petit-huguenin.org> wrote: >> >> On 12/19/20 5:29 PM, John R Levine wrote: >>> On Sat, 19 Dec 2020, Marc Petit-Huguenin wrote: >>>> I care exclusively about specifications that can be implemented as interoperable programs. The minimal formulation for such specifications is a dependent type, which can be always be expressed in ASCII. >>> I entirely agree that it makes sense to write code in ASCII. >>> But most of the contents of RFCs is not code, it's text, and we have hundreds of years of experience typesetting text. Look at any decently produced book or magazine and you will see that the character set is a lot broader than ASCII, which makes it a lot more readable. >> >> I am not arguing that all text produced should be in ASCII. In fact non-normative parts of a standard could use non-ASCII -- I do not read these anyway because I believe that a standard should be implementable even after been stripped of all the informative parts (abstract, introduction, overview of operations, examples, any diagram that is not complete, any list that is not exhaustive, informative references, appendices). >> >> I would even go further, in that everything produced by the IRTF stream should use the whole Unicode character set, such as they look more a paper written in LaTeX, and just forgo the text version. These documents are meant to be read, not to be implemented, and there is really nothing that come close to a nicely typeset PDF to absorb information. >> >> Where I draw the line is in the parts that are meant to be implemented, aka normative text. These are to be (at least virtually) translated into a dependent type (or higher order intuitionistic logic), itself then derived into a program (aka, the stuff I am payed to produce). To be able to do that translation I need unambiguous text, and with as little flourish as possible. Even before adding non-ASCII characters, my observation is that the normative text in RFCs already have way too many words, and I wish that a editor specialized in the topic (not the RFC editor) would have spent time distilling these sentences to their essence. Which in turn would make the translation explained above easier, which in turn would make the programs derived from it safer, which in turn would make the Internet works better. >> >> Now admitting non-ASCII for some RFCs or some parts of an RFCs requires discipline so that's a dead end. One way could be to annotate each section with a normative boolean attribute and prevent <u> (and other stuff) in a normative=true section. That will not happen so as a fallback the <u> thing seems a good way for me to generate a plain text that replaces these with the CLDR short text (as suggested in a previous email) when preparing for printing. I already use a patched xml2rfc (to print RFCs with pagination), so that's not a big deal. I do not believe that the IETF Trust allows for redistribution of stripped down RFCs, so other people will not profit from that improvement. >> -- Marc Petit-Huguenin Email: marc@petit-huguenin.org Blog: https://marc.petit-huguenin.org Profile: https://www.linkedin.com/in/petithug _______________________________________________ rfc-interest mailing list rfc-interest@rfc-editor.org https://www.rfc-editor.org/mailman/listinfo/rfc-interest
- [rfc-i] Unicode in xml2rfc v3 Lars Eggert
- Re: [rfc-i] Unicode in xml2rfc v3 Henrik Levkowetz
- Re: [rfc-i] Unicode in xml2rfc v3 Lars Eggert
- Re: [rfc-i] Unicode in xml2rfc v3 Carsten Bormann
- Re: [rfc-i] Unicode in xml2rfc v3 Lars Eggert
- Re: [rfc-i] Unicode in xml2rfc v3 Lars Eggert
- Re: [rfc-i] Unicode in xml2rfc v3 Carsten Bormann
- Re: [rfc-i] Unicode in xml2rfc v3 Carsten Bormann
- Re: [rfc-i] Unicode in xml2rfc v3 Carsten Bormann
- Re: [rfc-i] Unicode in xml2rfc v3 Carsten Bormann
- Re: [rfc-i] Unicode in xml2rfc v3 Carsten Bormann
- Re: [rfc-i] Unicode in xml2rfc v3 Julian Reschke
- Re: [rfc-i] Unicode in xml2rfc v3 Julian Reschke
- Re: [rfc-i] Unicode in xml2rfc v3 Carsten Bormann
- Re: [rfc-i] Unicode in xml2rfc v3 Julian Reschke
- Re: [rfc-i] Unicode in xml2rfc v3 Carsten Bormann
- Re: [rfc-i] Unicode in xml2rfc v3 Carsten Bormann
- Re: [rfc-i] Unicode in xml2rfc v3 Michael Richardson
- Re: [rfc-i] Unicode in xml2rfc v3 Carsten Bormann
- Re: [rfc-i] Unicode in xml2rfc v3 Lars Eggert
- Re: [rfc-i] Unicode in xml2rfc v3 Carsten Bormann
- Re: [rfc-i] Unicode in xml2rfc v3 Lars Eggert
- Re: [rfc-i] Unicode in xml2rfc v3 Julian Reschke
- Re: [rfc-i] Unicode in xml2rfc v3 Martin Thomson
- Re: [rfc-i] Unicode in xml2rfc v3 John R Levine
- Re: [rfc-i] Unicode in xml2rfc v3 Lars Eggert
- Re: [rfc-i] Unicode in xml2rfc v3 Carsten Bormann
- Re: [rfc-i] Unicode in xml2rfc v3 John R Levine
- Re: [rfc-i] Unicode in xml2rfc v3 tom petch
- Re: [rfc-i] Unicode in xml2rfc v3 Carsten Bormann
- Re: [rfc-i] Unicode in xml2rfc v3 Brian E Carpenter
- Re: [rfc-i] Unicode in xml2rfc v3 Julian Reschke
- Re: [rfc-i] Unicode in xml2rfc v3 Jay Daley
- Re: [rfc-i] Unicode in xml2rfc v3 Julian Reschke
- Re: [rfc-i] Unicode in xml2rfc v3 Brian E Carpenter
- Re: [rfc-i] Unicode in xml2rfc v3 Carsten Bormann
- Re: [rfc-i] Unicode in xml2rfc v3 Eric Rescorla
- Re: [rfc-i] Unicode in xml2rfc v3 Brian E Carpenter
- Re: [rfc-i] Unicode in xml2rfc v3 Martin Thomson
- Re: [rfc-i] Unicode in xml2rfc v3 Brian E Carpenter
- Re: [rfc-i] Unicode in xml2rfc v3 Julian Reschke
- Re: [rfc-i] Unicode in xml2rfc v3 tom petch
- Re: [rfc-i] Unicode in xml2rfc v3 Julian Reschke
- Re: [rfc-i] Unicode in xml2rfc v3 Paul Kyzivat
- Re: [rfc-i] Unicode in xml2rfc v3 Marc Petit-Huguenin
- Re: [rfc-i] Unicode in xml2rfc v3 Carsten Bormann
- Re: [rfc-i] Unicode in xml2rfc v3 Joel M. Halpern
- Re: [rfc-i] Unicode in xml2rfc v3 Eric Rescorla
- Re: [rfc-i] Unicode in xml2rfc v3 Marc Petit-Huguenin
- Re: [rfc-i] Unicode in xml2rfc v3 Carsten Bormann
- Re: [rfc-i] Unicode in xml2rfc v3 Marc Petit-Huguenin
- Re: [rfc-i] Unicode in xml2rfc v3 John R Levine
- Re: [rfc-i] Unicode in xml2rfc v3 Marc Petit-Huguenin
- Re: [rfc-i] Unicode in xml2rfc v3 John R Levine
- Re: [rfc-i] Unicode in xml2rfc v3 Carsten Bormann
- Re: [rfc-i] Unicode in xml2rfc v3 Marc Petit-Huguenin
- Re: [rfc-i] Unicode in xml2rfc v3 John R Levine
- Re: [rfc-i] Unicode in xml2rfc v3 John R Levine
- Re: [rfc-i] Unicode in xml2rfc v3 Salz, Rich
- Re: [rfc-i] Unicode in xml2rfc v3 Marc Petit-Huguenin
- Re: [rfc-i] Unicode in xml2rfc v3 Marc Petit-Huguenin
- Re: [rfc-i] Unicode in xml2rfc v3 Martin J. Dürst
- Re: [rfc-i] Unicode in xml2rfc v3 Lars Eggert
- Re: [rfc-i] Unicode in xml2rfc v3 Paul Kyzivat
- Re: [rfc-i] Unicode in xml2rfc v3 Carsten Bormann