Re: Require guidance on Unicode in IETF formats
Lisa Dusseault <lisa@osafoundation.org> Wed, 10 January 2007 21:36 UTC
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1H4l7Q-0008GJ-UO; Wed, 10 Jan 2007 16:36:24 -0500
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1H4kwf-0002Qh-Az for cosmogol@ietf.org; Wed, 10 Jan 2007 16:25:17 -0500
Received: from laweleka.osafoundation.org ([204.152.186.98]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1H4kwd-0002MF-Rq for cosmogol@ietf.org; Wed, 10 Jan 2007 16:25:17 -0500
Received: from localhost (localhost [127.0.0.1]) by laweleka.osafoundation.org (Postfix) with ESMTP id 35F43142262; Wed, 10 Jan 2007 13:25:13 -0800 (PST)
Received: from laweleka.osafoundation.org ([127.0.0.1]) by localhost (laweleka.osafoundation.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 22255-10; Wed, 10 Jan 2007 13:25:11 -0800 (PST)
Received: from [192.168.1.101] (c-69-181-78-47.hsd1.ca.comcast.net [69.181.78.47]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by laweleka.osafoundation.org (Postfix) with ESMTP id 9E42D142260; Wed, 10 Jan 2007 13:25:11 -0800 (PST)
In-Reply-To: <20070110203056.GA614@preston.sources.org>
References: <20070110203056.GA614@preston.sources.org>
Mime-Version: 1.0 (Apple Message framework v752.2)
Content-Type: text/plain; charset="ISO-8859-1"; delsp="yes"; format="flowed"
Message-Id: <8C89C287-CBAF-402B-A0CE-91FE3A240ABD@osafoundation.org>
Content-Transfer-Encoding: quoted-printable
From: Lisa Dusseault <lisa@osafoundation.org>
Date: Wed, 10 Jan 2007 13:25:09 -0800
To: Stephane Bortzmeyer <bortzmeyer@nic.fr>
X-Mailer: Apple Mail (2.752.2)
X-Virus-Scanned: by amavisd-new and clamav at osafoundation.org
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 31247fb3be228bb596db9127becad0bc
X-Mailman-Approved-At: Wed, 10 Jan 2007 16:36:24 -0500
Cc: "Ted Hardie - App. Area Director" <hardie@qualcomm.com>, cosmogol@ietf.org
Subject: Re: Require guidance on Unicode in IETF formats
X-BeenThere: cosmogol@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: DIscussion on state machine specification in IETF protocols <cosmogol.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/cosmogol>, <mailto:cosmogol-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/cosmogol>
List-Post: <mailto:cosmogol@ietf.org>
List-Help: <mailto:cosmogol-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/cosmogol>, <mailto:cosmogol-request@ietf.org?subject=subscribe>
Errors-To: cosmogol-bounces@ietf.org
I am not sure I have any special wisdom or can point to any policy requirements on this one. Our requirements for IETF standards to be i18n usually focus on what gets shown to the application user, and less importantly the administrative user, and not what the protocol implementor or debugger (or log file or wire trace) sees. Thus error messages need not be Unicode and need not be translated, as long as it's reasonable that a client implementation could look up an ASCII error message or the associated code and figure out what to display in the user's language. Method names and header/parameter names in protocols can definitely be ASCII, and not translated. So who's the *user* of a State Machine Description Language (SMDL)? If I can use a SMDL to describe a bunch of states for a protocol that is properly i18n for the final end-user of the *protocol*, that seems like the minimum that we can base on general IETF policy requirements. If there are further requirements, such as comments being i18n, those are community requirements rather than direct consequence of IETF/IESG policy. I will point out since it was brought up earlier, that ABNF is expressed in ASCII but it *can* specify protocol syntax in UTF8 or another encoding. Do we need ABNF to be able to declare rule names with non-ASCII characters, or to allow non-ASCII characters in comments? Would we bother rewriting ABNF to make that possible? Specific use cases may be helpful here. One use case could be "A German speaker communicating with German-speaking coworkers about their code-base needs to be able to name a state something like 'Exclusiv Verändert' ". If everybody agrees to support that use case then the SMDL needs to be able to support non-ASCII (or at least obviously encoded) state names. Alternatively the consensus could be that use case isn't necessary, either because in practice programmers use state machines and they're used to ASCII labels, or because of a decision to limit the scope of the SMDL to IETF RFCs where labels are even more consistently in English. Lisa On Jan 10, 2007, at 12:30 PM, Stephane Bortzmeyer wrote: > We require some guidance from our Area Directors about the use of > Unicode in an IETF format. On the mailing list cosmogol@ietf.org, a > discussion was raised on wether we should accept only ASCII in the > language we define (our work is to define a format, not a protocol) or > the full Unicode character set. > (http://www1.ietf.org/mail-archive/web/cosmogol/current/msg00007.html > and follow-ups.) > > Some people claimed that Unicode support was more or less mandatory at > the IETF and that a format without it had no chance of being > adopted. Besides, internationalization is a very good thing, anyway, > for the world-wide Internet. > > Some people feared that mandating Unicode would complicate the grammar > and would drastically reduce the number of tools available to write > parsers for this format. They think that Cosmogol, being intended > mostly for RFC or other ultra-technical usages do not have the same > requirments as a general protocol like HTTP or NNTP. > > We identified the following RFC as possibly relevant: > > RFC 2277 / BCP 18 IETF Policy on Character Sets and Languages > > RFC 2223 Instructions to RFC Authors > > RFC 3536 Terminology Used in Internationalization in the IETF > > But none seems to bring a clear answer. Is Unicode support a MUST, a > SHOULD or a MAY in a new protocol? > > How many *new* IETF formats are in Unicode? (Apart from those based > only on XML, like Atom in RFC 4287.) Old formats like ABNF do not > count because they derive from an older format. > > > _______________________________________________ Cosmogol mailing list Cosmogol@ietf.org https://www1.ietf.org/mailman/listinfo/cosmogol
- Require guidance on Unicode in IETF formats Stephane Bortzmeyer
- Re: Require guidance on Unicode in IETF formats Lisa Dusseault
- Re: Require guidance on Unicode in IETF formats Clive D.W. Feather
- Re: Require guidance on Unicode in IETF formats Stephane Bortzmeyer
- Re: Require guidance on Unicode in IETF formats Clive D.W. Feather
- Re: Require guidance on Unicode in IETF formats Michael Richardson
- Re: Require guidance on Unicode in IETF formats Stephane Bortzmeyer
- Re: Require guidance on Unicode in IETF formats Michael Richardson