RE: [Ltru] Re: Remove extlang from ABNF?

Martin Duerst <duerst@it.aoyama.ac.jp> Tue, 11 December 2007 07:50 UTC

Return-path: <ltru-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1J1zse-00049X-Ty; Tue, 11 Dec 2007 02:50:16 -0500
Received: from ltru by megatron.ietf.org with local (Exim 4.43) id 1J1zse-00049S-II for ltru-confirm+ok@megatron.ietf.org; Tue, 11 Dec 2007 02:50:16 -0500
Received: from [10.90.34.44] (helo=chiedprmail1.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1J1zsY-000491-7T for ltru@ietf.org; Tue, 11 Dec 2007 02:50:10 -0500
Received: from scmailgw1.scop.aoyama.ac.jp ([133.2.251.194]) by chiedprmail1.ietf.org with esmtp (Exim 4.43) id 1J1zsW-00063h-U7 for ltru@ietf.org; Tue, 11 Dec 2007 02:50:09 -0500
Received: from scmse2.scbb.aoyama.ac.jp (scmse2 [133.2.253.17]) by scmailgw1.scop.aoyama.ac.jp (secret/secret) with SMTP id lBB7o45Q013186 for <ltru@ietf.org>; Tue, 11 Dec 2007 16:50:04 +0900 (JST)
Received: from (133.2.206.133) by scmse2.scbb.aoyama.ac.jp via smtp id 4147_af5375da_a7bd_11dc_8d9a_0014221f2a2d; Tue, 11 Dec 2007 16:50:04 +0900
X-AuthUser: duerst@it.aoyama.ac.jp
Received: from Tanzawa.it.aoyama.ac.jp ([133.2.210.1]:47840) by itmail.it.aoyama.ac.jp with [XMail 1.22 ESMTP Server] id <S24EB1E> for <ltru@ietf.org> from <duerst@it.aoyama.ac.jp>; Tue, 11 Dec 2007 16:46:04 +0900
Message-Id: <6.0.0.20.2.20071211163740.0a090850@localhost>
X-Sender: duerst@localhost
X-Mailer: QUALCOMM Windows Eudora Version 6J
Date: Tue, 11 Dec 2007 16:48:42 +0900
To: Peter Constable <petercon@microsoft.com>, LTRU Working Group <ltru@ietf.org>
From: Martin Duerst <duerst@it.aoyama.ac.jp>
Subject: RE: [Ltru] Re: Remove extlang from ABNF?
In-Reply-To: <DDB6DE6E9D27DD478AE6D1BBBB83579561E51429AA@NA-EXMSG-C117.r edmond.corp.microsoft.com>
References: <E1J01vI-0003cW-Rd@megatron.ietf.org> <019601c83818$b06c3070$6601a8c0@DGBP7M81> <DDB6DE6E9D27DD478AE6D1BBBB83579561E51429AA@NA-EXMSG-C117.redmond.corp.microsoft.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 3002fc2e661cd7f114cb6bae92fe88f1
Cc:
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org

At 01:16 07/12/07, Peter Constable wrote:
>Content-Language: en-US
>Content-Type: text/plain; charset="utf-8"
>
>> From: Doug Ewell [mailto:dewell@roadrunner.com]
>
>> >> That's my whole point - the danger that specs writers might look at
>> >> the dropped extlang and say "they are dropping features between
>> >> versions of  BCP 47, so we better refer to an RFC *only* and even
>> >> leave 'or its successor' out".
>> >
>> > But this is a key: we are *not* dropping any features. We are
>> dropping
>> > the possibility of a future feature. The change to the ABNF (whether
>> > by removing the extlang subtag entirely or by renaming and/or
>> > comments) is to clean it up so that implementers do not implement for
>> > non-and-never-to-be-features.
>>
>> I think what Felix meant was not that we are dropping features, but
>> that
>> it may appear to the outside observer that we are dropping features.
>
>I understood that. But I think it's significant that we are *not* removing 
>any features, and it seems to me that could be explained easily enough.

[chairs hat OFF]

It looks like this could be explained easily enough, but I'm quite
sure that this won't be the case. XML, and XML Schema, are very
strictly defined languages. Fortunately, we managed to get XML
away from including a grammar for language tags in an early erratum/
corrigendum. But if XML Schema has indeed used RFC 4646 for defining
the syntactic range of language tags, then we should not remove
some productions. It is clear to us as well as to them that there
cannot be any valid language tags of the form ab-cde-fgh. But
these tags, in RFC 4646 and therefore in XML Schema version foo,
are well-formed. If they change to be non-wellformed in a new
version of XML Schema, that essentially means that there are
potentially some documents that conform to a schema in one
version of XML Schema, but not in a newer one. Even if these
language tags have not been valid, and don't make sense, that's
a very bad idea. As an extreme example, such a tag may by accident
be part of a document used for managing a nuclear plant. With
a software update, the document could suddently become non-conforming,
triggering some error.

Peter, I recommend that you talk to some XML Schema folks inside
Microsoft to understand some of their thinking. I'm sure Felix
can provide some points of reference if needed.

As for implementation effort, I agree that leaving the extlang
production in increases implementation effort. But quite a bit
of it can be saved by knowing that these aren't really used anymore
(i.e. implementations only have to do the parsing for them,
they don't have to provide any additional functionality).
Also, we already have a long list of tests and some regular
expressions out there which cover extlangs, which both can help
implementers.

Regards,    Martin.

>The only real stability concern is that some existing parsers would 
>recognize certain hypothetical tags they might encounter as being well 
>formed when in fact they are not well formed. What possible malicious 
>effect could this have? Is this a tag in a query statement? It very likely 
>won't match any content. Is this a tag attributed to content? There likely 
>won't be any requests for it. Only in some closed system in which somebody 
>decided to use proprietary extlang subtags are both going to exist by 
>design. If the parser is encountering a tag somehow constructed such that 
>it has a not-by-design extlang, it won't be catching an condition that 
>could be considered an error, but the error will still be reflected in a 
>failure to match anything, and the essential fix must be in the process 
>*constructing* that tag rather than in the parser.
>
>
>Peter
>_______________________________________________
>Ltru mailing list
>Ltru@ietf.org
>https://www1.ietf.org/mailman/listinfo/ltru


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp     



_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru