Re: Unicode/UTF-8 issues (was: comments on draft-ietf-sasl-anon-00)

"Kurt D. Zeilenga" <Kurt@OpenLDAP.org> Thu, 20 February 2003 19:10 UTC

Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1KJAhn28968 for ietf-sasl-bks; Thu, 20 Feb 2003 11:10:43 -0800 (PST)
Received: from pretender.boolean.net (root@router.boolean.net [198.144.206.49]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1KJAfd28964 for <ietf-sasl@imc.org>; Thu, 20 Feb 2003 11:10:41 -0800 (PST)
Received: from nomad.OpenLDAP.org (root@localhost [127.0.0.1]) by pretender.boolean.net (8.12.6/8.12.6) with ESMTP id h1KJAfxH029810; Thu, 20 Feb 2003 19:10:42 GMT (envelope-from Kurt@OpenLDAP.org)
Message-Id: <5.2.0.9.0.20030220103201.01a11718@127.0.0.1>
X-Sender: kurt@127.0.0.1
X-Mailer: QUALCOMM Windows Eudora Version 5.2.0.9
Date: Thu, 20 Feb 2003 11:09:07 -0800
To: Alexey Melnikov <mel@messagingdirect.com>
From: "Kurt D. Zeilenga" <Kurt@OpenLDAP.org>
Subject: Re: Unicode/UTF-8 issues (was: comments on draft-ietf-sasl-anon-00)
Cc: Philip Guenther <guenther@sendmail.com>, ietf-sasl@imc.org
In-Reply-To: <3E551D7A.BFB50CE8@messagingdirect.com>
References: <5.2.0.9.0.20030220092854.01a0bd18@127.0.0.1>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Sender: owner-ietf-sasl@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-sasl/mail-archive/>
List-ID: <ietf-sasl.imc.org>
List-Unsubscribe: <mailto:ietf-sasl-request@imc.org?body=unsubscribe>

At 10:24 AM 2/20/2003, Alexey Melnikov wrote:
>"Kurt D. Zeilenga" wrote:
>
>> At 11:54 PM 2/19/2003, Philip Guenther wrote:
>> >The syntax for UTF-8 characters in the draft permits "non-shortest form"
>> >encodings
>>
>> I'm not exactly sure what you are referring to here.  The
>> draft says that the trace information is transferred as
>> string of UTF-8 encoded Unicode characters.  A non-shorted
>> form UTF-8 encoding of a Unicode character is invalid per
>> RFC 2247.  I believe draft-yergeau-rfc2279bis-04.txt is
>> more clear on this, so I'll change the reference.
>
>I believe Philip was referring to ABNF which allows for overlong UTF8
>sequences.

I gather that you are saying that ABNF is not precise enough.
(please correct me if I wrongly gathered).  While I can
see that if you ignored all other text and the comments within
the ABNF, one could presume that the message, in hex, 0C 30
was valid.... if one actually ignored all the other text
and the comments they could also presume that the invalid
messages 00 (in hex), foo"bar"@x (in ASCII), and many others
were valid.

Obviously, the grammar is not intended to be taken out of
context.  It is provided as a tool for understanding the
text of the specification.
  
Kurt