Re: Unicode/UTF-8 issues (was: comments on draft-ietf-sasl-anon-00)

Alexey Melnikov <mel@messagingdirect.com> Thu, 20 February 2003 18:25 UTC

Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1KIP4i27542 for ietf-sasl-bks; Thu, 20 Feb 2003 10:25:04 -0800 (PST)
Received: from rembrandt.esys.ca (IDENT:root@rembrandt.esys.ca [198.161.92.131]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1KIP2d27535 for <ietf-sasl@imc.org>; Thu, 20 Feb 2003 10:25:02 -0800 (PST)
Received: from messagingdirect.com (gagarin.isode.com [193.133.227.138]) (authenticated) by rembrandt.esys.ca (8.11.0.Beta0/8.11.0.Beta0) with ESMTP id h1KIQUx00882; Thu, 20 Feb 2003 11:26:30 -0700
Message-ID: <3E551D7A.BFB50CE8@messagingdirect.com>
Date: Thu, 20 Feb 2003 11:24:58 -0700
From: Alexey Melnikov <mel@messagingdirect.com>
Organization: ACI WorldWide / MessagingDirect
X-Mailer: Mozilla 4.79 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
To: "Kurt D. Zeilenga" <Kurt@OpenLDAP.org>
CC: Philip Guenther <guenther@sendmail.com>, ietf-sasl@imc.org
Subject: Re: Unicode/UTF-8 issues (was: comments on draft-ietf-sasl-anon-00)
References: <5.2.0.9.0.20030220092854.01a0bd18@127.0.0.1>
Content-Type: text/plain; charset="koi8-r"
Content-Transfer-Encoding: 7bit
Sender: owner-ietf-sasl@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-sasl/mail-archive/>
List-ID: <ietf-sasl.imc.org>
List-Unsubscribe: <mailto:ietf-sasl-request@imc.org?body=unsubscribe>

"Kurt D. Zeilenga" wrote:

> At 11:54 PM 2/19/2003, Philip Guenther wrote:
> >The syntax for UTF-8 characters in the draft permits "non-shortest form"
> >encodings
>
> I'm not exactly sure what you are referring to here.  The
> draft says that the trace information is transferred as
> string of UTF-8 encoded Unicode characters.  A non-shorted
> form UTF-8 encoding of a Unicode character is invalid per
> RFC 2247.  I believe draft-yergeau-rfc2279bis-04.txt is
> more clear on this, so I'll change the reference.

I believe Philip was referring to ABNF which allows for overlong UTF8
sequences.

> If, however, you mean that the string of Unicode characters
> is not normalized using an algorithm which produces the
> minimum number of code points, yes.  This is as intended.

Regards,
Alexey
__________________________________________
R & D, ACI Worldwide/MessagingDirect
Watford, UK

Work Phone: +44 1923 81 2877
Home Page: http://orthanc.ab.ca/mel

I speak for myself only, not for my employer.
__________________________________________