RE: Protocol Action: 'UTF-8, a transformation format of ISO 10646' to Standard (fwd)

Rainer Gerhards <rgerhards@hq.adiscon.com> Thu, 14 August 2003 20:41 UTC

Date: Thu, 14 Aug 2003 22:41:18 +0200
From: Rainer Gerhards <rgerhards@hq.adiscon.com>
Subject: RE: Protocol Action: 'UTF-8, a transformation format of ISO 10646' to Standard (fwd)
X-Message-ID:
Message-ID: <20140418112159.2560.84450.ARCHIVE@ietfa.amsl.com>

Chris and all,

I am still strugling with UTF-8 & ALL syslog RFCs.

http://www.ietf.org/internet-drafts/draft-yergeau-rfc2279bis-05.txt, in
4. says:

"   For the convenience of implementors using ABNF, a definition of
UTF-8
   in ABNF syntax is given here.

   A UTF-8 string is a sequence of octets representing a sequence of UCS
   characters. An octet sequence is valid UTF-8 only if it matches the
   following syntax, which is derived from the rules for encoding UTF-8
   and is expressed in the ABNF of [RFC2234].

   UTF8-octets = *( UTF8-char )
   UTF8-char   = UTF8-1 / UTF8-2 / UTF8-3 / UTF8-4
   UTF8-1      = %x00-7F
   UTF8-2      = %xC2-DF UTF8-tail
   UTF8-3      = %xE0 %xA0-BF UTF8-tail / %xE1-EC 2( UTF8-tail ) /
                 %xED %x80-9F UTF8-tail / %xEE-EF 2( UTF8-tail )
   UTF8-4      = %xF0 %x90-BF 2( UTF8-tail ) / %xF1-F3 3( UTF8-tail ) /
                 %xF4 %x80-8F 2( UTF8-tail )
   UTF8-tail   = %x80-BF
"

If you look at this definition, 8 bit characters are required. All of
the current RFCs/Ids describe 7 bit US-ASCII only. So I don't see any
way to use UTF-8 in the current framework.

Am I missing something?

Rainer


> -----Original Message-----
> From: Chris Lonvick [mailto:clonvick@cisco.com]
> Sent: Thursday, August 14, 2003 3:48 PM
> To: syslog-sec@employees.org
> Subject: Protocol Action: 'UTF-8, a transformation format of
> ISO 10646' to Standard (fwd)
>
>
> Since we're on the subject.
>
> Thanks,
> Chris
>
> ---------- Forwarded message ----------
> Date: Mon, 11 Aug 2003 16:17:04 -0400
> From: The IESG <iesg-secretary@ietf.org>
> To: IETF-Announce:  ;
> Cc: Internet Architecture Board <iab@iab.org>,
>      RFC Editor <rfc-editor@rfc-editor.org>
> Subject: Protocol Action: 'UTF-8,
>      a transformation format of ISO          10646' to Standard
>
> The IESG has approved the Internet-Draft 'UTF-8, a
> transformation format of ISO 10646'
> <draft-yergeau-rfc2279bis-05.txt> as a Standard. This
> document has been reviewed in the IETF but is not the product
> of an IETF Working Group. The IESG contact person is Ted Hardie.
>
> Technical Summary
>
> This document updates the specification of UTF-8,
> an encoding of the UCS which is designed to be
> compatible with many current applications and protocols.
> UTF-8 has the characteristic of preserving the full US-ASCII
> range, providing compatibility with file systems, parsers and
> other software that rely on US-ASCII values but are
> transparent to other values. This memo obsoletes and replaces
> RFC 2279.
>
>
> Working Group Summary
>
> This draft and the interoperability reports associated with
> it were discussed on the IETF-charsets@iana.org mailing list.
> Archives may be found at
> http://lists.w3.org/Archives/Public/ietf-> charsets/ among other
places.
>
>
> Protocol Quality
>
> This specification was reviewed for the IESG by Patrik Falstrom.
>
>
>
>
>

------------------------------