RE: Protocol Action: 'UTF-8, a transformation format of ISO 10646' to Standard (fwd)

Glen Zorn <gwz@cisco.com> Fri, 15 August 2003 06:22 UTC

Date: Thu, 14 Aug 2003 23:22:01 -0700
From: Glen Zorn <gwz@cisco.com>
Subject: RE: Protocol Action: 'UTF-8, a transformation format of ISO 10646' to Standard (fwd)
X-Message-ID:
Message-ID: <20140418112159.2560.80806.ARCHIVE@ietfa.amsl.com>

owner-syslog-sec@employees.org <mailto:owner-syslog-sec@employees.org>
writes:

> Chris and all,
>
> I am still strugling with UTF-8 & ALL syslog RFCs.
>
> http://www.ietf.org/internet-drafts/draft-yergeau-rfc2279bis-05.txt,
> in 4. says:
>
> "   For the convenience of implementors using ABNF, a definition of
> UTF-8
>    in ABNF syntax is given here.
>
>    A UTF-8 string is a sequence of octets representing a sequence of
>    UCS characters. An octet sequence is valid UTF-8 only if it
>    matches the following syntax, which is derived from the rules for
>    encoding UTF-8 and is expressed in the ABNF of [RFC2234].
>
>    UTF8-octets = *( UTF8-char )
>    UTF8-char   = UTF8-1 / UTF8-2 / UTF8-3 / UTF8-4
>    UTF8-1      = %x00-7F
>    UTF8-2      = %xC2-DF UTF8-tail
>    UTF8-3      = %xE0 %xA0-BF UTF8-tail / %xE1-EC 2( UTF8-tail ) /
>                  %xED %x80-9F UTF8-tail / %xEE-EF 2( UTF8-tail )
>    UTF8-4      = %xF0 %x90-BF 2( UTF8-tail ) / %xF1-F3 3( UTF8-tail )
>                  / %xF4 %x80-8F 2( UTF8-tail )
>    UTF8-tail   = %x80-BF
> "
>
> If you look at this definition, 8 bit characters are required.

Maybe I'm reading it wrong, but UTF8-1 appears to be 7 bits.

> All of the current RFCs/Ids describe 7 bit US-ASCII only. So I don't
see any
> way to use UTF-8 in the current framework.

But RFC 3164 just _documents_ BSD syslog, it doesn't _define_ IETF
syslog; I-Ds are made to be changed.  Presumably, the purpose of this WG
is not just to document an ancient protocol, but to improve it.

>
> Am I missing something?
>
> Rainer
>
>
>> -----Original Message-----
>> From: Chris Lonvick [mailto:clonvick@cisco.com]
>> Sent: Thursday, August 14, 2003 3:48 PM
>> To: syslog-sec@employees.org
>> Subject: Protocol Action: 'UTF-8, a transformation format of ISO
>> 10646' to Standard (fwd)
>>
>>
>> Since we're on the subject.
>>
>> Thanks,
>> Chris
>>
>> ---------- Forwarded message ----------
>> Date: Mon, 11 Aug 2003 16:17:04 -0400
>> From: The IESG <iesg-secretary@ietf.org>
>> To: IETF-Announce:  ;
>> Cc: Internet Architecture Board <iab@iab.org>,
>>      RFC Editor <rfc-editor@rfc-editor.org>
>> Subject: Protocol Action: 'UTF-8,
>>      a transformation format of ISO          10646' to Standard
>>
>> The IESG has approved the Internet-Draft 'UTF-8, a transformation
>> format of ISO 10646' <draft-yergeau-rfc2279bis-05.txt> as a Standard.
>> This document has been reviewed in the IETF but is not the product
>> of an IETF Working Group. The IESG contact person is Ted Hardie.
>>
>> Technical Summary
>>
>> This document updates the specification of UTF-8,
>> an encoding of the UCS which is designed to be
>> compatible with many current applications and protocols. UTF-8 has
>> the characteristic of preserving the full US-ASCII range, providing
>> compatibility with file systems, parsers and other software that rely
>> on US-ASCII values but are transparent to other values. This memo
>> obsoletes and replaces RFC 2279.
>>
>>
>> Working Group Summary
>>
>> This draft and the interoperability reports associated with it were
>> discussed on the IETF-charsets@iana.org mailing list. Archives may be
>> found at http://lists.w3.org/Archives/Public/ietf-> charsets/ among
>> other
> places.
>>
>>
>> Protocol Quality
>>
>> This specification was reviewed for the IESG by Patrik Falstrom.

Hope this helps,

~gwz

"They that can give up essential liberty to obtain a little temporary
safety deserve neither..."
- -- Benjamin Franklin, 1759

"It is forbidden to kill; therefore all murderers are punished unless
they kill in large numbers and to the sound of trumpets."
- -- Voltaire

------------------------------