[rfc-i] RFCs vs US-ASCII
DXDragon at yandex.ru ( Роман Донченко ) Wed, 06 October 2010 18:07 UTC
From: "DXDragon at yandex.ru"
Date: Wed, 06 Oct 2010 22:07:55 +0400
Subject: [rfc-i] RFCs vs US-ASCII
References: <4CAC93D7.9040502@gmx.de> <m85pa69hi286mm2ltvgfbsblds6grq8u5o@hive.bjoern.hoehrmann.de>
Message-ID: <op.vj51zhm83qy9a8@tortoise>
Bjoern Hoehrmann <derhoermi at gmx.net> ????? ? ????? ?????? Wed, 06 Oct 2010 19:40:00 +0400: > * Julian Reschke wrote: >> I was just made aware of >> >> http://www.rfc-editor.org/rfc/rfc2557.txt >> >> which has at least one instance of a non-ASCII character (?). >> >> Are there more? > > It would appear so, > > % grep -lrP "[\x80-\xff]" * > rfc1305.txt > rfc2166.txt > rfc2302.txt > rfc2497.txt > rfc2557.txt > rfc2708.txt > rfc2875.txt > > For instance, the combination 0x9f and 0xf7 seems to be used for quote > marks. A quick check with iconv does not suggest a particular encoding. Here's what I was able to decipher: rfc1305.txt: ? (U+00B1 PLUS-MINUS SIGN) and ? (U+2191 UPWARDS ARROW), encoding unknown. rfc2166.txt: ? (U+201C LEFT DOUBLE QUOTATION MARK) and ? (U+201D RIGHT DOUBLE QUOTATION MARK), encoded with windows-1252. rfc2302.txt: I didn't find any non-ASCII characters in this one. rfc2497.txt: SHY (U+00AD SOFT HYPHEN), encoded with ISO-8859-1. rfc2557.txt: ? (U+00C9 LATIN CAPITAL LETTER E WITH ACUTE), encoded with ISO-8859-1. rfc2708.txt: Some kind of apostrophe (0xC6), encoding unknown. Used to be ? (U+2019 RIGHT SINGLE QUOTATION MARK) in draft-ietf-printmib-job-protomap-02.txt. rfc2875.txt: Some kind of apostrophe (0xC6) and quotes (0xF4 and 0xF6), encoding unknown (but the same as in rfc2708.txt). Used to be ordinary ASCII apostrophe and quotation marks in draft-ietf-pkix-dhpop-02.txt. Others: rfc64.txt: ? (U+00B5 MICRO SIGN), encoded with ISO-8859-1. rfc101.txt, rfc177.txt, rfc178.txt, rfc182.txt, rfc227.txt, rfc234.txt, rfc235.txt, rfc243.txt, rfc270.txt, rfc282.txt, rfc288.txt, rfc290.txt, rfc292.txt, rfc303.txt: ? (U+00E9 LATIN SMALL LETTER E WITH ACUTE), encoded with ISO-8859-1 (in the RFC Online attribution notice). rfc237.txt, rfc306.txt, rfc307.txt, rfc310.txt, rfc313.txt, rfc315.txt, rfc316.txt, rfc317.txt, rfc323.txt, rfc327.txt, rfc367.txt, rfc369.txt: ? (U+00E9 LATIN SMALL LETTER E WITH ACUTE) and ? (U+00E8 LATIN SMALL LETTER E WITH GRAVE), encoded with ISO-8859-1 (in the RFC Online attribution notice). rfc441.txt: what looks like NBSP (U+00A0 NO-BREAK SPACE), encoded with ISO-8859-1, as well as ? and ? in the RFC Online attribution notice. Hope this helps, Roman.
- [rfc-i] RFCs vs US-ASCII Julian Reschke
- [rfc-i] RFCs vs US-ASCII Bjoern Hoehrmann
- [rfc-i] RFCs vs US-ASCII Роман Донченко
- [rfc-i] RFCs vs US-ASCII Tim Bray
- [rfc-i] RFCs vs US-ASCII Brian E Carpenter
- [rfc-i] RFCs vs US-ASCII "Martin J. Dürst"
- [rfc-i] RFCs vs US-ASCII Paul Hoffman
- [rfc-i] RFCs vs US-ASCII Bob Braden