[rfc-i] Feedback on Section 3.4 in draft-iab-rfc-nonascii-02, U+ syntax
dev+ietf at seantek.com (Sean Leonard) Wed, 31 August 2016 17:02 UTC
From: dev+ietf at seantek.com (Sean Leonard)
Date: Wed, 31 Aug 2016 10:02:22 -0700
Subject: [rfc-i] Feedback on Section 3.4 in draft-iab-rfc-nonascii-02,
U+ syntax
Message-ID: <ecd3d504-764e-2b6d-72bd-3343ad22660d@seantek.com>
/(Sent this to the authors, and the suggestion was that this is the
right mailing list for public discussion.)/
**********
Hello draft-iab-rfc-nonascii-02 people, here is feedback
on draft-iab-rfc-nonascii-02.
Section 3.4 of draft-iab-rfc-nonascii-02 provides no less than six
preferred alternatives for how to represent a single Unicode character
or code point. They all pretty much say ?the ___ character (___)? in
various permutations. None of these are inherently wrong.
However, The Unicode Standard itself (9.0.0 and prior versions) provides
a specific convention in Appendix A:
?U+[x][x]xxxx NAME OF CHARACTER?
Notably, the convention does not use ?the ___ character? formulation.
Grammatically, the convention is a character, so an article is omitted.
A conforming example would be:
1. Temperature changes in the Temperature Control Protocol are
indicated by U+2206 INCREMENT.
I would like to propose that this be used as at least a priority
alternative.
In The Unicode Standard, two other conventions are noted:
U+1F631 ??? FACE SCREAMING IN FEAR
U+1F631 ???
These conventions show all-caps, and small-caps (which for PDF
presentation purposes, are actually stored as lowercase). They also show
curly quotes. I asked the Unicode mailing list over the weekend and the
general sense is that the uppercase is normative in plain text (as shown
in the UCD) but case distinctions, along with space and (nearly all)
hyphens, are not relevant for unambiguous identification.
draft-iab-rfc-nonascii-02 is only concerned with characters, not
semantics or presentation formats (unlike xml2rfc format). Assuming that
plain text is the norm for purposes of draft-iab-rfc-nonascii-02, I
suppose that it is sufficient for the plain text to have an ALL-CAPS
name. I was going to suggest a novel xml2rfc element for Unicode code
points, such as <ucode name="yes">?</ucode> that would be transformed
into the output above in plain text mode. However, the xml2rfc
transformer can detect such text by looking for the presence of ?U+1F631
FACE SCREAMING IN FEAR?, and apply CSS to it in the html output instead,
viz.:
span.uniname { ? ? ? ? ? ? ? ? ? /* CHAR STYLES */
text-transform: lowercase;
font-variant: small-caps;
font-size: 110%;
}
As discussed here:
<http://www.unicode.org/mail-arch/unicode-ml/y2016-m08/0055.html>
Personally I do not see the need for quotations around the character.
U+____ SP ? SP NAME ought to be good enough: the single ? is going to
be non-ASCII anyway. However there are implications for combining marks,
with or without quotes?this needs to be thought through. Consider:
U+0308 ???? COMBINING DIAERESIS vs.
U+0308 ?? COMBINING DIAERESIS vs.
U+0308 ??? COMBINING DIAERESIS vs.
U+0308 ? COMBINING DIAERESIS.
See
<http://stackoverflow.com/questions/2224772/whats-the-unicode-glyph-used-to-indicate-combining-characters>
The question is what happens when the ? is a specific protocol element,
which frequently (but not always) is quoted, such as "+" and treated as
verbatim text <spanx style="verb"> or the new <tt> in xml2rfc v3.
Section 3.6 (and elsewhere) discusses ?U+ notation? without a reference.
Appendix A of [UnicodeCurrent] is appropriate.
Sean
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.rfc-editor.org/pipermail/rfc-interest/attachments/20160831/38efa979/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 6499 bytes
Desc: not available
URL: <http://www.rfc-editor.org/pipermail/rfc-interest/attachments/20160831/38efa979/attachment.png>
- [rfc-i] Feedback on Section 3.4 in draft-iab-rfc-… Sean Leonard
- [rfc-i] Feedback on Section 3.4 in draft-iab-rfc-… Paul Hoffman
- [rfc-i] Feedback on Section 3.4 in draft-iab-rfc-… Sean Leonard
- [rfc-i] Feedback on Section 3.4 in draft-iab-rfc-… Martin J. Dürst
- [rfc-i] Feedback on Section 3.4 in draft-iab-rfc-… Sean Leonard