[rfc-i] Feedback on draft-iab-rfc-nonascii-02, allowable characters
dev+ietf at seantek.com (Sean Leonard) Wed, 31 August 2016 17:05 UTC
From: dev+ietf at seantek.com (Sean Leonard)
Date: Wed, 31 Aug 2016 10:05:51 -0700
Subject: [rfc-i] Feedback on draft-iab-rfc-nonascii-02, allowable characters
Message-ID: <c21e9705-b4a8-1d52-3d6d-a2e5749d49ed@seantek.com>
/(Part 2: questions about what characters beyond ASCII are allowed)/ ********** Hello draft-iab-rfc-nonascii-02 people, here is feedback on draft-iab-rfc-nonascii-02. Then there is the issue of curly quotes, both in U+ syntax and in general. Are curly quotes allowed? Should they be allowed in general in non-ascii RFCs, or replaced for straight quotes? The xml2rfc tool currently down-converts smart quotes to straight quotes in plain text, but does not upconvert straight quotes to smart quotes in HTML. This has implications for how ?verbatim? (aka literal text strings) are notated in the RFC formats. What about marks that are currently allowed by xml2rfc, such as U+2014 ? EM DASH, that is converted to -- in plain text? I happen to use that character aggressively as the prose calls for it, so it would be good to know how it will show up in the plain text format, if at all. What about other punctuation marks such as ? ? ? ? ? ? ? etc.? The whole raft of Unicode space characters such as EM QUAD, EM SPACE, etc.? What about characters that have strong mathematical value such as ? MULTIPLICATION SIGN, ? DIVISION SIGN, and ? N-ARY SUMMATION, and the whole block of mathematical operators? Such mathematical characters might be especially useful for cryptographic specifications. And what about block elements and geometric shapes (U+2500-U+25FF) in <artwork>? Overall the implications of this draft are that uses that are not explicitly mentioned (author names, protocol elements, addresses) are discouraged or prohibited; therefore, characters like EM DASH and BULLET that can be represented (however imperfectly) in ASCII ought to continue to be used as such. Yet the text plainly states: ?To support this move away from ASCII, RFCs will switch to supporting UTF-8 as the default character encoding and allow support for a broad range of Unicode character support.? That supports the proposition that all code points that are renderable in a modern, monospace, freely-available font (i.e., Courier New) are fair game, as well as code points that modern operating systems are likely to render /or/ that would appear in author names (emoji and CJK characters, Indic scripts, Arabic scripts). Note: Courier New 5.13 (Windows 7) includes coverage for 2852 characters and 3254 glyphs; the version with Windows 10 supports even more, I think. Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.rfc-editor.org/pipermail/rfc-interest/attachments/20160831/03136fc0/attachment-0001.html>
- [rfc-i] Feedback on draft-iab-rfc-nonascii-02, al… Sean Leonard
- [rfc-i] Feedback on draft-iab-rfc-nonascii-02, al… Paul Hoffman