Re: [xml2rfc-dev] xml2rfc would not be able to render RFC 7997

Hi Tom,

> On Oct 15, 2019, at 11:51 AM, Tom Pusateri <pusateri@bangj.com> wrote:
> 
> If you’re to change the use of non-ascii characters in RFCs, there’s been many requests for unicode math symbols in paragraph text.
> 
> I feel like someone is going to shoot me for saying this but really, it’s 2019. We should be able to do ≤ instead of <=

Sure. I don’t mention math symbols explicitly in 7997; do you think they need separate handling from other non-ASCII characters? Or does 7997 need a special section just on Math?

-Heather

> 
> Tom
> 
>> On Oct 15, 2019, at 2:35 PM, Heather Flanagan <rse@rfc-editor.org <mailto:rse@rfc-editor.org>> wrote:
>> 
>> 
>> 
>>> On Oct 14, 2019, at 11:58 PM, Julian Reschke <julian.reschke@gmx.de <mailto:julian.reschke@gmx.de>> wrote:
>>> 
>>> So,
>>> 
>>> RFC 7997 is "The Use of Non-ASCII Characters in RFCs". In
>>> <https://www.greenbytes.de/tech/webdav/rfc7997.html#rfc.section.3.2 <https://www.greenbytes.de/tech/webdav/rfc7997.html#rfc.section.3.2>> it
>>> says:
>>> 
>>>> Example Acknowledgements section:
>>>> 
>>>> OLD:
>>>> 
>>>> The following people contributed significant text to early versions of this draft: Patrik Faltstrom, William Chan, and Fred Baker.
>>>> 
>>>> PROPOSED/NEW:
>>>> 
>>>> The following people contributed significant text to early versions of this draft: Patrik Fältström (Faltstrom), 陈智昌 (William Chan), and Fred Baker.
>>> 
>>> However,
>>> <https://tools.ietf.org/html/draft-levkowetz-xml2rfc-v3-implementation-notes-09#appendix-A.1 <https://tools.ietf.org/html/draft-levkowetz-xml2rfc-v3-implementation-notes-09#appendix-A.1>>
>>> states:
>>> 
>>>> A.1.  <u>
>>>> 
>>>>  In xml2rfc vocabulary version 3, the elements <author>,
>>>>  <organisation>, <street>, <city>, <region>, <code>, <country>,
>>>>  <postalLine>, <email>, <seriesInfo>, and <title> may contain non-
>>>>  ascii characters for the purpose of rendering author names,
>>>>  addresses, and reference titles correctly.  They also have an
>>>>  additional "ascii" attribute for the purpose of proper rendering in
>>>>  ascii-only media.
>>>> 
>>>>  In order to insert Unicode characters in any other context, xml2rfc
>>>>  vocabulary v3 requires that the Unicode string be enclosed within an
>>>>  <u> element.  The element will be expanded inline based on the value
>>>>  of a "format" attribute.  This provides a generalised means of
>>>>  generating the 6 methods of Unicode renderings listed in [RFC7997],
>>>>  Section 3.4, and also several others found in for instance the RFC
>>>>  Format Tools example rendering of RFC 7700, at https://rfc- <https://rfc-/>
>>>>  format.github.io/draft-iab-rfc-css-bis/sample2-v2.html <http://format.github.io/draft-iab-rfc-css-bis/sample2-v2.html>.
>>>> 
>>>>  The "format" attribute accepts either a simplified format
>>>>  specification, or a full format string with placeholders for the
>>>>  various possible Unicode expansions.
>>>> 
>>>> A.1.1.  Expansion of simplified <u> format specifications
>>>> 
>>>>  The simplified format consists of dash-separated keywords, where each
>>>>  keyword represents a possible expansion of the Unicode character or
>>>>  string; use for example "<u "lit-num-name">foo</u>" to expand the
>>>>  text to its literal value, code point values, and code point names.
>>>> 
>>>>  A combination of up to 3 of the following keywords may be used,
>>>>  separated by dashes: "num", "lit", "name", "ascii", "char".  The
>>>>  keywords are expanded as follows and combined, with the second and
>>>>  third enclosed in parentheses (if present):
>>>> 
>>>>     "num"    The numeric value(s) of the element text, in U+1234
>>>>              notation
>>>> 
>>>>     "name"   The Unicode name(s) of the element text
>>>> 
>>>>     "lit"    The literal element text, enclosed in quotes
>>>> 
>>>>     "char"   The literal element text, without quotes
>>>> 
>>>>     "ascii"  The value of the 'ascii' attribute on the <u> element
>>>> 
>>>>  In order to ensure that no specification mistakes can result for
>>>>  rendering methods that cannot render all Unicode code points, "num"
>>>>  MUST always be part of the specified format.
>>>> 
>>>>  The default value of the "format" attribute is "lit-name-num".
>>> 
>>> So, unless I'm missing something, the only way to get non-ASCII
>>> characters into regular prose is using <u>, and using <u> implies
>>> automatic expansion of characters to numerical representations of the
>>> codepoints.
>>> 
>>> Possible solutions:
>>> 
>>> 1) In RFC 7997bis, remove the suggestion to allow non-ASCII names in
>>> Acknowledgements etc.
>>> 
>>> 2) Relax the requirements for <u> so that it doesn't *need* to be used
>>> in prose.
>>> 
>>> 3) Relax the requirement about output formats for <u>.
>>> 
>>> My preference would be 2) or 3).
>> 
>> I agree that 1) is not ideal - won’t go that route.
>> 
>> I like 3) over 2) because the point of <u> is to help be clear in text that might be semantically important for the spec about what characters are being used. If we just say “any prose”, I feel like that might open us up to the confusion we’re trying to avoid. Does that make sense?
>> 
>> I haven’t added <u> to the 7991bis doc. I’m currently looking at reverting <seriesInfo> as per https://github.com/rfc-format/draft-iab-xml2rfc-v3-bis/issues/7 <https://github.com/rfc-format/draft-iab-xml2rfc-v3-bis/issues/7>, so I’m not far away from <u>. 
>> 
>> -Heather
>> 
>>> 
>>> Best regards, Julian
>>> 
>>> PS: tracked for now at
>>> <https://trac.tools.ietf.org/tools/xml2rfc/trac/ticket/416 <https://trac.tools.ietf.org/tools/xml2rfc/trac/ticket/416>>
>>> 
>>> _______________________________________________
>>> xml2rfc-dev mailing list
>>> xml2rfc-dev@ietf.org <mailto:xml2rfc-dev@ietf.org>
>>> https://www.ietf.org/mailman/listinfo/xml2rfc-dev <https://www.ietf.org/mailman/listinfo/xml2rfc-dev>
>> 
>> _______________________________________________
>> xml2rfc-dev mailing list
>> xml2rfc-dev@ietf.org <mailto:xml2rfc-dev@ietf.org>
>> https://www.ietf.org/mailman/listinfo/xml2rfc-dev <https://www.ietf.org/mailman/listinfo/xml2rfc-dev>