Re: [xml2rfc-dev] xml2rfc would not be able to render RFC 7997

On 2019-10-15 20:35, Heather Flanagan wrote:
> 
> 
>> On Oct 14, 2019, at 11:58 PM, Julian Reschke <julian.reschke@gmx.de> wrote:
>> 
>> So,
>> 
>> RFC 7997 is "The Use of Non-ASCII Characters in RFCs". In
>> <https://www.greenbytes.de/tech/webdav/rfc7997.html#rfc.section.3.2> it
>> says:
>> 
>>> Example Acknowledgements section:
>>> 
>>> OLD:
>>> 
>>> The following people contributed significant text to early versions of this draft: Patrik Faltstrom, William Chan, and Fred Baker.
>>> 
>>> PROPOSED/NEW:
>>> 
>>> The following people contributed significant text to early versions of this draft: Patrik Fältström (Faltstrom), 陈智昌 (William Chan), and Fred Baker.
>> 
>> However,
>> <https://tools.ietf.org/html/draft-levkowetz-xml2rfc-v3-implementation-notes-09#appendix-A.1>
>> states:
>> 
>>> A.1.  <u>
>>> 
>>>   In xml2rfc vocabulary version 3, the elements <author>,
>>>   <organisation>, <street>, <city>, <region>, <code>, <country>,
>>>   <postalLine>, <email>, <seriesInfo>, and <title> may contain non-
>>>   ascii characters for the purpose of rendering author names,
>>>   addresses, and reference titles correctly.  They also have an
>>>   additional "ascii" attribute for the purpose of proper rendering in
>>>   ascii-only media.
>>> 
>>>   In order to insert Unicode characters in any other context, xml2rfc
>>>   vocabulary v3 requires that the Unicode string be enclosed within an
>>>   <u> element.  The element will be expanded inline based on the value
>>>   of a "format" attribute.  This provides a generalised means of
>>>   generating the 6 methods of Unicode renderings listed in [RFC7997],
>>>   Section 3.4, and also several others found in for instance the RFC
>>>   Format Tools example rendering of RFC 7700, at https://rfc-
>>>   format.github.io/draft-iab-rfc-css-bis/sample2-v2.html.
>>> 
>>>   The "format" attribute accepts either a simplified format
>>>   specification, or a full format string with placeholders for the
>>>   various possible Unicode expansions.
>>> 
>>> A.1.1.  Expansion of simplified <u> format specifications
>>> 
>>>   The simplified format consists of dash-separated keywords, where each
>>>   keyword represents a possible expansion of the Unicode character or
>>>   string; use for example "<u "lit-num-name">foo</u>" to expand the
>>>   text to its literal value, code point values, and code point names.
>>> 
>>>   A combination of up to 3 of the following keywords may be used,
>>>   separated by dashes: "num", "lit", "name", "ascii", "char".  The
>>>   keywords are expanded as follows and combined, with the second and
>>>   third enclosed in parentheses (if present):
>>> 
>>>      "num"    The numeric value(s) of the element text, in U+1234
>>>               notation
>>> 
>>>      "name"   The Unicode name(s) of the element text
>>> 
>>>      "lit"    The literal element text, enclosed in quotes
>>> 
>>>      "char"   The literal element text, without quotes
>>> 
>>>      "ascii"  The value of the 'ascii' attribute on the <u> element
>>> 
>>>   In order to ensure that no specification mistakes can result for
>>>   rendering methods that cannot render all Unicode code points, "num"
>>>   MUST always be part of the specified format.
>>> 
>>>   The default value of the "format" attribute is "lit-name-num".
>> 
>> So, unless I'm missing something, the only way to get non-ASCII
>> characters into regular prose is using <u>, and using <u> implies
>> automatic expansion of characters to numerical representations of the
>> codepoints.
>> 
>> Possible solutions:
>> 
>> 1) In RFC 7997bis, remove the suggestion to allow non-ASCII names in
>> Acknowledgements etc.
>> 
>> 2) Relax the requirements for <u> so that it doesn't *need* to be used
>> in prose.
>> 
>> 3) Relax the requirement about output formats for <u>.
>> 
>> My preference would be 2) or 3).
> 
> I agree that 1) is not ideal - won’t go that route.
> 
> I like 3) over 2) because the point of <u> is to help be clear in text that might be semantically important for the spec about what characters are being used. If we just say “any prose”, I feel like that might open us up to the confusion we’re trying to avoid. Does that make sense?

The problem here is that if you relax the requirements on <u> too much,
it looses its function.  It's current function is exactly to permit
insertion of non-ASCII in prose, but only if there is an expansion that
guarantees that the resulting specification always is explicit.  If it's
possible to use <u> to insert arbitrary non-ascii without expansion,
you're effectively back at no limitations on non-ascii at all.

I'm very strongly against removing the restriction on <u>.  In that case
it's better to permit any unicode in prose in general, and just drop <u>.

For the specific purpose of permitting non-ascii names in acknowledgements,
I'd like to suggest that we consider approaches that build on the current
<author> entry instead.  For author, we already have well-defined handling
of ASCII and non-ASCII parts that we can build on. Some possible variations:

 * Add a role="contributor" to <author>, and automatically generate a
   contributors section.

 * Add a role="contributor" to <author>, and make it possible to use <xref>
   to pull in contributor names at selected points in prose

 * Add a role="contributor" to <author>, and add a new <aref> element that
   lets you reference (insert names from) such entries in prose.

 * Permit insertion of <author> entries in prose directly.

Regards,

	Henrik

> I haven’t added <u> to the 7991bis doc. I’m currently looking at reverting <seriesInfo> as per https://github.com/rfc-format/draft-iab-xml2rfc-v3-bis/issues/7, so I’m not far away from <u>. 
> 
> -Heather
> 
>> 
>> Best regards, Julian
>> 
>> PS: tracked for now at
>> <https://trac.tools.ietf.org/tools/xml2rfc/trac/ticket/416>
>> 
>> _______________________________________________
>> xml2rfc-dev mailing list
>> xml2rfc-dev@ietf.org
>> https://www.ietf.org/mailman/listinfo/xml2rfc-dev
> 
> _______________________________________________
> xml2rfc-dev mailing list
> xml2rfc-dev@ietf.org
> https://www.ietf.org/mailman/listinfo/xml2rfc-dev
> 

Re: [xml2rfc-dev] xml2rfc would not be able to render RFC 7997

Attachment: signature.asc