Re: [xml2rfc-dev] xml2rfc would not be able to render RFC 7997

"Andrew G. Malis" <agmalis@gmail.com> Tue, 15 October 2019 19:30 UTC

Return-Path: <agmalis@gmail.com>
X-Original-To: xml2rfc-dev@ietfa.amsl.com
Delivered-To: xml2rfc-dev@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 84F5E120052 for <xml2rfc-dev@ietfa.amsl.com>; Tue, 15 Oct 2019 12:30:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.998
X-Spam-Level:
X-Spam-Status: No, score=-1.998 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rSvBmu5AoknZ for <xml2rfc-dev@ietfa.amsl.com>; Tue, 15 Oct 2019 12:30:48 -0700 (PDT)
Received: from mail-qt1-x82f.google.com (mail-qt1-x82f.google.com [IPv6:2607:f8b0:4864:20::82f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C05F3120044 for <xml2rfc-dev@ietf.org>; Tue, 15 Oct 2019 12:30:47 -0700 (PDT)
Received: by mail-qt1-x82f.google.com with SMTP id u22so32287395qtq.13 for <xml2rfc-dev@ietf.org>; Tue, 15 Oct 2019 12:30:47 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=vXjIIZ8RQ0qIr65FJN9VmaDIBRk3viUPGzjib3htF6s=; b=cz5hGGtn2I3wwNiTpBoyjrAWo63e8MAP36WAQSo5mM3lKwXAIyuHuWniTTVcnayHaS uUtliYXKqT3ZfaVBgWUbY5mqmWWgc0NKr+ya1XKrd9/HwvKY6YG7DkfOUFFKrF6yWMXI 70UH4k4ablk1v2LoZrzWzXLPh5Xue5cWvAvosp4BdATTHp8ufasq2zzzJrUWRcBe6B/r AvXipwAZiKlRCkwtkabM0PJg6IuZHcLMLyGaXfBeMEhNIzeahxRlNuwHa4sIjbJaum6E 90vkq0Jg1vrVNokZvxnrq2Raz2Ax3QboLWOz46hQ1vAmye/bt5Yjrr3h1JF/1+M9D2vp /w1w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=vXjIIZ8RQ0qIr65FJN9VmaDIBRk3viUPGzjib3htF6s=; b=AHyrwZTQDN7VhjlDEASfZ5XXR1r12rRxc8gTEimYVBgeB2izpdd5rahnSaBbver+ey ih/+sTIUbbv2P8L2/JPIfAGjv+TqYy3ruV7n7z+mwCnyssj26oehSS90fNjyOQpZONkf SVb2FiiVlfuN+mL9cJo+nZ9DyMMvNHnVuFg2GOTnPf+mYiafzfKpy3VbF9vxoYpxNOIl YSl0K+IhBlsQWFJvuHeSaScdJw+sp5eVp4WLLhV1eqCVDGsrWMRwL5KTQVZ1XAM/J3va yNMOBskbFX3FSi8av8wZo5BUjHomu6EKBeGS7FCPTQJGmKDji9gRz9HlH48Fz6bkv/xl hs3Q==
X-Gm-Message-State: APjAAAWY2BZFz3/OLBkopzS11kncnfCLzbwFvcanTRH0jzni7Ynv0W38 waK9EKEjOhbpM6Gbu2jG6iI/JJgnhvZyTisoI2c=
X-Google-Smtp-Source: APXvYqznZ/NiajVj2/+IDRTXiqapM+hd5ON0uNihzny69LoFnatzOirnT4YBxgWmm3Bc+nlWmhT48vq2pnqLC/s6hQQ=
X-Received: by 2002:ac8:38bb:: with SMTP id f56mr40008689qtc.154.1571167846701; Tue, 15 Oct 2019 12:30:46 -0700 (PDT)
MIME-Version: 1.0
References: <06116eaa-4dbb-1f35-6a76-d770e5775c12@gmx.de> <702D203A-2900-4290-8377-182F4AE2C359@rfc-editor.org> <1e73462a-b240-88ec-2ac1-068b3a1e0d2f@levkowetz.com>
In-Reply-To: <1e73462a-b240-88ec-2ac1-068b3a1e0d2f@levkowetz.com>
From: "Andrew G. Malis" <agmalis@gmail.com>
Date: Tue, 15 Oct 2019 15:30:35 -0400
Message-ID: <CAA=duU0UEMPRRSjzm=K2FUsHSnntky2aNTB0Ni1tgrMZ_4SoBA@mail.gmail.com>
To: Henrik Levkowetz <henrik@levkowetz.com>
Cc: Heather Flanagan <rse@rfc-editor.org>, Julian Reschke <julian.reschke@gmx.de>, XML Developer List <xml2rfc-dev@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000629d290594f805df"
Archived-At: <https://mailarchive.ietf.org/arch/msg/xml2rfc-dev/w2Yf8qeel_C-d4ocRRO_ZNPGiEs>
Subject: Re: [xml2rfc-dev] xml2rfc would not be able to render RFC 7997
X-BeenThere: xml2rfc-dev@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussion about particulars of xml2rfc V3 design, development and code." <xml2rfc-dev.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/xml2rfc-dev>, <mailto:xml2rfc-dev-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/xml2rfc-dev/>
List-Post: <mailto:xml2rfc-dev@ietf.org>
List-Help: <mailto:xml2rfc-dev-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/xml2rfc-dev>, <mailto:xml2rfc-dev-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Oct 2019 19:30:51 -0000

Henrik,

It's not just contributors, but general acknowledgements as well, such as
the example in Julian's email that kicked off this thread. Why not just
allow non-ASCII everywhere? That solves Tom's problem as well.

Cheers,
Andy


On Tue, Oct 15, 2019 at 3:07 PM Henrik Levkowetz <henrik@levkowetz.com>
wrote:

>
> On 2019-10-15 20:35, Heather Flanagan wrote:
> >
> >
> >> On Oct 14, 2019, at 11:58 PM, Julian Reschke <julian.reschke@gmx.de>
> wrote:
> >>
> >> So,
> >>
> >> RFC 7997 is "The Use of Non-ASCII Characters in RFCs". In
> >> <https://www.greenbytes.de/tech/webdav/rfc7997.html#rfc.section.3.2> it
> >> says:
> >>
> >>> Example Acknowledgements section:
> >>>
> >>> OLD:
> >>>
> >>> The following people contributed significant text to early versions of
> this draft: Patrik Faltstrom, William Chan, and Fred Baker.
> >>>
> >>> PROPOSED/NEW:
> >>>
> >>> The following people contributed significant text to early versions of
> this draft: Patrik Fältström (Faltstrom), 陈智昌 (William Chan), and Fred
> Baker.
> >>
> >> However,
> >> <
> https://tools.ietf.org/html/draft-levkowetz-xml2rfc-v3-implementation-notes-09#appendix-A.1
> >
> >> states:
> >>
> >>> A.1.  <u>
> >>>
> >>>   In xml2rfc vocabulary version 3, the elements <author>,
> >>>   <organisation>, <street>, <city>, <region>, <code>, <country>,
> >>>   <postalLine>, <email>, <seriesInfo>, and <title> may contain non-
> >>>   ascii characters for the purpose of rendering author names,
> >>>   addresses, and reference titles correctly.  They also have an
> >>>   additional "ascii" attribute for the purpose of proper rendering in
> >>>   ascii-only media.
> >>>
> >>>   In order to insert Unicode characters in any other context, xml2rfc
> >>>   vocabulary v3 requires that the Unicode string be enclosed within an
> >>>   <u> element.  The element will be expanded inline based on the value
> >>>   of a "format" attribute.  This provides a generalised means of
> >>>   generating the 6 methods of Unicode renderings listed in [RFC7997],
> >>>   Section 3.4, and also several others found in for instance the RFC
> >>>   Format Tools example rendering of RFC 7700, at https://rfc-
> >>>   format.github.io/draft-iab-rfc-css-bis/sample2-v2.html.
> >>>
> >>>   The "format" attribute accepts either a simplified format
> >>>   specification, or a full format string with placeholders for the
> >>>   various possible Unicode expansions.
> >>>
> >>> A.1.1.  Expansion of simplified <u> format specifications
> >>>
> >>>   The simplified format consists of dash-separated keywords, where each
> >>>   keyword represents a possible expansion of the Unicode character or
> >>>   string; use for example "<u "lit-num-name">foo</u>" to expand the
> >>>   text to its literal value, code point values, and code point names.
> >>>
> >>>   A combination of up to 3 of the following keywords may be used,
> >>>   separated by dashes: "num", "lit", "name", "ascii", "char".  The
> >>>   keywords are expanded as follows and combined, with the second and
> >>>   third enclosed in parentheses (if present):
> >>>
> >>>      "num"    The numeric value(s) of the element text, in U+1234
> >>>               notation
> >>>
> >>>      "name"   The Unicode name(s) of the element text
> >>>
> >>>      "lit"    The literal element text, enclosed in quotes
> >>>
> >>>      "char"   The literal element text, without quotes
> >>>
> >>>      "ascii"  The value of the 'ascii' attribute on the <u> element
> >>>
> >>>   In order to ensure that no specification mistakes can result for
> >>>   rendering methods that cannot render all Unicode code points, "num"
> >>>   MUST always be part of the specified format.
> >>>
> >>>   The default value of the "format" attribute is "lit-name-num".
> >>
> >> So, unless I'm missing something, the only way to get non-ASCII
> >> characters into regular prose is using <u>, and using <u> implies
> >> automatic expansion of characters to numerical representations of the
> >> codepoints.
> >>
> >> Possible solutions:
> >>
> >> 1) In RFC 7997bis, remove the suggestion to allow non-ASCII names in
> >> Acknowledgements etc.
> >>
> >> 2) Relax the requirements for <u> so that it doesn't *need* to be used
> >> in prose.
> >>
> >> 3) Relax the requirement about output formats for <u>.
> >>
> >> My preference would be 2) or 3).
> >
> > I agree that 1) is not ideal - won’t go that route.
> >
> > I like 3) over 2) because the point of <u> is to help be clear in text
> that might be semantically important for the spec about what characters are
> being used. If we just say “any prose”, I feel like that might open us up
> to the confusion we’re trying to avoid. Does that make sense?
>
> The problem here is that if you relax the requirements on <u> too much,
> it looses its function.  It's current function is exactly to permit
> insertion of non-ASCII in prose, but only if there is an expansion that
> guarantees that the resulting specification always is explicit.  If it's
> possible to use <u> to insert arbitrary non-ascii without expansion,
> you're effectively back at no limitations on non-ascii at all.
>
> I'm very strongly against removing the restriction on <u>.  In that case
> it's better to permit any unicode in prose in general, and just drop <u>.
>
> For the specific purpose of permitting non-ascii names in acknowledgements,
> I'd like to suggest that we consider approaches that build on the current
> <author> entry instead.  For author, we already have well-defined handling
> of ASCII and non-ASCII parts that we can build on. Some possible
> variations:
>
>  * Add a role="contributor" to <author>, and automatically generate a
>    contributors section.
>
>  * Add a role="contributor" to <author>, and make it possible to use <xref>
>    to pull in contributor names at selected points in prose
>
>  * Add a role="contributor" to <author>, and add a new <aref> element that
>    lets you reference (insert names from) such entries in prose.
>
>  * Permit insertion of <author> entries in prose directly.
>
>
> Regards,
>
>         Henrik
>
>
>
> > I haven’t added <u> to the 7991bis doc. I’m currently looking at
> reverting <seriesInfo> as per
> https://github.com/rfc-format/draft-iab-xml2rfc-v3-bis/issues/7, so I’m
> not far away from <u>.
> >
> > -Heather
> >
> >>
> >> Best regards, Julian
> >>
> >> PS: tracked for now at
> >> <https://trac.tools.ietf.org/tools/xml2rfc/trac/ticket/416>
> >>
> >> _______________________________________________
> >> xml2rfc-dev mailing list
> >> xml2rfc-dev@ietf.org
> >> https://www.ietf.org/mailman/listinfo/xml2rfc-dev
> >
> > _______________________________________________
> > xml2rfc-dev mailing list
> > xml2rfc-dev@ietf.org
> > https://www.ietf.org/mailman/listinfo/xml2rfc-dev
> >
>
> _______________________________________________
> xml2rfc-dev mailing list
> xml2rfc-dev@ietf.org
> https://www.ietf.org/mailman/listinfo/xml2rfc-dev
>