Re: [xml2rfc-dev] <artset> feedback

Julian Reschke <julian.reschke@gmx.de> Thu, 09 May 2019 12:11 UTC

Return-Path: <julian.reschke@gmx.de>
X-Original-To: xml2rfc-dev@ietfa.amsl.com
Delivered-To: xml2rfc-dev@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 394BF12001E for <xml2rfc-dev@ietfa.amsl.com>; Thu, 9 May 2019 05:11:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.6
X-Spam-Level:
X-Spam-Status: No, score=-2.6 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=gmx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id n3KtQQdpkshW for <xml2rfc-dev@ietfa.amsl.com>; Thu, 9 May 2019 05:11:03 -0700 (PDT)
Received: from mout.gmx.net (mout.gmx.net [212.227.15.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 66BF2120006 for <xml2rfc-dev@ietf.org>; Thu, 9 May 2019 05:11:03 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1557403846; bh=VaTlDjVl3IhjDFXbI4I73q8VuuSmwaptauyu2zRu4CM=; h=X-UI-Sender-Class:Subject:To:References:From:Date:In-Reply-To; b=dIc1WIbpIFrD3dAmZChHqRE/yvft7sItZFN8dosXFlG+hX6dsxw46TqKrw2ImX7JO Wd6YoYagmUeVg3WZsUUvo82Cki0hJ0H0LxpWx3ypQopkV03EjhS/Q9lKhAZx4thvjp AZcXpVZlqNVvx9NRvv8LQyt2Alip7x3YfJ+0Nf8M=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from [192.168.178.124] ([84.171.144.58]) by mail.gmx.com (mrgmx003 [212.227.17.190]) with ESMTPSA (Nemesis) id 0LwrPM-1geix42UEi-016NjF; Thu, 09 May 2019 14:10:46 +0200
To: Henrik Levkowetz <henrik@levkowetz.com>, XML Developer List <xml2rfc-dev@ietf.org>
References: <eb78385f-9ac0-01e8-8b4a-572d8890c1a1@greenbytes.de> <fe361119-60b1-e269-be2c-de8aa6987db9@levkowetz.com>
From: Julian Reschke <julian.reschke@gmx.de>
Message-ID: <d461160c-1a87-999d-3368-2abc797252e6@gmx.de>
Date: Thu, 09 May 2019 14:10:45 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1
MIME-Version: 1.0
In-Reply-To: <fe361119-60b1-e269-be2c-de8aa6987db9@levkowetz.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable
X-Provags-ID: V03:K1:4nCw+PNAuEAxfV1DjARMz7bcVs+I7KwChusu4KO4iBJm7bNb1VV YPT6y/0EZf1JNT3e6rRgceBBlJwsxmo/xE3aJtg3V+qC+0TnrZ5TPmPEJH5w93YBfGnmfen 9CkJZrGwkb0UyCnvtkxVfZP1XYGyqen89wvIqQLTcUN9lCPDqlZ1RNdxEX0hCZyPDgicE3/ G1sq5vRECgWoLCV424Bqw==
X-UI-Out-Filterresults: notjunk:1;V03:K0:9TkhwuHjyH4=:4JgB80ctsNH1D+oarS4DKf 5CCo2Y00L5YxxN3lWomTbcUrwwU/BfLvxFWrsEINEiIjzR+1d7eznZ+QS6o5TFtPBczrCqOaa W8r1SiRMDDJuqZMRIdzelgisGsi4fBHjgg87Uq28BzcUiamle1jPs7ImgX+WX5hcHSfLqQjwl 5vwaidKs0+7nsDptcbW/hNg+jpfdukbOmc82N6sDtba8IpO9fFPfLZFfGEK6UyHAhjlzIebFO eiZUq6pu6CY9REUnoEvipzoUlTsMVPPnGgSBHjnIA1JHt/lirdatUUBK77n7ZLU/oX/EHVVCR RHDPal71gax44ux5zclwz8pZ2+oFvyyAO6gtyCU5EdZRWmB8Zxye9EqYqumYKU0dKkitM9YXZ ow2fez3ma6WxagcD9Kd+6Du/JSn3cF55dISQm9xegs83V96dcQoBVLI7NO7TjnyVmppP2jadG 8LuTmCBzIV9pYWM/taAT9dsfCtG2nWTbpCoJ/ZchMBy9ItfVX1mYhcfTxsbLcRFPRjMjTi7Zk lVHfNkKx8dlLlVlkaOJLkgCT4V/OdwJfzUt0GEEPWSj1hdJHH/FQ3OrkfrsED57+B8F5SM1u0 rcYN/fJnOSANn/BJ5FGhfVV2qMOjghWgNfRoPR+1jFDxomAlM3/dmNrbEleJbffDm9XcYj3sq x6BuWhG9zBFDVARnaRj1FoeIvzkO8GJ+/VK58Qvf6guEC+9NdBYIKVVGQc7aRNTHZYNNpzS1C M4ViM50reU0vHAmVHWqHAQQPiZCp5iN8jn92ClyFYHnz/MfGtuBIkCFm8aBRgTfIuMOhtZpbL rrZny5i0jDrbryMgEyVitBFJVCsN/n6wlmY8mcpyxp9aVztKDBPWhs2hPARdxi35KmNAA4oQu No0BMLtRieISGvnPF3EFs9Z4NcCBjGO62frE8gU7kOrv1sfGFp/1JKR5o4kfPJU3vJk6OkdNl BuighMLtHSw==
Archived-At: <https://mailarchive.ietf.org/arch/msg/xml2rfc-dev/faax8iHhGfXHUQtlY8nXlz2jEPQ>
Subject: Re: [xml2rfc-dev] <artset> feedback
X-BeenThere: xml2rfc-dev@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussion about particulars of xml2rfc V3 design, development and code." <xml2rfc-dev.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/xml2rfc-dev>, <mailto:xml2rfc-dev-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/xml2rfc-dev/>
List-Post: <mailto:xml2rfc-dev@ietf.org>
List-Help: <mailto:xml2rfc-dev-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/xml2rfc-dev>, <mailto:xml2rfc-dev-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 09 May 2019 12:11:06 -0000

On 09.05.2019 13:57, Henrik Levkowetz wrote:
> Hi Julian,
>
> On 2019-05-09 13:24, Julian Reschke wrote:
>> Hi there,
>>
>> see below for some feedback on <artset>, as currently described in
>> <https://tools.ietf.org/html/draft-levkowetz-xml2rfc-v3-implementation-notes-08#section-3.1.1>.
>>
>> I have worked on an implementation in rfc2629.xslt (which should be
>> fairly complete), and have my own set of tests at
>> <https://greenbytes.de/tech/webdav/rfc2629xslt/v3test.xml#artset>. So
>> please read this as implementer's *practical* feedback.
>>
>> In general I found this change to be very disruptive, because it affects
>> all code that deals with <artwork>, and everything related to it
>> (numbering, references, etc). A simpler approach (as proposed back then)
>> IMHO would be far better.
>>
>> That said, here are the details:
>>
>>
>> 1) The grammar allows <artset> to be empty
>>
>> This makes things really hard. How is an empty <artset> element expected
>> to be handled? Does it contribute to the counting of paragraphs? Can it
>> be target of an <xref>? These things of course could be defined, but it
>> would be much simpler to require at least one <artwork> child element-
>
> Good point.  What would be your proposed grammar change to address this?

       artset =
         element artset {
           attribute xml:base { text }?,
           attribute xml:lang { text }?,
           attribute anchor { xsd:ID }?,
           attribute pn { xsd:ID }?,
           artwork+
         }

(I think)

>> 2) anchor propagation and <xref>
>>
>> "The first anchor on an <artwork> element within an <artset> element
>> will be promoted to the <artset> element if it has none; apart from
>> that, anchors on <artwork> elements within an <artset> element will be
>> removed by the preptool."
>>
>> I understand that this is supposed to make the author's life easier - an
>> existing <artwork> element can be moved into a new <artset> container
>> without modification.
>>
>> However, it causes lots of edge cases, such as: what happens if I <xref>
>> an <artwork> element that does not appear in the rendered output? I
>> *assume* the intention is to say that the anchor propagation applies (1)
>> not only at the preptool stage, and (2) applies *both* to the <artwork>
>> elements and all <xref>s referencing them - but the spec would need to
>> say way more about that.
>
> Ok.
>
>> 3) Invisible content
>>
>> Having multiple <artwork> alternatives can lead to content being present
>> in the canonical XML, but not to appear in the rendered output. I can
>> see that this is a problem with textual fallback content already, but
>> allowing any number of altenative <artwork> elements in the canonical
>> XML makes this a much more serious problem.
>
> I don't see this -- the essential problem is the same whether the number
> of alternatives are 2 or more than 2.  And that is built into the idea
> that the XML should be able to provide richer content for HTML than for
> text; I don't see any way around this.

If we just have "rich" and "text fallback", we can always capture that
in HTML too (<img alt="..."> etc). It might not be visible by default,
but it would at least be included.

>> 4) Processing model
>>
>> "This would let the renderer pick the most appropriate <artwork>
>> instance for its format from the alternatives present within an <artset>
>> element, based on the "type" attribute of each enclosed <artwork>
>> element.If more than one <artwork> element is found within an <artset>
>> element, with the same "type" attribute, the renderer could select the
>> first one, or possibly choose between the alternative instances based on
>> the output format and some quality of the alternative instances that
>> made one more suitable than the other for that particular format, such
>> as size, aspect ratio, or whatnot."
>>
>> "Implementation:  Xml2rfc as of version 2.19.0 implements this, with a
>> preference list when rendering to HTML and PDF of ( "svg", "binary-art",
>> "ascii-art" ), while the text renderer uses the list
>>         ( "ascii-art", ) -- i.e., one entry only."
>>
>> So there is no precise processing model, due to the fact that we can't
>> predict what kind of alternative formats will come up. The description
>> of the *actual* model is specific to a certain version of one
>> implementation.
>
> The serious fussiness here comes from
>
>    1) not prescribing whether the first or some other <artwork> should be
>       chosen when there are multiple instances with the same type.
>       For this, I propose that we codify that the first is always chosen,
>       or alternatively, make it an error to have more than one of each type.

+1 to always pick the first.

>    2) having implementation-specific preference lists.  We could codify this
>       more strictly too.  Would the xml2rfc settings work for you here?
>
>> In addition, this depends on the "type" attribute which as per RFC 7991
>> is sort of advisory only (see
>> <https://greenbytes.de/tech/webdav/rfc7991.html#element.artwork.attribute.type>).
>>
>> I believe it would be better (and more author friendly) to actually
>> inspect the contents of the <artwork>, and decide based on that (that's
>> what I'm currently doing in rfc2629.xslt).
>
> We've had this discussion before, and disagree.  I maintain that an explicit
> type is better than an implicit type.  An implicit type leads to _huge_
> difficulties in specifying exactly how the inspection is done and how
> the result is interpreted.

Let me disagree. The only case relevant for RFCs right now is SVG, and
that can be detected very easily be the processor without the presence
of a type attribute.

>> I have spent a significant amount of time to implement this, but I'd
>> still prefer to throw this all away and make a less-intrusive extension
>> instead.
>>
>> That being said, point 3) above really needs to be discussed in the
>> context of how the canonical form and the default rendering relate to
>> each other.
>
> And that's the issue that is inherent also in the RFC 7991 schema, since it
> permits 2 different artworks, only one of which will be shown in the
> rendered form.  Nothing new here.  I'm fine with discussing the issue, but
> don't make it out as something introduced with <artset>.

I believe it's much worse then before, thus I want to make sure people
are aware of it.

Best regards, Julian