Re: On XML and $EDITORs (Re: Things that used to be clear (was ...)) "Living Documents") side meeting at IETF105.)

Keith Moore <moore@network-heretics.com> Wed, 10 July 2019 10:09 UTC

Return-Path: <moore@network-heretics.com>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CFDED1200FD for <ietf@ietfa.amsl.com>; Wed, 10 Jul 2019 03:09:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.598
X-Spam-Level:
X-Spam-Status: No, score=-2.598 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_NONE=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=messagingengine.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7QlDbVsR_ruN for <ietf@ietfa.amsl.com>; Wed, 10 Jul 2019 03:09:05 -0700 (PDT)
Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CF6A4120099 for <ietf@ietf.org>; Wed, 10 Jul 2019 03:09:04 -0700 (PDT)
Received: from compute6.internal (compute6.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 5B79C21D2B; Wed, 10 Jul 2019 06:09:03 -0400 (EDT)
Received: from mailfrontend2 ([10.202.2.163]) by compute6.internal (MEProxy); Wed, 10 Jul 2019 06:09:03 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; bh=hNFrgz to7XgrB8f4h26R33/mZd0ZChC64HHV3mWDdRc=; b=O8GnmHVXLKLEqapWta3sMr FBwVGz8+JEUIQD2j55fuMfBrOZMzpGSrqg9sNlS+QbeWtRhhxU9BCG3c1gnfNxFg A39XYkNPAsGbWny2A98VgL1riPBiESgI5ca4Jj6L5TVb5xemWBzv/d54UUbXYnj+ 32FqknbkWiUQsReV2/uUtARi1Wo0Uie2J269SU1wnHTjdaVL2nDw2RPBC3YfC5e5 NKJhRRnUFA9xQBLBE7jtnd5yCIWI5D+YduVlDpj9Y17cthIavUOfg+JIZokFKqkJ 8alLv269WwsC474ke9R3nEmgXdkAv/J+GX14sG61JywENEews6lvIfHQIYYhPY6w ==
X-ME-Sender: <xms:PrklXfk9bF4y4cSVtNI5K7W4THFSXz7Yv6GXndhhdAC49x_Iw8qfyQ>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduvddrgeeigddvhecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefuvfhfhffkffgfgggjtgesrgdtreertdefjeenucfhrhhomhepmfgvihhthhcu ofhoohhrvgcuoehmohhorhgvsehnvghtfihorhhkqdhhvghrvghtihgtshdrtghomheqne cukfhppedutdekrddvvddurddukedtrdduheenucfrrghrrghmpehmrghilhhfrhhomhep mhhoohhrvgesnhgvthifohhrkhdqhhgvrhgvthhitghsrdgtohhmnecuvehluhhsthgvrh fuihiivgeptd
X-ME-Proxy: <xmx:PrklXXEOdSQHC-Fg5MmJUiDbk41NxJvHgtSy7AoeyBYWA6w1mGXZfw> <xmx:PrklXQqZp8Amz4KMbECVi704yuUs1J4Fej3i2_3VJjUbNujeAQ_5pA> <xmx:PrklXb4uEmvUIDRBzdNSlq9uNMfYo4D77nEl5D8cv4LmFqoT6Gxi8Q> <xmx:P7klXT5MyCWUoVnYXejKUDC7Wd9l5klw_Jx98NbjK7Q91imojw2xpw>
Received: from [192.168.1.66] (108-221-180-15.lightspeed.knvltn.sbcglobal.net [108.221.180.15]) by mail.messagingengine.com (Postfix) with ESMTPA id B74A4380076; Wed, 10 Jul 2019 06:09:01 -0400 (EDT)
Subject: Re: On XML and $EDITORs (Re: Things that used to be clear (was ...)) "Living Documents") side meeting at IETF105.)
To: Nico Williams <nico@cryptonector.com>, Christian Huitema <huitema@huitema.net>
Cc: ietf@ietf.org
References: <20190705205723.GI55957@shrubbery.net> <20190706185415.GB14026@mit.edu> <CABcZeBPgNr5UqQ0pLwwNu5wh0g9L9wCd6YyYKCUDO37SPru-_Q@mail.gmail.com> <20190708202612.GG60909@shrubbery.net> <9ae14ad1-f8d5-befb-64e4-fff063c88e02@network-heretics.com> <CABcZeBOH9LH8Jrz-A5eu9arqUb+bx8xs_eKWi0pyoh7a3qpOPA@mail.gmail.com> <20190708223350.GO3508@localhost> <af3b25d6-af16-a96a-c149-61d01afb4d01@network-heretics.com> <20190708233438.GP3508@localhost> <ea0b9894-ae9d-55a9-a082-af7aac5be66a@huitema.net> <20190710045202.GA3215@localhost>
From: Keith Moore <moore@network-heretics.com>
Message-ID: <50a21cf4-9b9f-1341-f53d-a953fa618d04@network-heretics.com>
Date: Wed, 10 Jul 2019 06:09:00 -0400
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.2
MIME-Version: 1.0
In-Reply-To: <20190710045202.GA3215@localhost>
Content-Type: multipart/alternative; boundary="------------0E82402A7F268E9B3716FA71"
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf/fttrcUrQNkxioQ3oOA71WX6sqA4>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf/>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Jul 2019 10:09:09 -0000

On 7/10/19 12:52 AM, Nico Williams wrote:
> On Tue, Jul 09, 2019 at 09:01:11PM -0700, Christian Huitema wrote:
>> On 7/9/2019 1:34 AM, Nico Williams wrote:
>>> XML with webby $EDITOR tooling would do.
>> Tooling is just one of the problems with XML2RFC. The real issue is
>> that XML2RFC is completely specific to the IETF. This translate into
>> training requirements for people who need to actually use that markup
>> language, absence of easy to use tools because the user pool is too
>> small to sustain development, and then a reliance on translators
>> between an easy-to-edit format and the publication. For example, a
>> team of authors would be using markdown and Github, and using a tool
>> chain to produce XML2RFC. But if a copy editor suggests updates to the
>> XML text, these updates cannot easily imported to the original
>> markdown document, or to the markdown starter for the "bis" project.
> Office and LibreOffice use XML too, but users don't see it.  That's what
> I meant by "webby $EDITOR tooling" above: a bloody real UI, a browser UI.

While that's probably better than editing raw XML, I'm unfavorably 
impressed with UIs for editing XML (and that includes UIs that edit HTML 
and variants thereof).   A possibly familiar example of what I'm talking 
about: you're editing a document that is internally represented by HTML 
or XML and trying to delete white space between two chunks of text that 
are at different levels of hierarchy.   All of a sudden you've "deleted 
too much" - the visual difference between those two chunks, that 
reflects the difference in the XML hierarchy, disappears.   You weren't 
trying to collapse the hierarchy, you were just trying to get rid of 
distracting and meaningless white space.   Or a similar problem - you 
/want/ extra white space, say, between items in a bulleted list, and the 
editor keeps trying to optimize out that white space because it sees it 
as superfluous.

At first glance one might assume that the problem is the editor 
implementation.   But you really can't fix it in a WYSIWYG editor 
specifically because it hides the underlying representation from the 
user.    And that means that there are circumstances in which "delete 
text at the cursor" is ambiguous, or potentially means multiple things, 
some of which are invisible.   The designer of the editor has a choice - 
does each delete remove something in the underlying representation (some 
of which may be invisible, so it looks to the user like the delete key 
is unreliable), or does each delete remove all of the differences in 
markup between the preceding and following chunk of text, or something 
in between? There's no good answer, especially because the order in 
which those invisible things get deleted from the underlying 
representation really matters and the user can't see the order. (of 
course it's not only XML-ish representations that have this problem, but 
XML-ish representations exacerbate it).

The fundamental problem is that XML is really a poor representation of 
text.  This is especially true for editing, but not just for editing.   
Text is not hierarchical.   How do you represent in XML a comment on a 
particular block of text that, say, overlaps multiple XML elements but 
doesn't completely contain all of them?   In a document which has been 
edited by multiple users, how do you represent in XML the changes made 
by each user?   I'm not saying that it absolutely cannot be done, but 
it's either going to be ugly or it's going to abandon many of the 
properties that made XML appear to be attractive in the first place.

>> I understand why we adopted an XML format 20 years ago. That was
>> better than NROFF, and there was a hope that the whole publishing
>> industry would standardize on XML. It did not, and now the IETF has
>> its very own markup language.

In some sense, nroff really was better.   Probably not better overall, 
but at least nroff usually wouldn't throw up its hands and completely 
refuse to render a document (within an hour of a deadline) because you 
left out or misspelled some directive.   And in nroff you didn't have 
the UI problem.    I'm not arguing for a return to nroff.   XML is 
definitely more powerful in some ways, and XSLT is nice.  (I've written 
tools to convert nroff to other representations and it wasn't either 
easy or fun.)  But we went from one obscure and specialized text 
representation to another, and the newer representation is in some ways 
a poorer reflection of the text than the older one.

Anyway, if we're really going to try to improve our tools, we shouldn't 
naively assume that XML is the right direction for underlying 
representation.   Again, it can probably be made to work, but I suspect 
only by realizing that we simply can't force everything into a 
hierarchy.   So the XML would not be a natural representation for the 
text, it would only be a contrived representation that had to be 
converted back-and-forth between a better internal representation.  (And 
if you don't define that internal representation and the algorithm for 
conversion, each editing tool is going to do it differently, which 
creates another problem - importing the text into a tool, making any 
edit at all, and saving it will change the representation and likely how 
the text is displayed).

Keith