Re: On XML and $EDITORs (Re: Things that used to be clear (was ...)) "Living Documents") side meeting at IETF105.)

Nico Williams <nico@cryptonector.com> Wed, 10 July 2019 16:56 UTC

Return-Path: <nico@cryptonector.com>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CBEDF12034E for <ietf@ietfa.amsl.com>; Wed, 10 Jul 2019 09:56:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cryptonector.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0QOpEyEyAIlA for <ietf@ietfa.amsl.com>; Wed, 10 Jul 2019 09:56:08 -0700 (PDT)
Received: from bisque.elm.relay.mailchannels.net (bisque.elm.relay.mailchannels.net [23.83.212.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 90C981202C3 for <ietf@ietf.org>; Wed, 10 Jul 2019 09:55:14 -0700 (PDT)
X-Sender-Id: dreamhost|x-authsender|nico@cryptonector.com
Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 652305E2C7D; Wed, 10 Jul 2019 16:55:13 +0000 (UTC)
Received: from pdx1-sub0-mail-a78.g.dreamhost.com (100-96-92-226.trex.outbound.svc.cluster.local [100.96.92.226]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id 85DB65E2CB9; Wed, 10 Jul 2019 16:55:12 +0000 (UTC)
X-Sender-Id: dreamhost|x-authsender|nico@cryptonector.com
Received: from pdx1-sub0-mail-a78.g.dreamhost.com ([TEMPUNAVAIL]. [64.90.62.162]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384) by 0.0.0.0:2500 (trex/5.17.3); Wed, 10 Jul 2019 16:55:13 +0000
X-MC-Relay: Neutral
X-MailChannels-SenderId: dreamhost|x-authsender|nico@cryptonector.com
X-MailChannels-Auth-Id: dreamhost
X-Trail-Towering: 07a8eac337c0c949_1562777713163_3769869406
X-MC-Loop-Signature: 1562777713163:1023454063
X-MC-Ingress-Time: 1562777713163
Received: from pdx1-sub0-mail-a78.g.dreamhost.com (localhost [127.0.0.1]) by pdx1-sub0-mail-a78.g.dreamhost.com (Postfix) with ESMTP id 3C5CE7F63B; Wed, 10 Jul 2019 09:55:07 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=cryptonector.com; h=date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to:content-transfer-encoding; s= cryptonector.com; bh=1nQwX+QPiXrYMk98zxWxARTB5WM=; b=hfG4DoQh7N+ jqtMZvnnIWaH/b/y0XNk8ie+NxIyt6efE+FAsveZDQ3Iz8y/qN6fhlYl7kFeP2eF V8k8uzoBMFv2wBmxKzGFPy6xEcGFDbgbK3fkYIMelq0tOkkISMjqWs35WkiZEhgn fCmERmNmutOHv7aMXgtg13kyjCDD9Mo0=
Received: from localhost (unknown [24.28.108.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: nico@cryptonector.com) by pdx1-sub0-mail-a78.g.dreamhost.com (Postfix) with ESMTPSA id D9B127F62F; Wed, 10 Jul 2019 09:55:05 -0700 (PDT)
Date: Wed, 10 Jul 2019 11:55:03 -0500
X-DH-BACKEND: pdx1-sub0-mail-a78
From: Nico Williams <nico@cryptonector.com>
To: Keith Moore <moore@network-heretics.com>
Cc: Christian Huitema <huitema@huitema.net>, ietf@ietf.org
Subject: Re: On XML and $EDITORs (Re: Things that used to be clear (was ...)) "Living Documents") side meeting at IETF105.)
Message-ID: <20190710165502.GD3215@localhost>
References: <CABcZeBPgNr5UqQ0pLwwNu5wh0g9L9wCd6YyYKCUDO37SPru-_Q@mail.gmail.com> <20190708202612.GG60909@shrubbery.net> <9ae14ad1-f8d5-befb-64e4-fff063c88e02@network-heretics.com> <CABcZeBOH9LH8Jrz-A5eu9arqUb+bx8xs_eKWi0pyoh7a3qpOPA@mail.gmail.com> <20190708223350.GO3508@localhost> <af3b25d6-af16-a96a-c149-61d01afb4d01@network-heretics.com> <20190708233438.GP3508@localhost> <ea0b9894-ae9d-55a9-a082-af7aac5be66a@huitema.net> <20190710045202.GA3215@localhost> <50a21cf4-9b9f-1341-f53d-a953fa618d04@network-heretics.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Disposition: inline
In-Reply-To: <50a21cf4-9b9f-1341-f53d-a953fa618d04@network-heretics.com>
User-Agent: Mutt/1.9.4 (2018-02-28)
X-VR-OUT-STATUS: OK
X-VR-OUT-SCORE: -100
X-VR-OUT-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgeduvddrgeeigddutdekucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuggftfghnshhusghstghrihgsvgdpffftgfetoffjqffuvfenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhepfffhvffukfhfgggtugfgjggfsehtkeertddtredunecuhfhrohhmpefpihgtohcuhghilhhlihgrmhhsuceonhhitghosegtrhihphhtohhnvggtthhorhdrtghomheqnecuffhomhgrihhnpehlhiigrdhorhhgpdifihhkihhpvgguihgrrdhorhhgpdhgihhthhhusgdrtghomhdphihouhhtuhgsvgdrtghomhenucfkphepvdegrddvkedruddtkedrudekfeenucfrrghrrghmpehmohguvgepshhmthhppdhhvghloheplhhotggrlhhhohhsthdpihhnvghtpedvgedrvdekrddutdekrddukeefpdhrvghtuhhrnhdqphgrthhhpefpihgtohcuhghilhhlihgrmhhsuceonhhitghosegtrhihphhtohhnvggtthhorhdrtghomheqpdhmrghilhhfrhhomhepnhhitghosegtrhihphhtohhnvggtthhorhdrtghomhdpnhhrtghpthhtohepnhhitghosegtrhihphhtohhnvggtthhorhdrtghomhenucevlhhushhtvghrufhiiigvpedt
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf/3QwsHiogO65i7RbGynAjVjeneE4>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf/>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Jul 2019 16:56:12 -0000

On Wed, Jul 10, 2019 at 06:09:00AM -0400, Keith Moore wrote:
> > Office and LibreOffice use XML too, but users don't see it.  That's
> > what I meant by "webby $EDITOR tooling" above: a bloody real UI, a
> > browser UI.
> 
> While that's probably better than editing raw XML, I'm unfavorably
> impressed with UIs for editing XML (and that includes UIs that edit
> HTML and variants thereof).  [...]

When I use LyX to write I-Ds, I never see either the raw .lyx or the raw
.xml that is produced by lyx2rfc.  I do submit the XML lyx2rfc produces,
naturally, but I don't look at it.  Instead I review the rendered
output.

That's what a good $EDITOR for I-Ds and RFCs should be like to use.

>                      [...].   A possibly familiar example of what I'm
> talking about: you're editing a document that is internally
> represented by HTML or XML and trying to delete white space between
> two chunks of text that are at different levels of hierarchy.   All of
> a sudden you've "deleted too much" - the visual difference between
> those two chunks, that reflects the difference in the XML hierarchy,
> disappears.   You weren't trying to collapse the hierarchy, you were
> just trying to get rid of distracting and meaningless white space.  

I've never had this problem with LyX.  That's because it's NOT a WYSIWYG
editor by a WYSIWYM editor, and among other things does not let you add
arbitrary amounts of whitespace accidentally.

Maybe I should make it a point to attend a meeting and hackathon just to
showcase LyX and lyx2rfc live, so y'all believe me...  There's
screenshots in https://github.com/nicowilliams/lyx2rfc and you can find
videos of people using LyX with searches like: https://www.youtube.com/results?search_query=LyX

> Or a similar problem - you /want/ extra white space, say, between
> items in a bulleted list, and the editor keeps trying to optimize out
> that white space because it sees it as superfluous.

LyX keeps you from adding unnecessary whitespace accidentally, but will
let you add it if you really want to (IIRC it's <ctrl><space>).

> At first glance one might assume that the problem is the editor
> implementation.   But you really can't fix it in a WYSIWYG editor
> [...]

That's right, you cannot fix this in a WYSIWYG editor.  WYSIWYG is not
the right tool for this job.

Good thing that LyX is a WYSIWYM editor.  Take a look at screenshots or
videos, or https://www.lyx.org/ and https://en.wikipedia.org/wiki/WYSIWYM...

>               [...]. (of course it's not only XML-ish representations
> that have this problem, but XML-ish representations exacerbate it).
> 
> The fundamental problem is that XML is really a poor representation of
> text.  This is especially true for editing, but not just for editing.
>   Text is not hierarchical.   How do you represent in XML a comment on

LyX does a pretty good job of exposing hierarchy without making the user
deal with TeX/LaTeX/XML/any other internal representation.  It can be
done, because it has been done.

> a particular block of text that, say, overlaps multiple XML elements
> but doesn't completely contain all of them?   [...]

Not sure how to do that in *any* document representation format.  In XML
you could encode the relations in element attributes, but while text can
be hierarchical, it's not _relational_ in a semantic way, so you can't
keep users from screwing up any such relations you choose to try to
represent by moving text around.  Still, I also don't understand why you
couldn't make a comment contain the text it comments on for this
particular use case, or why this particular use case is too important.

>                                      [...]?   In a document which has
> been edited by multiple users, how do you represent in XML the changes
> made by each user?   I'm not saying that it absolutely cannot be done,
> but it's either going to be ugly or it's going to abandon many of the
> properties that made XML appear to be attractive in the first place.

I'm not sure that I care about who made what changes in a collaborative
editor.  If I do care, then I'll prefer a merge-based workflow.  LyX
does have a mode for diff/merge, where you get to review every proposed
edit, and accept (then possibly further edit) or reject it, and LyX does
have version control support.

> > > I understand why we adopted an XML format 20 years ago. That was
> > > better than NROFF, and there was a hope that the whole publishing
> > > industry would standardize on XML. It did not, and now the IETF
> > > has its very own markup language.
> 
> In some sense, nroff really was better.   Probably not better overall,
> [...]

Roff does not represent metadata in any way such that it can be
programmatically extracted, except by convention -- there's no "roff
schema".

> but at least nroff usually wouldn't throw up its hands and completely
> refuse to render a document (within an hour of a deadline) because you
> left out or misspelled some directive.   And in nroff you didn't have
> [...]

roff can error out, or throw warnings and misrender in subtle ways that
you might not see if you miss the warnings.

roff is also a real pain on the eyes.

> the UI problem.    I'm not arguing for a return to nroff.   XML is
> definitely more powerful in some ways, and XSLT is nice.  (I've
> written tools to convert nroff to other representations and it wasn't
> either easy or fun.)  But we went from one obscure and specialized
> text representation to another, and the newer representation is in
> some ways a poorer reflection of the text than the older one.
> 
> Anyway, if we're really going to try to improve our tools, we
> shouldn't naively assume that XML is the right direction for
> underlying representation.   [...]

Perhaps, but any alternative needs to be possible to use in such a way
that we can convert to XML and preserve semantics, or else the
alternative has to be as powerful as XML (which is roughly saying the
same thing).

I would only consider LaTeX as mentioned earlier because a) there are
very good collaborative and non-collaborative web and non-web $EDITORs
for it, and b) it can be used in such a way that one can convert it to
XML while preserving semantics.

Nico
--