Re: [Lager] AD review of draft-ietf-lager-specification-10

Asmus Freytag <asmusf@ix.netcom.com> Mon, 14 March 2016 03:09 UTC

Return-Path: <asmusf@ix.netcom.com>
X-Original-To: lager@ietfa.amsl.com
Delivered-To: lager@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6D74E12D820; Sun, 13 Mar 2016 20:09:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.719
X-Spam-Level:
X-Spam-Status: No, score=-2.719 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); domainkeys=pass (384-bit key) header.from=asmusf@ix.netcom.com header.d=ix.netcom.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id foCEFVt-qksJ; Sun, 13 Mar 2016 20:09:20 -0700 (PDT)
Received: from elasmtp-kukur.atl.sa.earthlink.net (elasmtp-kukur.atl.sa.earthlink.net [209.86.89.65]) by ietfa.amsl.com (Postfix) with ESMTP id 07FE012D92E; Sun, 13 Mar 2016 20:09:19 -0700 (PDT)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=dk20050327; d=ix.netcom.com; b=ma9AngjLPzPMbupxgR0iijXbRo4HdIcI3C5lw9ARIE+7IoDfqGcuLdl3l3Ixejx0; h=Received:From:Subject:To:References:Cc:Message-ID:Date:User-Agent:MIME-Version:In-Reply-To:Content-Type:X-ELNK-Trace:X-Originating-IP;
Received: from [71.35.115.182] (helo=[192.168.1.104]) by elasmtp-kukur.atl.sa.earthlink.net with esmtpa (Exim 4.67) (envelope-from <asmusf@ix.netcom.com>) id 1afIsM-0003MD-Sl; Sun, 13 Mar 2016 23:08:59 -0400
From: Asmus Freytag <asmusf@ix.netcom.com>
To: Barry Leiba <barryleiba@computer.org>, draft-ietf-lager-specification@ietf.org
References: <CALaySJJP0deDOxCs8YSPr72pfyRUsbZBVE9XO=_4d2AvEhVEtQ@mail.gmail.com>
Message-ID: <56E62B63.90805@ix.netcom.com>
Date: Sun, 13 Mar 2016 20:09:23 -0700
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0
MIME-Version: 1.0
In-Reply-To: <CALaySJJP0deDOxCs8YSPr72pfyRUsbZBVE9XO=_4d2AvEhVEtQ@mail.gmail.com>
Content-Type: multipart/alternative; boundary="------------020808080405030300090409"
X-ELNK-Trace: 464f085de979d7246f36dc87813833b2857e9f10d2205ddc9c5b621a1aad7e69293993dd2acab7c5350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c
X-Originating-IP: 71.35.115.182
Archived-At: <http://mailarchive.ietf.org/arch/msg/lager/gIOxiIuZ1uBlzOqTTaE3TeQ505g>
Cc: lager@ietf.org
Subject: Re: [Lager] AD review of draft-ietf-lager-specification-10
X-BeenThere: lager@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Label Generation Rules <lager.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lager>, <mailto:lager-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lager/>
List-Post: <mailto:lager@ietf.org>
List-Help: <mailto:lager-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lager>, <mailto:lager-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Mar 2016 03:09:22 -0000

Barry,

most of your comments are straightforward, and as far as they cover my 
contributions to the text, I'll be happy to make the required tweaks. A 
few were worthy of some comment, in some cases asking whether a proposed 
fix would seem sufficient to you.

On 3/13/2016 7:57 AM, Barry Leiba wrote:
> -- Section 6.2.5 --
> Am I the only one who doesn't understand the distinction between
> "difference" and "symmetric difference"?  I had assumed that
> "difference" contained all those in one or the other, but not both...
> but that seems to be what "symmetric difference" is (you say it's
> xor).  Please explain (to me and in the document).


I have no opinion on whether you are the only one, but the term 
"symmetric-difference" defines what you think of as "difference", while 
the latter term means simply "subtract all the elements of one of the 
sets from their union" (normally you'd expect to see "from the other 
set", but that leaves unspecified how to subtract something that's not 
in a set to begin with).

As these terms can be looked up in Wikipedia, with beautiful Venn 
diagrams, I'm not sure that we should spend time on defining them further.

However, perhaps a simple "(subtraction)" after "difference" will put 
the reader on the right track?

> -- Section 6.3.9 --
> I'm confused here, so maybe you can help me understand:
> It seems to me that in this example, the ranges are defined in terms
> of the "mixed-digits" rule, and that the rule is defined in terms of
> the ranges.  How does such a self-referential thing work?


My guess is that you are thinking the "when" attribute "defines" a range 
in some absolute way.

That is not the case. A when attribute relates to whether a label 
containing a code point from the range will be invalid.

If you think of a range like a procedure declaration, then this 
declaration exists and is independent of the nature of the "when" 
attribute. However, when you evaluate the procedure, the result could be 
a no-op (if the context defined in the when attribute is not met).
(Not-when attributes just use an inverse sense of matching, but 
otherwise there is no difference).

The presence of the when attribute does not affect the association of 
tags to code points, which is part of the declaration.

Hence, the later definition of the context rule ("no-mix..") can refer 
to the tag values without being self-referential.

You are correct, that context rule names are used before the rules are 
declared.

Looking at the text: I think section 5.3.9 is really an example of a 
context rule. So we should move it past 5.4, which covers the context 
rule explicitly for section 5. The text of Section 5.4 could be tweaked 
very slightly to make clear(er) that "action" and "disposition" refer to 
labels that contain a code point with a context rule.

Section 5.3.9 would become new section 5.4.1 and retitled "Example of a 
context rule..."

While we are touching this, mention should be made that rules (without 
anchors) could be used both as context rules and as ordinary rules - in 
principle even in the same LGR; I'm thinking that mention could come in 
either 5.4 or the new 5.4.2 as part of the edits needed to reflect the 
new flow.

> -- Section 8.5 --
>
>     Because of symmetry and transitivity, all variant mappings form
>     disjoint sets in which each of them is a variant of all other
>     members.
>
> I can't sort this one out.  "Each of them" appears to refer to each
> set, and I don't know how a set can be a variant of a member.  But if
> I instead assume that "each of them" is meant to refer to "all variant
> mappings", then I don't understand what the sets have to do with
> anything.  I'm also not following how the disjoint sets are formed in
> the first place.  Can you explain/clarify?


Good catch. Intended was:

    "Because of symmetry and transitivity, all variant mappings form
    disjoint sets. In each of these
    sets, the source and target of each mapping are also variants of the
    sources and targets of all the other mappings. "

The use of "and" is deliberate, otherwise you could get a chain, instead 
of an n (n-1) set of crosswise and reciprocal mappings.

By using "source" and "target" I also avoid having to use "code point or 
sequence".

If I make that change, would it fix your problem, or do we need to touch 
more of the discussion in that section?

> -- Section 4.2 --
>
> A minor question: Why are you making the order significant?  Why is
> that important?


This is perhaps not (entirely) editorial.

There is one strong order dependency in the Schema, which is that 
<action> elements are evaluated in order.
There are several weak order dependencies in the schema that are 
necessary to prevent recursive definitions.
Named rules and classes must be defined before being referenced via 
by-ref, but within that constraint they can be reordered.

There are additional order dependencies that simplify validation 
(logical validation, not schema validation).
Rules referenced in actions (via "match") must be defined; the Unicode 
version must be defined before creating classes based on properties 
(which are version dependent). In a few places, additional 
define-before-use constraints might have been nice but could not be 
implemented. You found the one affecting when rules.

Allowing a more "random" ordering of elements would have been possible, 
but there seems to be no use case for it. Conversely, I'm convinced that 
there's a cost in usability. LGR files that stick to a somewhat 
conventional order can also be processed by general tools, for example, 
to view differences, and not just by special-purpose software.

> -- Section 7 and subsections --
> This was a real slog to get through, and I'm quite sure I didn't
> follow a lot of it.  I have to take your word that people who are
> actually doing this stuff can follow it and get it right, including
> all the variant stuff, including the reflexive variants and the
> any-variant vs only-variants vs everything else.  I just can't imagine
> needing to do anything complicated with this and actually getting it
> right.  (This is just a comment; there's nothing actionable here.)


The complications are very necessary but (primarily) relevant to CJK 
tables, where the full scheme has been in use for a while. It is 
difficult and in fact probably the hardest part of the spec. Took me 
forever to get it all sorted out. :)

> -- Appendix A --
>
>     In practice, any LGR that includes the hyphen might also contain
>     rules invalidating any labels beginning, ending, and containing a
>     hyphen in the third and fourth positions as required by [RFC5891].
>
> Mm, then why didn't you put those in the example?  It seems to me that
> getting the rules right is the hard part, and that using real-world
> examples when you can would be helpful.

I now have access to a worked example of this from a different project, so I may just stick it in.

A./