Re: Last Call: <draft-freytag-lager-variant-rules-02.txt> (Variant Rules) to Informational RFC

Asmus Freytag <asmusf@ix.netcom.com> Tue, 14 February 2017 18:31 UTC

Return-Path: <asmusf@ix.netcom.com>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1DC3B1296CA; Tue, 14 Feb 2017 10:31:48 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.72
X-Spam-Level:
X-Spam-Status: No, score=-2.72 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); domainkeys=pass (384-bit key) header.from=asmusf@ix.netcom.com header.d=ix.netcom.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id obGDdNcKgjuW; Tue, 14 Feb 2017 10:31:45 -0800 (PST)
Received: from elasmtp-kukur.atl.sa.earthlink.net (elasmtp-kukur.atl.sa.earthlink.net [209.86.89.65]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 562311296FC; Tue, 14 Feb 2017 10:31:45 -0800 (PST)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=dk20050327; d=ix.netcom.com; b=fpxNFw0udtXw8SZH2XR6eDq4eM+EuaKLbkXlxgMv2rYxwF9eC+X4YmaM893XK1GQ; h=Received:Subject:To:References:Cc:From:Message-ID:Date:User-Agent:MIME-Version:In-Reply-To:Content-Type:Content-Transfer-Encoding:X-ELNK-Trace:X-Originating-IP;
Received: from [71.212.94.37] (helo=[192.168.0.5]) by elasmtp-kukur.atl.sa.earthlink.net with esmtpa (Exim 4.67) (envelope-from <asmusf@ix.netcom.com>) id 1cdhsc-0002tA-N4; Tue, 14 Feb 2017 13:31:11 -0500
Subject: Re: Last Call: <draft-freytag-lager-variant-rules-02.txt> (Variant Rules) to Informational RFC
To: John C Klensin <john-ietf@jck.com>, ietf@ietf.org, IETF-Announce <ietf-announce@ietf.org>
References: <148467380280.32070.11213613399948034139.idtracker@ietfa.amsl.com> <ED9D6B23A636DFCB94044ABD@PSB>
From: Asmus Freytag <asmusf@ix.netcom.com>
Message-ID: <16b7beac-71e2-0a89-adc1-6a576597ab31@ix.netcom.com>
Date: Tue, 14 Feb 2017 10:31:10 -0800
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.7.1
MIME-Version: 1.0
In-Reply-To: <ED9D6B23A636DFCB94044ABD@PSB>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
X-ELNK-Trace: 464f085de979d7246f36dc87813833b2b92c5f0aecc81b51518f82b9d8697e82aa8226b48fbdfbca350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c
X-Originating-IP: 71.212.94.37
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf/bK_Zo_4GLlIAMm38gvVi4dU4tSg>
Cc: alexey.melnikov@isode.com, draft-freytag-lager-variant-rules@ietf.org, Patrik Fältström <patrik@frobbit.se>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf/>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 14 Feb 2017 18:31:48 -0000

John,

I can't feel but that there's somewhat of a disconnect between what the
document is actually about and the issues that are being discussed here.

The document is not about policy. It is also not about domain names
exclusively. The same is true for RFC7940 by the way.

RFC7940 is about how you write down a specification for label generation
rules. It is not a specification for designing them. It is also not a 
specification
for what should be possible in label generation rules for IDNs in the DNS.

In other words, it is a protocol for capturing a wide range of possible
policies for a range of possible identifier systems, of which domain names
are the most prominent (and first) example.

Variant mechanisms exist today in at least two different RFCs, one for
Arabic and one for Chinese. RFC 7940 provides the tools to express them
rigorously, without passing any judgement on the feasibility or desirability
of defining a policy that allows variants.

The XML formalism is hard to follow for many people and obscures to them
what is happening underneath. Because it is broad enough to cover all pre-
existing schemes, it is also possible to create policies with edge-cases 
that can
get you in trouble, where a label generation ruleset may not be 
"well-behaved"
in an  implementation sense. The current draft is aimed at explaining these
technical aspects, again without addressing the question whether variants
are desirable for the DNS.

There's nothing here that sets policy for the DNS. If that was desirable, it
should be done in a different document. I concur, that such a document
would be the correct place to address questions around the issues of what
is implementable in terms of *delegated* variants in the DNS, and even
that such a document would seem desirable. I fully agree with what is
described as the "magical thinking" around variants. But it is not something
addressed in this draft and not the place of this draft to discuss any of
these issues, and they should not be part of this draft.

RFC7940 is an XML schema. The current draft doesn't change anything
about this schema, therefore, it is not, in my view, material for an 
"update"
to RFC7940.

If you think of RC7940 like a programming language for LGRs, the current
document is a style guide. These are two entirely different things. Whether
using standards-track for a style guide is appropriate, I have no opinion
on, but I am clear that this draft neither updates nor supercedes any part
of RFC 7940.

The JET documents contain a combination of information: how to capture
a variant relationship in a multi-column plain text format on the one hand,
and what to base the relationship on, on the other hand.

RFC7940 only (!) addresses the first point. It does not address what variant
relations are desirable (or should be allowed) in any zone in the DNS.

The current document goes a step further and examines what kind of
assumptions about the nature of supportable variants go into the design
of the LGR formalism. Contrary to what is claimed, the draft is intended
to make clear that "variants" that are not 1:1 substitutions are effectively
not tractable with this mechanism.

If you have a relationship between two labels that can be characterized
by a distance in perceptual space, and where you define some arbitrary
but non-zero distance as defining confusability between such labels, then
of three labels, two may be confusable with the third, but not confusable
with each other: the relationship is not transitive.

Non-transitive relationships are not handled well with RFC7940, because
it does not have a way of allocating distances (or locations) in perceptual
space. (Language to that effect is definitely in the latest draft I 
wrote, but
I don't know whether it's already in the -02).

If your relationships between blocked variants are symmetric and transitive,
collision checking becomes an 0(1) operation, somewhat like a hash.
This makes blocked variants attractive for cases where, in case of a 
collision,
there is no doubt that the labels are clearly colliding. (Collisions 
based on
perceptual distance suffer from the arbitrary selection of the minimal
required distance and are always open to pressures to override or make
case-by-case exceptions, see .br).

One could capture the confusables data in UTS#46 in the form of RFC7940
and mechanically make the transitive and symmetric (as they are not
specified that way, although they appear to at least be intended to be
implicitly symmetric). If that is done, one finds that the transitivity
requirement would lead to mapping a number of clearly distinct labels
to each other; for others, the variant relation could be transitive as well.

RFC7940 allows symmetric only LGRs, and with suitable tools these
could be used to implement blocking, but the optimizations available
for the transitive case would not hold. (That is something that could
be described in a section in the draft, if it was considered helpful;
however, this author has no information on best strategies for implementing
blocking in a symmetric only case).

The Arabic RFC defined positional variants (a necessary feature for that
script) but the context rules for variants that this requires can lead to
undesirable edge cases when one violates the underlying assumption
of symmetry: if I substitute a variant code point in a label it must satisfy
the same context rule, otherwise the mapping is not symmetric.

(The types of context rules one tends to use in Arabic happen to be well-
behaved)

RFC7940 does not require variant relations to be symmetric and transitive;
it is not unreasonable to be able to have a tool to mechanically complete
the specifications of mappings, but to want the input file to be formally
valid under the XML schema, even if in practice one would want to not
use an LGR that isn't symmetric and transitive.

The work required to prove that an XML is symmetric and transitive
as far as the mappings are concerned, is practically the same as enforcing
that constraint, by the way.

The notation used in this draft is simply a shorthand for the formalisms
available in RFC7940, so that it is possible to succinctly write down
examples that aren't overburdened with XML syntax. Really nothing more
and nothing less. If RFC7940 were extended in the future, so could this
symbolic shorthand notation.

I believe it is factually incorrect to say that either RFC7940 or this 
draft are
in any way constrained to what "ICANN has decided to allow for the Root".
The expert group hired by ICANN has followed some of the reasoning found
in this draft to recommend against some details in certain proposed 
definitions
of variants, because they were realized to lead to ambiguous edge cases.

The suggestion that the section showing the correspondence is too cursory
I take as constructive. There's no requirement that prevents it from being
more comprehensive.

A./


On 2/14/2017 1:59 AM, John C Klensin wrote:
> --On Tuesday, January 17, 2017 09:23 -0800 The IESG
> <iesg-secretary@ietf.org> wrote:
>
>> The IESG has received a request from an individual submitter
>> to consider the following document:
>> - 'Variant Rules'
>>    <draft-freytag-lager-variant-rules-02.txt> as Informational
>> RFC
>>
>> The IESG plans to make a decision in the next few weeks, and
>> solicits final comments on this action. Please send
>> substantive comments to the ietf@ietf.org mailing lists by
>> 2017-02-14. Exceptionally, comments may be sent to
>> iesg@ietf.org instead. In either case, please retain the
>> beginning of the Subject line to allow automated sorting.
> Summary: This document should not be published in the IETF
> Stream, at least in its present form, proposed status, and
> relationship to other documents.  An explanation and some
> alternatives appear below.
>
> Details:
>
> This is a difficult document for me to review for multiple
> reasons, including a conviction that the intentions are the
> very best but that the document is artificially constrained by
> essentially political decisions taken outside the IETF or any
> other process that would meet traditional IETF criteria for
> openness, transparency, fairness, and rough consensus.
>
> For IETF decision-making, a largely procedural issue is as, or
> perhaps more, important.  The document bears a relationship to
> the standards-track RFC 7940 that is confusing at best.   The
> "Document Quality" section of the proposed approval notice
> starts "The document largely reflects experience gathered from
> implementing RFC 7940 and creating rulesets based on it".  That
> is a worthy goal and entirely consistent with the "rough
> consensus and running code" principle.  The author has done
> what I believe is a laudable job of coming up with a
> semi-mathematical and testable alternative to the rather
> lengthy, complex, and less easily validated and texted, XML of
> RFC 7940.
>
> If the document were submitted to the ISE as a "I think this
> would be a better way to do things while meeting the same goals
> as RFC 7940" or even as "this is what ICANN is doing (or
> proposes to do) and the community should know about it" piece,
> I'd have little or no objection to it.  However, as an IETF
> stream document that apparently is intended to replace (or at
> least provide an alternative to) large sections of 7940 with a
> different strategy, either
>
>    -- it should be a standards-track document that explicitly
> 	updates and replaces those portions of 7940, with all of
> 	the documentation and explanation the IESG requires of such
> 	updates.  OR
>
>   -- it should be a standards-track document that provides an
> 	alternative to 7940 and that explains the choices and
> 	tradeoffs.
>
> As an IETF Stream Informational specification, it is an
> apparent IETF Informational document that encourages the
> practice of something other than an IETF Proposed Standard that
> addresses the same topics and requirements, with no
> Applicability Statement or other guidance as to when or if it
> should be applied.
>
> I do not believe it should be published on that basis.
>
> Without descending into details and nitpicking, there are also a
> pair of technical problems.
>
> (1) As RFC 7940 points out, the concept of "variant" and hence
> the relationships needed to express the "increased
> requirements of contemporary IDN variant policies" [RFC7940,
> Section 9] has moved considerably beyond the definition of that
> term and concept in RFC 3743 (aka "the JET specification").
> This document further refines the description of those
> relationships.  However, if one is going to move beyond the JET
> concept -- one tailored to the relationship between Simplified
> and Traditional Chinese characters and not about, e.g., visual
> confusion at all -- it is not clear that there is a technical
> basis for saying "these things are variants and those others
> are not".  The "others" can include synonyms, translations,
> orthographic variations that cannot be expressed in simple
> character (or even character sequence) mappings, and so on.
>
> ICANN made a serious of decisions (IMO, some of them almost by
> accident and others by side-effect or more political reasons)
> as to what kinds of relationships might be considered variant
> candidates, at least for the root zone.  The grammar proposed
> in this document (and the one of 7940) exclude those other,
> non-ICANN-sanctioned, relationships and cannot, in general,
> represent them in spite of the fact that they might be quite
> appropriate (as least for blocking) in non-root zones and have
> been used in exactly that way (indeed, two of the key areas of
> friction between IDNA2008 and Unicode UTS #46 can be seen in
> exactly those terms).    To a certain extent, that is a
> criticism of 7940 rather than this document, but there is an
> important difference, at least IMO: The grammar of 7940 is
> essentially descriptive and, like most good XML structures,
> could easily be expanded with additional elements or element
> components if the need arose.  It is far less clear how one
> would expand a quasi-mathematical grammar, especially one that
> is heavily dependent on special operator symbos and strict
> typologies, like this one.  Even if one were to figure out how
> to expand this as requirements evolve outside ICANN's control,
> such extensions would raise questions of how to keep this
> document and 7940 synchronized.  That issue might be another
> reason for standards track status and either a more explicit
> discussion of relationships and mapping; one or more IANA
> registries or operators, types, and elements; or both.
>
> (2) Independent of the web and the convenience of HTTP redirect
> facilities, it is not only not clear how to implement delegated
> variants in a way taht is not damaging to the Internet and the
> DNS.  The opportunities for combinatorial explosion and
> consequent operational and zone management problems in all but
> a few very special cases (including, historically, the one that
> at least some of the JET designers had in mind) are
> considerable.  The ICANN solution of "just delegate them all to
> the same party and make it their problem" may be satisfactory
> from their corporate point of view, but is not a way to make
> the Internet work better, especially in the context of
> potentially hundreds of names that have to be kept synchronized
> in a way that leads to consistent behavior across all
> protocols.  RFC 7940 exposes some of this problem as well, but,
> again, is rather more descriptive than this document, which
> moves much closer to a set of executable tests that establish
> what is valid (and presumably reasonable) and what is not.
>
> I, and a few others, have suggested in other contexts that
> "variant" has become part of a magical ritual in which one looks
> at a complex DNS-related problem, solemnly chants the word a
> propitious number of times, and the problem is then assumed to
> disappear or be solved.  The issues above are only a few of the
> cases to which variations on that ritual have been applied.
> Unless the IETF has better solutions than magic, it should not
> be legitimizing the magical thinking by publishing documents
> that appear to encourage delegation of "variants".
>
> Two additional nits, the first an important procedural one and
> both to provide illustrative examples that this document would
> need work even if none of the considerations above applied.
>
> (i) At least since RFC 3552, we have not allowed documents in
> the IETF stream that say "There are no security considerations
> for this memo.".   And yet that is exactly what Section 18 has
> to say.
>
> (ii) Section 16 ("Corresponding XML Notation") is a good idea,
> but, rather than providing a comprehensive mapping, it
> essentially says "here are some examples of the mapping between
> notations; everything else is left as an exercise for the
> reader".  I think that is confusing and unfortunate but, if it
> really is what the IETF and the author want to do, it should be
> made much more explicit.
>
>
> thanks,
>     John Klensin
>
>
>
>