Re: [iucg] Disallowing code points

JFC Morfin <jefsey@jefsey.com> Fri, 17 July 2009 12:05 UTC

Return-Path: <jefsey@jefsey.com>
X-Original-To: iucg@core3.amsl.com
Delivered-To: iucg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id D101B3A6E83 for <iucg@core3.amsl.com>; Fri, 17 Jul 2009 05:05:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.599
X-Spam-Level:
X-Spam-Status: No, score=-3.599 tagged_above=-999 required=5 tests=[AWL=1.000, BAYES_00=-2.599, GB_I_LETTER=-2]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fQQ2kWnDbek8 for <iucg@core3.amsl.com>; Fri, 17 Jul 2009 05:05:46 -0700 (PDT)
Received: from montage2.altserver.com (montage2.altserver.com [72.34.52.22]) by core3.amsl.com (Postfix) with ESMTP id 25FD83A6E82 for <iucg@ietf.org>; Fri, 17 Jul 2009 05:05:46 -0700 (PDT)
Received: from 98.165.193-77.rev.gaoland.net ([77.193.165.98]:1919 helo=jfcmh.jefsey.com) by montage2.altserver.com with esmtpa (Exim 4.69) (envelope-from <jefsey@jefsey.com>) id 1MRmCX-0008TY-3N; Fri, 17 Jul 2009 05:06:10 -0700
X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9
Date: Fri, 17 Jul 2009 14:06:02 +0200
To: John C Klensin <john-ietf@jck.com>, Lisa Dusseault <lisa.dusseault@gmail.com>
From: JFC Morfin <jefsey@jefsey.com>
In-Reply-To: <1D5C4004CDCE63B318C0DFC4@PST.JCK.COM>
References: <8CEF048B9EC83748B1517DC64EA130FB3C1A2BF4EF@off-win2003-01.ausregistrygroup.local> <FF92DE19-65CA-4F12-8E13-0E154DFE02EC@google.com> <4a5fa63b.1aba720a.6a02.3eccSMTPIN_ADDED@mx.google.com> <ca722a9e0907161544v818c3c8r2535a457afc4c33c@mail.gmail.com> <1D5C4004CDCE63B318C0DFC4@PST.JCK.COM>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format="flowed"
Content-Transfer-Encoding: 8bit
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - montage2.altserver.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - jefsey.com
X-Source:
X-Source-Args:
X-Source-Dir:
Message-Id: <20090717120546.25FD83A6E82@core3.amsl.com>
Cc: Pete Resnick <presnick@qualcomm.com>, iucg@ietf.org
Subject: Re: [iucg] Disallowing code points
X-BeenThere: iucg@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
Reply-To: internet users contributing group <iucg@ietf.org>
List-Id: internet users contributing group <iucg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/iucg>, <mailto:iucg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/iucg>
List-Post: <mailto:iucg@ietf.org>
List-Help: <mailto:iucg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/iucg>, <mailto:iucg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 17 Jul 2009 12:05:47 -0000

At 05:36 17/07/2009, John C Klensin wrote:
>--On Friday, July 17, 2009 04:04 +0200 JFC Morfin
><jefsey@jefsey.com> wrote:
>
> > Lisa, John,
> > I agree with Shawn Steel, let call a cat a cat.  There are too
> > many words being used without considering their whole
> > implications. This is like just thinking over details instead
> > of also taking care of the whole picture. I can't and I do not
> > think it can be productive, moreover in a reticular system
> > like the Internet?
>
>I think it is precisely the whole picture that we are concerned
>about.  The whole, global identifier, picture, not one country
>or language group at a time.

Dear John,
from 40 years experience in navy, communications and policy this is 
the fundamental issue the world has with the american culture and 
language. It is not an opposition. It is a built-in misunderstanding. 
The understanding of the string "global" is opposed. This is the core 
of the IDNA issue.

In American English "global" means the entire Globe. In British 
English, French and other languages "global" means all the parts of 
the same Globe. For example, someone like José Bové fiercely stands 
for smart globalisation against imposed globalization. Globalisation 
is a fact that the Internet technically supports, globalization 
("mondialisme" in French) is a policy using the Internet to 
propagate, in particular by standardisation infuence (cf. RFC 3935).

There is an inner contradiction in your sentence that we meet here. 
The intended proposition is unilateral while reality is diverse, i.e. 
we care about the sum of all the countries, languages, and people. 
This translates very simply: the ISOC motto is user centric. The WSIS 
objective is people centric. Technical totalitarism is the difference 
: users go by the user guide. People go by their informed free-will. 
Pete's document is unilateral and as such it can only apply to one 
single presentation. However, it is flexible enough (SHOULDs and 
MAYs) to accomodate a legitimate diversity. So, it is a reasonable 
compromise against both rigidity and laxism.

> > 1. When a character is DISALLOWED this means that one way or
> > another if it used something is going to be discarded, i.e.
> > that "something" is mapped to nil. The "something" may be a
> > single code point or the whole domain name, or something in
> > between - the application decides. The consequence is the
> > same: loss of information and typographic bandwidth reduction.
>
>Again, no.  If a character is DISALLOWED, then the string that
>it contains is prohibited at a label, is not looked up, and, I
>hope results in a message to the user that it is invalid.  That
>has nothing to do with discarding all or part of the string and
>then looking something else (or a subset) up.  The latter would
>be a clear violation of the protocol.

I do not refer to any protocol. I refer to the relation between the 
user and the proposed system. I refer also to the way I would/have 
implement the code and the answered error message.  Let keep it 
simple in using a common metric to the people/process involved. This 
is what I call detail vs. the global diversity of the whole picture.

> > 2. I consider "protocol level" as a fundamental point indeed
> > (I do not think the IESG and IAB use nearly meaningless
> > phrases in a Charter, moreover when issuing strict
> > conditions). If the Charter says "elimination of character
> > mapping in the protocol" this may only mean one thing. That
> > thing - as we understood it, and the reason we are waiting for
> > Vint's plan to complete - is well documented in Pete Draft:
> > the mapping is to happen outside of the protocol, i.e. not on
> > the wire. Because what is involved is outside of the network
> > control and of the IETF network territory. It is users' usage
> > operation territory.
>
>Yes.  We actually agree about that.  But it has nothing to do
>with which characters are DISALLOWED, which the mapping document
>carefully does not touch.

It has to do with who (IETF or User) disallows and under which 
constraint/advise (SHOULD or MUST).

> > This clearly translates in Pete's I_D:
> >
> >> This document describes the operations that can be
> >
> > First fundamental point: there is no MUST, this is consistent
> > with the rest of the position. The only MUST is to be
> > acceptable to the DNS.
>
>No.  The MUST is to be consistent with the core IDNA2008 specs,
>which include consistency with the DNS.  And that is exactly
>what the rest of the sentence says.

OK: what are the other constraints made necessary by the IDNA2008 
Mapping document? Except IDNA2003 I do not discuss.

> >> applied to user input in order to get it into a form
> >> acceptable by  the Internationalized Domain Names in
> >> Applications (IDNA) protocol  [I-D.ietf-idnabis-protocol].
> >> The document describes the underlying  architectural
> >> principles (in section 2 and the general  implementation
> >> procedure (in section 3).
> >
> > I class Patrick's document among the underlying architectural
> > principles the WG is to define that "can be applied to user
> > input" (meaning by the application). This is a BCP not a
> > Standard.
>
>If, by "Patrick's (sic) document" you mean
>draft-ietf-idnabis-tables, it is referenced normatively from
>Protocol and is a core part of the Standards-Track IDNA2008
>specification, not a BCP.

As long as the mapping document does not make it a MUST it can 
legitimately be optionnal at User's decision. In such a case the User 
will use it as a Unicode related BCP. Actually, as I say it is an NFC 
restriction, NFC being itself a Unicode restriction. At the end of 
the day we will obtain a more restricted version of the visually 
based alternative to Unicode I wanted.

> >> It should be noted that this document does not specify the
> >> behavior  of a protocol that appears "on the wire".
> >
> > This conforms to the Charter. Character mapping of any kind
> > MUST not appear on the wire, but off the wire.
>
>Sure.  But, strictly speaking, IDNA does not appear on the wire.
>Only A-labels do.  That has been the source of a lot of
>confusion.  Note that the IRI document, which is well outside
>the WG's scope, does refer to IRIs as protocol elements, which
>pushes them into the "on the wire" category, including their IDN
>components.

This is a topology closure problem. You can define the end of a 
protocol where you want, including using a virtual wire. A protocol 
is univesal. Here we deal with the edges (network and user sides), 
i.e. with a global issue. In other words, the border between the 
network and the user is not a line, it is thick. This thickness is 
the Interplus. It can be supported via OPES if to keep with IETF 
documentation. Actually networked OPES I call ONES for a few decades 
(Open Network Extended Services). My Tymnet 1985 department.

> >> It describes an operation that is to be applied to user input
> >> in  order to prepare that user input for use in an "on the
> >> network" protocol.
> >
> > All this clearly describe something which is outside the "on
> > the network" protocol.
>
>Yes.  So?
>
> >>  As unusual as this may be for an IETF protocol document, it
> >>  is a  necessary operation to maintain interoperability.
> >
> > The text underlines that it is unusual and qualifies what is
> > not part of the protocol as an operation. We have widely
> > published that we are in agreement with the Charter and with
> > this text.
>
>Ok.  So we are arguing about what?
>
> > 3. The coup consists in trying to obtain URL indexing
> > stability in invading and annexing that territory, my/our
> > territory. And then starting rulling it, through MUSTs where
> > interoperability calls for negociation based upon SHOULDs. If
> > I want to make Tatweel resolve the IETF site, I just have to
> > use the 1972 host.txt service, making the Tatweel code-point
> > correspond to the IETF IP. What ever the way I can enter a
> > tatweel and punycode it, it will work. If I try the same under
> > IDNA2008, my entry will be mapped to nil and nothing will
> > happen. I do not really call this innovation and progress in
> > 37 years :-) Worse, I might also name it an alias/domain name
> > conflict.
>
>There is no coup and the above is either a  misinterpretation or
>outside the WG's scope.

It is out the WG technical scope. By nature, and because the Charter 
says so. This is why we do not support the coup and support the 
Charter. Actually I do not bother about WG scopes, I bother about 
what hurts me. I bother about Charters only because I consider it as 
a contract every party said it will respect.

>URIs are prohibited from containing
>non-ASCII domain names -- I believe globally although there are
>some factions who believe that such names can be included in
>%-escaped UTF-8 in those fields.

In my daily life I never meet an URI, but would I, I would be hurt 
the same in understanding what they may refer to if the 
orthotypography is wrong.

>And you can't use the 1972 Hosts.TXT service because it was shut
>down long ago.  More important, it was strictly limited
>--technically and administratively-- to ASCII letters, digits,
>and hyphens, so there is no way to express a Tatweel in those
>tables.

I keep using "http://nuts" to access ICANN's site. Hosts.txt works 
perfeclty well with punycoded entries.

>In IDNA2008, you simply cannot express Tatweel in a label so, in
>that narrow sense, the behavior under the Hosttable rules and
>that under IDNA2008 is identical.

For the time being I do not know what IDNA2008 is going to be. But I 
am pretty sure that Pouzin's keyword service can support tatweel.

> > Gee! I am on vacations !
>
>Yeah.  And I'm trying to retire.

I am suppose to retire on oct. 7th (in France they become less rigid, 
so I am creating a new business)
Best
jfc


> > At 00:44 17/07/2009, Lisa Dusseault wrote:
> >...
> >> 2. I don't consider DISALLOWED to be a mapping to nil at all.
> >> There  is no requirement for clients to map DISALLOWED
> >> characters to nil  and then request a domain -- a client
> >> could reject a string with  DISALLOWED characters.
>
>Indeed, I think the client is required to do so and that simply
>mapping a DISALLOWED character to nothing would be a violation
>of the specification.  Except for those characters that
>IDNA2003 _required_ by mapped to nothing, that is also true of
>IDNA2003.  There is no provision in either for a conforming
>implement to drop prohibited characters and look up the rest of
>the label.

NB. the first thing a parser should do is to look for the disallowed 
code points and reject the entry (with an error message or a log 
entry), the same action as if the entry was empty. You can ask for 
more a subtle operation through a SHOULD. You cannot oblige to it 
(via a MUST) as this has no impact on the wire.