Re: [Ltru] Re: Last call: BCP 47 second part.

"Mark Davis" <mark.davis@icu-project.org> Thu, 22 June 2006 01:39 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1FtEAU-00053W-DM; Wed, 21 Jun 2006 21:39:38 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1FtEAS-00052p-OC for ltru@ietf.org; Wed, 21 Jun 2006 21:39:36 -0400
Received: from wr-out-0506.google.com ([64.233.184.231]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1FtEAR-0007Pu-Mr for ltru@ietf.org; Wed, 21 Jun 2006 21:39:36 -0400
Received: by wr-out-0506.google.com with SMTP id i4so283802wra for <ltru@ietf.org>; Wed, 21 Jun 2006 18:39:35 -0700 (PDT)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; b=QCDZTHyciTqxis2mRYGtJ9PEQ35HVj+H4KuT1No8NBkw522b32PiEi3E3FIXG8wKo6eV/9acxHhNIicMe+W6D7PWfoz4fpIR6Q/wH25BEVdBi7MPxMJ535Ykdg9CFPeWSgzvlex25uh5o0dUq12IS1hTQTapsGtCJ/EIvZIIRgs=
Received: by 10.64.204.6 with SMTP id b6mr1958655qbg; Wed, 21 Jun 2006 18:39:34 -0700 (PDT)
Received: by 10.65.148.20 with HTTP; Wed, 21 Jun 2006 18:39:34 -0700 (PDT)
Message-ID: <30b660a20606211839r68e892eep584f85420b5539f7@mail.gmail.com>
Date: Wed, 21 Jun 2006 18:39:34 -0700
From: Mark Davis <mark.davis@icu-project.org>
To: Martin Duerst <duerst@it.aoyama.ac.jp>
Subject: Re: [Ltru] Re: Last call: BCP 47 second part.
In-Reply-To: <6.0.0.20.2.20060619152835.0769d880@localhost>
MIME-Version: 1.0
References: <7.0.1.0.2.20060618165520.03b26178@online.fr> <6.0.0.20.2.20060619152835.0769d880@localhost>
X-Google-Sender-Auth: 7261dfde6823de97
X-Spam-Score: 0.7 (/)
X-Scan-Signature: b7b1e91f6d312d4248b994050b22d659
Cc: LTRU Working Group <ltru@ietf.org>
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============1219421848=="
Errors-To: ltru-bounces@ietf.org

I fully agree with Martin. Nice job.

Mark

On 6/21/06, Martin Duerst <duerst@it.aoyama.ac.jp> wrote:
>
> Dear LTRU members (cc IETF mailing list),
>
> Here are my comments on how I (as a technical contributor)
> propose to address the Last Call comments made below.
> I hope others have some comments, too.
>
> At 23:56 06/06/18, JFC (Jefsey) Morfin wrote:
> >Dear IESG Members,
> >
> >1.  The proposed Draft is not about matching (it is absurd to say that my
> Italian can "match" your Japanese in order for us to understand each other
> better). It is about using pattern matching techniques in order to filter
> lists against a langtag with two results (max one answer, no max) and in two
> cases (well formed langtag or not). However, the wording is such that
> without examples it is difficult to understand the specifications of the
> pattern matching function that is being used - and therefore the possible
> applications and the purpose of the Draft.  The algorithm of this function
> is undocumented and there is no obligation to document it, what may lead to
> blocking conflicts if two filters may have to interoperate. This proposition
> is NOT scalable and does not intent to be scalable.
>
> The matching draft of course makes sure that "ja" and "it" do not
> match. Nothing absurd happening there. Also, what the draft actually
> describes is matching language tags against language ranges; there
> are three matching variants, two for filtering and one for lookup.
>
> Some of the matching procedures indeed have to be read carefully.
> But the WG made every attempt to describe them carefully, going
> through several iterations. And we provide examples, too.
>
> The main applications envisioned by the draft are described in the draft,
> they are things such as selection of documents (e.g. when searching)
> or document pieces (e.g. for styling) in the case of filtering and
> finding the best match to return documents or document fragments
> in the case of lookup (the prototypical example being HTTP language
> negotiation).
>
> As for conflicting filters, there is no requirement that two filters
> (e.g. in two different protocols) produce exactly the same result.
> Different protocols may have different needs. That's why the draft
> leaves some specifics for a particular protocol to be decided.
>
> As a result, I do not see anything in comment 1. that would need
> addressing in the current draft.
>
>
> >2. Either RFC 3066 Bis is well written (what I think we achieved if
> strictly limited to the Internationalized ASCII Internet) and well applied
> (what I can see that it is not the case: the review mechanism does not
> respect RFC 3066 Bis) and the filtering is already built-in, and the
> functional strategies are to be specific to applications and protocols. Or
> that Draft, which does not seek to first ensure that langtags respect RFC
> 3066 Bis (i.e. being well formed or corrected in order to become well
> formed), is a negation of RFC 3066 Bis. I think that authors had filtering
> in mind (it was the apex of the first unique document) and did not realise
> that the work achieved in cleaning the first part made its correction by the
> second part not necessary anymore. That is if the whole purpose was not a
> non documented use of the filtering (users mass profiling). If it was not
> the whole document can be written as "make sure langtags are well formed and
> feed them on the pattern
> matching function of your application/protocol to obtain the results it
> needs along your language management strategy".
>
> The commenter seems to claim that draft-ietf-ltru-matching conflicts with
> draft-ietf-ltru-registry (here called RFC 3066bis) because the later
> defines well-formed tags while the former does not require well-formed
> tags. The reason for not requiring checking for well-formed tags when
> matching was discussed extensively in the WG. There is a very clear
> reason:
> requiring this would require to check the IANA language subtag registry,
> potentially for every matching operation, which was considered
> operationally
> infeasible. It would also be an unnecessary performance punishment for
> those who actually use well-formed tags. In general, non-wellformed
> tags or ranges will simply not match anything, which is just fine.
>
> The commenter is correct in that there is no absolute need for this draft;
> each protocol or format could come up with it's own way of matching
> language tags. After all, RFC 3066bis defines how these tags are built,
> and (to a certain extent) what they mean. However, I consider the current
> draft valuable because it helps protocol/format designers, who in general
> are not experts on language tags and language matching, to choose the
> right kind of matching scheme. Also, one matching scheme was already
> described in RFC 3066, and so it would be difficult to obsolete
> RFC 3066 without this draft.
>
> I therefore don't see any change that would be needed in the current
> draft to address comment 2.
>
> >3. "*" restrictions in the pattern matching function can hardly be
> understood without several examples. They add usage limitations to the RFC
> 3066 Bis format, where they should be documented ュ or the Draft cannot be
> part of BCP 47. This certainly belongs to the language constraining strategy
> of the WG-LTRU affinity group and to the interests a co-Chair recently
> documented. But this is unacceptable to most users, even if it is certainly
> favourable to a national strategy and to the members of a given consortium.
> I therefore submit that the IESG Members who are citizens of that nation, or
> members, or employees of the members of that commercial consortium have a
> COI.
>
> There are no restrictions on the use of "*" in language ranges.
> There is a very specific treatment of "*" wildcard components in
> language ranges for extended filtering. The actual algorithm in
> the draft is described carefully, and an explanation for why it is
> the way it is is given. This matching algorithm does not add
> any usage limitations to RFC 3066bis. On the contrary, it was
> carefully designed to work well together with RFC 3066bis.
>
> I do not know of any concrete example where the matching behavior
> would be unacceptable. Any claims that it is "unacceptable to
> most users", are therefore, in my view, just made up out of thin air.
>
> Also, I have no idea what is meant by "language-constraining strategy".
> If anybody wanted to restrict the use of certain languages in certain
> parts of the Internet, they could easily already have done that based
> on RFC 3066, or could do based on RFC 3066bis, or even just based on
> statistical analysis of the actual content transmitted (with techniques
> such as trigrams). And certainly nobody who actually wanted to do such
> a thing would ask for an RFC or other kind of standard to try to
> legitimate such restrictions, nor would I hope anybody would condone
> such behavior just because it would make use of an RFC.
>
> As a result, I don't think that anything needs to be done to address
> comments 3.
>
> >4. All the above  means that the Draft is useful in at least two
> circumstances:
> >   - if the langtags are not well formed or do not respect the principles
> of ISO 639-4 and/or RFC 3066 Bis.
> >   - if the langtags are used for other purposes that are undocumented at
> the WG-LTRU Charter.
> >These circumstances should be documented.
>
> ISO 639-4 is still being worked on. The possibility of using
> non-wellformed
> tags is not something the draft is designed to do; it is just a
> consequence
> of not requiring checking for well-formedness (to avoid operational
> problems).
> The draft explicitly says that there is no need to check for
> well-formedness,
> so I don't see what would need to be documented further.
>
> The draft, like most IETF work, mentions possible uses of the technology,
> in particular as examples to explain design decisions or choices of
> options
> for the users (in the case of the draft, the direct users are protocols
> and formats). Any attempt to describe any and all possible uses for a
> technology invariably fail, and so shouldn't be attempted in the first
> place.
>
> I therefore don't see any change that would be necessary based on
> comment 4.
>
> >5. The security section should mention that this Daft encourages the
> disrespect of the RFC 3066 Bis format and further assists dangerous projects
> that the IETF has refused to mention in RFC 3066 Bis, such as lingual,
> cultural, racial, and religious profiling through retro-meta-spam ("I know
> who you are through which langtags you are not aware that you respond to"),
> two-tier Internet based upon the lingual characteristics of the users and
> their supposed market value, lack of conformance to ISO 11179, which may
> lead the IETF, stakeholders, and users to inadequate, costly, and delaying
> strategies or to conflicts with the Multilingual Internet - as in the sad
> DoS against the leading economic language ("en-EU") - or to legal access
> bans by democratic or privacy oriented countries. All of this lends itself
> to incentives for an Internet fragmentation.
>
> As explained above, there is no disrespect for RFC 3066 bis formats, just
> operational
> considerations. Also, the draft does not assist any of the 'dangerous
> projects' mentioned
> above; any of these projects are, if some entity is determined to do them
> and
> has the necessary access, easily possible with various other means.
>
> The problem that RFC 3066bis does not allow en-EU is a problem of RFC
> 3066bis,
> and may have to be addressed in a future revision, but does not affect the
> matching draft now in last call.
>
> Again, I don't see anything here that would need to be changed in the
> current draft to address this comment.
>
>
> >6. As far as I understand, two Draft compliant filters may result in
> different responses for the same filtering list and document. My concern is
> the interoperability of the proposed BCP 47 with Multilingual Internet
> registries, tags, etc. This interoperability is not ensured ュ and there is
> no prospect to see it insured as it is purposely ignored by authors. This
> represents no incentive for developers.
>
> The three matching schemes described in the draft all come with a small
> number
> of options. In this sense, it is e.g. possible that two different
> protocols,
> having choosen different options, will lead to different results. An
> example
> would be an HTTP server serving a document in a different language than a
> corresponding FTP server (assuming somebody added language negotiation to
> FTP) for requests with the same language priority list. But as such a
> setup
> is highly fictional, and the two servers are configured separately anyway,
> having different results simply because of different configuration (rather
> than different language matching), or tweaking the configuration to make
> the results match, are both possible. So this kind of interoperability is
> not of importance in practice.
>
> Also, it is difficult to try to ensure interoperability with something
> called 'Multilingual Internet registries' when such a thing does neither
> exist on paper nor in practice.
>
> So there is nothing in comment 6 that would require any changes
> in the current draft.
>
>
> >7. The acknowledgement section mostly quote those who contributed to the
> pre-WG-LTRU document (the three WG-LTRU Draft existed prior to the creation
> of the WG which never studied and tried to conform to its Charter). This is
> the privilege of authors to quote who supported them best. However, in this
> case the document was considerably cleaned through the tough life of the WG.
> Also, most of the names being quoted are widely known as belonging to a
> non-IETF affinity group, what enforces the external understanding that BCP
> 47 documents are actually not IETF documents. This will most probably limit
> their consideration. After one year of tough debates I can testify these,
> good or not, three documents are IETF documents for the Internationalized
> ASCII Internet. I listed the names I consider missing in a Last C all mail.
> >
> >Authors either did not read it or are keeping harassing me, since they
> continue asking for an input. I quote my mail: "every contribution can be a
> key stone in the final construct. That people like Michael Everson, Ned
> Freed, Lee Gillam, John C. Klensin, Felix Sasaki, Michel Suignard, and Tex
> Texin are not quotted seems odd. Others like Scott Hollenbeck and Sam
> Hartman really helped. What about Karen Broome, M.T. Carrasco Benitez, N.
> Piercei? Inputs or help from Brian Carpenter, Ted Hardie, Dylan N. Pierce
> are real.". I do not ask my name to be listed there since I know "it is not
> [the] interest [of some in the list] to be associated with [my own] name".
> >
> >Or would that mean that the IETF does not really back this deliverable?
> The question here is: does the IETF wants to influence the world (RFC 3935),
> document the Internationalised ASCII Internet, or serve the Multilingual
> Internet development. Many would like to know.
>
> The claim that the acknowledgement section mostly quotes those who
> contributed to
> the pre-WG documents is in stark contrast with the description of the
> editors about
> how they have formed that list of names. Most if not all the people
> mentioned above
> are acknowledged by reference, pointing to acknowledgement sections in
> related
> documents. In my function as co-chair, I have asked if anybody feels left
> out
> both on the WG list and on the IETF list; I have not received anything
> from
> anybody.
>
> As a technical contributor, I don't think there is any need for any change
> here.
>
>
> >All the best.
> >jfc
> >
> >PS. Having transparency in mind I copy the IETF main list. This LC ends
> tomorrow. I do not intent to address the comments. But I will certainly
> consider them in the appeal I suspect to be unfortunately necessary (NB.
> Before their first day decision to keep with a twice IETF LC failed
> document, I proposed the WG-LTRU Chairs to co-write the Drafts so we could
> finish the work in a few months. I obviously  eventually get step by step
> all what I wanted - in the documents or in the real world: but what a waste
> of time and effort). Cheers.
>
> I remember well that specially at the start of the WG, the WG co-chairs
> repeatedly asked for actual textual contributions. These were few and far
> between, and were usually rejected by the WG after some discussion.
>
>
> Regards,    Martin.
>
>
>
> #-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
> #-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp
>
>
> _______________________________________________
> Ltru mailing list
> Ltru@ietf.org
> https://www1.ietf.org/mailman/listinfo/ltru
>
_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru