Re: [Lucid] FW: [mark@macchiato.com: Re: Non-normalizable diacritics - new property]
Asmus Freytag <asmusf@ix.netcom.com> Thu, 19 March 2015 04:21 UTC
Return-Path: <asmusf@ix.netcom.com>
X-Original-To: lucid@ietfa.amsl.com
Delivered-To: lucid@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 392F41A876D for <lucid@ietfa.amsl.com>; Wed, 18 Mar 2015 21:21:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CEsiBydsj-eC for <lucid@ietfa.amsl.com>; Wed, 18 Mar 2015 21:21:29 -0700 (PDT)
Received: from elasmtp-banded.atl.sa.earthlink.net (elasmtp-banded.atl.sa.earthlink.net [209.86.89.70]) by ietfa.amsl.com (Postfix) with ESMTP id 5DAC01A8766 for <lucid@ietf.org>; Wed, 18 Mar 2015 21:21:29 -0700 (PDT)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=dk20050327; d=ix.netcom.com; b=X7tY5ADkG5R/xmZ8UIIIK9Hs3qlcvHUY5cPWjAsH5PMs7eBdXd0yjDkg0G+hB48l; h=Received:Message-ID:Date:From:User-Agent:MIME-Version:To:Subject:References:In-Reply-To:Content-Type:Content-Transfer-Encoding:X-ELNK-Trace:X-Originating-IP;
Received: from [72.244.206.133] (helo=[192.168.0.107]) by elasmtp-banded.atl.sa.earthlink.net with esmtpa (Exim 4.67) (envelope-from <asmusf@ix.netcom.com>) id 1YYRxX-0006Su-II for lucid@ietf.org; Thu, 19 Mar 2015 00:21:28 -0400
Message-ID: <550A4EC6.3090203@ix.netcom.com>
Date: Wed, 18 Mar 2015 21:21:26 -0700
From: Asmus Freytag <asmusf@ix.netcom.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0
MIME-Version: 1.0
To: lucid@ietf.org
References: <20150311013300.GC12479@dyn.com> <CA+9kkMDZW9yPtDxtLTfY1=VS6itvHtXHF1qdZKtXdwwORwqnew@mail.gmail.com> <55008F97.8040701@ix.netcom.com> <CA+9kkMAcgSA1Ch0B9W1Np0LMn2udegZ=AzU1b26dAi+SDcbGgg@mail.gmail.com> <CY1PR0301MB07310C68F6CFDD46AE22086F82190@CY1PR0301MB0731.namprd03.prod.outlook.com> <20150311200941.GV15037@mx1.yitter.info> <CY1PR0301MB0731F4EBE5EB5C3340F7059282190@CY1PR0301MB0731.namprd03.prod.outlook.com> <20150319014018.GI5743@mx1.yitter.info> <BLUPR03MB1378184CE32E928A3086665582010@BLUPR03MB1378.namprd03.prod.outlook.com> <20150319023029.GA6046@mx1.yitter.info>
In-Reply-To: <20150319023029.GA6046@mx1.yitter.info>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
X-ELNK-Trace: 464f085de979d7246f36dc87813833b2b65b6112f8911537c9cad3191107dd905d55cc620145e8f3350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c
X-Originating-IP: 72.244.206.133
Archived-At: <http://mailarchive.ietf.org/arch/msg/lucid/IvWKLvce3V6DRnZK2JNSaer9IDg>
Subject: Re: [Lucid] FW: [mark@macchiato.com: Re: Non-normalizable diacritics - new property]
X-BeenThere: lucid@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Locale-free UniCode Identifiers \(LUCID\)" <lucid.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lucid>, <mailto:lucid-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/lucid/>
List-Post: <mailto:lucid@ietf.org>
List-Help: <mailto:lucid-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lucid>, <mailto:lucid-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 19 Mar 2015 04:21:31 -0000
On 3/18/2015 7:30 PM, Andrew Sullivan wrote: > On Thu, Mar 19, 2015 at 02:11:56AM +0000, Shawn Steele wrote: >>> At every level. DNS names are exact-match. Each identifier is unique. >> For the machine, sure. But if you throw in font weirdness and stuff, they become non-unique (to humans, not to machines) even in ASCII. To make them be more unique one has to impose more rules, like "use a decent font", "lowercase everything". Etc. >> > Yes. That's what we're trying to understand. > >> No, even all NFC or NFKC would be 100% unique to the machine > This is either tautologically true, or false. Certainly we learned > with IDNA2003 that NFKC doesn't work, because while it's good for > increasing match probability the identifiers aren't stable. So when > they're handed around through different environments, stuff happens > that is bad. As in, buggy implementations? > >> What I'm questioning is how unique is good enough? > Surely that's part of what we're trying to understand? I understood the IETF concern this way: "While we all know that human perception can be tricked, and poor rendering doesn't help, we are concerned about the case where careful, conscientious users cannot tell apart two identifiers that the protocol deems unique." While this looks superficially like the normalization issue, it is not, but it may be (perhaps partially) something that can be addressed at the protocol level with reasonable cost. So why not deal with it there? Yes, this doesn't address the rest of the human perception issues, but because the subset we identified doesn't really hinge on any human inadequacies, addressing the current issue should not interfere with other solutions that address these inadequacies. > >> We do seem to have some desire to be linguistic. Otherwise Sharp-S and Greek didn't need touched. > We have people who want identifiers that work as useful things in > their writing systems. And sharp-s and sigma seemed to be cases where > people were quite surprised, which means that the identifers are less > useful (because if the identifier system furnishes you with surprises, > that's inconvenient). This new set of cases is in fact another set of > lurking surprises, which is why some of us are concerned about these > cases. As a btw: I'm amazed at the near total lack of IDNs registered for the Latin script in the root. It seems that people like the "fall-back" nature of non-accented ASCII labels for anything that should be accessed universally (top level). So, for that script at least, you could say that users don't like being surprised by a more linguistically accurate, but less universally accessible way of constructing identifiers. Interesting.... A./
- Re: [Lucid] FW: [mark@macchiato.com: Re: Non-norm… Shawn Steele
- Re: [Lucid] FW: [mark@macchiato.com: Re: Non-norm… Asmus Freytag
- Re: [Lucid] FW: [mark@macchiato.com: Re: Non-norm… Andrew Sullivan
- Re: [Lucid] FW: [mark@macchiato.com: Re: Non-norm… Shawn Steele
- Re: [Lucid] FW: [mark@macchiato.com: Re: Non-norm… Andrew Sullivan
- Re: [Lucid] FW: [mark@macchiato.com: Re: Non-norm… Shawn Steele
- Re: [Lucid] FW: [mark@macchiato.com: Re: Non-norm… Shawn Steele
- Re: [Lucid] FW: [mark@macchiato.com: Re: Non-norm… John C Klensin
- Re: [Lucid] FW: [mark@macchiato.com: Re: Non-norm… John C Klensin
- Re: [Lucid] FW: [mark@macchiato.com: Re: Non-norm… Asmus Freytag
- Re: [Lucid] FW: [mark@macchiato.com: Re: Non-norm… Shawn Steele
- Re: [Lucid] FW: [mark@macchiato.com: Re: Non-norm… John C Klensin
- Re: [Lucid] FW: [mark@macchiato.com: Re: Non-norm… Andrew Sullivan
- Re: [Lucid] FW: [mark@macchiato.com: Re: Non-norm… Asmus Freytag
- Re: [Lucid] FW: [mark@macchiato.com: Re: Non-norm… John C Klensin
- [Lucid] [mark@macchiato.com: Re: Non-normalizable… Andrew Sullivan
- Re: [Lucid] [mark@macchiato.com: Re: Non-normaliz… Ted Hardie
- Re: [Lucid] [mark@macchiato.com: Re: Non-normaliz… Ted Hardie
- Re: [Lucid] [mark@macchiato.com: Re: Non-normaliz… Shawn Steele
- Re: [Lucid] [mark@macchiato.com: Re: Non-normaliz… Andrew Sullivan
- Re: [Lucid] [mark@macchiato.com: Re: Non-normaliz… John C Klensin
- [Lucid] FW: [mark@macchiato.com: Re: Non-normaliz… Shawn Steele