Re: [iucg] Unicode 7.0.0, (combining) Hamza Above, and normalization for comparison
JFC Morfin <jefsey@jefsey.com> Wed, 06 August 2014 12:01 UTC
Return-Path: <jefsey@jefsey.com>
X-Original-To: iucg@ietfa.amsl.com
Delivered-To: iucg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1])
by ietfa.amsl.com (Postfix) with ESMTP id C495A1A007A
for <iucg@ietfa.amsl.com>; Wed, 6 Aug 2014 05:01:45 -0700 (PDT)
X-Quarantine-ID: <fL1uXxfdJE7l>
X-Virus-Scanned: amavisd-new at amsl.com
X-Amavis-Alert: BAD HEADER SECTION, Non-encoded 8-bit data (char E4 hex): To:
Patrik F\344ltstr\366m <paf@fr[...]
X-Spam-Flag: NO
X-Spam-Score: 1.931
X-Spam-Level: *
X-Spam-Status: No, score=1.931 tagged_above=-999 required=5
tests=[BAYES_50=0.8, IP_NOT_FRIENDLY=0.334, MIME_8BIT_HEADER=0.3,
MISSING_MID=0.497] autolearn=no
Received: from mail.ietf.org ([4.31.198.44])
by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024)
with ESMTP id fL1uXxfdJE7l for <iucg@ietfa.amsl.com>;
Wed, 6 Aug 2014 05:01:44 -0700 (PDT)
Received: from host.presenceweb.org (host.presenceweb.org [67.222.106.46])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(No client certificate requested)
by ietfa.amsl.com (Postfix) with ESMTPS id 81D171B2991
for <iucg@ietf.org>; Wed, 6 Aug 2014 05:01:44 -0700 (PDT)
Received: from 21.104.14.81.rev.sfr.net ([81.14.104.21]:59650
helo=MORFIN-PC.mail.jefsey.com)
by host.presenceweb.org with esmtpa (Exim 4.82)
(envelope-from <jefsey@jefsey.com>)
id 1XEzuX-0000w6-Vt; Wed, 06 Aug 2014 05:01:42 -0700
X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9
Date: Wed, 06 Aug 2014 14:01:29 +0200
To: Patrik F�ltstr�m <paf@frobbit.se>,Mark Davis ������ <mark@macchiato.com>
From: JFC Morfin <jefsey@jefsey.com>
In-Reply-To: <219A83FB-B0C4-4B58-93A9-84A976B9147E@frobbit.se>
References: <C0D401D76B8D1BA472604BB4@JCK-EEE10>
<CAJ2xs_F9+6_+Fz-xFdSGBUV82qmMa33Y8+F9mjinMKx9=YoKcA@mail.gmail.com>
<CAJ2xs_H_Gy9b_A5LZj0o9rFffbvbnVGLv+22CD7NhmZhLXE6Rg@mail.gmail.com>
<219A83FB-B0C4-4B58-93A9-84A976B9147E@frobbit.se>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format=flowed
Content-Transfer-Encoding: 8bit
X-AntiAbuse: This header was added to track abuse,
please include it with any abuse report
X-AntiAbuse: Primary Hostname - host.presenceweb.org
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - jefsey.com
X-Get-Message-Sender-Via: host.presenceweb.org: authenticated_id:
jefsey+jefsey.com/only user confirmed/virtual account not confirmed
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: http://mailarchive.ietf.org/arch/msg/iucg/jA-l_zBniPoSLumyDJo_lbg76OE
Cc: Marc Blanchet <Marc.Blanchet@viagenie.ca>,
IDNA update work <idna-update@alvestrand.no>, "iucg@ietf.org" <iucg@ietf.org>,
gerard lang <gerard_lang@orange.fr>
Subject: Re: [iucg] Unicode 7.0.0, (combining) Hamza Above,
and normalization for comparison
X-BeenThere: iucg@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
Reply-To: internet users contributing group <iucg@ietf.org>
List-Id: internet users contributing group <iucg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/iucg>,
<mailto:iucg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/iucg/>
List-Post: <mailto:iucg@ietf.org>
List-Help: <mailto:iucg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/iucg>,
<mailto:iucg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Aug 2014 12:01:46 -0000
X-Message-ID:
Message-ID: <20140806120149.3722.68947.ARCHIVE@ietfa.amsl.com>
At 07:03 06/08/2014, Patrik Fältström wrote: >To be honest, I do not think it matters where it is discussed. I suggest we keep it discussed here. The reason why is the ICANN response to the plaintiffs in the .ir, etc. case. "the DNS provides a human interface to the internet protocol addressing system". This seems to be a good definition to commonly sustain as it is technically true, easy to understand, and makes a clear distinction between the human and the non-human issues. The most complex issue of the human confusability of the ISO 10646 code points calls for a visual to binary anti-phishing algorithm. Such an algorithm should be added to the idna table allowing registries to accept xn-- registrations or not, based upon the domain names already registered. To start the debate on this issue I would suggest a possibilty for such an algorithm: a mathematical proximity confusability discrimination between character 32x32 rasterizations (i.e. 1024 bits structured strings). I note that this also implies a common font of reference: I do not think this is a problem as it is on the human side and that conflicts will be subject to courts: what counts is the font local law will consider. Up to each ccTLD to provide that information and to have it added to ISO 3106, which already includes the administrative languages we should get renamed anyway as standardization languages coupled with the accepted script(s). Initial question: 1. what is the URL of the complete Unicode code point table value/description? 2. I found rasterisations made for different scripts but not for all. jfc
- Re: [iucg] Unicode 7.0.0, (combining) Hamza Above… JFC Morfin
- Re: [iucg] Unicode 7.0.0, (combining) Hamza Above… John C Klensin
- Re: [iucg] Unicode 7.0.0, (combining) Hamza Above… Jefsey
- [iucg] Non-Unicode interfaces to IDNs (was: Re: U… John C Klensin