Re: [idn] homographs in TrueType fonts

"Mark Davis" <mark.davis@jtcsv.com> Sat, 07 May 2005 00:09 UTC

Received: from psg.com (mailnull@psg.com [147.28.0.62]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id UAA24857 for <idn-archive@lists.ietf.org>; Fri, 6 May 2005 20:09:45 -0400 (EDT)
Received: from majordom by psg.com with local (Exim 4.50 (FreeBSD)) id 1DUCl5-0008co-67 for idn-data@psg.com; Sat, 07 May 2005 00:01:27 +0000
Received: from [32.97.182.144] (helo=e4.ny.us.ibm.com) by psg.com with esmtps (TLSv1:AES256-SHA:256) (Exim 4.50 (FreeBSD)) id 1DUCl0-0008cF-U5 for idn@ops.ietf.org; Sat, 07 May 2005 00:01:23 +0000
Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by e4.ny.us.ibm.com (8.12.11/8.12.11) with ESMTP id j4701Ldn010995 for <idn@ops.ietf.org>; Fri, 6 May 2005 20:01:21 -0400
Received: from d01av02.pok.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by d01relay04.pok.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id j4701I8j105202 for <idn@ops.ietf.org>; Fri, 6 May 2005 20:01:21 -0400
Received: from d01av02.pok.ibm.com (loopback [127.0.0.1]) by d01av02.pok.ibm.com (8.12.11/8.13.3) with ESMTP id j4701HPX008025 for <idn@ops.ietf.org>; Fri, 6 May 2005 20:01:18 -0400
Received: from markdavis (sig-9-48-120-122.mts.ibm.com [9.48.120.122]) by d01av02.pok.ibm.com (8.12.11/8.12.11) with SMTP id j4701Gig007964; Fri, 6 May 2005 20:01:17 -0400
Message-ID: <014901c55297$e2602be0$7a783009@sanjose.ibm.com>
From: Mark Davis <mark.davis@jtcsv.com>
To: Erik van der Poel <erik@vanderpoel.org>, idn@ops.ietf.org, mb-secissues@opera.com
References: <427A9171.2030409@vanderpoel.org>
Subject: Re: [idn] homographs in TrueType fonts
Date: Fri, 06 May 2005 17:01:13 -0700
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2800.1437
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1441
X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on psg.com
X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.0.2
Sender: owner-idn@ops.ietf.org
Precedence: bulk
Content-Transfer-Encoding: 8bit

Eric, I updated the file on
http://unicode.org/reports/tr36/draft/confusables.txt incorporating your
list (and others).

There is also the file
http://unicode.org/reports/tr36/draft/confusables-raw.txt, which contains
raw data; it is not reconciled, and not closed. The items with a number or
'skip' in the third field are usually from your data. I do remove some of
the data where you have identicals because they are basically font bugs.

Anyway, comments welcome.

‎Mark

----- Original Message ----- 
From: "Erik van der Poel" <erik@vanderpoel.org>
To: <idn@ops.ietf.org>
Sent: Thursday, May 05, 2005 14:34
Subject: [idn] homographs in TrueType fonts


> I have written a small program that parses a number of TrueType font
> tables to determine which pairs of Unicode codepoints end up using the
> same glyphs. The ASCII part of the table is included below. Each line
> has a codepoint, its glyph, the other codepoint of the pair, and the
> number of fonts in which that pair is identical.
>
> U+2044 and U+2215 use the same glyph as the slash (U+002F) in a few East
> Asian fonts. Note also that the capital letters I and O have homographs,
> although some apps present domain names in lower case, so those
> homographs would stand out in those apps. For the complete table, see:
>
> http://nameprep.org/tt-hg.html
>
> Erik
>
> 0021(!);01C3;2
> 0022(");02BA;4
> 0022(");05F4;12
> 0027(');0060;1
> 0027(');02B9;4
> 0027(');05F3;12
> 0027(');2032;6
> 0028(();FD3E;3
> 0029());FD3F;3
> 002C(,);201A;9
> 002D(-);2010;12
> 002D(-);2012;1
> 002D(-);2013;2
> 002F(/);2044;3
> 002F(/);2215;4
> 003A(:);05C3;1
> 003C(<);2039;1
> 003E(>);203A;1
> 0049(I);04C0;4
> 004F(O);2D54;1
> 005C(\);00A5;2
> 005C(\);20A9;8
> 0060(`);0300;1
> 0061(a);03B1;4
> 0061(a);0430;52
> 0063(c);0441;51
> 0064(d);0501;1
> 0065(e);0435;55
> 0066(f);0192;1
> 0067(g);0261;2
> 0068(h);04BB;10
> 0069(i);0456;60
> 006A(j);03F3;3
> 006A(j);0458;57
> 006D(m);0442;15
> 006E(n);043F;13
> 006F(o);03BF;48
> 006F(o);043E;52
> 006F(o);0585;1
> 006F(o);1D0F;1
> 0070(p);0440;53
> 0073(s);0455;57
> 0075(u);0438;14
> 0076(v);03BD;27
> 0076(v);03C5;1
> 0076(v);0475;2
> 0078(x);03C7;2
> 0078(x);0445;46
> 0079(y);0443;48
> 007C(|);01C0;1
>
>
>