Re: Fwd: Unicode 7.0.0, (combining) Hamza Above, and normalization for comparison
John C Klensin <klensin@jck.com> Wed, 06 August 2014 13:33 UTC
Return-Path: <klensin@jck.com>
X-Original-To: idna-update@alvestrand.no
Delivered-To: idna-update@alvestrand.no
Received: from localhost (localhost [127.0.0.1]) by mork.alvestrand.no (Postfix) with ESMTP id 20BE17C3BCF for <idna-update@alvestrand.no>; Wed, 6 Aug 2014 15:33:42 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at alvestrand.no
Received: from mork.alvestrand.no ([127.0.0.1]) by localhost (mork.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6UXZysVWqX+0 for <idna-update@alvestrand.no>; Wed, 6 Aug 2014 15:33:37 +0200 (CEST)
X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0-rc2
Received: from bsa2.jck.com (ns.jck.com [70.88.254.51]) by mork.alvestrand.no (Postfix) with ESMTPS id D47357C3B37 for <idna-update@alvestrand.no>; Wed, 6 Aug 2014 15:33:36 +0200 (CEST)
Received: from [198.252.137.115] (helo=JcK-HP8200.jck.com) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <klensin@jck.com>) id 1XF1LN-000KCw-4x; Wed, 06 Aug 2014 09:33:29 -0400
Date: Wed, 06 Aug 2014 09:33:24 -0400
From: John C Klensin <klensin@jck.com>
To: Mark Davis ☕️ <mark@macchiato.com>, IDNA update work <idna-update@alvestrand.no>, Marc Blanchet <Marc.Blanchet@viagenie.ca>
Subject: Re: Fwd: Unicode 7.0.0, (combining) Hamza Above, and normalization for comparison
Message-ID: <E1E14E2EE504828DCD49C3ED@JcK-HP8200.jck.com>
In-Reply-To: <CAJ2xs_H_Gy9b_A5LZj0o9rFffbvbnVGLv+22CD7NhmZhLXE6Rg@mail.gmail.com>
References: <C0D401D76B8D1BA472604BB4@JCK-EEE10> <CAJ2xs_F9+6_+Fz-xFdSGBUV82qmMa33Y8+F9mjinMKx9=YoKcA@mail.gmail.com> <CAJ2xs_H_Gy9b_A5LZj0o9rFffbvbnVGLv+22CD7NhmZhLXE6Rg@mail.gmail.c om>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.115
X-SA-Exim-Mail-From: klensin@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Cc: Asmus Freytag <asmusf@ix.netcom.com>, Ken Whistler <ken.whistler@sap.com>
X-BeenThere: idna-update@alvestrand.no
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: IDNA update work <idna-update.alvestrand.no>
List-Unsubscribe: <http://www.alvestrand.no/mailman/options/idna-update>, <mailto:idna-update-request@alvestrand.no?subject=unsubscribe>
List-Archive: <http://www.alvestrand.no/pipermail/idna-update>
List-Post: <mailto:idna-update@alvestrand.no>
List-Help: <mailto:idna-update-request@alvestrand.no?subject=help>
List-Subscribe: <http://www.alvestrand.no/mailman/listinfo/idna-update>, <mailto:idna-update-request@alvestrand.no?subject=subscribe>
X-List-Received-Date: Wed, 06 Aug 2014 13:33:42 -0000
--On Tuesday, August 05, 2014 16:07 -0700 Mark Davis ☕️ <mark@macchiato.com> wrote: > I hadn't heard back from John, but I'm guessing that the > right place to discuss this is here, based on Marc's email. Mark (and, by extension, Ken and others who have said much the same thing), It is certainly a reasonable place. As Patrik suggested, it may not make much difference where it is discussed. I think we have a fundamental difference in perspectives here or, if you prefer, assumptions about criteria for making decisions. As long as that difference persists, while we can do our best to understand each other and explain those differences carefully, it is not likely that anyone will convince anyone else to change positions. That situation is further complicated by the observation that I have believed all along (where "all along" goes back prior to Unicode 1.0 to the work that led to ISO 10646 DIS-1) that the nature of human writing systems and their evolution leads to a situation in which there are no perfect solutions, only choices among tradeoffs and compromises. I presume that belief is not controversial: the first sentences of Section 2.2 (Unicode Standard, versions 3 - 6.2 at least) says, I think, almost the same thing. Many of those reading this will recall, from the earliest days of IDN discussions, that some people argued that the balance struck by Unicode is fundamentally inappropriate for the needs of IDNs and hence that the choice of Unicode as a base was, itself, inappropriate. Some have believed that because they felt that unification lost too much information, others that "compatibility ... with existing standards" led to bad design decisions that even those prior standards would not have made had they not been constrained (e.g., to seven or eight bits) or to arbitrary and duplicative division into "scripts", and so on. There was even an (IMO extreme) position that IDNs should be based on a phonetic rendering rather than on any sort of coding of writing systems. More of us have taken the position that Unicode represented a reasonable balance --both generally and for IDN needs-- although not a perfect one and that any other existing or hypothetical system would just represent a different balance and different set of tradeoffs and resulting issues. There were also some fundamental IDN design decisions that had little to do with the structure or contents of Unicode. Perhaps the most important example was that IDns would not be coded by language or contain any sort of language identifier. That decision involved its own set of tradeoffs: language coding as part of IDN labels would have been possible and would clearly convey some advantages, but, for example, the costs of having false negatives on comparison if a user could not correctly guess the language associated with a string seen "on the side of a bus" were seen as unacceptable. There are also some technical issues resulting from the DNS context in which IDN labels are accessed: even today, thinking persists in some quarters that a top-level domain can be used to identify a language and the language of all of the nodes associated with it. To a [very] limited extent (see below), that could be done as a matter of policy but the DNS structure makes doing it as a technical matter --essentially conditioning comparison of labels deep in the tree based on higher (or TLD) nodes-- impossible. This note is written in the spirit of a desire to explain, with little expectation of convincing you that my position is, indeed, heading in the right direction. The "no language coding for IDNs" decision mentioned above has profound implications, especially when multiple languages share a script. Your experience is obviously much broader, but the vast majority of the Unicode applications to coded character strings I've seen fall into one of three categories: * The language context is explicit * Only a subset of Unicode is in use, with that subset chosen for the needs of a particular language community. In some respects, that sort of use is no different from the earlier tradition of choosing or designating (but not necessarily explicitly identifying) a national (or language) standard, "code table", or "code page" and then using it, except that the universality of Unicode is much more elegant (and non-problematic due to statelessness) than, e.g., the designation rules of ISO/IEC 2022. * The coded character strings are used for display purposes only and no one cares what the codes are as long as the display on the page comes out the same. Because of the "no language coding" principle and the need to be able to compare character strings to labels (and labels to each other) with a high degree of accuracy, IDNs do not fall into any of those categories. Subject only to the exclusions (DISALLOWED code points and character sequences not allowed by CONTEXTx or Bidi rules) imposed by IDNA2008 (RFCs 5890-5893) and string length limits, an IDNA label can consist of any sequence of arbitrarily-chosen Unicode characters without regard to language, script, or homogeneity constraints. Whether it was correct or not in retrospect, one of the assumptions of the IDNA design --an assumption basic enough that we didn't bother to make it explicit in any of the five base documents -- was that, within a script, normalization would be sufficient to cause two different ways of coding the same abstracted glyph form to compare equal. That assumption was based, in part, on strong assurances that few, if any, new precomposed characters would be added and that, if they were, they would decompose back to the previously-existing combining forms. At least some of us believed that, if there were exception cases of which we should be aware, someone more expert in Unicode would tell us and no one did. Would the knowledge have made any difference to how IDNA was designed? I don't know -- more of those tradeoffs -- but at least we would have been aware that there were cases in which normalization would not behave as we felt we had been led to expect. Your comments (and earlier ones) about letters in Fula, etc., are extremely interesting and obviously important to Unicode decisions about coding. However, for better or worse, that "no language information" element of the IDNA design renders any distinctions (within a script) based on language irrelevant for IDNA purposes. From that IDNA perspective, U+08A1 (and U+0681 and U+076C) are simply exceptions to the (assumed and possibly incorrect) principle that NFC or NFD normalization are sufficient to assure IDNA comparison accuracy, where "IDNA comparison" is bound to the appearance of the character forms given a choice of type styles but without language information or even language specific rendering. In retrospect, that assumption about comparison integrity under normalization may have been naive. Perhaps we would even have done some things differently had we known that a half-dozen years ago. The question is what to do today. In that regard, the I-D really has two pieces. One is an explanation of the situation. It could (and would) clearly be written differently if IDN labels were language-sensitive within a script or if we could allow different comparison rules for different scripts, but those considerations and options are irrelevant to IDNA except for historical perspective on how we got here. The other is the more operational question of whether U+08A1 should be exceptionally DISALLOWED and, if not, what should be done with it. It seems to me there are only a few possibilities: (i) To DISALLOW it as the draft now suggests, acknowledging that doing so creates some issues for anyone intending to include Arabic/Ajami forms of Fula-specific characters or label strings or substrings in the DNS. Doing so requires accepting the inconsistency with U+0681 and U+076C on the assumption that it is too late to change how those characters (or the composing sequences that could mechanically produce them) are handled. (ii) To DISALLOW U+08A1 and do something unpleasant, despite the risks of invalidating now-valid labels, with U+0681 and U+076C as well, thereby ending up in a consistent state. Or to somehow figure out how to DISALLOW the combining sequences associated with all three. (iii) To take the "the problem has existed for a very long time" argument to the conclusion it seems to me that you are arguing for, which is that the existence of other cases means that this case has to be allowed too and doesn't make things significantly worse. That may be the right answer for this case but I have to admit that, as a principle, it makes me very nervous, in part because those criteria in Section 2.2 have, necessarily, not been applied consistently across scripts and languages (the text more or less says that -- the comment is not a criticism). (iv) To conclude that a requirement for NFC is not adequate for IDNA and create a new exception category that applies special, IDNA-specific, normalization rules that would have the effect of disallowing certain character sequences that cannot be disallowed by the current contextual rules and then apply that category to these in-script "it is really a character for that language but only a combining sequence for those others" situations. Again, all of this is a result of the IDNA "no language information" design constraint. To say things equivalent to "NFC works this way, its normalizations have been defined this way, and it is incorrect to say that things 'should' work some other way" is, in that context, not a comment about my draft as much as it is an assertion that NFC, as defined, is less appropriate for IDNA than we might have thought. That may or may not lead to the conclusion that (iv) above is the proper solution for IDNA, that (i) and (ii) are just poor surrogates for it, and that (iii) is just a form of denial. As I have said before, none of this is in any way a criticism of Unicode decisions made with other contexts in mind. All I can say personally about those four options right now is that I think it is very important that we document the problems, the risks that go with it, and, by extension, the difficulties created for IDNA by those decisions that [otherwise] make sense for Unicode. Which of those options is then chosen may be of less importance. Especially given the language issue, I don't think it is worth going through your note below point by point. I should, however, comment on one IDNA-related issue. You (or whomever you are quoting) wrote: > There are four levels at which confusables, including > homoglyphs , can be addressed for domain names > > 1. Encoding > 2. Protocol (IDNA) > 3. Label Generation Ruleset > 4. String Review > A more natural level [for addressing > confusables] would be the Label Generation Ruleset level. >... Unless "string review" is intended to include it, there is at least one more level. Since RFC 1591 or earlier (i.e., long before IDNs) any domain registry, at any level of the tree, can apply rules to the particular labels that will be accepted for entry into its zone, rules that are more restrictive than the constraints of the relevant protocols. We usually assume that such restrictions will further narrow whatever is permitted at the level above and, often, that it will propagate downward, but there is no technically-plausible way to enforce those relationships (what can be done by draconian administration is another matter). That level and set of distinctions are important because of their corollaries. First, the LGR process is applicable only to ICANN-delegated TLDs (I believe strictly only those created as part of the new gTLD process since other rules apply to IDN ccTLDs). As far as I know, no one has seriously suggested that they be imposed on second-level registrations within ICANN-contracted TLDs, much less on other second-level registrations or registrations below that level. Second, despite the obvious relationship between "two identical character forms within the same script, coded in different ways, do not compare equal" and confusability, my concern about U+08A1 (and similar situations) is not about the rather subjective issue of confusability, it is about IDNA's expectations about the properties of normalization and their relationship to equality comparison. By itself, the confusability issue, as you have pointed out several times, is mostly a cross-script similarity issue. ICANN has chosen to sweep may examples of those cross-script similarities away by a prohibition on mixed-script labels (with "COMMON", etc., as special cases)but, again, that prohibition is effective for labels created in ICANN-created zones and perhaps by contracted parties and is nothing more than a recommendation for other zones. I note that the IDNA work considered a protocol prohibition on mixed-script labels and decided against it because such labels make considerable sense in some cases, especially ones that were expected to occur lower in the tree). Now, perhaps the best (or least-bad) solution for the present situation would be to combine (iv) with a strong warning about this and other situations in which normalization is insufficient to provide a good language-independent test of visual equality within a script but without language or writing system information, enumerating those cases, and advising registries at all levels of the tree to take appropriate precautions. I can't speak for others, but I'd welcome suggested text and a list of cases along those lines. I'm skeptical about the adoption of such a list as long as UTR 46, and other efforts encouraged by it, essentially discourage adoption of IDNA 2008, but that is just my opinion. best regards, john > ---------- Forwarded message ---------- > From: Mark Davis ☕️ <mark@macchiato.com> > Date: Wed, Jul 30, 2014 at 7:38 AM > Subject: Re: Unicode 7.0.0, (combining) Hamza Above, and > normalization for comparison > To: John C Klensin <john+w3c@jck.com>, Patrik Fältström > <paf@frobbit.se> Cc: member-i18n-core@w3.org, Asmus Freytag > <asmusf@ix.netcom.com> > > > On what email address is this being discussed? I'd like to > convey to that list some comments from an internal > discussion about draft-klensin > -idna-5892upd-unicode70-00.txt. > > (These are not my wording, but I agree with them. I edited > slightly for flow. I will add that from a confusability > standpoint, the proposed draft accomplishes nothing, since > there are thousands of cases of confusable characters; > restricting just this one character has no useful effect; like > removing a quart of water from a lake.) > > For U+08A1, this certainly is a *letter* of Fula (Fulfulde, > Pula, ...), a large language spoken across swaths of > West Africa. Fula is mostly written with the Latin script, > but Islamists also write it in Ajami (Arabic extensions for > African languages), particularly in Guinea. > See: > > http://en.wikipedia.org/wiki/Fula_orthographies > > The *letter* in question is the one used to write the phoneme > /ɓ/, the bilabial implosive. See: > > http://en.wikipedia.org/wiki/%C6%81 > > for the African alphabet convention for the Latin writing of > this letter. > > For the Arabic Ajami alphabets for Fula, the form has been > missing. For whatever reason, in at least one Fulfulde Ajami > orthography, this implosive was (reasonably) represented by > using a Hamza diacritic on the beh letter. Following the way > such *diacritic* (ijam) letter derivations are encoded in the > Unicode Standard, a separate, non-decomposed entry was > required. Note that this use of Hamza is *different* from the > Arabic (language) use of a combining Hamza to indicate a > glottal stop, often in combination with a letter that is > actually pronounced as a vowel. > > As to *why* it was encoded as a single, undecomposed letter, > that is explained at length in the proposal document, as well > as in the section on Hamza in the Unicode Standard, which you > have referred to in the Internet Draft you mention. > > The newly encoded character U+08A1 for Unicode 7.0 has > *already* been added to the relevant table "Arabic Letters > With Hamza Above" in the draft core specification for > Unicode 7.0, where, like the long-encoded U+0681 > and U+076C, it is noted as having no decomposition. > (The core specification will be posted around October -- it > is still undergoing its final editing for all the 7.0 > additions.) > > U+08A1 does not have a canonical decomposition in Unicode 7.0 > (nor, of course, will it *ever* have a canonical decomposition, > because of normalization stability). This is exactly the same > treatment that U+0681 and U+076C got, and for exactly the same > reasons. (And, as you know, of course, those characters date > back to Unicode 4.1 for U+076C and even earlier, Unicode 1.1 > for U+0681.) > > Note that it is incorrect to assert that U+08A1 ARABIC LETTER > BEH WITH HAMZA ABOVE "should" be normalized to U+0628 ARABIC > LETTER BEH + U+0654 ARABIC HAMZA ABOVE. Those are distinct > sequences, and they are never going to compare equal in their > NFC normalizations. > > I am concerned that the Internet Draft here is heading in > exactly the wrong direction. If it ends up changing RFC 5892 > to override the derivation for U+08A1 and force it to INVALID, > all I can see that accomplishing is to guarantee forever that > correctly spelled Ajami Fulfulde cannot be used in domain > names, and that instead people would end up having to use > misspellings to represent their implosive b in a domain names. > > With all due respect to the Arabic script experts that have > been consulted, I rather doubt that they are experts on Ajami > orthographies in West Africa, or are in touch with the people > who would be supporting those languages and implementing > keyboarding and such for West Africa. > > Also, I don't see any way you can justify the abrupt (and > permanent) discontinuity that this would put in place between > the treatment of U+08A1 for Fulfulde and U+076C for Ormuri or > U+0681 for Pashto. > > > If you are looking for a more analogous precedent I suggest, > for example: > > U+2C65 LATIN SMALL LETTER A WITH STROKE > > That was added in Unicode 5.0, and nobody has ever had any > problem with it being PVALID in IDNA. It only has limited use > in a minor orthography, > but what is the harm? > > Now, if you examine U+2C65, you could well claim that it > *should* be decomposed to "a" plus the combining stroke > overlay, U+0338. And both of those have been encoded for a > long, long time in the standard, so in principle, somebody > *could* have been representing their data for a letter a with > stroke before Unicode 5.0 using the sequence with the stroke > overlay. It might even look o.k. in text, depending on the > font support for the combina̸tion. But the Unicode Standard > has rules now for the encoding of certain combinations of base > letters and diacritic modifiers that overlay or modify the > base character form. So U+2C65 was separately encoded. And > there is no normalization of the sequence involved. That > stroked letter use is, in text, distinct from somebody, say, > using a bunch of overlay strokes as a strikethrough convention > for some reason: a̸a̸a̸a̸a̸a̸ > > Consider the Hamza diacritic as falling in this same class of > edge cases, if you will. > > And in this case, I don't think it will be doing anybody any > favors to update RFC 5892 to make U+08A1 DISALLOWED in IDNA. > It doesn't "fix" normalization for it. All it accomplishes is > to force any Fulfulde user of Ajami orthography to misspell > their text in order to use a /ɓ/ in a domain name. It would > just create an unexplained (and unfixable) discontinuity > between what the domain registrations would accept and what > the Fulfulde input and spelling tools would support. Or I > guess it would just force people to give up the Arabic > spellings and go back to the more widely supported Latin > alphabets for Fula to get their domain names. > > What would be accomplished by > forcing another point incompatibility that just ends up getting > carried around forever? > > ==== > > There are four levels at which confusables, including > homoglyphs , > can be addressed for domain names > > 1. Encoding > 2. Protocol (IDNA) > 3. Label Generation Ruleset > 4. String Review > > A > more natural level [for addressing confusables] > would be the Label Generation Ruleset level. For an LGR, > there are three ways to deal with homoglyphs, one of which is > not available on the protocol level. The first two of these > are to rule out a code point (by not including it in the LGR's > repertoire), or to rule out a code point or sequence > conditionally. Unlike using these methods on the Protocol > level, doing so on the LGR level means that it is possible to > be more restrictive, say, for the root of the DNS than for > domains several levels down the tree. The downside of using the > LGR is, of course, that it is specific to the given zone on > the internet. > > The upside is that an LGR has additional mechanisms, such as > defining a "blocked" variant. That creates an "either/or" > situation, where both are permitted, but not at the same time > in the same position of an otherwise identical label. This is > a very nice solution for a number of confusables/homoglyphs > that are systemic (not dependent on accidents of rendering or > "arms length" similarity). > > Unlike the final level, String Review, an LGR has the > advantage of being applied mechanically without any > case-by-case review, which is why it's appropriate for cases > like the one that gave rise to this discussion. > > In principle, both the Label Generation Ruleset or the String > Review are created/carried out by people/entities that have > access to the necessary and specific linguistic and script > expertise, unlike IDNA which seems to be created largely by > protocol experts. > > > On Tue, Jul 22, 2014 at 12:22 AM, John C Klensin > <john+w3c@jck.com> wrote: > >> Hi. I was asked to forward the announcement of this Internet >> Draft to this group once it was posted. See attached. >> >> For information -- comments welcome, but the core issue may be >> rather specific to concerns that surround IDNs and IDNA. Or >> not. >> >> Or course, if I/we are still completely confused, corrections >> and explanations would be welcome. >> >> john >> >> >> ---------- Forwarded message ---------- >> From: internet-drafts@ietf.org >> To: i-d-announce@ietf.org >> Cc: >> Date: Mon, 21 Jul 2014 04:03:58 -0700 >> Subject: I-D Action: >> draft-klensin-idna-5892upd-unicode70-00.txt >> >> A New Internet-Draft is available from the on-line >> Internet-Drafts directories. >> >> >> Title : IDNA Update for Unicode 7.0.0 >> Authors : John C Klensin >> Patrik Faltstrom >> Filename : >> draft-klensin-idna-5892upd-unicode70-00.txt Pages >> : 10 >> Date : 2014-07-21 >> >> Abstract: >> The current version of the IDNA specifications anticipated >> that each new version of Unicode would be reviewed to >> verify that no changes had been introduced that required >> adjustments to the set of rules and, in particular, >> whether new exceptions or backward compatibility >> adjustments were needed. That review was conducted for >> Unicode 7.0.0 and identified a problematic new code point. >> This specification updates RFC 5982 to disallow that code >> point and provides information about the reasons why that >> exclusion is appropriate. It also applies an editorial >> clarification that was the subject of an earlier erratum. >> >> >> The IETF datatracker status page for this draft is: >> https://datatracker.ietf.org/doc/draft-klensin-idna-5892upd-u >> nicode70/ >> >> There's also a htmlized version available at: >> http://tools.ietf.org/html/draft-klensin-idna-5892upd-unicode >> 70-00 >> >> >> Please note that it may take a couple of minutes from the >> time of submission >> until the htmlized version and diff are available at >> tools.ietf.org. >> >> Internet-Drafts are also available by anonymous FTP at: >> ftp://ftp.ietf.org/internet-drafts/ >> >> _______________________________________________ >> I-D-Announce mailing list >> I-D-Announce@ietf.org >> https://www.ietf.org/mailman/listinfo/i-d-announce >> Internet-Draft >> <https://www.ietf.org/mailman/listinfo/i-d-announceInternet-D >> raft> directories: http://www.ietf.org/shadow.html >> or ftp://ftp.ietf.org/ietf/1shadow-sites.txt >> >>
- Fwd: Unicode 7.0.0, (combining) Hamza Above, and … Mark Davis ☕️
- Re: Unicode 7.0.0, (combining) Hamza Above, and n… Patrik Fältström
- Re: Unicode 7.0.0, (combining) Hamza Above, and n… Mark Davis ☕️
- Re: Unicode 7.0.0, (combining) Hamza Above, and n… Vint Cerf
- Re: Unicode 7.0.0, (combining) Hamza Above, and n… JFC Morfin
- Re: Unicode 7.0.0, (combining) Hamza Above, and n… Mark Davis ☕️
- Re: Unicode 7.0.0, (combining) Hamza Above, and n… Vint Cerf
- Re: Unicode 7.0.0, (combining) Hamza Above, and n… Patrik Fältström
- Re: Fwd: Unicode 7.0.0, (combining) Hamza Above, … John C Klensin
- Re: Unicode 7.0.0, (combining) Hamza Above, and n… Mark Davis ☕️
- Re: Unicode 7.0.0, (combining) Hamza Above, and n… Vint Cerf
- Re: Unicode 7.0.0, (combining) Hamza Above, and n… Mark Davis ☕️
- Re: Unicode 7.0.0, (combining) Hamza Above, and n… John C Klensin
- Re: Unicode 7.0.0, (combining) Hamza Above, and n… JFC Morfin
- Visually confusable characters (was: Re: Unicode … John C Klensin
- Re: Unicode 7.0.0, (combining) Hamza Above, and n… Jefsey
- Re: Visually confusable characters Eric Brunner-Williams
- Re: Visually confusable characters John C Klensin
- Non-Unicode interfaces to IDNs (was: Re: Unicode … John C Klensin
- Re: Visually confusable characters (3) Patrik Fältström
- Re: Visually confusable characters (was: Re: Unic… Jefsey
- Re: Visually confusable characters Asmus Freytag
- Re: Visually confusable characters Andrew Sullivan
- Re: Visually confusable characters J-F C. Morfin
- Re: Visually confusable characters John C Klensin
- Re: Visually confusable characters (3) Asmus Freytag
- Re: Visually confusable characters (2) Asmus Freytag
- Re: Visually confusable characters (7) Asmus Freytag
- Re: Visually confusable characters (6) Asmus Freytag
- Re: Visually confusable characters (5) Asmus Freytag
- Re: Visually confusable characters (4) Asmus Freytag
- Re: "This case isn't the important one" (was Re: … Andrew Sullivan
- Re: Visually confusable characters (8) Asmus Freytag
- Re: Visually confusable characters (1) John C Klensin
- Re: Visually confusable characters (3) Andrew Sullivan
- "This case isn't the important one" (was Re: Visu… Andrew Sullivan
- Re: Visually confusable characters (2) John C Klensin
- Re: Visually confusable characters (2) Patrik Fältström
- Re: "This case isn't the important one" (was Re: … Mark Davis ☕️
- Re: Visually confusable characters (3) Asmus Freytag
- Re: Visually confusable characters (2) Asmus Freytag
- Re: Visually confusable characters (1) Asmus Freytag
- Re: Visually confusable characters (3) Patrik Fältström
- Re: Visually confusable characters (2) John C Klensin
- Re: Visually confusable characters (1) John C Klensin
- Re: "This case isn't the important one" (was Re: … Andrew Sullivan
- Re: "This case isn't the important one" (was Re: … Vint Cerf
- RE: "This case isn't the important one" (was Re: … Shawn Steele
- Re: "This case isn't the important one" (was Re: … Asmus Freytag
- Re: "This case isn't the important one" (was Re: … Andrew Sullivan
- RE: "This case isn't the important one" (was Re: … Whistler, Ken
- RE: "This case isn't the important one" (was Re: … Shawn Steele
- RE: "This case isn't the important one" (was Re: … Whistler, Ken
- Re: "This case isn't the important one" (was Re: … Andrew Sullivan
- RE: "This case isn't the important one" (was Re: … Shawn Steele
- RE: "This case isn't the important one" (was Re: … Whistler, Ken
- Re: "This case isn't the important one" (was Re: … Andrew Sullivan
- RE: "This case isn't the important one" (was Re: … Shawn Steele
- Re: "This case isn't the important one" (was Re: … Kent Karlsson
- Re: "This case isn't the important one" (was Re: … Cary Karp