[I18n-discuss] Comments on "troublesome-characters" from Arabic script

Abdulaziz Al-Zoman <azoman@citc.gov.sa> Mon, 24 July 2017 04:42 UTC

Return-Path: <azoman@citc.gov.sa>
X-Original-To: i18n-discuss@ietfa.amsl.com
Delivered-To: i18n-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1A645126E64 for <i18n-discuss@ietfa.amsl.com>; Sun, 23 Jul 2017 21:42:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.082
X-Spam-Level:
X-Spam-Status: No, score=0.082 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DEAR_SOMETHING=1.973, HTML_MESSAGE=0.001, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, T_KAM_HTML_FONT_INVALID=0.01] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TdRfmsq9dqfa for <i18n-discuss@ietfa.amsl.com>; Sun, 23 Jul 2017 21:42:06 -0700 (PDT)
Received: from ry0iron1.citc.gov.sa (mx1.citc.gov.sa [IPv6:2001:67c:18c8:20::70]) by ietfa.amsl.com (Postfix) with ESMTP id 394231200ED for <i18n-discuss@iab.org>; Sun, 23 Jul 2017 21:42:03 -0700 (PDT)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A2ABBwBoenVZ/wYNCwpdHAEBBAEBCgEBgm8+gRGBFAejEJRyMIFcgzsCGoN5QgECAQEBAQECgROFGAEBKApGBhIBGQQBASgDAgQwFAkJAQQOBQiJQ3SvG4ImizABAQEBAQEBAQEBAQEBAQEBAQEBAQEYBYMog02BYYZKgSEmDCiCID0wgjEFilKGC4ZwiAGBaIVmjFyCAIVQiluVZIFhKAsfOB+FOAUcgWd2AYd1gTKBDgEBAQ
X-IPAS-Result: A2ABBwBoenVZ/wYNCwpdHAEBBAEBCgEBgm8+gRGBFAejEJRyMIFcgzsCGoN5QgECAQEBAQECgROFGAEBKApGBhIBGQQBASgDAgQwFAkJAQQOBQiJQ3SvG4ImizABAQEBAQEBAQEBAQEBAQEBAQEBAQEYBYMog02BYYZKgSEmDCiCID0wgjEFilKGC4ZwiAGBaIVmjFyCAIVQiluVZIFhKAsfOB+FOAUcgWd2AYd1gTKBDgEBAQ
X-IronPort-AV: E=McAfee;i="5700,7163,8600"; a="13961249"
X-IronPort-AV: E=Sophos; i="5.40,405,1496091600"; d="scan'208,217"; a="13961249"
Received: from ry0cas1.citc.gov.sa ([10.11.13.6]) by mx1.citc.gov.sa with ESMTP; 24 Jul 2017 07:41:35 +0300
Received: from RY0MAIL1.citc.gov.sa ([fe80::41f5:a928:35c4:e993]) by ry0cas1.citc.gov.sa ([::1]) with mapi id 14.03.0361.001; Mon, 24 Jul 2017 07:41:35 +0300
From: Abdulaziz Al-Zoman <azoman@citc.gov.sa>
To: "'i18n-discuss@iab.org'" <i18n-discuss@iab.org>
CC: Abdulaziz Al-Zoman <azoman@citc.gov.sa>
Thread-Topic: Comments on "troublesome-characters" from Arabic script
Thread-Index: AdMENuf0A2+8jdYqRAe/PSdBt0DNlA==
Date: Mon, 24 Jul 2017 04:41:34 +0000
Message-ID: <EDEC5B615F83D44981FA2D0DCA9971670131709366@ry0mail1.citc.gov.sa>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.10.117.11]
Content-Type: multipart/alternative; boundary="_000_EDEC5B615F83D44981FA2D0DCA9971670131709366ry0mail1citcg_"
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18n-discuss/HqYI-ZI60e4fbBjN6KfqvCJfBXE>
Subject: [I18n-discuss] Comments on "troublesome-characters" from Arabic script
X-BeenThere: i18n-discuss@iab.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Internationalization Program Open Discussion List <i18n-discuss.iab.org>
List-Unsubscribe: <https://www.iab.org/mailman/options/i18n-discuss>, <mailto:i18n-discuss-request@iab.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18n-discuss/>
List-Post: <mailto:i18n-discuss@iab.org>
List-Help: <mailto:i18n-discuss-request@iab.org?subject=help>
List-Subscribe: <https://www.iab.org/mailman/listinfo/i18n-discuss>, <mailto:i18n-discuss-request@iab.org?subject=subscribe>
X-List-Received-Date: Mon, 24 Jul 2017 04:43:56 -0000

Dear Sir/Mam (i18n-discuss)

Please find my comments concerning the draft-freytag-troublesome-characters-01.txt: “Those Troublesome Characters: A Registry of Unicode Code Points Needing Special Consideration When Used in Network Identifiers”

As the aim of this Internet draft was to create a registry of code points that need special consideration when used as identifiers so that it would guide system administrators in setting parameters for allowable code points in an  identifier system, and to aid applications in creating security aids for users. I have substantial concern on what code points should be part of this registry and hence how the recipients (software developer, registries, registrars, etc.) interpret the inclusion of a code point in the repository.

For example, the registry includes some essential characters (letters) that may result at the end useless identifiers if these characters are restricted or blocked (because they are part of the repository).   For instance, with respect to the Arabic language, the registry consists of a large portion of the Arabic basic alphabet that may result to a limited character set for creating identifiers (e.g., the Arabic language consists of 28 essential alphabet characters, while the repository suggests that out of these 28 characters there are 22 troublesome-characters; i.e. more than 78% of the Arabic language characters can't be used if they are blocked).  This will cause impractical use of the Arabic language in network identifier. While the actual "troublesome-characters" are the non-spacing marks.

Therefore,  we would suggest that the registry includes only the problematic code points such as non-spacing marks but not the basic characters (they can be in the comment field or the reason for including such a non-spacing mark). As adding basic characters to the repository  may lead to preventing their use in identifiers while the origin of the problem is due to the misuse of non-spacing marks. Thus, the repository should cover only the non-spacing marks indicating their risk-free usage, and their harmful usage which should be blocked.

Sincerely  Yours,
Abdulaziz Al-Zoman


From: tf-aidn-bounces@meswg.org [mailto:tf-aidn-bounces@meswg.org] On Behalf Of Sarmad Hussain
Sent: 4/Jul/2017 12:15 PM
To: TF-AIDN
Subject: [TF-AIDN] "troublesome-characters" from Arabic script


Dear All,



A new version of draft-freytag-troublesome-characters-01.txt has been posted to the IETF repository. This standards track internet-draft aims to address the IAB Statement<https://www.iab.org/documents/correspondence-reports-documents/2015-2/iab-statement-on-identifiers-and-unicode-7-0-0/> and more.



The draft lists many code points from the Arabic script, which may need the review of the task force.  You can send comments to i18n-discuss@iab.org<mailto:i18n-discuss@iab.org>.





Name:                  draft-freytag-troublesome-characters

Revision:              01

Title:                     Those Troublesome Characters: A Registry of Unicode Code Points Needing Special Consideration When Used in Network Identifiers

Document date:               2017-06-30

Group:                  Individual Submission

Pages:                  42

URL: https://www.ietf.org/internet-drafts/draft-freytag-troublesome-characters-01.txt<URL:https://www.ietf.org/internet-drafts/draft-freytag-troublesome-characters-01.txt>

Status:https://datatracker.ietf.org/doc/draft-freytag-troublesome-characters/

Htmlized:https://tools.ietf.org/html/draft-freytag-troublesome-characters-01

Htmlized:https://datatracker.ietf.org/doc/html/draft-freytag-troublesome-characters-01

Diff:https://www.ietf.org/rfcdiff?url2=draft-freytag-troublesome-characters-01



Abstract:

    Unicode's design goal is to be the universal character set for all

    applications.  The goal entails the inclusion of very large numbers

    of characters.  It is also focused on written language; special

    provisions have always been needed for identifiers.  The sheer size

    of the repertoire increases the possibility of accidental or

    intentional use of characters that can cause confusion among users,

    particularly where linguistic context is ambiguous, unavailable, or

    impossible to determine.  A registry of code points that can be

    sometimes especially problematic may be useful to guide system

    administrators in setting parameters for allowable code points in an

    identifier system, and to aid applications in creating security aids

    for users.





regards,
Sarmad