[Ltru] Mostly Proofreading Nits (Was: Re: draft-davis-t-langtag-ext)

CE Whitehead <cewcathar@hotmail.com> Fri, 22 July 2011 13:51 UTC

Return-Path: <cewcathar@hotmail.com>
X-Original-To: ltru@ietfa.amsl.com
Delivered-To: ltru@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 66C8C21F8A1A for <ltru@ietfa.amsl.com>; Fri, 22 Jul 2011 06:51:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.298
X-Spam-Level:
X-Spam-Status: No, score=-1.298 tagged_above=-999 required=5 tests=[AWL=0.500, BAYES_00=-2.599, GB_I_LETTER=-2, HTML_MESSAGE=0.001, J_BACKHAIR_43=1, J_CHICKENPOX_42=0.6, J_CHICKENPOX_43=0.6, J_CHICKENPOX_63=0.6]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Abi9uzM4r3XT for <ltru@ietfa.amsl.com>; Fri, 22 Jul 2011 06:51:05 -0700 (PDT)
Received: from snt0-omc3-s51.snt0.hotmail.com (snt0-omc3-s51.snt0.hotmail.com [65.54.51.88]) by ietfa.amsl.com (Postfix) with ESMTP id 002DC21F89B8 for <ltru@ietf.org>; Fri, 22 Jul 2011 06:51:04 -0700 (PDT)
Received: from SNT142-W23 ([65.55.90.136]) by snt0-omc3-s51.snt0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Fri, 22 Jul 2011 06:51:04 -0700
Message-ID: <SNT142-w233F32C7EA2C5258BFE535B34E0@phx.gbl>
Content-Type: multipart/alternative; boundary="_3d5bf114-ec99-440b-a52b-73ec3e22106e_"
X-Originating-IP: [71.229.7.23]
From: CE Whitehead <cewcathar@hotmail.com>
To: <ltru@ietf.org>
Date: Fri, 22 Jul 2011 09:51:04 -0400
Importance: Normal
MIME-Version: 1.0
X-OriginalArrivalTime: 22 Jul 2011 13:51:04.0594 (UTC) FILETIME=[66437720:01CC4876]
Subject: [Ltru] Mostly Proofreading Nits (Was: Re: draft-davis-t-langtag-ext)
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Jul 2011 13:51:06 -0000











Hi.  This is a response (mostly proofreading nits) to the working draft you've got at:http://unicode.org/repos/cldr/trunk/docs/rfc/draft-davis-t-langtag-ext.txt
{GENERAL:  Thanks very much for this link!To me the intro has sufficient examples.Also it's nice to get the list of various conventions/standards for transliterations at the end of the intro.I do think the paragraphs in it could flow a tiny bit better. So, for the Intro, text I think might be removed is enclosed by inverted brackets  > <  text I think might be inserted is marked by {} All my own comments are indicated as { COMMENT: . . .  } }1.  Introduction par 2 ff   "Language tags, as defined by [BCP47], are useful for identifying the   language of content.  There are mechanisms for specifying variant   subtags for special purposes.  However, these variants are   insufficient for specifying content that has undergone   transformations, including content that has been transliterated,   transcribed, or translated.  The correct interpretation of the   content may depend upon knowledge of {how the source script or language has affected  the transformation and even upon knowledge of}  the conventions used for the transformation.{ COMMENT: I  don't quite see how the following is an example of needing to specify conventions used for transformation -- what you've been talking about above. }   "> For example, < suppose that Italian or Russian cities on a map are   transcribed for Japanese users.  Each name needs to be transliterated   into katakana using rules appropriate for the specific source and   target language.  When tagging such data, it is important to be able   to indicate not only the resulting content language ("ja" in this   case), but also the source language.* * *
{ COMMENT:  in the text below I do not think "not only . . . but also" is quite right; we've already been told that the language is important; this is not new info to introduce with a "but also" clause;you can stress that language is important here, but do you need the "not only . . . but also"? }
"Transforms such as transliterations may vary depending >not only< on   the basis of the source and target script, >but<  {and} also on the source and   target language.  Thus the Russian <U+041F U+0443 U+0442 U+0438   U+043D> (which corresponds to the Cyrillic <PE, U, TE, I, EN>)   transliterates into "Putin" in English but "Poutine" in French.  The   identifier could be used to indicate a desired mechanical   transformation in an API, or could be used to tag data that has been   converted (mechanically or by hand) according to a transliteration   method."{In addition, }Many different conventions have arisen for how to transform text,   even between the same languages and scripts.  For example, "Gaddafi"   is commonly transliterated from Arabic to English as any of (G/Q/K/   Kh)a(d/dh/dd/dhdh/th/zz)af(i/y).  Some examples of standardized   conventions used for transcribing or transliterating text include:" . . . "{ COMMENT: I do like having the info. at the end of this section . . . }* * ** * *2.1 par 4   "The t extension is not intended for use in structured data that   already provides for source and target language identifiers.  For   example, this is the case in localization interchange formats such as   XLIFF.  In such cases, it would be inappropriate to use "ja-t-it" for   the target language tag because the source language tag "it" would   already be present in the data.  Instead one would use the language   tag "ja"."{ COMMENT:  The phrase "already present in the data" is confusing; if I have text in Italian or French transliterated from French script to Arabic script I can of course use the it or fr subtag twice, but this text seems to say if the language is part of the original subtag then you should not mention it again after -t  ??? To me it does. But otherwise this section is fine}* * *2.1 par 5   "It is sometimes necessary to indicate additional information about   the transformation.  This additional information is optionally   supplied after the source in a series of one or more fields, where   each field consists of a field separator subtag followed by one or   more non-separator subtags.  Each field separator subtag consists of   a single letter followed by a single digit.{ COMMENT: I personally would insert, "As noted" or "As noted earlier" or something similar at the beginning of this paragraph; I also did not see why you say "the" transformation"  here instead of just "a transformation" in general }=>"As pointed out in section I, it is sometimes necessary to indicate additional information about a transformation.  This additional information is optionally   supplied after the source in a series of one or more fields, where   each field consists of a field separator subtag followed by one or   more non-separator subtags.  Each field separator subtag consists of   a single letter followed by a single digit."* * *2.1 Editorial Note"The data and specification will be available by the time this internet draft has   been approved."  { COMMENT:  O.k. for now; I am assuming here you will put in more details, for example a date, by the time you send this draft for approval.}* * *From: "Martin J. DÃrst" <duerst at it.aoyama.ac.jp>Date: Thu, 21 Jul 2011 19:14:26 +0900> On 2011/07/08 6:00, Doug Ewell wrote:>> Pete Resnick<presnick at qualcomm dot com>  wrote:> . . .>> I can't find any indication of where within CLDR the list of allowable>> values will be located.  Saying they're in core.zip is almost useless.>> Saying they're in common/bcp47 is better, but I'd still like to know>> what file name, what XML element, etc.  An example would help.> Agreed here again.I tend to agree too.Best,--C. E. Whiteheadcewcathar@hotmail.com> Regards,   Martin.