Re: [Ltru] Fw: I-D Action: draft-falk-transliteration-tags-01.txt

Sorry this is a late reply.

On 2011/06/22 1:00, Mark Davis ☕ wrote:
> Those are good issues; thanks for raising them and starting the discussion.
> Comments below.
>
> ------------------------------
> Mark
> *— Il meglio è l’inimico del bene —*
>
>
> On Mon, Jun 20, 2011 at 23:39, "Martin J. Dürst"<duerst@it.aoyama.ac.jp>wrote:
>
>> Hello Mark, others,
>>
>> Overall comment:
>> The idea to reuse language tags to indicate transliteration/transcription
>> source, and to add some additional tags to distinguish methods seems to be
>> reasonable and sound.
>>
>> The description of the structure of the allowed subtags and of the
>> responsibility split between IETF (this draft) and UTC (UTS 35) looks quite
>> messy to me, and should be cleaned up. I'd personally prefer that UTS 35 (or
>> whatever else on the Unicode side) only define the<mechanism>  part (after
>> the m0 subtag).
>>
>
> That would be my preference as well (can't speak for my coauthors).
>
> We patterned it this way following what ended up being accepted for  the -u-
> extension. That is, the spec is in UTS35, but there is a summary here.

I didn't like that then, I have to admit.

> But
> of course, there are many ways to do it. And maybe this summary is too
> detailed, at least for the mechanism part, and we could just have it in
> UTS35.

The most important thing is to make clear what is 'summary' (i.e. 
non-normative) and what's normative. The second most important thing is 
that the RFC actually define something, not just say "look over there". 
The Unicode side is supposed to be a registry, not a spec (I think Doug 
already pointed that out.).

> We considered a number of alternatives:
>
>     - We could define everything after -t- to be the source language, and
>     everything after -m- to be the mechanism. But that burns 2 extension
>     letters, just one.
>     - We also considered having everything in the -u extension, for which we
>     already have the structure set up. However, that would force us to have
>     artificial source subtags like 'en0' instead of 'en', because the -u-
>     extension wouldn't allow the 2-letter subtags (it already defines a use for
>     them).
>     - We could also have -t- be just the source, and define the mechanism in
>     -u-, also easy. But we felt it would be better to have everything under one
>     extension.

This is the technical aspect, where I think you got it right. What I'm 
talking about above are questions of what spec says what.

>> Detailled comments:

>> BCP47 required information: The first three paragraphs should move to the
>> introduction.
>>
>
> Other authors, what do you think?

In version -03, we have two sections titled "Introduction". Not a good 
sign for a spec.

>> "followed by a sequence of subtags that would form a language tag": Here
>> and in general: Don't use 'would'.
>>
>
> Grammatically, it is that the sequence of subtags *would* form a language
> subtag if they *were* separated out. They are not actually a language tag,
> because they occur in the middle of another language subtag. How would you
> like that to be phrased?

I think "sequence of subtags that form a language tag" is fine.

>>>>>>
>>    The structure of 't' subtags is determined by the Unicode CLDR
>>    Technical Committee, in accordance with the policies and procedures
>>    in http://www.unicode.org/**consortium/tc-procedures.html<http://www.unicode.org/consortium/tc-procedures.html>,
>> and subject
>>    to the Unicode Consortium Policies on
>>    http://www.unicode.org/**policies/policies.html<http://www.unicode.org/policies/policies.html>
>> .
>>>>>>
>>
>>
>> The following paragraph is also difficult to understand. I wouldn't know
>> exactly what falls on what side. I think one major reason is that we are
>> treading new ground here, it's the first time we have a singleton definition
>> that allows reuse of language tags (with a few restrictions) as well as
>> intends to define its own extensions.
>>
>
> These were both patterned after what was used for the -u- extension. We can
> take a look at them to try to clarify.

Please do.

Regards,    Martin.

>>>>>>
>>    Changes that can be made by successive versions of LDML [UTS35] by
>>    the Unicode Consortium without requiring a new RFC include the
>>    allocation of new subtags for use after the 't' extension.  A new RFC
>>    would be required for material changes to an existing 't' subtag, or
>>    an incompatible change to the overall syntactic structure of the 't'
>>    extension; however, such a change would be contrary to the policies
>>    of the Unicode Consortium, and thus is not anticipated.
>>>>>>
>>
>> 2.1 Summary: There seems to be quite some overlap between the part of
>> section 2 before the 2.1 heading.
>>
>>
>> One question I would have as a linguistic researcher is: How much effort
>> and time is involved in getting a 'mechanism' approved? If such 'mechanisms'
>> are e.g. rejected with arguments like "if we accept it, then everybody has
>> to implement it" or so, then I would see that as a problem.
>>
>
> Good point. I'll propose some text.
>
>
>>
>> So much for the moment.
>>
>>
>> Regards,   Martin.
>>
>>
>>
>> On 2011/06/18 6:07, Mark Davis ☕ wrote:
>>
>>> Yoshito, Addison, and I had had an action for a while now from the CLDR
>>> committee to submit a draft for a an extension. Rather than go through all
>>> the problems in the falk draft, we put together an alternative approach,
>>> leveraging the work we already did for the -u- extension.
>>>
>>> It just got posted at
>>> http://tools.ietf.org/html/**draft-davis-t-langtag-ext-00<http://tools.ietf.org/html/draft-davis-t-langtag-ext-00>
>>>
>>> Courtney, I think this provides a superset of the functionality that you
>>> are
>>> interested in. Perhaps you can read it over, and we can add you as an
>>> author
>>> of the next version of this draft instead of having the two competing
>>> proposals.
>>>
>>> Mark
>>>
>>> *— Il meglio è l’inimico del bene —*
>>>
>>>
>>> On Wed, Jun 15, 2011 at 10:50, Randy Presuhn
>>> <randy_presuhn@mindspring.com>**wrote:
>>>
>>>   Hi -
>>>>
>>>> I started out with an off-list response, but I figure this is
>>>> something worth sending to the list.
>>>>
>>>> Off-list, a contributor asked:
>>>>
>>>> ...
>>>>
>>>>> I'd love to see your input. I'd like to make sure I understand
>>>>> all the concerns. Is there any way you could forward this to the list?
>>>>>
>>>>
>>>> My response:
>>>>
>>>> Sorry, already deleted.  As I recall, the main concerns were
>>>>
>>>>   (1) there already *is* support for identifying orthographies
>>>>       (remember German?)
>>>>   (2) the I-D seems to assume that transliterations always result
>>>>       in "Latin" (previous discussion on LTRU included transliterations
>>>>       to Cyrillic and Hangul, among others)
>>>>   (3) the "original orthography" is irrelevant for the transliteration
>>>>       systems I've been able to think of.  (At the same time, some
>>>>       transliteration systems are quite "lossy" and some don't do
>>>>       "round trip" very well.)  Consider also the transliteration of
>>>> material
>>>>       which was originally in audio form...
>>>>   (4) The draft doesn't clearly distinguish "orthography" from
>>>> "transliteration".
>>>>       This may be because the boundary between the two can be fuzzy, but
>>>> even
>>>>       that is an issue that should be addressed.
>>>>   (5) How this fits in with *transcription* systems (e.g. IPA) should be
>>>>       addressed.  The boundary gets fuzzy with orthographies that are
>>>> equivalent
>>>>       to phonemic representations of the language.  (e.g., Pinyin for
>>>> Mandarin)
>>>>   (6) The proposed singleton usage appears broken and unnecessary.
>>>>
>>>> Or something like that.  I may have forgotten something here, or, in the
>>>> process of reconstruction, thought of something I missed the first time.
>>>>
>>>> Randy
>>>>
>>>> ______________________________**_________________
>>>> Ltru mailing list
>>>> Ltru@ietf.org
>>>> https://www.ietf.org/mailman/**listinfo/ltru<https://www.ietf.org/mailman/listinfo/ltru>
>>>>
>>>>
>>>
>>>
>>> ______________________________**_________________
>>> Ltru mailing list
>>> Ltru@ietf.org
>>> https://www.ietf.org/mailman/**listinfo/ltru<https://www.ietf.org/mailman/listinfo/ltru>
>>>
>>
>