Re: Last Call: 'Tags for Identifying Languages' to BCP

Ned Freed <ned.freed@mrochek.com> Mon, 29 August 2005 16:26 UTC

Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1E9mSX-0002pz-1h; Mon, 29 Aug 2005 12:26:09 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1E9mSU-0002pu-S7 for ietf@megatron.ietf.org; Mon, 29 Aug 2005 12:26:06 -0400
Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id MAA19247 for <ietf@ietf.org>; Mon, 29 Aug 2005 12:26:04 -0400 (EDT)
Received: from mauve.mrochek.com ([209.55.107.55]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1E9mTq-000511-P1 for ietf@ietf.org; Mon, 29 Aug 2005 12:27:32 -0400
Received: from dkim-sign.mauve.mrochek.com by mauve.mrochek.com (PMDF V6.1-1 #35243) id <01LSE90FWC28006L3V@mauve.mrochek.com> (original mail from ned.freed@mrochek.com) for ietf@ietf.org; Mon, 29 Aug 2005 09:25:50 -0700 (PDT)
DKIM-Signature: a=rsa-sha1; c=nowsp; d=mrochek.com; s=mauve; t=1125332749; h=Received: Cc:To:Message-id:Date:From:Subject:In-reply-to:MIME-version: Content-type:References; b=no+Tvtvtg3ZMHsvZqmRHlZQVniN9ZIxrWQrFOgFr WH6GDGNq7wVTlkxPKifcwwpOoplRk6QfBsNjz//kvTTp1g==
Received: from mauve.mrochek.com by mauve.mrochek.com (PMDF V6.1-1 #35243) id <01LSE7G3FC6O000092@mauve.mrochek.com>; Mon, 29 Aug 2005 09:25:47 -0700 (PDT)
To: Peter Constable <petercon@microsoft.com>
Message-id: <01LSE90EXRWA000092@mauve.mrochek.com>
Date: Mon, 29 Aug 2005 09:12:33 -0700
From: Ned Freed <ned.freed@mrochek.com>
In-reply-to: "Your message dated Mon, 29 Aug 2005 05:11:37 -0700" <F8ACB1B494D9734783AAB114D0CE68FE06EDF1AB@RED-MSG-52.redmond.corp.microsoft.com>
MIME-version: 1.0
Content-type: TEXT/PLAIN
References: <F8ACB1B494D9734783AAB114D0CE68FE06EDF1AB@RED-MSG-52.redmond.corp.microsoft.com>
X-Spam-Score: 0.0 (/)
X-Scan-Signature: a2c12dacc0736f14d6b540e805505a86
Cc: ietf@ietf.org
Subject: Re: Last Call: 'Tags for Identifying Languages' to BCP
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
Sender: ietf-bounces@ietf.org
Errors-To: ietf-bounces@ietf.org

> > From: Bruce Lilly <blilly@erols.com>


> > > This
> > > is all what this proposition is about. This proposition is to give
> > > _one_shot_ in a _standardised_ way the language, the script and the
> > > country.
> >
> > This was discussed during Last Call of the previous non-IETF
> (individual
> > submission) attempt.  IIRC David Singer brought up several examples of
> > other pieces of information (e.g. legal/copyright variations) that
> could
> > also be negotiated and which might affect the presentation of content
> (or
> > choice among alternative content).  Lumping all of these separate
> items
> > into
> > one tag is a poor design as it impedes negotiation and tends toward
> > lengthy
> > tags which are incompatible with fixed-length mechanisms such as MIME
> > encoded-words.

> I agree that it would be poor design to incorporate other pieces of
> information such as legal/copyright variations into language tags, but
> as such pieces of information are not supported by the draft, this
> appears to be irrelevant.

I agree with both points.

> We should rather focus on whether it is good design to incorporate
> information related to linguistic and written-form attributes, as
> supported in the draft, into a single tag. The consensus of the LTRU
> working group is that it is. For instance, the use of separate tags for
> language and script were considered and rejected on the basis that the
> two are not entirely orthogonal. Clear examples of this was considered:
> while the intent of

> Accept-Language: ar, az-Cyrl, ru

> is clear, the intent of

> Accept-Language: ar, az, ru
> Accept-Script: Cyrl

> or of

> Accept-Language: ar, az, ru
> Accept-Script: Arab, Cyrl

> is not clear, nor is it obvious how rules could be specified that would
> make the intent clear, or that would permit expressing the preferences
> reflected in the first instance.

This is such an important point that it deserves to be caled out, lest it
be lost in the flurry of messages on this topic.

Designs the separate tagging of, say, script and langauge appear at
first glance to be more flexible and general. But appearances can be
deceiving. The problem is that using separate labels does not provide
an easy way of linking the two, and being able to express these
lingages is vital.

> > Tagging identifies characteristics of a particular piece of content. For
> > that purpose alone, it makes little difference (other than regarding the
> > aforementioned compatibility issues with existing IETF mechanisms) whether
> > the characteristics are lumped or separate.

> On the contrary, it makes little difference only if the characteristics
> in question are completely orthogonal.

And in the case of language and scripting tags the information is almost always
inseparatable - as far from orthogonal as you can get.

> As pointed out above, the
> characteristics of linguistic variety and written form are not
> orthogonal, particularly when it comes to expressing user preferences,
> and that it *does* make a difference if they are split into separate
> metadata attributes or they are lumped together into a single metadata
> attribute.

To be totally fair, it would be possible to define a linkage between the two.
Howegver, the representation would end up being fairly compliccated, not to
mention being totally incompatible with the existing field syntax. As far as I
can see the only time a multiple field plus linkage would be a win is when the
repetition of subordinate information resulted in an overly long field. But the
sizes of the tags here are so small that this is at best a marginal corner
case.

In summary, I beleve the approach of using separate fields offers no
advantages and has numerous disadvantages over the appeoach that was
chosen, and that the WG was correct to reject it.

				Ned

_______________________________________________
Ietf mailing list
Ietf@ietf.org
https://www1.ietf.org/mailman/listinfo/ietf