Re: [Ietf-languages] BCP47 violation in the recent extlang ajp change
Hugh Paterson III <sil.linguist@gmail.com> Tue, 28 March 2023 21:26 UTC
Return-Path: <sil.linguist@gmail.com>
X-Original-To: ietf-languages@ietfa.amsl.com
Delivered-To: ietf-languages@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1222AC15270E for <ietf-languages@ietfa.amsl.com>; Tue, 28 Mar 2023 14:26:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.095
X-Spam-Level:
X-Spam-Status: No, score=-2.095 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nNcqqCu62vv9 for <ietf-languages@ietfa.amsl.com>; Tue, 28 Mar 2023 14:26:00 -0700 (PDT)
Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DA211C151B3C for <ietf-languages@ietf.org>; Tue, 28 Mar 2023 14:26:00 -0700 (PDT)
Received: by mail-ed1-x536.google.com with SMTP id x3so55156434edb.10 for <ietf-languages@ietf.org>; Tue, 28 Mar 2023 14:26:00 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680038759; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=+oKspOa/SclmwftIZOzqkmUF+kFH24TIFyiATQjVG7I=; b=Ohpoi626Ikl0XONL/wmSs/0nnkkXBf5quU9NGq7SBzR/T5Jah/DOxTE74PVR5wDV4W lIe2vnBWPC2YGWbmr73dJz0VSGp81PxRJS4KLfQUHf/ZkNGEQ7QbSBP9iUd46AZChIwZ 0NoVShRkWNDy6BsglsOn6Tjd2DVgwM2j6MqCgXwkzGDF0/CfBN1MKDRIoRx0l+ZE1cqA Vpdtb/jVKu8kwPdviZQSgNo0AISwWaX/XP32n82i7qMKa5azgaiFinMIaL9mDSCkXb3D +FwYWJ9KchVGcZ6u5Lm/DjVnEb9fWQTD0VoE1B9jb4efFTcMbYqFgvefVkxGI46F13Ff IvLw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680038759; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=+oKspOa/SclmwftIZOzqkmUF+kFH24TIFyiATQjVG7I=; b=bUGl3OFADM/+JfhrmKL7FyvnwcAuYN+rQVUwyDEAKmbgzUHNWP06bWDXgU3Ucdw8SP Mc0iz7AoMtYp1wKZGrk897SObkex7NWkfivBXfuDWLik9P7y5UgtJLo4Te4lqixCCBJl u2Ch67Tg6QYDR6qLTQH/8R0LVtSNRKTBn5D6Vq5YrplPdhx9ZOvNXpy5yLxAGMun5INV JnyR3qgSHceadv3q0AvPlf2qjzt2bPc6IXXI70pWXqIHQo2qU1Sun/IyQkRbonu+jQtU E0aG9mkyOALXtj8zpIDKZQVRLpXDfBCb62/6ATtUR7BOr13kiWVlYk0gsE9QwokF2Xyk NM2w==
X-Gm-Message-State: AAQBX9dtll6EBRDETxmbOcUzQ3tRTpP+dERRPLF0jPGwpcLQemYfORom R1ZT2+OdmZfSf9YIcZ6NefoL0Rx5XxCZINRM4Us=
X-Google-Smtp-Source: AKy350Y+sIlNIcJ3lurMKAG4HJm6YV/KpdzYByLdeAe7r2nW8JdUqAF6Dqs/trA710KaQE/D+/Ut2LSSEPwktOqXFcY=
X-Received: by 2002:a17:907:c25:b0:8b2:fa6d:45e3 with SMTP id ga37-20020a1709070c2500b008b2fa6d45e3mr8659010ejc.1.1680038758883; Tue, 28 Mar 2023 14:25:58 -0700 (PDT)
MIME-Version: 1.0
References: <871qlbvbgo.fsf@gmail.com> <PH0PR03MB6606F4AFF9773C419EF09401CA8A9@PH0PR03MB6606.namprd03.prod.outlook.com> <7f1e4b72-fc25-bec8-9e36-cbffbdd6eeda@it.aoyama.ac.jp> <CAE=3Ky_fi5ixQp+gYJeV4kYAtr0BmTbRgiaLDUJLDAk8AE-xjA@mail.gmail.com> <SJ0PR03MB65981CE49906B746C69BCD03CA889@SJ0PR03MB6598.namprd03.prod.outlook.com>
In-Reply-To: <SJ0PR03MB65981CE49906B746C69BCD03CA889@SJ0PR03MB6598.namprd03.prod.outlook.com>
From: Hugh Paterson III <sil.linguist@gmail.com>
Date: Tue, 28 Mar 2023 14:25:47 -0700
Message-ID: <CAE=3Ky8R3mKWn+JRnPo-V5m+g+W=jsQ+_T2g3xBS6Ct-p26MNg@mail.gmail.com>
To: Doug Ewell <doug@ewellic.org>
Cc: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, Christian Despres <christian.j.j.despres@gmail.com>, "ietf-languages@ietf.org" <ietf-languages@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000006ea36c05f7fc8149"
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-languages/HWl2Is0erotKUF3KJp5mWfMIG0E>
Subject: Re: [Ietf-languages] BCP47 violation in the recent extlang ajp change
X-BeenThere: ietf-languages@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Review of requests for language tag registration according to BCP 47 \(RFC 4646\)" <ietf-languages.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-languages/>
List-Post: <mailto:ietf-languages@ietf.org>
List-Help: <mailto:ietf-languages-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Mar 2023 21:26:05 -0000
If -t and -u are not to be included in BCP-47 (as reserved entities), because they ought to be independently manageable (which I assume is the purpose for a separate RFC), then should the -x extension be kicked out to its own RFC? Why should it be included in RFC5646? Wouldn't the same reasons for inclusion or exclusion exist for all singletons? (§2.2/3.7). Without linking and registration of the various RFC documents defining singletons, doesn't it mean that the only singleton which a parser can depend on by implementing BCP-47 is -x-, as that is the only one acknowledged? Potentially then different communities could define their own singletons and create clashing singletons. For example, I could create an organization called the Open Language Advocacy Community, and define a set of singletons, some of which may clash with RFC6067 and RFC6497. Presumably nothing is stopping some of these other singletons from becoming registered as RFCs. This situation may create confusion for parsers who expect -t and -u to be related to RFC6067 and RFC6497. Maybe the more pertinent architectural approach is to have a database of singletons where registration is required like the IANA database. In my first read of §2.2 and §3.7, I read it as there was only a single universe of singletons possible. Am I more appropriately to understand the current architecture to be that there is infact a multiverse of infinite options with regards to the semantics of any singleton other than -x-? With the caveat that in each world of the multiverse that -x- will in fact have the same semantics. Kind Regards, Hugh On Tue, Mar 28, 2023 at 1:55 PM Doug Ewell <doug@ewellic.org> wrote: > Hugh Paterson III wrote: > > > 1. Any larger BCP-47 effort for revision should likely include > > incorporating the Unicode Extensions, RFC6067 and RFC6497: > > > > https://cldr.unicode.org/index/bcp47-extension > > http://www.unicode.org/reports/tr35/#36-unicode-bcp-47-u-extension > > The T and U extensions are quite properly described in separate RFCs, as > specified in Section 3.7 of RFC 5646. Folding them into an RFC 5646bis > would place them on a different definitional level from any future > extensions that might be created, which would be a Bad Thing™. > > I assume the purpose in doing so would be to call better attention to the > extensions, since RFC 5646 itself predates them and so cannot refer to > them. The Wikipedia page on BCP 47 already describes the extensions and > links to the extension RFCs, as does my page; perhaps the maintainers of > other sites that link to RFC 5646 could be persuaded to link to those > documents as well (and also to RFC 4647). > > > 2. The issue with using Glottolog codes is specifically that ISO 639-3 > > is scoped at the language level, while the informatic model of the > > Glottolog is specifically scoped to the document level, rather than > > the language level. That is, Glottolog codes can span the classes of > > macro-language, idiolect, dialect, or hypothetical reconstructed > > language. The two scopes are inconsistent with regard to purpose. If > > the purpose of the positions of the constructed BCP-47 tags is to stay > > consistent, I don't see how Glottolog codes become a possibility > > except after -x- or some other yet-to-be defined identifier. > > I agree wholeheartedly, but one does occasionally hear suggestions to do > this. It’s almost as if 8,000 language subtags, plus variants and the > ability to register more, plus private-use sequences, weren’t enough. > > -- > Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org > >
- [Ietf-languages] BCP47 violation in the recent ex… Christian Despres
- Re: [Ietf-languages] BCP47 violation in the recen… Doug Ewell
- Re: [Ietf-languages] BCP47 violation in the recen… Martin J. Dürst
- Re: [Ietf-languages] BCP47 violation in the recen… Hugh Paterson III
- Re: [Ietf-languages] BCP47 violation in the recen… Doug Ewell
- Re: [Ietf-languages] BCP47 violation in the recen… Hugh Paterson III
- Re: [Ietf-languages] BCP47 violation in the recen… Doug Ewell
- Re: [Ietf-languages] BCP47 violation in the recen… Hugh Paterson III
- Re: [Ietf-languages] BCP47 violation in the recen… Doug Ewell