Re: [art] [Last-Call] Language tags and YANG

Carsten Bormann <cabo@tzi.org> Thu, 30 June 2022 11:27 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: art@ietfa.amsl.com
Delivered-To: art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CA09EC15A74D; Thu, 30 Jun 2022 04:27:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.909
X-Spam-Level:
X-Spam-Status: No, score=-6.909 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hxFudOPafSq5; Thu, 30 Jun 2022 04:27:48 -0700 (PDT)
Received: from gabriel-smtp.zfn.uni-bremen.de (gabriel-smtp.zfn.uni-bremen.de [IPv6:2001:638:708:32::15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6F5DFC15A74C; Thu, 30 Jun 2022 04:27:45 -0700 (PDT)
Received: from [192.168.217.118] (p5089ad4f.dip0.t-ipconnect.de [80.137.173.79]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-smtp.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4LYbfR3lRMzDClp; Thu, 30 Jun 2022 13:27:43 +0200 (CEST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.7\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <62BD8272.8020706@btconnect.com>
Date: Thu, 30 Jun 2022 13:27:43 +0200
Cc: Francesca Palombini <francesca.palombini@ericsson.com>, "\"Martin J. Dürst\"" <duerst@it.aoyama.ac.jp>, Applications and Real-Time Area Discussion <art@ietf.org>, "last-call@ietf.org" <last-call@ietf.org>, "i18ndir@ietf.org" <i18ndir@ietf.org>
X-Mao-Original-Outgoing-Id: 678281262.990312-b3485addd6366ee6bccaef6bb3bab958
Content-Transfer-Encoding: quoted-printable
Message-Id: <D01A5C00-888E-4682-9FBB-FA0B49BFBC31@tzi.org>
References: <165511479760.19573.12671700576299137749@ietfa.amsl.com> <63D13796-758D-469B-AFA8-3050C9F87819@tzi.org> <dde9d36c-61e5-afcc-e15a-787c99d5fba9@it.aoyama.ac.jp> <CAN40gSuhSAOH3WRPETXU4s1468eXb_g-=sfWFmXXTvekEddqYQ@mail.gmail.com> <034DDF0F-FEF2-456B-B9ED-76B8F2B6C4BF@tzi.org> <CAN40gSuGJOChjAY9fFD5Gwqn9CaLH09-m5MKb5Gfg8HH9WYjvA@mail.gmail.com> <0359E066-79F3-4AAB-92A5-30B5E01D16CE@tzi.org> <CAN40gSuR12WE=NC-MqGvCX1z+XNVn+5X94VFH1qHE373gbQR_w@mail.gmail.com> <62B57BC0.9080706@btconnect.com> <CAN40gSsg+nbDejC2d34wpLhecUnGTEZL6RAHjT5UUKTJWRS7Uw@mail.gmail.com> <62B59596.2010203@btconnect.com> <AS1PR07MB86163DBAA92FE852CE0A495F98BB9@AS1PR07MB8616.eurprd07.prod.outlook.com> <62BD8272.8020706@btconnect.com>
To: tom petch <daedulus@btconnect.com>
X-Mailer: Apple Mail (2.3608.120.23.2.7)
Archived-At: <https://mailarchive.ietf.org/arch/msg/art/5TV9f4xInTsjnZ2n4EiXcq6JhSs>
Subject: Re: [art] [Last-Call] Language tags and YANG
X-BeenThere: art@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Applications and Real-Time Area Discussion <art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/art>, <mailto:art-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/art/>
List-Post: <mailto:art@ietf.org>
List-Help: <mailto:art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/art>, <mailto:art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Jun 2022 11:27:51 -0000

On 2022-06-30, at 13:01, tom petch <daedulus@btconnect.com> wrote:
> 
> Second, what is a suitable constraint on a YANG string to be a language tag?  If Carsten's length constraint is adequate, then I would advise authors to RYO as opposed to waiting for 6991-bis; if something more complex is needed, especially if it involves restricting the choice of characters, then it probably belongs in 6991-bis.

I got the regexp from XSD [1], where Martin pointed me to.
Specifically:

[1]: https://www.w3.org/TR/xmlschema11-2/#language

Which defines »language« using the pattern (iregexp):

[a-zA-Z]{1,8}(-[a-zA-Z0-9]{1,8})*

This in turn cites RFC 3066 (Obsoleted by RFC4646 [Obsoleted by RFC 5646] and RFC4647), which contains:

    Language-Tag = Primary-subtag *( "-" Subtag )

    Primary-subtag = 1*8ALPHA

    Subtag = 1*8(ALPHA / DIGIT)

(which is the ABNF way of saying and explaining the iregexp above.)

The RFC 3066 ABNF was replaced by the more restrictive ABNF grammar in RFC 4646, which was updated a bit in RFC 5646 (demonstrating that the restrictive approach is less stable).

Note that 
[a-zA-Z]{1,8}(-[a-zA-Z0-9]{1,8})*
is a bit more than a length restriction, it is also defining the characters that can be used in a language tag.

I would think that having that as a YANG data type would be good, but of course I have no opinion whether this needs to be in 6991bis.

Grüße, Carsten