Re: [netmod] Potential additions to rfc6087bis: RegEx guidelines

Juergen Schoenwaelder <j.schoenwaelder@jacobs-university.de> Wed, 30 August 2017 12:32 UTC

Return-Path: <j.schoenwaelder@jacobs-university.de>
X-Original-To: netmod@ietfa.amsl.com
Delivered-To: netmod@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CC15713316C for <netmod@ietfa.amsl.com>; Wed, 30 Aug 2017 05:32:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UKTeVTP4BVap for <netmod@ietfa.amsl.com>; Wed, 30 Aug 2017 05:32:00 -0700 (PDT)
Received: from atlas5.jacobs-university.de (atlas5.jacobs-university.de [212.201.44.20]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AA824133164 for <netmod@ietf.org>; Wed, 30 Aug 2017 05:31:59 -0700 (PDT)
Received: from localhost (demetrius5.irc-it.jacobs-university.de [10.70.0.222]) by atlas5.jacobs-university.de (Postfix) with ESMTP id 82C36F4B; Wed, 30 Aug 2017 14:31:58 +0200 (CEST)
X-Virus-Scanned: amavisd-new at jacobs-university.de
Received: from atlas5.jacobs-university.de ([10.70.0.217]) by localhost (demetrius5.jacobs-university.de [10.70.0.222]) (amavisd-new, port 10032) with ESMTP id Ar16u74Hby9R; Wed, 30 Aug 2017 14:31:57 +0200 (CEST)
Received: from hermes.jacobs-university.de (hermes.jacobs-university.de [212.201.44.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "hermes.jacobs-university.de", Issuer "Jacobs University CA - G01" (verified OK)) by atlas5.jacobs-university.de (Postfix) with ESMTPS; Wed, 30 Aug 2017 14:31:58 +0200 (CEST)
Received: from localhost (demetrius3.jacobs-university.de [212.201.44.48]) by hermes.jacobs-university.de (Postfix) with ESMTP id 6133D200E0; Wed, 30 Aug 2017 14:31:58 +0200 (CEST)
X-Virus-Scanned: amavisd-new at jacobs-university.de
Received: from hermes.jacobs-university.de ([212.201.44.23]) by localhost (demetrius3.jacobs-university.de [212.201.44.32]) (amavisd-new, port 10024) with ESMTP id ruflHKF_w3Dc; Wed, 30 Aug 2017 14:31:57 +0200 (CEST)
Received: from elstar.local (elstar.jacobs.jacobs-university.de [10.50.231.133]) by hermes.jacobs-university.de (Postfix) with ESMTP id 80DDD200AA; Wed, 30 Aug 2017 14:31:57 +0200 (CEST)
Received: by elstar.local (Postfix, from userid 501) id 2F70640735CD; Wed, 30 Aug 2017 14:31:56 +0200 (CEST)
Date: Wed, 30 Aug 2017 14:31:56 +0200
From: Juergen Schoenwaelder <j.schoenwaelder@jacobs-university.de>
To: Robert Wilton <rwilton@cisco.com>
Cc: Andy Bierman <andy@yumaworks.com>, Xufeng Liu <Xufeng_Liu@jabil.com>, "netmod@ietf.org" <netmod@ietf.org>
Message-ID: <20170830123156.cssrg5kklpo67fie@elstar.local>
Reply-To: Juergen Schoenwaelder <j.schoenwaelder@jacobs-university.de>
Mail-Followup-To: Robert Wilton <rwilton@cisco.com>, Andy Bierman <andy@yumaworks.com>, Xufeng Liu <Xufeng_Liu@jabil.com>, "netmod@ietf.org" <netmod@ietf.org>
References: <599F0991.7020900@tail-f.com> <BN3PR0201MB0867A248887538077CD5D49FF19B0@BN3PR0201MB0867.namprd02.prod.outlook.com> <20170825125254.6nhnzkrar6fhu7zr@elstar.local> <BN3PR0201MB086796F09BFD77FCD718C21BF19E0@BN3PR0201MB0867.namprd02.prod.outlook.com> <20170828154640.pzg7jfy5uepkb22q@elstar.local> <c8de6140-af50-0a4b-a479-b011a8dfbbe7@cisco.com> <CABCOCHRNt3Tkxy8Ffz3JGgPe-rQYwZ3MTLmD43OQi4P6tZQJmg@mail.gmail.com> <f7151a6b-9deb-52ad-62a9-78b29a552540@cisco.com> <20170830102902.2n5q6rgq2x2dxfq2@elstar.local> <e8482a9c-cba3-28e2-9ffa-ec5eb5c1c0a4@cisco.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <e8482a9c-cba3-28e2-9ffa-ec5eb5c1c0a4@cisco.com>
User-Agent: NeoMutt/20170714 (1.8.3)
Archived-At: <https://mailarchive.ietf.org/arch/msg/netmod/12xCBKtuysd9-th_kD96se4twJY>
Subject: Re: [netmod] Potential additions to rfc6087bis: RegEx guidelines
X-BeenThere: netmod@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: NETMOD WG list <netmod.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/netmod>, <mailto:netmod-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/netmod/>
List-Post: <mailto:netmod@ietf.org>
List-Help: <mailto:netmod-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/netmod>, <mailto:netmod-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 30 Aug 2017 12:32:02 -0000

On Wed, Aug 30, 2017 at 12:48:19PM +0100, Robert Wilton wrote:
> 
> 
> On 30/08/2017 11:29, Juergen Schoenwaelder wrote:
> > On Wed, Aug 30, 2017 at 10:16:30AM +0100, Robert Wilton wrote:
> > > Hi Andy,
> > > 
> > > What I am suggesting makes it easier for readers, because I am a proponent
> > > of simpler regular expressions that are easy to read and understand.
> > > 
> > > For example, I wonder how many YANG model readers would immediately
> > > comprehend what this pattern statement means:
> > > 
> > > pattern "\p{Sc}\p{Zs}?\p{Nd}+\.\p{Nd}{2}"?
> > > 
> > > Does allowing such patterns really make it easier for model readers?
> > This is always difficult to judge but to be fair you have to show how
> > you express _the same_ (and not a subset) with some other kind of
> > regular expressions. (My understanding is that \p{Sc} is a currency
> > symbol.)
> Yes, the expression would cover a currency amount, along with associated
> symbol (e.g. "$200.00").
> 
> If I was writing a module, I would probably use the following pattern
> statement instead, which I think a lot more people would likely be able to
> comprehend:
> 
> pattern "[A-Z]{3}\s?\d+\.\d{2}", using the 3 letter, ISO 4217, currency codes.  e.g. ("USD 200.00")

But that is not the same. Apples versus oranges. (I expect people to
tell me that (i) currency is irrelevant and (ii) that three ASCII
letter currency acronyms are better than currency symbols anyway but
this is a separate discussion I am not interested in.)

> > 
> > > The proposes guidelines obviously make it easier (or at least no harder) for
> > > tool makers.
> > > 
> > > I agree that there is an minor impact to model writers, but really only in
> > > the sense that the guidelines would be telling them not to use the esoteric
> > > options of the XML regex syntax that they probably don't know about anyway.
> > What is 'esoteric' largely depends on your language environment. What
> > you are saying by 'do not use \p{}' is essentially 'do not use any
> > unicode long live ASCII'.
> No, that is not my intention, i.e. I'm not suggesting banning all use of
> \p{}, but instead limiting it to the character classes that seem like they
> may plausibly be used in standardized YANG modules.

This is entirely subjective. And if you still allow some \p{}, what is
the point of the exercise?

> I'm not trying to change what 6020/7950 defines the pattern statement as,
> just give what I perceive as some pragmatic guidance as to what parts of XML
> RE it makes sense to use in standardized YANG modules, making it easier for
> readers and implementations.
> 
> I think that it is fine for companies, vendors, etc to use the full breadth
> of XML RE if they wish.

Implementations have to be prepared to handle XSD pattern if they
claim compliance to YANG 1.0 and 1.1. So all this only helps
non-compliant implementations. This may indeed be a goal - but then we
should spell this out as such - this helps non-compliant
implementations (and they may still fail on the first \p{} that
you still allow).

If implementations do not implement the YANG pattern statement but
something else, then then they should ignore patterns they can't
understand and treat the pattern as if it would have been in a
description clause - i.e., leave it to humans to write the code that
implements the pattern correctly. Note that YANG does not say anything
how stuff is implemented.

/js

-- 
Juergen Schoenwaelder           Jacobs University Bremen gGmbH
Phone: +49 421 200 3587         Campus Ring 1 | 28759 Bremen | Germany
Fax:   +49 421 200 3103         <http://www.jacobs-university.de/>