Re: [netmod] regular expression flavours (again)

"Rob Wilton (rwilton)" <rwilton@cisco.com> Fri, 14 June 2019 09:29 UTC

Return-Path: <rwilton@cisco.com>
X-Original-To: netmod@ietfa.amsl.com
Delivered-To: netmod@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 94A591201BB for <netmod@ietfa.amsl.com>; Fri, 14 Jun 2019 02:29:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -14.5
X-Spam-Level:
X-Spam-Status: No, score=-14.5 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com header.b=WElXkFJG; dkim=pass (1024-bit key) header.d=cisco.onmicrosoft.com header.b=D6J4Uu8N
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7_k4ttOeWivB for <netmod@ietfa.amsl.com>; Fri, 14 Jun 2019 02:29:44 -0700 (PDT)
Received: from alln-iport-3.cisco.com (alln-iport-3.cisco.com [173.37.142.90]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2AD2012019F for <netmod@ietf.org>; Fri, 14 Jun 2019 02:29:44 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=3614; q=dns/txt; s=iport; t=1560504584; x=1561714184; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-transfer-encoding:mime-version; bh=QmtvpJ7o6+oNxNb+uTZBM+D1OHFj/bwq5OaKJUar4IQ=; b=WElXkFJGFMZAVD2B20qnM4sdiNFhH5eytVj6ui3Uay86uGykHqnMCnAZ 69z1yPa07r3mTJ3FxP1fIFFADC+pLIUK+FuFcawW7osCT+vlHvYmasoIu JlGCemQKEtAai2YxS7stJex0kx3VCsr5Cah3COFcQNyeSlRLd6MnFfxEq 0=;
IronPort-PHdr: 9a23:Wq1KwRbpb9qpIbz3r87L92P/LSx94ef9IxIV55w7irlHbqWk+dH4MVfC4el20gebRp3VvvRDjeee87vtX2AN+96giDgDa9QNMn1NksAKh0olCc+BB1f8KavwcC0+AMNEfFRk5Hq8d0NSHZW2ag==
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0AXAACsaANd/4QNJK1jAxsBAQEBAwEBAQcDAQEBgVEGAQEBCwGBPVADalUgBAsoCoQMg0cDhFKKEIJXlzWBLoEkA1QJAQEBDAEBGAsKAgEBg3pGAheCNCM0CQ4BAwEBBAEBAgEEbRwMhUoBAQEBAwEBEBERDAEBLAsBCwICAgEGAg4CAQQBAQMCJgICAhkMCxUICAIEAQ0FCBMHgwGBagMdAQIMjXiQYAKBOIhfcYExgnkBAQWEfxiCDwMGBYEHKAGLXBeBQD+BEUaBTn4+gmEBAQKBYRUKJoJDMoImi2kyghybHgkCghCGR40kgieHAo4HjRyHHo88AgQCBAUCDgEBBYFPOIFYcBU7gmwTgXyDcDOEYYU/coEpjjMBgSABAQ
X-IronPort-AV: E=Sophos;i="5.63,372,1557187200"; d="scan'208";a="291876841"
Received: from alln-core-10.cisco.com ([173.36.13.132]) by alln-iport-3.cisco.com with ESMTP/TLS/DHE-RSA-SEED-SHA; 14 Jun 2019 09:29:42 +0000
Received: from XCH-ALN-003.cisco.com (xch-aln-003.cisco.com [173.36.7.13]) by alln-core-10.cisco.com (8.15.2/8.15.2) with ESMTPS id x5E9TguC013860 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=FAIL); Fri, 14 Jun 2019 09:29:42 GMT
Received: from xhs-rtp-003.cisco.com (64.101.210.230) by XCH-ALN-003.cisco.com (173.36.7.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Fri, 14 Jun 2019 04:29:42 -0500
Received: from xhs-rtp-001.cisco.com (64.101.210.228) by xhs-rtp-003.cisco.com (64.101.210.230) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Fri, 14 Jun 2019 05:29:41 -0400
Received: from NAM03-BY2-obe.outbound.protection.outlook.com (64.101.32.56) by xhs-rtp-001.cisco.com (64.101.210.228) with Microsoft SMTP Server (TLS) id 15.0.1473.3 via Frontend Transport; Fri, 14 Jun 2019 05:29:41 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cisco.onmicrosoft.com; s=selector2-cisco-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=QmtvpJ7o6+oNxNb+uTZBM+D1OHFj/bwq5OaKJUar4IQ=; b=D6J4Uu8NhnyZvSUBKiMWFgZm09Vj7rIEFhvVBfgn/c8qlee2EmCmmmevhUne3su1lwVB19fZ5tLV+ntqicUtRGHFASpqHGWk58p5o765y9kOlowhmE0VqiQV0rIwPsXdY6urSK+mM1yK5zPhobJtYvTiRLZeEYzrq9XxVdXDgEw=
Received: from BYAPR11MB2631.namprd11.prod.outlook.com (52.135.227.28) by BYAPR11MB3208.namprd11.prod.outlook.com (20.177.127.157) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1987.13; Fri, 14 Jun 2019 09:29:40 +0000
Received: from BYAPR11MB2631.namprd11.prod.outlook.com ([fe80::ed99:b6a8:d6fb:5045]) by BYAPR11MB2631.namprd11.prod.outlook.com ([fe80::ed99:b6a8:d6fb:5045%4]) with mapi id 15.20.1987.013; Fri, 14 Jun 2019 09:29:40 +0000
From: "Rob Wilton (rwilton)" <rwilton@cisco.com>
To: Juergen Schoenwaelder <j.schoenwaelder@jacobs-university.de>, Robert Varga <nite@hq.sk>
CC: NETMOD WG <netmod@ietf.org>
Thread-Topic: [netmod] regular expression flavours (again)
Thread-Index: AQHVIPUyXZrpgNnnhkC8kWX+UPfSraaXv8IAgAHXCICAAAnOgIABOjTQ
Date: Fri, 14 Jun 2019 09:29:39 +0000
Message-ID: <BYAPR11MB26311142F2841456A42623CDB5EE0@BYAPR11MB2631.namprd11.prod.outlook.com>
References: <291106e34498ebd68f26bf9ff9b679dd5bd8f0cd.camel@nic.cz> <20190612092555.xotrr4moh36xv4kl@anna.jacobs.jacobs-university.de> <4611382f-be58-a20f-1712-e5fb3e4ef3ec@hq.sk> <20190613140655.jyq3iltl2v22ekmb@anna.jacobs.jacobs-university.de>
In-Reply-To: <20190613140655.jyq3iltl2v22ekmb@anna.jacobs.jacobs-university.de>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: spf=none (sender IP is ) smtp.mailfrom=rwilton@cisco.com;
x-originating-ip: [173.38.220.62]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 0862c738-a5b2-4b7f-ed3b-08d6f0aad35e
x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600148)(711020)(4605104)(1401327)(2017052603328)(7193020); SRVR:BYAPR11MB3208;
x-ms-traffictypediagnostic: BYAPR11MB3208:
x-ms-exchange-purlcount: 3
x-microsoft-antispam-prvs: <BYAPR11MB320811691950C6D614E7CC47B5EE0@BYAPR11MB3208.namprd11.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:10000;
x-forefront-prvs: 0068C7E410
x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(136003)(396003)(366004)(346002)(39860400002)(376002)(51444003)(13464003)(199004)(189003)(66066001)(6506007)(478600001)(7696005)(102836004)(25786009)(110136005)(316002)(14454004)(2906002)(6436002)(966005)(4326008)(7736002)(81166006)(81156014)(76176011)(305945005)(8936002)(64756008)(229853002)(53936002)(53546011)(8676002)(66446008)(6246003)(26005)(74316002)(186003)(76116006)(66946007)(68736007)(6306002)(66476007)(86362001)(52536014)(33656002)(66556008)(9686003)(256004)(73956011)(476003)(99286004)(3846002)(71190400001)(71200400001)(446003)(6116002)(55016002)(486006)(11346002)(5660300002); DIR:OUT; SFP:1101; SCL:1; SRVR:BYAPR11MB3208; H:BYAPR11MB2631.namprd11.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1;
received-spf: None (protection.outlook.com: cisco.com does not designate permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam-message-info: hEyyHyHAtWej7aon9E4hBERI1nD7/BrsXS5Css2D/J7SOhiWHVA2n9eyK4RKX8jNG/eQnRGZzzFqaE6Juc19omca7rxMw/bdCHD2exWJXx0gzV8tRVE8DW5z4vPGDgT6QcU3QfygDe/ILXmK10A5H0Lphga/r4E9Gt4tfi76+lJdVrRyVAbVx4SRLtkSmSmdHJNgpMZOrtDTLfzaSYzSuj+rMwmiujqaVQzvY+D8GBvZSzD3euKNY2wwnjeIfD82X9+dINAM2jCcn3zGeny5YAUPAS6xTBwEYlHD8wy9hgmU62E+4ikdHnY04wyyIBBkocBqWq3LWSk2aGoGjtUz85eBK42PAEAxL9vDyD/ss7U+Pz+QO0P/KjtDItwz2Iqzk/VqnW+smEsShczAFraoVWphqxCRv+50SikamzkG5X8=
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-Network-Message-Id: 0862c738-a5b2-4b7f-ed3b-08d6f0aad35e
X-MS-Exchange-CrossTenant-originalarrivaltime: 14 Jun 2019 09:29:39.8955 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 5ae1af62-9505-4097-a69a-c1553ef7840e
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: rwilton@cisco.com
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR11MB3208
X-OriginatorOrg: cisco.com
X-Outbound-SMTP-Client: 173.36.7.13, xch-aln-003.cisco.com
X-Outbound-Node: alln-core-10.cisco.com
Archived-At: <https://mailarchive.ietf.org/arch/msg/netmod/WmiVRg1pEEI_DNupsXnMFCWcwaU>
Subject: Re: [netmod] regular expression flavours (again)
X-BeenThere: netmod@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: NETMOD WG list <netmod.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/netmod>, <mailto:netmod-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/netmod/>
List-Post: <mailto:netmod@ietf.org>
List-Help: <mailto:netmod-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/netmod>, <mailto:netmod-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Jun 2019 09:29:47 -0000


> -----Original Message-----
> From: netmod <netmod-bounces@ietf.org> On Behalf Of Juergen Schoenwaelder
> Sent: 13 June 2019 15:07
> To: Robert Varga <nite@hq.sk>
> Cc: NETMOD WG <netmod@ietf.org>
> Subject: Re: [netmod] regular expression flavours (again)
> 
> >
> > Can we engineer a workable solution for the general case without
> > getting everyone know the differences in RE engines? Something along
> > the lines of this:
> > https://github.com/openconfig/public/issues/44#issuecomment-501629497
> > perhaps?
> 
> Please lets not (even with the best intentions) create yet another regular
> expression flavour.

Sorry, but I still think that is exactly what we should do. 😊

What we have today is seemingly not working for the industry:

Some implementations might link against the libxml2 regex engine.  Great, they are compliant with IETF YANG models, but not OpenConfig.

Perhaps there are some other implementations that have written their own XML regex engine for the language of their choice.  I suspect that there are few in this camp!

For everyone else, I think that they do what OpenConfig is trying to do, and they just throw the pattern statements into their chosen language's default regex engine.  And the thing is, assuming the beginning/end anchors are handled appropriately, I suspect that this works for 99% of the pattern statements in YANG models.

OpenConfig state that they are using POSIX regex, with the expectation that these regular expressions can just be interpreted by most languages.  Unfortunately, that isn't quite true either, seemingly most languages are derived from PCRE, and there is a minor difference on how '\' is handled within a character group.  There also seems to be a difference of first vs longest match for alternation, but I'm not convinced that matters for pattern statements.  Finally, POSIX ERE regex is only defined by ASCII, and doesn't cover Unicode, so isn't a great choice for YANG.

Personally, I think that standardizing a basic regex language that is a common subset of what the common languages implement would probably be beneficial in the industry in general.

I'm sure that someone can post an XKCD of why this is a bad idea 😉

Thanks,
Rob


> 
> /js
> 
> --
> Juergen Schoenwaelder           Jacobs University Bremen gGmbH
> Phone: +49 421 200 3587         Campus Ring 1 | 28759 Bremen | Germany
> Fax:   +49 421 200 3103         <https://www.jacobs-university.de/>
> 
> _______________________________________________
> netmod mailing list
> netmod@ietf.org
> https://www.ietf.org/mailman/listinfo/netmod