[Cbor] Regular expressions

Joe Hildebrand <hildjj@cursive.net> Sun, 28 February 2021 19:03 UTC

Return-Path: <hildjj@cursive.net>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BD9753A1AC6 for <cbor@ietfa.amsl.com>; Sun, 28 Feb 2021 11:03:06 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.203
X-Spam-Level:
X-Spam-Status: No, score=0.203 tagged_above=-999 required=5 tests=[DKIM_INVALID=0.1, DKIM_SIGNED=0.1, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=neutral reason="invalid (public key: not available)" header.d=cursive.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id y-LE7GireH4X for <cbor@ietfa.amsl.com>; Sun, 28 Feb 2021 11:03:05 -0800 (PST)
Received: from mail-oi1-x22f.google.com (mail-oi1-x22f.google.com [IPv6:2607:f8b0:4864:20::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 668473A1AC5 for <cbor@ietf.org>; Sun, 28 Feb 2021 11:03:05 -0800 (PST)
Received: by mail-oi1-x22f.google.com with SMTP id x20so15796038oie.11 for <cbor@ietf.org>; Sun, 28 Feb 2021 11:03:05 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cursive.net; s=google; h=from:content-transfer-encoding:mime-version:subject:message-id:date :to; bh=qgalmB/Savst23gxgFka9b+PXHjU5m/U/9w/A+Ifvw0=; b=bJbOoXxOhSlIB5SR4Axez+nDJhTltTSAA8AAYdHYEqy8DIILi6vHmmYrWKJVVmcjHc TsrYV5ocZROhuA3Yv2OFPu2P9jlKQKbavWWH0sqLX8deHq48eMwaVjzciQx3KLP17hFK xlYrscTDI8dp7jjSNrDo5FYL1ZxjHOKwe1J04=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:mime-version :subject:message-id:date:to; bh=qgalmB/Savst23gxgFka9b+PXHjU5m/U/9w/A+Ifvw0=; b=dK+PYwSrePhjiGkPx56uTuZv4jrKdHHCPRwNTa6KKWfMUlewuIjVrFfTSDLCOHwmQZ DkBVVOE9DqqrPSmMNj8mxYYzyHcqsHs4fg/M0/UVSL17EgTkekxtGh6CUoOV+5Rd2hNi sqAz47ZffrSQ9Tohi5zpD36/G3l5IpPxTqKX3pO6aWg6M8JRH9Dr41jYizWnKofrtu2b BEz9xW76BBfLRSLyeTBrKHLTyz+iOhd6pBS8lxcP+Op4Myn08Pm4JcnwrPdwbUli8ZXS +VYa0jgksTJTJ5bSG3RLhkPL5hTL6z1a5tLrRG4SSfAexrwWIdU4+8jtJFeU9kPU33rj iTpw==
X-Gm-Message-State: AOAM530KZA21AUz/Tqah8Ape9HPr2rgcUsNBxbuMSIyYUn8/yJ2ucR3P Qb3noTyW4G+wCQqslTQ11FTgRaGC9Aub4g==
X-Google-Smtp-Source: ABdhPJwOndaSVrOrK2Bo1G00KbKdV5JZ5MK265s9Y7/bOlVK8I9a0/CSVWqrciU/FCUs0f0nSyknNg==
X-Received: by 2002:a05:6808:656:: with SMTP id z22mr1759406oih.163.1614538984305; Sun, 28 Feb 2021 11:03:04 -0800 (PST)
Received: from ?IPv6:2601:282:200:3758:878:7598:b37e:7e3f? ([2601:282:200:3758:878:7598:b37e:7e3f]) by smtp.gmail.com with ESMTPSA id q132sm2966521oif.32.2021.02.28.11.03.03 for <cbor@ietf.org> (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 28 Feb 2021 11:03:03 -0800 (PST)
From: Joe Hildebrand <hildjj@cursive.net>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\))
Message-Id: <4665BD99-C64E-41B4-9FD0-547175B33D9A@cursive.net>
Date: Sun, 28 Feb 2021 12:03:01 -0700
To: cbor@ietf.org
X-Mailer: Apple Mail (2.3654.60.0.2.21)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/lPRX4Jj6HAtaIWpQT7OAQJlk25c>
Subject: [Cbor] Regular expressions
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 28 Feb 2021 19:03:07 -0000

I know that rfc8949 dropped Regular Expressions as tag 35, but they're still defined and usable from rfc7049.

That said, what, if anything, are people doing the the regex flags?  I wonder if all this time I should have been encoding

/foo/g

As 

35("/foo/g")

Instead of 

35("foo")

(Losing the g flag).  I see from https://mailarchive.ietf.org/arch/msg/cbor/txIKHMXRFzNo7oH-eHZigO1L47w/ that Carsten has previously also assumed that we don't transport the slashes.

If including the slashes won't interop, and if I'm not the only one that's implemented tag 35, I'd be happy to whip up a new doc to describe this, and register a higher tag number for it.  If anyone is interested, we can have a discussion about

35("/foo/g") vs. 35("foo", "g")

And maybe even some sort of info about what kind of regex it is (ECMAscript vs. PCRE, for example).

I assume a lot of folks are in the "regexes are too hard to interop" camp, in which case I'll take a nice high tag number and everyone else can ignore it.

— 
Joe Hildebrand