Re: [Cbor] Validation of maps

Francesca Palombini <francesca.palombini@ericsson.com> Thu, 05 October 2017 10:53 UTC

Return-Path: <francesca.palombini@ericsson.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5D6ED133070 for <cbor@ietfa.amsl.com>; Thu, 5 Oct 2017 03:53:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.219
X-Spam-Level:
X-Spam-Status: No, score=-4.219 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=ericsson.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id i5WqxN5LEcVr for <cbor@ietfa.amsl.com>; Thu, 5 Oct 2017 03:53:16 -0700 (PDT)
Received: from sessmg23.ericsson.net (sessmg23.ericsson.net [193.180.251.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3BE25134226 for <cbor@ietf.org>; Thu, 5 Oct 2017 03:53:15 -0700 (PDT)
X-AuditID: c1b4fb2d-bddff7000000268d-e1-59d60f185181
Received: from ESESSHC022.ericsson.se (Unknown_Domain [153.88.183.84]) by sessmg23.ericsson.net (Symantec Mail Security) with SMTP id 50.18.09869.81F06D95; Thu, 5 Oct 2017 12:53:13 +0200 (CEST)
Received: from EUR01-VE1-obe.outbound.protection.outlook.com (153.88.183.145) by oa.msg.ericsson.com (153.88.183.84) with Microsoft SMTP Server (TLS) id 14.3.352.0; Thu, 5 Oct 2017 12:53:12 +0200
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.onmicrosoft.com; s=selector1-ericsson-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=tFzldQeSQnanKngpQm8ne/SlvV+zpOuWVoM3exAZbNc=; b=hmB2Gv2rX+yq7HH3umFEighqoyObSjIAgb5ZGxnHMZ3MFw9IYbIVSDjpxY+l9mEEIffEEYFJAeakeQSAmFsfZbmEHb9h1ULxhX9NUqhxzVUOtulIczcZ61jssScUyqueMBeI1GdkknLwc5I4Aat97MOrNUK/utBh3aVv2llErrA=
Received: from HE1PR0701MB2539.eurprd07.prod.outlook.com (10.168.129.17) by HE1PR0701MB2538.eurprd07.prod.outlook.com (10.168.129.16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.77.5; Thu, 5 Oct 2017 10:53:11 +0000
Received: from HE1PR0701MB2539.eurprd07.prod.outlook.com ([fe80::456e:eb29:8077:6178]) by HE1PR0701MB2539.eurprd07.prod.outlook.com ([fe80::456e:eb29:8077:6178%17]) with mapi id 15.20.0077.018; Thu, 5 Oct 2017 10:53:11 +0000
From: Francesca Palombini <francesca.palombini@ericsson.com>
To: Carsten Bormann <cabo@tzi.org>, "cbor@ietf.org" <cbor@ietf.org>
Thread-Topic: [Cbor] Validation of maps
Thread-Index: AQHS//XL7VetNcuD6kGhRpgLKMKXK6JaFwCAgAHUiQCAeaO90IAAAJQg
Date: Thu, 05 Oct 2017 10:53:11 +0000
Message-ID: <HE1PR0701MB2539BFF4100C57A6C70CC94E98700@HE1PR0701MB2539.eurprd07.prod.outlook.com>
References: <e16da575-bbed-1f52-c754-9938237aa6bc@obj-sys.com> <3FFCD42B-C1DE-43E3-A06D-608CACD55D86@tzi.org> <CANh-dXnucjNP=eZfrEcrVC6HN0XHk0dcw-C+J56rksWxMbX8=A@mail.gmail.com> <HE1PR0701MB253924A5FBD83848583C053898700@HE1PR0701MB2539.eurprd07.prod.outlook.com>
In-Reply-To: <HE1PR0701MB253924A5FBD83848583C053898700@HE1PR0701MB2539.eurprd07.prod.outlook.com>
Accept-Language: en-GB, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [192.176.1.84]
x-ms-publictraffictype: Email
x-microsoft-exchange-diagnostics: 1; HE1PR0701MB2538; 6:3+g1pRYVPxnnwdZpZ90L/Lbqibtvam4UIfGBfL5XrAJlFBxdZcYbRqHPDiQBOW5qLmF67zR20vJyHV4eGMl07w6kSnL1kkovDCDn23iZ8l6ja1004ZO7e0b+Iobbuo1rj4Jq+tC6wFlPxWRwlNGek9Plz2hzrGmKHXpDVVeyUe+GdC+wCf5SL5HiXlTLLjbuQ34wyyLhwZub4ttr3THppfHSSOJNdfbCl9NEZ+XFU5RCiC83K2jQ6B40RaxEQT3rq7HmbHBT5qzRcACszduonqn7hk4j94hwpYpC5f8r6UdXl32XD+qKDQJSr+DCk1ceLVGT6mgOD0KHKcjLSMgcQQ==; 5:P3eH2ZgGnRvFf1seUqc+lq8K7yKTFgG2/qK/eCLL6ZC6grjAtMIu3LBJ39akzE+tjl8WSKh5gFuRgsCgLJyKQwMwmR7VHOviHbjkg2iRdnwsvEpxPlNMQeIyxriHE0ZxLE8wp6ldyPb3L93SLbYyHA==; 24:Xj5zhm3nvs1T2yY4kQuSdDrrYSbAH7juAJBCDKZbA8F6/K/i+Vq782B2BFjD0V3YL5/2ZEu6BU1mNamM1qP9LB9SFXvJaF5L/jk/OJ/8ggE=; 7:54KaYHAiqL1A9AQ2oL00BabY806AkvVh2/wCeZ0grq5XPFm+8h3Cp0ZuauFm2A0p5s285+bavkM4iTyVLlXZIX65P4stDpfkZQO5vmTmUPqGw8nq2q0evazsyE4Qe5L1284+9OJ0FdaUpr2G6SJAO9pHlilxsrG4DP1xWDCSV64siEPRgXGw4eDeXYEL3tbT5VU5CgQy37pOoB4aezVKzF5wDSySCZKm1u87r3AhCsU=
x-ms-exchange-antispam-srfa-diagnostics: SSOS;
x-ms-office365-filtering-correlation-id: eadac6bd-c8e2-42c2-9b1c-08d50bdf4568
x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001)(2017030254152)(2017052603199)(201703131423075)(201703031133081)(201702281549075); SRVR:HE1PR0701MB2538;
x-ms-traffictypediagnostic: HE1PR0701MB2538:
x-exchange-antispam-report-test: UriScan:(21748063052155);
x-microsoft-antispam-prvs: <HE1PR0701MB2538C92A6633CED7C1461CA498700@HE1PR0701MB2538.eurprd07.prod.outlook.com>
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(5005006)(8121501046)(3002001)(10201501046)(93006095)(93001095)(100000703101)(100105400095)(6041248)(20161123555025)(20161123562025)(20161123560025)(20161123564025)(20161123558100)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095); SRVR:HE1PR0701MB2538; BCL:0; PCL:0; RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095); SRVR:HE1PR0701MB2538;
x-forefront-prvs: 04519BA941
x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(376002)(346002)(39860400002)(377454003)(199003)(24454002)(189002)(6306002)(316002)(106356001)(189998001)(5660300001)(110136005)(101416001)(76176999)(561944003)(105586002)(2950100002)(74316002)(54356999)(19609705001)(50986999)(33656002)(8676002)(7736002)(6436002)(7696004)(8936002)(6506006)(68736007)(478600001)(86362001)(25786009)(81156014)(81166006)(5250100002)(966005)(606006)(9686003)(55016002)(2906002)(2900100001)(3280700002)(2501003)(6246003)(229853002)(2940100002)(53546010)(6116002)(790700001)(102836003)(99286003)(3846002)(54896002)(93886005)(53936002)(66066001)(236005)(14454004)(3660700001)(97736004); DIR:OUT; SFP:1101; SCL:1; SRVR:HE1PR0701MB2538; H:HE1PR0701MB2539.eurprd07.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en;
received-spf: None (protection.outlook.com: ericsson.com does not designate permitted sender hosts)
authentication-results: spf=none (sender IP is ) smtp.mailfrom=francesca.palombini@ericsson.com;
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: multipart/alternative; boundary="_000_HE1PR0701MB2539BFF4100C57A6C70CC94E98700HE1PR0701MB2539_"
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-originalarrivaltime: 05 Oct 2017 10:53:11.1325 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 92e84ceb-fbfd-47ab-be52-080c6b87953f
X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0701MB2538
X-OriginatorOrg: ericsson.com
X-Brightmail-Tracker: H4sIAAAAAAAAA02Sa0iTYRTHefa+215Xo6epebJmtUrt4iULk7AwjBC64IcIrQ858k2t6WSv WlrBJATxAppu3jJNNC8zRctLUqI2b0TOpPICXsa8rxiKCGK6tr0L/PY7/3P+z+H8eShCpOG6 UjFxCbQiTiqT8ARkUVjbLa/9e36F+5qXeAHa/EluQFF6ARnECams3OCEqCtiQjl3BIGRtCwm iVb4XIoQRH/Oa0bxyk70pHYuj6dEf9tRBnKgAJ8Dw0cjNwMJKBHWIugyaPhs0Y9gUFdPWAsS ZxNg0GbZLCJcyIHC18fZqXkErW+7CGuDhwNhWG/iWtkJB8PKyCjfyo7YHfQDW4jVPWBFXcVh +SrMpNbZZkh8DKqnzaSVhTgCTKYOxC7I5sDiYLut4YClMKkeshkQFsNaqsa2mMAuMDFbxmEP wlD5SUew7AxLhm0uy4dhOVPJY1kMI2WZtgWA0/nQmLphN3hDS+4fezI3oG/azGWHyhFUNy3a XzoN6m9GO1+A9Nx+O8uhYUDLZw06LtSsWQOjLMVBWNgWsnoFD0q+Ggg2SBqq36WhHHSqeMcV LMshZ/Q7r9gWx14YLJoliy1PEfgENHb4sCNHID9Tz2fZE9JelfJ36uWIX4ecGZphYqP8znrT ipj7DCOP846jE5qR5Qt1f9j0akca4+UehCkk2S28u/kzXMSVJjHJsT0IKELiJJSYLZIwUpqc Qivk9xSJMprpQQcoUuIiDOocDhPhKGkC/Yim42nF/y6HcnBVotKaZ55961uyp4UPvoSPLa2R Hqasue6MrGWP3+cju0aDxS3+nY9VB+pL3rvtcndViXrdxP5Ha/SOQ1E/rvem7LumaVX5NqrH M0MEL7VOq/FN/g3dFW2h6Sa/N8leUxfHb840HKpdiBa3KqcWqx5eUcUab0+4yRMn5nUF68/H XqzSEpKJlp45SSgY6T+pE2XOPgMAAA==
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/z5LJUogRK7jVnDbWYtYcqnWO8xE>
Subject: Re: [Cbor] Validation of maps
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Oct 2017 10:53:19 -0000

Sorry, early sent. The opinion I’d like to hear from the working group: is cuts something we want to consider putting in at this point?

Francesca

From: Francesca Palombini
Sent: den 5 oktober 2017 12:51
To: 'Jeffrey Yasskin' <jyasskin@chromium.org>; Carsten Bormann <cabo@tzi.org>
Cc: Kevin Braun <kbraun@obj-sys.com>; cbor@ietf.org
Subject: RE: [Cbor] Validation of maps

Agreed with Jeffrey.

Reviving this thread to ask the opinion from the working group:

From: CBOR [mailto:cbor-bounces@ietf.org] On Behalf Of Jeffrey Yasskin
Sent: den 20 juli 2017 03:16
To: Carsten Bormann <cabo@tzi.org<mailto:cabo@tzi.org>>
Cc: Kevin Braun <kbraun@obj-sys.com<mailto:kbraun@obj-sys.com>>; cbor@ietf.org<mailto:cbor@ietf.org>
Subject: Re: [Cbor] Validation of maps

By the time CDDL makes it to an RFC, we should be answering questions like this by quoting normative text from https://tools.ietf.org/html/draft-greevenbosch-appsawg-cbor-cddl-11#section-3.5, not just pointing at examples.

Jeffrey

On Tue, Jul 18, 2017 at 11:18 PM, Carsten Bormann <cabo@tzi.org<mailto:cabo@tzi.org>> wrote:
Hi Kevin,

> I know the question of more formally specifying validation rules already came up.  One would think map validation would be fairly obvious, but what happens when key types overlap?
>
> For example, I think the intention is that if you have
>
>   top = { 4 => int, *int => tstr }
>
> then the key 4 must be present with an integer value,

Right, that is the only way to match the first field.
(And there is no way to have that as well as another /4/ key with a text string value.)

> and you can have any number of other integer keys with text string values. Okay, but what about:
>
> top = { ? 4 => int, *int => tstr }
>
> We might say this means that if a key of 4 appears, then it must have an int value.  Or, does it allow a key of 4 to appear with a text string value while considering the optional "4 => int" as being absent?

Yes, that is the semantics.  It is not always what a specifier might intend.

The reason is that the map opens a choice point.  A member with key 4 is starting to match the field.  If the value however does not  match (because there is no int), the matcher falls back to the choice point.  It then tries the other field, and indeed, that matches.

In the research underlying CDDL, we have discussed “cuts” (a concept from error handling in Parse Expression Grammars (PEGs)) as the solution to this.  If ^ represents a cut, write:

top = { ? 4 ^ => int, *int => tstr }

Once the 4 matches, there is no way back; for this member, another match is no longer tried.
A nice side effect is that anything except an int after a key of 4 can give a definite error message of “int expected”.
The cut proposal includes : as an abbreviation for ^=>, so you can simply write:

top = { ? 4: int, *int => tstr }

> Given the examples in the spec, I guess the intention is for such a thing to mean the key 4, if present, has to have an int value.

Which example leads you to this conclusion?

>  So, there is some kind of "match the most specific key" rule implied (I guess).

Actually, the PEG semantics we have borrowed here is that the *first* match is used.  But only rules are matched that indeed match!

> How that rule applies in more complex situations (where there is some kind of nesting) probably needs to be spelled out....  Given:
>
>   top = { 1 => 1, ? ( 5 => 5, 6 => 6 ), *int => tstr }
>
> Must keys 5 & 6 be present together,

Yes.

The whole group in the parentheses is optional.

> or does the wildcard allow only one of them to appear?

(That was an early semantics we tried, and it leads down the drain.
It is much better to have a matcher that simply and stupidly follows what’s in the grammar.)

> Or, given:
>
>   top = { 1 => 1, ( 5 => 5 // 6 => 6 ), *int => tstr }
>
> does this mean { 1 : 1, 5 : 5, 6 : "hi" }  is not valid?

No.  The first field eats the 1: 1, the second field only matches the 5: 5, so the third field gets to eat zero or more int: tstr, of which 6: “hi” is a match.

> Is the 6 free to match the wildcard when the 5 has satisfied the group choice?

Yes.

>
> Then there are cases where "most specific key" has no meaning,

(Again, we use “first match”.)

> such as when two key types overlap each other and neither is a single-value type.  Consider:
>
>   top = { * (0..10) => tstr, * (5..15) => int }
>
> Does this mean a key of 5 can have either a text string or an int value?

As long as there are no cuts here, yes.

> Or, does it require that a key of 5, if present, must have a value that is both a text string and an int at the same time (i.e. it disallows 5 to appear)?

That would never be the semantics — the fact that there are two branches in a choice that can be fulfilled is not an error.

With a cut like this:

top = { * (0..10) ^ => tstr, * (5..15) => int }

this could mean that key 0..10 cut the choice and therefore need to have a text string value, while the rest, 11..15 can be integers, because the choice is cut after matching 0..10.

So far, we haven’t seen a use case that actually needed the cut, but it is still nice to have that error message.
(We also haven’t implemented it yet, although we will certainly do that over time.)

Another example where a cut helps:

message = orderbeer / orderwine

orderbeer = {
  type: “beer”,
  ferment: “bottom” / “top”,
}

orderwine = {
  type: “wine”,
  color: “red” / “white”.
}

If you feed {“type”: “wine”, “ferment”: “top”} into this, you get a rather unspecific error message that tells you things don’t match up — the matcher can’t really know whether the “type” value of “wine" or the “ferment” key is the “cause” of neither branch matching.

If you add a cut:

message = orderbeer / orderwine

orderbeer = {
  type: “beer” ^,
  ferment: “bottom” / “top”,
}

orderwine = {
  type: “wine” ^,
  color: “red” / “white”.
}

the matcher can tell you right away that the key “ferment” is not allowed in an orderwine message.

Grüße, Carsten

_______________________________________________
CBOR mailing list
CBOR@ietf.org<mailto:CBOR@ietf.org>
https://www.ietf.org/mailman/listinfo/cbor