Re: [Cbor] Validation of maps

Francesca Palombini <francesca.palombini@ericsson.com> Thu, 05 October 2017 10:50 UTC

Return-Path: <francesca.palombini@ericsson.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AA68D133070 for <cbor@ietfa.amsl.com>; Thu, 5 Oct 2017 03:50:58 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.219
X-Spam-Level:
X-Spam-Status: No, score=-4.219 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=ericsson.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lOheNvPPEorN for <cbor@ietfa.amsl.com>; Thu, 5 Oct 2017 03:50:47 -0700 (PDT)
Received: from sesbmg23.ericsson.net (sesbmg23.ericsson.net [193.180.251.37]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B5DC9126C0F for <cbor@ietf.org>; Thu, 5 Oct 2017 03:50:46 -0700 (PDT)
X-AuditID: c1b4fb25-e44989c0000060a2-83-59d60e84ff14
Received: from ESESSHC010.ericsson.se (Unknown_Domain [153.88.183.48]) by sesbmg23.ericsson.net (Symantec Mail Security) with SMTP id 97.39.24738.48E06D95; Thu, 5 Oct 2017 12:50:45 +0200 (CEST)
Received: from EUR01-DB5-obe.outbound.protection.outlook.com (153.88.183.145) by oa.msg.ericsson.com (153.88.183.48) with Microsoft SMTP Server (TLS) id 14.3.352.0; Thu, 5 Oct 2017 12:50:39 +0200
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.onmicrosoft.com; s=selector1-ericsson-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=ilyrC7az361axpXsj4gu8IQ2N0YuFGkQLFLt3XAdvPk=; b=MCIca+fDsDTdE29LTT4+jQHZVUMe2Z64SkARMohx/k6LLr6Z5DmX/lgNpaTVdsg1g+Dhp72NewnckyW9n6LlQrbgBjJqPOm2PvRCKFh1EaxschVY7UzaHi9uc+XW70+K7hxnYuQA39GKfCeSt1eAhPno8wTeZANPBkH92iKpOkU=
Received: from HE1PR0701MB2539.eurprd07.prod.outlook.com (10.168.129.17) by HE1PR0701MB2540.eurprd07.prod.outlook.com (10.168.129.18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.77.5; Thu, 5 Oct 2017 10:50:38 +0000
Received: from HE1PR0701MB2539.eurprd07.prod.outlook.com ([fe80::456e:eb29:8077:6178]) by HE1PR0701MB2539.eurprd07.prod.outlook.com ([fe80::456e:eb29:8077:6178%17]) with mapi id 15.20.0077.018; Thu, 5 Oct 2017 10:50:38 +0000
From: Francesca Palombini <francesca.palombini@ericsson.com>
To: Jeffrey Yasskin <jyasskin@chromium.org>, Carsten Bormann <cabo@tzi.org>
CC: Kevin Braun <kbraun@obj-sys.com>, "cbor@ietf.org" <cbor@ietf.org>
Thread-Topic: [Cbor] Validation of maps
Thread-Index: AQHS//XL7VetNcuD6kGhRpgLKMKXK6JaFwCAgAHUiQCAeaO90A==
Date: Thu, 05 Oct 2017 10:50:38 +0000
Message-ID: <HE1PR0701MB253924A5FBD83848583C053898700@HE1PR0701MB2539.eurprd07.prod.outlook.com>
References: <e16da575-bbed-1f52-c754-9938237aa6bc@obj-sys.com> <3FFCD42B-C1DE-43E3-A06D-608CACD55D86@tzi.org> <CANh-dXnucjNP=eZfrEcrVC6HN0XHk0dcw-C+J56rksWxMbX8=A@mail.gmail.com>
In-Reply-To: <CANh-dXnucjNP=eZfrEcrVC6HN0XHk0dcw-C+J56rksWxMbX8=A@mail.gmail.com>
Accept-Language: en-GB, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: spf=none (sender IP is ) smtp.mailfrom=francesca.palombini@ericsson.com;
x-originating-ip: [192.176.1.84]
x-ms-publictraffictype: Email
x-microsoft-exchange-diagnostics: 1; HE1PR0701MB2540; 6:uQxATcZNN+5PYtp1NZxPIam46PFnOdF8J4P0HA/WxidP7vJ4bOPD+4gdG6mtjx2rWfonvVRWcbq1QSziRysu1+DAwMV7FMb6ziMj7Btbua2D64KJMPehO2c1YLyP6Y4uTrsrmUUEDhuKZygKlbZYoG+NHPyJMpIlN43bjFFJL4jDtdn0Ilt5ZZHRAvdjQObgUW1m+5L7lQCXNNyoqXvROMmz1RGas9AR923b8JM4EV95/ykWNCS8+knMrPU0lxj+Qf7PsuCyhV98jQKBo8YH8nC9gLJWGzX51K1vwB4H0i/b0+FG1Gqqp9rnKigfIKC7dW6Is57nKmLEvk5PldnC2A==; 5:mV4Y26cDnEZ69r3NiLokW0iftBNsxpO72AUnnhDH38YszcF4j8eYk30csil+/P4ulEST2HdkADI8MOo3yDfRJoqdQMr84QSNQytyiwuU4rd000DmQOEXwDpjOdOZTyHFuuzMWG3ErZFNfyApgD+zXg==; 24:dWvII/IDkzbsLXGdhiDd/StBBrz/Y/o9ZXdSc2M2tlErL4UXZFblkDUQXTFLT/YL/y8o8gGbJrj8Yb5D4TSIhLJ+mnOiG7M02kDGOnyiKEw=; 7:6lbY/StGsp8ncV8UlD/2O6KbmB1QKPcv3p4GGnMhisthG8apl5U8gDa1fvr2BFxjrdeYBLbgk8R3m2FUVub22xEg01OcIaA0z9UjmQgOayyEz+x3olN4WI62za1LcbmV62BmdeN+vyARaWMn7Kz1+hAfSyqncg5cMtWS9HT/c0JU9AmadUZzSxE1sfgYgrPg0vEnJ9Az8J7VMsDEpWPKHv4KPpAdKmmpoXcrdNzuAu4=
x-ms-exchange-antispam-srfa-diagnostics: SSOS;
x-ms-office365-filtering-correlation-id: d0a53987-7499-4fe9-0fec-08d50bdeea91
x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001)(2017030254152)(2017052603199)(201703131423075)(201703031133081)(201702281549075); SRVR:HE1PR0701MB2540;
x-ms-traffictypediagnostic: HE1PR0701MB2540:
x-exchange-antispam-report-test: UriScan:(21748063052155);
x-microsoft-antispam-prvs: <HE1PR0701MB25404FFA07655FA56F643A2498700@HE1PR0701MB2540.eurprd07.prod.outlook.com>
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(5005006)(8121501046)(3002001)(100000703101)(100105400095)(10201501046)(93006095)(93001095)(6041248)(20161123558100)(20161123555025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123564025)(20161123562025)(20161123560025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095); SRVR:HE1PR0701MB2540; BCL:0; PCL:0; RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095); SRVR:HE1PR0701MB2540;
x-forefront-prvs: 04519BA941
x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(346002)(39860400002)(376002)(377454003)(189002)(199003)(24454002)(81166006)(7696004)(54896002)(6116002)(54906003)(19609705001)(99286003)(105586002)(189998001)(7736002)(3846002)(236005)(9686003)(106356001)(74316002)(4326008)(55016002)(14454004)(606006)(8676002)(5660300001)(6306002)(102836003)(53936002)(53546010)(66066001)(6246003)(81156014)(790700001)(110136005)(68736007)(2950100002)(54356999)(5250100002)(966005)(86362001)(3660700001)(76176999)(2906002)(2900100001)(478600001)(6506006)(8936002)(50986999)(316002)(6436002)(33656002)(3280700002)(101416001)(561944003)(97736004)(229853002)(25786009); DIR:OUT; SFP:1101; SCL:1; SRVR:HE1PR0701MB2540; H:HE1PR0701MB2539.eurprd07.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en;
received-spf: None (protection.outlook.com: ericsson.com does not designate permitted sender hosts)
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: multipart/alternative; boundary="_000_HE1PR0701MB253924A5FBD83848583C053898700HE1PR0701MB2539_"
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-originalarrivaltime: 05 Oct 2017 10:50:38.7251 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 92e84ceb-fbfd-47ab-be52-080c6b87953f
X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0701MB2540
X-OriginatorOrg: ericsson.com
X-Brightmail-Tracker: H4sIAAAAAAAAA02SfUiTURTGu3vvtldxdpszD5pgS03DppnFKIusBCH6UAlUglz6pjPdZFvq imB9TIampFjqUjTbYKUiqflFqBOZSoGWWdAHsZKWpUgRhfiV714D//vd8zznnnseLk2Je/n+ tFKlYzQqRa5U4IlrU7qjdhu936RGFZ+WD1d95MtrTdVYXu60CuS//pgFR3DCfcNLnGCxLPAS 6ge9Eu41Kc/gNM/YTCZXWcBoIg+ne2YbZ2/z8otbUdGPbwvYgHpsqAR50EBiYNw1xytBnrSY DCMYcY0JWUFMRhB0d+SxAiZlFIwPOhDnquFB16gLc4evCIz9b/lsi4DEwoRzfo1pWkJOwNxU EYsUiYfPDn/W4UNCwTm67J4sITvh5z0rj+Oj0PRkzj0Yk2BwPazBLItIOjz6UCHkRnUhuGGZ x+ydHiQRalrSWQ8igfD7ejPFMkX84N10A4/bjIDl2TjFsS/MfFnhc/4MeP2+XMjVg+B7qUHA cSC8aih17wjEJIT+m72YE2TwtGJuPa6T0OqwYc7UiKBvYHFdiIC6yb51PgCmihE+x2rotM+s N4zzwfpp0R0QkG3gWhHdQRHmDQ/nWA0dNXZsdgewBcZqp9eYzTEc2voiOct2qCp1CjkOA2Nd vXBjvREJHyNfLaO9kJcVvVfGaJQZWq1aJVMxuna09qHsnYshPWhyNm4IERpJvUS5S1OpYr6i QKvPG0JAU1KJSLq6VhJlKvRXGI36vOZyLqMdQgE0lvqJ4vonUsQkS6FjLjFMPqP5r/JoD38D ohP/phVQoft32JMcPpYpOqfykLyqRIollRPmZcem56ri5GulvLKptrgBU3K8d7x1WBoSvVmc 88AW1hvj382cIrcqDdVBukLvu0FLL5LafYKviuR1sc1hzmNb9ZMl1pagAFtS9b5wryXZwaie wrPhF/ONtcf1FlPGuZVVm9+yFGuzFXt2URqt4h8/JNrSTAMAAA==
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/d-7URYBLSFx51afeMP39efuCwag>
Subject: Re: [Cbor] Validation of maps
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Oct 2017 10:50:59 -0000

Agreed with Jeffrey.

Reviving this thread to ask the opinion from the working group:

From: CBOR [mailto:cbor-bounces@ietf.org] On Behalf Of Jeffrey Yasskin
Sent: den 20 juli 2017 03:16
To: Carsten Bormann <cabo@tzi.org>
Cc: Kevin Braun <kbraun@obj-sys.com>; cbor@ietf.org
Subject: Re: [Cbor] Validation of maps

By the time CDDL makes it to an RFC, we should be answering questions like this by quoting normative text from https://tools.ietf.org/html/draft-greevenbosch-appsawg-cbor-cddl-11#section-3.5, not just pointing at examples.

Jeffrey

On Tue, Jul 18, 2017 at 11:18 PM, Carsten Bormann <cabo@tzi.org<mailto:cabo@tzi.org>> wrote:
Hi Kevin,

> I know the question of more formally specifying validation rules already came up.  One would think map validation would be fairly obvious, but what happens when key types overlap?
>
> For example, I think the intention is that if you have
>
>   top = { 4 => int, *int => tstr }
>
> then the key 4 must be present with an integer value,

Right, that is the only way to match the first field.
(And there is no way to have that as well as another /4/ key with a text string value.)

> and you can have any number of other integer keys with text string values. Okay, but what about:
>
> top = { ? 4 => int, *int => tstr }
>
> We might say this means that if a key of 4 appears, then it must have an int value.  Or, does it allow a key of 4 to appear with a text string value while considering the optional "4 => int" as being absent?

Yes, that is the semantics.  It is not always what a specifier might intend.

The reason is that the map opens a choice point.  A member with key 4 is starting to match the field.  If the value however does not  match (because there is no int), the matcher falls back to the choice point.  It then tries the other field, and indeed, that matches.

In the research underlying CDDL, we have discussed “cuts” (a concept from error handling in Parse Expression Grammars (PEGs)) as the solution to this.  If ^ represents a cut, write:

top = { ? 4 ^ => int, *int => tstr }

Once the 4 matches, there is no way back; for this member, another match is no longer tried.
A nice side effect is that anything except an int after a key of 4 can give a definite error message of “int expected”.
The cut proposal includes : as an abbreviation for ^=>, so you can simply write:

top = { ? 4: int, *int => tstr }

> Given the examples in the spec, I guess the intention is for such a thing to mean the key 4, if present, has to have an int value.

Which example leads you to this conclusion?

>  So, there is some kind of "match the most specific key" rule implied (I guess).

Actually, the PEG semantics we have borrowed here is that the *first* match is used.  But only rules are matched that indeed match!

> How that rule applies in more complex situations (where there is some kind of nesting) probably needs to be spelled out....  Given:
>
>   top = { 1 => 1, ? ( 5 => 5, 6 => 6 ), *int => tstr }
>
> Must keys 5 & 6 be present together,

Yes.

The whole group in the parentheses is optional.

> or does the wildcard allow only one of them to appear?

(That was an early semantics we tried, and it leads down the drain.
It is much better to have a matcher that simply and stupidly follows what’s in the grammar.)

> Or, given:
>
>   top = { 1 => 1, ( 5 => 5 // 6 => 6 ), *int => tstr }
>
> does this mean { 1 : 1, 5 : 5, 6 : "hi" }  is not valid?

No.  The first field eats the 1: 1, the second field only matches the 5: 5, so the third field gets to eat zero or more int: tstr, of which 6: “hi” is a match.

> Is the 6 free to match the wildcard when the 5 has satisfied the group choice?

Yes.

>
> Then there are cases where "most specific key" has no meaning,

(Again, we use “first match”.)

> such as when two key types overlap each other and neither is a single-value type.  Consider:
>
>   top = { * (0..10) => tstr, * (5..15) => int }
>
> Does this mean a key of 5 can have either a text string or an int value?

As long as there are no cuts here, yes.

> Or, does it require that a key of 5, if present, must have a value that is both a text string and an int at the same time (i.e. it disallows 5 to appear)?

That would never be the semantics — the fact that there are two branches in a choice that can be fulfilled is not an error.

With a cut like this:

top = { * (0..10) ^ => tstr, * (5..15) => int }

this could mean that key 0..10 cut the choice and therefore need to have a text string value, while the rest, 11..15 can be integers, because the choice is cut after matching 0..10.

So far, we haven’t seen a use case that actually needed the cut, but it is still nice to have that error message.
(We also haven’t implemented it yet, although we will certainly do that over time.)

Another example where a cut helps:

message = orderbeer / orderwine

orderbeer = {
  type: “beer”,
  ferment: “bottom” / “top”,
}

orderwine = {
  type: “wine”,
  color: “red” / “white”.
}

If you feed {“type”: “wine”, “ferment”: “top”} into this, you get a rather unspecific error message that tells you things don’t match up — the matcher can’t really know whether the “type” value of “wine" or the “ferment” key is the “cause” of neither branch matching.

If you add a cut:

message = orderbeer / orderwine

orderbeer = {
  type: “beer” ^,
  ferment: “bottom” / “top”,
}

orderwine = {
  type: “wine” ^,
  color: “red” / “white”.
}

the matcher can tell you right away that the key “ferment” is not allowed in an orderwine message.

Grüße, Carsten

_______________________________________________
CBOR mailing list
CBOR@ietf.org<mailto:CBOR@ietf.org>
https://www.ietf.org/mailman/listinfo/cbor