Re: [Cbor] Validation of maps
Jeffrey Yasskin <jyasskin@chromium.org> Thu, 20 July 2017 01:16 UTC
Return-Path: <jyasskin@google.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D00F91272E1 for <cbor@ietfa.amsl.com>; Wed, 19 Jul 2017 18:16:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.7
X-Spam-Level:
X-Spam-Status: No, score=-2.7 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com header.b=ED5Vcei6; dkim=pass (1024-bit key) header.d=chromium.org header.b=X26lsFVW
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9mCUVWfIomRq for <cbor@ietfa.amsl.com>; Wed, 19 Jul 2017 18:16:15 -0700 (PDT)
Received: from mail-wm0-x233.google.com (mail-wm0-x233.google.com [IPv6:2a00:1450:400c:c09::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BCA42127868 for <cbor@ietf.org>; Wed, 19 Jul 2017 18:16:14 -0700 (PDT)
Received: by mail-wm0-x233.google.com with SMTP id w191so13371141wmw.1 for <cbor@ietf.org>; Wed, 19 Jul 2017 18:16:14 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=eWhM+hkgW2iplTD0HbhXJ1M/sGa8Vbal2btpHgiAX4U=; b=ED5Vcei6rQ/xWAKMy/mqJUs1nlPjxDCN4g5BiN6e53bYPOz/GQ7QEdAZpPJyJUn8Hx 5lGjyr5G/r+EkY9EnZ6EsQBb0IMVjVQ33Jh9VhiNgc9XvU7SCyXBeqLFYHyvmiXS1zGe sIrZgTNR2PUEZzkpcVtbMznHbJ4+448oPGQfL6enDPc3kY5v+SRzgyeV2RQNK2Nx0oH3 BC6V9Ir3LxS3ESGlVvzZGmuqOsVKZo6G35O7BmloRQ3CEIf6mIriO9bcw+LkP9ifoqTs zPl0iur+I7FOEHQYAT/qF6UmDDeZKCpmHOGX2DM+RUJwiF4X22LxYUC2f2P4zz5yf4uG cpoQ==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=eWhM+hkgW2iplTD0HbhXJ1M/sGa8Vbal2btpHgiAX4U=; b=X26lsFVW1nY4rn2atX5nKoOeMH97NHcFot3Euoam/PSEYz595cuOWMPmiBd8pwut9r UjzphN0uhfpsshBTCeTbKXI/lepETW6tSHL4cyUJw8XOTHDgZj3d+Pe+sjTxgHmUeeza yvfGod2EkvNLlejzcugFUU33MFfnhVv31xb8k=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=eWhM+hkgW2iplTD0HbhXJ1M/sGa8Vbal2btpHgiAX4U=; b=XE5KN7YJSsnElFXPTKrXrMtOgft5z+xtJjYecPBqN5ONPzumv7DUsG23EUDMO3Pj3t JOenY6TkKiHZKLuhT51D58fkGhfXj3Dq68u1C2vD4woJh2wNi7dTlEsrcPUj9bZkRDA8 PHAomffY7epRkveCg0t0PjnSnoKhhpBl4nuuMI3sNDFlq68PXL9VMn0zA7T2XQvzca9L ixxHG3H8/kdm7+lfbeWbpRD7zs482lkOD/73/hVllMqogwuIHforCO6/BYxAw/xmBTgg 4+7pbhn7+J/jeirVIzWrW2jkkp+88OY6h+Sm8TaQ3zCC5YYXONVGR4cKrQ/DF+SD3Olg 7xAw==
X-Gm-Message-State: AIVw111Z18+GZG4nx0FXCLD9FRqYhbA7OAKNCIC9LkDTp/8Bu1Xh4m07 9uEdy1dW+5SGfQwg+9EoiHCwaoTb/QTlvnlgDA==
X-Received: by 10.28.144.211 with SMTP id s202mr480121wmd.111.1500513373097; Wed, 19 Jul 2017 18:16:13 -0700 (PDT)
MIME-Version: 1.0
Sender: jyasskin@google.com
Received: by 10.28.216.144 with HTTP; Wed, 19 Jul 2017 18:15:52 -0700 (PDT)
In-Reply-To: <3FFCD42B-C1DE-43E3-A06D-608CACD55D86@tzi.org>
References: <e16da575-bbed-1f52-c754-9938237aa6bc@obj-sys.com> <3FFCD42B-C1DE-43E3-A06D-608CACD55D86@tzi.org>
From: Jeffrey Yasskin <jyasskin@chromium.org>
Date: Thu, 20 Jul 2017 03:15:52 +0200
X-Google-Sender-Auth: S86k_QaQvGuGGaEfOneS9ZWsR-Q
Message-ID: <CANh-dXnucjNP=eZfrEcrVC6HN0XHk0dcw-C+J56rksWxMbX8=A@mail.gmail.com>
To: Carsten Bormann <cabo@tzi.org>
Cc: Kevin Braun <kbraun@obj-sys.com>, cbor@ietf.org
Content-Type: multipart/alternative; boundary="001a11469d86961b710554b57ea5"
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/60Jc-t0SwMA2x0MsnY1zx4fZaGI>
Subject: Re: [Cbor] Validation of maps
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 20 Jul 2017 01:16:18 -0000
By the time CDDL makes it to an RFC, we should be answering questions like this by quoting normative text from https://tools.ietf.org/html/draft-greevenbosch-appsawg-cbor-cddl-11#section-3.5, not just pointing at examples. Jeffrey On Tue, Jul 18, 2017 at 11:18 PM, Carsten Bormann <cabo@tzi.org> wrote: > Hi Kevin, > > > I know the question of more formally specifying validation rules already > came up. One would think map validation would be fairly obvious, but what > happens when key types overlap? > > > > For example, I think the intention is that if you have > > > > top = { 4 => int, *int => tstr } > > > > then the key 4 must be present with an integer value, > > Right, that is the only way to match the first field. > (And there is no way to have that as well as another /4/ key with a text > string value.) > > > and you can have any number of other integer keys with text string > values. Okay, but what about: > > > > top = { ? 4 => int, *int => tstr } > > > > We might say this means that if a key of 4 appears, then it must have an > int value. Or, does it allow a key of 4 to appear with a text string value > while considering the optional "4 => int" as being absent? > > Yes, that is the semantics. It is not always what a specifier might > intend. > > The reason is that the map opens a choice point. A member with key 4 is > starting to match the field. If the value however does not match (because > there is no int), the matcher falls back to the choice point. It then > tries the other field, and indeed, that matches. > > In the research underlying CDDL, we have discussed “cuts” (a concept from > error handling in Parse Expression Grammars (PEGs)) as the solution to > this. If ^ represents a cut, write: > > top = { ? 4 ^ => int, *int => tstr } > > Once the 4 matches, there is no way back; for this member, another match > is no longer tried. > A nice side effect is that anything except an int after a key of 4 can > give a definite error message of “int expected”. > The cut proposal includes : as an abbreviation for ^=>, so you can simply > write: > > top = { ? 4: int, *int => tstr } > > > Given the examples in the spec, I guess the intention is for such a > thing to mean the key 4, if present, has to have an int value. > > Which example leads you to this conclusion? > > > So, there is some kind of "match the most specific key" rule implied (I > guess). > > Actually, the PEG semantics we have borrowed here is that the *first* > match is used. But only rules are matched that indeed match! > > > How that rule applies in more complex situations (where there is some > kind of nesting) probably needs to be spelled out.... Given: > > > > top = { 1 => 1, ? ( 5 => 5, 6 => 6 ), *int => tstr } > > > > Must keys 5 & 6 be present together, > > Yes. > > The whole group in the parentheses is optional. > > > or does the wildcard allow only one of them to appear? > > (That was an early semantics we tried, and it leads down the drain. > It is much better to have a matcher that simply and stupidly follows > what’s in the grammar.) > > > Or, given: > > > > top = { 1 => 1, ( 5 => 5 // 6 => 6 ), *int => tstr } > > > > does this mean { 1 : 1, 5 : 5, 6 : "hi" } is not valid? > > No. The first field eats the 1: 1, the second field only matches the 5: > 5, so the third field gets to eat zero or more int: tstr, of which 6: “hi” > is a match. > > > Is the 6 free to match the wildcard when the 5 has satisfied the group > choice? > > Yes. > > > > > Then there are cases where "most specific key" has no meaning, > > (Again, we use “first match”.) > > > such as when two key types overlap each other and neither is a > single-value type. Consider: > > > > top = { * (0..10) => tstr, * (5..15) => int } > > > > Does this mean a key of 5 can have either a text string or an int value? > > As long as there are no cuts here, yes. > > > Or, does it require that a key of 5, if present, must have a value that > is both a text string and an int at the same time (i.e. it disallows 5 to > appear)? > > That would never be the semantics — the fact that there are two branches > in a choice that can be fulfilled is not an error. > > With a cut like this: > > top = { * (0..10) ^ => tstr, * (5..15) => int } > > this could mean that key 0..10 cut the choice and therefore need to have a > text string value, while the rest, 11..15 can be integers, because the > choice is cut after matching 0..10. > > So far, we haven’t seen a use case that actually needed the cut, but it is > still nice to have that error message. > (We also haven’t implemented it yet, although we will certainly do that > over time.) > > Another example where a cut helps: > > message = orderbeer / orderwine > > orderbeer = { > type: “beer”, > ferment: “bottom” / “top”, > } > > orderwine = { > type: “wine”, > color: “red” / “white”. > } > > If you feed {“type”: “wine”, “ferment”: “top”} into this, you get a rather > unspecific error message that tells you things don’t match up — the matcher > can’t really know whether the “type” value of “wine" or the “ferment” key > is the “cause” of neither branch matching. > > If you add a cut: > > message = orderbeer / orderwine > > orderbeer = { > type: “beer” ^, > ferment: “bottom” / “top”, > } > > orderwine = { > type: “wine” ^, > color: “red” / “white”. > } > > the matcher can tell you right away that the key “ferment” is not allowed > in an orderwine message. > > Grüße, Carsten > > _______________________________________________ > CBOR mailing list > CBOR@ietf.org > https://www.ietf.org/mailman/listinfo/cbor >
- [Cbor] Validation of maps Kevin Braun
- Re: [Cbor] Validation of maps Carsten Bormann
- Re: [Cbor] Validation of maps Kevin Braun
- Re: [Cbor] Validation of maps Carsten Bormann
- Re: [Cbor] Validation of maps Kevin Braun
- Re: [Cbor] Validation of maps Jeffrey Yasskin
- Re: [Cbor] Validation of maps Francesca Palombini
- Re: [Cbor] Validation of maps Francesca Palombini
- Re: [Cbor] Validation of maps Jim Schaad
- Re: [Cbor] Validation of maps Henk Birkholz
- [Cbor] More formal definition of CDDL matching ru… Carsten Bormann
- Re: [Cbor] More formal definition of CDDL matchin… Jim Schaad
- Re: [Cbor] More formal definition of CDDL matchin… Henk Birkholz
- Re: [Cbor] More formal definition of CDDL matchin… Jim Schaad
- Re: [Cbor] More formal definition of CDDL matchin… Carsten Bormann
- Re: [Cbor] More formal definition of CDDL matchin… Christoph Vigano
- Re: [Cbor] More formal definition of CDDL matchin… Brian E Carpenter
- Re: [Cbor] More formal definition of CDDL matchin… Carsten Bormann
- Re: [Cbor] More formal definition of CDDL matchin… Carsten Bormann
- Re: [Cbor] More formal definition of CDDL matchin… Sean Leonard
- Re: [Cbor] More formal definition of CDDL matchin… Carsten Bormann
- Re: [Cbor] More formal definition of CDDL matchin… Sean Leonard
- Re: [Cbor] More formal definition of CDDL matchin… Carsten Bormann