Re: [Cbor] Handling duplicate map keys

Jim Schaad <ietf@augustcellars.com> Mon, 25 November 2019 04:35 UTC

Return-Path: <ietf@augustcellars.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7DAD9120236 for <cbor@ietfa.amsl.com>; Sun, 24 Nov 2019 20:35:55 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id x2dYrX7fyuEB for <cbor@ietfa.amsl.com>; Sun, 24 Nov 2019 20:35:53 -0800 (PST)
Received: from mail2.augustcellars.com (augustcellars.com [50.45.239.150]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7BBE9120233 for <cbor@ietf.org>; Sun, 24 Nov 2019 20:35:52 -0800 (PST)
Received: from Jude (50.76.105.153) by mail2.augustcellars.com (192.168.0.56) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Sun, 24 Nov 2019 20:35:47 -0800
From: Jim Schaad <ietf@augustcellars.com>
To: 'Jeffrey Yasskin' <jyasskin@chromium.org>, 'Laurence Lundblade' <lgl@island-resort.com>
CC: 'Jeffrey Yasskin' <jyasskin=40google.com@dmarc.ietf.org>, cbor@ietf.org
References: <CANh-dX=EVDa4EwrgWKof4kCD3nkfV3BvH0Cg5ZKOivmXJ_Dm8g@mail.gmail.com> <E5C74E1F-A5F4-4AB8-9787-3999C4697C3B@island-resort.com> <CANh-dX=1gkAtfSG-yzCsVAkk=oLM-=dN_JLCr1kQK3d6Jb0fSw@mail.gmail.com> <F81E1A57-6072-44FA-A148-8F3ED7520791@island-resort.com> <CANh-dXmxzoEjb-yhL-R1p-xvBF8kZwOpzoS_fziPfuxFhbykhA@mail.gmail.com>
In-Reply-To: <CANh-dXmxzoEjb-yhL-R1p-xvBF8kZwOpzoS_fziPfuxFhbykhA@mail.gmail.com>
Date: Sun, 24 Nov 2019 20:35:44 -0800
Message-ID: <052d01d5a349$ceaecd00$6c0c6700$@augustcellars.com>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_052E_01D5A306.C08FABB0"
X-Mailer: Microsoft Outlook 16.0
Thread-Index: AQGmCDypdlfUUa/DzG+fZ+sR9lBVUQLz1JhnAeN/E+4ChiYl2gHbbvNUp7C7SIA=
Content-Language: en-us
X-Originating-IP: [50.76.105.153]
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/NGfhDTeNrZsCeD7KYpzdwA9xvGg>
Subject: Re: [Cbor] Handling duplicate map keys
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 25 Nov 2019 04:35:55 -0000

 

 

From: CBOR <cbor-bounces@ietf.org> On Behalf Of Jeffrey Yasskin
Sent: Sunday, November 24, 2019 2:49 AM
To: Laurence Lundblade <lgl@island-resort.com>
Cc: Jeffrey Yasskin <jyasskin=40google.com@dmarc.ietf.org>; Jeffrey Yasskin <jyasskin@chromium.org>; cbor@ietf.org
Subject: Re: [Cbor] Handling duplicate map keys

 

Comments inline:

 

On Sat, Nov 23, 2019 at 7:21 AM Laurence Lundblade <lgl@island-resort.com <mailto:lgl@island-resort.com> > wrote:

On Nov 23, 2019, at 1:11 AM, Jeffrey Yasskin <jyasskin@chromium.org <mailto:jyasskin@chromium.org> > wrote:

 

On Sat, Nov 23, 2019 at 10:00 AM Laurence Lundblade <lgl@island-resort.com <mailto:lgl@island-resort.com> > wrote:

Here’s a rough proposed text:

 

The protocol designer should make a choice for maps as to whether duplicates are allowed or not, particularly as to whether duplicates would cause security or functionality problems.

 

The protocol designer should only require duplicate detection when necessary as it can have the following implications:

*	Some generic decoders do not support duplicate detection because the underlying facilities in their programming environment to represent maps can’t detect duplicates
*	Some generic decoders do not support duplicate detection because it requires more code and is not required

*	It requires more resources to implement: 1) memory to store all the keys in the map, 2) more code, 3) more CPU time, as much as O(n^2).. This is particularly an issue with big maps in constrained environments.

It is suggested that protocol designers require duplicate detection only no the particular maps for which there is an issue.

 

Decoders will typically fall into one of these categories:

*	Full duplicate detection
*	Pass all items up to caller, allowing the caller to implement duplicate detection or not
*	No duplicate detection

A generic decoder should identify which it is. Some may support more than one, selectable by configuration.

 

I don't think generic decoders are relevant here. Protocol implementations often accumulate their input into a data structure even if they use a streaming decoder to read the input. Any such protocol can detect duplicates (of the keys it uses at all) for at most the cost of an extra bit per field.

 

My intention was that “pass all items up” covers your case. 

 

My QCBOR + t_cose implementations work exactly as you describe to implement dup detection in COSE header parameters. It’s the data structures that hold the COSE header params that are used to detect the duplicates, not the generic decoder (QCBOR).  This seems like a good way to handle this problem for lots of use cases. 

 

Yep, that's the model I'm thinking of. I'm clearly struggling to put it into understandable spec language. 

I think you make a good point that if a protocol is going to specify "take first", it can probably specify "error on duplicate key" for the same cost. That might imply some much simpler text:

 

"Protocols based on CBOR SHOULD fail with an error if a map contains a duplicate key, except that if the key isn't used at all, they MAY ignore it instead. Protocols that do not reject duplicate keys MUST (?) document why the cost of rejecting duplicates is too high and why accepting them will not lead to security vulnerabilities. An array might be a better choice for such protocols.”

 

I think I’d invert it and say that protocols that require duplicate detection for security reasons should describe that requirement in security considerations so the implementor gets a good solid hint that they need to worry about it.

 

There's an interesting difference of approach here. It's plausible to say that protocol designers should pick the secure design only when they realize their design has security implications, but I prefer to say that they should pick the secure design unless they think about it and realize the design doesn't have security implications. I think the second will get us noticeably more security in a world where designers don't have time to think about every aspect of every piece of their design.

 

[JLS]  It also gets fun in some other ways.  It might not matter for the protocol I am designing, because I always reflect back the input that I use to the client application.  That means it does not matter what the duplicate behavior is right up to the point in time where I decide that I am going to apply COSE to it.  I now have multiple potential behaviors that are required for the application

 

Jim

 

 

There is lot of text implying that duplicates are bad, but I think it would be worth being explicit.

 

You are an idiot if you design a protocol that considers duplicate input valid or necessary. Duplicate map keys are always an error in CBOR and will not work with many generic decoders.

 

I wouldn't say "idiot", just "wrong". The spec already says you "MUST NOT" do this, but I'm not opposed to making that statement more direct.

 

Thanks,
Jeffrey