Re: [Cellar] Security considerations: recursive elements

Jerome Martinez <jerome@mediaarea.net> Wed, 17 January 2018 21:01 UTC

Return-Path: <jerome@mediaarea.net>
X-Original-To: cellar@ietfa.amsl.com
Delivered-To: cellar@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 98BC312E9A1 for <cellar@ietfa.amsl.com>; Wed, 17 Jan 2018 13:01:46 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.92
X-Spam-Level:
X-Spam-Status: No, score=-1.92 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7iCcCnxE9gpu for <cellar@ietfa.amsl.com>; Wed, 17 Jan 2018 13:01:44 -0800 (PST)
Received: from 10.mo5.mail-out.ovh.net (10.mo5.mail-out.ovh.net [46.105.52.148]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3A74A12E8B1 for <cellar@ietf.org>; Wed, 17 Jan 2018 13:01:44 -0800 (PST)
Received: from player786.ha.ovh.net (b7.ovh.net [213.186.33.57]) by mo5.mail-out.ovh.net (Postfix) with ESMTP id 4C32917C6CA for <cellar@ietf.org>; Wed, 17 Jan 2018 22:01:41 +0100 (CET)
Received: from [192.168.2.120] (p5DDB56EF.dip0.t-ipconnect.de [93.219.86.239]) (Authenticated sender: jerome@mediaarea.net) by player786.ha.ovh.net (Postfix) with ESMTPSA id 5B7A08008A for <cellar@ietf.org>; Wed, 17 Jan 2018 22:01:41 +0100 (CET)
To: cellar@ietf.org
References: <CAHUoETL6+2XokNy5skB7dzjuzowoL8kV9gNLgd6HeJYiZcXpOQ@mail.gmail.com>
From: Jerome Martinez <jerome@mediaarea.net>
Message-ID: <ef896210-ed4b-7afe-5e4f-bd99298acb51@mediaarea.net>
Date: Wed, 17 Jan 2018 22:01:41 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2
MIME-Version: 1.0
In-Reply-To: <CAHUoETL6+2XokNy5skB7dzjuzowoL8kV9gNLgd6HeJYiZcXpOQ@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Content-Language: en-GB
X-Ovh-Tracer-Id: 13381601869916606609
X-VR-SPAMSTATE: OK
X-VR-SPAMSCORE: 50
X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedtvddrtddvgdduvdduucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuqfggjfdpvefjgfevmfevgfenuceurghilhhouhhtmecufedttdenucgoteefjeefqddtgeculdehtddm
Archived-At: <https://mailarchive.ietf.org/arch/msg/cellar/Kxya03EbdW14QOvZQzgXn1BLRUA>
Subject: Re: [Cellar] Security considerations: recursive elements
X-BeenThere: cellar@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Codec Encoding for LossLess Archiving and Realtime transmission <cellar.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cellar>, <mailto:cellar-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cellar/>
List-Post: <mailto:cellar@ietf.org>
List-Help: <mailto:cellar-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cellar>, <mailto:cellar-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Jan 2018 21:01:46 -0000

On 17/01/2018 21:33, Michael Bradshaw wrote:
> [...]
>
> For example, something like "a parser SHOULD handle recursion up to X 
> levels deep, and MAY abort the parse if it reaches Y levels deep".
>
> Thoughts from others?

an EBML parser can not dig in nested element without the corresponding 
dictionary (from DocType).
So IMO it is not like XML or JSON, because XML or JSON parser tries to 
read the nested elements, but n EBML parser does not try to read any 
element not in the dictionary (so the limit is the dictionary, usually 
can not be provided by the attacker.
It is lie JSON or XML only if the attacker can provide the dictionary.

And the JSON spec specifies no limit:
https://tools.ietf.org/html/rfc7159#section-9
"An implementation may set limits on the maximum depth of nesting"

There may be legitimate reason to have thousands of nesting level, and 
there is already a lot of debates about that with JSON (last tests I saw 
about that is ~100 levels for Ruby JSON default parser and ~1000 levels 
for Python JSON parser)

So the sentence should split security issues, from the input (the file 
itself) or the dictionary (e.g. matroska specs):
- a generic parser (reading the dictionary from input) MAY set limits on 
the maximum depth of nesting. An implementation MAY set limits on the 
length and contents.
- a specialized parser (e.g. Matroska parser) SHOULD handle recursion up 
to the maximum nesting level provided by the supported dictionary of the 
document, or 2 nesting levels, whichever is smaller (a specialized 
format could have only 1 nesting level but EBML needs at least 2, for 
DocType etc...). An alternative is to say to rely on the supported 
format security paragraph.

I am not in favor of writing a number, because there is no good number 
to provide, it depends too much of the content, I am in favor of doing 
like JSON sentences (without any number, but explicitly saying that 
limits are expected).
I hesitate in writing such kind of text in a "parser" section instead of 
security section, like the JSON RFC does.