Re: [Cellar] Benjamin Kaduk's Discuss on draft-ietf-cellar-ebml-15: (with DISCUSS and COMMENT)

Steve Lhomme <slhomme@matroska.org> Fri, 27 December 2019 09:18 UTC

Return-Path: <slhomme@matroska.org>
X-Original-To: cellar@ietfa.amsl.com
Delivered-To: cellar@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CD48D120104 for <cellar@ietfa.amsl.com>; Fri, 27 Dec 2019 01:18:02 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_NONE=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=matroska-org.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id X0SbbFuXcle7 for <cellar@ietfa.amsl.com>; Fri, 27 Dec 2019 01:18:01 -0800 (PST)
Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5A5B3120103 for <cellar@ietf.org>; Fri, 27 Dec 2019 01:18:01 -0800 (PST)
Received: by mail-wm1-x342.google.com with SMTP id c127so6520289wme.1 for <cellar@ietf.org>; Fri, 27 Dec 2019 01:18:01 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=matroska-org.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=Q4RdH0V/HV9FV1JdlFvzGmBqLg6kkfGHFccD7x0/tSw=; b=nsibdEbF4Nt9M4RPizcW4OUHPj94TF1+lk4U6TPgIhLvyKfiFirNR52G2olc32U1FY dV9I3E6zdWa09/yit4boBa8G75xvFLA1sQm64hQf8WEkITwhyaqKyrNtYe59qUVYktAF omnzNWuNvTuAqMSA3MiSH4UnvxCXbHOwYFzyhg59zzwso0eCKcMwqSRx+a+LDYjKJorC gCsgcjr4wbDWSZQotUmVheQONigRVGkz5GHXx4C1T2fx8F8fZBSrtweUcCHdUwhEMoc7 5u7VdNpjNqHu+sXDfJ22gygFJ6DbQeT43sJFWZMLwacHTST5AvT4/R3Q3vgK40LRk8B3 073A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=Q4RdH0V/HV9FV1JdlFvzGmBqLg6kkfGHFccD7x0/tSw=; b=dfBrDMYcEuO8hkgALc5SDyxhf5HDeh3US1nsmB1ArKhXl8vooEK965v7jBbcK4Uj0m YObBRhtiTPyxkP1WfYoih999wggHV7K5SxIGJ3YrPZY4Eid0T4GyepnxGmT9dkOEj5DP 9bhXvQJwSClYuIqNQ9stOgSiJnlvTktLTCKgprdGfJ65NBsYxm1HY/TAYVZ5wodRpxI/ ke/5U0pyBJHu6cOQpf927E6nbqNz+X7VeyMgXRmNoQuPi2KtRxmBjO8V1xNazSEDEhD3 O5ZYr3h2Ar1so46AeY3pmGvk7A0k2wwaOV/dkhgfKmX9UwQ6krsJuov890R/6IqusAiR +Xmg==
X-Gm-Message-State: APjAAAWXnXqGvQk9ol4yQJhFTW9VbOnLaYJJqNQ0lh8DWyPK1T3fJMn0 2YrkER5fQIbv3527NnNttgXXyiRxxNc=
X-Google-Smtp-Source: APXvYqxMTvbzZyK1b08WMOawjbDRgrE/qFALxwnPiXPwtJ3+ztJhvNn8oG84OvxfsTbXgAlThOcT+Q==
X-Received: by 2002:a1c:66d5:: with SMTP id a204mr17372678wmc.64.1577438279501; Fri, 27 Dec 2019 01:17:59 -0800 (PST)
Received: from [192.168.3.26] (229.74.9.109.rev.sfr.net. [109.9.74.229]) by smtp.gmail.com with ESMTPSA id b17sm33241000wrp.49.2019.12.27.01.17.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 27 Dec 2019 01:17:58 -0800 (PST)
To: Benjamin Kaduk <kaduk@mit.edu>
Cc: The IESG <iesg@ietf.org>, Steven Villereal <villereal@gmail.com>, draft-ietf-cellar-ebml@ietf.org, cellar-chairs@ietf.org, Codec Encoding for LossLess Archiving and Realtime transmission <cellar@ietf.org>
References: <157676970970.27491.11040479061607849531.idtracker@ietfa.amsl.com> <CAOXsMFJKk3HTEjoAJ9URhGt97SA++kNDp3HCVMscj+qED5+VgA@mail.gmail.com> <20191224184147.GP35479@kduck.mit.edu>
From: Steve Lhomme <slhomme@matroska.org>
Message-ID: <3fb0107e-a5d3-5d3a-0d15-f556b97755ae@matroska.org>
Date: Fri, 27 Dec 2019 10:17:58 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.1
MIME-Version: 1.0
In-Reply-To: <20191224184147.GP35479@kduck.mit.edu>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/cellar/hU9TvVi9qzuAeqoUIZVK3S4K0Fk>
Subject: Re: [Cellar] Benjamin Kaduk's Discuss on draft-ietf-cellar-ebml-15: (with DISCUSS and COMMENT)
X-BeenThere: cellar@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Codec Encoding for LossLess Archiving and Realtime transmission <cellar.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cellar>, <mailto:cellar-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cellar/>
List-Post: <mailto:cellar@ietf.org>
List-Help: <mailto:cellar-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cellar>, <mailto:cellar-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 27 Dec 2019 09:18:03 -0000

On 2019-12-24 19:41, Benjamin Kaduk wrote:
> On Tue, Dec 24, 2019 at 04:20:25PM +0100, Steve Lhomme wrote:

>>> Section 11.1.16
>>>
>>>     Identically Recurring Elements SHOULD include a CRC-32 Element as a
>>>     Child Element; this is especially recommended when EBML is used for
>>>     long-term storage or transmission.  If a Parent Element contains more
>>>
>>> I'm not sure if the "long-term" is intended to also bind as "long-term
>>> transmission" (though I'm not sure what it would mean in that case).
>>> It's also not entirely clear what kinds of transmission would benefit
>>> from this, as reliable media presumably don't need redundancy for
>>> reliability, but unreliable media can't really be used to carry EBML
>>> without some framing requirements to know when elements start.
>>
>> Actually, as long as you have the "packets" in the right order you can
>> use the EBML stream. You can also use it if you're missing some
>> packets. The Checksum can help determine if the data are valid or not,
>> even if the underlying transport loses some bits. It could technically
>> be used as-is as protocol on top of IP, like TCP or UDP.
> 
> Huh, interesting.  Though I thought that UDP (and IP itself) didn't
> guarantee in-order delivery.

UDP and IP have no notion or order. But you can create protocols on top 
that do. Basically an EBML-based streaming format would just be:
[Packet Number][EBML element]
With EBML element smaller than 4000 or 1512 octets to fit nicely in 
Ethernet.
Anyway, that's definitely out of the scope of this document ;)
>>> Section 12
>>>
>>>     If a Master Element contains a CRC-32 Element that doesn't validate,
>>>     then the EBML Reader MAY ignore all contained data except for
>>>     Descendant Elements that contain their own valid CRC-32 Element.
>>>
>>> Ignoring only part of the known questionable content could have
>>> significant security considerations, if (e.g.) security-relevant
>>> restrictions are in the garbled part of the document but the sensitive
>>> content has a (valid) redundant CRC.
>>
>> That's why it's a MAY. If a Matroska Segment has a CRC and each frame
>> in it has a CRC. If the top CRC is invalid, we can still use some of
>> the frames that have a valid CRC. It's not a requirement but a
>> possibility.
> 
> I agree that it's an implementation choice for whether or not to do this,
> but please add some text in the Security Considerations that mentions the
> risk of handling incomplete-but-interdependent data when implementations
> choose to do this sort of thing.

I think it really depends on the dependency of the data at the semantic 
level. For example and EBML Element may define the type of data found in 
another EBML Element. If the type is damaged the interpretation of the 
data can create many kind of issues.

As for the CRC I think it's similar. It depends on what the CRC covers 
and the decision should probably be done at the semantic level, even 
with various options on how to handle the data.