Re: [Cellar] Second AD review of draft-ietf-cellar-ebml-10

"Alexey Melnikov" <aamelnikov@fastmail.fm> Thu, 24 October 2019 12:34 UTC

Return-Path: <aamelnikov@fastmail.fm>
X-Original-To: cellar@ietfa.amsl.com
Delivered-To: cellar@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 93143120105 for <cellar@ietfa.amsl.com>; Thu, 24 Oct 2019 05:34:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.698
X-Spam-Level:
X-Spam-Status: No, score=-2.698 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=fastmail.fm header.b=I31LsV7z; dkim=pass (2048-bit key) header.d=messagingengine.com header.b=P8lHX7Mc
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nry5XRHgw9Yw for <cellar@ietfa.amsl.com>; Thu, 24 Oct 2019 05:34:30 -0700 (PDT)
Received: from wout2-smtp.messagingengine.com (wout2-smtp.messagingengine.com [64.147.123.25]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C9A9D120026 for <cellar@ietf.org>; Thu, 24 Oct 2019 05:34:30 -0700 (PDT)
Received: from compute7.internal (compute7.nyi.internal [10.202.2.47]) by mailout.west.internal (Postfix) with ESMTP id 129DF45A; Thu, 24 Oct 2019 08:34:30 -0400 (EDT)
Received: from imap1 ([10.202.2.51]) by compute7.internal (MEProxy); Thu, 24 Oct 2019 08:34:30 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastmail.fm; h= mime-version:message-id:in-reply-to:references:date:from:to:cc :subject:content-type; s=fm1; bh=6Rukjh092m42fkhxwTrzpJB8PHRXvob VqBUvQJXjbqw=; b=I31LsV7zYRsiJSYNbpsSirwFyiZtMH68IAIGSR9aMWL6lpn rzepLj6G7Ry8ZOM9uaRlRBILway0hLJaq7h9rST3Yf5kZuBYHHlVkpJ/1RITDjmG Lcly2FGqD64T0L4o9AWn4Up/ogc49bHHKw5qeGxPAMBpwcEyepoiYj1xpuXCUCu1 X6ZY0eUCfvxm8JyWctoIjcnCvJ+fdJIHlBT8ByopUd5cDcUS0Qzrqkh5NpMlJoSd 9bd8P2U8P2rP33qePNobH1MqfdJZCNhr8x5l0Fv1GmdgUbdPP9xJ5iHQ0GIfrgRW CMsgMbSwn/6Y4zKUnPMVq3PtVwv+B/5coyItXXQ==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; bh=6Rukjh 092m42fkhxwTrzpJB8PHRXvobVqBUvQJXjbqw=; b=P8lHX7Mck6+q6dJQSE+a92 DzLWgiszTR+cL3CqNg2kNmBWMgSHwwuOi7LKEJtPhnUqSc8gtZQ9Hqy5I1alzmdG 4BJmB9FhUxVJpC0Mrm/XZYQ73+pHxCk5ARKHI5I8GjwdEe0i0wgTk+UujtJ7XLRA suTbpav16ZbK1eEnG7EM3YppxwiA3GVdRrBmPNV0jL+YfWRKFNKxenVhDGHnnS5Q bVJTA8GNvbr2KiQ2Kl83s9s34e6qISntNVZy0ZqgeIEIxgNPkaJOvKB2ql3M2rM3 Wc6TE9LzdrzFnxWk3Twn4evgn6NYNOOzErhS3F+OOkwxlP/GubzhoJus6N/Q/KLg ==
X-ME-Sender: <xms:VZqxXXsTJFxR4IGw2t9LzVa7KBj3dYFKlqTb06-yh5ZJOS0lTrksCw>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedufedrledugdehfecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefofgggkfgjfhffhffvufgtsegrtderreerreejnecuhfhrohhmpedftehlvgig vgihucfovghlnhhikhhovhdfuceorggrmhgvlhhnihhkohhvsehfrghsthhmrghilhdrfh hmqeenucffohhmrghinhepghhithhhuhgsrdgtohhmnecurfgrrhgrmhepmhgrihhlfhhr ohhmpegrrghmvghlnhhikhhovhesfhgrshhtmhgrihhlrdhfmhenucevlhhushhtvghruf hiiigvpedt
X-ME-Proxy: <xmx:VZqxXSfrKbkDk6a7-5weCl_l197HKS-79-A4FgOjZz8KuAFzZTAGAw> <xmx:VZqxXRa4UELhPuspxlCxNqV_50yU_nU76-irGIh8lHVSVid4s5A1eQ> <xmx:VZqxXUYjwmqQMg1otHyT5q7S639m6XLmYC8zxhtVeOVCPvU67nKHMg> <xmx:VZqxXSodJsl0TDrlMCBMnIs6o4xhFrEDvwUJDgPp5EEYC79RDhsRrQ>
Received: by mailuser.nyi.internal (Postfix, from userid 501) id 40D3BC200A5; Thu, 24 Oct 2019 08:34:29 -0400 (EDT)
X-Mailer: MessagingEngine.com Webmail Interface
User-Agent: Cyrus-JMAP/3.1.7-470-gedfae93-fmstable-20191021v4
Mime-Version: 1.0
Message-Id: <8cdc2a28-787b-419e-84b5-9b998a18390d@www.fastmail.com>
In-Reply-To: <89CEEC4A-78D9-40BF-8A4D-732C7A199F30@dericed.com>
References: <3835cda8-7bfb-4178-bec7-b0acff9327ba@www.fastmail.com> <F50D112A-91E8-482B-A78F-8557480331BC@dericed.com> <52b3f63f-fddb-4438-be5f-f61359307f98@www.fastmail.com> <89CEEC4A-78D9-40BF-8A4D-732C7A199F30@dericed.com>
Date: Thu, 24 Oct 2019 13:33:57 +0100
From: Alexey Melnikov <aamelnikov@fastmail.fm>
To: Dave Rice <dave@dericed.com>
Cc: cellar@ietf.org
Content-Type: multipart/alternative; boundary="7893603c0c424a8782f54a1bf7303050"
Archived-At: <https://mailarchive.ietf.org/arch/msg/cellar/koyh6rLPjpqwmL9iUx2xouKD2g4>
Subject: Re: [Cellar] Second AD review of draft-ietf-cellar-ebml-10
X-BeenThere: cellar@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Codec Encoding for LossLess Archiving and Realtime transmission <cellar.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cellar>, <mailto:cellar-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cellar/>
List-Post: <mailto:cellar@ietf.org>
List-Help: <mailto:cellar-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cellar>, <mailto:cellar-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 24 Oct 2019 12:34:35 -0000

Hi Dave,

On Thu, Jul 11, 2019, at 2:53 PM, Dave Rice wrote:
>> On Jul 11, 2019, at 9:36 AM, Alexey Melnikov <aamelnikov@fastmail.fm> wrote:
>> 
>> Hi Dave,
>> I removed comments where we are in agreement. A few followups:
>> 
>> On Sat, Jul 6, 2019, at 6:23 PM, Dave Rice wrote:
>>> Hello Alexey,
>>> 
>>> Thanks much for providing this thorough review. I’ll try to reply point by point below with references to either pull requests or issues or offer followup questions. In a few places I ping Steve Lhomme and Michael Richardson as the comments relate to text they originated in their work.
>>> 
>>>> EBMLElementOccurrence = [EBMLMinOccurrence] "*" [EBMLMaxOccurrence]
>>>> EBMLMinOccurrence = 1*DIGIT
>>>> EBMLMaxOccurrence = 1*DIGIT
>>>> 
>>>> Are there any upper limits on allowed values for these fields? Even if you don't encode them using ABNF, it would be good to mention them in an ABNF comment.
>>>> 
>>>> VariableParentOccurrence = [PathMinOccurrence] "*" [PathMaxOccurrence]
>>>> PathMinOccurrence = 1*DIGIT
>>>> PathMaxOccurrence = 1*DIGIT
>>>> 
>>>> Same comment as above.
>>> 
>>> I looked for examples of that sort of commenting but didn’t find much guidance. Eventually I simply appended " ; no upper limit” to each of the four referenced lines and added that to https://github.com/cellar-wg/ebml-specification/pull/265.
>> I think this is the wrong fix. Is it sufficient for an implementation to use 32 bit value to represent any of these? 64 bit value?
>> "no upper limit" is not going to be interoperable.
> 
> The smallest possible EBML Element is 2 bytes (1 bytes Element ID, 1 byte Element Data Size, and 0 bytes of Element Data). The upper limit of how many occurrences would be determined by the limit of the Element Data Size of the Parent Element. If the EBML Document has EBMLMaxSizeLength as the default of 8, then the upper limit of an Element Data Size is 72,057,594,037,927,934 bytes. Is the smallest possible Element is 2 bytes, then the upper limit of that Element’s occurrence would be 72,057,594,037,927,934 divided by two.
> 
> So if EBMLMaxSizeLength = 8, then this is possible
> 
> <RootElement>
> <TwoByteElement/> # 36,028,797,018,963,967 occurrences
> </RootElement>
> 
> but this would overflow the Element Data Size of the Root Element
> <RootElement>
> <TwoByteElement/> # 36,028,797,018,963,968 occurrences
> </RootElement>
> 
> However, EBML allows the EBML Document Type to set an EBMLMaxSizeLength value higher than 8, and as that is incremented up the upper limit would expand exponentially.
I will try one more time: if I write an implementation that has hardcoded limit of 8 would it be considered compliant with this specification. I think asking to support arbitrary length values is a big ask.

> In my draft I would considering "no upper limit” and 36,028,797,018,963,967 to effectively be the same or that the definition does not define the limit but in practice it would be limited by the capacity of the Element Data Size of the parent Element.
I will let the document start IETF LC without this issue being fully resolved, but I suspect it will come up again during IESG review.

>>>> 11.1.10.1. label
>>>> 
>>>>  The label provides a concise expression for human consumption that
>>>>  describes what the value of the "<enum>" represents.
>>>> 
>>>> Is it worth adding a cross reference to the "lang" attribute here?
>>> 
>>> Do you mean to express the language of the term used within the label? Currently the language of the label is undefined and since it is an attribute that label is not repeatable.
>> 
>> To be honest I am not yet sure how I feel about "undefined language" here. Need to think about that.
>> But either way, I think adding some text that "lang" attribute doesn't apply would be helpful.
> 
> In the case of Matroska, the labels are in English. In the EBML definition we could say that the labels are in English unless the definition of that associated EBML Document Type claims otherwise.
Ok.

>>>> 12. Considerations for Reading EBML Data
>>>> 
>>>>  If a Master Element contains a CRC-32 Element that doesn't validate,
>>>>  then the EBML Reader MAY ignore all contained data except for
>>>>  Descendant Elements that contain their own valid CRC-32 Element.
>>>> 
>>>> I don't fully understand your use of "MAY ... except ..." here.
>>>> Can you elaborate on why would an implementation ignore data contained in a Master Element and not ignore Descendant Elements, even if they own CRC-32 is valid?
>>> 
>>> For instance if a Matroska file has three metadata tags and each has a CRC value and so does the parent Tags element like this.
>>> 
>>> <Tags crc=invalid>
>>>  <Tag crc=valid>
>>>  <Tag crc=valid>
>>>  <Tag crc=invalid>
>>> <Tags>
>>> 
>>> We’re trying to say that even though the contents of the <Tags> element is invalid, that the valid child elements may still be used.
>> So to me this means that after discarding all invalid elements you end up with something like this:
>> 
>> <Tags >
>>  <Tag crc=valid>
>> </Tags>
>> 
>> As this is an incomplete document, I am struggling to understand what it can be used for?
> 
> In that case, the valid Tags are still useful even if within a parent element whos contents are invalid. Perhaps another example would be in the Attachments Element:
> 
> <Attachments crc=invalid>
> <Attachment crc=invalid>Poster Art</Attachment>
> <Attachment crc=valid>Subtitle Font</Attachment>
> </Attachments>
> 
> Here some bit damage occurs to the Poster Art so the Attachment and its parent Attachments now have invalid crcs, however the subtitle font is still ok and could be used in the presentation without an issue.
I suggest some examples along the lines of your reply should be added to the document. This is not a property commonly found in IETF formats, so talking more about it would be useful.

Best Regards,
Alexey