Re: [Cellar] On the multiplicity of Info elements

Michael Niedermayer <michael@niedermayer.cc> Mon, 11 January 2016 12:44 UTC

Return-Path: <michael@niedermayer.cc>
X-Original-To: cellar@ietfa.amsl.com
Delivered-To: cellar@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 07FDC1A8A03 for <cellar@ietfa.amsl.com>; Mon, 11 Jan 2016 04:44:27 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.001
X-Spam-Level:
X-Spam-Status: No, score=-0.001 tagged_above=-999 required=5 tests=[BAYES_40=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DGbI3MklvRd3 for <cellar@ietfa.amsl.com>; Mon, 11 Jan 2016 04:44:24 -0800 (PST)
Received: from relay3-d.mail.gandi.net (relay3-d.mail.gandi.net [IPv6:2001:4b98:c:538::195]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8846F1A89FC for <cellar@ietf.org>; Mon, 11 Jan 2016 04:44:24 -0800 (PST)
Received: from mfilter15-d.gandi.net (mfilter15-d.gandi.net [217.70.178.143]) by relay3-d.mail.gandi.net (Postfix) with ESMTP id C0741A80F1; Mon, 11 Jan 2016 13:44:22 +0100 (CET)
X-Virus-Scanned: Debian amavisd-new at mfilter15-d.gandi.net
Received: from relay3-d.mail.gandi.net ([IPv6:::ffff:217.70.183.195]) by mfilter15-d.gandi.net (mfilter15-d.gandi.net [::ffff:10.0.15.180]) (amavisd-new, port 10024) with ESMTP id Af3RcjoHdHkl; Mon, 11 Jan 2016 13:44:21 +0100 (CET)
X-Originating-IP: 213.47.64.66
Received: from localhost (chello213047064066.6.14.vie.surfer.at [213.47.64.66]) (Authenticated sender: michael@niedermayer.cc) by relay3-d.mail.gandi.net (Postfix) with ESMTPSA id 975EEA80D8; Mon, 11 Jan 2016 13:44:19 +0100 (CET)
Date: Mon, 11 Jan 2016 13:43:24 +0100
From: Michael Niedermayer <michael@niedermayer.cc>
To: Steve Lhomme <slhomme@matroska.org>
Message-ID: <20160111124324.GL13213@nb4>
References: <CAHUoETLC4dQQ7=TOuTXZ3aDjKCCJgz2s-8Gb33MoSAP3hgRQiQ@mail.gmail.com> <BEA72D66-EA3D-4CF0-987D-836E95287F39@dericed.com> <20151230091811.GA19636@bunkus.org> <CAOXsMFLCbe-W=h+tQpdRa8Nh0jz=xdbZTXEmoXsgQTbA=4OPCQ@mail.gmail.com> <C0E5EBA2-2A56-46F9-A049-629EFB11F280@dericed.com> <CAOXsMF+gc0d2LEisfHm0jnjDGQKcYquEMBt7FnZ_uuSNF=C0iw@mail.gmail.com> <568AC10F.9030303@gmx.de> <CAOXsMFKJJhzU-3CYqguDePY42T+Vvhx9ytAfvoM6xyqaZY+N4g@mail.gmail.com> <20160105164958.GB13213@nb4> <CAOXsMFJ73VF9N4KPvm9QpOEKzyiUQ9APT2F70A7S5wktihf9aA@mail.gmail.com>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature"; boundary="1MV0VfA6Y2yiVCnw"
Content-Disposition: inline
In-Reply-To: <CAOXsMFJ73VF9N4KPvm9QpOEKzyiUQ9APT2F70A7S5wktihf9aA@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Archived-At: <http://mailarchive.ietf.org/arch/msg/cellar/i_fOPISs09nHsKbaEwbszBJ735w>
Cc: cellar@ietf.org, "Sebastian G. <bastik>" <bastik.public.mailinglist@gmx.de>
Subject: Re: [Cellar] On the multiplicity of Info elements
X-BeenThere: cellar@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Codec Encoding for LossLess Archiving and Realtime transmission <cellar.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cellar>, <mailto:cellar-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cellar/>
List-Post: <mailto:cellar@ietf.org>
List-Help: <mailto:cellar-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cellar>, <mailto:cellar-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 11 Jan 2016 12:44:27 -0000

On Mon, Jan 11, 2016 at 10:46:33AM +0100, Steve Lhomme wrote:
> 2016-01-05 17:49 GMT+01:00 Michael Niedermayer <michael@niedermayer.cc>:
> > On Tue, Jan 05, 2016 at 09:20:00AM +0100, Steve Lhomme wrote:
> >> 2016-01-04 19:59 GMT+01:00 Sebastian G. <bastik>
> >> <bastik.public.mailinglist@gmx.de>:
> >> > 04.01.2016, 08:11 Steve Lhomme:
> >> >> 2016-01-03 16:53 GMT+01:00 Dave Rice <dave@dericed.com>:
> >> >>>
> >> >>>> On Jan 2, 2016, at 3:47 AM, Steve Lhomme <slhomme@matroska.org>
> >> >>>> wrote:
> >> >>>>
> >> >>>> 2015-12-30 10:18 GMT+01:00 Moritz Bunkus <moritz@bunkus.org>:
> >> >>>>> Hey,
> >> >>>>>
> >> >>>>> I only remember the discussion around Tracks being multiple,
> >> >>>>> not particularly for the other ones. Our intent way back when
> >> >>>>> was to allow muxers to write multiple instances of _the same
> >> >>>>> information_ in different places in order to make the file more
> >> >>>>> resilient against damage or incomplete downloads with protocols
> >> >>>>> like BitTorrent.
> >> >>>>
> >> >>>> Yes, that's the idea for the Track Info as it's vital to the
> >> >>>> usability of the file, as well as the Segment Info. I'm not sure
> >> >>>> it's used in practice though. Since the goal of CELLAR is
> >> >>>> archiving solutions it may still make sense.
> >> >>>
> >> >>> Perhaps to declare that an Element may be repeated but must be
> >> >>> repeated identically should be a new EBML Element Attributes, so
> >> >>> there can be a distinction between the repeatability Segment/Info
> >> >>> and the repeatability of SimpleBlock.
> >> >>
> >> >> That might be good. After all not elements make sense as repeated
> >> >> ones. For example in Matroska you don't want a Cluster (timestamped
> >> >> data) to be repeated.
> >> >>
> >> >>>>> The same reasoning could be applied to Info. Both elements are
> >> >>>>> absolutely crucial to playback; the other level 1 elements safe
> >> >>>>> for the clusters simply aren’t.
> >> >>>
> >> >>> But what should happen when the read finds differences in
> >> >>> repeated-but-should-be-identical elements?
> >> >>
> >> >> Good question. Maybe repeated elements should have a CRC ? If a CRC
> >> >> is wrong (or not found) the parser could look for a copy.
> >> >
> >> > I like the CRC idea for repeated elements, but it still does not define
> >> > how players should behave if they encounter two elements, even with
> >> > valid CRCs, not matching each other.
> >>
> >> Also what about repeated elements that are not master elements. You'd
> >> have no way of telling which is the best version. So repeated should
> >> probably be master elements. Maybe CRC should be mandatory too (not
> >> sure if real life files already follow this rule). That's the only way
> >> a parser would be able to tell which version is correct, as far as I
> >> can see.
> >
> > i dont disagree but i think "no way" is not correct
> > its possible that external means like errors from reading data could
> > be used to detect which repeated versions are bad, also parsing errors
> > could indicate what is bad and then if there are 3 or more copies
> > simple majority "voting" on a byte per byte base could be used
> > to construct a "good" version, this too could be done when all copies
> > fail CRC checks. No idea if that would be usefull in any actual real
> > world usecase, but a player or repair tool trying to be exceptionally
> > resillient to damages could do things like that
> 
> True, some "forensic" could be applied to recover the damaged data
> when all CRC fails. I don't think the specs should specify such
> recovery techniques, nor a general reader should care too much about
> that. The first level of CRC (find one that matches) should be good
> enough for that level.

i agree

the possibility of such "forensic" techniques should be considered
in the design of the spec though at places where it might be usefull
iam not sure CRCs are the only such place ...

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The worst form of inequality is to try to make unequal things equal.
-- Aristotle