Re: [Cellar] My notes on https://datatracker.ietf.org/doc/draft-ietf-cellar-flac/

Martijn van Beurden <mvanb1@gmail.com> Mon, 03 April 2023 16:45 UTC

Return-Path: <mvanb1@gmail.com>
X-Original-To: cellar@ietfa.amsl.com
Delivered-To: cellar@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DA9FBC1516E9 for <cellar@ietfa.amsl.com>; Mon, 3 Apr 2023 09:45:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.834
X-Spam-Level:
X-Spam-Status: No, score=-1.834 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_KAM_HTML_FONT_INVALID=0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0RXweNODg1Vr for <cellar@ietfa.amsl.com>; Mon, 3 Apr 2023 09:45:18 -0700 (PDT)
Received: from mail-lf1-x134.google.com (mail-lf1-x134.google.com [IPv6:2a00:1450:4864:20::134]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D26DFC15171E for <cellar@ietf.org>; Mon, 3 Apr 2023 09:45:17 -0700 (PDT)
Received: by mail-lf1-x134.google.com with SMTP id bi9so38869403lfb.12 for <cellar@ietf.org>; Mon, 03 Apr 2023 09:45:17 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680540315; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=g6gRCcZ3CHN7n4KmPlc2CxroQbSODH1dWKJcpgjbhZw=; b=GU+g6YqmhYS/ZzCi3Q6ZeByyau1w11jBdvX1bscx9tB9XMM1OyOI+oaPITmo61Jbrd +UpLBDgBYQ2p45fayTOuKB/A5pzapjZtfQ2mFXoWJthKkMsZRCKCf/lxm+A9JM5gTGFD ZEu++kpUViNyzOtQOQWo34MDLiVU9hnhehyA7N8qdo6wVrHvSTySrVis5U2PzEG5ZiRU G59QNIOdpPdFFqkv2VCE6HFOg0xZr3t8W1GPEJWHt9qTEmxein4Qnxr29POH8I2IcxMh 2NtqvZMOH3EP9w1eqhgLGwUnwDI0zRV7oxPDv3cxPBq2UTH9luXwFVIR8s/87J2Tx5MI jTHQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680540315; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=g6gRCcZ3CHN7n4KmPlc2CxroQbSODH1dWKJcpgjbhZw=; b=vDOrlvRD1lYBK5ezAWWzwKGkk+pOMDvFRSZ7I/OY/PPu0j7Oo2rJfN1p/BNLsMUo5v yuioxyK252qMOtuI3ao9D0B5TyH1OFIGLqnv8oKk04WsDgjHQvlhtj8SmkY2Xy5BsL8l MAvDD8UTWm9B2eQujRW+Ql7VPjRdbFS8D5sXdG0pgU+gkesMo1Z45EPQluQDhHKoVKvV gccAwDXwDWQYhagJ62BQ69Sw7kC9Efu2jIEcSu9A4E6rtcLuEObF+DcIS2045ARfOd7a C9j/LloOvAF8TM8RW1n7qtJnqpTWwUDW6RHwp8O6r/PCTIW/TLoqpt8i/kfpG++z4WTv 4vtg==
X-Gm-Message-State: AAQBX9dtYGPlwVF7kT78j3mBdR4344oLufR4T6aTNxc+4odl94o5lgce RtgNy2+H3gh4XdEt+vyfQlkQpwQMiJkdDwQ4rXk=
X-Google-Smtp-Source: AKy350bqXSpDaxCSec70gollSp88EHE1TQr2gC9wRInuUoPaD5ClBKL4Gh/ZZy4PzSLBM/Epd+UWuVmjHehqnrMIFAE=
X-Received: by 2002:ac2:5fe6:0:b0:4eb:3f68:5540 with SMTP id s6-20020ac25fe6000000b004eb3f685540mr1394101lfg.11.1680540315230; Mon, 03 Apr 2023 09:45:15 -0700 (PDT)
MIME-Version: 1.0
References: <CAKKJt-fp4Xf5uoM9TOC6T1azCgCD0AGA-LbH4X03w0ot1CAPvA@mail.gmail.com> <CAKKJt-fuXnwJCs3TQ1gX==zTJ5tZDjKchaK+qTPK47T9-k9CFg@mail.gmail.com> <CAKKJt-ckrKC7K8Xvc=ZJ9BmHh5k-dfwQBVOM02bcahaSmk7sqA@mail.gmail.com> <CADQbU69b=VmwbZ_wAoLAUin-YFJKsEeyy3GyFNgbpsCAc-TN1g@mail.gmail.com>
In-Reply-To: <CADQbU69b=VmwbZ_wAoLAUin-YFJKsEeyy3GyFNgbpsCAc-TN1g@mail.gmail.com>
From: Martijn van Beurden <mvanb1@gmail.com>
Date: Mon, 03 Apr 2023 18:45:00 +0200
Message-ID: <CADQbU6-0Ys5xkG8cUvMQv5PSXdCHd8C-0JVbFz5q+9HKCT30gw@mail.gmail.com>
To: Spencer Dawkins at IETF <spencerdawkins.ietf@gmail.com>
Cc: Codec Encoding for LossLess Archiving and Realtime transmission <cellar@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000008525a605f87148f4"
Archived-At: <https://mailarchive.ietf.org/arch/msg/cellar/ZTZjZnXQdgir58W7TdNAn7ikNC4>
Subject: Re: [Cellar] My notes on https://datatracker.ietf.org/doc/draft-ietf-cellar-flac/
X-BeenThere: cellar@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Codec Encoding for LossLess Archiving and Realtime transmission <cellar.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cellar>, <mailto:cellar-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cellar/>
List-Post: <mailto:cellar@ietf.org>
List-Help: <mailto:cellar-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cellar>, <mailto:cellar-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 03 Apr 2023 16:45:19 -0000

Once again hi,

You said 'could you update the working group' but I read 'could you update
the draft'.

I've tried to address all comments as best I could and tried to find
similar issues you might have missed. Additionally I've addressed some
issues that were raised by others and some things I found while working on
the draft.

The document has improved quite a bit thanks to your input and I think it
is ready for the next step.

Kind regards, Martijn van Beurden

Op ma 3 apr 2023 om 16:39 schreef Martijn van Beurden <mvanb1@gmail.com>:

> Hi Spencer,
>
> There were no comments that needed further clarification.
>
> Please note that there are a few changes/improvements in the new draft not
> related to your review. It might be a good idea to look at the diff at some
> point. Especially the extension of the section coded number might be worth
> taking a look at.
>
> Kind regards,
>
> Martijn van Beurden
>
> Op zo 2 apr 2023 om 23:56 schreef Spencer Dawkins at IETF <
> spencerdawkins.ietf@gmail.com>:
>
>> Hi, Martjin,
>>
>> Could you update the working group on the status of the FLAC draft? Do
>> you think there are additional shepherd comments that you need
>> clarifications on?
>>
>> Best,
>>
>> Spencer
>>
>> On Tue, Feb 28, 2023, 00:15 Spencer Dawkins at IETF <
>> spencerdawkins.ietf@gmail.com> wrote:
>>
>>> Hi, again,
>>>
>>> On Sun, Feb 26, 2023 at 8:42 PM Spencer Dawkins at IETF <
>>> spencerdawkins.ietf@gmail.com> wrote:
>>>
>>>> Dear Cellar,
>>>>
>>>> I'm still working my way through
>>>> https://datatracker.ietf.org/doc/draft-ietf-cellar-flac/. I still need
>>>> to read the Appendices, but I wanted to pass along my notes so far.
>>>>
>>>> In general, I like the document. I do have questions, especially about
>>>> the use of BCP14 terminology, and some suggestions for readability, but
>>>> this should get discussion started.
>>>>
>>>> I expect to have the reviews of the appendices done by tomorrow at
>>>> close of business, my time.
>>>>
>>>
>>> "Define 'close of business'", but I finished reading through the
>>> appendices, and looked back at the IANA Considerations section.
>>>
>>> I said in my first set of notes that I like the document, and I like it
>>> more, now that I've had time to reflect on it.
>>>
>>> A large number of my comments are about BCP 14 requirements language, so
>>> please don't think of them as a lot of comments. They were really two or
>>> three comments made multiple times:
>>>
>>>    - Whether a BCP 14 keyword was an interoperability requirement or a
>>>    statement of fact,
>>>    - If a MUST is correct, what a decoder should do if it detects that
>>>    the MUST was not satisfied, and
>>>    - If a SHOULD is correct, why it's not a MUST, and what the decoder
>>>    should do if it's not satisfied.
>>>
>>> I really like all the appendices in this document - they were especially
>>> well-written and seemed helpful.
>>>
>>> I should also mention that I noticed one or two things in the Appendices
>>> that made me wonder about what I'd read in the formal specification, so
>>> added comments that should be addressed in the formal specification.
>>>
>>> Please do ask me questions about anything I seem to be getting wrong or
>>> didn't explain clearly, and remember - the working group participants are
>>> the experts on this technology.
>>>
>>> Best,
>>>
>>> Spencer
>>> ------
>>>
>>> 14.  IANA Considerations
>>>
>>> Could I suggest that you add the following text here?
>>>
>>> "In accordance with the procedures set forth in [RFC4288], this document
>>> registers one new media type, "audio/flac", as defined in the following
>>> section."
>>>
>>> And please do add the reference, of course.
>>>
>>> 14.1.  Media type registration
>>>
>>>    The following information serves as the registration form for the
>>>
>>>    "audio/flac" media type.  This media type is applicable for FLAC
>>>
>>>    audio packaged in its native container.  FLAC audio packaged in
>>>
>>>    another container will take on the media type of its container, for
>>>
>>>    example audio/ogg when packaged in an Ogg container or video/mp4 when
>>>
>>>    packaged in a MP4 container alongside a video track.
>>>
>>> I'd suggest putting a blank line after each line below.
>>>
>>>    Type name: audio
>>>
>>>    Subtype name: flac
>>>
>>>    Required parameters: none
>>>
>>>    Optional parameters: none
>>>
>>>    Encoding considerations: as per this document
>>>
>>> I'd suggest saying "as per THISRFC".
>>>
>>>    Security considerations: see section 12
>>>
>>> I'd suggest saying "see the Security Considerations in section of
>>> THISRFC".
>>>
>>>    Interoperability considerations: no known concerns
>>>
>>> I would have guessed you'd say "see the descriptions of past format
>>> changes in Appendix B of THISRFC".
>>>
>>>    Published specification: THISRFC
>>>
>>>    Applications that use this media type: ffmpeg, apache, firefox
>>>
>>>    Fragment identifier considerations: none
>>>
>>>    Additional information:
>>>
>>>      Deprecated alias names for this type: audio/x-flac
>>>
>>>      Magic number(s): fLaC
>>>
>>>      File extension(s): flac
>>>
>>>      Macintosh file type code(s): none
>>>
>>>    Person & email address to contact for further information: IETF
>>> CELLAR WG
>>>
>>>    Intended usage: COMMON
>>>
>>>    Restrictions on usage: N/A
>>>
>>>    Author: IETF CELLAR WG
>>>
>>>    Change controller: IESG
>>>
>>> Best practice for stating the change controller has varied wildly over
>>> time. My suggestion is to provide this as "Internet Engineering Task Force
>>> (mailto:iesg@ietf.org)."
>>>
>>>    Provisional registration? (standards tree only): NO
>>>
>>> Appendix A.  Numerical considerations
>>>
>>>    In order to maintain lossless behavior, all arithmetic used in
>>>
>>>    encoding and decoding sample values MUST be done with integer data
>>>
>>>    types to eliminate the possibility of introducing rounding errors
>>>
>>>    associated with floating-point arithmetic.  Use of floating-point
>>>
>>>    representations in analysis (e.g. finding a good predictor or Rice
>>>
>>>    parameter) is not a concern, as long as the process of using the
>>>
>>>    found predictor and Rice parameter to encode audio samples is
>>>
>>>    implemented with only integer math.
>>>
>>>    Furthermore, the possibility of integer overflow MUST be eliminated
>>>
>>>    by using data types large enough to never overflow.
>>>
>>> I have two observations here. First, I think you can say "can be
>>> eliminated" - this is a statement of fact. Beyond that, all of these
>>> appendices say they are non-normative, so BCP 14 language in explicitly
>>> non-normative (parts of) standards track specifications is likely to
>>> confuse both reviewers and implementers who haven't participated in the
>>> working group.
>>>
>>> Second, I'd suggest saying "using data types large enough to never
>>> overflow in practice".
>>>
>>>    Choosing a
>>>
>>>    64-bit signed data type for all arithmetic involving sample values
>>>
>>>    would make sure the possibility for overflow is eliminated, but
>>>
>>>    usually smaller data types are chosen for increased performance,
>>>
>>>    especially in embedded devices.  This section will provide guidelines
>>>
>>>    for choosing the right data type in each step of encoding and
>>>
>>>    decoding FLAC files.
>>>
>>> Appendix C.  Interoperability considerations
>>>
>>>    As documented in appendix past format changes (#past-format-changes),
>>>
>>>    there have been some changes and additions to the FLAC format.
>>>
>>>    Additionally, implementation of certain features of the FLAC format
>>>
>>>    took many years, meaning early decoder implementations could not be
>>>
>>>    tested against files with these features.  Finally, many lower-
>>>
>>>    quality FLAC decoders only implement a subset of FLAC features
>>>
>>>    required for playback of the most common FLAC files.
>>>
>>> I woke up after reading this sentence, and went back looking through the
>>> document for mentions of FLAC "subset". I don't think that's mentioned
>>> before the end of Section 7, and that's a forward reference to Section 8,
>>> which describes this.
>>>
>>> I'd suggest mentioning in both the Abstract and Introduction that this
>>> specification includes the specification for the complete FLAC format, as
>>> well as the FLAC subset. My apologies for the late mention about this!
>>>
>>>    This appendix provides some considerations for encoder
>>>
>>>    implementations aiming to create highly compatible files.  As this
>>>
>>>    topic is one that might change after this document is finished,
>>>
>>>    consult this web page (https://github.com/ietf-wg-cellar/flac-
>>>
>>>    specification/wiki/Interoperability-considerations) for more up-to-
>>>
>>>    date information.
>>>
>>> C.1.  Non-subset streams
>>>
>>>    As described in section format subset (#format-subset), FLAC
>>>
>>>    specifies a subset of its capabilities as the Subset format.  Certain
>>>
>>>    decoders may choose to only decode FLAC files conforming to the
>>>
>>>    limitations imposed by the Subset.  Therefore, when maximum
>>>
>>>    compatibility with decoders is desired it is RECOMMENDED to stay
>>>
>>>    within the limitations of the FLAC Subset format.
>>>
>>> This last sentence ^^^^ pretty clearly isn't a BCP 14 RECOMMENDED, it's
>>> a statement of fact. I'd recommend replacing it with something like
>>> "Therefore, maximum compatibility with decoders is achieved when FLAC files
>>> are created using the limited FLAC Subset format."
>>>
>>> C.2.  Variable block size
>>>
>>>    Because it is often difficult to find the optimal arrangement of
>>>
>>>    block sizes for maximum compression, most encoders choose to create
>>>
>>>    files with a fixed block size.  Because of this many decoder
>>>
>>>    implementations suffer from bugs when handling variable block size
>>>
>>>    streams or do not decode them at all.
>>>
>>> This ^^^^ might be clearer as "Because of this many decoder
>>> implementations receive minimal use when handling variable block size
>>> streams, and this can reveal bugs, or reveal that the implementations do
>>> not decode them at all."
>>>
>>>    Furthermore, as is explained
>>>
>>>    in section addition of block size strategy flag (#addition-of-block-
>>>
>>>    size-strategy-flag), there have been some changes to the way variable
>>>
>>>    block size streams were encoded.  Because of this, when maximum
>>>
>>>    compatibility with decoders is desired it is RECOMMENDED to only use
>>>
>>>    fixed block size streams.
>>>
>>> Again, this ^^^^ last sentence is a statement of fact. I'd suggest
>>> something like "Therefore, maximum compatibility with decoders is achieved
>>> when FLAC files are created using fixed block size streams."
>>>
>>> C.3.  5-bit Rice parameter
>>>
>>>    As the addition of the 5-bit Rice parameter as described in section
>>>
>>>    addition of 5-bit Rice parameter (#addition-of-5-bit-rice-parameter)
>>>
>>>    was quite a few years after the FLAC format was first introduced,
>>>
>>>    some early decoders might not be able to decode files containing such
>>>
>>>    Rice parameters.  The introduction of this was specifically aimed at
>>>
>>>    improving compression of 24-bit PCM audio and compression of 16-bit
>>>
>>>    PCM audio only rarely benefits from using a 5-bit Rice parameters.
>>>
>>>    Therefore, when maximum compatibility with decoders is desired it is
>>>
>>>    RECOMMENDED to not use 5-bit Rice parameters when encoding audio with
>>>
>>>    a bit depth of 16 bits or lower.
>>>
>>> Again, this ^^^^ last sentence is a statement of fact. I'd suggest
>>> something like "Therefore, maximum compatibility with decoders is achieved
>>> when FLAC files are created using 4-bit Rice parameters when encoding audio
>>> with a bit depth of 16 bits or lower."
>>>
>>> C.4.  Rice escape code
>>>
>>>    Escapes Rice partitions are only seldom used as it turned out their
>>>
>>>    use provides only very small compression improvement.  As many
>>>
>>>    encoders therefore do not use these by default or are not capable of
>>>
>>>    producing them at all, it is likely many decoder implementation are
>>>
>>>    not able to decode them correctly.  Therefore, when maximum
>>>
>>>    compatibility with decoders is desired it is RECOMMENDED to not use
>>>
>>>    escaped Rice partitions.
>>>
>>> :-) Same comment.
>>>
>>> "Therefore, maximum compatibility with decoders is achieved when FLAC
>>> files are created without using escaped Rice partitions."
>>>
>>> C.5.  Uncommon block size
>>>
>>>    For unknown reasons some decoders have chosen to support only common
>>>
>>>    block sizes except for the last block.  Therefore, when maximum
>>>
>>>    compatibility with decoders is desired it is RECOMMENDED to only use
>>>
>>>    common block sizes as listed in section block size bits (#block-size-
>>>
>>>    bits) for all but the last block.
>>>
>>> "Therefore, maximum compatibility with decoders is achieved when FLAC
>>> files are created only using common block sizes as listed in section block
>>> size bits (#block-size-bits) for all but the last block."
>>>
>>> C.6.  Uncommon bit depth
>>>
>>>    Most audio is stored in bit depths that are a whole number of bytes,
>>>
>>>    e.g. 8, 16 or 24 bit.  There is however audio with different bit
>>>
>>>    depths.  A few examples:
>>>
>>>    *  DVD-Audio has the possibility to store 20 bit PCM audio
>>>
>>>    *  DAT and DV can store 12 bit PCM audio
>>>
>>>    *  NICAM-728 samples at 14 bit, which is companded to 10 bit
>>>
>>>    *  8-bit µ (U+00B5)-law can be losslessly converted to 14 bit
>>>
>>>       (Linear) PCM
>>>
>>>    *  8-bit A-law can be losslessly converted to 13 bit (Linear) PCM
>>>
>>>    FLAC can store these bit depths directly, but because they are
>>>
>>>    uncommon, some decoders are not able to process the resulting files
>>>
>>>    correctly.  It is possible to store these formats in a FLAC file with
>>>
>>>    a more common bit depth without sacrificing compression by padding
>>>
>>>    each sample with zero bits to a bit depth that is a whole byte.  FLAC
>>>
>>>    will detect these wasted bits.  This transformation leaves no
>>>
>>>    ambiguity in how it can be reversed and is thus lossless.  See
>>>
>>>    section wasted bits per sample (#wasted-bits-per-sample) for details.
>>>
>>>    Therefore, when maximum compatibility with decoders is required, it
>>>
>>>    is RECOMMENDED to pad samples of such audio with zero bits to the bit
>>>
>>>    depth that is the next whole number of bytes.
>>>
>>> "Therefore, maximum compatibility with decoders is achieved when FLAC
>>> files are created by padding samples of such audio with zero bits to the
>>> bit depth that is the next whole number of bytes."
>>>
>>> Appendix D.  Examples
>>>
>>>    This informational appendix contains short example FLAC files which
>>>
>>>    are decoded step by step.  These examples provide a more engaging way
>>>
>>>    to understand the FLAC format than the formal specification.  The
>>>
>>>    text explaining these examples assumes the reader has at least
>>>
>>>    cursory read the specification
>>>
>>> I had to look it up ^^^^, but "cursorily" is the adverb form of this
>>> word.
>>>
>>>    and that the reader refers to the
>>>
>>>    specification for explanation of the terminology used.  These
>>>
>>>    examples mostly focus on the lay-out of several metadata blocks and
>>>
>>>    subframe types and the implications of certain aspects (for example
>>>
>>>    wasted bits and stereo decorrelation) on this lay-out.
>>>
>>> I wonder if it is helpful to contrast the examples in this Appendix to
>>> the tests in the FLAC decoder testbench that you mention at the end of the
>>> Security Considerations section.
>>>
>>> D.1.3.  Signature and streaminfo
>>>
>>>    Note that anywhere a number of samples is mentioned (block size,
>>>
>>>    total number of samples, sample rate), interchannel samples are
>>>
>>>    meant.
>>>
>>> Is this ^^^^ always the case? I ask because I didn't see this helpful
>>> guidance anywhere in the formal specification. If it's always the case,
>>> this statement should occur at the right place in the formal specification
>>> (I defer to the authors on where the right place is).
>>>
>>> _______________________________________________
>> Cellar mailing list
>> Cellar@ietf.org
>> https://www.ietf.org/mailman/listinfo/cellar
>>
>