[Cellar] Re: AD Evaluation: draft-ietf-cellar-tags-18

Robert Sparks <rjsparks@nostrum.com> Tue, 26 August 2025 15:08 UTC

Return-Path: <rjsparks@nostrum.com>
X-Original-To: cellar@mail2.ietf.org
Delivered-To: cellar@mail2.ietf.org
Received: from localhost (localhost [127.0.0.1]) by mail2.ietf.org (Postfix) with ESMTP id A8B12592EC87 for <cellar@mail2.ietf.org>; Tue, 26 Aug 2025 08:08:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at ietf.org
X-Spam-Flag: NO
X-Spam-Score: 0.519
X-Spam-Level:
X-Spam-Status: No, score=0.519 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_ADSP_DISCARD=1.8, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, KHOP_HELO_FCRDNS=0.399, T_SPF_HELO_PERMERROR=0.01, T_SPF_PERMERROR=0.01] autolearn=no autolearn_force=no
Authentication-Results: mail2.ietf.org (amavisd-new); dkim=fail (1024-bit key) reason="fail (message has been altered)" header.d=nostrum.com
Received: from mail2.ietf.org ([166.84.6.31]) by localhost (mail2.ietf.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pFSbZ-QSFtp6 for <cellar@mail2.ietf.org>; Tue, 26 Aug 2025 08:08:28 -0700 (PDT)
Received: from nostrum.com (raven-v6.nostrum.com [IPv6:2001:470:d:1130::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by mail2.ietf.org (Postfix) with ESMTPS id C7D5A592EC82 for <cellar@ietf.org>; Tue, 26 Aug 2025 08:08:27 -0700 (PDT)
Received: from [192.168.1.103] (47-186-49-96.fdr02.plan.tx.ip.frontiernet.net [47.186.49.96]) (authenticated bits=0) by nostrum.com (8.18.1/8.18.1) with ESMTPSA id 57QF8LEN069450 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Tue, 26 Aug 2025 10:08:22 -0500 (CDT) (envelope-from rjsparks@nostrum.com)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=nostrum.com; s=default; t=1756220903; bh=uVRkQITqY/JPAJ6Xp9ix9chIphgrAP8qvOVHBCrmCGQ=; h=Date:Subject:To:Cc:References:From:In-Reply-To; b=dkexwXnyxtdEv7T2sZsSe0/HLC70Oe/guI7ioE8N94vEnFKrBHuXYn++tUz5RMcXF JuO/tp4oY7El+Oli86cDXysT4ZqutPrsoV33ADay8INWFmmC9BQuqfazy74OgXB51M kQqMeXFBAqyFkOsQNWeVSe1NBVSJM9jEuuxCZuFs=
X-Authentication-Warning: raven.nostrum.com: Host 47-186-49-96.fdr02.plan.tx.ip.frontiernet.net [47.186.49.96] claimed to be [192.168.1.103]
Message-ID: <6ca6656f-c4c9-4d4e-999c-3f3d147387c0@nostrum.com>
Date: Tue, 26 Aug 2025 10:08:16 -0500
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
To: Steve Lhomme <slhomme@matroska.org>, Spencer Dawkins at IETF <spencerdawkins.ietf@gmail.com>
References: <CAMzqgozwprvj=BFmhzmSR4oG=07xdOz=fh5hV6Z+ZmY_UKniSw@mail.gmail.com> <0468E722-3BEB-4C2C-82FC-0F2C6A6B0A1B@matroska.org> <CAMzqgoxF7JdW6=18r1mzFbsPF6R0cYi-255FWHufy=hJp6bn7A@mail.gmail.com> <31DAC6FD-9CEB-496E-8BBF-D524476E6AB9@matroska.org> <CAMzqgoxMqCv6j+SUg9SrSu763-rRek=Rz-AGm2326TE2px=XWQ@mail.gmail.com> <DC6A5FAD-0F20-42BF-BDEE-B45843AB3CBD@matroska.org> <CAKKJt-efoJfa+YijGFZdgcPEufvxEoSq7ruqF_oD_vg4ArLBsg@mail.gmail.com> <a1d432e1-3624-4cb8-87bc-5a05f5129e83@matroska.org>
Content-Language: en-US
From: Robert Sparks <rjsparks@nostrum.com>
In-Reply-To: <a1d432e1-3624-4cb8-87bc-5a05f5129e83@matroska.org>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Message-ID-Hash: J3ZTWSABRIIUE4BUN2QVURS7UKALC564
X-Message-ID-Hash: J3ZTWSABRIIUE4BUN2QVURS7UKALC564
X-MailFrom: rjsparks@nostrum.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-cellar.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: Orie <orie@or13.io>, cellar@ietf.org
X-Mailman-Version: 3.3.9rc6
Precedence: list
Subject: [Cellar] Re: AD Evaluation: draft-ietf-cellar-tags-18
List-Id: Codec Encoding for LossLess Archiving and Realtime transmission <cellar.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/cellar/9xPKrHpNl_w9aCEHaV7nvRs_5Mg>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cellar>
List-Help: <mailto:cellar-request@ietf.org?subject=help>
List-Owner: <mailto:cellar-owner@ietf.org>
List-Post: <mailto:cellar@ietf.org>
List-Subscribe: <mailto:cellar-join@ietf.org>
List-Unsubscribe: <mailto:cellar-leave@ietf.org>

Hi Steve - I'll work with you directly to resolve this.

RjS

On 8/26/25 8:55 AM, Steve Lhomme wrote:
> I just tried but I cannot login to the IETF website. It requires a new 
> password because the old one was too short. But I never get the email 
> to reset my password.
>
> I'll wait, maybe it's taking time. Hopefully it will let me log in 
> tonight...
>
> Steve
>
> On 8/25/2025 5:26 PM, Spencer Dawkins at IETF wrote:
>> Hi, Steve,
>>
>> On Sun, Aug 24, 2025 at 2:39 AM Steve Lhomme <slhomme@matroska.org 
>> <mailto:slhomme@matroska.org>> wrote:
>>
>>     Hi Orie,
>>
>>     I merged the last pending change. Should I publish a draft-ietf-
>>     cellar-tags-19 with it ? That might be want we send to the RFC
>>     Editors in the forthcoming steps.
>>
>>
>> Yes, please! That will allow Orie to start IETF Last Call, the next 
>> step in the process.
>>
>> Best,
>>
>> Spencer
>>
>>
>>     Steve
>>
>>>     On 14 Aug 2025, at 23:21, Orie <orie@or13.io
>>>     <mailto:orie@or13.io>> wrote:
>>>
>>>     Hi Steve & Cellar,
>>>
>>>     Sorry for the delay in replying to your note.
>>>
>>>     The note about unichars was just a comment (non blocking
>>>     feedback), in case it was useful.
>>>
>>>     This PR remains open:
>>>
>>> https://github.com/ietf-wg-cellar/matroska-specification/pull/1020
>>> <https://github.com/ietf-wg-cellar/matroska-specification/pull/1020>
>>>
>>>     As far as I can tell you have addressed the rest of my comments.
>>>
>>>     Regards,
>>>
>>>     OS, ART AD
>>>
>>>
>>>
>>>     On Sun, Jul 20, 2025 at 8:08 AM Steve Lhomme <slhomme@matroska.org
>>>     <mailto:slhomme@matroska.org>> wrote:
>>>
>>>         Hi,
>>>
>>>         Limiting commits to a subject that needs to be addressed in 
>>> here.
>>>
>>>>         On 7 Jul 2025, at 18:27, Orie <orie@or13.io
>>>>         <mailto:orie@or13.io>> wrote:
>>>>
>>>>         Hi Steve & Cellar,
>>>>
>>>>         Inline replies prefixed with OS, sorry to have not gotten you
>>>>         this feedback in time for the draft cut off.
>>>>
>>>>         Unless noted, you can assume I have no objection to
>>>>         your comments, but please point out if you are expecting a
>>>>         specific response from me.
>>>>
>>>>         On Sun, Jun 29, 2025 at 5:55 AM Steve Lhomme
>>>>         <slhomme@matroska.org <mailto:slhomme@matroska.org>> wrote:
>>>>
>>>>             Hi Orie,
>>>>
>>>>             Thanks a lot for your detailed review. I’ll add comments
>>>>             and some Merge Requests with fixes where possible.
>>>>
>>>>>             On 20 Jun 2025, at 22:54, Orie <orie@or13.io
>>>>>             <mailto:orie@or13.io>> wrote:
>>>>
>>>>
>>>>>             ### UTF-8 "letters"
>>>>>
>>>>>             ```
>>>>>             195   Official TagName values MUST consist of UTF-8
>>>>>             capital letters,
>>>>>             196   numbers and the underscore character '_'.
>>>>>
>>>>>             198   Official TagName values MUST NOT contain any space.
>>>>>
>>>>>             200   Official TagName values MUST NOT start with the
>>>>>             underscore character
>>>>>             201   '_'; see Section 3.1.
>>>>>             ```
>>>>>
>>>>>             ABNF might be helpful for this.
>>>>
>>>>             That would definitely be useful. However there doesn’t
>>>>             seem to be any logic for UTF-8 upper (or lower) case
>>>>             values, or even simply letters, as opposed to symbols.
>>>>
>>>>             Looking at RFC 5234 (ABNF) it doesn’t really take in
>>>>             account Unicode but rather defined bits when it’s not
>>>>             ASCII (RFC 3629 - UTF-8 has a binary ABNF definition).
>>>>
>>>>             I can a name to describe a UTF-8 upper letter and then do
>>>>             the rest of the ABFN with that. But I don’t think that’s
>>>>             valid.
>>>>
>>>>
>>>>         OS: Consider if this is helpful: https://
>>>>         datatracker.ietf.org/doc/draft-bray-unichars/ <https://
>>>>         datatracker.ietf.org/doc/draft-bray-unichars/> (for defining
>>>>         the repertoire you are expecting software to recognize)
>>>>
>>>>
>>>>>
>>>>
>>>>>             ### Type UTF-8
>>>>>
>>>>>             ```
>>>>>             824       | SUBTITLE | UTF-8 | Sub Title of the 
>>>>> entity.             This is     |
>>>>>             ```
>>>>>
>>>>>             Are emoji allowed? what about control characters?
>>>>
>>>>             Any UTF-8 character. This is stored in a binary format
>>>>             that doesn’t need escaping. So any valid UTF-8 value is 
>>>> fine.
>>>>
>>>>>             Is there a more specific way to describe this type?
>>>>>
>>>>>             Consider usinghttps://datatracker.ietf.org/wg/precis/
>>>>>             documents/ <https://datatracker.ietf.org/wg/precis/
>>>>> documents/>orhttps://datatracker.ietf.org/doc/draft-
>>>>>             bray-unichars/ <https://datatracker.ietf.org/doc/draft-
>>>>>             bray-unichars/>.
>>>>
>>>>             Not sure what you are proposing here. The value is
>>>>             anything that is a valid UTF-8 string. The meaning is a
>>>>             “sub title”. Maybe “under title” is better or there are
>>>>             more appropriate word in English ?
>>>>             It seems that “subtitle” is commonly used for that
>>>> https://www.writeitgreat.com/post/title-vs-subtitle-what-
>>>>             s-the-difference <https://www.writeitgreat.com/post/
>>>>             title-vs-subtitle-what-s-the-difference>
>>>>
>>>>             In the linked ID3 tag, the definition is “Subtitle/
>>>>             Description refinement”.
>>>>
>>>>
>>>>         OS: See comment about repertoires above, please read the
>>>>         reference and consider if an attacker might abuse the ability
>>>>         to inject arbitrary unicode.
>>>
>>>         If I understand correctly the concern is that UTF-8 allows
>>>         more than displayable characters and can even contain ill-
>>>         formed data.
>>>
>>>         We can probably limit the tag name and the tag values to the
>>>         “non problematic” values, so the ones described in "4.3. 
>>>         Unicode Assignables”. But we should also do that for Unicode
>>>         strings in EBML and in Matroska. Limiting the values in the
>>>         tags document would be inconsistent with the Document that
>>>         document defines these elements as UTF-8 values, ie RFC9559.
>>>         Given this is defining the “official values” that should be
>>>         used, we may impose extra rules compared to the raw/any UTF-8
>>>         allowed by the format. It may also be part of a guideline.
>>>         However it cannot be a MUST because we don’t know what people
>>>         have put in these elements so far. If the UTF-8 is correct,
>>>         then there’s no reason to make it invalid because it’s
>>>         considered problematic now.
>>>
>>>         Which leads me to the other point. I would be willing to use a
>>>         reference to the Unicode Assignables section as the basis for
>>>         our rules. But that document is not yet published. I may use
>>>         their ABNF but it may also be bogus or might be extended until
>>>         it’s published. So using this document as a reference would
>>>         delay the publication of the tags spec until that document
>>>         eventually is published (if ever) ?
>>>
>>>         Steve
>>>
>>
>>     _______________________________________________
>>     Cellar mailing list -- cellar@ietf.org <mailto:cellar@ietf.org>
>>     To unsubscribe send an email to cellar-leave@ietf.org
>>     <mailto:cellar-leave@ietf.org>
>>
>