[Cellar] Re: Andy Newton's Discuss on draft-ietf-cellar-tags-20: (with DISCUSS and COMMENT)

Steve Lhomme <slhomme@matroska.org> Sun, 11 January 2026 12:20 UTC

Return-Path: <slhomme@matroska.org>
X-Original-To: cellar@mail2.ietf.org
Delivered-To: cellar@mail2.ietf.org
Received: from localhost (localhost [127.0.0.1]) by mail2.ietf.org (Postfix) with ESMTP id D70D8A60DE79 for <cellar@mail2.ietf.org>; Sun, 11 Jan 2026 04:20:49 -0800 (PST)
X-Virus-Scanned: amavisd-new at ietf.org
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_NONE=0.001] autolearn=ham autolearn_force=no
Authentication-Results: mail2.ietf.org (amavisd-new); dkim=pass (2048-bit key) header.d=matroska-org.20230601.gappssmtp.com
Received: from mail2.ietf.org ([166.84.6.31]) by localhost (mail2.ietf.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 31N8YkXQ9bJZ for <cellar@mail2.ietf.org>; Sun, 11 Jan 2026 04:20:49 -0800 (PST)
Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by mail2.ietf.org (Postfix) with ESMTPS id 04CDDA60DE6B for <cellar@ietf.org>; Sun, 11 Jan 2026 04:20:48 -0800 (PST)
Received: by mail-wm1-x332.google.com with SMTP id 5b1f17b1804b1-477aa91e75dso5568105e9.3 for <cellar@ietf.org>; Sun, 11 Jan 2026 04:20:48 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=matroska-org.20230601.gappssmtp.com; s=20230601; t=1768134048; x=1768738848; darn=ietf.org; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=Cqr2Sis5w08FXMPK+iFfaf3EfCa8xQSknbSd5o3I7yo=; b=INhL8hWFLVGYIg+1cqOIHjUEIF4eU0j73z+JotCX9okIZ/8heuFDkw4uBgtcT06fD/ kldVPbdCNRbnnrl5oX91pCWOloPweyzuUYAG1DEm0STdKIIdJmxPesj7wIzXgoWg98RY WkXEMGS0xFzd+MSS6NgvEy3oaC4xAfaS4F5F9Y4cwB9LqBvfOa2z/gJti7z4BoVpVV/H QCcmKqiZF+O1Ovs76gZ4+4eC9tSYL7iFAh2DnVdf7kCaQW+DetqpRV/bb3xl09edE/cA SIJETrz1q/MBltBDldHj6rfE7PO/23u9LWPlGC9m/9OG5EMoUOXsejnvNWX5TSXl965h JmbA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1768134048; x=1768738848; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Cqr2Sis5w08FXMPK+iFfaf3EfCa8xQSknbSd5o3I7yo=; b=jkZdJnJXmEbzsBRLRcuYXZaH74FCBGzmaa+JZTdEmqQBKfnZuPZbgECsAHyosYLFh4 1yTx3jwc8jdb0idShBjekLI+JFBlVXqVqLeHRme10UTQ8UdmKCs6hiGk8ioaFSklUZYJ RY8YlNcj+njVDq1YkooSzsuY236hMl6leoMJd/k8v1KYN0rf4e4+YP7XmVcoClfMTFY7 1xMJHPH9FaP/l2F9CctWxFepROHMp9dAGtoDq1QwvgsxUH/ktYUINNpU0D909B2j62FE HsbjD/Ybs+YEuyeh+7Z08GH8BY5KE3fN4hv6zpyfAo/kZN0d67NitXCen/KTAvsT0Jiq Djrw==
X-Forwarded-Encrypted: i=1; AJvYcCVpFenf0mQsvF0CSOrRL23SPe4Z6bzeq8Y7gU3U0a+vBT7b1toF+a0nOxK3vQwAOdioIUbmAc4=@ietf.org
X-Gm-Message-State: AOJu0YzBS6303t1hidR/flXwiQIHt4/TWwnm9lLwj1wv4W7kwJ5018ui t2/mfKqp+iOPhFmR6w8Tw9rqcORj+7vn6xdaX+Ce/mdLvfMWzfPN4rhfuzNLN80eTQ==
X-Gm-Gg: AY/fxX7jLqABeG4PHkbtETSGu+wK+tja+VBxqw8ARynlo8S5vrsprogE8fZgGJ9Ifpq 8BJvP8SR3oe46OmR+i0saNfhTaVM8pkc+4kmKm69mCSjAVJETUO41lBIHEGH9n8HM9nXYNchZWa 66uTMpPt6Vd8YKwLHhundO+YiCFAIMHRkSF/OSded2t98LR2/bl+oiVryxxouWz0AeLYtHyzIml Eq3Ysu66r9ZmlOgGXXUlfmMiavAOtFEFU28ypbmtxQaD31J0CRPvpIK3Y14Il2phwZkvMaY11YR Jol7e6OMLYG+w5vc7JHyzddAq2ezItaEPw3WKUmzkitaKP60lMsg4lWm8eDaBZZSHHim5JbA/Nj U3wHSfAlzunRarozaBukgVQ0qGYfw2B1ROXn9lAl7bfLIoYtRa3/Ebvl7o0izo1Jf84fHpRbNT2 foewSse0IzuhG92d6srketkSgcbpAoncpRCQY=
X-Google-Smtp-Source: AGHT+IESAE87AxBEB0Ep/Moncy8Rj4x9816IcS/ttlXk38ZDDtnUNsGRiuMR+LPIUFQ7Okz+zgHeHg==
X-Received: by 2002:a05:600c:a48:b0:477:5ca6:4d51 with SMTP id 5b1f17b1804b1-47d84b34ab6mr104695415e9.3.1768134047715; Sun, 11 Jan 2026 04:20:47 -0800 (PST)
Received: from smtpclient.apple ([2001:861:34c4:290:cd05:1b86:8a54:b2b]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-432bd0dacdcsm32482565f8f.1.2026.01.11.04.20.47 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 11 Jan 2026 04:20:47 -0800 (PST)
From: Steve Lhomme <slhomme@matroska.org>
Message-Id: <399A5709-05A1-4160-B89B-13ECE136819E@matroska.org>
Content-Type: multipart/alternative; boundary="Apple-Mail=_3D8A667B-BB75-439C-8ED9-E354713C5D70"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3826.700.81.1.3\))
Date: Sun, 11 Jan 2026 13:20:36 +0100
In-Reply-To: <176772159265.3617026.15718053658319749010@dt-datatracker-5656579b89-p6k4r>
To: Andy Newton <andy@hxr.us>
References: <176772159265.3617026.15718053658319749010@dt-datatracker-5656579b89-p6k4r>
X-Mailer: Apple Mail (2.3826.700.81.1.3)
Message-ID-Hash: MAVSZODM5F2PWPR24QQXKV53BESIZR3N
X-Message-ID-Hash: MAVSZODM5F2PWPR24QQXKV53BESIZR3N
X-MailFrom: slhomme@matroska.org
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-cellar.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: The IESG <iesg@ietf.org>, cellar-chairs@ietf.org, cellar@ietf.org, draft-ietf-cellar-tags@ietf.org, spencerdawkins.ietf@gmail.com
X-Mailman-Version: 3.3.9rc6
Precedence: list
Subject: [Cellar] Re: Andy Newton's Discuss on draft-ietf-cellar-tags-20: (with DISCUSS and COMMENT)
List-Id: Codec Encoding for LossLess Archiving and Realtime transmission <cellar.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/cellar/0LVvyuxUaDIqjD6wsU-IysJ-8x0>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cellar>
List-Help: <mailto:cellar-request@ietf.org?subject=help>
List-Owner: <mailto:cellar-owner@ietf.org>
List-Post: <mailto:cellar@ietf.org>
List-Subscribe: <mailto:cellar-join@ietf.org>
List-Unsubscribe: <mailto:cellar-leave@ietf.org>

Hi Andy,

Thanks for your review. My comments are inline below.

> On 6 Jan 2026, at 18:46, Andy Newton via Datatracker <noreply@ietf.org> wrote:
> 
> Andy Newton has entered the following ballot position for
> draft-ietf-cellar-tags-20: Discuss
> 
> When responding, please keep the subject line intact and reply to all
> email addresses included in the To and CC lines. (Feel free to cut this
> introductory paragraph, however.)
> 
> 
> Please refer to https://www.ietf.org/about/groups/iesg/statements/handling-ballot-positions/ 
> for more information about how to handle DISCUSS and COMMENT positions.
> 
> 
> The document, along with other ballot positions, can be found here:
> https://datatracker.ietf.org/doc/draft-ietf-cellar-tags/
> 
> 
> 
> ----------------------------------------------------------------------
> DISCUSS:
> ----------------------------------------------------------------------
> 
> # Andy Newton, ART AD, comments for draft-ietf-cellar-tags-20
> CC @anewton1998
> 
> * line numbers:
>  -
>  https://author-tools.ietf.org/api/idnits?url=https://www.ietf.org/archive/id/draft-ietf-cellar-tags-20.txt&submitcheck=True
> 
> * comment syntax:
>  - https://github.com/mnot/ietf-comments/blob/main/format.md
> 
> * "Handling Ballot Positions":
>  - https://ietf.org/about/groups/iesg/statements/handling-ballot-positions/
> 
> ## Thanks to the Reviewers
> 
> I want to thank the many reviewers of this draft, particularly Sean Turner for
> his ARTART review.
> 
> ## Discuss
> 
> As noted in https://www.ietf.org/blog/handling-iesg-ballot-positions/,
> a DISCUSS ballot is just a request to have a discussion on the following topics.
> 
> ### UTF-8 and Problematic Code Points.
> 
> 196        Official TagName values MUST consist of UTF-8 capital letters,
> 197        numbers and the underscore character '_'.
> 
> In addition to Roman's DISCUSS on capitalized UTF-8, should the tags be
> limited to exclude the problematic code points as described in RFC 9839?

Because there’s no proper definition of capital letters, I’m more inclined to limit to the latin capital letters. That basically turns the TagName element into an ASCII (“string" type) element rather than UTF-8 element, without breaking backward compatibility. https://www.rfc-editor.org/rfc/rfc9559#name-tagname-element

As for the problematic code points, it seems that it’s rather a more general problem for all Matroska UTF-8 elements or even EBML in general.

> And from your Security Considerations sections:
> 
> 1671       Most of the time strings are kept as-is and don't pose a security
> 1672       issue, apart from invalid UTF-8 values.  Implementations MUST
> 1673       validate TagString inputs for UTF-8 correctness and reasonable length
> 1674       before use, in accordance with the security considerations in
> 1675       Section 10 of [RFC3629].
> 
> I think you have to apply RFC 9839 to make a statement that UTF-8 values
> don't apply a security risk.

Not being a UTF-8 or security experts, I’d rather stick with the Security Considerations of the  RFC that defines UTF-8. 

> ### Beginning Underscore
> 
> 204        It is RECOMMENDED that tag names start with the underscore character
> 205        '_' for non official tags that are not meant to be added to the list
> 206        of official tags.
> 
> Why is the RECOMMENDED? Can it be a MUST? If it is RECOMMENDED, what are the
> ramifications of not following the recommendation?

Because there are tons of files out there with custom tag names (I mentioned FFmpeg and mkvtoolnix in another review) that don’t follow this rule. This rule has been added more recently so we have an official way to define tags that are private.

> ### TagString Formatting
> 
> 215        Multiple items SHOULD NOT be stored as a list in a single TagString.
> 216        If there is more than one tag value with the same name to be stored,
> 217        it is RECOMMENDED to use separate SimpleTags with that name for each
> 218        value.
> 
> Can this be a MUST NOT? Why allow them to be stored as a list at all. And if
> this advice is not followed, what happens? Is it that some software won't
> interoperate? If that is the case, I think you are better off stating this as a
> MUST NOT.

As mentioned in Éric’s review:
This used to be that way but turned back in https://github.com/ietf-wg-cellar/matroska-specification/pull/1030
After the AD review from Orie https://mailarchive.ietf.org/arch/msg/cellar/4ebLFttRb_I8SFu5yMQSIDPuk2E/ 

Especially since INSTRUMENTS and KEYWORDS suggest they can group multiple values separated by a comma.


> 220        Preexisting files may have used multiple values in the same TagString
> 221        but given there is no defined delimiters they cannot be easily split
> 222        into multiple elements.  INSTRUMENTS (Section 4.4) and KEYWORDS
> 223        (Section 4.6) tags allow using a comma as a separator.  However, it
> 224        is RECOMMENDED to use separate SimpleTags with each containing a
> 225        single instrument or keyword value, respectively.
> 
> Why are separate tags only RECOMMENDED? Can this be a MUST as well? If it is to
> remain as a RECOMMENDED, what are the consequences of not following the
> recommendation?

Precisely because of preexisting files. We cannot make them invalid just because now we want stricter rules.

There is no consequence to grouping or not grouping elements. Matroska Readers should be able to group values (or keep them split) in any case as it has always been an option. As with a database it’s better if you use atomic values that can be referenced multiple times.

> ----------------------------------------------------------------------
> COMMENT:
> ----------------------------------------------------------------------
> 
> ## Comments
> 
> ### Support Of Other DISCUSS Positions
> 
> I support Éric's DISCUSS on UTF-8 spaces. Should that be whitespace?
> 
> I support Gory's position on the "Official". IMHO, the doc is describing
> "registered" values, not "official" values.
> 
> And like Gory, I support Med's position on the specification required for DEs.
> 
> 
>