[AVTCORE] Comments on draft-shin-avtcore-rtp-multi-opus-01

"Timothy B. Terriberry" <tterribe@xiph.org> Thu, 06 November 2025 19:14 UTC

Return-Path: <tterribe@xiph.org>
X-Original-To: avt@mail2.ietf.org
Delivered-To: avt@mail2.ietf.org
Received: from localhost (localhost [127.0.0.1]) by mail2.ietf.org (Postfix) with ESMTP id EE6FC84A2747 for <avt@mail2.ietf.org>; Thu, 6 Nov 2025 11:14:44 -0800 (PST)
X-Virus-Scanned: amavisd-new at ietf.org
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail2.ietf.org ([166.84.6.31]) by localhost (mail2.ietf.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sj6cVfJWV5LJ for <avt@mail2.ietf.org>; Thu, 6 Nov 2025 11:14:44 -0800 (PST)
Received: from mailfish.xiph.org (mailfish.xiph.org [IPv6:2001:470:eb26:42:5054:ff:fe09:a40c]) by mail2.ietf.org (Postfix) with ESMTP id 4434C84A2742 for <avt@ietf.org>; Thu, 6 Nov 2025 11:14:44 -0800 (PST)
Received: from [172.17.0.42] (syn-174-099-211-134.biz.spectrum.com [174.99.211.134]) by mailfish.xiph.org (Postfix) with ESMTPSA id 6D2BBA1D5E; Thu, 6 Nov 2025 19:14:35 +0000 (UTC)
To: avt@ietf.org
From: "Timothy B. Terriberry" <tterribe@xiph.org>
Message-ID: <98906f50-ba25-1b3a-f73b-2e0f019eb22b@xiph.org>
Date: Thu, 06 Nov 2025 11:14:33 -0800
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 SeaMonkey/2.53.10.2
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Message-ID-Hash: 7B5SXTSFB3U4OTYZOAUVFFZMJZ4LYCZV
X-Message-ID-Hash: 7B5SXTSFB3U4OTYZOAUVFFZMJZ4LYCZV
X-MailFrom: tterribe@xiph.org
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-avt.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
X-Mailman-Version: 3.3.9rc6
Precedence: list
Subject: [AVTCORE] Comments on draft-shin-avtcore-rtp-multi-opus-01
List-Id: Audio/Video Transport Core Maintenance <avt.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/avt/DQVOPHb54WDCBhy2O6e_elKB05c>
List-Archive: <https://mailarchive.ietf.org/arch/browse/avt>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Owner: <mailto:avt-owner@ietf.org>
List-Post: <mailto:avt@ietf.org>
List-Subscribe: <mailto:avt-join@ietf.org>
List-Unsubscribe: <mailto:avt-leave@ietf.org>

Hi Shun,

Thanks again for working on this and bringing it to avtcore.

To follow up on the discussion in the room, if you are unsure about the 
purpose of mapping families in RFC 7845, I think the easiest way to 
think about them is that they are used to assign a _meaning_ to the 
audio data transmitted on the wire. They do not affect the format of the 
RTP payload itself (beyond establishing the number of streams and number 
of channels those get decoded into), but they tell you what those 
channels are. This can be simple speaker positions (mapping families 0 
and 1), or spherical harmonics as in Ambisonics (mapping families 2 and 
3, see RFC 8486), or something that must be agreed upon externally 
(mapping family 255). I think that even if this draft limits itself to 
mapping family 1, we should have a plan for how additional families 
could be supported. That said, I agree with Jean-Marc that adding 
support for mapping families 2 and 255 should be relatively painless.

I also had a few more pedantic comments on your draft that I thought 
were better suited to the mailing list than our limited meeting time. I 
don't think any of these should block working group adoption of the 
draft (and I would be in support of adoption).

1) In section 6.1.3, are all of the fields listed there mandatory? Can I 
leave out channel_mapping for 1 or 2 output channels with one stream 
(i.e., the equivalent of mapping family 0)?

2) RFC 7845 imposes some implicit limitations on the values of the 
num_streams, coupled_streams, and channel_mapping fields. E.g., because 
they are encoded in octets and treated as unsigned, they cannot be 
negative or exceed 255. Encoding them as text in SDP does not enforce 
those limitations. Should this draft make them explicit?

There are also some explicit limitations are tied to the mapping family, 
e.g., no more than 8 output channels for mapping family 1. This draft 
never discusses individual mapping families, so it may not be clear 
which of those limitations are intended to apply.

3) It is probably not a great idea to define channel_mapping with a 
normative reference to third-party source code (I tried visiting the 
libwebrtc link, but it just served me a blank page... probably that is 
some issue on my end, but I think it illustrates the point). Is this 
just RFC 7845's channel mapping as a comma-separated list? Does it 
support silence channels (255)?

4) For the SHOULD NOT in Section 7, what does it mean for the answerer 
to support the offered configuration? The ability to parse the format? 
The ability to decode or record it? The ability to render to some number 
(>2) of speakers?

What reasons would someone have for silently down-converting to stereo 
anyway (i.e., why is this SHOULD NOT instead of MUST NOT)?

5) What reasons would someone have for not including a stereo 
alternative (again, why is this a SHOULD and not a MUST)?