Re: [Moq] Data model and MoQ streams

Suhas Nandakumar <suhasietf@gmail.com> Fri, 02 December 2022 02:11 UTC

Return-Path: <suhasietf@gmail.com>
X-Original-To: moq@ietfa.amsl.com
Delivered-To: moq@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7022AC14CE5B for <moq@ietfa.amsl.com>; Thu, 1 Dec 2022 18:11:31 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.993
X-Spam-Level:
X-Spam-Status: No, score=-6.993 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_FONT_LOW_CONTRAST=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, TRACKER_ID=0.1, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7It0hgqFJQMB for <moq@ietfa.amsl.com>; Thu, 1 Dec 2022 18:11:27 -0800 (PST)
Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 49F5EC14F731 for <moq@ietf.org>; Thu, 1 Dec 2022 18:11:27 -0800 (PST)
Received: by mail-wm1-x32e.google.com with SMTP id m19so2579235wms.5 for <moq@ietf.org>; Thu, 01 Dec 2022 18:11:27 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=cdDZzqwk7FD2a2gQtK0Y/jFphm90YTK5+BclwUU5+/Q=; b=PLu43MYB1EuTd4n5o+ZQffWIec+zPRwdxRJs9A4GEFCpwzxionUfaudA5Y3StSk48L x4cyBJc3KOb2MaeXCWTz0MrDfnaMtJpR30SzX9PZONHtwV8utz7irZeZF5iCbd3TjwS9 uooZYKyMWXozCQqqFfri4lvJdZtsHgoYR3C5iIDcMK9zaweAW7FazPZM223Z5n4jEUmI rpG6rQTjCttG9Q5SEvvOBS1fWYDwI/TP21vLK7t/xhX79i/epOtXRFMGhGF1OH7ABEsA DRl/7IJqvjsrxXfyAwZ0dzJd3DSY20krw8WZCwUNe+l8Nd+tIJ1t1TUldJtoE/088qjV oqdg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=cdDZzqwk7FD2a2gQtK0Y/jFphm90YTK5+BclwUU5+/Q=; b=SskDWJyeqi1QUo+T8F0953HQ8pWkat3nXV3fOMUmiM+7dcyS6jWbaaViVfliYBtA9B U8XHrCdBzYjOLaU5qYXRSEsHa9a9kugp0qYJlR61MEkoPtQnBXPgXTCPI8Xy2cnYYW4J yLqzOw/7TWp/41aguBumbWIh5pJ1HJtAaIE7eJulduFrXA5Xty3iMJx9USMg2PeWudKd SeNwR2F63WQoLqQaqib+xhxTo3wChOA8DnQ/QmRgWUs338CgIWq+RdC/6olSMm+rP3lJ 3eGrgOdomh/5sWsehA7jsEkFPna0rY5pyzvB6MHm4R2aSQi665/oB5yhJj1AJXmeXLyi T0pw==
X-Gm-Message-State: ANoB5plBtVoCv86xaoeJJPrz9VqTC4+XypSqd4HQ8Pt7+4gf5waS9mlF EA4FuEwrCwQnIKX7r63H5QaxYqqfkbDOueluMG4=
X-Google-Smtp-Source: AA0mqf4EesXAl6GAxbAiQoJKFxQR80D0o/WZHr+0yzLrl6xwF1RP7aHA/52mw4q4uJUtrycz/XaV19bvSeCSudJMyRk=
X-Received: by 2002:a05:600c:4254:b0:3cf:7197:e693 with SMTP id r20-20020a05600c425400b003cf7197e693mr50932675wmm.49.1669947084966; Thu, 01 Dec 2022 18:11:24 -0800 (PST)
MIME-Version: 1.0
References: <cdb53c88-3b78-7a2e-3dd6-572b90192294@huitema.net> <CAHVo=Z=+aG1S9AuLV8qZsQgHcZyr39ys76VYDSbprDyoxC50gQ@mail.gmail.com> <1041F1F7-4247-4EBF-A3E6-31F5A71F0FE2@akamai.com> <CAHVo=ZkSgYQS9EDA3+o2ObOJVFPeJ=t3LxGTVGEszaeEfttgLA@mail.gmail.com> <MW5PR15MB5145FE957332D3E7B246D573D4149@MW5PR15MB5145.namprd15.prod.outlook.com> <D4E37AEB-F3B7-48B7-AA65-5C76B0B4D75D@akamai.com> <MW5PR15MB5145A9E0611B4FD277001B34D4149@MW5PR15MB5145.namprd15.prod.outlook.com>
In-Reply-To: <MW5PR15MB5145A9E0611B4FD277001B34D4149@MW5PR15MB5145.namprd15.prod.outlook.com>
From: Suhas Nandakumar <suhasietf@gmail.com>
Date: Thu, 01 Dec 2022 18:11:13 -0800
Message-ID: <CAMRcRGQraBaPAUjQ-woTwtfCwxpzMkc+p-ffwSGCkxmQbPBRjw@mail.gmail.com>
To: Roberto Peon <fenix=40meta.com@dmarc.ietf.org>
Cc: Christian Huitema <huitema@huitema.net>, "Law, Will" <wilaw@akamai.com>, Luke Curley <kixelated@gmail.com>, MOQ Mailing List <moq@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000cb0f9705eecedaeb"
Archived-At: <https://mailarchive.ietf.org/arch/msg/moq/V0GK7zEOfkWkNDIqMkEq8pXKE-Q>
Subject: Re: [Moq] Data model and MoQ streams
X-BeenThere: moq@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Media over QUIC <moq.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/moq>, <mailto:moq-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/moq/>
List-Post: <mailto:moq@ietf.org>
List-Help: <mailto:moq-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/moq>, <mailto:moq-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 02 Dec 2022 02:11:31 -0000

Agree with Will and Roberto.

Within QuicR, for example,  the manifest/catalog is just another type of
object that can be published and subscribed too. How it gets delivered is
via QUIC Streams and has knobs to control its priority and other things.
The same is true with media objects. Having the common idea of
resources/objects with names that you publish and subscribe to can keep
application/domain logic out of delivery/transport protocol.


On Thu, Dec 1, 2022 at 11:03 AM Roberto Peon <fenix=
40meta.com@dmarc.ietf.org> wrote:

> If the “catalog” is another media flow, then that could certainly work.
> That’d certainly not be O(n^2).
>
> I don’t think patching is even needed so long as we have a
> ‘stream-of-messages’ thing (where one can ask for message X)—it is just way
> easier that way!
>
> -=R
>
>
>
> I also like “catalog” FWIW
>
>
>
> *From: *Law, Will <wilaw@akamai.com>
> *Date: *Thursday, December 1, 2022 at 9:59 AM
> *To: *Roberto Peon <fenix@meta.com>
> *Cc: *Christian Huitema <huitema@huitema.net>, MOQ Mailing List <
> moq@ietf.org>, Luke Curley <kixelated@gmail.com>
> *Subject: *Re: [Moq] Data model and MoQ streams
>
> @Roberto - what if the “manifest” were just another media object flow that
> the client subscribed to? So it would receive it at the start of the
> playback, then subscribe to updates which would it would receive
> automatically. Such a subscribe-able
>
> ZjQcmQRYFpfptBannerStart
>
> *This Message Is From an External Sender *
>
> ZjQcmQRYFpfptBannerEnd
>
> @Roberto  - what if the “manifest” were just another media object flow
> that the client subscribed to? So it would receive it at the start of the
> playback, then subscribe to updates which would it would receive
> automatically. Such a subscribe-able manifest can be sent as a monolithic
> object, or broken up into a subscription of its sub-components as you
> suggest in the thread. The problem with the latter approach is that you
> still need a an overall descriptor to indicate what sub-components are
> available. The “manifest”  will likely be compact and only dispatched when
> there is a change in the publishers content mix (adaption sets,
> representations etc to use DASH terms, not with every GOP as with HLS).
> This is likely to be in the order of minutes, so its data contribution over
> the wire is de minimus compared to the other media flows. We could conceive
> of a patch update mechanism, in which the initial receipt is the full
> manifest and then you subscribe to a flow of delta updates. Since clients
> can join at arbitrary times and we don’t want to have to produce custom
> updates per client, this introduces the complexity of signaling to indicate
> to the client which base version it should apply the patch to, or else all
> patches are a delta from the base and not from each other, which reduces
> the patch efficiency. I think the simplicity of a solution in which 1) the
> manifest is super compact 2) it only updates when the publishers changes
> its overall content mix and 3) updates carry the complete manifest – is
> most attractive and scalable. If n is the number of changes in the
> inventory, then the manifest is dispatched with order O(n) not O(n^2).
>
>
>
> Cheers
>
> Will
>
>
>
> BTW - I use the term “manifest” to describe the inventory offered by a
> publisher. We  have strong aversions connotations around “playlist” and
> “manifest” with the HLS and DASH formats respectively. To avoid inheriting
> unintended assumptions, we should use a new term in the world of MoQ. I
> propose the term “*catalog*” to represent this inventory.  Its
> syntactically concise, not overloaded in the format world and
> understandable without explanation. You’ll see this term used in some
> upcoming drafts which Suhas and myself will release shortly. But for now,
> think about “catalog” and if it might be a preferred alternative to
> “manifest”.
>
>
>
>
>
> *From: *Roberto Peon <fenix@meta.com>
> *Date: *Thursday, December 1, 2022 at 9:02 AM
> *To: *Luke Curley <kixelated@gmail.com>, "Law, Will" <wilaw@akamai.com>
> *Cc: *Christian Huitema <huitema@huitema.net>, MOQ Mailing List <
> moq@ietf.org>
> *Subject: *Re: [Moq] Data model and MoQ streams
>
>
>
> Manifests that are not appendable are going to need to send O(n^2) data
> when new things are added to the manifest. This has been particularly bad
> with live streaming, where the lengths, sizes, and codec parameters change
> over the lifetime of the video/stream.
>
> An (appendable) stream of messages would allow a manifest that was at most
> O(n) transfer.
> Even better would be to take some of the structure from pre-existing
> manifests, and represent as stream-of-messages, i.e. a stream of periods, a
> stream of representations, etc.
>
> This can be (re) leveraged for both person-to-person and broadcast so long
> as a player has the ability to receive something other than what it
> requested.
>
> (e.g. if a manifest is a list of things that /could/ be sent, but there is
> some complication in generating or fetching the most desired
> representation, then some other representation should be sent)
>
> -=R
>
>
>
> *From: *Moq <moq-bounces@ietf.org> on behalf of Luke Curley <
> kixelated@gmail.com>
> *Date: *Wednesday, November 30, 2022 at 4:25 PM
> *To: *Law, Will <wilaw@akamai.com>
> *Cc: *Christian Huitema <huitema@huitema.net>, MOQ Mailing List <
> moq@ietf.org>
> *Subject: *Re: [Moq] Data model and MoQ streams
>
> My current thought is that the sender advertises what tracks are available
> for subscription, including any custom metadata: TRACK 1: resolution=480p,
> bitrate=2000, codec=avc1. 4d002a TRACK 2: resolution=720p, bitrate=4000,
> codec=avc1. 4d002aTRACK
>
> ZjQcmQRYFpfptBannerStart
>
> *This Message Is From an External Sender *
>
> ZjQcmQRYFpfptBannerEnd
>
> My current thought is that the sender advertises what tracks are
> available for subscription, including any custom metadata:
>
>
>
> TRACK 1: resolution=480p,  bitrate=2000, codec=avc1.4d002a
>
> TRACK 2: resolution=720p,  bitrate=4000, codec=avc1.4d002a
>
> TRACK 3: resolution=1080p, bitrate=6000, codec=av01.0.15M.10
>
> TRACK 4: bitrate=128, codec=mp4a.40.2, language=eng
>
> TRACK 5: bitrate=128, codec=mp4a.40.2, language=jap
>
>
>
> This is effectively a manifest. We could potentially leverage an existing
> manifest (ex. HLS/DASH), use an existing container (ex. MP4 moov), or make
> something custom. Personally, I like sending an init segment (ex. mp4 moov)
> since it already has a lot of this information and is required to configure
> the decoder, but I digress.
>
>
>
>
>
> The receiver chooses which tracks to receive:
>
>
>
> PLAY [3, 2, 1]
>
> PLAY 4
>
>
>
> This means "send me 1080p, otherwise 720p, otherwise 480p based on my
> available bitrate" and "send me english". The relay does need to keep a
> mapping from track ID to bitrate but I think that's fine. The order is also
> important, so the relay can naively check if the track is enabled and below
> the available bitrate, rather than needing to sort itself based on business
> logic.
>
>
>
>
>
> This could be very useful for seamless track switching:
>
>
>
> TRACK 6: resolution=720p, bitrate=3000, codec=avc1.4d002a,
> advertisement=true, enabled=false
>
> PLAY [6, 3, 2, 1]
>
> ...
>
> TRACK_UPDATE 6: enabled=true
>
>
>
> The relay would start to send track 6 after it's been enabled with no
> round-trip required. The sender can optionally disable tracks 3/2/1 to
> avoid non-advertisement content from being available at the time.
>
>
>
>
>
> On Wed, Nov 30, 2022 at 1:54 PM Law, Will <wilaw@akamai.com> wrote:
>
> I want to highlight one issue with scalability as we begin to propose
> solutions in which. “.. *Or the receiver could just tell the sender to
> choose a track from a subset (ex. only these tracks, which are below 720p).
> The sender only needs to know the maximum bitrate for each track.”. *This
> scheme requires the sender to have knowledge of what tracks are available,
> for a given resource, along with their bitrates, resolutions and any other
> attributes on which a client may choose to filter (such as language,
> captions, accessibility etc).  When that sender is an edge relay, we must
> maintain state of the content “package” in order to implement these
> server-side decisions. This requires memory, and the relay parsing some of
> type of package description, such as a manifest.  We assume that the relay
> delivering the subscription was even the one that previously delivered some
> type of manifest and this may not be true. There will be many types of
> manifests, their format will change all the time and when they do the
> entire delivery surface must be constantly updated to accommodate the
> evolution. In highly scaled system, having the ability for the edges not to
> have maintain content state over time , to not have to have knowledge of
> the media and to easily substitute one relay for another during delivery
> leads to more robust architecture. At a higher level, we can think of
>  knowledge of the internal offerings of a given live resource as a contract
> between the publisher and subscribers(s), where those agents are the only
> ones that need to understand the composition and the relays simply follow
> routing instructions and do not make decisions outside the state of the
> WebTransport connections which they manage.
>
>
>
> I can illustrate this with two different means to achieve what Luke hinted
> at in this thread.
>
>
>
> Option A
>
> Client: Hey server, send me the highest bitrate stream of 720p or below
> that you can for the resource ABC123
>
> Server: I must previously have received the manifest describing what
> streams are available for ABC123, parsed it, stored it. So I look up the
> streams, filter out those > 720p and select the highest bitrate from the
> remainder.
>
>
>
> It would be a more scalable design if the relay only had to maintain state
> about the WebTransport connections. This can be accomplished by the
> subscriber providing the list of qualifying subscription identifiers, along
> with their target bitrates, and then asking the sender (which is a relay)
> to pick one.
>
>
>
> Option B
>
> Client, Hey relay, from this list of stream IDs and bitrates , send me the
> highest bitrate appropriate for my connection
> [“123”:1Mbps,”456”:2Mbps,”789”:5Mbps].
>
> Relay: I see your throughput is 3Mbps so I’m sending you stream 456.
>
>
>
>
>
> Option B is easier to scale. The relay doesn’t need to know that “456” is
> part of ABC123. I can ask any relay for this content, even if it has never
> seen resource ABC123 before. It allows a lot of flexibility in how the
> content is described/referenced by delegating the composition of the
> resource to the publishers and subscribers and having relays respond to a
> very simple and low level set of forwarding instructions.
>
>
>
> Cheers
>
> Will
>
>
>
>
>
> *From: *Luke Curley <kixelated@gmail.com>
> *Date: *Wednesday, November 30, 2022 at 10:03 AM
> *To: *Christian Huitema <huitema@huitema.net>
> *Cc: *MOQ Mailing List <moq@ietf.org>
> *Subject: *Re: [Moq] Data model and MoQ streams
>
>
>
> I like the summary; no disagreements here.
>
>
>
> I think any confusion has been caused by loose terminology and loose
> requirements. I'm going to take a stab at both but I don't really know what
> I'm doing or how to be most effective in IETF.
>
>
>
> The media bitrate needs to be adjusted in response to congestion. For 1:1
> the encoder can change the encoded bitrate, but for 1:N we need a bitrate
> ladder.
>
>
>
> HLS/DASH works by letting the receiver choose the next rendition
> (audio+video track) to download based on decoder support and network
> conditions. Unfortunately, the receiver has very little information about
> congestion when media is delivered frame-by-frame (more info
> <https://github.com/kixelated/warp-draft/issues/44>). This is a
> fundamental problem with LL-DASH and Twitch's LHLS.
>
>
>
> The solution is to expose the congestion controller's estimated bitrate
> from the sender. This could be pushed periodically like a RTCP sender
> report but that has a delay, especially during congestion.
>
>
>
> I propose an alternative. In Warp, a session advertises subscribable
> tracks (aka media streams) that can have different content, encodings,
> bitrate, etc. There can be multiple active subscriptions and for each
> subscription, the receiver asks the sender for *one* track from a
> provided list. The sender uses the congestion controller's estimated
> bitrate, rounding down to choose the track. This sender-side ABR is
> extremely simple and has worked great in production.
>
>
>
> This gives the receiver the ability to control the desired experience
> while allowing it to delegate ABR responsibility. The receiver could
> request specific tracks using subscriptions with a list of size 1,
> implementing receiver-side ABR if they would like. Or the receiver could
> just tell the sender to choose a track from a subset (ex. only these
> tracks, which are below 720p). The sender only needs to know the maximum
> bitrate for each track.
>
>
>
> On Mon, Nov 28, 2022, 6:44 PM Christian Huitema <huitema@huitema.net>
> wrote:
>
> This email stems from an ongoing discussion of the "data model" used by
> MoQ on Slack. Slack is a great tool for rapid exchanges, but not every
> member of this list follows it. Also, it is not archived, which means
> that the exchanges will disappear after a few weeks. So, email. Lots of
> what follows is my personal take on the debate.
>
> The questions started with exchanges between Luke and Suhas about the
> names of variables used in protocol headers. These exchanges were made a
> bit harder because we don't have good agreement on the data model behind
> MoQ, including agreements on how to name what. Part of that is because
> different teams are working on different scenarios, such as streaming
> and real time, and also different network configurations, such as with
> relay or not.
>
> I think that we have some agreement about what MoQ shall do: enabling
> the transport of media streams. The client opens a QUIC connection using
> Web Transport and requests one or several media streams. The server
> sends the corresponding data, until the client somehow closes the media
> stream. That means we also have an agreement about what is out-of-scope:
> some communication scenarios require the orchestration of several media
> streams, such as multiple audio, video and other streams from multiple
> participants in a conference. I would expect applications doing that to
> open multiple MoQ streams, perhaps using multiple connections, and
> organizing the orchestration themselves. The "MoQ stream" would be the
> building block.
>
> We have a bit of a discussion on what the "MoQ stream" is. There is
> broad agreement on the general concept that the media is composed of a
> series of "objects", organized as series of groups (GOP). But then there
> are differences, because a given media stream (say, a video) can be
> encoded in multiple ways, say high, medium and low definition. The Warp
> draft calls these different "renditions" of the media stream.
>
> The differences are largely due to the way different teams plan to
> handle congestion. One way is to have the server decide. The media is
> sent as a series of GOP, each on its own QUIC stream. At the beginning
> of each GOP, the server looks at transmission conditions and decides
> what rendition to use for the next GOP. This is a very convenient way to
> manage congestion, but it imposes constraints: each rendition shall have
> the same notion of GOP, which is not obvious is for example the low def
> and high def codecs are operating in parallel. In that architecture,
> relays have to acquire all renditions of a MoQ stream, so they can do
> the real time adaptation. Real time clients also have to upload multiple
> renditions so relays can get them and adapt.
>
> Another way is to let the client decide. The client asks for a specific
> rendition, and the server provides exactly that. In case of congestion,
> the server drops some data to fit into the available bandwidth. The
> client notices that, closes the current stream, and opens a new MoQ
> stream with a lower definition. Adaptation takes a bit longer than in
> the previous scenario, but there is no requirement to synchronize GOP
> boundaries across different algorithms. The relay management is also a
> bit simpler.
>
> Then there are mixed scenarios. The client might ask for both low def
> and high def, display high def as long as it receives it correctly,
> switch to low def if high def stutters. The server would send GOP for
> low def and high def in parallel, using a higher priority for low def.
> Relays could use similar strategies, asking for all available renditions.
>
> I think these positions are not as far apart as it seems. They lead to
> exposing the "rendition" property prominently in the protocol. (I hope
> we can find a way to do that in a manner independent of media and
> codec.) This would lead to something like:
>
> * client requests a media (by name) and specifies the renditions that it
> is willing to receive. Server responds with some kind of accept message.
> * server transmits the media as a series of GOP, which each GOP starting
> with identification of which media stream and what rendition this is.
> Servers may send several GOP renditions in parallel, each using its own
> QUIC stream, with appropriate priorities. Servers chose what to send
> based on client references and network conditions.
> * relays act as client vis a vis the origin or the upstream relay, act
> as server for the clients or the downstream relays, typically request
> multiple renditions in parallel.
>
> There are things to iron out. Media names are typically long URIs. We
> would want a short identifier or "media ID" in the QUIC stream headers.
> The mapping from media name to media ID could be negotiated as part of
> the initial request/accept exchange. The GOP headers would have to carry
> a rendition ID -- or maybe we negotiate a unique ID for each valid
> combination of media and rendition, add complexity and save a few bits.
> (I could defend both sides of that argument.)
>
> OK. That message is already too long. But I hope it helps informing the
> WG and making progress.
>
> -- Christian Huitema
>
>
>
>
> --
> Moq mailing list
> Moq@ietf.org
> https://www.ietf.org/mailman/listinfo/moq
>
> --
> Moq mailing list
> Moq@ietf.org
> https://www.ietf.org/mailman/listinfo/moq
>