Re: [Sframe] Partial decodability and IDUs

Sergio Garcia Murillo <> Sun, 22 November 2020 22:48 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id E334F3A0EB5 for <>; Sun, 22 Nov 2020 14:48:23 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: 0.001
X-Spam-Status: No, score=0.001 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, NICE_REPLY_A=-0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 7h6R8unxE9yo for <>; Sun, 22 Nov 2020 14:48:22 -0800 (PST)
Received: from ( [IPv6:2a00:1450:4864:20::430]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 823123A0EB4 for <>; Sun, 22 Nov 2020 14:48:22 -0800 (PST)
Received: by with SMTP id o15so16831289wru.6 for <>; Sun, 22 Nov 2020 14:48:22 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=5hy6CksF0eS/3iCmLVZ8y917Z+D5KCQ9Citn3q3GNjg=; b=wiMyJ8vLb2frGWcEYDVwtGjB40MunR3P1BpN8ML5cb/kG11MYbwoc78AqHE0pz9/Ki JES7O8aDUlzCN1M1syIgGZhHdz6pbUHU0KqwvInAnxUQArn7IFO3W8HpewXiOp3KEchN G8kGGMSocXNZKZOMWVRT3cK+an68lC5RrST+WFdd5pDsv8cGp0YKTPdsGdQInXAw3jlV pkSi7bHkx+Q+g/RX49x+mJ1GMVD8h1aJ1CtSGpHj04nZvQwwVxxHEJLzVytT2lw0MJBR ven9VfsXaMG4/qV6uRjZRXRQQ+IlbvD6SvQT3v+74TZgnWrgmHw911NdMPdM9MoIXVuZ ov7Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=5hy6CksF0eS/3iCmLVZ8y917Z+D5KCQ9Citn3q3GNjg=; b=F6Is6OgsF9dMofWHmqUOLIomSmdQj/Ks9VPeSVRuzz5e7zhmsJAHzICV3E2qlJ9dKY fkmn0sIAfI3JVxqK1L1SjfAfE0IstKK3heqtkweAcxe3AXnSneqQFj+BYA9roe7RbTxF S6MRybqZ3eaioGD1NcknB7Zp8vtftquuga0ULoL/zcvK0es8rD2o/UscCgtUnhVs4mfG KAssZtC/375TbCJSJSBgWILzRJO9KTRA7udQcKwA66DzqhdH8+UnJ1Np5acO37mZ1UB6 +ZnD58W4xfcVMUWlSXT2YuoYghESsReZas49BwyF/yMM6Li09KFp3lON4BE0BkSu3uiA PUyg==
X-Gm-Message-State: AOAM531NRxoa/jD9U1M1+gwRXtWv1rQ8MCWauZltlSkC1rIlJRlko84d Wda0JElXCcHdPrxTzXM4iwRHmU4FbMH7Og==
X-Google-Smtp-Source: ABdhPJxH6AfJnZEc4m77mQpTy0aVHKB0zEGKZvQ0sWpRwZ0JLXdbz+eU5QllZ4dNepYGe93IR8Q36Q==
X-Received: by 2002:adf:f48c:: with SMTP id l12mr4626067wro.280.1606085300235; Sun, 22 Nov 2020 14:48:20 -0800 (PST)
Received: from [] ( []) by with ESMTPSA id f20sm12487701wmc.26.2020. (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 22 Nov 2020 14:48:19 -0800 (PST)
To: Magnus Westerlund <>, "" <>
Cc: "" <>, "" <>
References: <> <> <> <> <> <> <> <>
From: Sergio Garcia Murillo <>
Message-ID: <>
Date: Sun, 22 Nov 2020 23:48:18 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Content-Language: en-US
Archived-At: <>
Subject: Re: [Sframe] Partial decodability and IDUs
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sun, 22 Nov 2020 22:48:24 -0000

On 20/11/2020 17:01, Magnus Westerlund wrote:
> Hi,
> If one look at this from the RTP payload level what becomes of potential use and
> in some case necessary I think are the following.
> The sender will put one or more IDU into independent SFRAMEs and provide the
> meta data to it that can include the following:
>   - Media Stream + Encoder instance
>   - Scalability layer and depdendency information
>   - Decoding order
>   - Associated Capture/Presentation time
>   - timestamp for any consumption related time line (if not redundant)
>   - If it represent a switch point (IDR, Intra etc)
>   - Will produce output (Video marker bit type, i.e. this IDU(s) will result in
> decoded output) (likely for which layers)
>   - Reliability level (How necessary the data is, parameter sets is necessary for
> any decoding, versus a non-referenced slice).
> The one or more IDUs is a trade-off if between overhead and what benefits there
> are to encoded different IDUs in indepdendent SFRAMES. For example encoding
> H.26x parameterset NALUs independently depend on how one like to use them, or if
> they should be teamed with their related "slice" NALUS, in a STAP style
> aggregation of NALUs inside the SFRAME.
> I think the point is to look at this what would be needed to enable the
> functionality in the RTP payload format and for the SFUs to be able to correctly
> select what SFRAMEs to forward to a particular receiver.

 From my point of view the that the partial decodability is not 
something that the SFU cares about.

The only RTCP recovery I am aware of that could apply for IDUs is the 
slice loss indication which requires macroblock information, which is a 
way lower information layer than the one SFUs works with (it may not 
even be available if payload is encrypted).

Moreover, the SFU will not rely the SLIs received from the receiving 
endpoint as it has no way of tracking SLIs requests from endpoints and 
map them to already sent ones in order to avoiding overflowing the 
encoder with SLIs requests. So I am afraid that the theoretical benefit 
of partial decoding would be reduced a lot in the typical usage of 
SFrame (i.e. no p2p use case).

Also, before adding support from it, I would like to get real 
implementation feedback as it is currently not implemented in 
(lib)webrtc and I have not heard anynone asking for it ever.

> If some implementation makes is easy for themselves and aggregate more in one
> SFRAME that should be possible. It is a trade-off to a large degree between
> functionality, cost of doing repair of missing parts, and complexity to
> implement.

And also complexity to specify.

> There are some basic level which should be very easy to accomplish and never
> should be gone below becauase then one is truly throwing away the basic
> functionalities of RTP.
> I still think one needs to write a bit of analysis of the different media
> encoders functionality and what are shared concepts, what have special quirks
> that needs to be thought about. We also I think need to find the terminolgy for
> what the different aspects like I listed above really should be called and how
> they are mapped to different encoders, in audio, video, tactile, etc.
> I will note that we didn't make much headway in the RTP streams debate before we
> actually created the taxonomy of that. People where talking past each others.

Are you proposing to do something like rfc7656 but for video codecs? I 
think that would be great, but I also understand that it would be an 
different draft than the packetization or description header, right?

> I will also note that a very interesting aspect of the reliability puzzle for
> these codecs is what you can determine about a missing RTP packet based on that
> you know which SSRC and sequence number that is missing. So if you have a
> scalable codec and transmit all the packets on a single SSRC, then a missing
> sequence number (detected through the gap) will not tell the receiver if it
> needs the packet urgently or not (such a difference between a base layer or the
> highest enchancement layer). In some applications that might not matter in
> otheres it may. So in the later cases you might want to distribute the layers on
> different SSRCs so that one can immediately determine how important the
> information is. I think that possibility will exist if this RTP payload format
> is done correctly.

While I think this information is very valuable (in fact, it is a must 
have from the SFU perspective), I think it belongs to the SFU 
information header extension part and not the packetization part 
(although they could be implemented together).

Best regards