Re: [Sframe] Partial decodability and IDUs

Sergio Garcia Murillo <> Thu, 19 November 2020 23:23 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 5FFEC3A0E10 for <>; Thu, 19 Nov 2020 15:23:22 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, NICE_REPLY_A=-0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 1zvUxfzG7A79 for <>; Thu, 19 Nov 2020 15:23:20 -0800 (PST)
Received: from ( [IPv6:2a00:1450:4864:20::42b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id EDDB33A0E0B for <>; Thu, 19 Nov 2020 15:23:19 -0800 (PST)
Received: by with SMTP id j7so8214919wrp.3 for <>; Thu, 19 Nov 2020 15:23:19 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language; bh=t3mk4ARrgXe5S+EppYz7HoC57NLg4D5QLc4LLYx0niU=; b=eCdEFABEbIC9xoIRnnV2wxLvlAFAzukSIFx5Jafg56Ptx12mpsDkU7o3w8w5lzIUYj BYxcC9Rd2C+JFE62okW3wBlxawt1sVgB2c5Lgab280rshbg8V/uJ5RG2BTI4J1etFmpu rBWu17ifwMi39+fS8afbqmFmrkWeOSD2ze5rEu/C/NCZKk5A2oakTf/bgiV5g1xgbXTy pndapL5+RO9QGc0lO5w5yFR7cxvXUN/ttsDSBwLWKhWQMZjm8P+H283bbSu1m5RboPRA O+p5EQ5MajtdAiJFKvw1Bb5H8qr9JMczWZ9VhynHbk91ZaJpe5e2pQE1rhKylCLruJax FyJQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language; bh=t3mk4ARrgXe5S+EppYz7HoC57NLg4D5QLc4LLYx0niU=; b=bDonTENtn7UTtTSIRRjSw2WjbeoOHMk54/eSz/MJmAg/4bMIk+qF5OIYXgxYiu2CQP c1d+b65Lw7WoTy3neF6ety/yLNfaOen3bX4PfZ2Z5xsuWQRXo2eEcE7g4pLG/4DYzrXw oss0+ba8ekmIB7qwCaN0DUhHAbYRgnKIZMRCAeaZ70pIVJ4QVU8PlWT6C7wqaRRpvHxe eCs/6pCGTmdMYbG1Rp1lpjStZdZxPZPRsBuhEEeAx+S4ewkUXzyhn2grVqDTDywbUWJP teUYBYk9bX+PMUGji2fm7n+C1/dm9ndNuSZesFlD1suMvKSB2Fipy3pzk2FxArI9VlMO LQPA==
X-Gm-Message-State: AOAM533ohD/0rMxLSV+npjLVSf4OHMuRL+LoooKhdYFGNkls0OOKox69 HJiDcWu/GtJhJAX2JAGV+RBTKiGkTXMG+Q==
X-Google-Smtp-Source: ABdhPJzdq6pztyZ+SgVyWIfWREPQiDv41z7XcltxX0Oj3yMULtvOUb61OOhNc01MFqSrnPI3knHqHw==
X-Received: by 2002:a5d:448a:: with SMTP id j10mr11975820wrq.33.1605828198119; Thu, 19 Nov 2020 15:23:18 -0800 (PST)
Received: from [] ( []) by with ESMTPSA id f16sm2146375wrp.66.2020. (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 19 Nov 2020 15:23:17 -0800 (PST)
To: Bernard Aboba <>
References: <> <> <> <>
From: Sergio Garcia Murillo <>
Message-ID: <>
Date: Fri, 20 Nov 2020 00:23:16 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.4.3
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: multipart/alternative; boundary="------------654B669104BC318D213FAA90"
Content-Language: en-US
Archived-At: <>
Subject: Re: [Sframe] Partial decodability and IDUs
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 19 Nov 2020 23:23:22 -0000

If we are able to say something like:

"The encoder is responsible of dividing the encoded byte stream for each 
spatial video frame in 1 or more byte chunks called IDUs before passing 
it to the SFrame layer independently, in such a way that if any of the 
IDUs is not fully received, the rest of the IDUs can be concatenated 
after decrypting and still be partially decodable by the decoder without 
doing any further byte stream manipulation."

Then I would be fine, but I doubt it could be as easy.

Best regards


On 20/11/2020 0:13, Bernard Aboba wrote:
> Sergio said:
> "You are right regarding that the SFrame layer does not need to know 
> what is feed in for encryption, but in order to be able to have a 
> working end to end solution for webrtc, someone will need to define 
> what and how this IDUs are generated and reassembled for each codec if 
> we want to have interoperable implementations in different devices."
> [BA] The job of an SFrame sender is to encrypt and packetize the 
> bitstream provided by the encoder.  For SFU to act on the 
> packetization in an optimal way, some rules have to be followed (such 
> as not packetizing frames from different layers in the same packet). 
> So the sender/packetizer will have codec-specific logic, so it can 
> parse the bitstream for meta-data, and figure out how to do the 
> packetization in a manner appropriate for that codec. This could 
> include separately packetizing IDUs (slices/tiles), but I'm not clear 
> this needs to be required for all use cases.
> The SFM does not peer into the payload, it only acts on the meta-data 
> included in RTP header extensions, so the recovery/forwarding/dropping 
> decision can be largely codec-independent.
> The receiver decrypts and de-packetizes the bitstream, then provides 
> it to the decoder.  The recovery/decryption/de-packetizion process 
> should also be codec-independent.
> On Thu, Nov 19, 2020 at 2:11 PM Sergio Garcia Murillo 
> < 
> <>> wrote:
>     You are right regarding that the SFrame layer does not need to
>     know what is feed in for encryption, but in order to be able to
>     have a working end to end solution for webrtc, someone will need
>     to define what and how this IDUs are generated and reassembled for
>     each codec if we want to have interoperable implementations in
>     different devices.
>     That process is codec-dependant and I would require quite a lot of
>     effort (and also supporting it on the agnostic packetization), so
>     I would prefer to have strong arguments in favor of doing it.
>     On 19/11/2020 22:53, Justin Uberti wrote:
>>     The encoder needs to be aware of any mechanism to generate IDUs
>>     (e.g., slices), and typically each of these IDUs will be handed
>>     up to the consumer individually. So the SFRAME layer doesn't need
>>     to do any splitting, it just knows that it should treat each IDU
>>     as something it needs to individually SFRAME and packetize.
>>     On Thu, Nov 19, 2020 at 1:40 PM Sergio Garcia Murillo
>>     <
>>     <>> wrote:
>>         Hi all,
>>         As most of you already know, this morning I made a
>>         presentation in AVTCORE introducing the topic about the need
>>         to specify an agnostic video codec packetization format.
>>         <>
>>         I got an AP for creating an initial draft so it could be
>>         reviewed and accepted.
>>         However, there were two main concerns that we should address
>>         in this this group:
>>           * Historically, avtcore has explicitly designed not to be
>>             payload agnostic and  declined to standardized codec
>>             agnostic payload formats in number of cases.  If that is
>>             to be changed, needs to be done deliberately.
>>           * Need to define the "minimum decoding unit" or
>>             "independently decodable unit", that SFrame will work with.
>>         Regarding the second one
>>           * Full video frames (just use whatever is the encoder output)
>>           * Spatial layer frames
>>           * "independend decodable subframes" like h264 slices, vp8
>>             partitions or av1 tiles which allows partial decodability
>>             which is mainly aimed for enhancing packet loss resilience.
>>         Spatial layer frames is the minimum we should target as if
>>         not it will just prevent SFUs for using SVC codecs. So the
>>         question is if we should go deeper and implement lower
>>         partitions of the frames or not.
>>         AFAIK, currently, libwertc does not support partial
>>         decodability and I personally haven't seen any practical
>>         usage of this in the RTC world (while it makes a lot of sense
>>         in streaming/broadcasting world), but would like to hear what
>>         is the view and experience of the other members of this
>>         group. Also note that if we are going to support them on
>>         SFrame this will require a greater effort because we will
>>         need to explicitly define how the frames must be split before
>>         being encrypted y SFrame for *each* possible video codec
>>         (h264,h265,vp8,vp9,av1,...).
>>         There was also the question about how/if we should support
>>         other codec features like DON/interleaved mode for h264,
>>         which I also think we should not support mainly because we
>>         are not currently using it on webrtc implementations.
>>         What do you think?
>>         Best regards
>>         Sergio
>>         -- 
>>         Sframe mailing list
>> <>
>>         <>
>     -- 
>     Sframe mailing list
> <>
>     <>