Re: [Sframe] [dispatch] Magnus Westerlund's Block on charter-ietf-sframe-00-00: (with BLOCK and COMMENT)

Sergio Garcia Murillo <> Mon, 14 September 2020 09:39 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id B10893A0D81; Mon, 14 Sep 2020 02:39:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.098
X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id uDcNNi3vDa12; Mon, 14 Sep 2020 02:39:18 -0700 (PDT)
Received: from ( [IPv6:2a00:1450:4864:20::42e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 86F1D3A0D7C; Mon, 14 Sep 2020 02:39:17 -0700 (PDT)
Received: by with SMTP id c18so17914639wrm.9; Mon, 14 Sep 2020 02:39:17 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=nCYNp39gdk8OY6IsNVveSNljSD3Em9HFvDgFPLDNcXY=; b=DVWTw6ei2FzV8rfrbD0vjLDZ0+O6t2nWAECGUV8YbvmrAfxKe/2rxm6GBD+P9E9DY4 fHEYtvpMDiDYLmIBm5Ow8RzSM7jmf0MXB9GzwzrZbJFtY94GRvMWp3qlMMN4dglK+8q/ +IrJurde4oBqipmwKu1dpq4aMS2ZX1pkbQX7CPxm//tfQiDxxdYJfu59J5vJ/kBgeWBt RLjSPx/pEHsw9aEPezm1NqC38AvwuI+zztJRp3GHeUvgOXn4JmF3Icnl+NM+Sislid65 SsaFHwN+N7vVNGOA2bF57ukxydOenC9Cf4bC61KlfFSIaqIv6akxLtIr6UYG2OY/a6lW wVHQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=nCYNp39gdk8OY6IsNVveSNljSD3Em9HFvDgFPLDNcXY=; b=OWjqC/5rfjs4Ds6qbQW/o1+pWxcbiZFrbdN2gGImUe6HMQ9t5FEKOeaBHJFNh3L/S0 brIJqXeXvOrv9iYU7xeE704PTH/X+bKNS+OVWw+5BFXNYXPPCVfWHpeVs4/diW7slie3 HkjqvcdjpkpCVHajcEv3JguNQN5kpfz9sFEwwB2ovi8Vis/XsrvrpqgTUlGvEw0RQ3TA ILBhGmojV2V7RlC+9+Z6832hMpE+AsgwOdKheGnoXY+PxdIOXHSYRF0wPwq7Nl3Ryg5p i0vO7WwTOrNOy0eJ2wMNhSHQ82sEWqhfS8/bP/RkzQRGb0m4PyeH21ntt5VYHBxHFPee C4EA==
X-Gm-Message-State: AOAM533g/nr3Rb9yVIOJaH3GxptqvNiMh74IpfsOCIkDoCkidkldEqoD FDryStNb2xQW4pceQGg8B5F6gKPm1BAjVg==
X-Google-Smtp-Source: ABdhPJyD94Vf5Srv2+XUlJX6bH/p4Y6n5e9OUCA3An0KsVfP5M88bdogttCRIIGeiz86BzTrU+NWWw==
X-Received: by 2002:adf:f34f:: with SMTP id e15mr14372487wrp.387.1600076355578; Mon, 14 Sep 2020 02:39:15 -0700 (PDT)
Received: from [] ( []) by with ESMTPSA id k84sm17770924wmf.6.2020. (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 14 Sep 2020 02:39:14 -0700 (PDT)
To: Magnus Westerlund <>, "" <>, "" <>, "" <>
Cc: "" <>, "" <>, "" <>, "" <>
References: <> <> <> <> <> <> <> <> <>
From: Sergio Garcia Murillo <>
Message-ID: <>
Date: Mon, 14 Sep 2020 11:39:15 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Content-Language: en-US
Archived-At: <>
Subject: Re: [Sframe] [dispatch] Magnus Westerlund's Block on charter-ietf-sframe-00-00: (with BLOCK and COMMENT)
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 14 Sep 2020 09:39:20 -0000

On 14/09/2020 9:29, Magnus Westerlund wrote:
> Hi,
> Please see inline
> On Fri, 2020-09-11 at 17:32 +0200, Sergio Garcia Murillo wrote:
>> On 11/09/2020 13:19, Magnus Westerlund wrote:
>>> Hi,
>>>>> So do I assume correctly that the idea is that the application layer
>>>>> using SFRAME if it haves multiple independent or sets of dependent
>>>>> streams there will be no support for that aspect in SFRAME layer?
>>>>> Instead it is up to the application to put such information to either
>>>>> put it inside the encapsualted part of the SFRAME or map it to the
>>>>> lower transport layer, like to RTP SSRCs and extension information.
>>>> If i understood you correctly, SFrame is kind of agnostic to the media
>>>> streams. Currently it has a single global frame counter for all the
>>> transport, so
>>>> the number of streams/dependency between them is not known/needed
>>>> by SFrame.
>>> Yes, and when using a multi-stream application with layered encoding being
>>> protected by SFRAME and transported over RTP with repair functions the high
>>> layer application will need to have a model for how it maps individual
>>> SFRAMEs to RTP layer functions to enable them to do their work. You get a
>>> very limited functionality if in an SFU system tries to send all over a
>>> single RTP SSRC. So this becomes a discussion in the RTP Payload format
>>> context when carrying SFRAMEs.
>> I don't really understand your point, let's put an example vp9 svc.
>> Chrome will encode each picture from the video stream and pass it to the
>> vp9 svc encoder. The encoder will produce n-frames (one per spatial
>> layer) for a given picture.
>> Chrome will get each frame with its metadata (spatial/layer id + layer
>> structure info + previous frame dependencies + future frame depencies)
>> and pass it over to the insertable streams.
>> The payload will go to javascript, in where SFRAME will encrypt it and
>> returned as a binary blob.
>> The browser will get the binary blob, packetize it in several rtp
>> packets according to the new packetization format and add the metadata
>> as a header extension.
>> Is this process what you are describing or is there anything missing?
> Then the client will send these RTP packets to an SFU that will need to make
> decision on which of the packets to forward for which layers to a particular
> receiver. To be able to do this there are first the aspect of that the meta data
> needs to be included so that the SFU can make decision based on the packets it
> recieves. In RTP context this likely are a more generic structure like the RTP
> header extension for Frame marking (draft-ietf-avtext-framemarking).
> For packet it doesn't receive it lacks the meta data. Thus, for a more efficient
> repair of packets that are missing and to repair only packets that the SFU
> actually needs, then you actually have to map the layered structure into RTP
> level structures so that the SFU can determine that the missing packet belongs
> to a layer that it actually intends to forward. Putting everything in a single
> SSRC requires the SFU to repair all losses independent of it needing the packet
> or not.
> A even more extreme example is if the client has 3 video cameras and produce 3
> video encodings. In the SFRAME context it is possible to take the video frames
> from all these threee encodings and put them in a single RTP stream (SSRC) with
> SFRAMEs from that end-point. However, that would force your meta data to contain
> also source identification rather than to using different SSRCs. Putting
> multiple encoding on an single RTP stream would deprive the SFU of the
> possibility to do rate control per encoding (RFC 5104 - TMMBR) as well as pause
> streams it doesn't forward at all (RFC 7728 - Stream PAUSE) because all the
> existing RTP mechanism that are included in WebRTC works on the RTP streams, not
> sub-streams that doesn't have an identifier in the RTP layer. In addition to
> having to repair any loss for the aggregated stream of the three encodings.

No one has spoken of putting every on a single SSRC, the new rtp 
packetization would change the way the payload is packetized into the 
RTP packets, but not the rtp/ssrc/mid mapping of the media streams which 
would be the same as if not SFRAME is use. So this is a non-issue.

>>>>>> However, if the encrypted output is going to be transmitted over RTP
>>>>>> using a new packetization format, we need to address how to negotiate
>>>>>> that in the SDP.
>>>>> Good
>>>>>> Note that this new packetization format will not only be alid for
>>>>>> transporting encrypted frames (SFrame or not) but even any other
>>>>>> audio/video frames.
>>>>> I understand the potential exists. However, the recommendation for RTP
>>>>> has been to try to consider Application Level Framing. In the case of
>>>>> SFRAME that consideration moves up above SFRAME in the choice of how
>>>>> data is split into SFRAMEs. However, having an RTP paylaod format for
>>>>> SFRAME, then that should carry SFRAMEs. If you starts putting in other
>>>>> binary objects into it, then you create a new demultiplexing point for
>>>>> format between the RTP payload format and the SFRAME processing. That
>>>> appears unmotivated.
>>>> Note that SFRAME is not the only encryption possible with w3c insertable
>>>> streams, it is up to the application to define its own crypto if they
>>> want. So
>>>> the payload must be able to transport non-SFRAME opaque blobs.
>>> So I would consider this a misuse of the RTP payload format. The media type
>>> will be for SFRAMEs. I understand that you might not be able to prevent this
>>> in WebRTC. However, from an interoperability point of view you will be
>>> stating that this is SFRAMEs. RTP will not be able to tell a difference. But
>>> I don't see that this is will explicitly support carrying other things than
>>> SFRAME. Because that means bringing in other consideration including the
>>> security model for the E2E usage.
>> Not sure if ignoring the reality on how the packetization format is
>> going to be used within insertable streams is a good idea.
> If you are passing other things than SFRAME you are also sending them without
> end-to-end protection. Why is this happening, if the information was intended
> for the SFU, then I think we should identify it as such, rather than having the
> SFU have to hunt for it. If not, why are you sending it outside of the SFRAME
> envelope?

This is an application concern. I don't know why they would choose to do 
it, but the fact is that they can do it.

>>>>> Should the RTP payload format aspect of the charter be more explicit
>>>>> in that it needs to disucss the general model of how to use SFRAME and
>>>>> how an application can use the facilities of RTP to get good
>>>>> performance from RTP mechanism like FEC and retransmission?
>>>> I don't think so, RTX and FEC frames MUST not be modified/affected by
>>>> SFRAME/the new packetization format, so they work out of the box. We can
>>>> be explicit about that in the chapter as a requirement.
>>> So I am not saying that RTX or FEC shall be modified. What I am saying is
>>> that the SFRAME payload format should discuss the impact on the performance
>>> of these functions depending on how you structure the SFRAMES across SSRCs.
>>> The most simple example is the one where you put all layers for a video
>>> encoding in a single SSRC. In such a stream you loose a single RTP packet
>>> between the transmitting end-point and the SFU. Now the SFU is only
>>> forwarding the base layer and not the enhancement layers. If base layer and
>>> enhancement layer share RTP stream, the SFU can't determine if the missing
>>> packet contains an base layer data or enhancement layer data. If the base
>>> layer and enhancement layer would be on different SSRC, the fact that there
>>> is a missing packet on the enhancement layer RTP stream means that the SFU
>>> can ignore that as it anyway are not forwarding it. This same argument can
>>> be applied between media sources. So how you map your application level
>>> structures onto RTP do matter for the resulting performance of RTP. Putting
>>> all on a single SSRC is not a path to good transport performance.
>> But that is already the case for SVC codecs, so there is nothing new to
>> specify in that regard.
> So for SVC and other scalable codes there exists options for how to do this. And
> where you can write and RTP packetizer for SVC that takes basicaly a mode and a
> layer structure configuration as input, that is not possible for SFRAME as that
> information is not visible in the RTP payload information. Instead it must come
> with each SFRAME to packetize how to map it to the RTP layer.
> Thus, in my view the SFRAME RTP Payload format needs to discuss the core of
> these issues to make it clear to the application both its options as well as
> point to the impact of those options.

This is already possible for any SVC codec re-using the Dependency 
Descriptor header extension:

If you read my example above, this information is know before the frame 
goes to the Insertable Stream process, so it will be available for the 
rtp packetization without going via SFRAME.

Best regards