Re: [rtcweb] Fwd: New Version Notification for draft-fineberg-avtext-temporal-layer-ext-00.txt

Stephan Wenger <> Fri, 19 July 2013 16:03 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id B3B8321F85D1; Fri, 19 Jul 2013 09:03:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -3.538
X-Spam-Status: No, score=-3.538 tagged_above=-999 required=5 tests=[AWL=0.060, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id iSmr16J+xtRR; Fri, 19 Jul 2013 09:03:30 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 754D421F85BB; Fri, 19 Jul 2013 09:03:29 -0700 (PDT)
Received: from ( by ( with Microsoft SMTP Server (TLS) id 15.0.731.16; Fri, 19 Jul 2013 15:33:03 +0000
Received: from ([]) by ([]) with mapi id 15.00.0731.000; Fri, 19 Jul 2013 15:33:03 +0000
From: Stephan Wenger <>
To: Justin Uberti <>, Bernard Aboba <>
Thread-Topic: [rtcweb] Fwd: New Version Notification for draft-fineberg-avtext-temporal-layer-ext-00.txt
Thread-Index: AQHOhJVBdM05O/fEXUeQN07SatH6lQ==
Date: Fri, 19 Jul 2013 15:33:02 +0000
Message-ID: <>
In-Reply-To: <>
Accept-Language: en-US
Content-Language: en-US
x-originating-ip: []
x-forefront-prvs: 0912297777
x-forefront-antispam-report: SFV:NSPM; SFS:(189002)(199002)(24454002)(377424004)(13464003)(377454003)(2473001)(479174003)(16236675002)(76786001)(36756003)(81542001)(54316002)(83072001)(83322001)(74706001)(50986001)(53806001)(31966008)(76482001)(56776001)(4396001)(16406001)(51856001)(69226001)(47976001)(59766001)(8558605003)(47736001)(80022001)(15202345003)(74502001)(74876001)(49866001)(65816001)(81342001)(19580405001)(63696002)(74366001)(76796001)(54356001)(79102001)(74662001)(76176001)(77982001)(47446002)(56816003)(77096001)(19580385001)(46102001)(19580395003)(66066001)(42262001); DIR:OUT; SFP:; SCL:1; SRVR:CO1PR07MB191;; CLIP:; RD:InfoNoRecords; MX:1; A:0; LANG:en;
Content-Type: multipart/alternative; boundary="_000_CE0E9F569F89Bstewesteweorg_"
MIME-Version: 1.0
Cc: "" <>, "" <>
Subject: Re: [rtcweb] Fwd: New Version Notification for draft-fineberg-avtext-temporal-layer-ext-00.txt
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 19 Jul 2013 16:03:44 -0000

I also believe that 16 bits should be enough.  For H.264 and VP8 that has already been demonstrated.  For H.265, some initial thoughts below.  Apologies for the word-count.

The scalable version of H265 (called SHVC) is currently under development.  The current working draft can be found here:  Therein, the options for defining layering structures are considerably more complex.  To start, we have 3 bits for the temporal ID in the NAL unit header of the H.265 version 1 (HEVC) base specification (temporal scalability is already nicely supported in version 1).  Just like in SVC.  In the scalable extension, the NAL unit header contains a six bit field that points into a data structure known as "Video Parameter Set" (VPS).  Inside the VPS, those six bits are mapped to to a position in a directed graph (specified through "dimension_id[][]"), which tells you about the reference relationship of the layer in question and its parent layer.  One can recursively follow the graph to determine what used to be called dependency_id, quality_id, view_id, and whatnot.  The six bit pointer field can (or: is to be when possible) organized by the encoder such that it is prudent for a middle box to throw away NAL units (belonging to layers) with higher values of the six bit field first, before throwing away NAL units with lower values.  Relying on this feature, 3+6 bits == 9 bits should be fine for the header extension.

That said, the ordering by the encoder is just a recommendation, and there may well be cases where different pruning strategies may be advisable.  For example, a layering structure could be constructed that expands into two branches, one using 2D scalable tools only, the other including view_id for multi view coding.  By looking at the six bit field alone, a middle box will not be able to meaningfully remove NAL units belonging to one of the branches completely while pruning the other branch.  In order to meaningfully deal with that scenario, there would be two options: one to represent the dimension_id[][] (and associated control info) in the header extension, or require the middle box to have access to the VPS and be able to interpret its content.  The further could take considerably more than 16 bits and we would be talking about a variable length data structure.  The latter requires the middle box to have state and a mechanism to intercept the VPS (through signaling—as the encrypted in-band VPS would not be useful under the assumption that the middle box does not have the key to the media—which is the motivation of the draft in the first place).  I personally don't mind at all the second mechanism, as I'm a big fan of out-of-band parameter set transmission and any middle box must be in the signaling path anyway to meaningfully manipulate RTP.  I do not like the first option due to its variable, and possibly substantial, overhead.


From: Justin Uberti <<>>
Date: Friday, 19 July, 2013 06:32
To: Bernard Aboba <<>>
Cc: "<>" <<>>, "<>" <<>>
Subject: Re: [rtcweb] Fwd: New Version Notification for draft-fineberg-avtext-temporal-layer-ext-00.txt

Agree those are the right codecs to design for. Since in each case there are fairly low limits on the number of supported layers (i.e. 3 spatial layers for SVC), I think it should be possible to pack the temporal, spatial, quality layer ids into 16 bits.

On Fri, Jul 19, 2013 at 1:56 AM, Bernard Aboba <<>> wrote:
If we can support VP8/9 as well as H.264/5 SVC
that would be a start. It seems doable to me.

On Jul 18, 2013, at 8:34 PM, "Adam Fineberg" <<>> wrote:


Are there other codecs you are thinking should be supported?  If it's generalized I would think we want to be able to cover all known scalable codecs. I'll look into the H264/SVC fields to see how to encode them in a generalized header.


On 7/18/13 7:40 PM, Bernard Aboba wrote:
I think it may be possible to generalize this.  For example, for H.264/SVC which can support temporal, spatial and quality scalability, you would need the quality_id and dependency_id in addition to the temporal_id (what you call the temporal layer index).

Date: Thu, 18 Jul 2013 08:45:38 -0700
Subject: Re: [rtcweb] Fwd: New Version Notification for draft-fineberg-avtext-temporal-layer-ext-00.txt


Good question.  I'm not familiar enough with the parameter requirements of all other scalable codecs to be able to generalize.  If you'd like to help specify them, I'd be fine revising the draft to generalize.


On 7/17/13 8:26 PM, Bernard Aboba wrote:
Since the need is not codec specific (e.g. it arises with any codec supporting temporal, spatial and quality scalability), why
 a VP8-specific RTP extension?

Date: Wed, 17 Jul 2013 17:09:46 -0700
Subject: [rtcweb] Fwd: New Version Notification for draft-fineberg-avtext-temporal-layer-ext-00.txt


I'm working on WebRTC services and have found that while developing services that forward VP8 video streams if we want to take advantage of the VP8 temporal scaling we must get the temporal layer information from the RTP header which requires us to decrypt the SRTP packets. This is undesirable both because the middle-box needs to have access to the keys as well as the because of the added overhead of the decrypt/encrypt cycle. This draft proposes an RTP header extension that will allow us to use the VP8 temporal layer information included in the header extension and therefore do forwarding without SRTP decryption. Comments welcome.

Adam Fineberg
fineberg at<>

-------- Original Message --------
Subject:        New Version Notification for draft-fineberg-avtext-temporal-layer-ext-00.txt
Date:   Tue, 09 Jul 2013 10:02:05 -0700
From:   internet-drafts at<>
To:     Adam Fineberg <fineberg at><>

A new version of I-D, draft-fineberg-avtext-temporal-layer-ext-00.txt
has been successfully submitted by Adam Fineberg and posted to the
IETF repository.

Filename:        draft-fineberg-avtext-temporal-layer-ext
Revision:        00
Title:           A Real-Time Transport Protocol (RTP) Header Extension for VP8 Temporal Layer Information
Creation date:   2013-07-08
Group:           Individual Submission
Number of pages: 6

   This document defines a mechanism by which packets of Real-Time
   Tranport Protocol (RTP) video streams encoded with the VP8 codec can
   indicate, in an RTP header extension, the temporal layer information
   about the frame encoded in the RTP packet.  This information can be
   used in a middlebox performing bandwidth management of streams
   without requiring it to decrypt the streams.

_______________________________________________ rtcweb mailing list<>


rtcweb mailing list<>