[rtcweb] VP8 payload, decoder processing capabilities (was Re: Resolution negotiation - a contribution)

Stephan Wenger <stewe@stewe.org> Mon, 16 April 2012 21:40 UTC

Return-Path: <stewe@stewe.org>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A571121F85D8; Mon, 16 Apr 2012 14:40:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.732
X-Spam-Level:
X-Spam-Status: No, score=-3.732 tagged_above=-999 required=5 tests=[AWL=-1.333, BAYES_00=-2.599, J_CHICKENPOX_110=0.6, J_CHICKENPOX_19=0.6, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bd6Q+UVc+Q1w; Mon, 16 Apr 2012 14:40:21 -0700 (PDT)
Received: from am1outboundpool.messaging.microsoft.com (am1ehsobe003.messaging.microsoft.com [213.199.154.206]) by ietfa.amsl.com (Postfix) with ESMTP id 941BE21F85AF; Mon, 16 Apr 2012 14:40:20 -0700 (PDT)
Received: from mail49-am1-R.bigfish.com (10.3.201.248) by AM1EHSOBE001.bigfish.com (10.3.204.21) with Microsoft SMTP Server id 14.1.225.23; Mon, 16 Apr 2012 21:40:19 +0000
Received: from mail49-am1 (localhost [127.0.0.1]) by mail49-am1-R.bigfish.com (Postfix) with ESMTP id 955782C0779; Mon, 16 Apr 2012 21:40:19 +0000 (UTC)
X-SpamScore: -21
X-BigFish: PS-21(zzbb2dI9371Ic89bh1454I14ffI1432N41c5N98dK179cMzz1202h1082kzzz2fh2a8h668h839h)
X-Forefront-Antispam-Report: CIP:157.56.240.133; KIP:(null); UIP:(null); IPV:NLI; H:BL2PRD0710HT003.namprd07.prod.outlook.com; RD:none; EFVD:NLI
Received-SPF: pass (mail49-am1: domain of stewe.org designates 157.56.240.133 as permitted sender) client-ip=157.56.240.133; envelope-from=stewe@stewe.org; helo=BL2PRD0710HT003.namprd07.prod.outlook.com ; .outlook.com ;
Received: from mail49-am1 (localhost.localdomain [127.0.0.1]) by mail49-am1 (MessageSwitch) id 133461241776465_6633; Mon, 16 Apr 2012 21:40:17 +0000 (UTC)
Received: from AM1EHSMHS006.bigfish.com (unknown [10.3.201.236]) by mail49-am1.bigfish.com (Postfix) with ESMTP id 040C824004E; Mon, 16 Apr 2012 21:40:17 +0000 (UTC)
Received: from BL2PRD0710HT003.namprd07.prod.outlook.com (157.56.240.133) by AM1EHSMHS006.bigfish.com (10.3.207.106) with Microsoft SMTP Server (TLS) id 14.1.225.23; Mon, 16 Apr 2012 21:40:16 +0000
Received: from BL2PRD0710MB349.namprd07.prod.outlook.com ([169.254.1.165]) by BL2PRD0710HT003.namprd07.prod.outlook.com ([10.255.102.38]) with mapi id 14.16.0143.004; Mon, 16 Apr 2012 21:40:10 +0000
From: Stephan Wenger <stewe@stewe.org>
To: Harald Alvestrand <harald@alvestrand.no>
Thread-Topic: VP8 payload, decoder processing capabilities (was Re: [rtcweb] Resolution negotiation - a contribution)
Thread-Index: AQHNHBl/zPrIuhmKBk2PF+XRFqQqHg==
Date: Mon, 16 Apr 2012 21:40:09 +0000
Message-ID: <CBB1D76E.85DD1%stewe@stewe.org>
In-Reply-To: <4F87D9B1.4000206@alvestrand.no>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.255.102.5]
Content-Type: text/plain; charset="iso-8859-1"
Content-ID: <71C3AEB0F09920489EAE38D0873E7072@namprd07.prod.outlook.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginatorOrg: stewe.org
Cc: "rtcweb@ietf.org" <rtcweb@ietf.org>, "payload@ietf.org" <payload@ietf.org>
Subject: [rtcweb] VP8 payload, decoder processing capabilities (was Re: Resolution negotiation - a contribution)
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 16 Apr 2012 21:40:21 -0000

Hi all,

For context: Harald and myself have been at odds for a while now about the
lack of support for a code point in the VP8 payload that can be used to
negotiate a maximum decoder/bitstream complexity.  Specifically, Harald
(and other VP8 payload folks) suggested that generic mechanisms, such as
the a=framerate attribute of RFC4566 in conjunction with the picture size
aspect of the imageattr of RFC 6236 can be used, at least in the rtcweb
context.  However, as far as I understood our argument, these two
mechanisms in combination are not meant as a limit for decoder complexity
(in terms of samples/sec processing rate), but rather as an indication,
from receiver to sender, of an upper bound of what is "useful to send".
See the email below.  To me, it's quite obvious that an indication of
"useful to send" includes "my decoder can handle this"; however, it could
be more restrictive in that factors other than decoder horsepower could
also be at play, such as screen size, user interface settings, and so on.

I believe that the combination of what can be signaled using the above
mechanisms should be sufficient for rtcweb.  However, I also believe that
it is insufficient for general purpose use, mostly because it requires the
support of RFC 6236, which is not exactly a widely deployed technology.
Further, the a=framerate attribute is not a particularly useful attribute
these days anymore, because variable frame rates, at least for software
encoding/decoding, are the norm.

In previous posts on the payload list (in response to the VP8 payload
WGLC), I have commented on the practical shortcomings of the (lack of)
complexity negotiation, and suggested that this needs to be fixed.

Two options:

1) codify Harald's mechanism (based on a=framerate and imageattr in the
VP8 payload draft, at MUST strength.  "In a declarative context, a
prospective media sender supporting this (VP8 payload) specification MUST
support RFC 4566 a=framerate and RFC6236 imageattr, and MUST include code
points according to both mechanisms to identify the properties of the
media stream.  In an offer-answer context, both offerer and answerer
receiver supporting this VP8 payload specification MUST support
a=framertate and imageattr, and MUST include both in their respective
offer/answer messages, so to identify an operation point that will not
overload the media decoder's capabilities.

The issue with this approach, IMO, is that we are dealing here with three
individual code points (framerate, horizontal and vertical picture size),
where a single code point ought to be sufficient for determining whether a
décor is capable of decoding a stream, at least from a complexity
viewpoint).

2) include, in the V8 payload, a negotiable SDP code point indicating the
complexity of a stream, in units of samples per second processing
requirements or a derivative thereof (such as: levels as used in the MPEG
world).  For example, the VP8 payload could include a single, optional,
negotiable parameter "SamplePerSecond".  If SamplePerSecond were absent in
the SDP, a value of xxxxx must be inferred.  (a sensible value for xxxxx
could be, for example 9216000, which is the number of samples per second
for VGA resolution at 30 Hz).  If SamplePerSecond is present in a
declarative context, it indicates the minimum processing requirements a
decoder must support in order to successfully decode the stream.  In a
symmetric offer-answer context, SamplePerSecond can be used to "dial down"
the complexity of the stream to a value that both encoder and decoder can
support.

My preference is obviously the second proposal, but I'm willing to help
fleshing out either or both of them, just not today :-)

Regards,
Stephan
 


On 4.13.2012 00:45 , "Harald Alvestrand" <harald@alvestrand.no> wrote:

>On 04/12/2012 11:13 PM, Stephan Wenger wrote:
>>
>> On 4.12.2012 12:08 , "Harald Alvestrand"<harald@alvestrand.no>  wrote:
>>
>>> On 04/12/2012 08:19 PM, Stephan Wenger wrote:
>>>> Hi Harald,
>>>> Thanks for this strawman.  I believe it should work, but I fail to see
>>>> how
>>>> a two dimensional negotiation requirement (negotiating max values for
>>>> framerate and image size--which, in turn, also has two-dimensional
>>>> properties) leads to better interop than a one dimensional negotiation
>>>> (pixels per second processing requirement).
>>> Stephan,
>>>
>>> I do not see this (or the requirement from the use-cases document)
>>>first
>>> and foremost a decoder complexity negotiation; it is a negotiation of
>>> how much data it is useful to send, given the recipient's intended use
>>> of that data.
>> Then such a negotiation should be executed in addition.  Decoder cycle
>> requirement do matter in practical implementations.
>Feel free to propose language that captures this requirement. As noted,
>my I-D fragment doesn't.
>
>