Re: [payload] Submission of new versions of H.265/HEVC payload format

"Wang, Ye-Kui" <yekuiw@qti.qualcomm.com> Fri, 09 August 2013 18:01 UTC

Return-Path: <yekuiw@qti.qualcomm.com>
X-Original-To: payload@ietfa.amsl.com
Delivered-To: payload@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 66AD511E8123 for <payload@ietfa.amsl.com>; Fri, 9 Aug 2013 11:01:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.598
X-Spam-Level:
X-Spam-Status: No, score=-102.598 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AD48SL-iX7hv for <payload@ietfa.amsl.com>; Fri, 9 Aug 2013 11:00:59 -0700 (PDT)
Received: from sabertooth02.qualcomm.com (sabertooth02.qualcomm.com [65.197.215.38]) by ietfa.amsl.com (Postfix) with ESMTP id A5B7421F9C6F for <payload@ietf.org>; Fri, 9 Aug 2013 10:55:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=qti.qualcomm.com; i=@qti.qualcomm.com; q=dns/txt; s=qcdkim; t=1376070948; x=1407606948; h=from:to:cc:subject:date:message-id:references: in-reply-to:mime-version; bh=QHVd4eUE+rIE9aR67p5PjY+XUNTU9+Q3WX0gRCQyG70=; b=EbZQo7AjMRI6branB5a88TGGccAu6NAFoxKLQSOlysREr7xhpI3g1BPZ kCcnxHoLKTF/WATCHlNyy+fARmKbFW1iYnVApX6pCDKOw/IvvlbrKLF0w C6hcJYFgU2d20t9imAiRXgFFc/agSKRKkrjv4WhX5UMcgm9gdz7avBIK+ c=;
X-IronPort-AV: E=Sophos; i="4.89,847,1367996400"; d="scan'208,217"; a="49288881"
Received: from ironmsg01-lv.qualcomm.com ([10.47.202.180]) by sabertooth02.qualcomm.com with ESMTP; 09 Aug 2013 10:55:47 -0700
Received: from nasanexhc17.na.qualcomm.com ([10.45.158.129]) by ironmsg01-lv.qualcomm.com with ESMTP/TLS/RC4-SHA; 09 Aug 2013 10:55:47 -0700
Received: from NASANEXD02F.na.qualcomm.com ([169.254.8.21]) by NASANEXHC17.na.qualcomm.com ([10.45.158.129]) with mapi id 14.03.0146.002; Fri, 9 Aug 2013 10:55:47 -0700
From: "Wang, Ye-Kui" <yekuiw@qti.qualcomm.com>
To: Tom Kristensen <tomkrist@cisco.com>
Thread-Topic: [payload] Submission of new versions of H.265/HEVC payload format
Thread-Index: AQHOczrzTxJvXBhLWkGnuMyZJW3g0ZlJxGPggEO4VoD//+8kkA==
Date: Fri, 09 Aug 2013 17:55:46 +0000
Message-ID: <8BA7D4CEACFFE04BA2D902BF11719A83384D6529@nasanexd02f.na.qualcomm.com>
References: <8BA7D4CEACFFE04BA2D902BF11719A83383F92DF@nasanexd02f.na.qualcomm.com> <CAFHv=r9kG4sxTeGc9VyFmv=OaPcFxGmjugPqGRiuhfPsJB6AiQ@mail.gmail.com> <8BA7D4CEACFFE04BA2D902BF11719A833844E9AB@nasanexd02f.na.qualcomm.com> <5204D7FB.1020507@cisco.com>
In-Reply-To: <5204D7FB.1020507@cisco.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [172.30.48.1]
Content-Type: multipart/alternative; boundary="_000_8BA7D4CEACFFE04BA2D902BF11719A83384D6529nasanexd02fnaqu_"
MIME-Version: 1.0
Cc: "payload@ietf.org" <payload@ietf.org>
Subject: Re: [payload] Submission of new versions of H.265/HEVC payload format
X-BeenThere: payload@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Audio/Video Transport Payloads working group discussion list <payload.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/payload>, <mailto:payload-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/payload>
List-Post: <mailto:payload@ietf.org>
List-Help: <mailto:payload-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/payload>, <mailto:payload-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 09 Aug 2013 18:01:09 -0000

Thanks Tom for the answers and clarifications. It seems convincing to me that we should add max-tr and max-tc on top of what have been agreed at the avtcore session in Berlin.

I will include them in the next version unless people have issues with it - so please speak up if you do have issues.

BR, YK

From: Tom Kristensen [mailto:tomkrist@cisco.com]
Sent: Friday, August 09, 2013 4:52 AM
To: Wang, Ye-Kui
Cc: Tom Kristensen; payload@ietf.org
Subject: Re: [payload] Submission of new versions of H.265/HEVC payload format

A late response after summer holiday and inbox flooding season; comments/answers inline below.

On 06/27/2013 07:28 PM, Wang, Ye-Kui wrote:
Hi Tom,

Thanks for the review and the suggestions. I have some comments on the suggestions.

On max-fps: The following paragraph in draft-kristensen-payload-rtp-h241param-00 seems to be the main reason for the introduction of max-fps. Just for a better understanding, could you please clarify why the receiver side would have a preferred frame rate? Does that relate to rendering/display capability? If that is the case, then discarding of the additional frames should be done by the rendering module instead of the decoder, as conforming decoders should just output frames that are indicated, by the bitstream, to be output.

   If the encoder chooses to send at a higher frame rate than preferred
   by the receiver side, the decoder will normally discard the
   additional frames after decoding them.  The transmission of the extra
   frames and the processing of frames to just discard them are wasteful
   and the bandwidth and processing could be used more effectively.

Stephan provided convincing argumentation for this in the IETF-87 AVTCORE session. And as we know both H.264 and H.265 allow very high/extreme framerates for a given level, when the resolution is small enough. The max-fps parameter is meant to set a limit for useful and preferred maximal framerate.

And yes, the decoder is one thing. Render/display capability another. The value for max-fps might stem from display/render capability or as Stephan mentioned at IETF-87 - the use of video for content sharing with a typically high resolution/low framerate stream.


Not relevant for H.265/HEVC payload, but for H.264/AVC payload in RFC 6184, would introduction of such a parameter cause any IOP issue between a legacy sender and a new receiver? I think that should be analyzed in draft-kristensen-payload-rtp-h241param-00. Certainly, it would also be good to clarify the above question related to rendering/display capability in draft-kristensen-payload-rtp-h241param-00 too.

Thanks, note taken! I will update the upcoming version of draft-kristensen-payload-rtp-h241param with the interop/backwards compatibility issues valid for H.264 and the rendering/display vs. decoder distinction.


On max-lts: It makes sense to me, but we would need to specify a lower bound of the value, such that it would at least not contradict with the min tile size implied by other parameters.

This parameter is no longer proposed, since the signalling need is taken care of in the merged proposal for parallellization tools agreep upon at IETF-87. The minimum tile size should be limited to the lower bound on the tile size in H.265, text might be needed for the dec-parallel-cap.


On max-tr and max-tc: It seems that they are saying that there are scenarios where you would need more tiles than allowed by the indicated level. Could you please give an (practical) example that this is needed? I am asking this because in JCT-VC there had been a lot of discussions after which the level limits of MaxTileRows and MaxTileCols were concluded.

The values of maxTileRows and maxTileColumns defined in HEVC were a compromise, mainly between people who wanted to limited the maximum number of tiles for their single-core decoder design and people who wanted a large number of tiles for their multi-core encoder design. Moreover, when making maxTileRows/Cols in HEVC level-dependent (rather than resolution-dependent) it was already anticipated that additional flexibility could be achieved during call negotiation. (See document JCTVC-J0235.)

There are at least three reasons for adding max-tr and max-tc as additional optional parameters:

   1)  Typically video conferencing uses bit rates that are significantly lower than the values specified in HEVC for a given level/resolution. For example, it is expected that 1080p30 can use bit rates at 1 mbps or below, while HEVC specifies a maximum bitrate of 12 mbps for 1080p30 (level 4). To avoid designing CABAC decoding for unnecessarily high bit rates, the practical solution that has been used in the industry since H.264 days has been to signal support for a low level, and then send optional parameters to indicate support for a higher resolution. For example, a 1080p30 capable decoder could signal:
    -  max-recv-level-id = 2.1 (maxBR = 3000)
    -  max-ls = 1920x1080
    -  max-lps = 1920x0180x30
However, without being able to signal max-tc and max-tr, the encoder would have to comply with the values of maxTileRows and maxTileColumns of level 2.1 (maxTileRows = maxTileColums = 1).

   2)  Even if a decoder could signal the "true" value of max-recv-level-id, there are encoders who would benefit from using an even higher number of tiles. For example, for level 4 (1080p30), HEVC specifies maxTileRows = maxTileColums = 5, limiting the total number of tiles to 25. On the other hand, there exist processing devices having as many as 64 cores. Also, for sw encoder designs, it might be beneficial to have significantly more tiles than cores for load balancing purposes. A decoder that signals values of max-tr and max-tc larger than those indicated by the signaled level/resolution would give the encoder more freedom to choose the appropriate balance between number of tiles and level of parallelization.

   3)  Finally, for the sake of consistency, it seems reasonable support the same negotiation mechanisms for all parameters of HEVC tables A-1 and A-2, i.e. not only those corresponding to max-ls, max-lps, max-cbp, max-dpb, and max-br.


Cheers,
-- Tom



BR, YK

From: payload-bounces@ietf.org<mailto:payload-bounces@ietf.org> [mailto:payload-bounces@ietf.org] On Behalf Of Tom Kristensen
Sent: Thursday, June 27, 2013 6:34 AM
To: Wang, Ye-Kui
Cc: Geir Arne Sandbakken; payload@ietf.org<mailto:payload@ietf.org>; Tom Kristensen
Subject: Re: [payload] Submission of new versions of H.265/HEVC payload format

We have done a first review of the new version of this draft. Seems promising. However, we are missing a couple of things.

First, we would like to include a max-fps parameter in line with the proposed draft draft-kristensen-payload-rtp-h241param-00. A similar parameter has been part of H.241 for while and will eventually be valid for H.265 in H.323 land too. Note that this is a limiting parameter, different from the other max-* parameters (max-ls, max-lps, max-cpb, max-dpb and max-br).

We propose max-fps as something along the lines of:
   ------
   max-fps:  The value of max-fps is an integer indicating the maximum
      frame rate in units of hundredths of frames per second that can be
      efficiently received.  The signaled highest level is conveyed in
      the value of the profile-level-id parameter or the max-recv-level
      parameter.  The max-fps parameter MAY be used to signal that the
      receiver has a constraint in that it is not capable of decoding
      video efficiently at the full frame rate that is implied by the
      signaled highest level and, if present, one or more of the
      parameters max-ls, max-lps or max-br.

      Notice that the value of max-fps is not necessarily the frame rate
      at which the maximum frame size can be sent, it constitutes a
      constraint on maximum frame rate for all resolutions.

         Informational note: The max-fps parameter is semantically
         different from max-ls, max-lps, max-cpb, max-dpb,
         and max-br in that max-fps is used to signal a constraint,
         lowering the maximum frame rate from what is implied by the
         signaled MaxLumaSR and MaxLumaPS.

      The encoder MUST use a frame rate equal to or less than this
      value.  In cases where the max-fps parameter is absent the
      encoder is free to choose any frame rate according
      to the highest signaled level and any signaled optional
      parameters.
   ------

Second, we are missing an ability to signal support for tiles (number of tiles used/to expect) and maximum tile size. The ability to signal support for a number of tiles higher than what is implied for the signalled level (as listed in Table A-1 in the HEVC specification. These two parameters have the same semantics as the existing max-* parameters (max-ls, max-lps, max-cpb, max-dpb and max-br).

We propose max-tr (or max-tile-rows) and the equivalent description for max-tc (or max-tile-cols) along the lines of:
   ------
   (Will be introduced toghether with max-ls, max-lps, max-cpb, max-dpb, max-br, as parameters that "MAY be used to signal the capabilities of a receiver implementation.")
capabilities...:

   max-tr:
      The value of max-tr is an integer indication the maximum number
      of tile rows. The max-tr parameter signals that the receiver is
      capable of decoding video with a larger number of tile rows than
      is required by the signaled highest level.

      When max-tr is signaled, the receiver must be able to decode NAL
      unit streams that conform to the signaled highest level, with the
      exception that the MaxTileRows value in Table A-1 of [HEVC] for
      the signaled highest level is replaced with the value of max-tr.
      The value of max-tr MUST be greater than or equal to the value of
      MaxTileRows given in Table A-1 of [HEVC] for the highest
      level. Senders MAY use this knowledge to send pictures utilizing
      a larger number of tile rows than is indicated in the signaled
      highest level.

   max-tc:
      (As for max-tr with s/max-tr/max-tc/
                          s/MaxTileRows/MaxTileCols/
                          s/rows/colums)
   ------

The last tile related parameter is a new parameter named max-lts to specify maximum luma tile size to make sure the decoder doesn't receive a stream with too large resolution and/or bitrate in one single tile. The max-lts parameter is a limiting parameter - as described for max-fps above.

We propose max-lts to signal the maximum size allowed in one tile along the lines of:
   ------
   max-lts:
      The value of max-lts, Maximum Luma Tile Size, is an integer
      indicating the maximum tile size in units of luma samples. The
      max-lts parameter signals the maximum size of one tile that the
      decoder is able to decode.

         Informational note: The max-fps parameter is semantically
         different from max-ls, max-lps, max-cpb, max-dpb,
         and max-br in that max-lts is used to signal a constraint
         on the maximum tile size.

      The encoder MUST use a tile size equal to or less than this
      value.  In cases where the max-lts parameter is absent the
      encoder is free to choose any tile size according
      to the highest signaled level and any signaled optional
      parameters.
   ------

Please consider these additions. More text needed to merge the new parameters into the media subtype specification, offer/answer rules and so on - of course. To be dealt with later.

-- Tom

On 11 June 2013 19:00, Wang, Ye-Kui <yekuiw@qti.qualcomm.com<mailto:yekuiw@qti.qualcomm.com>> wrote:
Dear All,

We have just submitted versions 02 and 03 of draft-schierl-payload-rtp-h265, for which the links are as follows:

Version 02:
URL:             http://www.ietf.org/internet-drafts/draft-schierl-payload-rtp-h265-02.txt
Htmlized:        http://tools.ietf.org/html/draft-schierl-payload-rtp-h265-02
Diff:            http://www.ietf.org/rfcdiff?url2=draft-schierl-payload-rtp-h265-02

Version 03:
URL:             http://www.ietf.org/internet-drafts/draft-schierl-payload-rtp-h265-03.txt
Htmlized:        http://tools.ietf.org/html/draft-schierl-payload-rtp-h265-03
Diff:            http://www.ietf.org/rfcdiff?url2=draft-schierl-payload-rtp-h265-03

Version 02 includes huge changes compared to the earlier submitted version 01. In a short summary, the authors have worked hard to try to make the design complete, from both the payload structure and the signaling points of view. Compared to version 02, version 03 only contains a correction of the email address of an author.

Due to that the industry is eager to deploy H.265/HEVC and other standards bodies such as 3GPP rely on the payload format for support of H.265/HEVC in RTP based video services, we need to progress this draft relatively quickly. Therefore, we would like to request quick reviews from interested parties and also request for the WG status of this draft.

BR, YK (on behalf of the authors)
_______________________________________________
payload mailing list
payload@ietf.org<mailto:payload@ietf.org>
https://www.ietf.org/mailman/listinfo/payload



--
# Cisco                         |  http://www.cisco.com/telepresence/
## tomkrist@cisco.com<mailto:tomkrist@cisco.com>  |  http://www.tandberg.com
###                               |  http://folk.uio.no/tomkri/