Re: [AVTCORE] [payload] Clarifying the H.265 RTP payload format specification's text on when to set the RTP "M" bit

"Wang, Ye-Kui" <> Fri, 09 August 2013 22:57 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 366F921F9D95; Fri, 9 Aug 2013 15:57:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -102.598
X-Spam-Status: No, score=-102.598 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001, USER_IN_WHITELIST=-100]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id X3dcci-uWxSS; Fri, 9 Aug 2013 15:57:15 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 379E821F9E62; Fri, 9 Aug 2013 15:49:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple;;; q=dns/txt; s=qcdkim; t=1376088577; x=1407624577; h=from:to:subject:date:message-id:references:in-reply-to: mime-version; bh=frnj6wymdAI+6Byk9Pgll+5O5lk6+2+IKWIM8NQNT8c=; b=KtklqgjW5XzTgsP1hUR1DzkAS2NGoWejPZMMr4qahLD2LEvtHIIoH2ji +Atk2sxaS5T5xVVBkb+9rD1v2mlReVwXhKGo0MobqNDXnccGj31KOKjQ5 zL5cv/Z4sZv1N6URPFg7F2cvtDey6BI2qGVMsND7MXQHf5sSq/qezOyXP U=;
X-IronPort-AV: E=Sophos; i="4.89,849,1367996400"; d="scan'208,217"; a="49306333"
Received: from ([]) by with ESMTP; 09 Aug 2013 15:49:36 -0700
X-IronPort-AV: E=Sophos; i="4.89,848,1367996400"; d="scan'208,217"; a="582022121"
Received: from ([]) by with ESMTP/TLS/RC4-SHA; 09 Aug 2013 15:49:35 -0700
Received: from ([]) by ([]) with mapi id 14.03.0146.002; Fri, 9 Aug 2013 15:49:35 -0700
From: "Wang, Ye-Kui" <>
To: Ross Finlayson <>, "" <>, "" <>
Thread-Topic: [payload] [AVTCORE] Clarifying the H.265 RTP payload format specification's text on when to set the RTP "M" bit
Thread-Index: AQHOlK64MyApD0+0bEyc75pQI3FkapmM0zAQgADx7oD//7IbsA==
Date: Fri, 09 Aug 2013 22:49:34 +0000
Message-ID: <>
References: <> <> <> <> <> <> <>
In-Reply-To: <>
Accept-Language: en-US
Content-Language: en-US
x-originating-ip: []
Content-Type: multipart/alternative; boundary="_000_8BA7D4CEACFFE04BA2D902BF11719A83384D690Fnasanexd02fnaqu_"
MIME-Version: 1.0
Subject: Re: [AVTCORE] [payload] Clarifying the H.265 RTP payload format specification's text on when to set the RTP "M" bit
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Audio/Video Transport Core Maintenance <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 09 Aug 2013 22:57:29 -0000

Now I understand your intention better - did not see that the note was intended for sender implementers.

For this purpose, indeed (as Mo pointed out earlier?) that first of all, the sender needs make sure the RTP timestamp for each RTP packet is correct. As long as RTP timestamp is correct, it is also easy to know when the M bit should be set. Or are you saying that there is no problem with RTP timestamp setting but there is a problem for the M bit setting? If yes, please clarify why.

I have no problem to putting some information in the line as you suggested, if the suggested text is good. However, the suggested wording below still has a few problems (or shortcomings).

Unfortunately the contents of a NAL unit, alone, does not tell a RTP sender implementation whether or not the NAL unit ends an access unit.  Instead, the implementation can obtain this information separately, from the encoder.  If, however, this information is not available directly from the encoder (e.g., because the implementation is sending data that consists solely of a sequence of pre-encoded NAL units), then it must instead inspect subsequent NAL units, to determine whether or not the NAL unit ends an access unit.  The following rule can be used:
    A NAL unit ends an access unit if it is a VCL NAL unit, and the next-occurring VCL NAL unit has the high-order bit of the first byte after its NALU header equal to 1.

The problems (or shortcomings) are:

-          I don't think it is good to say that the fact that a NAL unit itself does not indicate whether it is the last one of an AU is unfortunate.

-          Saying that the information can be obtained from the encoder is vague to me. I guess you meant the bitstream instead, as the encoder may be absent e.g. when dealing pre-encoded video.

-          Saying that inspecting subsequent NAL units is a must is not really correct, as I believe timing information can be used too - as mentioned above, you got to make sure the RTP timestamp value is correctly set anyway.

-          Last but not least, an AU may also end with a non-VCL NAL unit, e.g. a filler data NAL unit, an end of sequence NAL unit, an end of bitstream NAL unit, a suffix SEI NAL unit, etc.


From: [] On Behalf Of Ross Finlayson
Sent: Friday, August 09, 2013 1:11 PM
Subject: Re: [payload] [AVTCORE] Clarifying the H.265 RTP payload format specification's text on when to set the RTP "M" bit

Though I don't think it is [] really necessary, I think we can either simply add a note after the semantics of the M bit saying that the first slice header bit indicates that the slice is the first slice of a picture, or mention this as part of the introduction to HEVC.

Because the new text is intended to help implementers figure out when to set the 'M' bit, it should be added just after the existing 'M' bit text.  (But why not just use the paragraph that I suggested?)

The addition is not really necessary because the first slice header bit is more expected to be used by video decoders for detection of the start of a new picture without relying on timing (because they might not be available to the decoder etc), while for the receivers at RTP level, they can always use the M bit and/or the RTP timestamp.

Yes, but that's irrelevant here, because we're talking about how to clarify the text for implementers of RTP *senders*, not RTP receivers.  (If the current proposed RTP payload format specification were to be used only by the implementers of H.265/RTP receivers, then I would have no problem with it.  But that's not the case.)

it is so easy to follow the semantics, and I don't see a reason why implementers would not follow what is being specified.

The reason is (as I've explained before) that for many RTP sender implementers who aren't very familiar with the H.265 codec specification, it will not be immediately obvious whether a NAL unit ends an access unit.  That's why I'm proposing adding a (very small) bit of extra text to clarify this.

I think the core of the disagreement here is that there are two different views on what a RTP payload format specification should be:

One view is that a RTP payload format specification should be the shortest, most concise possible document that describes how to send/receive RTP packets for a particular codec, for someone how is already very familiar with the codec specification.  From the perspective of a codec designer, this point of view makes perfect sense: The codec specification is the 'core document'; the RTP payload format specification is logically just an 'appendix'.  Therefore, from this point of view, there seems little sense in adding extra, redundant text to the RTP payload format document, because it's information that is already available in the codec specification.

However, there are two problems with this point of view.  The first problem is that it makes the specification 'fragile'.  Someone who happens to misread the codec specification is more likely to implement the RTP payload format incorrectly.

The second problem with this point of view is a practical one.  Many (if not most) people who implement RTP payload formats are not codec experts.  They just want to know how to transmit data that they've received from an encoder, or (on the receiving end) want to know how to feed data that they've received from a RTP stream into a decoder.  It's not reasonable to expect them to have intimate knowledge of the codec specification (especially since they may want to implement the RTP payload formats for several codecs - not just one).  Nor will they always have time to read through the codec specification in detail to find information that could just as easily have been made available in the RTP payload format specification.

An alternative view of what a RTP payload format should be (and this is the view that is, I think, is shared by most members of the AVT(CORE) working group) is that a RTP payload format specification should have sufficient detail to allow someone to implement the payload format (sending and/or receiving), even if they have only a limited knowledge of the codec specification.  Even if this means adding text to the RTP payload format specification that might - from the point of view of the codec designer - seem somewhat redundant.

I hope this helps you understand why adding just a small bit of redundant text (such as the paragraph I suggested) to the RTP payload format specification (after the existing 'M' bit text) will help implementers, and makes it more likely that they will implement the RTP payload format correctly.

Ross Finlayson
Live Networks, Inc.