Re: [AVTCORE] [payload] Clarifying the H.265 RTP payload format specification's text on when to set the RTP "M" bit

Ross Finlayson <finlayson@live555.com> Sat, 10 August 2013 00:54 UTC

Return-Path: <finlayson@live555.com>
X-Original-To: avt@ietfa.amsl.com
Delivered-To: avt@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E184111E81A8; Fri, 9 Aug 2013 17:54:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.598
X-Spam-Level:
X-Spam-Status: No, score=-2.598 tagged_above=-999 required=5 tests=[AWL=-0.000, BAYES_00=-2.599, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9cy+Lq+D9p5x; Fri, 9 Aug 2013 17:54:09 -0700 (PDT)
Received: from ns.live555.com (ns.live555.com [4.79.217.242]) by ietfa.amsl.com (Postfix) with ESMTP id 8839E21F9BFF; Fri, 9 Aug 2013 17:47:54 -0700 (PDT)
Received: from [127.0.0.1] (localhost.live555.com [127.0.0.1]) by ns.live555.com (8.14.4/8.14.4) with ESMTP id r7A0lpd0092379; Fri, 9 Aug 2013 17:47:52 -0700 (PDT) (envelope-from finlayson@live555.com)
From: Ross Finlayson <finlayson@live555.com>
Content-Type: multipart/alternative; boundary="Apple-Mail=_D679A071-4D6C-4368-9856-7FA361D3ABA3"
Message-Id: <A6FDCB43-F221-4BE0-BEDA-1F1E5394DDEC@live555.com>
Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\))
Date: Fri, 9 Aug 2013 17:47:50 -0700
References: <C1F36850-2B72-4A98-97B7-8847C9C90CB0@live555.com> <3879D71E758A7E4AA99A35DD8D41D3D91D4C7AA7@xmb-rcd-x14.cisco.com> <A0D9B971-CDE0-4AF7-9A70-C280514AEF10@live555.com> <8BA7D4CEACFFE04BA2D902BF11719A83384D4C47@nasanexd02f.na.qualcomm.com> <5D781561-69CD-4E0D-9377-B4B26036E691@live555.com> <8BA7D4CEACFFE04BA2D902BF11719A83384D5FB3@nasanexd02f.na.qualcomm.com> <69F25DB2-6926-4ACA-935D-90376A2A7BFA@live555.com> <8BA7D4CEACFFE04BA2D902BF11719A83384D690F@nasanexd02f.na.qualcomm.com>
To: "avt@ietf.org" <avt@ietf.org>, "payload@ietf.org" <payload@ietf.org>
In-Reply-To: <8BA7D4CEACFFE04BA2D902BF11719A83384D690F@nasanexd02f.na.qualcomm.com>
X-Mailer: Apple Mail (2.1508)
Subject: Re: [AVTCORE] [payload] Clarifying the H.265 RTP payload format specification's text on when to set the RTP "M" bit
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Audio/Video Transport Core Maintenance <avt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/avt>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 10 Aug 2013 00:54:18 -0000

> Now I understand your intention better – did not see that the note was intended for sender implementers. 

OK, thanks.


>  For this purpose, indeed (as Mo pointed out earlier?) that first of all, the sender needs make sure the RTP timestamp for each RTP packet is correct. As long as RTP timestamp is correct, it is also easy to know when the M bit should be set. Or are you saying that there is no problem with RTP timestamp setting but there is a problem for the M bit setting? If yes, please clarify why.

Figuring out the RTP timestamp isn't (nearly as much of) a problem, because this can be done by knowing the stream's frame rate.  But in this email thread, I'm focusing only on the RTP 'M' bit, because for it (unlike the timestamp) there's less of an incentive for a sender implementation to 'get it right'.  I'm worried that if the RTP payload format specification doesn't make it crystal clear when a NAL unit ends an 'access unit', then an implementer might be tempted to just not bother setting the bit at all.  I don't want to see that happen.  Hence this email thread.


> I have no problem to putting some information in the line as you suggested, if the suggested text is good. However, the suggested wording below still has a few problems (or shortcomings).
>  
> ----------
> Unfortunately the contents of a NAL unit, alone, does not tell a RTP sender implementation whether or not the NAL unit ends an access unit.  Instead, the implementation can obtain this information separately, from the encoder.  If, however, this information is not available directly from the encoder (e.g., because the implementation is sending data that consists solely of a sequence of pre-encoded NAL units), then it must instead inspect subsequent NAL units, to determine whether or not the NAL unit ends an access unit.  The following rule can be used:
>     A NAL unit ends an access unit if it is a VCL NAL unit, and the next-occurring VCL NAL unit has the high-order bit of the first byte after its NALU header equal to 1.
> ----------
>  
> The problems (or shortcomings) are:
> -          I don’t think it is good to say that the fact that a NAL unit itself does not indicate whether it is the last one of an AU is unfortunate.

Why not?  It *is* "unfortunate".  I'm not intending to 'insult' H.265 here :-)


> -          Saying that the information can be obtained from the encoder is vague to me. I guess you meant the bitstream instead, as the encoder may be absent e.g. when dealing pre-encoded video.

I think you may have been confused by my use of the word "can" in the 2nd sentence of my proposed paragraph.  Here is a rewording of the 2nd (& 3rd) sentences; I hope this will make things clearer:

	"Instead, the implementation may be able to obtain this information separately, from the encoder.  If, however, the implementation cannot obtain this information directly from the encoder (e.g., because the implementation is sending data that consists solely of a sequence of pre-encoded NAL units), then it must instead inspect subsequent NAL units, to determine whether or not the NAL unit ends an access unit."


> -          Last but not least, an AU may also end with a non-VCL NAL unit, e.g. a filler data NAL unit, an end of sequence NAL unit, an end of bitstream NAL unit, a suffix SEI NAL unit, etc.

Arggh!  Now, I hope, you see the problem :-)  Can you then help me figure out a more accurate wording for the rule?  In particular:
	1/ Suppose we have a VCL NAL unit, followed immediately by another VCL NAL unit with the first slice header bit set.  In this case, there's no doubt: The first VCL NAL unit definitely ends an 'access unit' - correct?
	2/ Suppose we have a VCL NAL unit, followed by a sequence of one or more non-VCL NAL units, followed immediately by a VCL NAL unit with the first slice header bit set.  In this case, which (if any) of the non-VCL NAL units ends an 'access unit'?  Can we figure out a rule (ideally, an easy rule, based solely on the "nal_unit_type"s) for this?


Ross Finlayson
Live Networks, Inc.
http://www.live555.com/