Re: [AVTCORE] Clarifying the H.264 RTP payload format specification's text on when to set the RTP "M" bit

Ross Finlayson <finlayson@live555.com> Sun, 04 August 2013 21:51 UTC

Return-Path: <finlayson@live555.com>
X-Original-To: avt@ietfa.amsl.com
Delivered-To: avt@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D0D3311E80D1 for <avt@ietfa.amsl.com>; Sun, 4 Aug 2013 14:51:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.598
X-Spam-Level:
X-Spam-Status: No, score=-2.598 tagged_above=-999 required=5 tests=[AWL=-0.000, BAYES_00=-2.599, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AMmaUmRN5C7N for <avt@ietfa.amsl.com>; Sun, 4 Aug 2013 14:51:55 -0700 (PDT)
Received: from ns.live555.com (ns.live555.com [4.79.217.242]) by ietfa.amsl.com (Postfix) with ESMTP id B49F321E80A1 for <avt@ietf.org>; Sun, 4 Aug 2013 14:51:52 -0700 (PDT)
Received: from [127.0.0.1] (localhost.live555.com [127.0.0.1]) by ns.live555.com (8.14.4/8.14.4) with ESMTP id r74LpmdQ056494 for <avt@ietf.org>; Sun, 4 Aug 2013 14:51:51 -0700 (PDT) (envelope-from finlayson@live555.com)
From: Ross Finlayson <finlayson@live555.com>
Content-Type: multipart/alternative; boundary="Apple-Mail=_A7ACE70E-415C-4C18-90D1-AF3EE3C252B2"
Message-Id: <A0D9B971-CDE0-4AF7-9A70-C280514AEF10@live555.com>
Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\))
Date: Sun, 04 Aug 2013 14:51:48 -0700
References: <C1F36850-2B72-4A98-97B7-8847C9C90CB0@live555.com> <3879D71E758A7E4AA99A35DD8D41D3D91D4C7AA7@xmb-rcd-x14.cisco.com>
To: "avt@ietf.org" <avt@ietf.org>
In-Reply-To: <3879D71E758A7E4AA99A35DD8D41D3D91D4C7AA7@xmb-rcd-x14.cisco.com>
X-Mailer: Apple Mail (2.1508)
Subject: Re: [AVTCORE] Clarifying the H.264 RTP payload format specification's text on when to set the RTP "M" bit
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Audio/Video Transport Core Maintenance <avt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/avt>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 04 Aug 2013 21:52:00 -0000

> If you are willing to detect the last NALU of a frame by waiting for the first VCL NALU of the next frame, that is very simple in both H.264 and H.265. The first bit after the VCL NALU header is set if it’s the first VCL NALU of a frame. This works in H.265 because the slice header explicitly has such a flag.

Thanks.  This is exactly the kind of information that would be useful for implementers.


> Note that RTP receivers are technically forbidden to rely on the marker bit. But many implementations detect whether it is reliable, then use it as an optimization to minimize latency if reliable.

Yes, it's unfortunate that there may be implementations out there that do not set the RTP 'M' bit correctly.  But to reduce the likelihood of this happening, we should try to make our RTP payload format specifications as clear as is reasonable about when to set the 'M' bit.  (It is common - in non-interactive environments - for RTP transmitting applications to obtain their encoded data as a byte stream.)


>  I don’t think the draft needs any more text beyond what it currently says.

On the contrary - I think you've explained that some additional text would be both useful and reasonable.  (Once again, because we're not MPEG, we should not feel the need to fully-specify only the behavior that's required by RTP *receivers*.)

So I suggest adding the following paragraph to the H.265 RTP payload format specification, just after the existing 'M bit' text:

----------
Unfortunately the contents of a NAL unit, alone, does not tell a RTP sender implementation whether or not the NAL unit ends an access unit.  Instead, the implementation can obtain this information separately, from the encoder.  If, however, this information is not directly available from the encoder (e.g., because the implementation is sending data that consists solely of a sequence of pre-encoded NAL units), then it must instead inspect the next NAL unit, to determine whether or not the current NAL unit ends an access unit.  The following rule can be used:
    The current NAL unit ends an access unit if it is a VCL NAL unit, and if either:
	- the next NAL unit is not a VCL NAL unit, or
	- the next NAL unit is a VCL NAL unit, and the high-order bit of the first byte after its NALU header is 1.
----------

Ross Finlayson
Live Networks, Inc.
http://www.live555.com/