Re: [AVTCORE] Clarifying the H.264 RTP payload format specification's text on when to set the RTP "M" bit

"Wang, Ye-Kui" <yekuiw@qti.qualcomm.com> Thu, 08 August 2013 16:40 UTC

Return-Path: <yekuiw@qti.qualcomm.com>
X-Original-To: avt@ietfa.amsl.com
Delivered-To: avt@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BCF2A21E80AC; Thu, 8 Aug 2013 09:40:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.598
X-Spam-Level:
X-Spam-Status: No, score=-102.598 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PY80VgetGYmg; Thu, 8 Aug 2013 09:40:47 -0700 (PDT)
Received: from sabertooth01.qualcomm.com (sabertooth01.qualcomm.com [65.197.215.72]) by ietfa.amsl.com (Postfix) with ESMTP id 9E53111E81D0; Thu, 8 Aug 2013 09:40:42 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=qti.qualcomm.com; i=@qti.qualcomm.com; q=dns/txt; s=qcdkim; t=1375980042; x=1407516042; h=from:to:subject:date:message-id:references:in-reply-to: mime-version; bh=bbxBnlLRIwouq1oJoNdAuIWKJkP2gndje5tGVBiTScw=; b=wc2NvinrUWbuCkR7VHWvfuUSQ0vLGM+L42GVMDk3hk0nOQzp9SiNMI0z 2TybT76LzXn/2y+SdrpOAZa5IeV2ZupZChOMmHVuIRqLkw+KI0b6qFK2y H7xJ5Mrt5X4jyP73hs0KAT4JY8fA+D9tcocHIuSQc94cAhFLIyoNAGRp0 c=;
X-IronPort-AV: E=Sophos; i="4.89,840,1367996400"; d="scan'208,217"; a="49099242"
Received: from ironmsg03-l.qualcomm.com ([172.30.48.18]) by sabertooth01.qualcomm.com with ESMTP; 08 Aug 2013 09:40:42 -0700
X-IronPort-AV: E=Sophos; i="4.89,840,1367996400"; d="scan'208,217"; a="517814667"
Received: from nasanexhc13.na.qualcomm.com ([172.30.48.20]) by Ironmsg03-L.qualcomm.com with ESMTP/TLS/RC4-SHA; 08 Aug 2013 09:40:25 -0700
Received: from NASANEXD02F.na.qualcomm.com ([169.254.8.21]) by nasanexhc13.na.qualcomm.com ([172.30.48.20]) with mapi id 14.03.0146.002; Thu, 8 Aug 2013 09:40:16 -0700
From: "Wang, Ye-Kui" <yekuiw@qti.qualcomm.com>
To: Ross Finlayson <finlayson@live555.com>, "avt@ietf.org" <avt@ietf.org>, "payload@ietf.org" <payload@ietf.org>
Thread-Topic: [AVTCORE] Clarifying the H.264 RTP payload format specification's text on when to set the RTP "M" bit
Thread-Index: AQHOkVzkeMqFQ9EXC0GCRYYICjI/eZmLhfPA
Date: Thu, 8 Aug 2013 16:40:15 +0000
Message-ID: <8BA7D4CEACFFE04BA2D902BF11719A83384D4C47@nasanexd02f.na.qualcomm.com>
References: <C1F36850-2B72-4A98-97B7-8847C9C90CB0@live555.com> <3879D71E758A7E4AA99A35DD8D41D3D91D4C7AA7@xmb-rcd-x14.cisco.com> <A0D9B971-CDE0-4AF7-9A70-C280514AEF10@live555.com>
In-Reply-To: <A0D9B971-CDE0-4AF7-9A70-C280514AEF10@live555.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [199.106.115.192]
Content-Type: multipart/alternative; boundary="_000_8BA7D4CEACFFE04BA2D902BF11719A83384D4C47nasanexd02fnaqu_"
MIME-Version: 1.0
X-Mailman-Approved-At: Fri, 09 Aug 2013 14:22:42 -0700
Subject: Re: [AVTCORE] Clarifying the H.264 RTP payload format specification's text on when to set the RTP "M" bit
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Audio/Video Transport Core Maintenance <avt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/avt>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 08 Aug 2013 16:40:51 -0000

I share Mo's opinion here that there is no need to add additional text to the H.264 or the H.265 draft in this aspect, rather the semantics of the M bit is sufficient.

First of all, implementers are expected to have basic knowledge of the video coding specification itself. If we try to explain all the coding-level details, there can be a lot.

Secondly, as I mentioned in the avtcore session in Berlin, the M bit for H.264 payload format is not reliable because RFC 6184 supports multi-time aggregation packets (MTAPs). This is not an issue any more in the H.265 payload as MTAPs are not supported anymore, thus the M bit can be a reliable indication. Of course, bad packetization/sender implementations can make that unreliable - but bad packetization/sender implementations can make anything unreliable - thus they don't count.

Lastly, the suggested text is too vague and even incorrect. For example, in the following copied part: what is "the current NAL unit"? Also, it is possible for non-VCL NAL unit to be between two VCL NAL units in an AU, the following language would suggest that the first of the two VCL NAL unit ends the AU - this is definitely incorrect.
    The current NAL unit ends an access unit if it is a VCL NAL unit, and if either:
            - the next NAL unit is not a VCL NAL unit, or
            - the next NAL unit is a VCL NAL unit, and the high-order bit of the first byte after its NALU header is 1.

BR, YK

From: avt-bounces@ietf.org [mailto:avt-bounces@ietf.org] On Behalf Of Ross Finlayson
Sent: Sunday, August 04, 2013 2:52 PM
To: avt@ietf.org
Subject: Re: [AVTCORE] Clarifying the H.264 RTP payload format specification's text on when to set the RTP "M" bit

If you are willing to detect the last NALU of a frame by waiting for the first VCL NALU of the next frame, that is very simple in both H.264 and H.265. The first bit after the VCL NALU header is set if it's the first VCL NALU of a frame. This works in H.265 because the slice header explicitly has such a flag.

Thanks.  This is exactly the kind of information that would be useful for implementers.



Note that RTP receivers are technically forbidden to rely on the marker bit. But many implementations detect whether it is reliable, then use it as an optimization to minimize latency if reliable.

Yes, it's unfortunate that there may be implementations out there that do not set the RTP 'M' bit correctly.  But to reduce the likelihood of this happening, we should try to make our RTP payload format specifications as clear as is reasonable about when to set the 'M' bit.  (It is common - in non-interactive environments - for RTP transmitting applications to obtain their encoded data as a byte stream.)



 I don't think the draft needs any more text beyond what it currently says.

On the contrary - I think you've explained that some additional text would be both useful and reasonable.  (Once again, because we're not MPEG, we should not feel the need to fully-specify only the behavior that's required by RTP *receivers*.)

So I suggest adding the following paragraph to the H.265 RTP payload format specification, just after the existing 'M bit' text:

----------
Unfortunately the contents of a NAL unit, alone, does not tell a RTP sender implementation whether or not the NAL unit ends an access unit.  Instead, the implementation can obtain this information separately, from the encoder.  If, however, this information is not directly available from the encoder (e.g., because the implementation is sending data that consists solely of a sequence of pre-encoded NAL units), then it must instead inspect the next NAL unit, to determine whether or not the current NAL unit ends an access unit.  The following rule can be used:
    The current NAL unit ends an access unit if it is a VCL NAL unit, and if either:
          - the next NAL unit is not a VCL NAL unit, or
          - the next NAL unit is a VCL NAL unit, and the high-order bit of the first byte after its NALU header is 1.
----------

Ross Finlayson
Live Networks, Inc.
http://www.live555.com/