[AVT] Comments on the major open issue of draft-ietf-avt-rtp-svc-07.txt, cross layer decoding order dependency

<mike.nilsson@bt.com> Fri, 08 February 2008 18:14 UTC

Return-Path: <avt-bounces@ietf.org>
X-Original-To: ietfarch-avt-archive@core3.amsl.com
Delivered-To: ietfarch-avt-archive@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 619F528C43F; Fri, 8 Feb 2008 10:14:34 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.598
X-Spam-Level:
X-Spam-Status: No, score=-2.598 tagged_above=-999 required=5 tests=[AWL=-0.000, BAYES_00=-2.599, HTML_MESSAGE=1, MIME_HTML_MOSTLY=0.001, RCVD_IN_DNSWL_LOW=-1]
Received: from core3.amsl.com ([127.0.0.1]) by localhost (mail.ietf.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5uzfmrybTTsG; Fri, 8 Feb 2008 10:14:25 -0800 (PST)
Received: from core3.amsl.com (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id DFD2B28C41B; Fri, 8 Feb 2008 10:14:07 -0800 (PST)
X-Original-To: avt@core3.amsl.com
Delivered-To: avt@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 7C5C828C2E9 for <avt@core3.amsl.com>; Fri, 8 Feb 2008 08:55:07 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
Received: from core3.amsl.com ([127.0.0.1]) by localhost (mail.ietf.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id p+Pe22xXoj4v for <avt@core3.amsl.com>; Fri, 8 Feb 2008 08:54:56 -0800 (PST)
Received: from smtp3.smtp.bt.com (smtp3.smtp.bt.com [217.32.164.138]) by core3.amsl.com (Postfix) with ESMTP id 2A81D28C353 for <avt@ietf.org>; Fri, 8 Feb 2008 08:54:56 -0800 (PST)
Received: from E03MVY1-UKDY.domain1.systemhost.net ([193.113.30.60]) by smtp3.smtp.bt.com with Microsoft SMTPSVC(6.0.3790.1830); Fri, 8 Feb 2008 16:56:24 +0000
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Date: Fri, 08 Feb 2008 16:56:42 -0000
Message-ID: <683204CAF7155443BC14CEAEC009FCA603711DC8@E03MVY1-UKDY.domain1.systemhost.net>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: Comments on the major open issue of draft-ietf-avt-rtp-svc-07.txt, cross layer decoding order dependency
thread-index: AchcWBOtdxYRnl/ZQLy/J+Tiv/W/DQAlHItQAAlDlhADVn7D0AAB7tTg
From: mike.nilsson@bt.com
To: roni.even@polycom.co.il, ye-kui.wang@nokia.com, jonathan@vidyo.com, schierl@hhi.fhg.de, csp@csperkins.org, Yann.Leprovost@alcatel-lucent.fr, stewe@stewe.org, tom@vidyo.com, tom.taylor@rogers.com, rjesup@wgate.com
X-OriginalArrivalTime: 08 Feb 2008 16:56:24.0093 (UTC) FILETIME=[89835CD0:01C86A73]
X-Mailman-Approved-At: Fri, 08 Feb 2008 10:14:06 -0800
Cc: avt@ietf.org
Subject: [AVT] Comments on the major open issue of draft-ietf-avt-rtp-svc-07.txt, cross layer decoding order dependency
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Audio/Video Transport Working Group <avt.ietf.org>
List-Unsubscribe: <http://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <http://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============1558912053=="
Sender: avt-bounces@ietf.org
Errors-To: avt-bounces@ietf.org

The problem to be solved can be summarised as follows. The video
encoder, or other source of coded video data, produces a sequence of
chunks of data known as NAL units. These are to be transmitted over two
or more RTP sessions. At the receiver, the data is to be put back into a
single sequence with the same order as in the original sequence from the
encoder or data source. This data is then input to the video decoder and
is decoded and output. There are other variants, where for example, the
receiver is not a decoder, but some other device such as a MANE, but the
core problem of re-establishing the original order of NAL units is the
same.

 

One of the solutions to this problem, the CL-DON solution, allocates a
monotonic increasing sequence number to each of the NAL units from the
encoder, transports these numbers through the network, and uses these
numbers to re-establish the original order of NAL units. The NAL units
received on the multiple RTP sessions are simply ordered according to
this monotonic increasing sequence of numbers.

 

The other solution to this problem, the classical solution, when
operated as in the rules in the current version of the draft operates as
follows. The NAL units from the encoder are grouped into NAL units and
associated with a non- monotonic number (the timestamp representing
output (display) order rather than decoding order). Effectively the NAL
units are being labelled with (almost) arbitrary labels. These labelled
NAL units are then separated into multiple RTP streams, and a monotonic
increasing sequence is applied independently in each RTP stream. Note
both of these steps are performed in the CL-DON solution, but do not
have to be used to restore decoding order. At the receiver, the
independent monotonic increasing sequence numbers are used to re-order
packets in each RTP stream. These are then grouped according to label
(timestamp) in each RTP stream. Then the NAL units from the lower layers
are "merged" with the NAL units in the highest enhancement layer,
grouping together NAL units with the same label (timestamp). Finally,
SEI NAL units must be moved to the start of each group (access unit), if
they were transmitted anywhere other than the base RTP session.

 

This suffers from the need for the highest layer to have NAL units at
every time instant for which there is a NAL unit in any lower layer. And
due to the need for this process to work regardless of how many of the
RTP sessions are received, the same has to apply to any layer with
regards to the layers below it. While this can be overcome by inserting
filler data NAL units, it does seem to have a problem with packet loss,
as this situation can not be guaranteed after loss. Given that the
highest layer may often be transmitted with the least error protection,
this is a major limitation of this approach.

 

But the classical solution can be operated in a different way at the
receiver to overcome this limitation, but with additional complexity. As
before, at the receiver, the independent monotonic increasing sequence
numbers are used to re-order packets in each RTP stream, and then these
are grouped according to label (timestamp) in each RTP stream. Then the
sequences of labels (timestamps) in each stream can be analysed, and in
many (but not all) cases, the decoding order of the labels (timestamps)
can be deduced, and then used to restore the decoding order.

 

In the example below, the top two RTP sessions operate at a given frame
rate and the base layer is operating at half the frame rate. Packet loss
has affected one access unit in the top layer and one access unit in the
middle layer.

 

 4     1  3  8  6  5  7 12 10

 4  2  1  3  8     5  7 12 10

 4  2        8  6       12 10

 

However, decoding order can be restored by noticing from the middle
layer that NAL units with label =2 are to be decoded before those with
label=1. Similarly, the top layer tells us that NAL units with label =6
are to be decoded before those with label=5.

 

But if both middle and top layers lost their NAL units with label=2, as
shown below, it would be more difficult to re-establish decoding order
as from the RTP and payload layer it is not possible to determine if
label=2 comes before or after label=1. It may be possible to determine
order by looking into pic_timing SEI messages, if present (not
guaranteed), or a best guess could be made by making assumptions based
on previous GOP structures (and the order of timestamps). Alternatively
it may be better to discard all NAL units with labels 1 and 3 rather
than to risk feeding data to the decoder in the wrong order.

 

 4     1  3  8  6  5  7 12 10

 4     1  3  8     5  7 12 10

 4  2        8  6       12 10

 

My conclusion is that while using a non-monotonic set of numbers
(timestamps) to re-establish decoding order is possible in many but not
all cases, it is a fairly complex process, particularly if it is to make
best use of all packets received when some are lost, as in the second
method above. And in practice I feel that the second method would be
implemented because the performance of the first in the case of packet
loss could be unacceptably poor.

 

The major weakness of the CL-DON method is that it is not backwards
compatible with the single NAL unit mode of RFC 3984.

 

One way to overcome this would be to use some backward compatible
mechanism to transport the CL-DON information in the base RTP session
operating in single NAL unit mode. The RTP header extension mechanism is
one way that this could be done, but I know that there are objections to
doing this.

 

However, the single NAL unit mode was introduced into RFC 3984 primarily
"for low-delay applications that are compatible with systems using ITU-T
Recommendation H.241". 

 

Hence, if there is a need for backwards compatibility with the single
NAL unit mode, and this is itself very debatable, then this need would
seem to be restricted to low delay applications, where it is unlikely
that access units would be encoded in a different order to output
(display) order.

 

Consequently, a solution to the whole problem of restoring the decoding
order of NAL units is define a class of receiver that supports the full
CL-DON method, and the classical method restricted to cases where the
timestamps are monotonic increasing. This restricted case of the
classical method is much simpler to implement than the general case, and
provides backwards compatibility with the intended applications of the
single NAL unit mode.

 

 

Best regards

 

Mike

 

Mike Nilsson 
Multimedia Analysis and Coding
BT Group Chief Technology Office
___________________________ 

Sirius House (B54-MH), Room 92 
Adastral Park, Martlesham Heath, Ipswich, IP5 3RE, UK 
Tel:    +44 1473 645413 
Mobile: +44 7917 025433 
Fax:    +44 1908 862365 
Email:  mike.nilsson@bt.com <BLOCKED::mailto:mike.nilsson@bt.com> 

_______________________________________________
Audio/Video Transport Working Group
avt@ietf.org
http://www.ietf.org/mailman/listinfo/avt