[AVT] Some comments on <draft-flaks-avt-rtp-ac3-03.txt>

Ross Finlayson <finlayson@live.com> Sat, 05 April 2003 02:40 UTC

Received: from www1.ietf.org (ietf.org [132.151.1.19] (may be forged)) by ietf.org (8.9.1a/8.9.1a) with ESMTP id VAA16603 for <avt-archive@odin.ietf.org>; Fri, 4 Apr 2003 21:40:37 -0500 (EST)
Received: (from mailnull@localhost) by www1.ietf.org (8.11.6/8.11.6) id h352hck12668 for avt-archive@odin.ietf.org; Fri, 4 Apr 2003 21:43:38 -0500
Received: from www1.ietf.org (localhost.localdomain [127.0.0.1]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id h352dr812478; Fri, 4 Apr 2003 21:39:53 -0500
Received: from ietf.org (odin.ietf.org [132.151.1.176]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id h352cR812440 for <avt@optimus.ietf.org>; Fri, 4 Apr 2003 21:38:27 -0500
Received: from ns.live.com (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id VAA16368 for <avt@ietf.org>; Fri, 4 Apr 2003 21:34:54 -0500 (EST)
Received: from ns.live.com (localhost.live.com [127.0.0.1]) by ns.live.com (8.12.9/8.12.5) with ESMTP id h352bM4N031035; Fri, 4 Apr 2003 18:37:22 -0800 (PST) (envelope-from rsf@ns.live.com)
Received: (from rsf@localhost) by ns.live.com (8.12.9/8.12.3/Submit) id h352bIlj030959; Fri, 4 Apr 2003 18:37:18 -0800 (PST)
Message-Id: <4.3.1.1.20030404170117.00b796d0@laptop-localhost>
X-Sender: rsf@laptop-localhost
X-Mailer: QUALCOMM Windows Eudora Version 4.3.1
Date: Fri, 04 Apr 2003 18:36:33 -0800
To: thh@dolby.com, jasonfl@microsoft.com
From: Ross Finlayson <finlayson@live.com>
Cc: avt@ietf.org
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format="flowed"
Subject: [AVT] Some comments on <draft-flaks-avt-rtp-ac3-03.txt>
Sender: avt-admin@ietf.org
Errors-To: avt-admin@ietf.org
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Id: Audio/Video Transport Working Group <avt.ietf.org>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>

I have recently been reviewing (& implementing) the proposed RTP payload 
format for AC-3 Audio <draft-flaks-avt-rtp-ac3-03.txt>, and I have a number 
of comments:

1/ This is now an official AVT Working Group item, correct?  If so, the 
file name of future drafts should have the form "draft-avt-rtp-ac3-*.txt".

2/ I don't see any real need for the "NDU" (number of data units) byte at 
the front of each RTP payload.  If an incoming RTP packet contains more 
than one 'data unit' (frame), then the receiver already needs to figure out 
the frame boundaries within the packet (presumably by computing the size of 
each frame from its internal header bytes).  In doing this, he can count 
how many frames are present - but knowing this count in advance provides no 
useful information.

3/ The rules for how fragmented frames are packed into RTP packets are not 
made clear.  In particular, can a RTP packet legally contain the following:
         - n (>=1) complete frames, followed by an initial fragment of a
                 following frame?
or
         - the trailing fragment of a frame, followed by one or more complete
                 frames (or the initial fragment of a following frame)?
In most existing RTP payload formats, to simplify the packing rules, 
neither of these cases is allowed.  I.e., most RTP payload formats (that 
support multiple frames per packet, with fragmentation) specify that a RTP 
packet must contain either (i) one or more complete frames, or (ii) exactly 
one partial fragment of a frame (and nothing more).  The document should 
explain exactly what the packing rules are for this payload format.

4/ What value should the "RDT" field (in the Data Unit Header) take if the 
data unit does not contain any redundant data (i.e., if the "TYP" field is 
00)?  Should "RDT" be 000 in this case?  (It probably doesn't matter, but 
it should be specified.)

In general, it's not clear which of the 4x8=32 possible combinations of the 
(2-bit) "TYP" field and the (3-bit) "RDT" field are actually legal.

5/ What value should the "B" bit have if the 'data unit' is a complete 
frame (i.e., not fragmented)?  Presumably the "B" bit should be 1 in this 
case (because it will, of course, contain the 5/8 point).  But the 
description of "B" is a little ambiguous, because it says "set to 1 if the 
packet contains an AC-3
*fragment* consisting of..."

6/ The "F" bit, as currently specified, seems mostly useless.  Consider, 
for example, the case where a single frame is fragmented over three RTP 
packets.  The first packet has
         F==1, M==0
The second packet has
         F==1, M==0
The third packet has
         F==1, M==1
(where M is the RTP header's "M" bit.)  If the first of these packets gets 
lost, then the receiver will not be able to tell whether the second packet 
is the start of a new frame, or (as is actually the case) the middle of a 
frame.  (I suppose it could figure this out by checking for the AC3 
'syncword' at the start, but then you wouldn't really need the "F" bit at all.)

I suggest that you replace the "F" bit with a "S" (start of frame) 
bit.  This bit would be set to 1 iff the 'data unit' is the start of a 
frame (or a complete frame).  I.e., using the example above, the first 
packet would have
         S==1, M==0
The second packet would have
         S==0, M==0
The third packet would have
         S==0, M==1
(The fourth possible case
         S==1, M==1
would indicate that the packet contains (one or more) *complete* frames.)

7/ The "T" (time code present) bit doesn't seem particularly useful.  It 
requires that a sending implementation have detailed knowledge of the 
internals of AC3 frames, which seems unnecessary.  Instead, I suggest 
omitting this bit, and leaving it up to the receiver to figure out - from 
the internals of each frame - whether it contains a time code (because this 
is something that it presumably needs to do anyway).

8/ As noted in 2/ above, each receiver needs to figure out where the 
boundaries between 'data units' are in each incoming RTP packet.  Can the 
receiver easily do this even when the data unit(s) contain 
redundant/interleaved data - i.e., if "TYP" is 01 or 10?  This wasn't clear 
to me, and I think you should provide a more detailed explanation (and 
perhaps even some 'pseudo-code') that describes how a receiver should parse 
incoming packets.

9/ I found the definition of the '5/8 point' a little confusing:
         "...the 5/8ths point is defined as:
                 5/8-framesize = truncate(framesize/2) + truncate(framesize/8)"
When I first saw this, I thought that "5/8-framesize" was part of the 
formula - i.e., with a division and a subtraction operation!  Instead, I 
suggest changing the definition to:
         "...the 5/8ths point, P, is defined as:
                 P = truncate(framesize/2) + truncate(framesize/8)"

10/ Re. Section 4.2: "SDP usage": Having the number of channels be 
specified in a separate "a=fmtp:" line is unorthodox.  It's far more common 
(and actually suggested in the SDP specification) for the number of 
channels to be specified at the end of the "a=rtpmap:" line - e.g.:
         m=audio 49000 RTP/AVP 100
         a=rtpmap:100 ac3/48000/6

I hope this helps...

         Ross.



_______________________________________________
Audio/Video Transport Working Group
avt@ietf.org
https://www1.ietf.org/mailman/listinfo/avt