[AVT] Final minutes for the AVT meeting in Atlanta

Audio/Video Transport Working Group Minutes

Reported by Stephen Casner and Colin Perkins

   The AVT working group met in two sessions at the 55th IETF meeting in
   Atlanta.  In the first session, the group discussed RTP payload
   formats for MIDI, DTMF digits and tones, iLBC speech, ATRAC-X audio,
   and uncompressed video.  The session ended with an important
   discussion of the issues to be resolved for IESG approval of Secure
   RTP.  In the second session, the discussion focused on RTP payload
   formats for MPEG-4 and JVT video plus RTCP extensions for voice
   quality reporting and for SSM sessions, and RTP retransmission.  A
   bonus topic on the RGL codec and payload format was squeezed in at the
   end.

Introduction, Document Status, and Open Issues

   This meeting began with an update by Steve Casner on document
   publication status, including a few issues identified for documents in
   the queue.  One RFC was published since the last meeting (RFC 3389 on
   Comfort Noise payload format), two are in the RFC editor queue (the
   MIME registration for the payload formats in the RTP profile and the
   SDP bandwidth modifiers for RTCP bandwidth, both blocked on the RTP
   specification), and seven are with the IESG.  Of the latter, two are
   the RTP specification and A/V profile (revisions of RFC 1889 and 1890)
   which have been "tentatively approved".  Final approval is pending
   preparation of a set of "RFC Editor notes" to be passed with the
   drafts to the RFC Editor to implement the changes requested by the
   IESG and the resolution of comments by the working group while the
   documents have been under IESG review.  Steve Casner will prepare
   these notes for approval by the Area Directors.

   Some of those RFC Editor notes implement the resolution of an issue
   with the RTP A/V profile that was raised just before the previous
   (54th) IETF meeting.  This was a request to change the sample packing
   order for G.726 audio encoding to be consistent with the packing order
   for ATM AAL2 transport as specified in ITU-T Recommendation I.366.2
   Annex E.  A request for comments on the proposal to make this change
   was sent to at least ten relevant mailing lists in IETF and ITU-T.
   The number of comments was surprisingly small, which indicates that
   there may not be many implementations of G.726 transport in RTP.
   However, the comments did indicate that both packing orders are in use
   and that there are parties opposed to making the change in addition to
   the those who proposed the change.

   The conclusion reached by the chairs in consultation with the Area
   Directors is that we need to define MIME subtypes for two payload
   formats reflecting the two packing orders.  We generally prefer not to
   have multiple choices because of the risk of incompatibility that
   imposes, but we are forced into it in this case by an incompatibility
   that already exists.  Furthermore, both packing orders are specified
   in separate areas of ITU-T (AAL2 and X.400 mail).  In order to make
   clear the incompatibility between the existing G726-* payload formats
   and the AAL2 packing, we will add a note in the A/V profile section
   that specifies those formats to note the incompatibility and say that
   a second set of payload formats named AAL2-G726-* will be specified in
   a separate document.  Or, if the IESG agrees, the AAL2-G726-*
   specification will be added as a new section in the profile.  One
   problem would be that the profile is to be published as a Draft
   Standard, which means there should first be two interoperable
   implementations.  Alternatively, a separate draft can be produced
   quickly to be published as a Proposed Standard.

   Flemming Andreasen asked why not make the existing G726-* names
   indicate the AAL2 packing and make up new names for the existing
   payload formats for RTP.  The primary justification for that approach
   would be if most implementations used the existing name to indicate a
   payload format with the same packing as ITU I366.2.  That appears not
   to be the case.  The real issue is not the name but the interpretation
   of static payload type 2 which is assigned to G726-32, since most
   implementations are probably using the 32K rate and using the static
   payload type rather than the MIME name.  This incompatible
   interpretation exists and can't be avoided.  Consequently, we will
   deprecate the use of static payload type 2.  All systems should
   negotiate a dynamic payload type using the MIME subtypes G726-32 or
   AAL2-G726-32 depending upon which packetization they want to use.  A
   longer summary of the comments and the details of the conclusion was
   posted by the chairs to the AVT mailing list on November 14, just
   before this meeting.

   Five other drafts have been submitted to the IESG but not yet accepted
   for publication.  These include enhanced CRTP and TCRTP, the secure
   RTP profile, the payload format for EVRC/SMV speech, and the payload
   format for distributed speech recognition.  Our Area Director Allison
   Mankin asked for some changes on ECRTP and TCRTP; revisions were
   submitted.  Discussion of the issues for the secure RTP profile is
   covered later in these minutes.

   Several drafts are in (extended) working group last call.  The RTCP
   feedback profile draft was updated for this meeting to address
   comments from the last call, but the authors did not have time to
   complete a "wording cleanup" pass they want to do, so we will wait for
   that and give the WG a last chance to read it before passing it on to
   the IESG.  Steve Casner asked for the feedback simulation draft to be
   updated and resubmitted the so it can accompany the feedback profile as
   an Informational RFC to help convince the IESG that congestion control
   issues have been properly addressed.  José Rey said he would try to do
   this.  The MPEG-4 payload format has been revised to address comments
   regarding the section on interleaving; those were discussed in the
   second AVT session.  Two drafts specify unequal error protection: the
   ULP and UXP FEC mechanisms.  At the previous AVT meeting, Steve Casner
   requested that the ULP draft be changed to update and replace RFC 2733
   FEC rather than extend it.  The motivation is to correct an
   unfortunate design choice in RFC 2733 resulting in the X, P and CC
   bits in the RTP header not following the usual rules (these bits are
   the XOR of the bits in the protected packets instead) and thus
   requiring a special case for header validation.  A new
   draft-ietf-avt-ulp-07.txt was submitted in response to this request,
   but the new design repeats all of the RTP header in the FEC payload so
   the overhead is too large at 7 octets.  It may be possible to just
   insert the problem bits into the FEC header by reducing the mask size
   instead.  This will be discussed with the authors, and others are
   asked to comment as well.  Finally, the SMPTE 292 video draft
   completed last call in October but needed a few tweaks to the security
   considerations and references.

   Steve also mentioned one new document this is not otherwise on the
   agenda: draft-kreuter-avt-rtp-clearmode-00.txt, a CLEARMODE payload
   format that is just the same as PCMU (G.711) audio except that the
   bits carry ISDN data rather than audio.  A question is what media type
   should be used in an SDP description since the bits are not
   necessarily audio.  There is also the possibility of charter overlap
   with PWE3 working group.  Comments are requested.

MIDI Wire Protocol Packetization (MWPP)

   Colin Perkins, sitting in for John Lazzaro, gave an update on the MIDI
   Wire Protocol Packetization (draft-ietf-avt-mwpp-midi-rtp-05.txt).
   This revision incorporates many changes reflecting WG comments (the
   change log itself is 2 pages).  There are about 20 open issues
   remaining, however; John plans a -06 revision early in the New Year to
   list those issues and proposed resolutions, and then a -07 revision to
   incorporate the consensus and be ready for working group last call.
   This normative draft on the payload format is now accompanied by a new
   informative draft intended as an implementers guide for MWPP.  It
   includes a walk-through of sample coding techniques intended to help
   those in the MIDI community who are totally unfamiliar with RTP
   applications.  The new draft is not finished; comments are requested
   on the approach and what should be added or removed.

   In parallel with the document preparation, a reference implementation
   of MWPP in the sfront program is tracking the spec for validation.
   The MIDI Manufacturers Association has also provided comments and
   positive feedback on the MWPP work.  John has also been contacted by
   an IEEE WG that is forming to develop transport of MIDI directly over
   Ethernet (without IP).  He asks whether there are any standards or
   work on using RTP, SDP, RTSP, and SIP in that mode.  Anyone with
   information should let us know.  One answer would be, "Don't do that."

RTP end-to-end liveness test

   Henning Schulzrinne presented a topic resulting from a discussion on
   the mailing list:  Flemming Andreasen had asked whether the RFC 2833
   tones payload format could be extended to include an active end-to-end
   liveness test (an RTP "ping").  The purpose is to detect problems
   above the IP level that might be induced by NATs or firewalls; some
   risks are that the function could be used for DoS attacks or result in
   multicast implosion.  One solution, which doesn't require anything
   new, is to just rely on RTCP reception reports.  A dummy RTP packet,
   perhaps with no payload, can be sent if no real traffic is being sent.
   RTCP already accommodates multicast scaling, although the consequence
   is that the RTCP response is not immediate.  The delay is probably not
   an unreasonable wait.  Not all receivers implement RTCP, but you can
   distinguish that case from a problem in the RTP forward path by
   whether you don't get any RTCP at all or just don't get an RR
   indicating receipt of the RTP packet.  A second solution involves
   signaling (e.g., in SDP) an RTP "ping" capability, then sending a
   special type of RTP packet that would elicit a response packet sent to
   a signaled address or to the source address/port of the request
   packet.  But this solution poses the potential for DoS and implosion
   problems requiring complicated solutions some of which are already in
   RTCP.  That's likely a killer.

   Flemming favors the RTCP solution, but wants faster response in the
   case of success.  Could the RTCP interval be reduced?  Steve Casner
   responded that the RTCP feedback timing rules would be appropriate.
   Dave Oran asked why we need a dummy packet, why not just send comfort
   noise?  Magnus Westerlund pointed out that works fine for audio
   sessions.  For others, an empty payload may be needed.  Flemming
   confirmed this because in some SIP scenarios early media packets can
   cut off ringback tone.  Dave continued that this was all started by
   people who don't do RTCP... they should just do it!  Roni Even said
   monitoring RTCP is important because if the other side dies, there may
   be no other indication that the packets go into a black hole.  Maybe
   this just needs a hints-for-implementers document.  Henning will put
   the discussion on his RTP web page.

RTP Payload for DTMF Digits, Tones and Signals

   Henning Schulzrinne discussed draft-ietf-avt-rfc2833bis-02.txt, which
   updates the payload format for DTMF tones in RFC 2833.  This payload
   format transports DTMF and other tones in the form of named events as
   an alternative encoding the tone waveform with low fidelity when a
   high-compression codec is in use.  There is also a second mode in
   which tones are specified by their component frequencies.  An amazing
   amount of email has been received with comments and requests for
   additions, so many people must be implementing and using this payload
   format.  The changes from the -00 revision are:

     - Addition of a formal notion of state to clarify that signals such
       as on/off-hook and the ABCD bits used on T1 trunks represent sets
       of states out of which only one can be active.  Also, the notion
       of soft state was added for signals that reset to default value
       after a period of time.

     - Clarification that events longer than the maximum duration (about 8
       seconds for 8 kHz RTP clock) can be expressed as the concatenation
       of multiple events.

     - Clarification of which tones can meaningfully have a volume
       specified.

     - Addition of a few data tones and clarification of the meaning and
       naming of ANS signals.

   Colin Perkins expressed concern that the state additions may be
   introducing too much application semantics into the protocol.  Henning
   responded that the concern is understood, but that for the few cases
   that exist the semantics are already fixed.

   There are two open issues.  The first is that some signals (in
   particular MF R1 signals) have acquired different names or
   descriptions over the decades, some of which are not even documented
   well by the ITU, so help is requested to supply definitive references
   for the complete and correct text.  The second issue is more
   significant.  Some (potential) users of the payload format want to
   pass the signals required for fax setup and negotiation, but this
   involves a non-trivial number of bits sent as 300-baud V.21 modem
   data.  Sending these bits as a sequence of tones is very inefficient
   at one symbol (tone) per packet.  This could be improved in various
   ways, but any significant improvement would require redefining the
   fields of the payload format to be interpreted differently.  There is
   a real concern is that we're slipping down a dangerous slope of
   mission creep: this is not a signaling protocol.  The purpose of this
   payload format is to convey tones with more fidelity than low-rate
   codecs can provide, and to allow the receiver to avoid the need to
   implement tone detection for some scenarios.  Do we want to support a
   full-featured fax negotiation as a sequence of named events?  Or
   should we say that if you want to do fax you should do T.38 or
   whatever else might be appropriate, and deprecate what is in RFC 2833
   for V.21 now.  Either extend to do the whole job in a reasonable way,
   or don't do it at all.

   Jim Rafferty, who has participated in IP-FAX standards work, commented
   that T.38 has its pluses and minuses.  A number of people in ITU might
   be interested in an RTP-based alternative to T.38, but he questioned
   whether it is worth doing at this point in time.  Flemming Andreasen
   agreed that this payload format shouldn't be a new way of sending fax,
   but there is a strong need for it in the initial phases of call
   establishment (V.8, V.8bis, V.25), and most of these signals are sent
   using V.21.  Steve Casner took off his chairman's hat to express the
   opinion that we should do nothing more than provide for the sending of
   tones.  If it is feasible for some applications to send each bit of
   V.21 data as a tone in one RFC 2833 packet, that's fine, but we should
   do nothing to provide a higher-density representation.  Flemming asked
   for a review of the code points that are included; the CI signal is
   there, but TM and JM are not, and might be useful.

   Henning said the important point is to get this work completed, and
   that requires interop testing to allow advancing to Draft Standard.
   The number of points in the matrix is large, including features such
   as redundancy; plus for each codepoint the matrix need to state what
   it means to be supported.  One attendee indicated that the tones
   portion of the draft (specified by frequencies) has been implemented,
   but a second would be needed for interop testing.  Robert Sparks has
   posted an initial draft of the matrix and others have volunteered to
   help.  They plan to gather as much interop input as possible at SIPit,
   but for those who are not going to be there, please send interop input
   to Robert Sparks (see draft-sparks-avt-2833-interop-00.txt).

   An attendee asked if other forms of DTMF-represented coding can be
   added, e.g. some signaling supplementary services as defined by
   Telcordia related to voiceband data transmission.  Henning replied
   that there is more room to add tones that fit the design of the
   payload format.  If there is something that exists now, and
   preferably is already implemented since we want to get to Draft
   Standard, send the info: common name, succinct description, and
   citable reference.  However, the list is intended to be extendible
   after the draft is published; there is an IANA registration mechanism.

Payload format for iLBC Speech

   Alan Duric presented an update of two drafts on the iLBC speech codec
   and its associated payload format, in draft-ietf-avt-ilbc-codec-00.txt
   and draft-ietf-avt-rtp-ilbc-00.txt, respectively (each was preceded
   by two revisions as individual submissions).  Extensive changes were
   made to the iLBC codec since last meeting.  The number of bits per
   frame was reduced from 416 to 399 bits to fit in 50 bytes while at the
   same time the quality was improved and the complexity was significantly
   reduced (to less than G.729a).  A/B tests by the authors and by third
   parties confirmed the quality improvement which derives from the
   addition of a 57th sample in the quantized residual state and an
   increase in the number of bits allocated to gain (utilizing bits freed
   elsewhere).  A demo SIP client with the iLBC codec is available by
   request from alan.duric@globalipsound.com.

   Steve Casner asked why the 400th bit should not be used for something
   more than setting it to zero.  Alan replied that several ideas have
   been proposed and that these will be sent on the mailing list.  Steve
   also commented that the codec seems to be still changing a lot.  We
   don't want to progress this until the codec format has stabilized.
   Alan responded that no further changes are expected on the codec
   itself.  This round of changes completes the work on reduction of
   frame size and complexity as planned.  Plans for a 20ms frame option
   may be dropped because the need does not appear strong.  Comments on
   that are requested.  Work on voice activity detection is ongoing; this
   may be paired with the RFC 3389 Comfort Noise.  That work is expected
   to be completed in time for interop testing planned for the next SIPit
   in February.

   Steve asked whether the sorting of bits for ULP is intended to be
   applied across frames, because the payload format draft is not clear
   on this.  The answer is yes.  Steve said that is appropriate
   (otherwise the sorting does no good), but it is a lot of work which
   gives no advantage in environments without ULP at lower layers.  It
   may be necessary to allow both sorted and unsorted modes as in the AMR
   codec.  We'd like feedback from implementers about the cost and
   utility of the ULP sorting.

   Alan asked about the possibility of adding another document giving the
   qualification criteria for the codec.  Steve replied that this would
   need to be standards track to be effective, but the status even of
   standardizing the codec itself is still not entirely clear.  Generally
   IETF avoids conformance testing.  Stephan Wenger asked when the
   general issue of standardizing media codecs in IETF will be resolved.
   Steve Casner replied that, although the Transport ADs were consulted
   and were in favor of this work before it started, we won't know the
   answer for sure until the work is submitted to the whole IESG for
   approval.

RTP Payload Format for ATRAC-X

   Matthew Romaine present a new payload format for ATRAC-X audio in
   draft-hatanaka-avt-rtp-atracx-00.txt.  Sony's ATRAC family of
   perceptual codecs is used in MD's and solid-state recorders.  The -X
   version supports multiple channels in a wide range of data rates from
   8kbps to 1.4Mbps.  The payload format supports multiplexing of
   multiple streams and metadata within a single session, redundant data
   to mitigate packet loss, and fragmentation.  The draft details the
   segmentation of streams into segments and the association of segments
   from different streams in the same time slot.  Two open issues were
   identified; the first was how to manage the allocation of metadata
   identifiers.  Some appropriate body could static identifiers, as is
   done in MIDI, or the assignments could be a dynamic free-for-all.
   There was no input on this.  The second issue is the determination of
   the RTP timestamp: the draft currently specifies transmit time, but it
   has already been pointed out that a presentation (sampling) timestamp
   is needed to allow synchronization with other streams.  The problem is
   that a single session might carry multiple sampling rates.  Steve
   Casner offered the example of MPEG audio in which the timestamp clock
   rate is always 90kHz synchronized to the sampling clock, which may
   vary in rate.  Could a similar arrangement be used here?  Magnus
   Westerlund suggested that if different rates are needed, perhaps
   different RTP sessions should be used.

   Steve Casner asked why the multiplexing of streams built into the
   payload format rather than using multiplexing at the UDP/RTP level.
   Is the format derived from something already in use on MD or other
   media and therefore hard to change, or is it a new design that is part
   of the payload format and therefore open to discussion?  Matthew
   responded that the format was developed with streaming in mind; it is
   supposed to be extensible.  Multiple bit rates are supported for
   scalable QoS, and they have specified multi-channel configurations up
   to 7.1 but it could be expanded to 32 channels.  The benefit is
   payload overhead.  Steve asked how this would be used for QoS: keep
   some parts of the packet and throw away others?  That does not work.
   It might make sense for the file format to contain multiple rates for
   scalability, but the packets should only contain the rate appropriate
   for the receive or you have not achieved the goal of fitting the
   available bandwidth.  If you need to deliver different rates to
   different receives, send different streams, or layered coding for
   multicast.  Roni Even echoed this concern; if the multiplexing of
   streams is for redundancy, the draft needs to explain the relationship
   between the fragments, redundant segments, etc.

   Colin Perkins asked why redundancy was built into the payload format
   rather than using RFC 2198.  The authors were unaware of 2198.  Steve
   also pointed out that for redundancy to be useful the redundant copy
   may need to be separated further in time than one slot.  He also
   suggested that it would be useful for the authors to review several of
   the other payload formats since several of the architectural ideas
   commonly used in AVT have been missed, such as separate streams for
   separate needs.

   Magnus asked if is it possible for fragments to be independently
   decoded, or must a segment be fully reassembled to decode it.  Matthew
   said the answer depends on the encoder, and needs to look into this
   further.  In summary, this payload format may need quite a bit of
   change from what is defined so far.

RTP Payload Format for Uncompressed Video

   Ladan Gharai presented updates to draft-ietf-avt-uncomp-video-01.txt.
   In addition to the correction of editorial nits and the inclusion of
   an applicability statement and a comparison to RFC 2431 (BT.656
   video), some new features were added: 12- and 16-bit sample sizes join
   to the 8- and 10-bit sizes specified previously, and monochrome,
   4:4:4:4 chrominance subsampling, and RGBA color representations were
   added.  The payload header was unchanged except that the 'M' bit was
   renamed 'C' to avoid possible confusion with marker bit in RTP header.
   The draft has established a list of mandatory SDP fmtp parameters and
   a partial list of optional parameters.  The authors are still working
   on the representation of these parameters, but will complete this work
   for the next draft.

   Ladan identified a few open issues.  Currently only packed sample
   formats are provided; the authors are considering adding planar and
   macro-blocked formats as well.  The planar format, in which color
   planes are sent separately, is straightforward; it would be identified
   by an SDP parameter.  However, it is unclear whether it makes sense to
   have packed and macro-block formats in the same payload format.  To
   accommodate macro-blocks, width and length parameters would have to be
   added to the payload header (there is room), and then the packed
   format would be indicated by a macro-block size of 1.  Stephan Wenger
   would like to see the planar representation added, but has doubts
   about a macro-block-based scheme.  There are applications for which it
   would be useful, but there are too many complications related to
   interlacing.  You can't assume that the shape of a macro-block will be
   16x16 in a progressive scan or in one field.  Sometimes a macro-block
   is a different size with parts from both fields.  It is also affected
   by transcoding.

   A second open issue is the transport of interlaced 4:2:0 color
   subsampling.  This has been discussed on the mailing list and work is
   still in progress.  Lastly, for interlaced video, there is a question
   whether the two fields should have distinct timestamps.  A problem is
   that for the current 90kHz timestamp clock rate which increments at
   3003 for 29.97fps NTSC video, a fractional increment of 1501.5 would
   be needed for the intermediate field timestamp, but the RTP timestamp
   is an integer.  It should be possible instead to derive the timestamp
   from header bits and the frame rate.  Stephan explained that you need
   to have a timestamp for every field in order to indicate the proper
   mapping of fields between 24fps film content and 30fps video using
   3-2-pulldown because an individual field may be repeated so they do
   not always appear in even-odd pairs.  However, we don't worry about
   the exact timestamp value for this, it would be safe to round up to
   the next integer.

Resolution of comments on draft-ietf-avt-srtp-05.txt

   The first session ended with a discussion of IESG security concerns
   regarding the Secure RTP profile (draft-ietf-avt-srtp-05.txt).  For
   this discussion, Allison Mankin introduced herself as the Transport AD
   for this group, Eric Rescorla as security advisor to the Transport
   Area, and Steve Bellovin who is one of the Security Area Directors.

   Eric Rescorla started by noting that the SRTP profile has some unusual
   design features: it uses AES in counter mode, rather than in CBC mode,
   and it offers a choice of several message authentication codes (MACs),
   including no authentication.  These features, in particular the option
   to use AES in counter mode with no authentication, don't make security
   folks comfortable.  Eric then summarized his understanding of the
   issues that require SRTP to use these modes of operation.  The first
   is latency, since shorter packets mean less latency for voice and MACs
   consume bandwidth.  Secondly, wireless channels are noisy and packets
   often contain bit errors.  If integrity checks are used in this
   environment, the bandwidth consumption will be excessive and the bit
   errors may lead to unacceptable packet discard rates due to failure of
   the integrity checks.

   Eric moved on to explain that counter mode has no integrity protection
   unless protected by a MAC.  This is not obviously a problem for voice,
   one of the key applications for SRTP, but may be a problem for other
   types of content.  From a security viewpoint, it is desirable to use
   SRTP with a MAC, but the default MAC in SRTP is a weak 32-bit code and
   there is the option to use SRTP without integrity protection (there is
   also a strong MAC option).  The choices lead to the threat of modified
   message streams and forged traffic, unless the optional strong MAC is
   used.  Two solutions to this problem were proposed:

    1) Make the MAC mandatory and add FEC after encryption to correct bit
       errors so that the integrity check will work on somewhat corrupted
       packets.  There was considerable discussion of this in private
       email with the authors, who were opposed on the grounds that it
       expands packets and makes SRTP uneconomic for cellular links,
       which already employ link-layer FEC.  Eric was not convinced by
       these arguments, citing the qualitative nature of the concerns
       rather than hard numbers giving performance impact.

    2) Define a wireless voice profile for SRTP where the MAC protects
       only the control data leaving the media data unprotected.  The
       reduced MAC causes limited packet expansion, but is less sensitive
       to bit-errors than SRTP as currently specified.  Other types of
       traffic will use a mandatory 80 bit MAC.

   Mark Baugher noted that SRTP has the ability to use strong integrity
   protection now, but it's not the default.  The question is whether the
   vendors or the users should be able to make the determination, based
   on their environment, their application, whether they want a strong
   MAC or not.  Steve Bellovin agreed with this formulation, but noted
   that the IESG has a strong preference for protocols that are secure by
   default, and a protocol won't be published unless it has strong
   mandatory to implement security.  If a protocol has weaker security
   options, it needs a Security Considerations section that describes the
   environments where the weaker options may be acceptable, and explains
   the consequences and tradeoffs of selecting those options.

   Eric Rescorla asked Steve Bellovin if it was acceptable for SRTP to
   have the option of no authentication?  Steve answered that it was
   permitted in certain other situations, but would take detailed
   analysis to show where it is safe and useful and where it isn't.

   Mark Baugher asked if changing the default mandatory transforms,
   adding CBC mode as an option, would satisfy concerns?  Steve Bellovin
   answered that, assuming you meet requirements for safely using counter
   mode, there is no strong need for CBC mode; the MAC is much more
   critical.

   Mark Baugher asked if the security folks are not happy with the
   default 32-bit HMAC-SHA1?  Steve Bellovin replied that he needs to
   think more on that, but the group needs to better analyse the
   environment before he can make a good decision.

   Allison Mankin reminded the group that SRTP is for all environments,
   and expressed her preference for a specification where the MAC was
   mandatory in all cases, with a possible exception for cellular
   telephony.  Elisabetta Carrara reminded Allison that SRTP includes a
   32-bit MAC by default, and that stronger options are specified.  Eric
   Rescorla again noted that it is necessary to analyse individual threats
   and the environment, giving numbers to characterize the impact of
   security on performance.

   Elisabetta noted that the MAC cannot be used in cellular telephony,
   since that environment cannot afford the bandwidth of the MAC.  She
   reminded the group that the requirements driving ROHC and UDPlite also
   apply to SRTP.  Steve Bellovin replied this is the sort of thing that
   has to go into the security considerations section, explaining why the
   environment has these requirements and how they affect security.

   Allison commented that the draft is intended to be general purpose,
   but is optimized for cellular use.  The default transform needs to be
   suitable for the general case, with a non-optional MAC if counter mode
   is used, and justification why weaker options are present for cellular
   operation.

   Steve Casner asked if there was a problem with changing the defaults
   to be more general-purpose, signaling specific settings for telephony
   applications, and clearly documenting the rationale in the security
   consideration section?  There were no objections.

Liaison statement from MPEG

   Steve Casner started the second session by reading a liaison statement
   the group has received from the MPEG committee, stating that they have
   revised the RTP Payload Format for MPEG-4 taking into account comments
   from the last AVT working-group last call, and requesting publication
   of the draft as an RFC.

MIME Type Registration for MPEG-4

   Jan van der Meer, sitting in for Young-Kwon Lim, outlined the draft
   draft-lim-mpeg4-mime-01.txt that specifies MIME type registrations for
   the MPEG-4 file formats and their relation to the MPEG4-on-IP
   framework (ISO/IEC 14496-8).

   Steve Casner noted that this draft includes some discussion of RTP
   MIME parameters, which needs to be moved to the payload format drafts.
   Steve also expressed concern that the previous versions of the full
   framework document, submitted to the IETF, had problems which needed
   to be resolved but it's not clear that these have been addressed in
   ISO.  There is a need to address these issues in future, especially if
   this MIME registration and the framework conflict.

   Mike Coleman asked about the difference between streams and files, in
   this context, since MPEG-4 streams are not well defined.  Steve Casner
   and Colin Perkins clarified that this draft should cover only the MP4
   file format, and that the RTP payload format drafts will contain MIME
   types for use with RTP.

   Stephan Wenger asked about the presumed existence of an informational
   RFC, pointing to the MPEG4-on-IP framework.  Colin Perkins and Steve
   Casner explained that this was agreed in the AVT meeting at the 52nd
   IETF (Salt Lake City).

RTP Payload Format for MPEG-4

   Jan van der Meer discussed draft-ietf-avt-mpeg4-simple-05.txt, the RTP
   Payload Format for MPEG-4.  This document is in working group last
   call and several comments, mostly editorial, have been received.  The
   main issues are the suggested replacement of the "Profile" parameter
   with "InterleaveDelay", and whether RTP timestamps should be allowed
   to go backwards when interleaving.  These have been discussed in AVT,
   and in MPEG and ISMA, and it has been agreed to allow both features.
   Current discussion on the mailing list is on the exact meaning of
   interleave delay and emission rules.

   This discussion continued in the meeting with Steve Casner, Stephan
   Wenger, Colin Perkins and Andrea Basso commenting on the RTP system
   model and how it leaves much to the discretion of the receiver when
   compared to the MPEG buffer model.  They saw no need for the emission
   rules, viewing them as implementation details that do not need to be
   specified.  In addition, they noted that the characteristics of an IP
   network are such that the sender cannot control the buffering at the
   receiver.  This also led to the definition of the interleaving delay,
   with concern being expressed that the attempt to precisely define the
   delay being unnecessary, since what is really needed is a hint to the
   receiver suggesting an starting estimate of the buffering delay.  Much
   of the complexity comes from trying to tightly bound the interleaving
   delay, and a tight bound is not necessary or feasible.

   Stephan Wenger asked what would be the impact of pulling interleaving
   out of the payload format?  Colin Perkins said that this is not
   possible, but we may consider leaving the interleave delay parameter,
   and letting the sender chose an appropriate value without saying how
   to do that.

   Mike Coleman asked about the draft status, since it is not available
   in the archives and because parts of the MPEG committee belive it
   complete, but it clearly is not.  Steve Casner noted that the draft
   will be in the archives after the meeting.  Steve and Colin also noted
   that the current working group last call is not completed.  There will
   be time to review any changes introduced before the draft is advanced.

RTP Payload Format for JVT Video

   Stephan Wenger discussed draft-ietf-avt-rtp-h264-00.txt, the payload
   format for JVT video.  This updates draft-wenger-avt-rtp-jvt-01.txt to
   align with the latest JVT specification and adds MTAPs with 8-, 16-,
   24- and 32-bit timestamp offsets (as discussed at the previous AVT
   meeting).  Stephan is considering removing the 8- and 32-bit timestamp
   offsets, since they are not believed to be useful.

   The next open issue is the relation between this payload format and
   the MPEG-4 payload format, since JVT video is referenced as part of
   MPEG-4.  Stephan believes that using the MPEG-4 format for JVT is not
   acceptable, since MTAPs and STAPs cannot be sent efficiently with that
   format.  He also believes that full binary compatibility between the
   JVT payload format and the MPEG-4 payload format is not achievable.
   However, it is possible to define a common operation point, providing
   compatibility at the expense of limited optimization.

   Steve Casner noted that the draft specifies use of the latest
   timestamp when doing AU aggregation, but that other payload formats
   use the oldest timestamp.  Stephan agreed that this is an issue, and
   should be changed.

   Mike Coleman noted that section 3 says the draft is "not intended to
   be used with MPEG-4 systems" and asked for clarification what is
   meant?  It is possible to use it with MPEG-4 systems, but there are
   some features of this draft that are not compatible with the MPEG-4
   payload format.  Jan van der Meer noted that some in MPEG will ask
   "what are the features offered with this draft that cannot be
   supported by the MPEG-4 payload format?"  Stephan answered that the
   main reason is STAPs and aggregation which cannot be supported
   efficiently, and multiple fragments of AUs are vital but not supported
   in the MPEG-4 payload format.  There was some discussion of this, and
   it may be appropriate to clarify in a future version of the draft.

RTCP Reporting Extensions

  Alan Clark discussed draft-ietf-avt-rtcp-report-extns-01.txt, on RTCP
  reporting extensions.  This is the combination of the various reporting
  extensions drafts discussed in Yokohama, with the addition of loss
  run-length encoding, updated VoIP metrics, and security and IANA
  considerations.

  Colin Perkins noted that the IANA considerations section needs work to
  specify the registration in detail, and will supply detailed comments
  offline.  Colin also asked if the jitter buffer metrics are useful and
  match implementations?  Have implementors looked at the draft to see if
  the information is meaningful in their context?  Alan Duric noted that
  jitter buffer and PLC functions are separate in the draft, but these
  sometimes combined in implementations.  Alan Clark said that the broad
  intent is to provide rough info for diagnostic purposes, not an exact
  description of an implementation.

  Alan Clark noted the need to be management friendly, even if SRTP is
  used.  Accordingly, he would like to add a note to the draft indicating
  that the SRTP E bit can be used to send extended RTCP report frames in
  plaintext, even if encryption has been selected as the default setting.
  Colin agreed that this might be possible, but noted that the draft
  shouldn't specify a security policy.  Steve Casner also noted that the
  draft should talk about this issue in the security considerations
  section.

  Steve Casner also highlighted that a receive-only endpoint will not
  know the RTT that is supposed to be included in the VoIP metrics report
  since the RTCP mechanism works only for senders.  Alan Clark replied
  that these metrics are expected to be used for full-duplex
  conversations.  Steve said that in that case, the draft needs to make
  clear in what scenarios the VoIP metrics report is applicable.

RTCP Extensions for SSM

  Jörg Ott described changes to draft-ietf-avt-rtcpssm-02.txt, the RTCP
  extensions for source-specific multicast.  The main changes are to the
  security considerations.  In addition, SSRC distribution has been
  removed from this version and cumulative values are now included in the
  distribution.  There are "work-in-progress" changes to the IANA
  considerations section and to use the XR packet formats (on this
  subject, Jörg noted that there are several proposed RTCP extensions
  using packet type 205, and we need to resolve this conflict).

  The security considerations section has been significantly reworked,
  with the assumptions that we need to maintain low overhead, that the
  session parameters are securely distributed out of band, and that the
  security weaknesses should be addressed at the transport layer and
  above since weaknesses may exist in the SSM layer below.  The threats
  identified are denial of service, packet forgery, session replay and
  eavesdropping.  The draft also categorizes threats according to the
  direction of the traffic flow, and discusses the trust models.

  Colin Perkins approved of the security considerations section, but
  would like discussion of specific applications and mandatory security
  behavior for those applications in this draft (e.g.  how to use SSM
  with RTSP and SIP).

  Jörg highlighted the issue of relation to other I-Ds, since this uses
  the features of the extended RTCP reporting draft.  He asked on the
  time schedule for the RTCP reporting extensions draft.  Alan Clark
  would like to get the RTCP Reporting Extensions draft done quickly, and
  was willing to cooperate on the IANA issues, ensuring they're aligned.

  Jörg asked if future drafts relating to RTCP should include a section
  on SSM considerations?  Steve Casner was not sure if we need to
  establish a requirement, but noted that this draft should have a section
  giving advice to authors of RTCP extensions that might be affected by
  SSM.

  Open issues include cumulative BYE packets, a possible revision to the
  message format, discussion of the relation to other RTP/RTCP
  extensions, completion of IANA considerations, etc.  A revised draft is
  expected by the end of the year.

Retransmission

  The RTP retransmission format
  (draft-ietf-avt-rtp-retransmission-03.txt) was discussed by José Rey.
  This is the merger of the two previous drafts, as was discussed in
  Yokohama.  The new draft uses a dynamic payload format to indicate the
  original payload type of the retransmission.  It supports session
  multiplexing, with streams associated using an a=fmtp parameter and
  FID, and SSRC multiplexing using an a=fmtp parameter to associate the
  retransmission with the original stream.  José also outlined the RTSP
  considerations regarding SSRC-multiplexing.  There will be a minor
  revision shortly, which is expected to be ready for last call.

  Anders Klemets asked if one MUST NOT do session multiplexing and SSRC
  multiplexing in the same session?  It was clarified that this is
  correct.

RGL codec and payload format

  The final presentation was a brief outline of the RGL lossless G.711
  codec, by Michael Ramalho, which was presented as a possible future
  work item.  Steve Casner noted that standardizing codecs is not
  entirely within scope of AVT, and will need discussion, as with iLBC.
  Drafts will be submitted shortly after the meeting.

_______________________________________________
Audio/Video Transport Working Group
avt@ietf.org
https://www1.ietf.org/mailman/listinfo/avt