RE: Gen-ART and OPS-Dir review of draft-ietf-payload-g7110-03

"Michael Ramalho (mramalho)" <mramalho@cisco.com> Wed, 29 October 2014 15:35 UTC

Return-Path: <mramalho@cisco.com>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4755E1A0179; Wed, 29 Oct 2014 08:35:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -13.811
X-Spam-Level:
X-Spam-Status: No, score=-13.811 tagged_above=-999 required=5 tests=[BAYES_50=0.8, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, GB_I_INVITATION=-2, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8A6Jof4lqapW; Wed, 29 Oct 2014 08:35:37 -0700 (PDT)
Received: from rcdn-iport-9.cisco.com (rcdn-iport-9.cisco.com [173.37.86.80]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 71E101A014E; Wed, 29 Oct 2014 08:35:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=37377; q=dns/txt; s=iport; t=1414596937; x=1415806537; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-transfer-encoding:mime-version; bh=x1Wskr5iCrBOnMjXIb1qHWG0zdk03qs0kUz8aVr1oNQ=; b=dY9aM9NI1/UTSeGdYRTOjU4wjqaQ6jLko1NAym1aL52/m78JxE99zOBf pO2CsSu+LYmauNNw4fpL2FHUQOBY3SoyI3elzIubKgn9bWc92Co6nxpqu B0VOaqEJNuemROh0gjIpTIqvg/fZkuEvfs5qKAz7jX1KHAHl7Ejyrx/1F k=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AhcFAD4IUVStJV2a/2dsb2JhbABZA4MOVFgEzhuHSwKBGxYBAQEBAX2EAgEBAQQaAQxLBwwEAgEIEQMBAQELHQchERQJCAEBBAENBQgBEgSIDQMSDcEFDYY4AQEBAQEBAQEBAQEBAQEBAQEBAQEBF45PJ4EwAREBHyEQBwYLgxyBHgWEYoFLhEI7hEOCHoRKgXdMgX5Bg0I8gw2DL4csgmCEA4N4bAGBBQYDFwQegQMBAQE
X-IronPort-AV: E=Sophos;i="5.04,810,1406592000"; d="scan'208";a="364421503"
Received: from rcdn-core-3.cisco.com ([173.37.93.154]) by rcdn-iport-9.cisco.com with ESMTP; 29 Oct 2014 15:35:36 +0000
Received: from xhc-rcd-x09.cisco.com (xhc-rcd-x09.cisco.com [173.37.183.83]) by rcdn-core-3.cisco.com (8.14.5/8.14.5) with ESMTP id s9TFZZmV026023 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Wed, 29 Oct 2014 15:35:35 GMT
Received: from xmb-rcd-x12.cisco.com ([169.254.2.5]) by xhc-rcd-x09.cisco.com ([173.37.183.83]) with mapi id 14.03.0195.001; Wed, 29 Oct 2014 10:35:35 -0500
From: "Michael Ramalho (mramalho)" <mramalho@cisco.com>
To: "Black, David" <david.black@emc.com>, "Paul E. Jones (paulej@packetizer.com)" <paulej@packetizer.com>, "harada.noboru@lab.ntt.co.jp" <harada.noboru@lab.ntt.co.jp>, "muthu.arul@gmail.com" <muthu.arul@gmail.com>, "lei.miao@huawei.com" <lei.miao@huawei.com>, "General Area Review Team (gen-art@ietf.org)" <gen-art@ietf.org>, "ops-dir@ietf.org" <ops-dir@ietf.org>
Subject: RE: Gen-ART and OPS-Dir review of draft-ietf-payload-g7110-03
Thread-Topic: Gen-ART and OPS-Dir review of draft-ietf-payload-g7110-03
Thread-Index: Ac/uDvwDXYp5WfZlQayJ5JnEo1/RLQEnG7PQ
Date: Wed, 29 Oct 2014 15:35:35 +0000
Message-ID: <D21571530BF9644D9A443D6BD95B91032710E1C9@xmb-rcd-x12.cisco.com>
References: <CE03DB3D7B45C245BCA0D24327794936062C09@MX104CL02.corp.emc.com>
In-Reply-To: <CE03DB3D7B45C245BCA0D24327794936062C09@MX104CL02.corp.emc.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.82.234.145]
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Archived-At: http://mailarchive.ietf.org/arch/msg/ietf/qWRQRx-8ttCj9_8RbLgFpAORtok
X-Mailman-Approved-At: Thu, 30 Oct 2014 08:07:22 -0700
Cc: "ietf@ietf.org" <ietf@ietf.org>, "payload@ietf.org" <payload@ietf.org>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ietf/>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 29 Oct 2014 15:35:50 -0000

David,

The authors of the G.711.0 RTP Payload Draft thank you for the comments below. It is clear from the caliber of your comments that you spent a lot of time on this.

G.711.0 being a variable length stateless and lossless compression for G.711 (a sampled-oriented encoding) causes a lot of confusion to those who occasionally think of it as "a codec" instead of the lossless compression mechanism it is.

Thus, this was a hard payload format to write due to some of the pre-conceived notions of what G.711.0 is and an even harder one for someone to review (as it is not sample-based or fixed-length frame-based encoding that the authors of RFC 3550/3511 assumed/envisioned).

So, I really do thank you for the effort here, David. You must have drawn the short-straw.

My response to your comments/questions are made in-line below (my comments with "\begin {Reply to [issue]}" and my proposed fixes within these are highlighted with ">>").

Regards,

Michael A. Ramalho, Ph.D.

-----Original Message-----
From: Black, David [mailto:david.black@emc.com] 
Sent: Wednesday, October 22, 2014 11:44 AM
To: Michael Ramalho (mramalho); Paul E. Jones (paulej@packetizer.com); harada.noboru@lab.ntt.co.jp; muthu.arul@gmail.com; lei.miao@huawei.com; General Area Review Team (gen-art@ietf.org); ops-dir@ietf.org
Cc: ietf@ietf.org; payload@ietf.org; Black, David
Subject: Gen-ART and OPS-Dir review of draft-ietf-payload-g7110-03

This is a combined Gen-ART and OPS-DIR review.  Boilerplate for both follows ...

I am the assigned Gen-ART reviewer for this draft. For background on Gen-ART, please see the FAQ at:

<http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.

Please resolve these comments along with any other Last Call comments you may receive.

I have reviewed this document as part of the Operational directorate's ongoing effort to review all IETF documents being processed by the IESG.  These comments were written primarily for the benefit of the operational area directors.
Document editors and WG chairs should treat these comments just like any other last call comments.

Document: draft-ietf-payload-g7110-03
Reviewer: David Black
Review Date: October 22, 2014
IETF LC End Date: October 27, 2014
IESG Telechat date: October 30, 2014

Summary: This draft is on the right track, but has open issues
 		described in the review.

Process note: This is the second draft that I've reviewed recently that has been scheduled for an IESG telechat almost immediately following the end of IETF Last Call.  The resulting overlap of IETF LC with IESG Evaluation can result in significant last-minute changes to the draft when issues are discovered during IETF LC.

This draft describes an RTP payload format for carrying G.711.0 compressed G.711 voice.  The details of G.711.0 compression are left to the ITU-T G.711.0 spec (which is fine), and this draft focuses on how to carry the compressed results in RTP and conversion to/from uncompressed G.711 voice at the communication endpoints.
I found a few major issues and a couple of minor ones, although a couple of the major issues depend on a meta-issue, - the intended relationship of this draft be to the ITU-T G.711.0 spec.

In general, I expect IETF RFCs to be stand-alone documents that make sense on their own, although one may need to read related documents to completely understand what's going on.  For this draft, I would expect the actual compression/decompression algorithms to be left to the ITU-T spec, and this draft to stand on its own in explaining how to deploy G.711.0 compression/decompression with RTP.  If that expectation is incorrect, and this draft is effectively an RTP Annex to G.711.0 that must be read in concert with G.711.0, then the first two major issues below are not problems as they should be obvious in the G.711.0 spec, although the fact that this draft is effectively an Annex to G.711.0 should be stated.  Otherwise, those two major issues need attention.

-- Major Issues (4):

[A] Section 4.2.3 specifies a detailed decoding algorithm covering how G.711.0 decompression interacts with received RTP G.711.0 payloads.
A corresponding encoding algorithm specification is needed on the sending side for G.711.0 compression interaction with RTP sending.
The algorithm will have some decision points in it that cannot be fully specified, e.g., time coverage of the generated G.711.0 frames.

\begin {Reply to [A]}

I believe you are correct. As with everything associated with G.711.0 , a longer answer is required.

At the sender end, the G.711.0 encoder itself has decided exactly how it desires to send compressed G.711.0. As an example outlined earlier in Section 3.3.1 (Multiple G.711.0 Output Frame per RTP Payload Considerations), a given G.711.0 encoder could choose to encode 20ms of input G.711 symbols as: 1) a single 20ms G.711.0 frame, or 2) as two 10 ms G.711.0 frames, or 3) any combination of 5 ms or 10 ms G.711.0 frames. The decision criteria is NOT SPECIFIED in the ITU-T G.711.0 standard;  a G.711.0 encoder could choose base on: 1) which encoding produced resulted in fewer bits, 2) simple operation such as always using 20 ms G.711.0 frames, or 3) any other criteria of its choosing. Thus the encoding process is NOT DETERMINISTIC in how many G.711.0 frames could represent a given ptime of G.711 symbols.

[Aside: Using a 20 ms ptime example, there could be 1, 2, 3 or 4 G.711.0 frames in a RTP payload in any one of six combinations in a G.711.0 payload ([20ms],[ 10ms:10ms],[10ms:5ms:5ms], [5ms:10ms:5ms], [5ms:5ms:10ms],[5ms:5ms:5ms:5ms]).]

Thus, it is important to note that the >>G.711.0 STANDARD<< only specifies the encoding of an individual input G.711 frame (which can only have lengths of 40, 80, 160, 240 or 320 G.711 symbols) to a valid G.711.0 frame.

The authors of this draft assumed that the G.711.0 compressor/encoder provider has already made the encoding decision on the number of G.711.0 frames INDEPENDENT of the decompressor/decoder and OUTSIDE any sender-side RTP payload processing. That is, the G.711.0 encoder just passed the result (any of the combinations above) the compressor/encoder made to the G.711.0 RTP layer at the sender to be incorporated into the G.711.0 payload. The RTP layer could then choose to add padding octets (0x00) to form the final G.711.0 payload.

>From that perspective, the co-authors of the draft believed what was important for the draft was "what could be on-the-wire". However, since the ITU-T G.711.0 standard only specifies the individual G.711 frame to G.711.0 mapping, there is a benefit in explicitly calling out the possible "payload encoding process" in this section (4) as well.

>>Proposed Action: If my co-authors agree, I could write a very small section titled "G.711.0 RTP Payload Encoding Process" (inserted in-between the present 4.2.2 and 4.2.3). This paragraph-long section will reverse reference Section 3.3.1 and remind the implementer that they can - at their option - chose to use any of the allowable encoding possibilities described in it. I think David is correct, we assumed that some entity PURPOSELY NOT defined by the G.711.0 standard (the provider of the "G.711.0 compressor/encoder") already made those decisions and that explicit definition of that decision is not specified anywhere in any SDO document (so why not here?). Indeed, any "standard G.711.0 encoder" offered by a vendor would likely have that functionality within it (so a RTP implementer wouldn't need to know it either). I could also remind the reader that one could use a single G.711.0 frame per ptime (if a G.711.0 frame supported that ptime) for the least complicated encoding case. Would that work David? Would that work co-authors?

\end {Reply to [A]}

[B] The G.711.0 frame format is not specified here, making it very difficult to figure out what's going on when G.711.0 frames are concatenated.  A specific example is that the concept of a "prefix code" that occurs at the start of a G.711.0 frame is far too important to be hidden in step H5 of the decoding algorithm in Section 4.2.3.

\begin {Reply to [B]}

We welcome comments on how to improve this section, as it is complicated. We did attempt to describe only what is necessary for understanding.

At the beginning of Section 4.2.3 we IMMEDIATELY reference the ITU-T G.711.0 document - as it is that document that describes how to "decode a G.711.0 bit-stream". We really want the reader needing to know the details to go there first. Indeed, the entire G.711.0 payload could be provided to the G.711.0 bit stream decoder in the ITU-T G.711.0 reference code and obtain all the uncompressed G.711 samples in the RTP payload and be finished without knowing anything in this section.

The bit-stream decoder in the ITU-T reference code was defined to parse the individual compressed G.711.0 frames. However the G.711.0 >>STANDARD ITSELF<< defines only the mapping between the 40, 80, 160, 240 or 320 G.711 symbols presented to it and the G.711.0 frame produced from those 40, 80, 160, 240 or 320 samples (i.e., only Section 3.3).

In other words, someone designing a G.711.0 encoder could choose how to partition the uncompressed G.711 symbols into groups of 40, 80, 160, 240 or 320 samples and then individually encode them into individual G.711.0 frames as per my reply to [A].

Any arbitrary value corresponding to a valid "G.711.0 prefix code" is NOT unique (or otherwise special) in that it can be appear anywhere within a G.711.0 frame; however a given value for a prefix code DOES have a unique meaning >>TO THE G.711.0 DECODER<< (not the RTP machinery) when it is present at the beginning of a G.711.0 frame. 

The mention of the prefix code (with immediate reference back to  the ITU-T specification I might add) was simply side information conveyed to the reader for purposes of understanding. The G.711.0 decoder actually "reads it" and then uses it to know how many source G.711 to produce (in this case exactly M G.711 samples). The only thing the G.711.0 RTP implementer needs to know is that the G.711 sample buffer returned by the G.711.0 decoder will contain exactly M samples of G.711.

To be precise, the ITU-T specified G.711.0 decoder returns not only the samples themselves, but the number of samples, M upon its exit (we were not 100% clear on this - fix proposed below). The value of M is important to the RTP decoding process; the value, structure or meaning of "prefix code" isn't. The only exception is that 0x00 has a special meaning when it appears where a prefix code might otherwise be expected.

To accommodate padding, 0x00 may be placed anywhere between the encoded G.711.0 frames (we only recommend that any desired padding be placed at the end of the RTP payload). But to convey this "0x00" for padding, we needed to describe that 0x00 could not be a valid prefix code. If it were not for the desire for padding, we would not have even mentioned that a "prefix code" existed in a G.711.0 frame.

In the text we mention that a "0x00" where a prefix code is expected in a G.711.0 bit stream is "silently ignored" by a G.711.0 frame decoder.

The mention of the prefix code was only for general information of what the G.711.0 decoder actually does (generally how it decodes the frame and that "0x00" isn't a valid prefix code) and what is expected by the RTP machinery when the G.711.0 decoder is finished decoding (the value of M and the M individual G.711 symbols). 

Summary: The interested reader desiring knowledge of how to decode a  G.711.0 bit stream should really read the ITU-T document first; that is why we put the reference to the "ITU-T G.711.0 Reference code" as the FIRST sentence in Section 4.2.3. They don't need to know what a "prefix code" is other than it is used by the G.711.0 decoder to know how many samples (M) it will produce and that the value of M will be returned by the G.711.0 decoder.

>>Proposed Action: I would suggest the following change in H5 to make this clearer:
From: The G.711.0 decoder will produce exactly M G.711 source symbols.
To: Then the ITU-T specified G.711.0 decoder will produce exactly M G.711 source symbols and return both the symbols (in a buffer up to 321 octets in length if the in-place ITU-T reference code is used) and the value of M upon exit.

That information - the samples and the value of M - is the only thing the reader needs to know.

Does that work for you, David?

\end {Reply to [B]}

[C] The discussion of use of the SDP ptime parameter is spread out and imprecise (is SDP REQUIRED?, when is ptime REQUIRED, RECOMMENDED, or recommended? - it's not obvious).

A specific example is that this sentence in Section 4.2.4 is an invitation to interoperability problems ("could infer" - how is that done and where do the inputs to that inference come from?):

   Similarly, if the number of
   channels was not known, but the payload "ptime" was known, one could
   infer (knowing the sampling rate) how many G.711 symbols each channel
   contained; then with this knowledge determine how many channels of
   data were contained in the payload.

I would suggest that a subsection be added, possibly at the end of Section 3, to gather/summarize all of the relevant ptime discussion in one place.  I suspect that the contents of this draft are mostly correct wrt ptime, but it's hard to figure out what's going on from the current spread-out text.  It looks like "ptime" could provide a cross-check on correctness of G.711.0 decoding - see minor issue [G] below.

This major issue [C] is independent of the relationship between this draft and the G.711.0 spec.

\begin {Reply to [C]}

We underspecified the use of SDP  on purpose, but I also agree that some text on why we wish to leave it underspecified could be useful. In Section 5 we simply say "parameters that may be used to configure [G.711.0 RTP transmission]". Perhaps the MAY should be capitalized? Or more text?

As you know and appreciate, one could put an arbitrary number of G.711.0 frames in a G.711.0 RTP payload and the decoder really won't know how many G.711 samples were compressed in that payload until it decodes the entire payload.

Point A: For systems that use SDP and have specified a ptime (IANA registration for ptime is as an OPTIONAL parameter per WG agreement), a check can be performed to see if the required number of G.711 samples is present.

Point B: For systems that use SDP and have not specified ptime - the payload can still be decoded. In this case there is no a priori expectation on the number of G.711 symbols contained within the G.711.0 RTP payload and thus no check is possible.

Point C: For systems that use SDP we RECOMMEND that ptime SHOULD be used (see IANA registration text). The reason is that such a check can be made!

All three points (A, B & C) have been agreed to during previous meetings/discussions.

However, some USERS of the G.711.0 payload format may wish to use the RTP format itself but NOT use SDP! A good example is a "in-the-middle" compression of a G.711 flow (into a G.711.0 flow) and a corresponding decompression of the G.711.0 flow back into a G.711 flow. This is possible in many network arrangements (e.g., enterprise to enterprise) where the compression and decompression endpoints know the PT corresponding to G.711.0 use within their administrative domain.

[Aside: At one time this RTP Payload format had both the payload definition (this draft) and G.711.0-specific use cases within it. Previous WG discussion supported the splitting out of the use-cases into a separate draft (a "G.711.0 use case" draft). I have such an expired draft, but we agreed to defer work on it until after the RTP payload format was complete. Thus some elements of uses outside of G.711.0 running in the endpoints would be described in the other use-case draft.]

The SDP discussion is a little wordy, but this is a result of G.711.0 not being a codec, but rather a variable length, frame-based lossless compression/decompression. That is G.711.0 is NOT a (sample-based or frame-based) codec in the usual sense that RFC 3550/3551 anticipated, but does require some "G.711 specific" information to be passed to it (e.g., complaw).

For the passage you quoted above, the FOLLOWING TWO SENTENCES in the draft provide a forward reference in the document to when the "channels" and "ptime" parameters are needed and referenced (Section 5.1); because we have had no need prior to that point in the draft to discuss use of ANY particular session negotiation protocol.

SDP is a dominant IETF protocol for media negotiation; but even RFC 3551 mentions H.245 and the fact that other mapping methods are possible (including "no negotiation" methods). Indeed, the "in-the-middle" use case described in this email (and at earlier IETF Payload meetings) may or may not have any a priori negotiation of PT at all within an administrative domain (e.g., the G.711.0 PT may be a network configured parameter specific to a company network).

>>Proposed Action: The discussion of ptime (and the channels parameter) in this section is primarily for the purpose of a check. If it is any comfort, that paragraph has had lots of input to it previously (so you responded to a complicated issue). And since we have no need to describe "ptime issues" or session negotiation issues prior to this point (Section 4.2.4) in the document AND ptime isn't a required negotiation parameter AND we put a forward reference to Section 5.1 for  "ptime" when SDP is used, I hesitate to mention such an optional parameter here in Section 3.
>>Proposed action: No Change (the forward references are enough).

\end {Reply to [C]}

[D] Backwards compatibility.

The problem here is that it's not clear that negotiation (e.g., via SDP) is required.  This sentence in Section 3.1 is a particular problem:

   G.711.0, being both lossless and stateless, may also be employed as a
   lossless compression mechanism anywhere between end systems which
   have negotiated use of G.711.

That's definitely wrong.  Use of G.711.0 when only G.711 has been negotiated will fail to interoperate correctly.

A subsection of section 3 on negotiation and SDP usage would help here.

This major issue [D] is independent of the relationship between this draft and the G.711.0 spec.

\begin {Reply to [D]}

The passage you quote is in Section 3  which is "General Information and Use of ITU-T G.711.0 Codec) and is: 1) prior to ANY discussion of the use of G.711.0 in RTP (or even packet networks), and 2) prior to any discussion of media negotiation when using RTP (e.g., SDP). Thus the context for this sentence is at the codec bit stream (or packet payload) level of the ITU-T codec. It stands on its own and is definitely correct. 

When the compression of a G.711 payload to a G.711.0 payload occurs somewhere on the end-to-end path and the corresponding decompression from a G.711.0 payload to a G.711 payload occurs prior to the receiving endpoint the receiving endpoint doesn't know the (lossless) compression occurred on the PAYLOAD (the context in this section). As mentioned previously, this is possible in many arrangements (in RTP) where the compression and decompression endpoints know the PT corresponding to G.711.0 use within their administrative domain (a reserved or not-used-in-their-domain PT) and desire to do this.

That is the beauty of lossless compression - the receiving endpoint doesn't know (or need to know) that payload compression occurred. To imply otherwise is to dismiss lossless compression (e.g., CRTP, ECRTP, ROHC) that losslessly compress and decompress arbitrary parts of packets (in the case of CRTP/ECRTP/RHOC, the headers) in between the endpoints without the endpoints explicit knowledge of the compression.

Please note that this property isn't possible with lossy *CODECS*, as the transcode will typically introduce some distortion which would be unknown to the receiving endpoint but nevertheless present. This is one of the many subtleties that people reading about G.711.0 have when considering it as if it were a (lossy) codec - they ASSUME that G.711.0 is a TRANSCODE and not the lossless, STATELESS compression of the MEDIA PAYLOAD that it is.

Again, we had working group agreement (I think in Quebec) that a use-case document could follow this G.711.0 RTP payload format document to describe how to do the mapping in RTP for these "compression-in-the-middle" cases. High level summary is that you copy the G.711 RTP header verbatim into the G.711.0 RTP header except for the PT. I have a draft on the use case document which I let expire until this RTP payload definition is finished.

>>Proposed Action: No Change. We have more than enough words in the document to describe all the attributes of G.711.0 (Section 3.2) in this section of the document that discusses properties of the >>ITU-T specification<<.

\end {Reply to [D]}

-- Minor issues (3):

[E] Section 4.1:

   The only significant difference is that the
   payload type (PT) RTP header field will have a value corresponding to
   the dynamic payload type assigned to the flow.  This is in contrast
   to most current uses of G.711 which typically use the static payload
   assignment of PT = 0 (PCMU) or PT = 8 (PCMA) [RFC3551] even though
   the negotiation and use of dynamic payload types is allowed for
   G.711.
 
I would change "will have" to "MUST have" and add the following sentence:

   The existing G.711 PT values of 0 and 8 MUST NOT be used for G.711.0
   content.

I'm suspect that this is obvious to the authors, but it'll help a reader who's not familiar with the importance of the difference between G.711 and G.711.0 .

\begin {Reply to [E]}

>>Proposed Action: Happy to fix both (for the reasons given). However, please read my reply to [F] below, I believe the rules actually allow PT = [0|8] in a specific corner case (result is: MUST NOT->SHOULD NOT in your suggestion).

\end {Reply to [E]}

[F] Section 4.1:

      PT - The assignment of an RTP payload type for the format defined
      in this memo is outside the scope of this document.  The RTP
      profiles in use currently mandate binding the payload type
      dynamically for this payload format.

Good start, but not sufficient - cite the "RTP profiles currently in use" and I would expect those citations to be normative references.

Would that be just RFC 3551 and RFC 4585 (both are already normative references), or are there more RTP profiles?

\begin {Reply to [F]}

I think that wording was suggested somewhere along the way, but I can't remember who provided it. It is boilerplate on many RTP payload formats, but others (such as recent RFC 7310) are as simple as " PT - A dynamic payload type; MUST be used" (which appears to be incorrect use of the semicolon, but I digress). In any event, major edits of the first paragraph of 4.1 were made to include the possibility of G.711 not having PT = 0 or PT =8 for exceptional cases (so not even static payload types can be automatically assumed).

According to IANA (http://www.iana.org/assignments/rtp-parameters/rtp-parameters.xhtml#rtp-parameters-2 ) and RFC 3551, the FINAL set of static payload assignments is contained in Table 4 and 5 of RFC 3551.

And, according to RFC 3551, the PT assigned (for a new codec not having a static type) chosen SHOULD first attempt to use a dynamic PT - but there are exceptions cited (e.g., dynamic PT exhaustion). Even codecs that have a static PT assigned MAY negotiate a different PT (e.g., a dynamic PT). And new codecs (after exhaustion of dynamic and other types) MAY actually use a static PT not presently in use (at least I recall someone stated so in a meeting).  So it appears there are a lot of exception cases that preclude knowing (with 100% certainty) any particular PT mapping.

And, according to RFC 3551, dynamic payload types SHOULD NOT be used without a well-defined mechanism to indicate the mapping - SDP or ITU-T H.323/H.245 negotiation or other pre-arrangement are cited (e.g., PT defined within a certain scope or administrative domain) - and a well-defined RTP payload format (this draft).

Thus, not much can be said about the assignment other than what was stated. I could put (yet another) RFC 3551 reference in this paragraph but it would provide no more guidance than already provided a few paragraphs earlier (which references RFC 3551). At a minimum I think I should say that PT of 0 and 8 SHOULD NOT be used for G.711.0.

Re: "PTs currently in use". It is hard to differentiate the profiles "currently IANA registered" and those "currently in use". That is, what is the definition of "currently in use" when you don't have insight into the registered-but-not-in-use profiles (e.g., historic codecs).

>>Proposed Action: I think we should both defer to the Payload WG chairs on this - as they can be expected to know all the exceptions AND the present state of verbiage that goes on "PT -" line of an IANA media registration coming from the Payload WG. Ali and Roni: Please suggest alternate text if you desire, I will accommodate; otherwise I will leave it as is.

\end {Reply to [F]}

[G] Framing errors

Section 4 generally assumes that the G.711.0 decoder gets handed frames generated by the G.711.0 encoder and can't get disaligned.  I'm not convinced that this "just works" based on the text in the draft - major issue [B] is a significant reason why, and explaining that should help.

Some discussion should be added on why the G.711.0 decoder can't get disaligned wrt frame boundaries this can't happen, or what the G.711.0 decoder will do when it discovers that it wasn't handed a complete G.711.0 frame.  For example, this error case and how to deal with it are not covered by the algorithm in Section 4.2.3.

\begin {Reply to [G]}

The actual buffer handling to/from G.711.0 encoding/decoding logic is pretty straightforward so I really doubt that an encoder that has been exercised sufficiently wouldn't pass the G.711.0 frame(s) to the RTP payload incorrectly or the converse.

However, you are correct in that we should always specify what happens when things don't work as expected. Thanks for the catch.

Consistent with an "error condition catch" Richard Barnes made in 4.2.4 - we do have some information for when an encoder and/or decoder error resulted in an unexpected number of G.711 decoded symbols.

Assuming ptime was signaled, we expect the number of G.711 decoded symbols to equal what we expect from the ptime value at the receiver/decoder. If it doesn't then "we SHOULD discard the packet".

[Aside: We discussed the SHOULD vs MUST on the decoder, the SHOULD won. This is because a given system design might temporarily send a packet inconsistent with the ptime previously signaled but which is structurally correct (has the correct decoded G.711). Such a system might not desire to discard such a packet (as it might appear otherwise correct in the number of samples decoded). However, lacking such a design the usual operational choice is to discard the packet. Thus a SHOULD.]

For the encoder, the length of the G.711.0 RTP payload - excluding padding - should never be greater than the number of input G.711 symbols plus the number of G.711.0 frames (as a given G.711.0 frame can be no greater than one octet more than the number of source symbols). If the number of frames is known to the RTP layer (it may not be) and this constraint is not met, the source packet MUST be discarded.

[Aside: We did NOT discuss the SHOULD vs MAY on the encoder. In my opinion, the MUST is more appropriate - as if the condition is met, you KNOW something is wrong.]

>>Proposed Action: Add two sentences similar to the above to the end of Section 4.2.2+ (proposed earlier new section on Encoding Process) and Section 4.2.3 (Decoding Process).

\end {Reply to [G]}

-- Nits/editorial comments:

Section 3.2:

   A6  Bounded expansion: Since attribute A2 above requires G.711.0 to
         be lossless for any payload, by definition there exists at
         least one potential G.711 payload which must be
         "uncompressible".

The "by definition" statement assumes that every possible bit string is a valid G.711 input.  If that is correct, it should be explicitly stated.

\begin{nit}

Yes, because Attribute A2 referenced within this sentence quoted says as much.

Every value of a G.711 symbol (2^8) corresponds to a discrete value. There is no restriction from a sample-to-sample(octet to next octet)  basis assumed in the G.711 encoding (no "illegal transitions"). Lastly, some "DS0 channels" assume that all the bits can be used for arbitrary digital data (so-called ISDN 64kbps B-channel). Thus it is widely known that, by definition, that if something is random and can take ANY value of ANY possible concatenation of octets that there is no-redundancy to be exploited in the concatenation for the purposes of deterministic compression for all possible inputs - there must exist at least one combination payload that is not compressible.

This is an assertion from the G.711.0 ITU-T document that anyone who cares to verify can go to the ITU-T, look up G.711 and instantly know that all the values are "assigned" and there are no illegal transitions specified; thus there is no redundancy to be exploited. I hesitate to insult my readers by giving them any more detail than Attribute A2 says.

>>Proposed Change: None needed. However if you really feel strongly on this, I could agree to something like the following ... or anything of your choosing that reads better and is accurate. Let me know what you want.

   A6  Bounded expansion: Since attribute A2 above requires G.711.0 to
         be lossless for any payload (which could consist of any concatenation
         of octets each octet spanning the entire space of 2^8 values), by definition
        there exists at least one potential G.711 payload which must be
         "uncompressible".

\end {nit}

   A8  Low Complexity: Less than 1.0 WMOPS average and low memory
         footprint (~5k octets RAM, ~5.7k octets ROM and ~3.6 basic
         operations) [ICASSP] [G.711.0].

Expand WMOPS on first use, and check for other acronyms that need to be expanded on first use.

\begin{nit}

Note: The references define what a WMOPS is.

>>Recommended Action: Since this is the only use of WMOPS, I will expand it there (Weighted Million Operations Per Second) and skip the abbreviation entirely.

RAM and ROM is the only other non-expansion. I trust that these don't qualify as "needed" as not even the ITU-T document expands these.

>>Recommended Action: No change to RAM and ROM. It is a reasonable expectation that anyone reading this document will know those two based on context.

\end {nit}

Section 3.3:

   Since the G.711.0 output frame is "self-describing", a G.711.0
   decoder (process "B") can losslessly reproduce the original G.711
   input frame with only the knowledge of which companding law was used
   (A-law or mu-law).

"companding law"?  The term "compression law" is used elsewhere in this draft, including two paragraphs earlier in this section - I suggest using "compression law" consistently.

\begin{nit}

Good catch.

The law both forms of that G.711 uses (mu or A) is that of an input-to-output compander (http://en.wikipedia.org/wiki/Companding ), where the output format is discretized.

I will change the one use of "compression law" to "companding law" in its singular use in Section 3.3 (due to G.711 being a companding, sample-based codec).

\end {nit}

Section 6:

   We note that something must be stored for any G.711.0 frames that not
   received at the receiving endpoint, no matter what the cause.

"that not" -> "that are not"

\begin{nit}
Thanks. Will do.
\end {nit}

Section 6.2:

   An entire frame of value 0++ or 0-- is expected to be
   extraordinarily rare when the frame was in fact generated by a
   natural signal (on the order of one in 2^{ptime in samples, minus
   one}), as analog inputs such as speech and music are zero-mean and
   are typically acoustically coupled to digital sampling systems.

This doesn't explain where the 2^{ptime in samples, minus one} order of magnitude estimation came from.  What assumption(s) is(are) being made about randomness and distribution thereof in the analog input?
It might be simpler to delete the parenthesized text.

\begin{nit}
Agreed. Consider the parenthetical deleted.
\end {nit}

Section 11: Congestion Control

This section is mis-named, as it basically (correctly) says that there is nothing useful that can be done in G.711.0 compression to respond to congestion.  I would retitle this to "Congestion Considerations".

\begin{nit}
I would, but the requirements for new RTP payload formats say that there MUST be a section named "Congestion Control" in all newly approved RTP Payload formats!

You are, of course, correct - as the text in this section basically says there is no explicitly way to regulate the bit-rate for the purposes of congestion control.
\end {nit}

Are there opportunities to respond to congestion elsewhere, e.g.
dynamically change the sampling rate?  If so, a sentence mentioning them would be good to add.

\begin{nit}
I know of no use of G.711 that changes the sampling frequency from the default - although that is allowed in the SDP (as G.711 is a sample-based codec). The 8000 samples per second is hard-coded in many voice implementations.

Since the whole purpose of G.711.0 is to send G.711 lossly  with lower bandwidth, the use of G.711.0 could be triggered by G.711 negotiated sessions looking for a lower bandwidth solution. Although we could mention this (obvious) fact, the guidelines for this section instruct me to discuss things that can be done with the "codec" this payload format describes for the purposes of congestion control. This is yet another artifact that the new RTP guidelines did not anticipate the use of a lossless and stateless compression technique being defined for RTP. We broke a lot of new ground here, thanks for wading through it!

>>Proposed Action: None. I would not have this section in the document except that the new rules for RTP Payload definitions mandate such a section exist.
\end {nit}

idnits 2.13.01 didn't find anything to complain about ;-).

--- Selected RFC 5706 Appendix A Q&A for OPS-Dir review ---

Most of these questions are N/A as this draft specifies a payload format for RTP, so most of the operations and management concerns are wrt RTP and SDP.

A.1.3.  Has the migration path been discussed?

No, see major issue [D] above.

A.1.4   Have the Requirements on other protocols and functional
       components been discussed?

Only in part - major issues [C] and [D] call out shortcomings in the discussion of SDP interactions.

A.1.8   Are there fault or threshold conditions that should be reported?

Yes, the likelihood and consequences of framing problems at the G.711.0 decoder (decoder is handed octet strings that are not G.711.0 frames generated by the encoder) should be discussed.  Major issue [B] needs to be resolved first, and then see minor issue [G].

A.2.  Management Considerations

I would expect that the media type registration (Section 5.1 of this draft) results in this new G.711.0 media type being usable in any relevant management model and/or framework that has some notion of media type.

A.3 Documentation

By itself, this compressed payload format does not look like a likely source of significant operational impacts on the Internet.

The shepherd's writeup indicates that an implementation exists.

Thanks,
--David
----------------------------------------------------
David L. Black, Distinguished Engineer
EMC Corporation, 176 South St., Hopkinton, MA  01748
+1 (508) 293-7953             FAX: +1 (508) 293-7786
david.black@emc.com        Mobile: +1 (978) 394-7754
----------------------------------------------------