Re: [AVTCORE] Benjamin Kaduk's Discuss on draft-ietf-payload-tsvcis-03: (with DISCUSS and COMMENT)

<victor.demjanenko@vocal.com> Thu, 24 October 2019 18:18 UTC

Return-Path: <victor.demjanenko@vocal.com>
X-Original-To: avt@ietfa.amsl.com
Delivered-To: avt@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C98E312008F for <avt@ietfa.amsl.com>; Thu, 24 Oct 2019 11:18:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 3.101
X-Spam-Level: ***
X-Spam-Status: No, score=3.101 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, GB_SUMOF=5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id t8LmgARNYGoD for <avt@ietfa.amsl.com>; Thu, 24 Oct 2019 11:18:50 -0700 (PDT)
Received: from cuda.olm1.com (cuda.olm1.com [72.236.255.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DD446120044 for <avt@ietf.org>; Thu, 24 Oct 2019 11:18:49 -0700 (PDT)
X-ASG-Debug-ID: 1571938966-092fd3685c50e10001-6kZpOq
Received: from host105.olm1.com (host105.olm1.com [72.236.255.15]) by cuda.olm1.com with ESMTP id YLg3myKcq6m6aiyT (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Thu, 24 Oct 2019 13:42:46 -0400 (EDT)
X-Barracuda-Envelope-From: victor.demjanenko@vocal.com
X-Barracuda-Apparent-Source-IP: 72.236.255.15
Received: from HERTELLT (rrcs-72-43-202-98.nys.biz.rr.com [72.43.202.98]) by host105.olm1.com (Postfix) with ESMTPSA id 87EB6B5B921; Thu, 24 Oct 2019 13:42:46 -0400 (EDT)
From: victor.demjanenko@vocal.com
To: "'Roni Even (A)'" <roni.even@huawei.com>, 'Benjamin Kaduk' <kaduk@mit.edu>, 'The IESG' <iesg@ietf.org>, 'Catherine Meadows' <catherine.meadows@nrl.navy.mil>, secdir@ietf.org
Cc: draft-ietf-payload-tsvcis@ietf.org, 'Ali Begen' <ali.begen@networked.media>, avtcore-chairs@ietf.org, avt@ietf.org, "'Dave Satterlee (Vocal)'" <Dave.Satterlee@vocal.com>, ietf@ietf.org, avt@ietf.org, draft-ietf-payload-tsvcis.all@ietf.org, "'Victor Demjanenko, Ph.D.'" <victor.demjanenko@vocal.com>
References: <157007038502.8860.1558861534319247512.idtracker@ietfa.amsl.com> <001601d57af9$405efcf0$c11cf6d0$@vocal.com> <6E58094ECC8D8344914996DAD28F1CCD23D79BC0@DGGEMM506-MBX.china.huawei.com> <034a01d58a73$f4d3a1c0$de7ae540$@vocal.com>
In-Reply-To: <034a01d58a73$f4d3a1c0$de7ae540$@vocal.com>
Date: Thu, 24 Oct 2019 13:42:45 -0400
X-ASG-Orig-Subj: RE: Benjamin Kaduk's Discuss on draft-ietf-payload-tsvcis-03: (with DISCUSS and COMMENT)
Message-ID: <037e01d58a92$72287510$56795f30$@vocal.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-Mailer: Microsoft Outlook 16.0
Thread-Index: AQKF5W0DvD42FGb/qFKkK3bBTKTeXwLglFUwAo/9rkMBQrixDqXTvquw
Content-Language: en-us
X-Barracuda-Connect: host105.olm1.com[72.236.255.15]
X-Barracuda-Start-Time: 1571938966
X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
X-Barracuda-URL: https://72.236.255.32:443/cgi-mod/mark.cgi
X-Virus-Scanned: by bsmtpd at olm1.com
X-Barracuda-BRTS-Status: 1
X-Barracuda-Spam-Score: 0.00
X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=1000.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.0 tests=NO_REAL_NAME
X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.77561 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 NO_REAL_NAME From: does not include a real name
Archived-At: <https://mailarchive.ietf.org/arch/msg/avt/YlrFexRNTAoJ2qCp0sveupBkx5o>
X-Mailman-Approved-At: Sun, 27 Oct 2019 00:14:58 -0700
Subject: Re: [AVTCORE] Benjamin Kaduk's Discuss on draft-ietf-payload-tsvcis-03: (with DISCUSS and COMMENT)
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Audio/Video Transport Core Maintenance <avt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/avt/>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 24 Oct 2019 18:18:53 -0000

I forgot to address security comments in one email.  The changes are:

Section 8, second paragraph - Suggested edit by reviewer

(was)
   This RTP payload format and the TSVCIS decoder do not exhibit any
   significant non-uniformity in the receiver-side computational
   complexity for packet processing and thus are unlikely to pose a
   denial-of-service threat due to the receipt of pathological data.
   Additionally, the RTP payload format does not contain any active
   content.  

(now)
   This RTP payload format and the TSVCIS decoder, to the best of our
   knowledge, do not exhibit any significant non-uniformity in the
   receiver-side computational complexity for packet processing and thus
   are unlikely to pose a denial-of-service threat due to the receipt of
   pathological data. Additionally, the RTP payload format does not
   contain any active content.  


Section 8, third paragraph - Suggested edit by reviewer

(was)
   Please see the security considerations discussed in [RFC6562]
   regarding VAD and its effect on bitrates.

(now)
   Please see the security considerations discussed in [RFC6562]
   regarding Voice Activity Detect (VAD) and its effect on bitrates.

Victor

-----Original Message-----
From: victor.demjanenko@vocal.com <victor.demjanenko@vocal.com> 
Sent: Thursday, October 24, 2019 10:05 AM
To: 'Roni Even (A)' <roni.even@huawei.com>; 'Benjamin Kaduk' <kaduk@mit.edu>; 'The IESG' <iesg@ietf.org>
Cc: draft-ietf-payload-tsvcis@ietf.org; 'Ali Begen' <ali.begen@networked.media>; avtcore-chairs@ietf.org; avt@ietf.org; 'Dave Satterlee (Vocal)' <Dave.Satterlee@vocal.com>
Subject: RE: Benjamin Kaduk's Discuss on draft-ietf-payload-tsvcis-03: (with DISCUSS and COMMENT)

Hi Everyone,

First we want to thank everyone for their review and comments for this draft RFC.  We believe we reviewed all the comments and suggestions and incorporated them adequately in the next draft (04).  We'd like to send out this list of exact changes in case anyone has additional comments or thinks the clarifications are inadequate.  We would be most happy to address concerns before publishing draft 04 tomorrow.

With so many emails from a half dozen or more reviewers, we apologize that we cannot address each sender individually.  We hope this detail is sufficient for everyone.

Again, many thanks to all.

Victor & Dave

----------------------------------------------------------------------------------------------

Section 1.1 - Suggested reference to RFC 8088 added.

(was)
   Best current practices for writing an RTP payload format
   specification were followed [RFC2736].

(now)
   Best current practices for writing an RTP payload format
   specification were followed [RFC2736] [RFC8088].


Section 2, paragraphs 3 and 4 - Suggested edits by reviewers

(was)
   In addition to the augmented speech data, the TSVCIS specification
   identifies which speech coder and framing bits are to be encrypted,
   and how they are protected by forward error correction (FEC)
   techniques (using block codes).  At the RTP transport layer, only the
   speech coder related bits need to be considered and are conveyed in
   unencrypted form.  In most IP-based network deployments, standard
   link encryption methods (SRTP, VPNs, FIPS 140 link encryptors or Type
   1 Ethernet encryptors) would be used to secure the RTP speech
   contents.  Further, it is desirable to support the highest voice
   quality between endpoints which is only possible without the overhead
   of FEC.

   TSVCIS augmented speech data is derived from the signal processing
   and data already performed by the MELPe speech coder.  For the
   purposes of this specification, only the general parameter nature of
   TSVCIS will be characterized.  Depending on the bandwidth available
   (and FEC requirements), a varying number of TSVCIS specific speech
   coder parameters need to be transported.  These are first byte-packed
   and then conveyed from encoder to decoder.

(now)
   In addition to the augmented speech data, the TSVCIS specification
   identifies which speech coder and framing bits are to be encrypted,
   and how they are protected by forward error correction (FEC)
   techniques (using block codes).  At the RTP transport layer, only the
   speech-coder-related bits need to be considered and are conveyed in
   unencrypted form.  In most IP-based network deployments, standard
   link encryption methods (SRTP, VPNs, FIPS 140 link encryptors or Type
   1 Ethernet encryptors) would be used to secure the RTP speech
   contents.

   TSVCIS augmented speech data is derived from the signal processing
   and data already performed by the MELPe speech coder.  For the
   purposes of this specification, only the general parameter nature of
   TSVCIS will be characterized.  Depending on the bandwidth available
   (and FEC requirements), a varying number of TSVCIS-specific speech
   coder parameters need to be transported.  These are first byte-packed
   and then conveyed from encoder to decoder.


Section 3, last sentence paragraph 3 - Suggested edit by reviewer

(was)
   When more than one codec data frame is
   present in a single RTP packet, the timestamp is, as always, that of
   the oldest data frame represented in the RTP packet.

(now)
   When more than one codec data frame is
   present in a single RTP packet, the timestamp specified is that of
   the oldest data frame represented in the RTP packet.


Section 3.1, last paragraph - Clarified permission for MELP 600 end-to-end framing bit

(was)
   It should be noted that CODB for both the 2400 and 600 bps modes MAY
   deviate from the values in Table 1 when bit 55 is used as an end-to-
   end framing bit.  Frame decoding would remain distinct as CODA being
   zero on its own would indicate a 7-byte frame for either rate and the
   use of 600 bps speech coding could be deduced from the RTP timestamp
   (and anticipated by the SDP negotiations).

(now)
   It should be noted that CODB for MELPe 600 bps mode MAY deviate from
   the value in Table 1 when bit 55 is used as an end-to-end framing
   bit. Frame decoding would remain distinct as CODA being zero on its
   own would indicate a 7-byte frame for either 2400 or 600 bps rate and
   the use of 600 bps speech coding could be deduced from the RTP
   timestamp (and anticipated by the SDP negotiations).


Section 3.2, first paragraph - Clarifications requested by reviewers

(was)
   The TSVCIS augmented speech data as packed parameters MUST be placed
   immediately after a corresponding MELPe 2400 bps payload in the same
   RTP packet.  The packed parameters are counted in octets (TC).  In
   the preferred placement, shown in Figure 6, a single trailing octet
   SHALL be appended to include a two-bit rate code, CODA and CODB,
   (both bits set to one) and a six-bit modified count (MTC).  The
   special modified count value of all ones (representing a MTC value of
   63) SHALL NOT be used for this format as it is used as the indicator
   for the alternate packing format shown next.  In a standard
   implementation, the TSVCIS speech coder uses a minimum of 15 octets
   for parameters in octet packed form.  The modified count (MTC) MUST
   be reduced by 15 from the full octet count (TC).  Computed MTC = TC-
   15.  This accommodates a maximum of 77 parameter octets (maximum
   value of MTC is 62, 77 is the sum of 62+15).  

(now)
   The TSVCIS augmented speech data as packed parameters MUST be placed
   immediately after a corresponding MELPe 2400 bps payload in the same
   RTP packet.  The packed parameters are counted in octets (TC).  The
   preferred placement SHOULD be used for TSVCIS payloads with TC less
   than or equal to 77 octets, is shown in Figure 6.  In the preferred
   placement, a single trailing octet SHALL be appended to include a
   two-bit rate code, CODA and CODB, (both bits set to one) and a six-
   bit modified count (MTC).  The special modified count value of all
   ones (representing a MTC value of 63) SHALL NOT be used for this
   format as it is used as the indicator for the alternate packing
   format shown next.  In a standard implementation, the TSVCIS speech
   coder uses a minimum of 15 octets for parameters in octet packed
   form.  The modified count (MTC) MUST be reduced by 15 from the full
   octet count (TC).  Computed MTC = TC-15.  This accommodates a maximum
   of 77 parameter octets (maximum value of MTC is 62, 77 is the sum of
   62+15).


Section 3.3, first paragraph - Suggested edit by reviewer

(was)
   A TSVCIS RTP packet consists of zero or more TSVCIS coder frames
   (each consisting of MELPe and TSVCIS coder data) followed by zero or
   one MELPe comfort noise frame.  The presence of a comfort noise frame
   can be determined by its rate code bits in its last octet.

(now)
   A TSVCIS RTP packet payload consists of zero or more consecutive
   TSVCIS coder frames (each consisting of MELPe 2400 and TSVCIS coder
   data), with the oldest frame first, followed by zero or one MELPe
   comfort noise frame.  The presence of a comfort noise frame can be
   determined by its rate code bits in its last octet.


Section 3.3, fourth paragraph - Clarification requested by reviewers

(was)
   TSVCIS coder frames in a single RTP packet MAY be of different coder
   bitrates.  With the exception for the variable length TSVCIS
   parameter frames, the coder rate bits in the trailing byte identify
   the contents and length as per Table 1.

(now)
   TSVCIS coder frames in a single RTP packet MAY have varying TSVCIS
   parameter octet counts.  Its packed parameter octet count (length) is
   indicated in the trailing byte(s).  All MELPe frames in a single RTP
   packet MUST be of the same coder bitrate.  For all MELPe coder
   frames, the coder rate bits in the trailing byte identify the
   contents and length as per Table 1.


Section 4.1 - Editor note removed


Section 4.1 - Change controller is now

(now)
   Change controller: IETF, contact <avt@ietf.org>


Section 5, first paragraph - Suggested edits by reviewers

(was)
   A primary application of TSVCIS is for radio communications of voice
   conversations, and discontinuous transmissions are normal.  When
   TSVCIS is used in an IP network, TSVCIS RTP packet transmissions may
   cease and resume frequently.  RTP synchronization source (SSRC)
   sequence number gaps indicate lost packets to be filled by PLC, while
   abrupt loss of RTP packets indicates intended discontinuous
   transmissions.

(now)
   A primary application of TSVCIS is for radio communications of voice
   conversations, and discontinuous transmissions are normal.  When
   TSVCIS is used in an IP network, TSVCIS RTP packet transmissions may
   cease and resume frequently.  RTP synchronization source (SSRC)
   sequence number gaps indicate lost packets to be filled by Packet
   Loss Concealment (PLC), while abrupt loss of RTP packets indicates
   intended discontinuous transmissions.  Resumption of voice
   transmission SHOULD be indicated by the RTP marker bit (M) set to 1.


Section 10 - Added reference

(added)
   [RFC8088]  Westerlund, M., "How to Write an RTP Payload Format",
              RFC 8088, DOI 10.17487/RFC8088, May 2017, 
              <http://www.rfc-editor.org/info/rfc8088>.

-------------------------------------------------------------------------------------------------


-----Original Message-----
From: Roni Even (A) <roni.even@huawei.com> 
Sent: Sunday, October 6, 2019 2:09 AM
To: victor.demjanenko@vocal.com; 'Benjamin Kaduk' <kaduk@mit.edu>; 'The IESG' <iesg@ietf.org>
Cc: draft-ietf-payload-tsvcis@ietf.org; 'Ali Begen' <ali.begen@networked.media>; avtcore-chairs@ietf.org; avt@ietf.org; 'Dave Satterlee (Vocal)' <Dave.Satterlee@vocal.com>
Subject: RE: Benjamin Kaduk's Discuss on draft-ietf-payload-tsvcis-03: (with DISCUSS and COMMENT)

Hi,
About the reference to TSVCIS.
The RTP payload is about how to encapsulate the payload in an RTP packet. The objective is to define how an RTP stack can insert the tsvcis frames and  extract the tsvcis frames from the RTP packet. Typically it is not required to understand the payload structure in order to be able to perform the encapsulation.
This is why the reference to the payload is Informational and we did not require to have it publically available.  If there is a need to understand the payload itself for the encapsulating than we need more information in the RTP payload specification and a publically available normative reference. I think this is not the case here

Roni Even 

AVTCore co-chair (ex Payload)

-----Original Message-----
From: victor.demjanenko@vocal.com [mailto:victor.demjanenko@vocal.com] 
Sent: Saturday, October 05, 2019 12:18 AM
To: 'Benjamin Kaduk'; 'The IESG'
Cc: draft-ietf-payload-tsvcis@ietf.org; 'Ali Begen'; avtcore-chairs@ietf.org; avt@ietf.org; 'Victor Demjanenko, Ph.D.'; 'Dave Satterlee (Vocal)'
Subject: RE: Benjamin Kaduk's Discuss on draft-ietf-payload-tsvcis-03: (with DISCUSS and COMMENT)

Everyone,

Thanks for the comments.  I think I mis-understood the ambiguity with respect to to changing rates within a RTP packet.  That was not plan.  An RTP packet must have MELP speech frames of the same rate.  What is possible is that the amount of augmented TSVCIS speech data may vary from one speech frame to the next.  This allows for a dynamic VDR as suggested by the NRL paper.  So an RTP packet may have varying TSVCIS data but must always have MELPe 2400 data.

Again backwards parsing is necessary but the timestamp uniformly increments 22.5msec per combined MELP/TSVCIS speech frame.

The NRL is a good public reference on the VDR aspects.  The actual TSVCIS spec we had was FOUO so we could not replicate its detail.  (I believe a later spec is public or at least partially public.  I am trying to get this.)  The opaque data is pretty obvious with the TSVCIS spec in hand.

We will address the issues/concerns raised next week.  Other business had priority.

Thank you and enjoy the weekend.

Regards,

Victor & Dave

-----Original Message-----
From: Benjamin Kaduk via Datatracker <noreply@ietf.org> 
Sent: Wednesday, October 2, 2019 10:40 PM
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-payload-tsvcis@ietf.org; Ali Begen <ali.begen@networked.media>; avtcore-chairs@ietf.org; ali.begen@networked.media; avt@ietf.org
Subject: Benjamin Kaduk's Discuss on draft-ietf-payload-tsvcis-03: (with DISCUSS and COMMENT)

Benjamin Kaduk has entered the following ballot position for
draft-ietf-payload-tsvcis-03: Discuss

When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-payload-tsvcis/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

I support Magnus' point about the time-ordering of adjacent frames in a packet.

Additionally, I am not sure that there's quite enough here to be interoperably implementable.  Specifically, we seem to be lacking a description of how an encoder or decoder knows which TSVCIS parameters, and in what order, to byte-pack or unpack, respectively.  One might surmise that there is a canonical listing in [TSVCIS], but this document does not say that, and furthermore [TSVCIS] is only listed as an informative reference.  (I couldn't get my hands on my copy, at least on short notice.)  If we limited ourselves to treating the TSVCIS parameters as an entirely opaque blob (codec, convey these N octets to the peer with the appropriate one- or two-byte trailer for payload type identification and framing), that would be interoperably implementable, since the black-box bits are up to some other codec to interpret.

In a similar vein, we mention but do not completely specify the potential for using CODB as an end-to-end framing bit, in Section 3.1 (see Comment), which is not interoperably implementable without further details.


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

Where is [TSVCIS] available?

Is [NRLVDR] the same as
https://apps.dtic.mil/dtic/tr/fulltext/u2/a588068.pdf ?  A URL in the references would be helpful.

Is additional TSVCIS data only present after 2400bps MELPe and the first thing to get dropped under bandwidth pressure?  The abstract and introduction imply this by calling out MELPe 2400 bps speech parameters explicitly, but Section 3 says that TSVCIS augments standard 600, 1200, and 2400 bps MELP frames.

It's helpful that Section 3.3 gives some general guidance for decoding this payload type ("[t]he way to determine the number of TSVCIS/MELPe frames is to identify each frame type and length"), but I think some generic considerations would be very helpful to the reader much earlier, along the lines of "MELPe and TSVCIS data payloads are decoded from the end, using the CODA and CODB (and, if necessary, CODC and others) bits to determine the type of payload.  For MELPe payloads the type also indicates the payload length, whereas for TSVCIS data an additional length field is present, in one of two possible formats.  A TSVCIS coder frame consists of a MELPe data payload followed by zero or one TSVCIS data payload; after the TSVCIS payload's presence/length is determined, then the preceding MELPe payload can be determined and decoded.  Per Section 3.3, multiple TSVCIS frames can be present in a single RTP packet."  This (or something like it) would also serve to clarify the role of the COD* bits, which is otherwise only implicitly introduced.

Section 1.1

RFC 2736 is BCP 36 (but it's updated by RFC 8088 which is for some reason an Informational document and not part of BCP 36?!).

Section 2

   In addition to the augmented speech data, the TSVCIS specification
   identifies which speech coder and framing bits are to be encrypted,
   and how they are protected by forward error correction (FEC)
   techniques (using block codes).  At the RTP transport layer, only the
   speech coder related bits need to be considered and are conveyed in
   unencrypted form.  In most IP-based network deployments, standard

Am I reading this correctly that this text is just summarizing what's in the TSVCIS spec in terms of what needs to be in unencrypted form, so the "only the speech coder related bits[...]" is not new information from this document?  I'm not sure I agree with the conclusion, regardless -- won't the (MELPe) speech coder bits be enough to convey the semantic content of the audio stream, something that one might desire to keep confidential?

   link encryption methods (SRTP, VPNs, FIPS 140 link encryptors or Type
   1 Ethernet encryptors) would be used to secure the RTP speech
   contents.  Further, it is desirable to support the highest voice
   quality between endpoints which is only possible without the overhead
   of FEC.

I think I'm missing a step in how this conclusion was reached.

   TSVCIS will be characterized.  Depending on the bandwidth available
   (and FEC requirements), a varying number of TSVCIS specific speech
   coder parameters need to be transported.  These are first byte-packed
   and then conveyed from encoder to decoder.

Per the Discuss point, how do I know which parameters need to be transported, and in what order?

   Byte packing of TSVCIS speech data into packed parameters is
   processed as per the following example:

      Three-bit field: bits A, B, and C (A is MSB, C is LSB)
      Five-bit field: bits D, E, F, G, and H (D is MSB, H is LSB)

           MSB                                              LSB
            0      1      2      3      4      5      6      7
        +------+------+------+------+------+------+------+------+
        |   H  |   G  |   F  |   E  |   D  |   C  |   B  |   A  |
        +------+------+------+------+------+------+------+------+

   This packing method places the three-bit field "first" in the lowest
   bits followed by the next five-bit field.  Parameters may be split
   between octets with the most significant bits in the earlier octet.
   Any unfilled bits in the last octet MUST be filled with zero.

I agree with Adam that this is very unclear.  A is the MSB of the three-bit field but the LSB of the octet overall?
We probably need an example of splitting a parameter across octets as well, to get the bit ordering right.

Section 3.1

   It should be noted that CODB for both the 2400 and 600 bps modes MAY
   deviate from the values in Table 1 when bit 55 is used as an end-to-
   end framing bit.  Frame decoding would remain distinct as CODA being

Where is the use of CODB as an end-to-end framing bit defined?  If we're going to provide neither a complete description of how to do it nor a reference to a better description, we probably shouldn't mention it at all.

Section 3.2

   RTP packet.  The packed parameters are counted in octets (TC).  In
   the preferred placement, shown in Figure 6, a single trailing octet
   SHALL be appended to include a two-bit rate code, CODA and CODB,

I'd consider saying something about this being the preferred format
("placement") due to its shorter length than the alternative, and say that it "SHOULD be used for TSVCIS payloads with TC less than or equal to 77 octetes".

Section 3.3

When a longer packetization interval is used, is that indicated by signaling or RTP timestamps or otherwise?

   TSVCIS coder frames in a single RTP packet MAY be of different coder
   bitrates.  With the exception for the variable length TSVCIS
   parameter frames, the coder rate bits in the trailing byte identify
   the contents and length as per Table 1.

Maybe also note that the penultimate octet gives the length there?

   Information describing the number of frames contained in an RTP
   packet is not transmitted as part of the RTP payload.  The way to
   determine the number of TSVCIS/MELPe frames is to identify each frame
   type and length thereby counting the total number of octets within
   the RTP packet.

terminology nit: if a frame is the combination of MELPe and TSVCIS payload data units then there are two layres of decoding to get a length for the frame, since we have to get the TSVCIS length and then the MELPe length.

Section 4.2

   Parameter "ptime" cannot be used for the purpose of specifying the

nit: missing article ("The parameter")

   will be impossible to distinguish which mode is about to be used
   (e.g., when ptime=68, it would be impossible to distinguish if the
   packet is carrying one frame of 67.5 ms or three frames of 22.5 ms).

So how is the operating mode determined, then?
(I think this is the same question I asked above)

Section 4.4

   For example, if offerer bitrates are "2400,600" and answer bitrates
   are "600,2400", the initial bitrate is 600.  If other bitrates are
   provided by the answerer, any common bitrate between the offer and
   answer MAY be used at any time in the future.  Activation of these
   other common bitrates is beyond the scope of this document.

It seems important to specify whether this requires a new O/A exchange or can be done "spontaneously" by just encoding different frame types.
(It seems like the latter is possible, on first glance, and this is implied by Section 3.3's discussion of mixing them in a single packet.)

Section 5

Please expand PLC at first use (not second).

Section 6

I don't understand the PLC usage.  Is the idea that a receiver, on seeing an SSRC gap, constructs fictitious PLC frames to "fill the gap"
and passes the resulting stream to the decoder?

Section 8

   and important considerations in [RFC7201].  Applications SHOULD use
   one or more appropriate strong security mechanisms.  The rest of this
   section discusses the security-impacting properties of the payload
   format itself.

I thought we described TSVCIS itself (much earlier in the document) as requiring encryption for some data; wouldn't that translate to a "MUST"
here and not a "SHOULD"?