Re: [payload] Progressing draft-ramalho-payload-g7110-00

Noboru Harada <harada.noboru@lab.ntt.co.jp> Tue, 16 August 2011 16:10 UTC

Return-Path: <harada.noboru@lab.ntt.co.jp>
X-Original-To: payload@ietfa.amsl.com
Delivered-To: payload@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7741621F8C13; Tue, 16 Aug 2011 09:10:10 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.041
X-Spam-Level:
X-Spam-Status: No, score=0.041 tagged_above=-999 required=5 tests=[AWL=0.131, BAYES_00=-2.599, HELO_EQ_JP=1.244, HOST_EQ_JP=1.265]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KMDjFpagerSx; Tue, 16 Aug 2011 09:10:08 -0700 (PDT)
Received: from tama50.ecl.ntt.co.jp (tama50.ecl.ntt.co.jp [129.60.39.147]) by ietfa.amsl.com (Postfix) with ESMTP id 1162121F8C12; Tue, 16 Aug 2011 09:10:07 -0700 (PDT)
Received: from mfs5.rdh.ecl.ntt.co.jp (mfs5.rdh.ecl.ntt.co.jp [129.60.39.144]) by tama50.ecl.ntt.co.jp (8.14.5/8.14.5) with ESMTP id p7GGAowc020073; Wed, 17 Aug 2011 01:10:50 +0900 (JST)
Received: from mfs5.rdh.ecl.ntt.co.jp (localhost [127.0.0.1]) by mfs5.rdh.ecl.ntt.co.jp (Postfix) with ESMTP id 762346D67; Wed, 17 Aug 2011 01:10:50 +0900 (JST)
Received: from imss1.kecl.ntt.co.jp (imss1.kecl.ntt.co.jp [129.60.199.16]) by mfs5.rdh.ecl.ntt.co.jp (Postfix) with ESMTP id 5A0256D66; Wed, 17 Aug 2011 01:10:50 +0900 (JST)
Received: from imss1.kecl.ntt.co.jp (localhost.localdomain [127.0.0.1]) by postfix-imss71 (Postfix) with ESMTP id 31CD4E73E3; Wed, 17 Aug 2011 01:10:50 +0900 (JST)
Received: from lab-pop.k.ecl.ntt.co.jp (lab-pop1.k.ecl.ntt.co.jp [129.60.199.78]) by imss1.kecl.ntt.co.jp (Postfix) with ESMTP id 247CAE7380; Wed, 17 Aug 2011 01:10:50 +0900 (JST)
Received: from [129.60.199.172] (vpn-spl172.cslab.kecl.ntt.co.jp [129.60.199.172]) by lab-pop.k.ecl.ntt.co.jp (Postfix) with SMTP id 10C0F941F3; Wed, 17 Aug 2011 01:10:50 +0900 (JST)
Date: Wed, 17 Aug 2011 01:10:50 +0900
From: Noboru Harada <harada.noboru@lab.ntt.co.jp>
To: "Michael Ramalho (mramalho)" <mramalho@cisco.com>
In-Reply-To: <999109E6BC528947A871CDEB5EB908A0046EE5BB@XMB-RCD-209.cisco.com>
References: <20110814104721.2013.24F8F98F@lab.ntt.co.jp> <999109E6BC528947A871CDEB5EB908A0046EE5BB@XMB-RCD-209.cisco.com>
Message-Id: <20110817011050.DA5D.24F8F98F@lab.ntt.co.jp>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-Mailer: Becky! ver. 2.57.01 [ja]
Cc: avtext@ietf.org, payload@ietf.org
Subject: Re: [payload] Progressing draft-ramalho-payload-g7110-00
X-BeenThere: payload@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
Reply-To: harada.noboru@lab.ntt.co.jp
List-Id: Audio/Video Transport Payloads working group discussion list <payload.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/payload>, <mailto:payload-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/payload>
List-Post: <mailto:payload@ietf.org>
List-Help: <mailto:payload-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/payload>, <mailto:payload-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Aug 2011 16:10:11 -0000

Dear Michael,

Thanks for the comments.

> MAR: I also agree with your statement that only single channel G.711 is
> traditionally used. Indeed, when PT = [0 | 8] is used RFC 3551 states
> (in Table 4) that channels == 1 by definition for G.711.
> 
> MAR: However, if you look closely at RFC 3551 ... and you use a DYNAMIC
> payload type for G.711 ... you could specify channel > 1 ... because
> it defines how to pack the payload for "sample-based encodings"
> (RFC 3551, Section 4.3).
> 
> MAR: There are some applications (e.g., acceleration of real-time
> protocols
> over WANs) whereby "multiplexing multiple G.711 flows" is desired.
> Granted
> there are few, if any, products AT PRESENT doing this. However, for
> those
> applications, I think a "standardized method" to put multiple channels
> in one payload is desirable.

If there was any strong need for the function, someone who desires the
functionality would better propose a standardized method for
"multiplexing multiple G.711 flows" within the G.711 payload first.
Then we can discuss how to reflect the function in the G.711.0 payload
based on the defined multiplexing multiple G.711 flows payload.
This way, we can make sure what is the real requirements for the
functionality.


> MAR: I agree with you ... there no need to add a delimiter for the
> channels
> if you know ptime. On the next revision I will have:
> 
> 1: channels as an optional parameter (as it is now)
> 2: if optional channels parameter is present and > 1, then ptime becomes
> a required parameter (not specified in the present draft).

I'm fine with this proposal.


> MAR: Regarding your channel example above, this is similar to RFC 3551
> already
> specifies for channel demarcation of sample based recordings ... from
> RFC
> 3551 ....

Note that this channel example proposed here is slightly different from
what is described in RFC 3551 when any of channels contains more than
one G.711.0 frames.

> MAR: If channels > 1, the buffers in the G.711 decoding process may
> need to be larger ... but that is easily accomplished with some #defines
> in the given application.

The G.711 decoding process does not require larger buffer size though
buffer size for the decoded G.711 samples shall be large enough to
accommodate the decoded samples, such as, "ptime octets" for each
channels ("ptime * channels" octets in total).
Therefore, we don't have to change the processing buffer size which 
the current G.711.0 reference software implementation uses.
The bintstream can be processed frame by frame from the first frame to
the last regardless if the bitstream contains multi-channel stream or
not.


> MAR: Can you formulate a proposal for the long recording case? Perhaps
> we can write in 1 second chunks with a length byte for that 1 second.
> Something like (for N chunks of G.711.0 data in the file and "|"
> indicating concatenation here):
> 
> | A | B1 | C1 | B2 | C2 | B3 | C3 | ... | B(N-1) | C(N-1) | B(N) | ...
> where
> 
> A = Fixed length Preamble (has codec name, ptime and number_of_channels)
> 
> Ci = Variable length G.711.0 data for as many channels specified in
> chunk i
> 
> Bi = 	16 bit uint
> 	{if (Bi == 65535)
> 		Indicates EoF;          //for B(N) above
> 	elseif (Bi == 65534)
> 		Indicates End_segment;  //for B(N-1) above where you
> 						//have less than one
> second of data
> 						//in the last segment
> 	else
> 		Number_of_bytes in Ci;
> 	}
> 
> MAR: The above with Bi a uint16 would accommodate a worst-case 8
> channels.

This proposal seems OK.
I have no strong opinion on the long recording case payload.


> MAR: For adoption in the IETF ... I think using their existing
> convention
> is always better (unless you have a killer fault with it). I think the
> IETF
> wants someone with a "storage mode decoder" to see something like
> "#!<codec_name>\n" as the preamble to any storage mode recoding.

I see your point.
I agree.
Current proposal is fine then.



Best Regards,

Noboru



> Hi Noboru,
> 
> Thank you for your reply.
> 
> My answers are in-line below with "MAR:".
> 
> Michael Ramalho
> 
> -----Original Message-----
> From: Noboru Harada [mailto:harada.noboru@lab.ntt.co.jp] 
> Sent: Saturday, August 13, 2011 9:47 PM
> To: Michael Ramalho (mramalho); avtext@ietf.org; payload@ietf.org
> Cc: harada.noboru@lab.ntt.co.jp; avtext@ietf.org; payload@ietf.org
> Subject: Re: Progressing draft-ramalho-payload-g7110-00
> 
> Dear Michael and all,
> 
> 
> Thanks for taking care of the G.711.0 payload format issues.
> 
> According to the ISSUES 1 and 2, 
> I'm fine with the proposal to have two separated documents.
> 
> MAR: Thanks.
> 
> For ISSUE 3, please see my comments below.
> 
> > >>>>ISSUE 3: Any comments on this goal?
> > 
> > Assuming the partitioning proposed above is accepted, the only
> > significant open items for the G.711.0 payload format draft are:
> > 
> > OPEN ITEM 1 - Is the specification of multiple G.711.0 "channels"
> within
> > a single G.711.0 RTP session desired? If so, is the proposed method
> > acceptable (reserving a presently unused "prefix code" as a channel
> > delimiter and changing the decoding heuristic in Section 4.2.3)?
> >
> > and
> > 
> > OPEN ITEM 2 - The specification of the storage mode for long
> recordings.
> 
> For OPEN ITEM 1, I'm not fully confident about that supporting multiple
> G.711.0 channels is really useful.
> 
> As stated in the current draft, I'm not sure if there is any real
> application need to have more than one G.711 channel per a RTP session.
> I have never seen such multi-channel implementation in traditional G.711
> systems.
> 
> For teleconference applications, there may be several alternatives such
> as down-mix all channels into one channel at MCU server or mesh
> connection using NxN RTP sessions.
> Note that I'm fine with having the function if someone could show us
> that there is strong need and show us reasonable application scenarios.
> 
> MAR: I also agree with your statement that only single channel G.711 is
> traditionally used. Indeed, when PT = [0 | 8] is used RFC 3551 states
> (in Table 4) that channels == 1 by definition for G.711.
> 
> MAR: However, if you look closely at RFC 3551 ... and you use a DYNAMIC
> payload type for G.711 ... you could specify channel > 1 ... because
> it defines how to pack the payload for "sample-based encodings"
> (RFC 3551, Section 4.3).
> 
> MAR: There are some applications (e.g., acceleration of real-time
> protocols
> over WANs) whereby "multiplexing multiple G.711 flows" is desired.
> Granted
> there are few, if any, products AT PRESENT doing this. However, for
> those
> applications, I think a "standardized method" to put multiple channels
> in one payload is desirable.
> 
> According to the proposed method, I don't think any channel delimiter
> required for the channel separation if we make use of given "ptime" and
> "channels" information.
> 
> With following definition, no channel delimiter is needed.
> We can just decode all samples using the current decoding heuristic in
> Section 4.2.3 and then separate decoded samples.
>  ----------------------------------------------------------
> | left channel (160 samples) | right channel (160 samples) |
> | (G.711.0 frames +padding)  | (G.711.0 frames +padding)   |
>  ----------------------------------------------------------
> 
> MAR: YOU ARE RIGHT!
> 
> MAR: My brain was so fixated on "not needing ptime" in writing the draft
> (I documented ptime as an optional parameter Section 5.1) that I
> neglected
> to consider that IF you considered ptime, THEN you would know where the
> channel boundaries were by the decoded G.711 bitstream!
> 
> MAR: I agree with you ... there no need to add a delimiter for the
> channels
> if you know ptime. On the next revision I will have:
> 
> 1: channels as an optional parameter (as it is now)
> 2: if optional channels parameter is present and > 1, then ptime becomes
> a required parameter (not specified in the present draft).
> 
> MAR: Regarding your channel example above, this is similar to RFC 3551
> already
> specifies for channel demarcation of sample based recordings ... from
> RFC
> 3551 ....
> 
> <begin table>
> channels description channel
> 				1 2 3 4 5 6
> 2 		stereo 	l r
> 3 				l r c
> 4 				l c r S
> 5 				Fl Fr Fc Sl Sr
> 6 				l lc c r rc S
> <end table>
> 
> I believe that this number of channels issue should not be solved in
> G.711.0 bitstream level but should be solved in RTP payload level.
> Even though there are some RESERVED magic numbers available in the
> G.711.0 specification, we should restrict us to introduce as little
> magic numbers as possible for solving RTP payload issues.
> 
> MAR: Agreed.
> 
> MAR: If channels > 1, the buffers in the G.711 decoding process may
> need to be larger ... but that is easily accomplished with some #defines
> in the given application.
> 
> > OPEN ITEM 2 - The specification of the storage mode for long
> recordings.
> 
> According to the storage mode, I had a discussion with some experts who
> are developing some VoIP services.
> They said that supporting 2-channel may be helpful for the storage mode.
> There is a need to store recorded down-link and up-link data into one
> file (e.g., recording conversation between a customer and an operator
> for some call-center application).
> 
> MAR: I can see the use case. And given that one side of this
> communication
> is typically "silence/background noise", this is a good application for
> G.711.0.
> 
> MAR: Anticipating two channels for this application - do we need to
> introduce a parameter of "ptime" in the storage mode definition
> (in addition to channels) so that the channel boundaries are
> self-evident?
> 
> MAR: Can you formulate a proposal for the long recording case? Perhaps
> we can write in 1 second chunks with a length byte for that 1 second.
> Something like (for N chunks of G.711.0 data in the file and "|"
> indicating concatenation here):
> 
> | A | B1 | C1 | B2 | C2 | B3 | C3 | ... | B(N-1) | C(N-1) | B(N) | ...
> where
> 
> A = Fixed length Preamble (has codec name, ptime and number_of_channels)
> 
> Ci = Variable length G.711.0 data for as many channels specified in
> chunk i
> 
> Bi = 	16 bit uint
> 	{if (Bi == 65535)
> 		Indicates EoF;          //for B(N) above
> 	elseif (Bi == 65534)
> 		Indicates End_segment;  //for B(N-1) above where you
> 						//have less than one
> second of data
> 						//in the last segment
> 	else
> 		Number_of_bytes in Ci;
> 	}
> 
> MAR: The above with Bi a uint16 would accommodate a worst-case 8
> channels.
> 
> Other comments on sections 7.1 and 7.2:
> 
> We may want to amend ITU-T Rec. G.711.0 reference software in order to
> add a support of "0x01" defined in Section 7.1 G.711.0 Erasure Frame
> because implementing it without changing the G.711.0 decoder is quite 
> complicated.
> 
> MAR: I wish I had thought of the concept of an erasure frame when we did
> the ITU-T standardization ;-(.
> 
> MAR: Assuming that we make an open-source version of G.711.0 available,
> we could make that small change in the code (to recognize and generate
> an erasure frame). And at a later time update the ITU-T documents.
> 
> ---
> The magic number for G.711.0 A-law corresponds to the ASCII character
> string "#!G7110A\n", i.e., "0x23 0x21 0x47 0x37 0x31 0x31 0x30 0x41
> 0x0A".
> Likewise, the magic number for G.711.0 MU-law corresponds to the ASCII
> character string "#!G7110M\n", i.e., "0x23 0x21 0x47 0x37 0x31 0x31 0x4E
> 0x4D 0x0A".
> ---
> 
> I have no strong opinion but I think we'd better think of an advantage
> of using any RESERVED magic number for the G.711.0 Storage Mode header
> instead of "#".
> Starting from "#" looks good but "#" is already used as a pre-fix in the
> G.711.0 specification.
> Which means the short recordings storage mode file will never be able to
> be decoded by the ITU-T G.711.0 reference software.
> 
> MAR: I am a little confused. The magic number I chose for the draft
> simply
> used the existing IETF convention (look at the RFCs for iLBC and similar
> one-channel codecs). Consider this as a "preamble" to the entire file.
> 
> If we assigned any RESERVED prefix such as "0x01" or "0x47" as the first
> byte of the file, we could add the support to the ITU-T G.711.0
> reference software (perhaps, as an informative appendix).
> Note that only the difference between the short recordings storage mode
> file and the file that the G.711.0 reference software generates is
> existence of this header part.
> On the other hand, this may not be a big issue because we can assign an
> unique file extension for the file so that the application can recognize
> it is the storage mode file.
> 
> What do you think of it?
> 
> MAR: For adoption in the IETF ... I think using their existing
> convention
> is always better (unless you have a killer fault with it). I think the
> IETF
> wants someone with a "storage mode decoder" to see something like
> "#!<codec_name>\n" as the preamble to any storage mode recoding.
> 
> MAR: Considering the fact that the encoded sound files may need to be
> encrypted for some sensitive applications - having the "decoder name"
> inside
> the (encrypted) file instead of the header will be necessary (although
> it
> may also help in cryptographic attacks - I have an outstanding question
> on this with an crypto expert in Cisco).
> 
> Best Regards,
> 
> Noboru
> 
> MAR: Thanks Noboru!
> 
> 
> > Dear AVTEXT and PAYLOAD list members,
> > 
> >  
> > 
> > At IETF 81 I presented the initial draft for G.711.0 payload format
> > (draft-ramalho-payload-g7110-00). This email solicits opinions for
> > continuing the work in this draft.
> > 
> >  
> > 
> > This draft had the usual detail expected in a payload format draft
> PLUS
> > some recommendations and use cases for employing the (lossless and
> > stateless) G.711.0 compression "in the middle" of an end-to-end G.711
> > call/session.
> > 
> >  
> > 
> > The rough consensus I interpreted from presenting the G.711.0 draft
> was
> > that this draft should be split into two drafts:
> > 
> >  
> > 
> > 1 - A "G.711.0 only" payload format draft (mostly the existing draft
> > without Section 6).
> > 
> >  
> > 
> > and
> > 
> >  
> > 
> > 2 - A "G.711.0 use cases / best practices" draft describing use of
> > G.711.0 in the middle of an end-to-end G.711 call (mostly Section 6).
> > 
> >  
> > 
> > >>>ISSUE 1: Any objection to this partitioning or other suggestion?
> > 
> >  
> > 
> > Furthermore, it has been suggested that the former be a AVT PAYLOAD
> > draft (e.g. draft-ramalho-payload-g7110-01) and the latter be a AVT
> EXT
> > draft (e.g., draft-ramalho-avtext-g7110usecases-00).
> > 
> >  
> > 
> > >>>>ISSUE 2: Any objection to partitioning the draft into two drafts
> > targeted for two different IETF AVT WGs or other suggestion?
> > 
> >  
> > 
> > The idea (which I agree with) is that the payload draft be targeted to
> > be an eventual standards track RFC and the avtext draft be targeted to
> > be an informational RFC. This suggestion was only briefly mentioned at
> > the meeting, but was supported in my discussions afterwards.
> > 
> >  
> > 
> > >>>>ISSUE 3: Any comments on this goal?
> > 
> >  
> > 
> > Assuming the partitioning proposed above is accepted, the only
> > significant open items for the G.711.0 payload format draft are:
> > 
> >  
> > 
> > OPEN ITEM 1 - Is the specification of multiple G.711.0 "channels"
> within
> > a single G.711.0 RTP session desired? If so, is the proposed method
> > acceptable (reserving a presently unused "prefix code" as a channel
> > delimiter and changing the decoding heuristic in Section 4.2.3)?
> > 
> >  
> > 
> > and
> > 
> >  
> > 
> > OPEN ITEM 2 - The specification of the storage mode for long
> recordings.
> > 
> >  
> > 
> > >>>>ISSUE 4: Any commentary on the above issues are welcome.
> > 
> >  
> > 
> > Thanks in advance for any comments or suggestions you have.
> > 
> >  
> > 
> > Michael Ramalho
> > 
> >  
> > 
> >  
> > 
> > Michael Ramalho
> > Technical Leader
> > Product Development
> > mramalho@cisco.com <mailto:mramalho@cisco.com> 
> > Phone: +1 919 476 2038
> > Mobile: +1 941 544 2844
> > 
> > 
> > 
> > Cisco Systems, Inc.
> > 4564 Tuscana Drive
> > Sarasota, FL 34241-4201
> > United States
> > http://ramalho.webhop.info
> > Skype: mramalho_mar42
> > 
> >  
> > 
> >  
> > 
> >  Think before you print.
> > 
> > 
> > This email may contain confidential and privileged material for the
> sole
> > use of the intended recipient. Any review, use, distribution or
> > disclosure by others is strictly prohibited. If you are not the
> intended
> > recipient (or authorized to receive for the recipient), please contact
> > the sender by reply email and delete all copies of this message.
> > 
> > For corporate legal information go to:
> > http://www.cisco.com/web/about/doing_business/legal/cri/index.html
> > <http://www.cisco.com/web/about/doing_business/legal/cri/index.html> 
> > 
> >  
> > 
> >  
> > 
> 
> --------------------------------------
> Noboru Harada
> NTT Communication Science Laboratories
> Tel: +81 46 240 3676
> FAX: +81 46 240 3145
> E-mail: harada.noboru@lab.ntt.co.jp
> --------------------------------------

--------------------------------------
Noboru Harada
NTT Communication Science Laboratories
Tel: +81 46 240 3676
FAX: +81 46 240 3145
E-mail: harada.noboru@lab.ntt.co.jp
--------------------------------------