Re: [AVT] Comments on draft-perkins-avt-srtp-vbr-audio
Colin Perkins <csp@csperkins.org> Sun, 12 December 2010 13:46 UTC
Return-Path: <csp@csperkins.org>
X-Original-To: avt@core3.amsl.com
Delivered-To: avt@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 9F8C33A6DBA for <avt@core3.amsl.com>; Sun, 12 Dec 2010 05:46:31 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -103.599
X-Spam-Level:
X-Spam-Status: No, score=-103.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DjDI8hqNchjU for <avt@core3.amsl.com>; Sun, 12 Dec 2010 05:46:30 -0800 (PST)
Received: from anchor-msapost-1.mail.demon.net (anchor-msapost-1.mail.demon.net [195.173.77.164]) by core3.amsl.com (Postfix) with ESMTP id 1D9B83A6DB8 for <avt@ietf.org>; Sun, 12 Dec 2010 05:46:30 -0800 (PST)
Received: from starkperkins.demon.co.uk ([80.176.158.71] helo=[192.168.0.22]) by anchor-post-1.mail.demon.net with esmtpsa (AUTH csperkins-dwh) (TLSv1:AES128-SHA:128) (Exim 4.69) id 1PRmHU-0005UX-hk; Sun, 12 Dec 2010 13:48:05 +0000
Mime-Version: 1.0 (Apple Message framework v1082)
Content-Type: text/plain; charset="iso-8859-1"
From: Colin Perkins <csp@csperkins.org>
In-Reply-To: <DBB1DC060375D147AC43F310AD987DCC1DEA8105FE@ESESSCMS0366.eemea.ericsson.se>
Date: Sun, 12 Dec 2010 13:47:58 +0000
Content-Transfer-Encoding: quoted-printable
Message-Id: <94B9F600-7846-400A-874A-08D1E0C732BB@csperkins.org>
References: <DBB1DC060375D147AC43F310AD987DCC1DEA81037C@ESESSCMS0366.eemea.ericsson.se> <4CE2D2FC.2000203@octasic.com> <DBB1DC060375D147AC43F310AD987DCC1DEA8105FE@ESESSCMS0366.eemea.ericsson.se>
To: Ingemar Johansson S <ingemar.s.johansson@ericsson.com>
X-Mailer: Apple Mail (2.1082)
Cc: "avt@ietf.org" <avt@ietf.org>
Subject: Re: [AVT] Comments on draft-perkins-avt-srtp-vbr-audio
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Audio/Video Transport Working Group <avt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/avt>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 12 Dec 2010 13:46:31 -0000
Hi, On 17 Nov 2010, at 07:45, Ingemar Johansson S wrote: > Hi > > Answers inline below > > /Ingemar > >> -----Original Message----- >> From: Jean-Marc Valin [mailto:jean-marc.valin@octasic.com] >> Sent: den 16 november 2010 19:53 >> To: Ingemar Johansson S >> Cc: avt@ietf.org; Colin Perkins >> Subject: Re: Comments on draft-perkins-avt-srtp-vbr-audio >> >> Hi Ingemar, >> >> On 10-11-16 07:17 AM, Ingemar Johansson S wrote: >>> Today it may be quite far fetched to imagine that much useful >>> information can be extracted this way (I even have problems >> to get our >>> automated speech recognition exchange understand me...). However, >>> anyone who has read a spy novel by Tom Clancy or John le >> Carré realize >>> that eavesdropping is more or less picking fragments of information >>> from many different sources (including trash-bins). This in >>> combination with Moores law says that implementors should >> be aware of this issue. >> >> I think you've pretty much summed up the idea we were trying >> to convey. >> Conversational speech recognition is indeed hard enough when >> we have the audio that recognizing from VBR is quite >> far-fetched unless the vocabulary is highly constrained. >> >> On the other hand, the real worry I have with VBR and VAD is >> for pre-recorded prompts like you have in an IVR. If an >> attacker knows the IVR prompts (e.g. has an account at the >> same bank as you), then they can extract patterns that are >> very precise and obtain 100% identification on the known >> prompts. This is the case even with the counter-measures >> described in the draft. On the other hand, anything that >> isn't pre-recorded is not something that worries me too much. > Then I would say that a general recommendation is to always use padding for VBR or turn off DTX in the case you describe above. This should be a very small amount of the total traffic volume so one don't need to worry about increased network load. > How is the padding for the VBR case negotiated ?. I guess you don't intend to negotiate this or ?, DTX can be turned off but the padding is not possible to negotiate today > >> >>> Section 4: My personal feeling is that the recommendations are too >>> far, the idea is that an eavesdropper can extract >> information from the >>> length of the talk spurts. This to me sounds like a much more >>> difficult task than the VBR case (which is difficult >> already that). A >>> hangover of e.g 1s is very likely to give 100% activity time for >>> speakers of a particular south-european nationality :-) >> >> Any suggestion for a reasonable the hangover half-life? > I would actually prefer no extra hangover other than that already inherent in the codec. This is more due to (imagined) network load reasons than security reasons however as I believe the security concerns are not that severe. If we think secuity then why not just recommend to turn off DTX completely for very sensitive applications ?. A 1s overhang will IMHO drive the voice activity factor so high so one may aswell turn off DTX completely . I just submitted -05 that attempts to give some more nuanced guidance here. Feedback would be appreciated. Cheers, Colin >> >>> Section 5: A variation to the padding is to randomly pad a >> fraction of >>> the packets up to a large size (less or equal to the largest packet >>> size from the codec) than , this, I believe should confuse the >>> eavesdropper algorithm considerably. It is possible that >> the same can >>> be applied to VAD case in section 4 as well. >> >> I'm not sure what you are suggesting. I think VAD is a bit >> different from VBR because it's a binary decision. If you >> decide not to send a packet based on the VAD data, then you >> can't just do padding. Similarly, if the VAD triggers a >> low-rate mode, then the padding would have to be high enough >> to make the packets indistinguishable from high-rate mode, >> which means it acts as an overhang. Or maybe I didn't quite >> understandand what you were suggesting. > Hmm, you are right (brain fart). I was thinking like for instance an AMR case where you can actually transmit frames of type NO_DATA after the SID_FIRST frame but unless you make it an overhang (like you already suggested) you still get "holes" which are easily detected by the eavesdropper. > > >> >> Cheers, >> >> Jean-Marc >> > _______________________________________________ > Audio/Video Transport Working Group > avt@ietf.org > https://www.ietf.org/mailman/listinfo/avt -- Colin Perkins http://csperkins.org/
- [AVT] Comments on draft-perkins-avt-srtp-vbr-audio Ingemar Johansson S
- Re: [AVT] Comments on draft-perkins-avt-srtp-vbr-… Jean-Marc Valin
- Re: [AVT] Comments on draft-perkins-avt-srtp-vbr-… Ingemar Johansson S
- Re: [AVT] Comments on draft-perkins-avt-srtp-vbr-… Colin Perkins