Re: [AVT] Comments on draft-perkins-avt-srtp-vbr-audio
Ingemar Johansson S <ingemar.s.johansson@ericsson.com> Wed, 17 November 2010 07:44 UTC
Return-Path: <ingemar.s.johansson@ericsson.com>
X-Original-To: avt@core3.amsl.com
Delivered-To: avt@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 7409B3A680C for <avt@core3.amsl.com>; Tue, 16 Nov 2010 23:44:20 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.519
X-Spam-Level:
X-Spam-Status: No, score=-6.519 tagged_above=-999 required=5 tests=[AWL=0.080, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WeEDMvJ0Ymsv for <avt@core3.amsl.com>; Tue, 16 Nov 2010 23:44:19 -0800 (PST)
Received: from mailgw10.se.ericsson.net (mailgw10.se.ericsson.net [193.180.251.61]) by core3.amsl.com (Postfix) with ESMTP id 2277B3A68B9 for <avt@ietf.org>; Tue, 16 Nov 2010 23:44:18 -0800 (PST)
X-AuditID: c1b4fb3d-b7c05ae0000028e7-03-4ce387ff4cc9
Received: from esessmw0247.eemea.ericsson.se (Unknown_Domain [153.88.253.125]) by mailgw10.se.ericsson.net (Symantec Mail Security) with SMTP id 11.CA.10471.FF783EC4; Wed, 17 Nov 2010 08:45:03 +0100 (CET)
Received: from ESESSCMS0366.eemea.ericsson.se ([169.254.1.174]) by esessmw0247.eemea.ericsson.se ([10.2.3.116]) with mapi; Wed, 17 Nov 2010 08:45:03 +0100
From: Ingemar Johansson S <ingemar.s.johansson@ericsson.com>
To: Jean-Marc Valin <jean-marc.valin@octasic.com>
Date: Wed, 17 Nov 2010 08:45:02 +0100
Thread-Topic: Comments on draft-perkins-avt-srtp-vbr-audio
Thread-Index: AcuFv3is1687e6c1QfKFvdOmYWhrBQAaS35A
Message-ID: <DBB1DC060375D147AC43F310AD987DCC1DEA8105FE@ESESSCMS0366.eemea.ericsson.se>
References: <DBB1DC060375D147AC43F310AD987DCC1DEA81037C@ESESSCMS0366.eemea.ericsson.se> <4CE2D2FC.2000203@octasic.com>
In-Reply-To: <4CE2D2FC.2000203@octasic.com>
Accept-Language: sv-SE, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: sv-SE, en-US
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Brightmail-Tracker: AAAAAA==
Cc: "avt@ietf.org" <avt@ietf.org>, Colin Perkins <csp@csperkins.org>
Subject: Re: [AVT] Comments on draft-perkins-avt-srtp-vbr-audio
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Audio/Video Transport Working Group <avt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/avt>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Nov 2010 07:44:20 -0000
Hi Answers inline below /Ingemar > -----Original Message----- > From: Jean-Marc Valin [mailto:jean-marc.valin@octasic.com] > Sent: den 16 november 2010 19:53 > To: Ingemar Johansson S > Cc: avt@ietf.org; Colin Perkins > Subject: Re: Comments on draft-perkins-avt-srtp-vbr-audio > > Hi Ingemar, > > On 10-11-16 07:17 AM, Ingemar Johansson S wrote: > > Today it may be quite far fetched to imagine that much useful > > information can be extracted this way (I even have problems > to get our > > automated speech recognition exchange understand me...). However, > > anyone who has read a spy novel by Tom Clancy or John le > Carré realize > > that eavesdropping is more or less picking fragments of information > > from many different sources (including trash-bins). This in > > combination with Moores law says that implementors should > be aware of this issue. > > I think you've pretty much summed up the idea we were trying > to convey. > Conversational speech recognition is indeed hard enough when > we have the audio that recognizing from VBR is quite > far-fetched unless the vocabulary is highly constrained. > > On the other hand, the real worry I have with VBR and VAD is > for pre-recorded prompts like you have in an IVR. If an > attacker knows the IVR prompts (e.g. has an account at the > same bank as you), then they can extract patterns that are > very precise and obtain 100% identification on the known > prompts. This is the case even with the counter-measures > described in the draft. On the other hand, anything that > isn't pre-recorded is not something that worries me too much. Then I would say that a general recommendation is to always use padding for VBR or turn off DTX in the case you describe above. This should be a very small amount of the total traffic volume so one don't need to worry about increased network load. How is the padding for the VBR case negotiated ?. I guess you don't intend to negotiate this or ?, DTX can be turned off but the padding is not possible to negotiate today > > > Section 4: My personal feeling is that the recommendations are too > > far, the idea is that an eavesdropper can extract > information from the > > length of the talk spurts. This to me sounds like a much more > > difficult task than the VBR case (which is difficult > already that). A > > hangover of e.g 1s is very likely to give 100% activity time for > > speakers of a particular south-european nationality :-) > > Any suggestion for a reasonable the hangover half-life? I would actually prefer no extra hangover other than that already inherent in the codec. This is more due to (imagined) network load reasons than security reasons however as I believe the security concerns are not that severe. If we think secuity then why not just recommend to turn off DTX completely for very sensitive applications ?. A 1s overhang will IMHO drive the voice activity factor so high so one may aswell turn off DTX completely . > > > Section 5: A variation to the padding is to randomly pad a > fraction of > > the packets up to a large size (less or equal to the largest packet > > size from the codec) than , this, I believe should confuse the > > eavesdropper algorithm considerably. It is possible that > the same can > > be applied to VAD case in section 4 as well. > > I'm not sure what you are suggesting. I think VAD is a bit > different from VBR because it's a binary decision. If you > decide not to send a packet based on the VAD data, then you > can't just do padding. Similarly, if the VAD triggers a > low-rate mode, then the padding would have to be high enough > to make the packets indistinguishable from high-rate mode, > which means it acts as an overhang. Or maybe I didn't quite > understandand what you were suggesting. Hmm, you are right (brain fart). I was thinking like for instance an AMR case where you can actually transmit frames of type NO_DATA after the SID_FIRST frame but unless you make it an overhang (like you already suggested) you still get "holes" which are easily detected by the eavesdropper. > > Cheers, > > Jean-Marc >
- [AVT] Comments on draft-perkins-avt-srtp-vbr-audio Ingemar Johansson S
- Re: [AVT] Comments on draft-perkins-avt-srtp-vbr-… Jean-Marc Valin
- Re: [AVT] Comments on draft-perkins-avt-srtp-vbr-… Ingemar Johansson S
- Re: [AVT] Comments on draft-perkins-avt-srtp-vbr-… Colin Perkins