Re: [cnit] [stir] Reputation vs Display name (was Textual caller ID)

Hadriel Kaplan <hadriel.kaplan@oracle.com> Wed, 28 August 2013 13:02 UTC

Return-Path: <hadriel.kaplan@oracle.com>
X-Original-To: cnit@ietfa.amsl.com
Delivered-To: cnit@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1B29C21E804E for <cnit@ietfa.amsl.com>; Wed, 28 Aug 2013 06:02:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.536
X-Spam-Level:
X-Spam-Status: No, score=-6.536 tagged_above=-999 required=5 tests=[AWL=0.063, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Gvy7i9BRBRwb for <cnit@ietfa.amsl.com>; Wed, 28 Aug 2013 06:02:45 -0700 (PDT)
Received: from userp1040.oracle.com (userp1040.oracle.com [156.151.31.81]) by ietfa.amsl.com (Postfix) with ESMTP id EA93B21F9FE3 for <cnit@ietf.org>; Wed, 28 Aug 2013 06:02:44 -0700 (PDT)
Received: from ucsinet22.oracle.com (ucsinet22.oracle.com [156.151.31.94]) by userp1040.oracle.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.1) with ESMTP id r7SD2ZKa020783 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 28 Aug 2013 13:02:35 GMT
Received: from aserz7022.oracle.com (aserz7022.oracle.com [141.146.126.231]) by ucsinet22.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id r7SD2Vxv025865 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 28 Aug 2013 13:02:32 GMT
Received: from abhmt117.oracle.com (abhmt117.oracle.com [141.146.116.69]) by aserz7022.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id r7SD2VPL004819; Wed, 28 Aug 2013 13:02:31 GMT
Received: from [10.1.21.34] (/10.5.21.34) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 28 Aug 2013 06:02:30 -0700
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\))
From: Hadriel Kaplan <hadriel.kaplan@oracle.com>
In-Reply-To: <4B1956260CD29F4A9622F00322FE053193812973DF@BOBO1A.bobotek.net>
Date: Wed, 28 Aug 2013 09:02:29 -0400
Content-Transfer-Encoding: quoted-printable
Message-Id: <DEAFB574-73A7-4EC3-8AE5-416D8F9AF592@oracle.com>
References: <4B1956260CD29F4A9622F00322FE053193812973D8@BOBO1A.bobotek.net> <CAOPrzE2aM9bWj+Txby+u=dXiaaFeaKF5BTJfYYCQ18QgYXU2OQ@mail.gmail.com> <521CF2B9.1050200@cs.tcd.ie> <CAOPrzE1Lg4r17CDhBP35Qj1wf0kUr_MqJJS4Dt998Ev9YgAoXA@mail.gmail.com> <FD959F56-275F-4547-831B-98C2B8760A0C@oracle.com> <61E614C6-950C-48A1-BDB2-242519DE71C3@brianrosen.net> <78218B3E-8293-4F7F-85C2-EA6547BD312E@oracle.com> <9767A883-6957-4AEA-9F3C-77B270EE13CD@brianrosen.net> <4B1956260CD29F4A9622F00322FE053193812973DF@BOBO1A.bobotek.net>
To: Alex Bobotek <alex@bobotek.net>
X-Mailer: Apple Mail (2.1508)
X-Source-IP: ucsinet22.oracle.com [156.151.31.94]
Cc: "cnit@ietf.org" <cnit@ietf.org>, Henning Schulzrinne <Henning.Schulzrinne@fcc.gov>, "Dwight, Timothy M (Tim)" <timothy.dwight@verizon.com>, Stephen Farrell <stephen.farrell@cs.tcd.ie>, Brian Rosen <br@brianrosen.net>
Subject: Re: [cnit] [stir] Reputation vs Display name (was Textual caller ID)
X-BeenThere: cnit@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Calling Name Identity Trust discussion list <cnit.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cnit>, <mailto:cnit-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/cnit>
List-Post: <mailto:cnit@ietf.org>
List-Help: <mailto:cnit-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cnit>, <mailto:cnit-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Aug 2013 13:02:52 -0000

Unfortunately your point-3 is the problem: we know that some originating "service providers" cannot be trusted.  Some of them are just lazy or understaffed; but some of them are actually complicit: letting their customers generate unverified and unverifiable calling numbers and names is actually part of their business model.

That's the same problem that is making us do STIR to begin with.  If we could trust all originating service providers, or even thought that reputation of originating providers alone was sufficient, we wouldn't need half of the stuff STIR is defining.  For example we could have simply implement draft-kaplan-sip-asserter-identity, which was far simpler than STIR will be.  That would have enabled creating reputation services to weed out bad originating service providers.

But when I talked to some of the carriers about doing that, they had concerns over the notion of deciding an entire originating service provider was untrustworthy; concerns regarding lawsuits from those originating providers, threshold level issues, and big problems if it was used for international calls.  The thing they liked about STIR was it would be easily and legally defensible, including at an international level.

-hadriel


On Aug 28, 2013, at 1:55 AM, Alex Bobotek <alex@bobotek.net> wrote:

> The points that the originating service provider should be presenting and possibly signing the display name I accept as generally positive.   The originating service provider is usually in the best position to provide this information.  And if some notion of originator reputation (e.g., 'score') is also presented and signed by the originating service provider, great.  
> 
> A display name service cannot provide an adequate basis for trust for several reasons: 
> 
> 1.  Linear unstructured text naming cannot prevent fooling the recipient.  If an attacker wants to have their display name be Charlotte C Union and vish, smish or phish Charlotte Credit Union's customers, they may.  What might be Ms. Charlotte's 'personal' calls shouldn't be blocked.  
> 
> 2.  Further, even a perfect taxonomy of caller types (bank, person, ...)that tells a called party that the caller is a person rather than a  bank would offer inadequate protection to vulnerable (e.g., permanently or temporarily disabled) individuals.  Criminals are creative and will always find ways to exploit the vulnerable with naming similarities.  Nor could it, at least without global adoption of identity authentication and registration requirements such as India's, be counted on to indicate telemarketing.  I do not see how a practical role-based naming taxonomy could be comprehensive enough to prevent these exploits, but my ears are open if someone could point me to such an explanation.  
> 
> 3.  We cannot assume that all originating service providers will be trustworthy or even well intentioned.  The opening of the service provider market is increasing access to both reputable and disreputable parties.  Display Names provided/signed only by an originating service provider should not be trusted more than the service provider itself.  
> 
> Given unique authenticated identities of the calling party and originating service provider, reputation-based systems incorporating called party feedback and calling pattern analysis can and are currently being used successfully in some telephony applications (e.g., text messaging) to pressure peer service providers to police their originations and originating parties, as well as block calls from disreputable callers and,  if necessary, disreputable service providers.  In much more heavily abused communications media (i.e., email), reputation does the heavy lifting.   
> 
> We need notions of service provider and originator reputation to control several types of abuse such as vishing and smishing that authenticated names cannot adequately mitigate.  Called party Display Name is beneficial, but insufficient.  
> 
> It appears to me the 'score' Brian mentions is one notion of reputation.  But it's only one, and IMHO reputation needs much more extensive discussion.   
> 
> Personally, I believe that the telephony security framework should include the following:
> 
> * Authenticated identity (phone number)
> * Display name
> * Reputation framework that can be used to filter calls and indicate trust, 
> 
> and possibly an
> 
> * Authorization framework that can be used to prove authority to use (e.g., by a fundraising boiler room or integrated messaging system) a particular identity.
> 
> 
> Regards,
> 
> Alex
> 
> 
> 
> -----Original Message-----
> From: Brian Rosen [mailto:br@brianrosen.net] 
> Sent: Tuesday, August 27, 2013 3:50 PM
> To: Hadriel Kaplan
> Cc: Henning Schulzrinne; cnit@ietf.org; Alex Bobotek; Dwight, Timothy M (Tim); Stephen Farrell
> Subject: Re: [cnit] [stir] Reputation vs Display name (was Textual caller ID)
> 
> We're proposing to change CNAM in a couple of ways:
> 1. We're proposing to move the name determination from the termination to the origination 2. We're proposing to send a score 
> 
> We think the score can be used very effectively by the termination.  It can be used with a threshold, and the threshold can be set by the termination service provider to match what service level it offers.  More interestingly, it can be used by the termination device to display some useful data to the consumer that is not black or white.
> 
> We want the score to be as accurate as it can be.  The databases have dozens of fields, not all of which are populated for every record.  The more fields you provide, the more accurate the score is (and the higher the score gets).  If all you query with is name and phone number, the confidence level is medium to low.  If you have more data, like address for example, the confidence is considerably higher.  It's not that address is better, it's that if you have a match of name AND number AND address, you have a much smaller error.
> 
> The databases we have today use scores.
> 
> Just as an example, my company has such a database.  It has dozens of fields.  One of the products that is driven by the database is a CNAM service.  When the termination side dips the database, it does so only with the telephone number.   The scores are fairly low, but there is a threshold (I don't actually know the details).  You get a name, or no entry found out of that dip.  The database scores the data and applies a threshold to it.
> 
> The exact same database is used for a call center caller match query.  You call the call center, the operator asks your name and address, gets phone number from ANI and they query the database (same database) to get a score of how likely the data in the query matches (name and phone number and address match each other).  The service you get depends on that score.
> 
> We can provide a much better service if we have more information, and the devices and services downstream can make use of score data to decide how to present the call.
> 
> Brian
> 
> On Aug 27, 2013, at 6:22 PM, Hadriel Kaplan <hadriel.kaplan@oracle.com> wrote:
> 
>> 
>> On Aug 27, 2013, at 4:41 PM, Brian Rosen <br@brianrosen.net> wrote:
>> 
>>> As I keep saying, over and over, what they are used for today is termination dips, where all you have to query with is the telephone number, and that gets poor scores.
>> 
>> Yes, I know they're queried on termination; and yes, I know they're 
>> queried using the source telephone number.  STIR will provide validity 
>> for that source number, so that you can't pretend to be a source 
>> number you aren't.  That should make things better for calling names, 
>> if the content of the calling name databases is accurate.  You've been 
>> claiming the content of the databases is fairly accurate. (and as far 
>> as I can tell, they have been relatively accurate so far, for at least 
>> CNAM databases though maybe not LIDB ones)
>> 
>> I know many folks don't like the CNAM model, but I believe they don't like it due to the pricing model - not due to bad content, nor due to having to physically query it.  They don't like the fact that the receiver of calls has to pay extra for getting data the far-end wanted to be delivered to begin with.
>> 
>> I don't know what you mean by "that gets poor scores".  As far as I know, there is no such thing as a "score" in the existing PSTN calling name market.  There are name/number/phone-service "types" or category, but not scores of name accuracy afaik.  Are there such things?  Why would anyone claim their score is anything but perfect?
>> 
>> 
>>> What we need is to do the dip at the origination side, where you have more information to make the score larger, and securely carry it in the SIP signaling.  That is the problem to be solved - I have the name, the score, and the identity of the validator.  I have to get that information across reliably, and reliably includes preventing messing with it at the origination side (so the sender can't lie about the score).
>> 
>> I think that jumps to a solution - presumably the problem is "calling names aren't reliable"; the problem is not "we can't send scores securely in SIP".
>> 
>> This whole topic is reminiscent of the debates the SIPPING WG had years ago on:
>> draft-wing-sipping-spam-score
>> draft-schwartz-sipping-spit-saml
>> 
>> -hadriel
>> 
>