Re: [cnit] [stir] Reputation vs Display name (was Textual caller ID)

Alex Bobotek <alex@bobotek.net> Wed, 28 August 2013 06:02 UTC

Return-Path: <alex@bobotek.net>
X-Original-To: cnit@ietfa.amsl.com
Delivered-To: cnit@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 49C4211E8137 for <cnit@ietfa.amsl.com>; Tue, 27 Aug 2013 23:02:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.437
X-Spam-Level:
X-Spam-Status: No, score=-0.437 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FH_RELAY_NODNS=1.451, HELO_MISMATCH_NET=0.611, RDNS_NONE=0.1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3z2XdiyREhSG for <cnit@ietfa.amsl.com>; Tue, 27 Aug 2013 23:02:51 -0700 (PDT)
Received: from qmta15.emeryville.ca.mail.comcast.net (qmta15.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:44:76:96:27:228]) by ietfa.amsl.com (Postfix) with ESMTP id 7030611E810C for <cnit@ietf.org>; Tue, 27 Aug 2013 23:02:50 -0700 (PDT)
Received: from omta24.emeryville.ca.mail.comcast.net ([76.96.30.92]) by qmta15.emeryville.ca.mail.comcast.net with comcast id J62F1m0041zF43QAF62qse; Wed, 28 Aug 2013 06:02:50 +0000
Received: from BOBO1A.bobotek.net ([76.22.113.196]) by omta24.emeryville.ca.mail.comcast.net with comcast id J62o1m0034EJ4tY8k62oqJ; Wed, 28 Aug 2013 06:02:50 +0000
Received: from BOBO1A.bobotek.net ([fe80::4851:b4bb:416a:e1ad]) by BOBO1A.bobotek.net ([fe80::4851:b4bb:416a:e1ad%10]) with mapi; Tue, 27 Aug 2013 22:55:06 -0700
From: Alex Bobotek <alex@bobotek.net>
To: Brian Rosen <br@brianrosen.net>, Hadriel Kaplan <hadriel.kaplan@oracle.com>
Date: Tue, 27 Aug 2013 22:55:06 -0700
Thread-Topic: [cnit] [stir] Reputation vs Display name (was Textual caller ID)
Thread-Index: Ac6jdqEnt2Kjif7cTrS3fDN6I7N3RQALR7kA
Message-ID: <4B1956260CD29F4A9622F00322FE053193812973DF@BOBO1A.bobotek.net>
References: <4B1956260CD29F4A9622F00322FE053193812973D8@BOBO1A.bobotek.net> <CAOPrzE2aM9bWj+Txby+u=dXiaaFeaKF5BTJfYYCQ18QgYXU2OQ@mail.gmail.com> <521CF2B9.1050200@cs.tcd.ie> <CAOPrzE1Lg4r17CDhBP35Qj1wf0kUr_MqJJS4Dt998Ev9YgAoXA@mail.gmail.com> <FD959F56-275F-4547-831B-98C2B8760A0C@oracle.com> <61E614C6-950C-48A1-BDB2-242519DE71C3@brianrosen.net> <78218B3E-8293-4F7F-85C2-EA6547BD312E@oracle.com> <9767A883-6957-4AEA-9F3C-77B270EE13CD@brianrosen.net>
In-Reply-To: <9767A883-6957-4AEA-9F3C-77B270EE13CD@brianrosen.net>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1377669770; bh=QrSan9HlrdhJ+dlqabHeyGm9U0CjjrUGmb7Ti046vMw=; h=Received:Received:Received:From:To:Date:Subject:Message-ID: Content-Type:MIME-Version; b=TyCC026h/r38rbEnPmUL5L3C76KWFHJo4O5walTYx38ZqStJfwI3Dbfx8QkucHi7S ezeKPNorqDZpxa6pgR/VhvXCExzEfoAt8ckO3BBbejd28c2vj6xwtyaJFUhfl7RpZh ZDh4cxsF81WYjcmn5w/L7Yx82lv3m8N/V4WXs5oA6TRHU10SRl+wMLh6nD9Y1GfrKt NyIQKn2OAkOaZnqpq+oETxHxQ59B1FNUYyIjW4Vi3OHnjo9kKserSXwQfkjbD2cyvx nyFOoOYaEiiMy4NF1eaPO47BUFoHf4gAEwF5o6UW+/sxPru6rnkHN2rhCmDM6dSnDF SytS6c6e8eUJQ==
Cc: "cnit@ietf.org" <cnit@ietf.org>, Stephen Farrell <stephen.farrell@cs.tcd.ie>, "Dwight, Timothy M (Tim)" <timothy.dwight@verizon.com>, Henning Schulzrinne <Henning.Schulzrinne@fcc.gov>
Subject: Re: [cnit] [stir] Reputation vs Display name (was Textual caller ID)
X-BeenThere: cnit@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Calling Name Identity Trust discussion list <cnit.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cnit>, <mailto:cnit-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/cnit>
List-Post: <mailto:cnit@ietf.org>
List-Help: <mailto:cnit-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cnit>, <mailto:cnit-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Aug 2013 06:02:57 -0000

The points that the originating service provider should be presenting and possibly signing the display name I accept as generally positive.   The originating service provider is usually in the best position to provide this information.  And if some notion of originator reputation (e.g., 'score') is also presented and signed by the originating service provider, great.  

A display name service cannot provide an adequate basis for trust for several reasons: 

1.  Linear unstructured text naming cannot prevent fooling the recipient.  If an attacker wants to have their display name be Charlotte C Union and vish, smish or phish Charlotte Credit Union's customers, they may.  What might be Ms. Charlotte's 'personal' calls shouldn't be blocked.  

2.  Further, even a perfect taxonomy of caller types (bank, person, ...)that tells a called party that the caller is a person rather than a  bank would offer inadequate protection to vulnerable (e.g., permanently or temporarily disabled) individuals.  Criminals are creative and will always find ways to exploit the vulnerable with naming similarities.  Nor could it, at least without global adoption of identity authentication and registration requirements such as India's, be counted on to indicate telemarketing.  I do not see how a practical role-based naming taxonomy could be comprehensive enough to prevent these exploits, but my ears are open if someone could point me to such an explanation.  

3.  We cannot assume that all originating service providers will be trustworthy or even well intentioned.  The opening of the service provider market is increasing access to both reputable and disreputable parties.  Display Names provided/signed only by an originating service provider should not be trusted more than the service provider itself.  

Given unique authenticated identities of the calling party and originating service provider, reputation-based systems incorporating called party feedback and calling pattern analysis can and are currently being used successfully in some telephony applications (e.g., text messaging) to pressure peer service providers to police their originations and originating parties, as well as block calls from disreputable callers and,  if necessary, disreputable service providers.  In much more heavily abused communications media (i.e., email), reputation does the heavy lifting.   

We need notions of service provider and originator reputation to control several types of abuse such as vishing and smishing that authenticated names cannot adequately mitigate.  Called party Display Name is beneficial, but insufficient.  

It appears to me the 'score' Brian mentions is one notion of reputation.  But it's only one, and IMHO reputation needs much more extensive discussion.   

Personally, I believe that the telephony security framework should include the following:

* Authenticated identity (phone number)
* Display name
* Reputation framework that can be used to filter calls and indicate trust, 

and possibly an

* Authorization framework that can be used to prove authority to use (e.g., by a fundraising boiler room or integrated messaging system) a particular identity.


Regards,

Alex



-----Original Message-----
From: Brian Rosen [mailto:br@brianrosen.net] 
Sent: Tuesday, August 27, 2013 3:50 PM
To: Hadriel Kaplan
Cc: Henning Schulzrinne; cnit@ietf.org; Alex Bobotek; Dwight, Timothy M (Tim); Stephen Farrell
Subject: Re: [cnit] [stir] Reputation vs Display name (was Textual caller ID)

We're proposing to change CNAM in a couple of ways:
1. We're proposing to move the name determination from the termination to the origination 2. We're proposing to send a score 

We think the score can be used very effectively by the termination.  It can be used with a threshold, and the threshold can be set by the termination service provider to match what service level it offers.  More interestingly, it can be used by the termination device to display some useful data to the consumer that is not black or white.

We want the score to be as accurate as it can be.  The databases have dozens of fields, not all of which are populated for every record.  The more fields you provide, the more accurate the score is (and the higher the score gets).  If all you query with is name and phone number, the confidence level is medium to low.  If you have more data, like address for example, the confidence is considerably higher.  It's not that address is better, it's that if you have a match of name AND number AND address, you have a much smaller error.

The databases we have today use scores.

Just as an example, my company has such a database.  It has dozens of fields.  One of the products that is driven by the database is a CNAM service.  When the termination side dips the database, it does so only with the telephone number.   The scores are fairly low, but there is a threshold (I don't actually know the details).  You get a name, or no entry found out of that dip.  The database scores the data and applies a threshold to it.

The exact same database is used for a call center caller match query.  You call the call center, the operator asks your name and address, gets phone number from ANI and they query the database (same database) to get a score of how likely the data in the query matches (name and phone number and address match each other).  The service you get depends on that score.

We can provide a much better service if we have more information, and the devices and services downstream can make use of score data to decide how to present the call.

Brian

On Aug 27, 2013, at 6:22 PM, Hadriel Kaplan <hadriel.kaplan@oracle.com> wrote:

> 
> On Aug 27, 2013, at 4:41 PM, Brian Rosen <br@brianrosen.net> wrote:
> 
>> As I keep saying, over and over, what they are used for today is termination dips, where all you have to query with is the telephone number, and that gets poor scores.
> 
> Yes, I know they're queried on termination; and yes, I know they're 
> queried using the source telephone number.  STIR will provide validity 
> for that source number, so that you can't pretend to be a source 
> number you aren't.  That should make things better for calling names, 
> if the content of the calling name databases is accurate.  You've been 
> claiming the content of the databases is fairly accurate. (and as far 
> as I can tell, they have been relatively accurate so far, for at least 
> CNAM databases though maybe not LIDB ones)
> 
> I know many folks don't like the CNAM model, but I believe they don't like it due to the pricing model - not due to bad content, nor due to having to physically query it.  They don't like the fact that the receiver of calls has to pay extra for getting data the far-end wanted to be delivered to begin with.
> 
> I don't know what you mean by "that gets poor scores".  As far as I know, there is no such thing as a "score" in the existing PSTN calling name market.  There are name/number/phone-service "types" or category, but not scores of name accuracy afaik.  Are there such things?  Why would anyone claim their score is anything but perfect?
> 
> 
>> What we need is to do the dip at the origination side, where you have more information to make the score larger, and securely carry it in the SIP signaling.  That is the problem to be solved - I have the name, the score, and the identity of the validator.  I have to get that information across reliably, and reliably includes preventing messing with it at the origination side (so the sender can't lie about the score).
> 
> I think that jumps to a solution - presumably the problem is "calling names aren't reliable"; the problem is not "we can't send scores securely in SIP".
> 
> This whole topic is reminiscent of the debates the SIPPING WG had years ago on:
> draft-wing-sipping-spam-score
> draft-schwartz-sipping-spit-saml
> 
> -hadriel
>