Re: [cnit] CNIT Charter bashing..

I just don't see this working. It gets worse in the way you describe since there's no way for the the recipient to know what secret mix of tools the originating carrier used, when they changed that secret mix and what the number may mean. Also, unlike in the fraud case, where you have a good underlying ground truth (the bad guy stole your money or turns out to be a deadbeat), with caller name information, the recipient has no good way to check whether a score of 70 corresponds to anything. They'll only know if somebody asserts "FBI" and it turns out that the caller doesn't have a badge.

>From a consumer perspective, this is completely useless. What would I do with a score of 70? Is that good or bad? Is it only good if the call is from AT&T (which I won't know) but bad if it's from TMobile?

I think we need more than assertion and "magic happens" here...

More constructively, I think the model that provides broad indications of how the information was derived is helpful:

(1) self-asserted ("123 Bogus Street, Anytown" is acceptable)

(2) validated (i.e., the company name exists and any other information, such as street address, exists), but no guarantee that the caller is entitled to that name

(3) billing information (the name and location corresponds to a billing or credit card record)

(4) verified (using third party services such as KBA, above whatever threshold the originator considers good enough)

This corresponds very roughly to the current web model - (1) is somebody's gmail address or Google-hosted blog; (2) is a dedicated domain name with plausible whois information; (3) is a standard TLS cert and (4) is an EV cert.

Henning

________________________________________
From: Brian Rosen [br@brianrosen.net]
Sent: Friday, June 12, 2015 7:20 PM
To: Henning Schulzrinne
Cc: cnit@ietf.org
Subject: Re: [cnit] CNIT Charter bashing..

Okay so, I’ll try to give you meaningful answers to the questions you raise.  I believe I’ve done this before.
> On Jun 12, 2015, at 3:39 PM, Henning Schulzrinne <Henning.Schulzrinne@fcc.gov> wrote:
>
> We have gone around that issue a few times, I think. Without trying to rehash the arguments, I think the value is completely meaningless to the recipient. The value is currently used within a closed environment - company A uses a validation service V and presumably gets a good sense of what "70" means and whether that value is sufficiently high to do whatever they need to do (extend credit, open an account, show tax return information).

Actually, many companies have multiple sources of information, and use more than one of them some times.  Since there is zero standardization, the company has to know how to interpret the results, but since there are a relatively small number of sources, it works fine.  When dealing with a simple score, you have some kind of a scaling factor that normalizes them.  That idea will work just fine here.  The receiving end will have scaling factors for the sources it encounters.  There may even be services that provide scaling factors.  You accumulate and adjust your scaling factors based on the observations you have on the data.  Here, it might be customer complaints, or UIs that have “report errors” buttons, or periodic secondary validation systems.  My company could pretty easily calculate a pretty good set of scaling factors given a large enough sample of historical call data with scores based on the databases we have.  So can others.  If a new scoring service shows up, it might have a pretty low scaling factor until you build up enough data to raise it.

So the termination SP, or device, will apply a scale to the score based on the identity of the source of the score and use that to determine what to do.   Depending on the device, it could change the appearance of the name depending on the scaled score, or it could subject the name to alternative validation, or use a third party name source.

> When company B receives this information, it is completely meaningless to that recipient, as the value will depend on what information the customer A provided, whether A used V, W or X for validation and when this information was validated. Does 70 mean that 70% of the customers with that type of information are indeed who they claim to be? Or just that the person answered 7 out of 10 questions correctly?
The score can be defined as a confidence percentage.  70 means there is a 70% chance the name is correct.  The scoring service is free to use any method it wants to come up with the score.
>
> Thus, this information is meant to be interpreted within a particular context, and taking it out of this context renders it meaningless.
The context is very well defined - what is the name of the entity placing the call?  The score is the confidence we have in the answer provided.

>
> Thus, unless these issues can be addressed, we would be conveying information that pretends to be accurate, but is just noise. Brian, you repeat the same idea, but never address the issues that get raised again and again. This is not helpful.
No, it’s the exact opposite.  When you send just a name without a score, you are pretending it’s 100% accurate, and that is clearly wrong.  When you send a score, you acknowledge that the data is not 100% accurate, and you show what your confidence is in the information you are providing.  Since we have a lot of experience, we know how to get quality scoring.  What we can’t do is mandate a good scoring methodology or source, so we have to have some more complications like the scaling factor.  But we do that now, because there is lots of competition for data, having more than one source is common and yet we know they don’t all provide the same quality of data.  So we deal with that.

>
> We also now know from the IRS tax return debacle that knowledge-based authentication varies in quality. I'm sure whoever got access to the tax returns scored a 100 on whatever questions the web site asked…
Certainly.  Any scoring system can be gamed.  Nothing is perfect.  But just saying the guy’s name is Clark Kent is no where near as useful as saying that Name Scoring Service Inc says there is an 93% probability that the guy’s name is Clark Kent.  If someone says 100%, you should not believe them.

These kinds of systems are common.  We use them in lots of environments.  We know how they work.  We know what their limitations are.  This isn’t rocket science, but it is data science.  Why are you so skeptical of proven systems?

Brian
>
> ________________________________________
> From: Brian Rosen [br@brianrosen.net]
> Sent: Friday, June 12, 2015 1:51 PM
> To: Dwight, Timothy M (Tim)
> Cc: Richard Shockey; philippe.fouquart@orange.com; Henning Schulzrinne; cnit@ietf.org; Stephen Farrell
> Subject: Re: [cnit] CNIT Charter bashing..
>
> Yes, it would be assigned by the entity that signed the name.
>
> It’s not true that it would always be the highest possible value.  If the entity that provided it did that, the receiving entity might not believe it, and choose to use an alternative name source (or at least check another service to see what it thought).  Modern systems that collect names subject user data with verification sources that are getting very accurate, but those services have scoring systems that don’t result in black/white results.  Older systems blithely accept whatever the customer says, and those are not accurate.  Some services have something simpler, like the name has to match a credit card name, but we know those are pretty spoofable these days.  Real systems use external verification services that provide scores for this kind of thing.
>
> I’m simply proposing that we allow an optional confidence, in the range of 0-100, and that it be part of the data signed by the data provider.  No one has to send it, no one has to look at it.  But it represents what is state of the art these days on asserting names and I think it’s valuable.
>
> Brian
>> On Jun 12, 2015, at 9:56 AM, Dwight, Timothy M (Tim) <timothy.dwight@verizon.com> wrote:
>>
>> Who would assign the confidence value?  If it's assigned by the entity that operates the calling name database, why would it ever be less than the highest possible value?  If it's set by some other entity, on what basis do they determine the value they assign?  It seems like we're going to stumble over business issues.
>>
>> Tim
>>
>>
>> -----Original Message-----
>> From: cnit [mailto:cnit-bounces@ietf.org] On Behalf Of Brian Rosen
>> Sent: Friday, June 12, 2015 11:28 AM
>> To: Richard Shockey
>> Cc: philippe.fouquart@orange.com; Henning Schulzrinne; cnit@ietf.org; Stephen Farrell
>> Subject: Re: [cnit] CNIT Charter bashing..
>>
>> One possible extra bit is that we need to know WHO signed.  That could be easy (identity in a cert for the signature), but it’s a requirement.
>>
>> I still want an optional confidence value, because the source is often not authoritative.
>>
>> If we’re thinking we’re using the existing display name, and coming up with a way to sign it, then, like stir, the termination side can decide what it wants to do if it gets a display name but no signature.  The sender has the option to provide the name or not, and provide the signature or not.
>>
>> We COULD consider a new header that would contain the name encrypted for a destination TN (To:).  That would afford privacy to the name to middle boxes that we would not have today with display name.  I would not be opposed to that.  This would work like the offline stir proposal, where the sender obtains the public key of the recipient and encrypts the name for the recipient.
>>
>> Brian
>>
>>> On Jun 12, 2015, at 8:49 AM, Richard Shockey <richard@shockey.us> wrote:
>>>
>>>
>>> Henning is right. No one is forcing anything. Existing anonymous
>>> calling protections still apply.
>>>
>>>
>>> Again my point is that is a great many cases Interconnected SIP
>>> between NA carriers are covered by other security mechanisms.
>>>
>>> Right now your Facetime session is totally in the clear. My concern is
>>> we end up going down the rat hole of trying to create perfect end to
>>> end security nothing will get done.
>>>
>>>
>>>
>>> On 6/12/15, 10:17 AM, "Stephen Farrell" <stephen.farrell@cs.tcd.ie> wrote:
>>>
>>>>
>>>>
>>>> On 12/06/15 15:13, Henning Schulzrinne wrote:
>>>>> In almost all cases of interest, the calling party *wants* to
>>>>> disclose accurate information to the called party, so the privacy
>>>>> issues don't seem to arise. They would only arise if there was
>>>>> forced disclosure; I don't think anybody is proposing that.
>>>>
>>>> Privacy issues could also arise if a middlebox could now see
>>>> sensitive information that it previously could not see. I think that
>>>> is independent of whether disclosure is desired by either of the
>>>> endpoints.
>>>>
>>>> S.
>>>>
>>>> _______________________________________________
>>>> cnit mailing list
>>>> cnit@ietf.org
>>>> https://www.ietf.org/mailman/listinfo/cnit
>>>
>>>
>>> _______________________________________________
>>> cnit mailing list
>>> cnit@ietf.org
>>> https://www.ietf.org/mailman/listinfo/cnit
>>
>> _______________________________________________
>> cnit mailing list
>> cnit@ietf.org
>> https://www.ietf.org/mailman/listinfo/cnit
>
>