Re: [Geopriv] draft-jones-geopriv-sigpos-survey

Kipp Jones <kjones@skyhookwireless.com> Wed, 17 October 2012 20:50 UTC

Return-Path: <kjones@skyhookwireless.com>
X-Original-To: geopriv@ietfa.amsl.com
Delivered-To: geopriv@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2E9A721F8546 for <geopriv@ietfa.amsl.com>; Wed, 17 Oct 2012 13:50:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.398
X-Spam-Level:
X-Spam-Status: No, score=-0.398 tagged_above=-999 required=5 tests=[BAYES_50=0.001, HTML_MESSAGE=0.001, J_CHICKENPOX_54=0.6, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JJt5B5C9oJrc for <geopriv@ietfa.amsl.com>; Wed, 17 Oct 2012 13:50:48 -0700 (PDT)
Received: from server505.appriver.com (server505a.appriver.com [98.129.35.4]) by ietfa.amsl.com (Postfix) with ESMTP id 71AC121F86BB for <geopriv@ietf.org>; Wed, 17 Oct 2012 13:50:47 -0700 (PDT)
X-Note-AR-ScanTimeLocal: 10/17/2012 3:50:46 PM
X-Policy: GLOBAL - skyhookwireless.com
X-Primary: kjones@skyhookwireless.com
X-Note: This Email was scanned by AppRiver SecureTide
X-ALLOW: @skyhookwireless.com ALLOWED
X-Virus-Scan: V-
X-Note: Spam Tests Failed:
X-Country-Path: UNKNOWN->UNITED STATES->UNITED STATES
X-Note-Sending-IP: 98.129.35.1
X-Note-Reverse-DNS: smtp.exg5.exghost.com
X-Note-Return-Path: kjones@skyhookwireless.com
X-Note: User Rule Hits:
X-Note: Global Rule Hits: G321 G322 G323 G324 G328 G329 G340 G436
X-Note: Encrypt Rule Hits:
X-Note: Mail Class: ALLOWEDSENDER
X-Note: Headers Injected
Received: from [98.129.35.1] (HELO smtp.exg5.exghost.com) by server505.appriver.com (CommuniGate Pro SMTP 5.4.4) with ESMTPS id 329302584 for geopriv@ietf.org; Wed, 17 Oct 2012 15:50:46 -0500
Received: from MBX02.exg5.exghost.com ([169.254.2.43]) by HT09-e5.exg5.exghost.com ([98.129.23.242]) with mapi; Wed, 17 Oct 2012 15:50:45 -0500
From: Kipp Jones <kjones@skyhookwireless.com>
To: GEOPRIV WG <geopriv@ietf.org>
Date: Wed, 17 Oct 2012 15:50:43 -0500
Thread-Topic: draft-jones-geopriv-sigpos-survey
Thread-Index: Ac2sqRQkJP8KGdRcQPW9NMXNOFS6sQ==
Message-ID: <44A96F3A-61ED-42F3-8F7C-36E1D650F392@skyhookwireless.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: multipart/alternative; boundary="_000_44A96F3A61ED42F38F7C36E1D650F392skyhookwirelesscom_"
MIME-Version: 1.0
X-Mailman-Approved-At: Wed, 17 Oct 2012 13:57:10 -0700
Subject: Re: [Geopriv] draft-jones-geopriv-sigpos-survey
X-BeenThere: geopriv@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Geographic Location/Privacy <geopriv.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/geopriv>, <mailto:geopriv-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/geopriv>
List-Post: <mailto:geopriv@ietf.org>
List-Help: <mailto:geopriv-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/geopriv>, <mailto:geopriv-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2012 20:50:51 -0000

Thanks again for all of the comments and apologies for a delayed response.  I've attempted to assimilate all of the comments and responses here and will provide an revision of the draft that incorporates these and cleans up a couple of other nits.


>>>Usage Models [Martin Thompson]
There seem to be two verydifferent usage models: (1) crowd-source, (2) contracted to do asurvey with specialized equipment  ... These models seem to be quite different, especially when one talks about licensing

On reviewing the comments and discussions, I think this issue muddies the water.  The primary purpose of this specification is for 'formal' and 'semi-formal' (directed) surveys of a given venue.  Therefore, I'm stripping out language and consideration for the 'crowdsourcing' of venue surveys and explicitly identifying this as not addressed.  I believe there will be many different models that will be experimented with for the crowdsource model and I think it is too early to try to nail down a specific model/specification to encompass these use cases.

However the process of performing a directed venue survey has been something that has been going on for years and should be able to be wrangled into a structure that supports a more common model for interoperability, privacy, and extensibility.


>Licensing / Terms of Use

There were several discussions on the topic of licensing, ownership and related topics.  I'll try to summarize and respond below.

The primary intent for including the licensing component in the specification is to provide both data provenance as well as rights declarations for usage of the data. I've simplified the structure to support a more focused set of licensing terms and clarified their meanings in this context.

As several pointed out, the idea of data 'ownership' is a tricky one, especially in light of the fact that this specification is covering signals that are broadcast and in the public spectrum.  However, it is not the signal information that is being addressed, rather the location information.  To whit, the signals are associated with locations, some of which may be within controlled access areas in a given venue.  For example, it is reasonable that a corporation would desire to have accurate indoor location within it's facilities, but they may not want this to be publicly accessible.

The specific data is a product of the venue owner that either produced it or contracted to have it produced.  Thus, there are no specific deterrents from another person/company producing a similar dataset, excepting perhaps gaining legal access to the facility for this purpose.

To make this more clear, I've changed the wording from 'owner' to 'licensor', as this better describes the relationship and allows some flexibility as to who the licensor is (venue owner, proprietor, leaser, etc.).



>>>>[Martin Thompson]: This is a minor structuring point, but I think the
location of the beacon is a property of the venue (at a point in time)
instead of a property of the survey. I see a survey as a collection of
measurements

I agree with Martin's point, the intent is for the venue owner/manager to be able to retain rights to the location data that is generated during a venue survey. Thus the rights to the location data generated by the survey is the property of the licensor.  It is feasible that somebody could surreptitiously survey a venue without the consent of the venue owner, but I don't think we want/should deal with that scenario within this specification.


>Data Usage and Rights

There is an additional issue having to do with derivative uses of the data.  Survey data at a high quality could be used to 'seed' a location system, allowing an organization to build their own beacon location database over time. At some point, the original data could be removed but the derived database could continue to be used.  For data licensors who wish to protect from this, the notion of derivative rights is included in the license structure.

>>>>[Robin Wilton]
I think the 'finesse' here might be to draw the distinction between
i - the venue owner's entitlement to exercise their rights relating to the premises, the private networks etc
ii - the rights individuals might have relating to data about them that is captured as a result of their presence/movement/actions in a given space

This specification does not address the actual determination of end users locations.  Instead, it is concerned with the rights of the Venue data licensor with respect to their hardware and the location of that data and the related signals.  And, as I noted above, I am excluding the 'crowdsource' model from this specification, thus eliminating the concerns (I hope) with respect to the location data of end users.


>>>> [Martin Thompson] it may be possible to deal with licensing declarations out of band, I am interested in hearing a clear argument for why in-band declaration of licensing in the interchange is the
right way forward

While I agree that it would be possible to deal with licensing in an out-of-band model, the lack of connectivity between the data and the licensing is at best disconcerting.  Including the terms of use with the data means that the terms travel with the data. There is no doubt about the terms.  I'm leaning on my experience with software, wherein the license terms are either spelled out within the code or have a reference to a specific license or set of terms of use.

I believe the explicit linkage of data to the license provides a stronger bond between the two and thus more assurance that the terms will be honored.

>>>Transport Mechanism
[Adam Roach]What appears to be missing is a discussion of how these documents move around the network. Based on the microphone discussion with the author, it appears that the current plan for moving this information around is something like using PUT or POST requests over https. I think it would be of great use to cover this topic, either in this document or in a separate one.

Thanks Adam, excellent call.  I will add this to the new revision.  The focus will be HTTP/TLS as transport.


Key things to consider would be:
* What https URL is used for the PUT or POST?

I would expect the URL to be service provider dependent.  Thus if Venue X produced a survey of their Venue, they would likely already be in contact with a location service provider to acquire the proper URL and credentials. It would also be reasonable for them to know their own URL in the case of self-provisioning.

  o How is it provisioned into survey devices?


The URL and credentials would be added during the configuration/set up for the survey along with additional information (e.g. Licensor information, Building location, Licensing info, etc.).

  o Is it valid to use http instead of https?

HTTPS


* What kind of authentication is used? Basic? Digest? Client certs? S/MIME signatures? Something else?

Basic over HTTPS

* What can the client infer from the HTTP response code regarding the state of the information it has submitted?

Good question. I'm using RFC5985 (HELD) as a template for defining this section.  If there is a better model, feel free to point it out.  The draft revision will include this information.

* Is there utility in defining additional transport mechanisms?

Not currently, do you think there needs to be?


  o SIP SUBCRIBE/NOTIFY?

Not sure why this would be necessary.

  o Is there value in distributing this information in Atom over HTTP?

Again, not sure the value of this. Can you elaborate?



* Do we need to define an inter-server protocol for exchanging information between survey aggregators?

I believe the same mechanism can be used for inter-aggregator communication.


>>>Capturing EMEA/Ephemerides
[Martin Thompson] You might also consider capturing ephemerides if you are capturing EMEA sentences and if you want an archival format.

I'm looking into this to see what additional value the ephemeris data would provide and how common this practice is.


>>>Privacy Concerns wrt Device Characteristics
[Martin Thompson] There are privacy implications on the use of device characteristics, useful as they are.

I believe the privacy concerns would be more applicable in a 'crowdsource' model of survey.  Do you still have concerns with respect to device characteristics in the directed survey model?






========================================
FROM Alissa Cooper - IETF 84 Minutes


--------------------------
Indoor Signal Position Conveyance by Kyp Jones (of Skyhook)
---------------------------
Document: draft-jones-geopriv-sigpos-survey
Slides: http://www.ietf.org/proceedings/84/slides/slides-84-geopriv-0.pdf

-- Adam Roach: This is an interesting problem space. However, I don't
see anything in the spec about transport mechanism for this data. How
this is sent across the network will have a lot of affect
Answer: I expect this will probably be transported by HTTPS, but I
welcome discussion of how to talk about transport in the document

-- Martin Thompson: I have read this. There seem to be two very
different usage models: (1) crowd-source, (2) contracted to do a
survey with specialized equipment
  ... These models seem to be quite different, especially when one
talks about licensing
  ... I think it may be possible to deal with licensing
declarations out of band, I am interested in hearing a clear argument
for why in-band declaration of licensing in the interchange is the
right way forward
  ... and you certainly make different use of data depending on
what model it comes from
  ... raises questions about crowd-sourcing in private spaces
Answer: There seems to be some fuzziness in the industry with
regards to things like definitions of private spaces with regards to
wireless signals.

-- Martin Thompson: This is a minor structuring point, but I think the
location of the beacon is a property of the venue (at a point in time)
instead of a property of the survey. I see a survey as a collection of
measurements. We should talk about this more offline

-- Robin Wilton: I think this is interesting. I think there are some
issues to explore which may be applicable to other contexts (in
particular, from a privacy and data-rights point-of-view).
1. User opt-in opt-out by default
2. Transparency
3. Use of the data
Therefore, it may be valuable to document some of these
considerations in the document.

-- Alissa: Question about the other players in this space. We have a
lot of relevant expertise in this room, but we don't have all the
people who would benefit from this standardization.
Answer: This is somewhat of a chicken-egg problem. It is easier to
get people interested in participating once if it has been accepted by
a working group. I appreciate any guidance on how best to solve this
chicken egg problem.

-- Many hands think this is an interesting problem that could benefit
from st





Thanks Ted - entirely rational. I was perhaps misinterpreting the emphasis of this work (i.e. on mapping the premises, rather than on mapping the movement of customers - so apologies if I was tilting at the wrong windmill ;^)


I think the 'finesse' here might be to draw the distinction between

i - the venue owner's entitlement to exercise their rights relating to the premises, the private networks etc

ii - the rights individuals might have relating to data about them that is captured as a result of their presence/movement/actions in a given space

(And I also fully appreciate that there are different legal contexts (and different social expectations) relating to this, depending which country etc this is happening in.)

Engagement/discussion with venue owners would be great… because stakeholder engagement is always good, and because down the road, one foreseeable outcome is some form of self-regulatory code of conduct on the collection/use of this kind of data.

Yrs.,
Robin


Robin Wilton
Technical Outreach Director - Identity and Privacy
Internet Society

email: wilton@isoc.org<mailto:wilton@isoc.org>
Phone: +44 705 005 2931
Twitter: @futureidentity




On 1 Aug 2012, at 12:54, Ted Morgan wrote:

Fair points and maybe there is better language to use but the venue owners (think Best Buy, MGM Grand, Mall owners, corp campuses, Ball parks) are very protective about what happens on their premises and do view those wifi networks as something very proprietary.  These aren't public hotspots.  Now, we know that anyone can observe and collect these signals on their own, but what is trying to be accomplished with this standard is to have the venue owners participate in and drive the mapping of their venues.  Some comments we have heard "why should I let someone scan my ball park and then make money off of apps/services which we see none of?" "I don't want XXXX coming in here and mapping my store only to offer comparative shopping services to show people how to leave my store and buy a product elsewhere".  So if they are going to be involved, they want control.  We may not like it but our alternative is to watch the adoption move at the rate it has for the last ten years (slow).  With this model, they can say "anyone can come into my venue and do detailed mapping as long as they support the standard".

Hopefully the venue owners will get involved in this list as well so you can hear their concerns directly.


On Aug 1, 2012, at 2:20 PM, Robin Wilton wrote:

The word 'ownership' always makes me nervous in this kind of context… and apologies, this is one of my pet rant topics, so I promise not to beat the subject to death. Bottom line: it may be healthier if we refer to venues as being 'data controllers', rather than 'data owners'.

In brief, the underlying point is this:

You've probably all seen privacy threads where an aggrieved data subject says "All I want is to be given back *my* data"… The implicit assumption is that, in some way, I 'own' my [sic] personal data. Unfortunately, not far down the line that leads to all kinds of unwanted consequences, and therefore we're better off not starting out with a model based on concepts of 'ownership' if at all possible.

For instance, as Bob Blakley pithly put it, "You can't control the stories other people tell about you". There's lots of personal data about you over which you have no control, let alone 'ownership', because it's generated by other people. The only time you get control over it is, for instance, if the information is libellous. Even then, you don't get 'ownership' of the data, but you get the opportunity to exercise certain rights pertaining to it.

Similarly, a model based on a concept of 'ownership' doesn't work well for informational resources that can be 'stolen' from you, yet still leave you in possession of the data. Think of copyright digital media… you own the CD of Beethoven's 5th., but there are rights to do with the original work (or the performance) that you don't enjoy.

Legally, there are distinctions between the treatment of "personal property" (or personalty) and "real property" (or realty), and if you want to delve into that aspect, my own belief is that we're better off treating personal data as if it were realty than as if it were personalty. (Happy to take that offline… ;^)

I know this is a rather terse and dense statement of the issue, and my apologies again for that - there are doubtless points here that could be unpacked ad infinitum - but suffice to say, I think an approach based on assumptions of 'rights' over data has fewer problems than one based on assumptions of 'ownership'. I think this is especially true of personal data (including location/tracking/behavioural data): it makes little sense to claim that I 'own' the data collected about my path through a shopping mall, but it makes s lot of sense to claim that I have certain rights relating to data about me.

Hope this is helpful -


Robin

Robin Wilton
Technical Outreach Director - Identity and Privacy
Internet Society

email: wilton@isoc.org<mailto:wilton@isoc.org>
Phone: +44 705 005 2931
Twitter: @futureidentity




On 1 Aug 2012, at 10:29, Ted Morgan wrote:

Well the venues will be unwilling to interchange their data if their is not some licensing model associated with it.  That ownership question has dramatically slowed the adoption of indoor technologies.  In fact venue owners have told us data ownership is the single biggest factor keeping them from deploying in-store location technology.  If not part of a standard like this, where would that *promise* be accommodated?


On Aug 1, 2012, at 12:03 PM, Martin Thomson wrote:

My own:

Given what I know of the deployment models for this sort of
technology, I don't see that license information is a) useful to an
automaton, and b) necessary for a standardized interchange format.

Also, I don't see the beacons as being relevant to the survey, but
more to the venue.  (You might also consider capturing ephemerides if
you are capturing EMEA sentences and if you want an archival format.)

There are privacy implications on the use of device characteristics,
useful as they are.

To Adam's points:

I believe that there is a fair degree of pre-arrangement involved in
this that may alleviate most of your concerns.  A description of the
two primary modes of operation is important, because that makes the
considerations clearer.

On 31 July 2012 14:02, Robin Wilton <wilton@isoc.org<mailto:wilton@isoc.org>> wrote:
Thanks Adam -

In the same spirit, here's a quick recap of my remarks at the mic….

The 'venue survey' use-case is an interesting one in its own right, but also
gives an opportunity to explore issues which will be relevant to privacy in
other contexts. Three such issues are:

1 - Opt-in versus opt-out by default. There have been a lot of
'opted-in-by-default' service deployments in recent years, with
corresponding concern about user notice and consent. The venue survey
use-case is an opportunity to explore the alternative of
'opted-out-by-default' and to look at ways of encouraging the data subject
to take part on the basis of an engaged awareness.

2 - User transparency; as noted by another commenter, there's a potential
lack of transparency here, with the consequence that individuals may have no
idea that their information is being collected or why… A paper that explores
the various possible means of making users aware of what's going on would be
of value.

3 - The need to take account of possible third party access to the
geolocation data collected through the survey. The temptation is to design
for the 'straight-through' use-case where the only parties involved are the
data subject, the venue owner and the data collector. For privacy purposes,
the proposed document ought at least to acknowledge the risks arising out of
potentially malicious third-party access to data collected.

Hope this helps -

Robin

Robin Wilton
Technical Outreach Director - Identity and Privacy
Internet Society

email: wilton@isoc.org<mailto:wilton@isoc.org>
Phone: +44 705 005 2931
Twitter: @futureidentity




On 31 Jul 2012, at 13:50, Adam Roach wrote:

This is intended to be a reiteration of the points I made at the microphone
today, for the purpose of stimulating discussion on the mailing list.

The document covers an aspect of something that I think is interesting and
has utility. At the moment, it covers a syntax for conveying survey
information as well as a fairly rigorous description of what each element in
that information means.

What appears to be missing is a discussion of how these documents move
around the network. Based on the microphone discussion with the author, it
appears that the current plan for moving this information around is
something like using PUT or POST requests over https. I think it would be of
great use to cover this topic, either in this document or in a separate one.

Key things to consider would be:

* What https URL is used for the PUT or POST?
  o How is it provisioned into survey devices?
  o Is it valid to use http instead of https?
* What kind of authentication is used? Basic? Digest? Client certs?
S/MIME signatures? Something else?
* What can the client infer from the HTTP response code regarding the
state of the information it has submitted?
* Is there utility in defining additional transport mechanisms?
  o SIP SUBCRIBE/NOTIFY?
  o Is there value in distributing this information in Atom over HTTP?
* Do we need to define an inter-server protocol for exchanging
information between survey aggregators?


In terms of permissions (which is related to the privacy question raised by
the presentation), it seems that this kind of document would be well suited
for talking about using Common Policy (RFC 4745). However, without having a
better grip on the larger architecture in which these documents will be
used, I can't really come up with useful suggestions about how such
documents are conveyed/applied/etc.


/a

_______________________________________________
Geopriv mailing list
Geopriv@ietf.org<mailto:Geopriv@ietf.org>
https://www.ietf.org/mailman/listinfo/geopriv



_______________________________________________
Geopriv mailing list
Geopriv@ietf.org<mailto:Geopriv@ietf.org>
https://www.ietf.org/mailman/listinfo/geopriv

_______________________________________________
Geopriv mailing list
Geopriv@ietf.org<mailto:Geopriv@ietf.org>
https://www.ietf.org/mailman/listinfo/geopriv




_______________________________________________
Geopriv mailing list
Geopriv@ietf.org<mailto:Geopriv@ietf.org>
https://www.ietf.org/mailman/listinfo/geopriv




..............................................
Kipp Jones
Chief Architect/Privacy Czar
kjones@skyhookwireless.com<mailto:kjones@skyhookwireless.com>
m: 404.213.9293 | @skykipp

..............................................
Kipp Jones
Chief Architect/Privacy Czar
kjones@skyhookwireless.com<mailto:kjones@skyhookwireless.com>
m: 404.213.9293 | @skykipp