Re: [dhcwg] Stephen Farrell's Discuss on draft-ietf-dhc-dhcpv6-active-leasequery-03: (with DISCUSS and COMMENT)

Kim Kinnear <kkinnear@cisco.com> Tue, 14 July 2015 17:30 UTC

Return-Path: <kkinnear@cisco.com>
X-Original-To: dhcwg@ietfa.amsl.com
Delivered-To: dhcwg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 422741ACED6; Tue, 14 Jul 2015 10:30:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -13.911
X-Spam-Level:
X-Spam-Status: No, score=-13.911 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, J_CHICKENPOX_52=0.6, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id weNp_sUw9uCM; Tue, 14 Jul 2015 10:30:11 -0700 (PDT)
Received: from aer-iport-3.cisco.com (aer-iport-3.cisco.com [173.38.203.53]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 13F3A1A8A6B; Tue, 14 Jul 2015 10:30:09 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=14142; q=dns/txt; s=iport; t=1436895010; x=1438104610; h=mime-version:subject:from:in-reply-to:date:cc: content-transfer-encoding:message-id:references:to; bh=XDXyZa/JWws69yQHOfKUlrxXGWX+QWaeatuU1MlV0ao=; b=Zsx8tEavI4b3pD74UJTxTtyuswZ9xAWp8glMxrbzgx5hAjLmA2VeO9P0 /3RRmg3OazIl6If1pCjmEo7fZSuSjQtCP//X/GQLImW6FfmV2ENMkd5DA nFd4nlYqG7GuTT9lc4n9SlrmeyoCzbnKxZWTEwGnHwlmBZ5Lzbnn01cZX 0=;
X-IronPort-AV: E=Sophos;i="5.15,473,1432598400"; d="scan'208";a="559470245"
Received: from aer-iport-nat.cisco.com (HELO aer-core-2.cisco.com) ([173.38.203.22]) by aer-iport-3.cisco.com with ESMTP; 14 Jul 2015 17:30:08 +0000
Received: from dhcp-10-131-65-127.cisco.com (dhcp-10-131-65-127.cisco.com [10.131.65.127]) (authenticated bits=0) by aer-core-2.cisco.com (8.14.5/8.14.5) with ESMTP id t6EHU0PJ002853 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Tue, 14 Jul 2015 17:30:04 GMT
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\))
From: Kim Kinnear <kkinnear@cisco.com>
In-Reply-To: <55A46641.6030204@cs.tcd.ie>
Date: Tue, 14 Jul 2015 13:30:00 -0400
Content-Transfer-Encoding: quoted-printable
Message-Id: <1C365154-9EB5-401F-A130-B51B0C77294E@cisco.com>
References: <20150708114206.28697.67541.idtracker@ietfa.amsl.com> <74130B2E-D0EC-4263-9403-421EED783B92@cisco.com> <55A1160A.8050805@cs.tcd.ie> <2F494A40-0C52-420D-BBAC-8242F5C5BD8C@cisco.com> <55A46641.6030204@cs.tcd.ie>
To: Stephen Farrell <stephen.farrell@cs.tcd.ie>
X-Mailer: Apple Mail (2.1878.6)
X-Authenticated-User: kkinnear
Archived-At: <http://mailarchive.ietf.org/arch/msg/dhcwg/-ifcbNXRoLGUhaKwSQR_-fPnH-o>
Cc: dhc-chairs@ietf.org, draft-ietf-dhc-dhcpv6-active-leasequery@ietf.org, The IESG <iesg@ietf.org>, dhcwg@ietf.org, Kim Kinnear <kkinnear@cisco.com>
Subject: Re: [dhcwg] Stephen Farrell's Discuss on draft-ietf-dhc-dhcpv6-active-leasequery-03: (with DISCUSS and COMMENT)
X-BeenThere: dhcwg@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <dhcwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dhcwg>, <mailto:dhcwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dhcwg/>
List-Post: <mailto:dhcwg@ietf.org>
List-Help: <mailto:dhcwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dhcwg>, <mailto:dhcwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 14 Jul 2015 17:30:14 -0000

Stephen,

Maybe we are getting closer to something we can both
live with, here...

On Jul 13, 2015, at 9:30 PM, Stephen Farrell <stephen.farrell@cs.tcd.ie> wrote:

> 
> Hiya,
> 
> On 14/07/15 01:57, Kim Kinnear wrote:
>> 
>> Stephen,
>> 
>> In an attempt to narrow down the discussion, I've selected a few sections
>> to respond to here, but I'm actually trying to respond to your entire 
>> concern w.r.t. privacy.
>> 
>> On Jul 11, 2015, at 9:11 AM, Stephen Farrell <stephen.farrell@cs.tcd.ie> wrote:
>> 
>>> Sure, I don't doubt there's utility here. But the actual utility
>>> is still vague for me. I also don't doubt that it's not vague for
>>> you or those deploying but it's not really possible (for me) to try
>>> analyse the privacy properties without some more detailed knowledge
>>> of the purpose(s) to which this data will be put.
>> 
>> I do not believe that we going to be able to give you the information
>> as to which data elements are "necessary" and how they will be used.
>> We simply don't know, and in many cases our customers won't (or even
>> can't) tell us.  I think for the purposes of analysis here, that we
>> should assume that people using Active Leasequery are going to use all
>> of the data that the DHCP server has available to it about each client
>> and that it can send out in an Active Leasequery response.  I don't
>> doubt that vendor independent extensions will be used as well to
>> enhance the information with specific items in different
>> implementations.
>> 
>> Essentially, our customers are building a parallel database which
>> contains essentially everything they can get from an Active Leasequery
>> response (or a Bulk Leasequery response).  Trying to limit the data
>> that they can acquire from an Active Leasequery by limiting it in this
>> document will only, in my view, take us back to where they will find
>> other approaches to acquire the missing information.  Which is exactly
>> the reason that we developed an Active Leasequery-like capability in
>> the first place -- to remove the need to develop one-off solutions for
>> each customer.
> 
> Yes, there're deep conflicting principles here - we have the general,
> long-standing and well understood computer science goal that we all
> understand of enabling re-use as much as possible vs. the much more
> recent and not at all well understood privacy goal of limiting
> (ab)uses. (One elucidation of which is that re-use for purposes not
> initially declared should be forbidden.) I doubt we'll solve that
> general conundrum in this case, and we shouldn't try do that by
> blocking a document like this.
> 
> However, while we won't solve the general problem here, I think it's
> fair to push back a bit and ask what are the main things that users
> of the shadow DB are after. It's a little hard to believe that it's
> really all down to fixing up routes, but for all I know maybe it is.

It is certainly well beyond fixing up routes.  That is one use of Bulk
Leasequery, but Active Leasequery is designed to give administrative
personnel and their tools access to the IP <-> DUID mapping.
Additionally, the information sent in about the type of device (often
found in the vendor-class option) and any
vendor-specific-information-option is used as is the
relay-agent-remote-id in the Relay-Forward message.  The interface-id
would be another option that would be used.  Of course, the
client-fqdn option would be another oft-used item.

While certainly not the only use to which this data is being put, one
example use is to allow a service technician who is at someone's house
to access a tool back in the back office that will tell the technician
whether or not the device that they have in their hands has acquired
an IP address and how far (if at all) that it has progressed through
the provisioning process unique to this service provider.  Certainly
the DHCPv6 Active Leasequery information isn't the only information
that is used when troubleshooting a problem like this, but some of
that information would be useful to the diagnostic tool that the
service technician would access.

There are many other back-end OSS kinds of things that people can do
with this data, including billing for service where connectivity is
defined as "service".

> I do know that it's really hard to do a realistic privacy analysis
> without knowing anything about that - in such cases one generally
> tends to conservatively assume the worst, as you can imagine. And
> maybe that worst is not the reality, in which case we'd be wasting
> our time and maybe producing a worse outcome by making such bad
> and pessimistic assumptions.

I clearly don't understand what a "realistic privacy analysis"
involves, since I think that what authorized people can do with the
information that they can get by Active Leasequery or regular RFC 5007
Leasequery or RFC 5640 Bulk Leasequery is all pretty reasonable stuff.
How this data is used by the legitimate users seems to me to be a
non-issue, since anyone with access to the DHCP server can certainly
get it to export this information with little effort. Ensuring that it
is only used by legitimate users seems the goal.

As I said, we are more than willing to say that you SHOULD allow
configuration of which options you will allow to be sent through an
Active Leasequery connection.  We could even make that a MUST if it
would ease some of your concerns.


>> 
>> That said, it would certainly serve the goal of privacy if there were
>> no standardized way to acquire this information.  We have no
>> particular incentive to standardize this Active Leasequery other than
>> it seemed like a "good thing to do".  Clearly from a privacy standpoint
>> it isn't.  Seems like from a user's standpoint, though, there is some
>> value in having this standardized.  Thus, we have added TLS support at
>> the request of the former AD.
>> 
>>>> 
>>>> 	Ultimately, we expect that the people in charge of the data on
>>>> 	the DHCP server are the same people in charge of the data that
>>>> 	a processing element can discover from the DHCP server through
>>>> 	the use of the active leasequery protocol.  These same people
>>>> 	have control of the data in both elements (as well as the
>>>> 	network on which the raw DHCP data flows).  Of course all of
>>>> 	this information can be garnered from a tcpdump of the wire on
>>>> 	which the DHCP server sits (and we have customers doing that
>>>> 	too, to meet their needs for a near-real time database of
>>>> 	client<->IP address binding information).
>>> 
>>> Sure a local tcpdump is what it is, but if someone deploys this
>>> without implementing and turning on TLS and screws up their f/w config
>>> then the entire world could find this data. And with zmap that's no
>>> longer unrealistic if the DHCP server has an IPv4 address. And we do
>>> know that people do screw up f/w configs. So the thing to analyse I
>>> think is the balance between the probability and downsides of
>>> publishing this information to the world, vs. the costs of making
>>> the spec more restrictive/protective. I'm not sure we've got that
>>> balance quite right, but then as I said I don't yet understand the
>>> beneficial uses of this.
>> 
>> You have put this well -- what is the probability and downsides of
>> making the information the DHCP server knows easily available vs the
>> cost of make this spec more restrictive.
>> 
>> I want to emphasize as well that we expect that the people who control
>> the information on the DHCP server are the same people who will
>> control the configuration of the DHCP server (regarding Active
>> Leasequery) as well as the database constructed by the requestor of
>> the Active and Bulk Leasequery information.
>> 
>> Yes, someone could mistakenly leave their firewall unconfigured
>> regarding DHCP and thus let anyone see what the DHCP server knows.
>> 
>> Of course this can already be done with DHCPv6 Bulk Leasequery (RFC
>> 5460) as well as DHCPv6 Leasequery (RFC 5007).  I recognize, however,
>> that just because some existing protocols don't protect some data is
>> no justification for allowing a new protocol to not protect some data.
>> So I'm not using that as a justification for anything, yet I did want
>> to point it out.
>> 
>> I believe that anyone who allows TCP connections to port 67 on their
>> DHCP server is at risk of losing data to Bulk Leasequery already, and
>> so one would normally prevent TCP connections to port 67.  And that
>> would also cover Active Leasequery.
>> 
>> But mistakes certainly do happen when configuring networks, and I deal
>> with supporting customers who make such mistakes frequently.  So
>> perhaps simply telling people to "prevent access by unauthorized
>> requestors" isn't enough (which is essentially what we have done 
>> so far).
>> 
>> It would be straightforward to alter the draft to say that you MUST
>> NOT make Active Leasequery access available by default, and that it
>> MUST require explicit configuration in order to allow the DHCP server
>> to respond to any Active Leasequery request.  Then someone would have
>> to take an action to create a privacy problem.
>> 
>> Further we could say that it MUST take explicit configuration to allow
>> Active Leasequery without TLS even if TLS wasn't implemented.  So you
>> would have two explicit hurdles to overcome to do something dumb
>> regarding privacy.  This would largely prevent people who knew nothing
>> about Active Leasequery from having it enabled it on their servers,
>> as well as encouraging others to use it in a protected way.
> 
> Your suggestions from the above two paragraphs and your earlier idea
> of being able to configure which information will be reported seem
> to me like they're definitely on the right track. If the WG think
> those are ok and if folks would implement like that then I think we
> should probably be able to close out on all this fairly quickly.

That would be great, and I think the WG would have no problem with
making that a SHOULD or a MUST.  I can't control what people
implement, but I think that sort of requirement typically isn't a
difficult thing to implement in a server's configuration, so there
would probably be good compliance with that recommendation (SHOULD) or
requirement (MUST).

> 
> That said, I'll admit that while using TLS is fairly simple, having
> it on by default does have some admin overhead, so I'd like to be
> confident that that overhead is something implementers would likely
> find worth putting up with before we consider that to be the right
> answer here. To be clear - the overhead is nothing to do with CPU
> consumption of run-time TLS, it's all down to ensuring the keys and
> certificates required are provisioned in a way that users find
> acceptable. That can be done, but requires some more developer effort
> compared to just opening a cleartext socket.

I see no problem in making TLS the default whether or not it is implemented,
and requiring explicit action to *not* use TLS even if it is not available
in a particular implementation.  While a bit unusual, nobody is going to
have a problem with that -- especially if the alternative is to not have
this standardized at all.  

> 
>> 
>>> I'm asking wouldn't it be better if the spec said that "Mutually
>>> authenticated TLS MUST be supported and SHOULD be used and SHOULD
>>> be the default."
>>> 
>>> Right now, it says TLS SHOULD be implemented, which is consistent
>>> with overall IETF BCPs. I'm asking if, for this protocol, with
>>> the potential privacy exposure, we ought go further that those (now
>>> somewhat old) BCPs and say to just turn on TLS, since it ought not
>>> be that hard and really ought be the default.
>> 
>> Our answer is no, we don't believe that we should go further and
>> require TLS, in that we don't see this protocol as creating a privacy
>> exposure of sufficiently unusual magnitude to warrant taking that
>> extra step.  Possibly because we don't view the information available
>> as unusually ... useful to someone intent on violating privacy
>> concerns.  Useful, certainly, but not extraordinarily useful, and so
>> not requiring extraordinary measures to protect.
> 
> Well I'd argue that mandating use of TLS is getting more and more
> common and will soon be far from extraordinary for anything with
> potentially bad privacy consequences. Think the US OPM or Target
> or Sony or even hacking-team - if systems defaulted to protecting
> data at rest and in transit as much as possible, then it could well
> be the case that fewer such incidents will happen.

Whether or not TLS becomes mandatory soon is something that will be
interesting to observe.  I don't think that the data we are talking
about the DHCP server making available to legitimate requestors with
Active Leasequery has particularly bad privacy consequences.  I think
the DHCP data is at the other end of that spectrum.  The Target or Sony
data and the DHCP data we are discussing here are so far apart from a
privacy standpoint that I'm astonished that we are even having to
discuss comparing them here.

I recognize that you want to move the world forward to be more secure
from a privacy standpoint.  We have agreed to change the document to
make accidental exposure of the DHCP data much less likely and I'm
glad we have done so.  

In your quest to move the world forward to a more secure future, I
would respectfully suggest that there must be protocols whose data,
were it exposed, would be more egregious than the relatively benign
data we are discussing here.  Perhaps you might move forward to make
TLS implementation and use mandatory in one of these other protocols
where the result would have more obviously concrete benefits?

Thanks -- Kim