Re: comments on draft-ietf-ipv6-privacy-addrs-v2-00.txt

Sorry for the delay to respond to your quick response.. long email ;-)

On Wed, 13 Oct 2004, Suresh Krishnan wrote:
> On Tue, 12 Oct 2004, Pekka Savola wrote:
> >FWIW, I think we should make the specification agnostic of the hash
> >algorithm.  Either MD5 and SHA1 or whatever is just fine.  There is no
> >interoperability problem because I don't see a need to be able to
> >reverse the hashes, and it's just an implementation's internal matter.
> >
> 
> Please see my reply to Brian earlier on the list and let me know if this
> addresses your concern.

OK.

> >substantial
> >-----------
> >
> >1) 
> >
> >The document goes at great depth to discuss the scenarios and justification
> >for privacy addresses, but I don't think the document produces a concise
> >problem statement (like one paragraph of less than 5 sentences) or
> >description about the scenarios or the threat model.  The factoids are
> >dribbled through section 2 but it might make sense to try to collect the
> >important bits together and try to coin up a nice and neat description of
> >the problem that's being fixed.
> 
> How about the following text for the problem statement?
> 
> "Addresses generated using Stateless address autoconfiguration [ADDRCONF]
> contain an embedded 64-bit interface identifier which remains constant
> over time. Anytime a fixed identifier is used in multiple contexts, it
> becomes possible to correlateseemingly unrelated activity using this
> identifier. Since the identifier is embedded within the IPv6 address,
> which is a fundamental requirement of communication, it cannot be easily
> hidden. This document proposes a solution to this issue by generating
> interface identifiers which vary over time."

This describes rather well the approach against correlation, but what 
I was more concerned about was what I wrote below -- i.e., that the 
case _where_ and _by whom_ you expect that correlation to happen.

> >For example, one should take note of the following:
> > - what is the observation point? (e.g., first paragraph of 2.1 takes IMHO
> >bad example of sniffing the traffic might be better removed or reworded)
> > - what is the assumption about the (in)stability of the prefix to the
> >effects of privacy addresses?
> 
> This is covered in Section 3.6 "Deployment Considerations"

See above.  At least some of this should probably be prominent in the 
justification of this mechanism.

> > - how does this effect stationary nodes?  nomadic nodes?
> 
> I will cover the effect on nomadic nodes in the section I am going to
> write about mobile IP.

OK.

> >Also, security considerations talks about ingress filtering, but
> >doesn't really talk about the main meat of
> >draft-dupont-ipv6-rfc3041harmful-XX.txt, that is, why privacy
> >addresses give privacy only with certain assumptions about the
> >dynamicity (or not) of the used prefixes.  This should go in as a
> >paragraph of its own in security considerations after the above is
> >clear.
> 
> I had already added text in the draft, which I thought would address
> this issue. It is the last paragraph of Section 3.6 "Deployment
> Considerations"
> 
> "If a very small number of nodes(say only one) use a given prefix for
>  extended  periods of time, just changing the interface identifier
>  part of the prefix  may not be sufficient to ensure privacy, since
>  the prefix acts as a constant identifier.  The procedures described
>  in this document are most effective when the prefix is reasonably non
>  static or is used by a fairly large number of nodes"
> 
> Do you think this is not sufficient?

Yes -- that seems like an afterthought which is not applied
sufficiently to the rest of the document -- the discussion & problem
statement in the introduction and the first paragraphs and the
security considerations in particular.

The point is that if you use stable prefixes, RFC3041 addresses don't
help you at all to ensure privacy for stationary nodes in that
network.  That needs to be said loud and clear in the critical places
in the document.

> >2) the document also goes on at least in 3 different places to list
> >issues with reverse lookups and such.  Is this really necessary?  I'm
> >not objecting strongly, but there have been arguments that rather than
> >fiddling with protocols, one should consider fixing the broken
> >applications.  This is also discussed in draft-ietf-dnsop-ipv6-issues
> >(as are the issues with DDNS), so it might be worth referring to that
> >informatively from here.
> 
> The dnsop ipv6 issues draft just refers back to RFC3041 regarding
> reverse lookups. Would you still like me to reference it?

The part in dnsop ipv6 issues draft I referred to was the discussiona
bout the techiniques for using DDNS, and the discussion of the
applicability of reverse DNS in the first place.

Not sure if it needs to be referred to here, but if the draft talks a 
bit about these subjects, referring to another document which tries to 
do a more extensive discussion might not hurt.

> >3) DNA is referred to in section 3.5, unfortunately I think (at least with
> >the current wording) this would need to be a normative reference because
> >it's required for implementation of this spec.  And normative refs must be
> >at least the same maturity level of this spec, and this may pose a problem. 
> >A few possibilities which might solve this problem:
> > - describe DNA just as one alternative to note whether the link change has
> >occurred, and possibly detail some others as well
> > - just wait for DNA spec, and keep this as proposed standard
> > - remove (some of) the specification about link changes, e.g., just saying
> >that for wired interfaces plug in/out and wireless interfaces the interface
> >restart could be the events -- no need for DNA then.
> 
> I like option 1. I will reword this and change the DNA reference to
> informational.

OK.  I guess some other techniques than DNA also need to be noted
(maybe less optimal as DNA)?

> >semi-substantial
> >----------------
> >
> > ==> the draft talks a lot about 'global scope addresses',
> >but I don't think is really accurate.  The privacy address practices could
> >be applied to site-local or ULA addresses as well, it's just a matter of
> >local policy.  RFC2462bis might provide some ideas how to express
> >'non-link-local' in a better way.
> 
> Site local has been deprecated and ULAs are within the scope of the
> document. I have the following text in the draft at the end of Section 1
> 
> "Introduction"
> 
> " The term "global scope addresses" is used in this
>   document  to collectively refer to "Global unicast addresses" as
>   defined in [ADDRARCH] and "Unique local addresses" as defined in
>   [ULA]"
> 
> Do you have any issues with this definition?

Yes -- this brings up the issue why one would like to use privacy
addresses with unique local addresses ?  This gets back to the
question of correlating by whom and where.  In the domains where you 
use ULA addresses, I'd expect you wouldn't be concerned of 
correlation..

> >On the other hand, I see benefit in restricting the scope of privacy
> >addresses to just global scope addresses, but then you have to define
> >those somehow and that may be tricky.. because ULA addresses, by some
> >terminology, are also global scope..
> >
> >   Not all nodes and interfaces contain IEEE identifiers.  In such
> >   cases, an interface identifier is generated through some other means
> >   (e.g., at random), and the resultant interface identifier is not
> >   globally unique and may also change over time.
> >
> >==> 'globally unique interface identifier' is a rather absolute
> >statement and may not actually be accurate because it has happened
> >that identifiers have been duplicated.  Maybe soften the tone a bit.
> >(also see the robert elz appeal on addrarch and unique/local bits.)
> 
> OK. The problem in fixing this is that there is no explicit claim that
> IIDs formed from EUIs are globally unique. Would it be OK if I changed
> " resultant interface identifier is not globally unique"
> to
> " resultant interface identifier may not be globally unique"

That seems to be sufficiently softening the wording and it's OK by me.

> >  A more troubling case concerns mobile devices (e.g., laptops, PDAs,
> >   etc.) that move topologically within the Internet.  Whenever they
> >   move (in the absence of technology such as mobile IP [MOBILEIP]),
> >   they form new addresses for their current topological point of
> >   attachment.
> >
> >==> the document mentions Mobile IP, but this doesn't actually matter much,
> >depending on the actual problem statement.  Remember, unless the mobile node
> >would *only* do bidirectional tunneling back to the home agent, every node
> >the MN talks to will know the care-of address in any case, which seems equal
> >in privacy considerations to just laptop moving without mobile IP.
> >
> >Mobile IP is actually relatively incompatible with privacy addresses 
> >as the home address probably acts as a stable identifier, nullifying 
> >the effect of using privacy addresses for care-of address if you do 
> >route optimization.  This could maybe be discussed in security 
> >considerations.  A workaround is having multiple home addresses 
> >(rfc3041 ones), but I don't recall if that's supported or not, and 
> >even then, you reveal the stable prefix as the identifier.
> 
> I will add a paragraph in "Deployment Considerations" about Mobile IP.

OK.

> >   One way to avoid some of the problems discussed above is to use DHCP
> >   for obtaining addresses.  With DHCP, the DHCP server could arrange to
> >   hand out addresses that change over time.
> >
> >==> 'change over time' seems relative.  The key point here is that DHCP
> >should not provide the addresses with (similar) stable identifiers.  Looking
> >at current or planned DHCPv6 deployments, at least some are using (AFAIR)
> >DHCPv6 to give the hosts the same addresses they'd get with stateless
> >address autoconfiguration, including EUI64.  Obviously, such would likely
> >not change sufficiently often, and would include the stable identifier.
> >
> >How I see it, the document seems slightly too forthcoming with
> >proposing DHCPv6 as a solution for privacy here, but that only works
> >if DHCPv6 gives temporary addresses without the stable identifiers,
> >and rotates even those non-identifying addresses in a regular basis
> >("strict pool").  But that seems unnecessary complexity, because there
> >is no need (no address shortage) to do that with IPv6.  So, I don't
> >why anyone would use DHCPv6 to avoid this problem rather than privacy
> >addresses.
> 
> Agreed. DHCP is proposed as a solution and is qualified by the following
> sentence
> 
> "With DHCP, the DHCP server could arrange to hand out addresses that
>  change over time"
> 
> and the issue you are mentioning is discussed earlier in section 2.2
> 
> "In theory, the address a client gets via DHCP can change over time, but
>  in practice servers often return the same address to the same client
>  (unless addresses are in such short supply that they are reused immediately 
>  by a different node when they become free). Thus, even within sites using
>  DHCP, clients frequently end up using the same address for weeks to 
>  months at a time."
> 
> Would you like me to add stronger wording or to explicitly discourage
> the use of DHCPv6 for privacy purposes?

The discussion in section 2.2 is about IPv4 where there arguably there 
can be some some address shortage.  The quote you give is not 
sufficiently strong for DHCPv6, because the node would basically NEVER 
have to change the address.  So DHCPv6 does not practically address 
this concern, unless the address rotation is manually configured, and 
nobody will bother to do that.

Therefore I think stating that DHCP(v6) can solve this problem needs
to be stated rather in a fashion like "DHCPv6 could only solve the
problem if addresses it gave didn't have stable identifiers, and those
addresses were manually rotated periodically so that each node would
get new addresses suitably frequently.  Because this does not happen
automatically, and manual renumbering operations can be considered
extremely burdensome, DHCPv6 is not a solution for address privacy;  
DHCPv6 can only be used as an alternative for handing temporary
addresses [but explain why this doesn't necessarily make much
difference compared to just doing RFC3041]".

> >   4.  By default, generate a set of addresses from the same
> >       (randomized) interface identifier, one address for each prefix
> >       for which a global address has been generated via stateless
> >       address autoconfiguration.  Using the same interface identifier
> >       to generate a set of temporary addresses reduces the number of IP
> >       multicast groups a host must join.  Nodes join the solicited-node
> >       multicast address for each unicast address they support, and
> >       solicited-node addresses are dependent only on the low-order bits
> >       of the corresponding address.  This default behaviour was made to
> >       address the concern that a node that joins a large number of
> >       multicast groups may be required to put its interface into
> >       promiscuous mode, resulting in possible reduced performance.
> >
> >==> what you seem to be saying, in a rather complex way in these
> >steps, that a random address is generated for each received prefix
> >[regardless of interfaces].  Wouldn't there a simpler way to specify
> >that ?
> 
> Not really. The identifier is generated per interface. So a random address
> is generated per prefix per interface. The idea of the paragraph is to
> explain the rationale behind using the same random identifier with all
> the prefixes received on an interface. I agree there might be a simpler 
> way of stating this. I will try to come up with something.

OK.

> >      A node highly concerned about privacy MAY use different interface
> >       identifiers on different prefixes, resulting in a set of global
> >       addresses that cannot be easily tied to each other.  This may be
> >       useful, for example, to a mobile node using multiple wireless
> >       interfaces to connect to multiple independent networks.
> >
> >==> I think the example is flawed, or I don't understand this concern.  
> >Multiple wireless interfaces have their own MAC addresses, and
> >multiple independent networks have their own prefixes.  If you connect
> >using interface 1 to network 1 and interface 2 to network 2, there is
> >no way to connect the privacy addresses derived from them.  Rather, if
> >you use interface 1 to connect to networks 1,2,..,N, you're
> >identifiable, but this is then again the the regular roaming scenario
> >and requires no special considerations...
> 
> This paragraph explicitly allows the following scenario.
> 
> Let's say interface IF1 is connected to networks N1,N2 and N3. The node
> MAY generate DIFFERENT identifiers I1,I2 and I3 for each of these prefixes
> to make addresses N1|I1, N2|I2, and N3|I3 if it so desires, instead of
> being limited to N1|I1,N2|I1, and N3|I1.

Then you should be saying 'a single wireless interface' instead of 
'multiple wireless interfaces', and this would be OK.

> >       B.
> >           +  If the received Valid Lifetime is greater than 2 hours
> >              update the lifetime of the temporary address to the
> >              received lifetime.
> >           +  If the RemainingLifetime of the temporary address is less
> >              than or equal to 2 hours ignore the received option.
> >           +  Otherwise set the valid lifetime to 2 hours.
> >       C.  These steps are necessary to prevent a denial of service
> >           attack where a bogus advertisement contains prefixes with
> >           very small Valid Lifetimes
> >
> >==> isn't this fundamentally duplication of specification from rfc2462
> >checks ?  Couldn't you just specify that the lifetimes are checked as
> >specified in section X.X.X.y of rfc2462 ?
> 
> No. This check does not exist in RFC2462. It was added later in
> RFC2462bis. Since 2462bis is still a draft I included it here. I can
> remove it and add a reference to 2462bis if needed. But this needs to be
> a normative reference.

2462bis is slated for (recycling at) DS, and is also relatively far in
the process, so I don't think it would be a problem normatively
referring to what that doc specifies.

That is, if there are folks who would implement the RFC3041bis but not
2462bis could just look at 2462bis and take that specification from
there.

> >   6.  The node MUST Perform duplicate address detection (DAD) on the
> >       generated temporary address.  If DAD indicates the address is
> >       already in use, the node MUST generate a new randomized interface
> >       identifier as described in Section 3.2 above, and repeat the
> >       previous steps as appropriate up to 5 times.  If after 5
> >       consecutive attempts no non-unique address was generated,
> >
> >==> 5 times seems like really a strech.  Couldn't we lower that down
> >to, say, 3 times?  That would still be sufficiently many repetitions,
> >but lower the unnecessary delay and attempts significantly in case
> >there are problems.
> 
> I will make this a configuration variable and default it to 3.

OK.

> >  When a temporary address becomes deprecated, a new one MUST be
> >   generated.  This is done by repeating the actions described in
> >   Section 3.3, starting at step 3).  Note that, except for the
> >   transient period when a temporary address is being regenerated, in
> >   normal operation at most one temporary address corresponding to a
> >   public address should be in a non-deprecated state at any given time.
> >
> >==> I'm flinching back at the 'corresponding' language.  The
> >implementation shouldn't need to keep track of public -> private
> >address mappings, and if it does, this should be made more explicit.  
> >Instead, it may need to track prefixes, and possibly interfaces (or
> >interface-identifiers).
> 
> I will change the language to make the node track the prefixes instead
> of corresponding public addresses.

OK.

> >  If a very small number of nodes(say only one) use a given prefix for
> >   extended   periods of time, just changing the interface identifier
> >   part of the prefix  may not be sufficient to ensure privacy, since
> >   the prefix acts as a constant identifier.
> >
> >==> i'd rather say s/may not be/is not/, because there's nothing conditional
> >(IMHO) about it.
> 
> There is. The data collector NEEDS to know that there is only one node.
> Otherwise she/he may not be able to correlate the collected data with
> two different addresses.

This depends on whether the data collector is interested in tracking a 
particular user inside the prefix, or just the prefix and whoever are 
using it.

For example, FBI might want to correlate traffic that's originated
from the particular house.  They might not care that much whether it's
the father or the son whose laptop is doing the talking -- it wouldn't
probably even much, because the people inside the prefix might even
share their computers.

What the data collector NEEDS to know (or make an educated guess
about) AFAICS is whether the specific set of prefixes have a small or
large number of users behind them.  But that kind of information can
usually be gleaned from publically available information if they
really want.

> >10  References
> >
> >==> needs to be broken down to informative and normative.  It's not clear
> >whether this is intended for PS or DS, but if DS, normative refs must only
> >include those specs which are of DS of BCP; if PS, PS is also OK.
> 
> OK. I have made this change for the new version of the draft based
> on Brian's comments.

OK.

> >   A more interesting case concerns always-on connections (e.g., cable
> >   modems, ISDN, DSL, etc.) that result in a home site using the same
> >   address for extended periods of time.  This is a scenario that is
> >   just starting to become common in IPv4 and promises to become more of
> >   a concern as always-on internet connectivity becomes widely
> >   available.  Although it might appear that changing an address
> >   regularly in such environments would be desirable to lessen privacy
> >   concerns, it should be noted that the network prefix portion of an
> >   address also serves as a constant identifier.  All nodes at (say) a
> >   home, would have the same network prefix, which identifies the
> >   topological location of those nodes.  This has implications for
> >   privacy, though not at the same granularity as the concern that this
> >   document addresses.  Specifically, all nodes within a home would be
> >   grouped together for the purposes of collecting information.  This
> >   issue is difficult to address, because the routing prefix part of an
> >   address contains topology information and cannot contain arbitrary
> >   values.
> >
> >==> while the topic was 'Address Usage in IPv4 today', the text jumps in the
> >middle to describe v6 considerations (about the prefix portion).  Maybe this
> >needs to go somewhere else, or at least broken down more explicitly (e.g.,
> >to a different paragraph).
> 
> Not necessarily. It is still talking about IPv4 prefixes and IPv4
> addresses. 

I fail to see that.. homes aren't allocated a /24, /25, /26 or
whatever v4 prefixes, so saying that "the network prefix portion of an
address also serves as a constant identifier" seems wrong --
especially if the observer cannot know or reasonably guess where the
prefix part ends and the address part begins.

> >   An implementation might want to keep track of which addresses are
> >   being used by upper layers so as to be able to remove a deprecated
> >   temporary address from internal data structures once no upper layer
> >   protocols are using it (but not before). [...]
> >
> >==> this issue seems to be discussed in nearly the same fashion in two
> >different places in the draft.  Could this just be summarized in one, and
> >described in full in the other, or something?
> 
> One of the places talks about WHEN an application is allowed to remove
> deprecated addresses. The other one talks about Future Work to be done.
> I don't see any elegant way of combining these two.

Couldn't one just say in section 3.4 like:

   As an optional optimization, an implementation MAY remove a
   deprecated temporary address that is not in use by applications or
   upper-layers as outlined in section 6. 

.. and delete the rest ?

(Note: I'm not sure whether this has been even implemented -- if not, 
it should be removed completely, at least from sect 3.4)

> >8.  Security Considerations
> >                                                                                                                                           
> >                                                                                                                                           
> >   Ingress filtering is being deployed as a means of preventing the use
> >   of spoofed source addresses in Distributed Denial of Service(DDoS)
> >   attacks. 
> >
> >==> s/is being/has been/ ?
> 
> Subjective ;-). "Has been" makes it sound like everyone has it deployed
> already. I will leave it as it is unless you have strong feelings.

Sufficiently fine with me right now.  'is being' just sounds as if it
hasn't been deployed at all yet, and it was just recently started
being deployed..

The more complex way would be to say "has been and is being"..

-- 
Pekka Savola                 "You each name yourselves king, yet the
Netcore Oy                    kingdom bleeds."
Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings

--------------------------------------------------------------------
IETF IPv6 working group mailing list
ipv6@ietf.org
Administrative Requests: https://www1.ietf.org/mailman/listinfo/ipv6
--------------------------------------------------------------------