[saag] Fwd: AD evaluation of draft-gont-numeric-ids-sec-considerations-04

Benjamin Kaduk <kaduk@mit.edu> Tue, 21 July 2020 05:43 UTC

Date: Mon, 20 Jul 2020 22:43:37 -0700
From: Benjamin Kaduk <kaduk@mit.edu>
To: saag@ietf.org
Cc: draft-gont-numeric-ids-sec-considerations@ietf.org, Fernando Gont <fgont@si6networks.com>
Message-ID: <20200721054337.GF41010@kduck.mit.edu>
References: <20200717204604.GV41010@kduck.mit.edu> <c4a6c913-c7ee-2f95-51ce-47bbb13bd647@si6networks.com> <20200721033933.GB41010@kduck.mit.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <20200721033933.GB41010@kduck.mit.edu>
User-Agent: Mutt/1.12.1 (2019-06-15)
Archived-At: <https://mailarchive.ietf.org/arch/msg/saag/FNT7MZ1pw9OHpO46gVgea42kFBs>
Subject: [saag] Fwd: AD evaluation of draft-gont-numeric-ids-sec-considerations-04
Precedence: list

And here's Fernando's responses and my follow-ups.
Sorry again for not getting this right the first time.

-Ben

On Mon, Jul 20, 2020 at 08:39:38PM -0700, Benjamin Kaduk wrote:
> Hi Fernando,
> 
> Oops, it looks like I failed to actually cc: SAAG on my initial comments
> like I had planned to.  [...]
> 
> On Fri, Jul 17, 2020 at 10:41:02PM -0300, Fernando Gont wrote:
> > Hello, Benjamin,
> > 
> > On 17/7/20 17:46, Benjamin Kaduk wrote:
> > [....]
> > > 
> > > Abstract
> > > 
> > >     For more than 30 years, a large number of implementations of the TCP/
> > >     IP protocol suite have been subject to a variety of attacks, with
> > >     effects ranging from Denial of Service (DoS) or data injection, to
> > >     information leakage that could be exploited for pervasive monitoring.
> > > 
> > > This much historical background is probably overkill for an Abstract.
> > 
> > I agree. We probably kept the Abstract from the time the whole project 
> > was a single document, so your suggestion makes perfect sense.
> > 
> > 
> > >     The root of these issues has been, in many cases, the poor selection
> > >     of transient numeric identifiers in such protocols, usually as a
> > >     result of insufficient or misleading specifications.  This document
> > >     formally updates RFC3552, such that RFCs are required to include a
> > >     security and privacy analysis of the transient numeric identifiers
> > >     they specify.
> > > 
> > > I might reformulate the whole thing as something more like:
> > > 
> > > % Poor selection of transient numerical identifiers in protocols such as
> > > % the TCP/IP suite has historically led to a number of attacks on
> > > % implementations, ranging from Denial of Service (DoS) to data
> > > % injection and information leakage that can be exploited by pervasive
> > > % monitoring.  To prevent such flaws in future protocols and
> > > % implementations, this document updates RFC 3552, requiring future RFCs
> > > % to contain analysis of the security and privacy properties of any
> > > % transient numeric identifiers specified by the protocol.
> > 
> > Unless there are objections, I'll apply your suggested change verbatim. 
> > Thanks!
> > 
> > 
> > 
> > > Section 1
> > > 
> > >     Interface Identifiers (IIDs).  These identifiers usually have
> > >     specific properties that must be satisfied such that they do not
> > >     result in negative interoperability implications (e.g. uniqueness
> > >     during a specified period of time), and an associated failure
> > > 
> > > nit: comma after "e.g.".
> > > 
> > >     For more than 30 years, a large number of implementations of the TCP/
> > >     IP protocol suite have been subject to a variety of attacks, with
> > > 
> > > (editorial) the transition from the previous paragraph might flow better
> > > if this was something like "The TCP/IP protocol suite alone has been
> > > subject to variety of attacks on its numerical identifiers over the past
> > > 30 years or more, with [...]".
> > > 
> > >     o  Predictable DNS TxIDs
> > > 
> > > [DNS TxIDs are pretty hard to argue for being part of the "TCP/IP
> > > protocol suite", so I added a hedge word "alone" in my previous
> > > suggestion]
> > 
> > Well, the DNS is the Internet's directory service, so to speak. But your 
> > suggestion is fine, so I'll apply it.
> > 
> > 
> > 
> > > Section 2
> > > 
> > > What's wrong with the RFC 4949 definition of "identifier"?
> > > (We'd still want to keep the discussion about "transient", of course.)
> > 
> > To be honest, we missed there was this definition of "identifier" in the 
> > RFC series. That said, the definition in RFC4949 also acommodates 
> > non-numeric identifiers, for which the concept of e.g. "linear" , 
> > monotonically-increasing, etc. would not apply.
> 
> This is true (and I was not sure whether pulling in 4949 for a single
> definition was going to help much -- it doesn't cover any of the other
> things we need).
> 
> > I would suggest that we replace this definition of "identifier" with the 
> > definition of "transient numeric identifier", which is the specific 
> > identifiers we're concerned with here. 
> 
> That seems like a really good approach (especially since we explicitly say
> we will use the shorter term to refer to the same thing in this document).
> 
> > draft-irtf-pearg-numeric-ids-history defines them as:
> > 
> >     Transient Numeric Identifier:
> >        A data object in a protocol specification that can be used to
> >        definitely distinguish a protocol object (a datagram, network
> >        interface, transport protocol endpoint, session, etc) from all
> >        other objects of the same type, in a given context.  Transient
> 
> This "in a given context" seems to be the only text at present that touches
> on the "transient" part.  That may well be fine; all the ideas that are
> coming to me about "the contest in which the identifier is valid is limited
> in scope or time, e.g., a connection lifetime or replay timer" or similar
> feel like they are probably too much detail.
> 
> >        numeric identifiers are usually defined as a series of bits, and
> >        represented using integer values.  These identifiers are typically
> >        dynamically selected, as opposed to statically-assigned numeric
> >        identifiers (see e.g.  [IANA-PROT]).  We note that different
> >        identifiers may have additional requirements or properties
> >        depending on their specific use in a protocol.  We use the term
> >        "transient numeric identifier" (or simply "numeric identifier" or
> >        "identifier" as short forms) as a generic term to refer to any
> >        data object in a protocol specification that satisfies the
> >        identification property stated above.
> > 
> > 
> > 
> > 
> > > Also, we should pick up the new BCP 14 boilerplate from RFC 8174.
> > 
> > Definitely. Will do.
> > 
> > 
> > 
> > > Section 3
> > > 
> > > [*] Overall, I find this section longer than I expected, spending "too
> > > much time" on examples and evangelizing. 
> > 
> > FWIW, we might keep the bullets in this section, and simply, right after 
> > the bullets, refer to draft-irtf-pearg-numeric-ids-history and 
> > draft-irtf-pearg-numeric-ids-generation for further details. Thoughts?
> 
> That would probably work.  I think it would be okay to have maybe another
> sentence for each bullet, too, but let's see what it looks like.
> > 
> > 
> > > I have some specific comments
> > > below, but I'm not sure that just addressing the specific comments would
> > > change the overall impression that the section gives.
> > > 
> > >     While assessing protocol specifications and implementations regarding
> > >     the use of transient numeric identifiers
> > >     [I-D.gont-numeric-ids-history], we found that most of the issues
> > >     discussed in this document arise as a result of one of the following
> > >     conditions:
> > > 
> > > (editorial) A potential rewording that hews closer to "typical RFC
> > > style" might be:
> > > 
> > > % A recent survey of transient numerical identifier usage in protocol
> > > % specifications and implementations [I-D.gont-numeric-ids-history]
> > > % revealed that most of the issues discussed in this document arise as a
> > > % result of one of the following conditions:
> > > 
> > >     A number of protocol implementations (too many of them) simply
> > >     overlook the security and privacy implications of identifiers.
> > >     Examples of them are the specification of TCP port numbers in
> > >     [...]
> > >     On the other hand, there are a number of protocol specifications that
> > >     over-specify some of their associated protocol identifiers.  For
> > >     example, [RFC4291] essentially results in link-layer addresses being
> > >     embedded in the IPv6 Interface Identifiers (IIDs) when the
> > >     [...]
> > > 
> > > I wonder if the writing would be tighter if we diverged from the "one
> > > bullet point, one paragraph" style.  Consider:
> > > 
> > > % Both under-specifying and over-specifying identifier contents is
> > > % hazardous.  TCP port numbers and sequence numbers [RFC0793] and DNS
> > > % TxID [RFC1035] were under-specified, leading to implementations that
> > > % used predictable values and thus were vulnerable to numberous off-path
> > > % attacks.  Over-specification, as for IPv6 Interface Identifiers (IIDs)
> > > % [RFC4291] and Fragment Identification values [RFC2460], leaves
> > > % implementations unable to respond to security and privacy issues
> > > % stemming from the mandated algorithm -- IPv6 IIDs need not expose
> > > % privacy-sensitive link-layer addresses, and predictable Fragment
> > > % Identifiers invite the same off-path attacks that plague TCP.
> > 
> > FWIW, what we tried to do here was to provide one example regording what 
> > we meant by "under-specification" (the requirements are to clearly 
> > spelled out), "over-specification" (a proposed algorithm overloads the 
> > ID with properties it need not have), etc., such future specs avoid 
> > incurring into the same error.
> 
> (thanks for making it clear)
> 
> > >     Finally, there are protocol implementations that simply fail to
> > >     comply with existing protocol specifications.  For example, some
> > >     popular operating systems (notably Microsoft Windows) still fails to
> > >     implement transport-protocol port randomization, as specified in
> > >     [RFC6056].
> > > 
> > > It's not clear that this chunk speaks to the third bullet point; from
> > > the IETF perspective, implementation of ("compliance with") our
> > > protocols is optional, and there's not a lever in place to force people
> > > to take updates when we revise/update the spec.  
> > 
> > Agreed. What we meant by this bullet is that there are cases where the 
> > specs do the right thing, but there's a flaw in how the spec is turned 
> > into an implementation. (or well, the spec wasn't implemented at all).
> > 
> > 
> > 
> > > Implementing only part
> > > of a spec is a problem, but if we want to lament lack of adoption of
> > > follow-on updates, we should be writing things differently.
> > 
> > What we meant here is that this is a case where you might face security 
> > issues arising from flawed transient numeric IDs, but this has nothing 
> > to do with suboptimal specs, but rather with suboptimal implementations 
> > (and in that sense there's not much we (IETF) can do about them).
> 
> Sure.  "The right thing to do is properly specified (whether in a given
> document or an update to it), but the implementation just didn't do the
> right thing."
> 
> > 
> > > 
> > >     By requiring protocol specifications to clearly specify the
> > >     interoperability requirements for the transient numeric identifiers
> > >     they specify, the constraints in the possible algorithms to generate
> > >     them, as well as possible over-specification of such identifiers,
> > >     become evident.  Furthermore, requiring specifications to include a
> > > 
> > > nit(?): I'm not sure whether "constraints" is the right word here --
> > > what is being constrained, and by whom?  Would "limitations" or "risks"
> > > be workable?
> > 
> > Probably an issue of "English as second language" here, sorry: I guess 
> > we could have used "requirements" instead of "constraints". i.e., what 
> > we meant is that if you spell out the interoperability requirements, two 
> > things will become evident:
> > 
> > 1) The very "function" the algorithm needs to implement (e.g. if you 
> > need monotonically-increasing IDs, you better make sure that a new 
> > transient ID is larger than its predecesario), and,
> > 
> > 2) It would become evident when you are overspecifying things (.e.g, 
> > monotonically-increasing IDs need not be a global counter that starts at 
> > 0 -- that's not necessary to achieve monotonically-increasing IDs).
> 
> Thanks for writing this out, it helps a lot.  I suspect that what we
> actually want is to keep "constraints", but s/in/on/ -- the "constraints on
> the possible algorithms" are exactly the "function the algorithm needs to
> implement", which leads nicely into the "possible over-specification" that
> matches your point (2).
> 
> > 
> > 
> > >     security and privacy analysis of the transient numeric identifiers
> > >     they specify prevents the corresponding considerations from being
> > >     overlooked at the time a protocol is specified.
> > > 
> > > [*] I really don't think this is an appropriate way to phrase what we're
> > > doing.  To specifically *require* authors to include an analysis that
> > > covers these particular points seems to be quite a divergence from RFC
> > > 3552 -- while the presence of a security considerations section in RFCs
> > > is required, what RFC 3552 claims to provide is just guidelines for how
> > > to write such a section.  Isn't what we're doing here just an
> > > incremental addition to that guidance, not a new hard requirement on
> > > authors?
> > 
> > Indeed.
> > 
> > How about changing the paragraph to:
> > 
> >     Clear specification of the interoperability requirements for the
> >     transient numeric identifiers will help identify possible algorithms
> >     that could be employed to generate them, and also make evident
> >     if such identifiers are being over-specify. A protocol specification
> 
> ("over-specified")
> 
> >     will usually also benefit from a security and privacy analysis of
> >     the transient numeric identifiers they specify, to prevents the
> 
> ("to prevent")
> 
> >     corresponding considerations from being overlooked in the protocol
> >     specification itself.
> > 
> > ?
> 
> Looks good; just the indicated nits.
> 
> > 
> > 
> > > Section 4
> > > 
> > > [*] Similarly to Section 3, this section felt significantly longer than
> > > I was expecting.  Could you say something about the motivation for
> > > putting content here as opposed to (e.g.)
> > > draft-irtf-pearg-numeric-ids-generation?  (Note, this section currently
> > > references draft-gont-predictable-numeric-ids, which is IIRC the
> > > pre-split consolidated document.)
> > 
> > We were not sure if sentences such as "Employing the same identifier 
> > across contexts in which constancy is not required" or "Employing the 
> > same increment space across different contexts" would make sense to the 
> > reader without any context.
> > 
> > I can think of two options to address your comment:
> > 1) Keep the list, and point to draft-irtf-pearg-numeric-ids-generation 
> > right after that.
> > 
> > 2) Strip much of the contents of this section (notably the specific 
> > examples) as follows:
> > 
> > ---- cut here ----
> >     Employing trivial algorithms for generating the identifiers means
> >     that any node that is able to sample such identifiers can easily
> >     predict future identifiers employed by the victim node.
> > 
> >     When one identifier is employed across contexts where such constancy
> >     is not needed, activity correlation is made made possible.  For
> >     example, employing an identifier that is constant across networks
> >     allows for node tracking across networks.
> > 
> >     Re-using identifiers across different layers or protocols ties the
> >     security and privacy of the protocol re-using the identifier to the
> >     security and privacy properties of the original identifier (over
> >     which the protocol re-using the identifier may have no control
> >     regarding its generation).  Besides, when re-using an identifier
> >     across protocols from different layers, the goal of of isolating the
> >     properties of a layer from that of another layer is broken, and the
> >     privacy and security analysis may be harder to perform, since the
> >     combined system, rather than each protocol in isolation will have to
> >     be assessed.
> 
> I see I made this one longer, but I don't get to complain :)
> 
> >     At times, a protocol needs to convey order information (whether
> >     sequence, timing, etc.).  In many cases, there is no reason for the
> >     corresponding counter or timer to be initialized to any specific
> >     value e.g. at system bootstrap.
> 
> (Unrelated to previous comments) I wonder if it would be worth also saying
> that there may not be a need for the difference between successive counted
> values to be a constant increment (e.g., if order is truly the only needed
> property, without any "counter" nature).  Perhaps adding to the end ", or
> even for the difference between successive values to be predictable" would
> be a minimal-ish change, though I'm still on the fence as to whether it's
> worth saying [in this document].
> 
> >     A node that implements a per-context linear function may share the
> >     increment space among different contexts (please see the "Simple
> >     Hash-Based Algorithm" in [I-D.gont-predictable-numeric-ids]).
> >     Sharing the same increment space allows an attacker that can sample
> >     identifiers in other context to e.g. learn how many identifiers have
> >     been generated between two sampled values.
> > 
> >     Finally, some implementations have been found to employ flawed PRNGs.
> >     See e.g.[Klein2007].
> > ---- cut here ----
> > 
> > The above still gives context for the bullets, but reduces the amount of 
> > text and the amount of unnecesary details/examples here.
> > 
> > I'd probably prefer addressing your comment with the modified text, but 
> > I would still be fine with removing the text if you prefer.
> 
> I think we can get the modified text to work; thanks for putting it
> together.  Would you also be able to stub out what it would look like to
> just make this modified text *be* the list itself?  E.g., "Employing
> trivial algorithms for generating the identifiers, which means [...]",
> "Employing the same identifier across contexts in which constancy is not
> required -- when one identifier is imployed [...]", etc.  I think we might
> have to see what that looks like before we can decide whether or not it
> will work.
> 
> > 
> > [....]
> > > 
> > > It's also probably worth noting that this makes the privacy (and, to
> > > some extent, security) analysis harder, since you can no longer just
> > > consider each protoco in isolation and instead have to look at the
> > > combined system.
> > 
> > I've added this one above. Thanks!
> > 
> > 
> > 
> > > 
> > > Section 5
> > > 
> > > [*] This seems like the meat of the document, but it's at risk of
> > > getting lost due to the size of Sections 3 and 4.  Perhaps the
> > > Introduction should specifically say something like "the key guidelines
> > > for protocol designers are found in Section 5" to call it out.
> > 
> > Good grief! I'll add a note to the Intro.
> > 
> > 
> > >     Protocol specifications that specify transient numeric identifiers
> > >     MUST:
> > > 
> > > [*] This gets into the same question as above about "RFC 3552 doesn't
> > > require specific things in the security considerations; why should we?".
> > > I think we should rephrase this, perhaps:
> > > 
> > > % When a protocol specifies transient numerical identifiers, it is
> > > % critical for the security and privacy considerations to include
> > > % anlysis that:
> > > 
> > > (with the corresponding verb tense changes for the list itself).
> > > 
> > >     1.  Clearly specify the interoperability requirements for the
> > >         aforementioned identifiers.
> > > 
> > > Hmm, would it be fair to say "interoperability (i.e., functional)"?
> > 
> > I'm not sure if the two words convey the same meaning. You might 
> > probably know better than me fore this one (i.e. "English as second 
> > language"). Not sure if someone might interpret "functional 
> > requirements" as a vague description of what the transient numeric ID is 
> > for (e.g. "identifying packets") as opposed to detailed requirements 
> > such as "monotonically increasing...".
> > 
> > That said, no matter which specific term we employ, I guess it might 
> > helps to add a parenthesis with something like:
> > 
> > 
> > "(e.g. required properties such as uniqueness, along with the failure 
> > mode if such properties are not met)"
> 
> This is good; we should probably take it.
> 
> > Thoughts?
> 
> In light of the discussion here, I think it is okay to stick with
> "interoperability".  I might consider going with "core interoperability",
> in an attempt to emphasize that we are looking for the actual fundamental
> requirements, not just the stuff that the protocol authors put in as MUST
> because they felt like it, but I think just "interoperability" would work,
> too.
> 
> > 
> > 
> > >     2.  Provide a security and privacy analysis of the aforementioned
> > >         identifiers.
> > > 
> > > This one is perhaps redundant with the revised lead-in, though.
> > 
> > How about if we remove the redundant text from the lead-in?
> 
> So, just
> 
> % When a protocol specifies transient numerical identifiers, it is
> % critical for the security and privacy considerations to:
> 
> ?
> 
> That looks like it would work.
> 
> > My take is that an analysis of any specified numeric IDs is also key for 
> > the "Privacy Considerations" that are expected in RFCs these days. So, 
> > while not mandatory, I'd expect that specs that explicitly do this 
> > homework will have more chances of avoiding these problems.
> 
> Indeed.
> 
> > > 
> > > Section 7
> > > 
> > >     This entire document is about the security and privacy implications
> > >     of transient numeric identifiers, and formally updates [RFC3552] such
> > >     that the "Security Considerations" sections of RFCs are required to
> > > 
> > > [if the other changes go through, the "required" wording will need to be
> > > tweaked]
> > 
> > How about changing this to:
> > 
> >     This entire document is about the security and privacy implications
> >     of transient numeric identifiers, and formally updates [RFC3552] such
> >     that the security and privacy implications of transient numeric
> >     identifiers are considered when writing the "Security Considerations"
> >     section of future RFCs.
> > 
> > ?
> 
> Works for me.
> 
> > 
> > 
> > > Section 9.1
> > > 
> > > We seem to only be using RFCs 793, 2460, 6056, and 8200 as examples, so
> > > they would be okay to relegate to the Informative References section.
> > 
> > Definitely.
> > 
> > 
> > 
> > > Section 9.2
> > > 
> > > Keeping [I-D.gont-predictable-numeric-ids] around for the
> > > Acknowledgments makes sense, but (as noted above) we should refer to the
> > > appropriate post-split document(s) in the main body text.
> > 
> > Done.
> > 
> > 
> > > 
> > > Section 9.3
> > > 
> > > Please consolidate into Section 9.2
> > 
> > Done!
> > 
> > Thanks *a lot* for your comments! They have been really useful.  -- 
> > Please do let us know what you think about the few open issues above 
> > and, whether we should rev the document after that.
> 
> I think we have gotten as far as we can get just by talking about potential
> changes, so please go ahead and rev the document so we can see the changes
> and have a chance to change our mind back :)
> 
> Many thanks,
> 
> Ben

[saag] Fwd: AD evaluation of draft-gont-numeric-i… Benjamin Kaduk
[saag] Fwd: AD evaluation of draft-gont-numeric-i… Benjamin Kaduk
Re: [saag] AD evaluation of draft-gont-numeric-id… Fernando Gont