Re: [Anima] Benjamin Kaduk's Discuss on draft-ietf-anima-bootstrapping-keyinfra-22: (with DISCUSS and COMMENT)

Benjamin Kaduk <kaduk@mit.edu> Fri, 16 August 2019 22:33 UTC

Return-Path: <kaduk@mit.edu>
X-Original-To: anima@ietfa.amsl.com
Delivered-To: anima@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C54F41200C4; Fri, 16 Aug 2019 15:33:30 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.199
X-Spam-Level:
X-Spam-Status: No, score=-4.199 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xTI4kZBVnuVr; Fri, 16 Aug 2019 15:33:28 -0700 (PDT)
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5528C120018; Fri, 16 Aug 2019 15:33:28 -0700 (PDT)
Received: from kduck.mit.edu ([24.16.140.251]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id x7GMXGsI018924 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 16 Aug 2019 18:33:19 -0400
Date: Fri, 16 Aug 2019 17:33:16 -0500
From: Benjamin Kaduk <kaduk@mit.edu>
To: Michael Richardson <mcr+ietf@sandelman.ca>
Cc: The IESG <iesg@ietf.org>, draft-ietf-anima-bootstrapping-keyinfra@ietf.org, tte+ietf@cs.fau.de, anima@ietf.org, anima-chairs@ietf.org
Message-ID: <20190816223315.GP88236@kduck.mit.edu>
References: <156282301326.15131.7510532622479656237.idtracker@ietfa.amsl.com> <17440.1565636744@localhost> <20190814142737.GV88236@kduck.mit.edu> <14902.1565888325@localhost>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <14902.1565888325@localhost>
User-Agent: Mutt/1.10.1 (2018-07-13)
Archived-At: <https://mailarchive.ietf.org/arch/msg/anima/eKptbogH5OX_sRz29K2D927FiGk>
Subject: Re: [Anima] Benjamin Kaduk's Discuss on draft-ietf-anima-bootstrapping-keyinfra-22: (with DISCUSS and COMMENT)
X-BeenThere: anima@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Autonomic Networking Integrated Model and Approach <anima.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/anima>, <mailto:anima-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/anima/>
List-Post: <mailto:anima@ietf.org>
List-Help: <mailto:anima-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/anima>, <mailto:anima-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 16 Aug 2019 22:33:31 -0000

On Thu, Aug 15, 2019 at 12:58:45PM -0400, Michael Richardson wrote:
> 
> Benjamin Kaduk <kaduk@mit.edu> wrote:
>     >> + directly.  This is because BRSKI pledges MUST use the CSR Attributes
> 
>     > (This may not need to be a 2119 MUST since we cite 7030.)
> 
> It turns out, in pracice, that many EST clients do not use the CSR
> Attributes, so I need this line as a hammer.

Fair enough.

>     >> > "intended" implies that the EST server has some knowledge of what
>     >> the > pledge is expected to be doing in the network, right?
>     >> 
>     >> Yes.  The ACP document is quite specific about the (rfc822Name)
>     >> attributes to assign.  Certainly the attributes could include stuff
>     >> like "ve608.core1.tor1.example.net" if the Registrar knew how this
>     >> device was to be used, but more likely that would be set up
>     >> afterwards.
> 
>     > Hmm, maybe later when we say "the local infrastructure (EST server)
>     > informs the pledge of the proper fields to include in the generated
>     > CSR" we could reiterate that the EST server has local configuration
>     > information to inform this messaging, though it's probably not
>     > necessary.
> 
> I've added the ().
> 
> +          fields to include in the generated CSR (such as rfc822Name). 
> 
>     doc> To alleviate these operational difficulties, the pledge MUST request
>     doc> the EST "CSR Attributes" from the EST server and the EST server
>     doc> needs to be able to reply with the attributes necessary for use of
>     doc> the certificate in its intended protocols/services.  This approach
>     doc> allows for minimal CA integrations and instead the local
>     doc> infrastructure (EST server) informs the pledge of the proper fields
>     doc> to include in the generated CSR.  This approach is beneficial to
>     doc> automated boostrapping in the widest number of environments.
>     >> 
>     >> > This is convenient, but has some security considerations in that it
>     >> > implies that the validation policy on the CA is somewhat lax, since
>     >> the > EST server is expected to be doing most of the policy controls.
>     >> Thus, > a compromised pledge/device could send a CSR with unauthorized
>     >> fields > and it is likely to be signed, allowing for some level of
>     >> privilege > escalation.  When the registrar acts as a proxy to the CA
>     >> as well as > its EST role, as described later, this risk is
>     >> diminished.
>     >> 
>     >> I don't really understand.  EST servers are Registration Authorities,
>     >> and they have some kind of priviledged access to the CA, and are
>     >> mandated to check the CSR.  I expected to find a statement to this
>     >> effect in RFC7030, in section 4.2.1, but I don't see any particularly
>     >> strong language.  This seems like a quality of implementation issue in
>     >> the Registrar.
> 
>     > The high-level intended workflow described here is roughly "(1) pledge
>     > asks registrar for config; (2) pledge puts that config in a CSR, signs
>     > the CSR, and sends the CSR to registrar; (3) registrar passes CSR to CA
>     > using registrar's implicit authority.  We don't describe any crypto to
>     > check that (2) happens as intended, as opposed to the pledge
>     > dishonestly claiming "oh, and I'm a CA" or "I can provide all ACP
>     > services, even privileged ones", so that has to be done by policy in
>     > the registrar, as you note.  I'm wary of suggesting the workflow that
>     > relies on the registrar's implicit authority at the CA without also
>     > noting the registrar's policy enforcement obligations.  Though it's
>     > possible this is covered elsewhere and doesn't need to be duplicated
>     > here.
> 
> I think it goes back to the RA and more specifically, the CA, being boss of
> what going into a certificate.   To the point where it generally seems really
> hard to deploy new extensions in the public WebPKI.
> 
> It does say:
> 
>           <t>The registrar MUST also confirm that the resulting CSR is formatted as
>           indicated before forwarding the request to a CA. If the registrar is
>           communicating with the CA using a protocol such as full CMC, which
>           provides mechanisms to override the CSR attributes, then these
>           mechanisms MAY be used even if the client ignores CSR Attribute
>           guidance.</t>

Hmm, I guess I must have missed that or skimmed over it too quickly.

>     >> > Section 7
>     >> 
>     >> > If this is non-normative and will need to be fleshed out in a
>     >> separate > document, would an Appendix be more appropriate?
>     >> 
>     >> Section 9 and 10 refer back to this section in a normative fashion.
> 
>     > Er, wouldn't that make this section no longer non-normative?  (Not that
>     > I could find the references you're talking about, so a clue bat is
>     > welcome.)
> 
> It's of the form, "if you wish to do X, then you MUST do Y"
> (but, X is not a MUST).

That specific construction would seem like an "optional feature" per
https://www.ietf.org/blog/iesg-statement-normative-and-informative-references/
...

> section 9:
>    In recognition of this, some mechanisms are presented in
>    Section 7.2.  The manufacturer MUST provide at least one of the one-
>    touch mechanisms described that permit enrollment to be proceed
>    without availability of any manufacturer server (such as the MASA).

... but this is a somewhat different construction.  In isolation, it looks
more like "MUST do at least one of X, Y, Z" without condition on "wish to
do W", and if X, Y, and Z are all in the same place, that place seems
normative to me.  (I will confess I've rather lost track of exactly why
we're debating if this is normative or not; I guess it's just the
disclaimer in Section 7 about "considered non-normative in the generality
of the protocol".)

>     >> > I think this is maybe more of a "does not enforce" than "does not >
>     >> know", since the domainID ends up in the audit logs that the MASA >
>     >> holds.
>     >> 
>     >> Yes, but the domainID does not directly identify the Registrar by
>     >> name.  Assuming a database breach, what does the MASA know that it can
>     >> reveal.
> 
>     > can reveal or be correlated with other sources of information.  The
>     > domainID is derived from the public part of a certificate, which could
>     > well be widely disseminated.  An attacker that compromises the MASA and
>     > retains a presence can watch requests come in and backsolve from
>     > domainID to certificate directly.
> 
> Right, so that's 11.4.3, where the attacker controls the web server.

The "watch requests come in" is, yes.  But if we don't constrain the
visibility/use of the certificates in question, they could be out "in the
wild" and thus correlatable even with just a database dump/snapshot.

> At which point, they don't need to compromise the database, I think, as they
> can just access it.  I think of a database dump as being offline: I get
> access to your backups, or I find an SQL injection attack that exfiltrates
> data, but does not give me control.
> 
>     >> > A nonceless voucher can still include an expiration time ... it is
>     >> just > in practice possible for it to never expire, if the target
>     >> pledge does > not have an accurate clock.
>     >> 
>     >> Yes, that's correct.  How many devices with RTCs survive 10 years in a
>     >> warehouse with no power? :-)
> 
>     > The ones with a radioisotope thermoelectric generator? ;)
> 
> So RTCs with RTGs.
> Apparently not much plutonium is left for future missions.
> 
>     >> fixed.
>     >> 
>     >> > Section 9
>     >> 
>     doc> The autonomic control plane that this document provides bootstrap
>     doc> for is typically a medium to large Internet Service Provider
>     doc> organization, or an equivalent Enterprise that has signficant
>     doc> layer-3 router connectivity.  (A network consistenting of primarily
>     doc> layer-2
>     >> 
>     >> > nit: "is used in" -- the ACP is not the entire organization!
>     >> 
>     >> The text doesn't have "is used in"
> 
>     > (Right, I was saying that you should add it.  But the reworking fixes
>     > the nit, so this is all good.)
> 
> got it.
> 
>     doc> There are a number of design choices that mitigate this risk.  The
>     doc> domain can maintain some privacy since it has not necessarily been
>     doc> authenticated and is not authoritatively bound to the supply chain.
>     >> 
>     >> > Is this really "privacy" or just "semi-plausible deniability"?
>     >> 
>     >> It's a good question.  A domain could use a new certificate for each
>     >> device, could connect via some onion router.  Is that privacy, or
>     >> "semi-plausible deniability"?
> 
>     > It might depend on the details of the certificates used, what CA issued
>     > them (and its policies regarding accuracy and level of detail in
>     > certificates issued therefrom), and whether the certifciates are
>     > used/exposed for any other purposes.  So, I don't insist on any
>     > specific language, and was just sharing my thoughts about how it is
>     > possible to think about this, in case it sparked any insight.
> 
> Agreed.
> I think that it's hard problem.
> 
>     doc> The above situation is to be distinguished from a residential/
>     doc> individual person who registers a device from a manufacturer: that
>     doc> an enterprise/ISP purchases routing products is hardly worth
>     doc> mentioning.  Deviations would, however, be notable.
>     >> 
>     >> > Deviations in what sense?
>     >> 
>     >> After buying only Cisco equipment (like ASR 9000) [observed by number
>     >> of MASA connections after each POP turn up], ISP example.net suddendly
>     >> has started communicating with Juniper's MASA (for MX40s..).
> 
>     > So like "deviations from a historical trend" or "deviations from an
>     > established baseline"?
> 
> added that text.
> 
>     >> > Section 10.3
>     >> 
>     doc> 4.  There is a fourth case, if the manufacturer is providing
>     doc> protection against stolen devices.  The manufacturer then has a
>     doc> responsability to protect the legitimate owner against fraudulent
>     doc> claims that the the equipment was stolen.  Such a claim would cause
>     doc> the manufacturer to refuse to issue a new voucher.  Should the
>     doc> device go through a deep factory reset (for instance, replacement of
>     doc> a damaged main board component, the device would not bootstrap.
>     >> 
>     >> > I'm not sure I understand this scenario -- is it talking about where
>     >> a > third party makes a false theft report in the hopes that the real
>     >> owner > will have to do a deep reset and then have the device fail to
>     >> bootstrap > because of the reported theft?
>     >> 
>     >> Yes.
> 
>     > I think having "In the absence of such manufacturer protection, such a
>     > claim would cause [...]" would have helped me get there.
> 
> added.
> 
>     doc> registrar.  This is mandated anyway because of the operational
>     doc> benefits of an informed administrator in cases where the failure is
>     doc> indicative of a problem.  The registrar is RECOMMENDED to verify
>     doc> MASA
>     >> 
>     >> > I'd also expect some comment about the limited value of the
>     >> additional > information to an attacker in the context where the
>     >> attacker already > would know [other information].
>     >> 
>     >> I'm not sure what other "other information" is yet, so I don't know
>     >> how to fill that in.
> 
>     > IIUC, this text is predicated on a "possibly malicious registrar" that
>     > a pledge tries to register using/through but fails.  So that registrar
>     > is already in a position where the pledge would try to use it -- in the
>     > ACP case, that probably means fairly physically proximate, and maybe
>     > implies a link-local connection.
> 
> Agreed. In the ACP case, imagine an ISP border router on which there are a number
> of links; some of which are internal (legitimate Join Proxy), and some of
> which are external links to another ISP's border router (Q).  The Join Proxy
> on Q really doesn't know if the purpose of that link has changed from
> external to internal.  That's exactly the case of a non-malicious mistake
> in finding the right network.
> 
>     > In the non-ACP case, maybe it also
>     > implies proximity?  I don't know if we can get physical vs. network
>     > proximity implied by a pledge trying to use a registrar, but maybe
>     > sometimes.  My implied question is basically "what else would an
>     > attacker have to do to get a pledge to try to use it as a registrar,
>     > and what does the attacker already have access to by the time it gets
>     > that far?"  So we can set the information learned from voucher parsing
>     > status reports in the context of what else the attacker would already
>     > know.  But I don't have the best picture on the deployment scenarios
>     > here, so I can't answer the question myself :)
> 
> We left the warning there so that the people writing code would think about
> this tuscle. My opinion that security paranoia has often gotten in the way of
> building debuggable security systems that actually work.  
> 
>     doc> this might be an issue during disaster recovery.  This risk can be
>     doc> mitigated by Registrars that request and maintain long term copies
>     doc> of "nonceless" vouchers.  In that way they are guaranteed to be able
>     doc> to bootstrap their devices.
>     >> 
>     >> > This, of course, comes with a different risk of what is something
>     >> like > a long-term credential existing that needs to be protected and
>     >> stored.
>     >> 
>     >> I partially agree.  Long-term nonceless vouchers still pin a specific
>     >> domainID.  So they need to available, but they don't need to be
>     >> private.
> 
>     > It's not entirely clear to me that it's okay to make them totally
>     > public (but maybe I am missing something).  That is, in that once the
>     > voucher is public, anyone who can get next to the device can
>     > re-bootstrap it using that voucher, which possibly gets it into a
>     > configuration that's not usable in its current location, and (less
>     > likely) maybe it's hard for the real owner to get the correct
>     > configuration back (if the original MASA is gone or whatever).  So the
>     > voucher is not something I'd want to just put on a public web site.
>     > But I guess maybe you don't have to protect it to the same extent that
>     > you do your crypto keys.
> 
> Not anyone can use the nonceless voucher to bootstrap a device.
> If it did, then that would be a bearer voucher ("cash")
> It still pins a specific domain CA ("pay $X to Mr. Y").
> It's a public document in the same sense that a public key is.

IIUC I don'to have to prove possession of the private key corresponding to
a certificate that chains to that domain's CA in order to get the pledge to
accept the voucher.  That is, even if I can't get the device to work for
me, I can still keep someone else from using it (e.g., reusing a nonceless
voucher for the original owner after re-sale).

>     >> > This gets a bit more complicated to reason about when we assume that
>     >> a > pledge will have multiple voucher requests out in parallel,
>     >> instead of > doing them in sequence and timing out (so that there is
>     >> literally only > one valid nonce at a given time)...
>     >> 
>     >> Agreed.  But, we think that doing things serially has too much
>     >> potential to run into head-of-queue challenges; the risk is that if
>     >> devices don't onboard relatively quickly, then BRSKI will get turned
>     >> off, not used, or the vendor might provide some "backdoor"
> 
>     > That's a fair concern, but in and of itself is not an excuse to skip
>     > reasoning through the risks of the parallel workflow.  How much effort
>     > has already been spent doing that reasoning through?  For example, one
>     > might want to require that the pledge track which nonce belongs to the
>     > voucher request submitted through which candidate registrar, but I
>     > didn't work through whether that actually will defend against any
>     > attack scenarios.
> 
> Each attempt needs to use a distinct nonce.

That's true, but not quite what I was looking for.  With two outstanding
nonces, now I have to keep track of which nonce went where and use the
right nonce to attempt to verify a response.  I also have to consider the
case where I send out two requests and get back two valid responses.
Presumably I take whichever one shows up first and validates, but are there
risks associated with rejecting the other one or not knowing whether it was
valid?  If I use predictable nonces, could that allow another party to
affect the outcome of the "race" between which response shows up first?

-Ben

> I thought we said that.  
> We say that in section 11, but I've added it to 4.1.
> 
>     >> I have dug around cabforum.org and root-servers.org for some
>     >> references on what a "well-run secure CA" should be
>     >> doing... surprisingly I didn't find an RFC, or one for DNSSEC root
>     >> operation. Did I miss them?  I thought that there was one.  I'm
>     >> writing some text, and I'll finish this email here, and post the
>     >> resulting text.
> 
>     > I think I had always assumed this was part of the CA/B forum baseline
>     > requirements (https://cabforum.org/baseline-requirements-documents/)
>     > but never actually looked. :-/ We could ask around if it's important
>     > (but I don't think there's an RFC).
> 
> I will save it for an operational document.
> 
> -- 
> ]               Never tell me the odds!                 | ipv6 mesh networks [
> ]   Michael Richardson, Sandelman Software Works        |    IoT architect   [
> ]     mcr@sandelman.ca  http://www.sandelman.ca/        |   ruby on rails    [
> 
>