Re: [Anima] Benjamin Kaduk's Discuss on draft-ietf-anima-bootstrapping-keyinfra-22: (with DISCUSS and COMMENT)

Benjamin Kaduk <kaduk@mit.edu> Wed, 14 August 2019 14:27 UTC

Return-Path: <kaduk@mit.edu>
X-Original-To: anima@ietfa.amsl.com
Delivered-To: anima@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 58EEA12082F; Wed, 14 Aug 2019 07:27:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.199
X-Spam-Level:
X-Spam-Status: No, score=-4.199 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EBiQgF19yqv2; Wed, 14 Aug 2019 07:27:53 -0700 (PDT)
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BDEFE120829; Wed, 14 Aug 2019 07:27:52 -0700 (PDT)
Received: from kduck.mit.edu ([24.16.140.251]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id x7EERdAM006330 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 14 Aug 2019 10:27:41 -0400
Date: Wed, 14 Aug 2019 09:27:38 -0500
From: Benjamin Kaduk <kaduk@mit.edu>
To: Michael Richardson <mcr+ietf@sandelman.ca>
Cc: The IESG <iesg@ietf.org>, draft-ietf-anima-bootstrapping-keyinfra@ietf.org, tte+ietf@cs.fau.de, anima@ietf.org, anima-chairs@ietf.org
Message-ID: <20190814142737.GV88236@kduck.mit.edu>
References: <156282301326.15131.7510532622479656237.idtracker@ietfa.amsl.com> <17440.1565636744@localhost>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <17440.1565636744@localhost>
User-Agent: Mutt/1.10.1 (2018-07-13)
Archived-At: <https://mailarchive.ietf.org/arch/msg/anima/9_XsPO3AzKoUMh3nbETy0n7967E>
Subject: Re: [Anima] Benjamin Kaduk's Discuss on draft-ietf-anima-bootstrapping-keyinfra-22: (with DISCUSS and COMMENT)
X-BeenThere: anima@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Autonomic Networking Integrated Model and Approach <anima.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/anima>, <mailto:anima-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/anima/>
List-Post: <mailto:anima@ietf.org>
List-Help: <mailto:anima-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/anima>, <mailto:anima-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 14 Aug 2019 14:27:57 -0000

On Mon, Aug 12, 2019 at 03:05:44PM -0400, Michael Richardson wrote:
> 
> https://tinyurl.com/yylruorn contains a diff against -24.
> 
> Benjamin Kaduk via Datatracker <noreply@ietf.org> wrote:
>     > Section 5.8.1
> 
>     doc>    A log data file is returned consisting of all log entries associated
>     doc> with the the device selected by the IDevID presented in the request.
>     doc> The audit log may be truncated of old or repeated values as explained
>     doc> below.  The returned data is in JSON format ([RFC7951]), and the
>     doc> Content-Type SHOULD be "application/json".  For example:
> 
>     > If RFC 7951 is to be used, I'd suggest "is JSON-encoded YANG data".
> 
> Typo, it should be 7159.

(Well, 8529, but I think we covered that already.)

>     doc>    This document specifies a simple log format as provided by the MASA
>     doc> service to the registrar.  This format could be improved by distributed
>     doc> consensus technologies that integrate vouchers with technologies such
>     doc> as block-chain or hash trees or optimized logging approaches.  Doing so
>     doc> is out of the scope of this document but is an anticipated improvement
>     doc> for future work.  As such, the registrar client SHOULD anticipate new
>     doc> kinds of responses, and SHOULD provide operator controls to indicate
>     doc> how to process unknown responses.
> 
>     > This would be a great place to talk about the "version" field that's
>     > otherwise ignored.
> 
> As in, what should occur if the "version" is not 1?

Exactly so!

> How about:
> 
>           <t>
>             A registrar that sees a version value greater than 1 indicates
>             an audit log format that has been enhanced with additional
>             information.   No information will be removed in future
>             versions; should an incompatible change be desired in the future,
>             then a new HTTP end point will be used.
>           </t>
>           
> 
>     doc>    anticipated improvement for future work.  As such, the registrar
>     doc> client SHOULD anticipate new kinds of responses, and SHOULD provide
>     doc> operator controls to indicate how to process unknown responses.
> 
>     > Is "registrar client" intended to be both words or just one?
> 
> Probably not, removed.
> 
>     doc>    A registrar SHOULD use the log information to make an informed
>     doc> decision regarding the continued bootstrapping of the pledge.  The
> 
>     > I may be confused, but I thought the registrar was asking for the log
>     > after the voucher had already been issued.  Are these check supposed to
>     > keep the registrar from forwarding the voucher to the pledge, or just
>     > as a check for future renewal operations?
> 
> The voucher can be returned to the Pledge immediately, as there is never
> any issue about whether the Pledge should join.  The MASA has already made
> that decision.
> 
> The Registrar could avoid passing the voucher on until after the audit log
> checks are done, or could do that concurrently.  What matters is that the
> audit log checks are done prior to the Registrar accepting an enroll
> request.    This is down in (what is now) Figure 4.

Thinking about it that way is very helpful; thank you.

> The Registrar may do additional checks of the audit log at later times,
> but I don't think we have any good advice on this yet.

Okay.

>     doc>    A relatively simple policy is to white list known (internal or
>     doc> external) domainIDs and to require all vouchers to have a nonce and/ or
>     doc> require that all nonceless vouchers be from a subset (e.g. only
>     doc> internal) domainIDs.  A simple action is to revoke any locally issued
> 
>     > nit: missing "of"
> 
> I don't know where the "of" that is missing goes.

"subset of" (though possibly split by the parenthetical)

> While investigating, I broke the big sentence up, and discovered that the
> 5209 reference was not coded as XML. Please clue me in what you mean...
> 
>           <t>
>             A relatively simple policy is to white list known (internal or
>             external) domainIDs.
>             To require all vouchers to have a nonce.
>             Alternatively to require that all nonceless vouchers be from a
>             subset (e.g. only internal) domainIDs.
>             If the policy is violated a simple action is to revoke any
>             locally issued credentials for the pledge in question or to
>             refuse to forward the voucher.  The Registrar MUST then refuse
>             any EST actions, and SHOULD inform a human via a log.
>             A registrar MAY be configured to ignore (i.e. override the above
>             policy) the 
>             history of the device but it is RECOMMENDED that this only be
>             configured if hardware assisted (i.e. TPM anchored) Network
>             Endpoint Assessment (NEA) <xref target="RFC5209" /> is supported.
>           </t>
> 
>     doc>    credentials for the pledge in question or to refuse to forward the
>     doc> voucher.  A registrar MAY be configured to ignore the history of the
> 
>     > "simple action" in the case that the registrar doesn't like the audit
>     > log results?
> 
> See changes above for better clarity.

And greater clarity is achieved; thanks.

>     >    device but it is RECOMMENDED that this only be configured if
>     > hardware assisted NEA [RFC5209] is supported.
> 
>     > Probably need to expand NEA.
> 
> done, see above.
> 
>     > Section 5.9
> 
>     doc>    Although EST allows clients to obtain multiple certificates by
>     doc> sending multiple CSR requests BRSKI mandates use of the CSR Attributes
>     doc> request and mandates that the registrar validate the CSR against the
>     doc> expected attributes.  This implies that client requests
> 
>     > Where is this requirement stated normatively?
> 
> I guess I never thought of it as a requirement that multiple certificates are
> not supported; it's a restriction because CSR Attributes is used.
> Where is use of CSR Attributes stated. RFC7030 Section 4.5 has using CSR
> Attributes as a SHOULD.
> 
> I will note that the ACME Integration documents all seem to have missed
> the CSR Attributes part of the flow.
> 
>     Although EST allows clients to obtain multiple certificates by
> -   sending multiple CSR requests BRSKI mandates use of the CSR
> -   Attributes request and mandates that the registrar validate the CSR
> +   sending multiple CSR requests; BRSKI does not support this mechanism

nit: comma, not semicolon  (because the sentence starts with "although").

> +   directly.  This is because BRSKI pledges MUST use the CSR Attributes

(This may not need to be a 2119 MUST since we cite 7030.)

> +   request ([RFC7030] section 4.5).  The registrar MUST validate the CSR

but this change does address my concern, so thank you.

>     against the expected attributes.  This implies that client requests
>     will "look the same" and therefore result in a single logical
>     certificate being issued even if the client were to make multiple
> 
>     > Section 5.9.1
> 
>     doc>    This ensures that the pledge has the complete set of current CA
>     doc> certificates beyond the pinned-domain-cert (see Section 5.6.1 for a
>     doc> discussion of the limitations inherent in having a single certificate
>     doc> instead of a full CA Certificates response.)  Although these
> 
>     > I don't see such a discussion in the indicated section.
> 
> The 5.6 section got split, it should refer to 5.6.2 now.
> 
>     > Section 5.9.2
> 
>     doc>    To alleviate these operational difficulties, the pledge MUST request
>     doc> the EST "CSR Attributes" from the EST server and the EST server needs
>     doc> to be able to reply with the attributes necessary for use of the
>     doc> certificate in its intended protocols/services.  This approach allows
> 
>     > "intended" implies that the EST server has some knowledge of what the
>     > pledge is expected to be doing in the network, right?
> 
> Yes.  The ACP document is quite specific about the (rfc822Name) attributes to
> assign.  Certainly the attributes could include stuff like
> "ve608.core1.tor1.example.net" if the Registrar knew how this device was
> to be used, but more likely that would be set up afterwards.

Hmm, maybe later when we say "the local infrastructure (EST server) informs
the pledge of the proper fields to include in the generated CSR" we could
reiterate that the EST server has local configuration information to inform
this messaging, though it's probably not necessary.

>     doc>    To alleviate these operational difficulties, the pledge MUST request
>     doc> the EST "CSR Attributes" from the EST server and the EST server needs
>     doc> to be able to reply with the attributes necessary for use of the
>     doc> certificate in its intended protocols/services.  This approach allows
>     doc> for minimal CA integrations and instead the local infrastructure (EST
>     doc> server) informs the pledge of the proper fields to include in the
>     doc> generated CSR.  This approach is beneficial to automated boostrapping
>     doc> in the widest number of environments.
> 
>     > This is convenient, but has some security considerations in that it
>     > implies that the validation policy on the CA is somewhat lax, since the
>     > EST server is expected to be doing most of the policy controls.  Thus,
>     > a compromised pledge/device could send a CSR with unauthorized fields
>     > and it is likely to be signed, allowing for some level of privilege
>     > escalation.  When the registrar acts as a proxy to the CA as well as
>     > its EST role, as described later, this risk is diminished.
> 
> I don't really understand.
> EST servers are Registration Authorities, and they have some kind of
> priviledged access to the CA, and are mandated to check the CSR.
> I expected to find a statement to this effect in RFC7030, in section 4.2.1,
> but I don't see any particularly strong language.
> This seems like a quality of implementation issue in the Registrar.

The high-level intended workflow described here is roughly "(1) pledge asks
registrar for config; (2) pledge puts that config in a CSR, signs the CSR,
and sends the CSR to registrar; (3) registrar passes CSR to CA using
registrar's implicit authority.  We don't describe any crypto to check that
(2) happens as intended, as opposed to the pledge dishonestly claiming "oh,
and I'm a CA" or "I can provide all ACP services, even privileged ones", so
that has to be done by policy in the registrar, as you note.  I'm wary of
suggesting the workflow that relies on the registrar's implicit authority
at the CA without also noting the registrar's policy enforcement
obligations.  Though it's possible this is covered elsewhere and doesn't
need to be duplicated here.

>     doc>    In networks using the BRSKI enrolled certificate to authenticate the
>     doc> ACP (Autonomic Control Plane), the EST attributes MUST include the "ACP
>     doc> information" field.  See [I-D.ietf-anima-autonomic-control-plane] for
>     doc> more details.
> 
>     > This MUST seems like it belongs in the ACP spec and need not be
>     > repeated here using normative language.
> 
> I will simplify this, and reference to section 6.1.2 of the ACP.
> 
>            <t>In networks using the BRSKI enrolled certificate to authenticate
> -          the ACP (Autonomic Control Plane), the EST attributes MUST include
> -          the "ACP information" field. See <xref target="I-D.ietf-anima-autonomic-control-plane" /> for more details.</t>
> +          the ACP (Autonomic Control Plane), the EST CSR attributes MUST include
> +          the ACP Domain Information Fields defined in <xref
> +          target="I-D.ietf-anima-autonomic-control-plane" /> section 6.1.2.
> +          </t>

(beating the dead horse, is this any different as "MUST include" vs.
"include"?)

>     > Section 5.9.4
> 
>     doc>    administrators concerning device lifecycle status.  This might
>     doc> include information concerning attempted bootstrapping messages seen by
>     doc> the client, MASA provides logs and status of credential enrollment.
>     doc> [RFC7030] assumes an end user and therefore does not
> 
>     > This looks like a comma splice.
> 
> fixed.
> 
>     doc>    In the case of a FAIL, the Reason string indicates why the most
>     doc> recent enrollment failed.  The SubjectKeyIdentifier field MUST be
>     doc> included if the enrollment attempt was for a keypair that is locally
> 
>     > We haven't talked about POSTing to a new status-report endpoing yet, so
>     > this comes out of the blue.  It would probably be worth adding above
>     > "SHOULD [start a new TLS handshake] and POST a enrollment status
>     > message".
> 
> So I'll re-order some paragraphs, and I see that there is an obsolete
> half-thought, which remained after removing the TLS re-negotiate option that
> TLS1.3 obsoletes. 
> 
>     >      "Status":TRUE /* TRUE=Success, FALSE=Fail"
> 
>     > [same note about JSON comments]
> 
>     >      "reason-context": "Additional information"
> 
>     > Is this supposed to be a string or a JSON object similar to
>     > /voucher_status?
> 
> Already fixed to correct JSON from other comments.
> 
>     doc>    This allows for clients that wish to minimize their crypto
>     doc> operations to simply POST this response without renegotiating the TLS
>     doc> session - at the cost of the server not being able to accurately verify
>     doc> that enrollment was truly successful.
> 
>     > I'd prefer to not have this sort of option available, but I can't
>     > disprove the claim of its utility, so I will not object if it stays.
> 
> Agreed, I've removed it.
> 
>     > Section 5.9.5
> 
>     > Is the idea that the initial BRSKI-EST-issued certificate would
>     > authenticate the client for the subsequent EST requests?
> 
> yes.
> 
>     > Section 7
> 
>     > If this is non-normative and will need to be fleshed out in a separate
>     > document, would an Appendix be more appropriate?
> 
> Section 9 and 10 refer back to this section in a normative fashion.

Er, wouldn't that make this section no longer non-normative?
(Not that I could find the references you're talking about, so a clue bat
is welcome.)

>     > Section 7.1
> 
>     doc>    Pledge: The pledge could be compromised and providing an attack
>     doc> vector for malware.  The entity is trusted to only imprint using secure
>     doc> methods described in this document.  Additional endpoint
> 
>     > How do "could be compromised" and "is trusted to only [...]" go
>     > together?
> 
> The IDevID could be safe in a TPM, on the other side of an intact kernel,
> but the device could be running a VM full of malware.  We can trust it to
> onboard, but not the operate correctly.
> 
>     doc>    Vendor Service, MASA: This form of manufacturer service is trusted
>     doc> to accurately log all claim attempts and to provide authoritative log
>     doc> information to registrars.  The MASA does not know which devices are
>     doc> associated with which domains.  These claims could be
> 
>     > I think this is maybe more of a "does not enforce" than "does not
>     > know", since the domainID ends up in the audit logs that the MASA
>     > holds.
> 
> Yes, but the domainID does not directly identify the Registrar by name.
> Assuming a database breach, what does the MASA know that it can reveal.

can reveal or be correlated with other sources of information.  The
domainID is derived from the public part of a certificate, which could well
be widely disseminated.  An attacker that compromises the MASA and retains
a presence can watch requests come in and backsolve from domainID to
certificate directly.

> The MASA could store more data, but it doesn't have.
> 
>     >       Current text provides only for a trusted manufacturer.
> 
>     > nit: not a complete sentence.
> 
> I have no memory of what it means, so I've removed it.
> 
>     > Section 7.3
> 
>     doc>    A registrar can choose to accept devices using less secure methods.
>     doc> These methods are acceptable when low security models are needed, as
>     doc> the security decisions are being made by the local administrator, but
>     doc> they MUST NOT be the default behavior:
> 
>     > I'm having a hard time parsing "low security models"; the best I can
>     > come up with is "threat models where low security is adequate".
> 
>     doc>    Lower security modes chosen by the MASA service affect all device
>     doc> deployments unless bound to the specific device identities.  In which
> 
>     > Is this "unless the lower-security behavior is tied to specific device
>     > identities"?
> 
> Yes, changed.
> 
>     doc>    case these modes can be provided as additional features for specific
>     doc> customers.  The MASA service can choose to run in less secure modes
> 
>     > nit: This middle sentence is not a complete sentence.
> 
> Already fixed.
> 
>     doc>    1.  Not enforcing that a nonce is in the voucher.  This results in
>     doc> distribution of a voucher that never expires and in effect makes
> 
>     > A nonceless voucher can still include an expiration time ... it is just
>     > in practice possible for it to never expire, if the target pledge does
>     > not have an accurate clock.
> 
> Yes, that's correct.
> How many devices with RTCs survive 10 years in a warehouse with no power? :-)

The ones with a radioisotope thermoelectric generator? ;)

> Section 7.4 (.1, and .2) have been reworked some more.
> 
> 
>     doc>        subsequent bootstrapping attempts.  That this occurred is
>     doc> captured in the log information so that the registrar can make
>     doc> appropriate security decisions when a pledge joins the Domain.  This is
>     doc> useful to support use cases where registrars might not be online during
>     doc> actual device deployment.  Because this results in
> 
>     > nit: I think that grammatically the "This" at the start of this
>     > sentence refers to the behavior described in the previous sentence (the
>     > availability of the log information) rather than the issuance of the
>     > nonceless voucher.
> 
> fixed.
> 
>     > Section 9
> 
>     doc>    The autonomic control plane that this document provides bootstrap
>     doc> for is typically a medium to large Internet Service Provider
>     doc> organization, or an equivalent Enterprise that has signficant layer-3
>     doc> router connectivity.  (A network consistenting of primarily layer-2
> 
>     > nit: "is used in" -- the ACP is not the entire organization!
> 
> The text doesn't have "is used in"

(Right, I was saying that you should add it.  But the reworking fixes the
nit, so this is all good.)

> But, I did find that paragraph awkward and rewrote it.
> 
>     > Section 10.1
> 
>     doc>    The MASA audit log includes a hash of the domainID for each
>     doc> Registrar a voucher has been issued to.  This information is closely
>     doc> related to the actual domain identity, especially when paired with the
>     doc> anti-DDoS authentication information the MASA might collect.  This
>     doc> could
> 
>     > I thought I remembered some discussion of collecting this information
>     > and thus what sort of information might be collected, but searching for
>     > "DDoS" in the document didn't find it.
> 
> In order to protect itself from DDoS attacks, it's better if the MASA
> can authenticate every connection.  With full supply chain integration, then
> it knows every customer, and this is easy.
> On the other hand, if it hasn't got full chain integration, then it may need
> to accept connections from any place with any client certificate.  So it
> will need some kind of anti-DDOS system.  section 11.1 speaks about this,
> although it uses the term DoS.  I've synchronized the terms and made
> a forward reference.

Ah, thanks for the pointer to 11.1.

>     doc>    There are a number of design choices that mitigate this risk.  The
>     doc> domain can maintain some privacy since it has not necessarily been
>     doc> authenticated and is not authoritatively bound to the supply chain.
> 
>     > Is this really "privacy" or just "semi-plausible deniability"?
> 
> It's a good question.
> A domain could use a new certificate for each device, could connect via some
> onion router.  Is that privacy, or "semi-plausible deniability"?

It might depend on the details of the certificates used, what CA issued
them (and its policies regarding accuracy and level of detail in
certificates issued therefrom), and whether the certifciates are
used/exposed for any other purposes.  So, I don't insist on any specific
language, and was just sharing my thoughts about how it is possible to
think about this, in case it sparked any insight.

>     doc>    Additionally the domainID captures only the unauthenticated subject
>     doc> key identifier of the domain.  A privacy sensitive domain could
> 
>     > It's interesting to see "unauthenticated" used here, since the domainID
>     > is deterministically generated from the pinned-domain-cert that is
>     > included in the voucher and inserted into the pledge's trust store.  So
>     > in a sense it is a committal by "the domain" to tie that domain cert to
>     > that device or have the device otherwise be unusable.
> 
> Yes, so a Registrar can present a new self-signed certificate each time.
> Sure, it has to keep that around, as it's the link for that pledge.
> 
>     > The subsequent
>     > text here seems to be suggesting that instead of having a single root
>     > CA cert used by all devices in the domain, distinct certs would be used
>     > for each device, thus attempting to remove an ability to associate
>     > devices with each other via joint domain membership.  One might imagine
>     > doing this by entering intermediate CAs (or even leafs?) into the
>     > pinned-domain-cert field, but if those certs ever become visible (e.g.,
>     > via certificate transparency), then the domainIDs can be independently
>     > computed and associated to the MASA audit log, and the issuer chain
>     > used to recorrelate the devices by domainID.  So to fully protect
>     > privacy, the per-device pinned-domain-certs would need to be "root"s
>     > (i.e., self-issued), which greatly increases the manageability
>     > complexity and is in effect counter to the goals of ANIMA and BRSKI.
> 
> Agreed.
> If you have suggestions on other ways to mitigate, I'm all ears.

My tentative conclusion so far has been that robust mitigation is going to
be operationally expensive, to the point that I personally would make the
tradeoff of documenting risk and not trying to do a whole lot of
mitigation.  So no brilliant ideas here, unfortunately :-/

>     > Section 10.2
> 
>     doc>    While the contents of the signed part of the pledge voucher request
>     doc> can not be changed, they are not encrypted at the registrar.  The
>     doc> ability to audit the messages by the owner of the network prevents
>     doc> exfiltration of data by a nefarious pledge.  The contents of an
> 
>     > I think "prevents" is too strong -- steganography and concealed
>     > channels are really hard to completely defend against.  "Gives a
>     > mechanism to defend against" would be more accurate, in my opinion.
> 
> Okay.
> 
>     doc>    The above situation is to be distinguished from a residential/
>     doc> individual person who registers a device from a manufacturer: that an
>     doc> enterprise/ISP purchases routing products is hardly worth mentioning.
>     doc> Deviations would, however, be notable.
> 
>     > Deviations in what sense?
> 
> After buying only Cisco equipment (like ASR 9000) [observed by number of MASA
> connections after each POP turn up], ISP example.net suddendly
> has started communicating with Juniper's MASA (for MX40s..).

So like "deviations from a historical trend" or "deviations from an
established baseline"?

>     > Section 10.3
> 
>     doc>    4.  There is a fourth case, if the manufacturer is providing
>     doc> protection against stolen devices.  The manufacturer then has a
>     doc> responsability to protect the legitimate owner against fraudulent
>     doc> claims that the the equipment was stolen.  Such a claim would cause the
>     doc> manufacturer to refuse to issue a new voucher.  Should the device go
>     doc> through a deep factory reset (for instance, replacement of a damaged
>     doc> main board component, the device would not bootstrap.
> 
>     > I'm not sure I understand this scenario -- is it talking about where a
>     > third party makes a false theft report in the hopes that the real owner
>     > will have to do a deep reset and then have the device fail to bootstrap
>     > because of the reported theft?
> 
> Yes.

I think having "In the absence of such manufacturer protection, such a
claim would cause [...]" would have helped me get there.

>     > Section 11
> 
>     doc>    To facilitate logging and administrative oversight, in addition to
>     doc> triggering Registration verification of MASA logs, the pledge reports
> 
>     > I'm not sure if "Registration verification" is a typo or not.
> 
> changed to: _Registrar verification_
> 
>     doc>    registrar.  This is mandated anyway because of the operational
>     doc> benefits of an informed administrator in cases where the failure is
>     doc> indicative of a problem.  The registrar is RECOMMENDED to verify MASA
> 
>     > I'd also expect some comment about the limited value of the additional
>     > information to an attacker in the context where the attacker already
>     > would know [other information].
> 
> I'm not sure what other "other information" is yet, so I don't know
> how to fill that in.

IIUC, this text is predicated on a "possibly malicious registrar" that
a pledge tries to register using/through but fails.  So that registrar is
already in a position where the pledge would try to use it -- in the ACP
case, that probably means fairly physically proximate, and maybe implies a
link-local connection.  In the non-ACP case, maybe it also implies
proximity?  I don't know if we can get physical vs. network proximity
implied by a pledge trying to use a registrar, but maybe sometimes.  My
implied question is basically "what else would an attacker have to do to
get a pledge to try to use it as a registrar, and what does the attacker
already have access to by the time it gets that far?"  So we can set the
information learned from voucher parsing status reports in the context of
what else the attacker would already know.  But I don't have the best
picture on the deployment scenarios here, so I can't answer the question
myself :)

>     doc>    To facilitate truely limited clients EST RFC7030 section 3.3.2
>     doc> requirements that the client MUST support a client authentication model
>     doc> have been reduced in Section 7 to a statement that the registrar "MAY"
>     doc> choose to accept devices that fail cryptographic authentication.  This
>     doc> reflects current (poor) practices in shipping
> 
>     > But section 7 is non-normative!
> 
> okay, but now section 9 refers to parts of it normatively.
> 
>     doc>    devices without a cryptographic identity that are NOT RECOMMENDED.
> 
>     > I guess if we really wanted to disrecommend this practice we could
>     > split it out into a separate document that profiles core BRSKI for such
>     > usage.
> 
> yes.
> 
>     > Section 11.1
> 
>     doc>    this might be an issue during disaster recovery.  This risk can be
>     doc> mitigated by Registrars that request and maintain long term copies of
>     doc> "nonceless" vouchers.  In that way they are guaranteed to be able to
>     doc> bootstrap their devices.
> 
>     > This, of course, comes with a different risk of what is something like
>     > a long-term credential existing that needs to be protected and stored.
> 
> I partially agree.
> Long-term nonceless vouchers still pin a specific domainID.
> So they need to available, but they don't need to be private.

It's not entirely clear to me that it's okay to make them totally public
(but maybe I am missing something).  That is, in that once the voucher is
public, anyone who can get next to the device can re-bootstrap it using
that voucher, which possibly gets it into a configuration that's not usable
in its current location, and (less likely) maybe it's hard for the real
owner to get the correct configuration back (if the original MASA is gone
or whatever).  So the voucher is not something I'd want to just put on a
public web site.  But I guess maybe you don't have to protect it to the
same extent that you do your crypto keys.

>     doc>    The issuance of nonceless vouchers themselves creates a security
>     doc> concern.  If the Registrar of a previous domain can intercept protocol
>     doc> communications then it can use a previously issued nonceless voucher to
>     doc> establish management control of a pledge device even after having sold
>     doc> it.  This risk is mitigated by recording the issuance of such vouchers
>     doc> in the MASA audit log that is verified by the subsequent Registrar and
>     doc> by Pledges only bootstrapping when in a factory default state.  This
>     doc> reflects a balance between enabling MASA independence during future
>     doc> bootstrapping and the security of bootstrapping itself.  Registrar
>     doc> control over requesting and auditing nonceless vouchers allows device
>     doc> owners to choose an appropriate balance.
> 
>     > I would expect some discussion here about nonceless vouchers
>     > with/without expiration times.
> 
> Added above.
> 
>     > I also wonder whether the owner or registrar should expect to be doing
>     > soem periodic MASA audit log checking, akin to a CT auditor or monitor.
> 
> Yes, but we think falls into quality of implementation.
> It's not something worth doing for every Registrary.

Sure.

>     doc>    of specific pledge devices, helps to mitigate this.  Pledge
>     doc> signatures on the pledge voucher-request, as forwarded by the registrar
>     doc> in the prior-signed-voucher-request field of the registrar
>     doc> voucher-request, significantly reduce this risk by ensuring the MASA
>     doc> can confirm proximity between the pledge and the registrar making the
>     doc> request.  This mechanism is optional to allow for constrained devices.
>     doc> Supply chain integration ("know your customer") is an
> 
>     > And so the protection is only available when all devices served by the
>     > MASA are known to produce signed voucher-requests.
> 
> Which is now mandatory.

Yay

>     doc>    additional step that MASA providers and device vendors can explore.
> 
>     > In terms of adding some more crypto to the BRSKI-MASA flow that would
>     > prevent unauthenticated DoS?
> 
> If every connection is from a known customer, then one can reject the DoS
> at the TLS handshake step.
> 
>     > Section 11.2
> 
>     > I appreciate having this discussion present; thanks!
> 
>     doc>    The fake registrar (Rm) can obtain a voucher signed by the MASA
>     doc> either directly or through arbitrary intermediaries.  Assuming that
> 
>     > I'm not sure what sort of intermediary this is thinking about.
> 
> Join Proxies connected to the systems involved.
> 
>     doc>    This pledge voucher-request would be 'stale' in that it has a nonce
>     doc> that no longer matches the internal state of the pledge.  In order to
>     doc> successfully use any resulting voucher the Rm would need to remove the
>     doc> stale nonce or anticipate the pledge's future nonce state.  Reducing
>     doc> the possibility of this is why the pledge is mandated to generate a
>     doc> strong random or pseudo-random number nonce.
> 
>     > This gets a bit more complicated to reason about when we assume that a
>     > pledge will have multiple voucher requests out in parallel, instead of
>     > doing them in sequence and timing out (so that there is literally only
>     > one valid nonce at a given time)...
> 
> Agreed.
> But, we think that doing things serially has too much potential to run
> into head-of-queue challenges; the risk is that if devices don't onboard
> relatively quickly, then BRSKI will get turned off, not used, or the
> vendor might provide some "backdoor"

That's a fair concern, but in and of itself is not an excuse to skip
reasoning through the risks of the parallel workflow.  How much effort has
already been spent doing that reasoning through?  For example, one might
want to require that the pledge track which nonce belongs to the voucher
request submitted through which candidate registrar, but I didn't work
through whether that actually will defend against any attack scenarios.

>     doc>    Additionally, in order to successfully use the resulting voucher the
>     doc> Rm would have to attack the pledge and return it to a bootstrapping
>     doc> enabled state.  This would require wiping the pledge of current
> 
>     > .... and I think there is a different attack if the Rm is in a position
>     > to delay or drop network traffic between the pledge and the intended
>     > registrar, to cause Rm's voucher to be delivered first even though it
>     > is generated after the intended registrar's authorization process.  The
>     > intended registrar would need to require reports on voucher processing
>     > status (or investigate their absence) in order to detect such a case.
> 
> Yes, that's definitely a problem.  We had a lot of difficulty constructing
> this attack, btw.
> 
>     doc>    o Retreival and examination of MASA log information upon the
>     doc> occurance of any such unexpected events.  Rm will be listed in the logs
>     doc> along with nonce information for analysis.
> 
>     > How strongly guaranteed is it that a given device will only have a
>     > single specific MASA that it trusts to issue vouchers (and thus limit
>     > the scope of monitoring needed by the owner)?
> 
> There is no guarantee; a device could have multiple IDevID.
> However, a voucher-request from one wouldn't be useable for another.
> It would require malicious code in the pledge, I think.

I'm willing to consider malicious code in the pledge as part of my threat
model :)

>     > Section 11.3
> 
>     doc>    o A Trust-On-First-Use (TOFU) mechanism.  A human would be queried
>     doc> upon seeing a manufacturer's trust anchor for the first time, and then
>     doc> the trust anchor would be installed to the trusted store.  There are
>     doc> risks with this; even if the key to name is validated using something
>     doc> like the WebPKI, there remains the possibility
> 
>     > nit: is this "key to name mapping"?
> 
> fixed.
> 
>     > Section 11.4
> 
>     > It is not entirely clear to me whether device manufacturers are set up
>     > with incentives to maintain a well-run secure CA with strong hardware
>     > protections on the offline signing key for the root CA, cycling through
>     > various levels of intermediates, etc., that CAs in the Web PKI do
>     > today.  If the manufacturer uses a less stringent process, that would
>     > leave the manufacturer's key as a more tempting attack surface, and it
>     > may be worth some discussion here about what damage could be done with
>     > a compromised MASA signing key.  E.g., would an attack require
>     > restoring devices to factory defaults or otherwise waiting for natural
>     > bootstrapping events to occur?  Would the attacker need to be on-path?
>     > Etc.
> 
> I have dug around cabforum.org and root-servers.org for some references on
> what a "well-run secure CA" should be doing... surprisingly I didn't find an
> RFC, or one for DNSSEC root operation. Did I miss them?  I thought that there
> was one.  I'm writing some text, and I'll finish this email here, and post
> the resulting text.

I think I had always assumed this was part of the CA/B forum baseline
requirements (https://cabforum.org/baseline-requirements-documents/) but
never actually looked. :-/
We could ask around if it's important (but I don't think there's an RFC).

-Ben