Re: [Acme] ACME Renewal Information (ARI) API Proposal

Andrew Ayer <agwa@andrewayer.name> Tue, 24 March 2020 01:03 UTC

Return-Path: <agwa@andrewayer.name>
X-Original-To: acme@ietfa.amsl.com
Delivered-To: acme@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 558AD3A0F26 for <acme@ietfa.amsl.com>; Mon, 23 Mar 2020 18:03:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.1
X-Spam-Level:
X-Spam-Status: No, score=-2.1 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=andrewayer.name
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id E5Wvnng93ytZ for <acme@ietfa.amsl.com>; Mon, 23 Mar 2020 18:03:53 -0700 (PDT)
Received: from thomson.beanwood.com (thomson.beanwood.com [18.220.42.202]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 76F1D3A0F22 for <acme@ietf.org>; Mon, 23 Mar 2020 18:03:52 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=andrewayer.name; s=beanwood20160511; t=1585011831; bh=wyfc/6uunt6KkA9sRO4ClqbuKOwV+BIcGeWUdN8qCyI=; h=Date:From:To:Cc:Subject:In-Reply-To:References; b=m3fGKduU3GbrPuagSN0gtHl78wv3g+OwTMjfS2nI06NpkIq2KFb85FA8y6JhfW6Z5 QYA3Jsd/YZKVe9HxpKI8T6k8yo49frWcOcwmcnMCxfZ04EDq4KYsgckKCmM2dZPt8m bOBzoq+h7Rjn3qyhRRmriEzNHCsnUThbzTrsCrXhf+OC85VYQ8hE2UGH1ysjVtKrxZ Y67z04shKqL5z7wXDQTnvWC6yC0llXU1hv7klsxNvsdO+hBY6kaJ6ztll7IPOC/vBi yHn99w2JOhNddVmmNhh3uEzzgE2RIOjN+sbSeHSYBqZZ94m/VOEmtQmshRT0+fNzNy tV/z6Ox5+aUlA==
Date: Mon, 23 Mar 2020 21:03:50 -0400
From: Andrew Ayer <agwa@andrewayer.name>
To: Roland Shoemaker <roland@letsencrypt.org>
Cc: IETF ACME <acme@ietf.org>
Message-Id: <20200323210350.c66da2e9ee7462126d32dfda@andrewayer.name>
In-Reply-To: <CAF1ySfFWU4Bai7SPyXfF5v5y_zUa89fkt7tEDvfQmzEBE9LkKA@mail.gmail.com>
References: <CAF1ySfFWU4Bai7SPyXfF5v5y_zUa89fkt7tEDvfQmzEBE9LkKA@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/acme/szDHa5z6qRiAtmeC2ohrePPoBjU>
Subject: Re: [Acme] ACME Renewal Information (ARI) API Proposal
X-BeenThere: acme@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Automated Certificate Management Environment <acme.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/acme>, <mailto:acme-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/acme/>
List-Post: <mailto:acme@ietf.org>
List-Help: <mailto:acme-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/acme>, <mailto:acme-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Mar 2020 01:03:57 -0000

I think it would be really useful if the renewal information could be
discovered and retrieved by third parties.  This would permit monitoring
services to raise an alarm if a certificate is going to be revoked soon,
just in case the automation fails to renew it (or there is no automation,
which is unfortunately common).  For example, during the recent Let's
Encrypt revocation incident, Cert Spotter used the list of serial numbers
published by Let's Encrypt to alert the users of Cert Spotter who were
affected.

Monitoring services would also be able to detect if a certificate is
renewing earlier or later than recommended by the CA, which would help
site operators tune their integrations for optimal load on the CA.

Obviously, OCSP would be the best way to make this information
discoverable to third parties, as the OCSP URL is right there in the
certificate.  But failing that, it should at least be discoverable
given the ACME directory URL, so as long as you have a map from issuer
to directory URL you can find it.

Regards,
Andrew


On Mon, 23 Mar 2020 15:00:31 -0700
Roland Shoemaker <roland@letsencrypt.org> wrote:

> Hey all,
> 
> At Let's Encrypt we've been thinking about designing some kind of
> renewal information API as an extension to ACME for a while now.
> Recent events have brought this back to the forefronts of our minds.
> Below I've attached a proposal I've written up detailing our
> proposal. I'd really like to get input on this proposal, especially
> from those working on ACME clients as this work mostly represents
> thoughts from ACME server developers, and as such may not accurately
> capture issues faced by clients.
> 
> If the working group is interested in this as a work product I'll
> spend some time developing an ID based on this outline.
> 
> Thanks!
> Roland
> 
> ----
> 
> This proposal aims to address two issues that affect both ACME and the
> wider web PKI.
> 
> The first of these issues is how a CA should inform subscribers of a
> CA, or third-party, initiated certificate revocation event. In most
> cases this is done via email, or other out-of-band notification
> channels, which may be appropriate for CAs that rely on manual
> processes but seems clunky for ACME based CAs which heavily rely on
> automation. For automated ACME clients the probability that a user
> will act upon a revocation notice (or even receive one if they do not
> provide an account contact) is lower than manually maintained
> certificates, leading to the possibility of serving a revoked
> certificate until their next renewal window. In cases where the CA
> has a buffer before performing the revocation, being able to inform
> the client of this impending event would allow for seamless renewal
> before the revocation took place, and in the case where the
> revocation has already taken place this would help to significantly
> reduce the impact to the subscriber.
> 
> The second issue is how ACME clients should determine when to renew a
> regular non-revoked certificate. Most clients take one of two routes.
> They are either manually configured to renew at a specific interval
> (i.e. via `cron` or similar) or parse the issued certificate to
> determine the expiration date and choose some date preceding it to
> attempt renewal. While the latter option is better than the former,
> each can cause issues for both the client and the issuing CA. The
> first option causes significant barriers for the issuing CA changing
> certificate lifetimes, as the static renewal window makes assumptions
> about that lifetime that must be manually updated. Both options can
> also cause load clustering for the issuing CA. Being able to indicate
> to the client a period in which the issuing CA suggests renewal would
> allow dynamic changes to the certificate lifetime and smearing of
> load.
> 
> ## ACME vs. OCSP
> 
> The two obvious options for transmitting renewal suggestions to the
> subscriber are via an extension to the ACME protocol, or an extension
> to the OCSP protocol. Each option has advantages and disadvantages.
> 
> For OCSP one of the obvious advantages is that the protocol is already
> designed to carry revocation information, which some clients already
> poll for. An extension could be added to OCSP responses containing
> 'recommended renewal' windows and/or indicators of impending
> revocation. Using OCSP also has the advantage of existing serving and
> caching infrastructure.
> 
> The disadvantages of using OCSP mainly revolve around usage of the
> protocol by relying parties. In order to avoid intolerance for new
> OCSP extensions we would likely want to require clients to indicate
> that they want this information, via either an OCSP request extension
> or an HTTP header, which could increase caching requirements for the
> issuing CA and possibly require changes to their existing caching
> infrastructure.
> 
> For ACME the most obvious advantage is that every ACME client already
> understands the protocol, and should have a relatively easy path to
> being extended to understand a new API endpoint. Using ACME also
> allows a more descriptive, extensible API, rather than requiring us
> to stuff more information into a strictly defined ASN.1 extension.
> Using ACME would allow for requesting information on multiple
> certificates in a single request, which while technically possible
> via OCSP is in reality rarely supported.
> 
> The disadvantages of using ACME mainly revolve around increased load
> on the ACME API for the issuing CA. ACME currently has no endpoints
> that are designed to be routinely polled, adding one could introduce
> a significant load vector which infrastructure has not been designed
> for. Another disadvantage is that if the API was authenticated it
> wouldn't be possible to viably cache the renewal information at a CDN
> layer.
> 
> On balance it seems like ACME is the better choice for this API.
> 
> ## Push vs. Pull
> 
> The CA could either push information to ACME clients, for instance via
> webhook, or it could rely on clients polling for information.
> 
> The push method is challenging because many ACME clients run behind
> firewalls or don___t have full access to provide external-facing
> services. For instance, an ACME client might only have the ability to
> provision files under /.well-known/acme-challenge/, or it might only
> have access to modify DNS records.
> 
> The pull method, on the other hand, is straightforward. ACME clients,
> by necessity, need to send HTTPS requests to the CA. They can use
> that same channel to poll periodically.
> 
> The disadvantage of polling is that it provides less timely results
> than pushing. The most relevant constraint is the Baseline
> Requirement that CAs must revoke within 24 hours on key compromise,
> or when validation information ___cannot be relied on.___ Polling must be
> frequent enough that the ACME client will receive notification within
> this 24-hour window, with enough remaining time for manual escalation
> if the automated client fails to act. Polling on a 12-hour interval
> should provide this.
> 
> ## Cacheability
> 
> An important question to answer is if the results of this API need to
> be cacheable, and if so what level of cacheable it should have. One
> reason for designing the API around cacheability would be high
> request load for repeated requests. Users for repeated identical
> requests are likely to have a relatively low cardinality and these
> requests are not likely to be made rapidly, suggesting that the API
> doesn't need to be highly cacheable. That said given the information
> returned by the API isn't likely to be dynamic (for instance in the
> lifetime of the certificate it is unlikely to change, barring a
> revocation event) it seems likely that the issuing CA would like some
> way to cache the results in order to reduce unnecessary resource
> usage..
> 
> ## An OCSP-based design (rejected)
> 
> Here we___ll sketch out an OCSP-based design for contrast with the
> design proposed below.
> 
> OCSP is frequently fetched by Relying Parties (RPs). We do not want to
> increase the bandwidth usage for normal RP fetches, since that would
> worsen performance for many normal web browsing requests. Also, when
> ACME clients poll, they will want different caching semantics than
> RPs. CAs will want ACME clients to get fresh information about every
> 12 hours, while OCSP responses are commonly cacheable up to their
> NextUpdate, which according to the Microsoft Root Program can be up
> to 7 days after ThisUpdate. While CAs could shorten their NextUpdate
> interval to accommodate ACME clients, this would be an unnecessary
> coupling of concerns.
> 
> Under this proposal, ACME clients that poll OCSP for renewal
> information MUST add an HTTP Header, ___ACME-Renewal: 1___ to their
> requests. CAs that use CDNs to serve OCSP responses MUST treat the
> ACME-Renewal header as part of their cache key, so that responses to
> ACME clients can have a different Cache-Control: max-age than those
> sent to RPs.
> 
> In the normal case, when no renewal is needed soon, the OCSP response
> will be unchanged. For the ___renewal needed soon___ case, we have two
> choices to convey that information: An OCSP extension, or an HTTP
> header. The OCSP extension has the advantage that it___s signed, but
> has the disadvantage that it requires extra signatures from a CA___s
> HSM, at a time when the HSM may already be burdened by signing bulk
> revocation responses.
> 
> In the case where a CA wants an ACME client to renew a certificate,
> the CA responds to all requests that have ___ACME-Renewal: 1___ in the
> header with a response that has the header ___ACME-Renewal:
> renew-by=<datetime>; key-rotate=<true/false>___. The ACME client then
> attempts renewal by the specified datetime.
> 
> In both cases the CA MUST include the ___Vary___ header in its response,
> and must include ___ACME-Renewal___ among the header names listed.
> 
> Advantages of this proposal:
> 
> 
> It does not require a discovery mechanism for ACME clients to find out
> where to check the status of a certificate; the OCSP URL is already
> available in the certificate itself.
> ACME clients can also check the revocation status of the OCSP
> response. For CAs that don___t support renewal notifications, these
> clients could trigger renewal immediately on noticing a certificate
> was revoked.
> 
> Disadvantages of this proposal:
> 
> 
> It combines two different types of requests with different caching at
> a single URL, inviting subtle mistakes with cache keys.
> Because OCSP URLs embedded in certificates necessarily use HTTP, the
> response triggering renewal is unauthenticated. A MITM attacker could
> use this to trigger early certificate renewal.
> 
> We reject this design.
> 
> ## Proposed API
> 
> Here we propose a roughly sketched out ACME API extension, taking into
> account the topics discussed above.
> 
> Conformant ACME servers should include a new key in the JSON objects
> for finalized orders with the key ___renewalInformation___. The value of
> this field should contain a unique URL from which renewal information
> can be retrieved. To request renewal information conforming ACME
> clients should make a GET request to this URL.
> 
> The ACME server should respond to a request with a JSON object
> containing renewal hints for the associated certificate.
> 
> {
>     "suggestedRenewalWindow": {
>         "start": "...",
>         "end": "..."
>     },
>     "keyRotate": true
> }
> 
> The structure of the certificate objects is as follows:
> 
> suggestedRenewalWindow (object, required): A JSON object containing
> two strings, "start" and "end", which indicates the window in which
> the CA recommends renewing the certificate. Conformant ACME clients
> should pick a random time within this window at which to renew the
> certificate. If this window is in the past, conforming clients SHOULD
> immediately attempt to renew the certificate.
> 
> keyRotate (boolean, optional): A boolean indicating if the ACME server
> requires that the renewed certificate MUST use a new key pair.
> 
> The HTTP response should contain a Retry-After heading indicating the
> polling interval that the ACME server recommends. Conforming ACME
> clients SHOULD use this value to determine their polling schedule,
> using the returned date as a lower bound for requesting information
> again, rather than using a fixed interval.
> 
> This API is explicitly unauthenticated, and does not use the ACME
> POST-as-GET scheme, as none of the information used by this API is
> considered confidential.
> 
> Conforming ACME servers may construct the renewal URLs included in
> order objects in any fashion they wish as long as the URL is stable
> for the lifetime of the certificate. Conforming clients should store
> this URL locally so that the ACME server does not need to be queried
> in order to learn the URL as the server may delete, or otherwise make
> unavailable, the related order object while the certificate is still
> valid.
> 
> ### Discoverability & URL Construction
> 
> Determining how the ACME server offers renewal information, and how
> the ACME client discovers this information, is a big question. We___ve
> proposed one design above, but acknowledge that there are trade-offs
> in our design which may make more sense to ACME server implementers
> than ACME client implementers. This section details those trade-offs
> with our proposed design, and another initial design we rejected.
> 
> The design proposed uses a static URL, the format of which is not
> specified. These URLs are provided via the order object, and as they
> have no specified structure, cannot be derived from the certificate
> itself. This means that clients must store the URL locally in order
> to access the API or access the order to learn the URL, although as
> orders may expire, or become inaccessible during the lifetime of the
> certificate, this is not an ideal approach. This also means that ACME
> servers must continue to serve this specific URL for the lifetime of
> the certificate and cannot dynamically change where they serve this
> information from.
> 
> Our initial design specified the construction of the URL, using a
> directory resource to point to the API endpoint and the SHA256 hash
> of the certificate as the token. The upside of this is that it would
> allow clients to construct the URL without any required local state
> other than the certificate itself. It would also allow the ACME
> server to change where it was serving the API endpoint from
> dynamically, as the client would need to query the directory to learn
> the first portion of the URL to append their hash to. The main
> downside here is that it requires clients to make two requests each
> time they want to access the API, one to the directory endpoint, and
> then another to the API endpoint. Depending on the design of the ACME
> server this could cause a significant increase in load, specifically
> to the directory endpoint. Another downside is that this requires
> that we specify the construction of the token beyond simply being
> unique, which adds complexity to the specification.
> 
> We could possibly merge these two designs, such that the specification
> specifies how to construct the URL, and provides a directory entry,
> but RECOMMENDS that the ACME client store this URL locally in order
> to reduce load on the ACME server. In the case where the client makes
> a request and receives a 404, for instance because the server has
> changed where it serves the API endpoint from, it would then re-query
> the directory in order to reconstruct the URL. This would provide the
> benefits of both designs, with the benefit of inducing lower load on
> the ACME server, but would require a somewhat more complex client
> design.
> 
> ## Acknowledgements
> 
> This document draws heavily from an internal write-up of the issue by
> Jacob Hoffman-Andrews.