Re: [TLS] New Version Notification for draft-kampanakis-tls-scas-latest-00.txt (ICA Supression)

"Kampanakis, Panos" <kpanos@amazon.com> Wed, 16 February 2022 19:41 UTC

Return-Path: <prvs=039816789=kpanos@amazon.com>
X-Original-To: tls@ietfa.amsl.com
Delivered-To: tls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6255E3A15A4 for <tls@ietfa.amsl.com>; Wed, 16 Feb 2022 11:41:40 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -15.173
X-Spam-Level:
X-Spam-Status: No, score=-15.173 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.576, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=amazon.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hkd58Y0BuuGx for <tls@ietfa.amsl.com>; Wed, 16 Feb 2022 11:41:35 -0800 (PST)
Received: from smtp-fw-9102.amazon.com (smtp-fw-9102.amazon.com [207.171.184.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 56B4A3A15A2 for <tls@ietf.org>; Wed, 16 Feb 2022 11:41:35 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1645040495; x=1676576495; h=from:to:cc:date:message-id:references:in-reply-to: mime-version:subject; bh=DUdiLDpaNkoQPS7MH2D5km282+xG33s9I+wCDrCc17I=; b=nPpmVbQsoxbR72wGXjEizsDZ7NQSwj11ZhiYcVs+SnzLKppXwDxt3iuB FwESZ90Y4SxyX9x5aCGCGp/s3WjPtFCc4GZt80ZNzR97fvdW0EnRY+2jI 975NCXTtb8A81yQHFtkbPuh+3jQFXd7vWqffCcAg/iGtS7PGeFZi/gqpe c=;
X-Amazon-filename: image002.png, image004.png
X-IronPort-AV: E=Sophos;i="5.88,374,1635206400"; d="png'150?scan'150,208,217,150";a="195258426"
Thread-Topic: [TLS] New Version Notification for draft-kampanakis-tls-scas-latest-00.txt (ICA Supression)
Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO email-inbound-relay-iad-1a-2d7489a4.us-east-1.amazon.com) ([10.25.36.214]) by smtp-border-fw-9102.sea19.amazon.com with ESMTP; 16 Feb 2022 19:41:20 +0000
Received: from EX13MTAUWB001.ant.amazon.com (iad12-ws-svc-p26-lb9-vlan2.iad.amazon.com [10.40.163.34]) by email-inbound-relay-iad-1a-2d7489a4.us-east-1.amazon.com (Postfix) with ESMTPS id C7D17C02C4; Wed, 16 Feb 2022 19:41:19 +0000 (UTC)
Received: from EX13D14UWB004.ant.amazon.com (10.43.161.137) by EX13MTAUWB001.ant.amazon.com (10.43.161.249) with Microsoft SMTP Server (TLS) id 15.0.1497.28; Wed, 16 Feb 2022 19:41:19 +0000
Received: from EX13D01ANC003.ant.amazon.com (10.43.157.68) by EX13D14UWB004.ant.amazon.com (10.43.161.137) with Microsoft SMTP Server (TLS) id 15.0.1497.28; Wed, 16 Feb 2022 19:41:18 +0000
Received: from EX13D01ANC003.ant.amazon.com ([10.43.157.68]) by EX13D01ANC003.ant.amazon.com ([10.43.157.68]) with mapi id 15.00.1497.028; Wed, 16 Feb 2022 19:41:11 +0000
From: "Kampanakis, Panos" <kpanos@amazon.com>
To: Ryan Sleevi <ryan-ietftls@sleevi.com>
CC: "Bytheway, Cameron" <bythewc@amazon.com>, "tls@ietf.org" <tls@ietf.org>
Thread-Index: AdghU4+EMFFuriJ+SLKnIIleOIJolAAbmTcAABqFvvA=
Date: Wed, 16 Feb 2022 19:41:11 +0000
Message-ID: <180543c01fdf439cbdfd8214ec75eb76@EX13D01ANC003.ant.amazon.com>
References: <83f923185c3741ccb668826f5b11b0c3@EX13D01ANC003.ant.amazon.com> <CAErg=HFamywTBGriKsVd4eB=yo46Mz2JcKnnjHY8s36f12qEFg@mail.gmail.com>
In-Reply-To: <CAErg=HFamywTBGriKsVd4eB=yo46Mz2JcKnnjHY8s36f12qEFg@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: yes
X-MS-TNEF-Correlator:
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [10.43.157.155]
Content-Type: multipart/related; boundary="_005_180543c01fdf439cbdfd8214ec75eb76EX13D01ANC003antamazonc_"; type="multipart/alternative"
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/tls/h4nS90S7E4Fa9JFZqLtqHtwfvFI>
X-Mailman-Approved-At: Wed, 16 Feb 2022 11:46:26 -0800
Subject: Re: [TLS] New Version Notification for draft-kampanakis-tls-scas-latest-00.txt (ICA Supression)
X-BeenThere: tls@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "This is the mailing list for the Transport Layer Security working group of the IETF." <tls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tls>, <mailto:tls-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tls/>
List-Post: <mailto:tls@ietf.org>
List-Help: <mailto:tls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tls>, <mailto:tls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Feb 2022 19:41:41 -0000

Thanks for the insightful comments Ryan.

Some responses below (sorry long email):

> how is that functionally different than simply saying "Intermediate 2" is the Trust Anchor, using the computed outputs (RFC 5280, Section 6.1.6) for Intermediate 2 as the inputs for RFC 5280, 6.1.1's algorithm for validating End Entity? What value does "Intermediate 1", or "Root 1", serve from a protocol or conceptual layer?

Agreed, it is not different. But I believe adding Intermediates to the trust bundle is less straightforward to be deployed everywhere especially when expanding the scope to many more CAs. Note that the draft does not consider the ICA list a trust list. It is just a list to build the chain. There is some text in the draft trying to convey that.


> The flags proposal, in effect, is introducing the notion of a "static Directory" - such as the examples you pointed to of Mozilla's CSV output or Filippo's consolidation of that output.

Agreed. But although the directory option is the most straightforward option, there is a dynamic cache building element too. I have experimented with this a bit; someone may not even need a full ICA list. It could build its cache dynamically as it starts connecting to peers. If it has not seen the peer before it caches its ICAs the first time. And uses an update algorithm when the cache is full and has a new cache miss. [4] has some basic details about this. Admittedly that is the less straightforward and probably less common option. I added some text in the draft to reflect both options, but we do not want to define how the ICAs get discovered. We just want the peer to declare “I somehow know my peer ICAs”.


> I'm not suggesting online signing by roots, but rather, that this extension firmly rejects the "trust roots, discover intermediates" model of 1996, so why shouldn't we lean into this more for PQ?

Good point. I think the challenge is the significant scope change as you are suggesting because now you are supposed to vet and somehow trust many more CAs that don’t have their key offline like roots do. Generally, significant changes like that scare me. Also let’s not forget the challenges of having the TLS client say “if PQ, do PKI differently, else keep the Netscape model”.


> I'm not sure I see addressed in the draft how it handles the problem previously raised on the mic [1] and on the list [2], regarding version skew problems. [...] It seems rather fundamental to the assessment of the proposal to understand the proposed client behaviours here.

To summarize, you had it right. If the client thinks it has the ICAs it signals it. If the connection fails, it tries again without suppression. The cache is acquired either statically from CCADB or some vendor or service somewhere, or by building a dynamic cache as explained above.


> For example, if this is seen as the only practically viable (at Internet scale) way to deploy PQ, then we should presume that a failure for the client to have a fresh set is, effectively, a failure to communicate with a site that needs such a fresh set.

It is not the only option, but we consider this as low hanging fruit that trims the PQ data sent in the handshake by a good amount. Here is the PQ Auth data (in MB) from the server without suppression
[cid:image002.png@01D82343.3A247B40]
And this is the same data with suppression
[cid:image004.png@01D82343.3A247B40]

There have been more proposals like KEMTLS, or using different algorithm (smaller signature, bigger public key) in the SCT, OCSP staple or Root CA, or using CRLite and saving one extra OCSP signature. These do trim the data as well. But they also introduce significant changes or may not be viable depending on the standardized options. So, we consider the ICA suppression option as relatively straightforward, low hanging fruit.

I would rephrase it to “a failure to have a fresh set means not taking advantage of the faster handshake with ICAs suppressed”.


> The choice of TBD3 is both a reflection of, and an expectation of, broader ecosystem behaviours and changes, but which are left somewhat minimally specified, and which seems rather substantial.

Agreed. We need to crisp up TBD3. We think that with CT and, say, two day's grace, a client that requires SCTs can actually be sure that its chosen set of intermediates will do. No need for CCADB for that usecase. But you are right we need expand on TBD3.


> This is alluded to by saying, in Section 3.1, "a server could choose not to send its intermediates regardless of the flag from the client, if it has reason to believe the issuing CAs do not exist in the client ICA list", but without reference to how practically that would be determined.

Yeah, we could work on the text there. Basically if the server knows it got a cert from new ICA which is likely to not exist in the list it could just send the chain. Or if there is a special Private PKI usecase where the server changed ICAs and the clients will not know because they are not configured yet, it could do that as well. I created issue https://github.com/csosto-pk/tls-suppress-intermediates/issues/7  to track this.


> This already had some discussion during the cross-sni-resumption, which shares a similar state overlap, but it may be worth exploring how well the amortization here works when faced under "real world" conditions (of multiple parallel connections racing, isolated between different security principals/origins, with limited lifetimes).

Yes, good comment. We had discussed these with co-authors. We are not convinced that the ticket or any of the other options are worth it. We would need more data to substantiate it. That is why we have the EDNOTE in the draft.


> But, in such a scenario, they can already account for this by just eliding intermediates today without requiring negotiation. Negotiating via flags comes in when both parties don't have perfect knowledge or consistent configuration between each other, but functionally, if they negotiate the flag, we're requiring that they do have perfect knowledge.

Anecdotally, in previous work I had done in WiSUN and 802.15.4, we were going to extensive lengths to pre-provision endpoints with CAs before deploying. But I see these constrained / IoT usecases to easily fit the dynamic cache category. These PKIs usually have 1-2 ICAs max. That cache can be very trivially built and maintained dynamically without perfect previous knowledge.

Rgs,
Panos


From: TLS <tls-bounces@ietf.org> On Behalf Of Ryan Sleevi
Sent: Monday, February 14, 2022 11:43 AM
To: Kampanakis, Panos <kpanos=40amazon.com@dmarc.ietf.org>
Cc: Bytheway, Cameron <bythewc@amazon.com>; tls@ietf.org
Subject: RE: [EXTERNAL] [TLS] New Version Notification for draft-kampanakis-tls-scas-latest-00.txt (ICA Supression)


CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


Hi Panos,

There was additionally some discussion during IETF 105 [1], which looked a bit about the problem.

As a problem statement, it's definitely an interesting problem, and certainly, one we need to start preparing to solve practically. I think one very direct question is this: If we are confident that the client and server have negotiated some shared trust bundle reliably, why would we use intermediates at all? That is, if a given client knows a certificate path such that Root->Intermediate 1->Intermediate 2 exists, sufficient that it can be omitted within the TLS exchange and only send End Entity, then how is that functionally different than simply saying "Intermediate 2" is the Trust Anchor, using the computed outputs (RFC 5280, Section 6.1.6) for Intermediate 2 as the inputs for RFC 5280, 6.1.1's algorithm for validating End Entity? What value does "Intermediate 1", or "Root 1", serve from a protocol or conceptual layer?

This might sound too abstract, but I think highlights the core concept of the proposal, which is that certificate path building and validation become unnecessary in a scheme where we're eliding intermediates. In the original deployment model of TLS using X.509, the certificates sent were simply pointers into the X.500 Directory. Of course, we all know that didn't manifest, and we certainly don't depend on the Directory, and so we used the certificate chains provided in-band to communicate trust in a world where the Directory doesn't exist. The flags proposal, in effect, is introducing the notion of a "static Directory" - such as the examples you pointed to of Mozilla's CSV output or Filippo's consolidation of that output.

I realize this suggests a much larger change to the trust model we've inherited from Netscape's snap decisions 25 years ago, but given PQ, does it make sense to re-examine whether or not we need such intermediates at all? That is, if we're willing to say "This is signed by Intermediate 2, and you're expected to know about it and its limitations", how is this different from saying "This is signed by Trust Anchor Foo, and you're expected to know about it and its limitations". I'm not suggesting online signing by roots, but rather, that this extension firmly rejects the "trust roots, discover intermediates" model of 1996, so why shouldn't we lean into this more for PQ?

I'm not sure I see addressed in the draft how it handles the problem previously raised on the mic [1] and on the list [2], regarding version skew problems. Section 3.1 of the draft states "To prevent a failed TLS connection, a client MAY choose not to send the flag if its list of ICAs hasn't been updated in TBD3 time or has any other reason to believe it does not include the ICAs for the peer", but this leaves a lot open to interpretation. For example, is it expected that clients will retry the connection, as suggested in [3]? That seems to be the suggestion from 3.1's "If the connection still fails ... the client MUST NOT send the flag in a subsequent connection to the server". Or is this meant to be left unspecified, similar to Section 3's "It is beyond the scope of this document to define how CA certificates are identified and stored"? It seems rather fundamental to the assessment of the proposal to understand the proposed client behaviours here.

A significant amount of practicality for this proposal rests on what the value for TBD3 is. For example, if this is seen as the only practically viable (at Internet scale) way to deploy PQ, then we should presume that a failure for the client to have a fresh set is, effectively, a failure to communicate with a site that needs such a fresh set. Have I misunderstood the conclusions of your research [4]?

If this is functionally necessary for certain deployments, then the choice of TBD3 is a reflection of "How long do servers/CAs need to wait, before a newly provisioned intermediate becomes practically deployable"? When Certificate Transparency (RFC 6962) examined a similar question, the suggestion by CAs then was that 24 hours (the proposed CT MMD for direct integration into the log) was untenable, and thus SCTs, the "immediate promise to log, without proof of having been integrated into the Merkle tree", were introduced. Ilari's point about technically constrained subordinates is related to this; even for those not constrained, today's ecosystem has them functionally usable immediately, and disclosure is not required for week(s) after by policies, if at all. The choice of TBD3 is both a reflection of, and an expectation of, broader ecosystem behaviours and changes, but which are left somewhat minimally specified, and which seems rather substantial.

The current draft doesn't address the question about how "The Web PKI" is not a comprehensive set of all CAs, but rather, an overlapping/intersecting set from independent vendors. This raises practical questions about deployability, because it speaks a bit to trust anchor agility within these applications. For example, Vendor A supports Root 1 -> Intermediate 1 -> Intermediate 2, with Root 1 in their trust store. Vendor B supports Root 2 -> (Cross-Signed) Root 1 -> Intermediate 1 -> Intermediate 2, with Root 2 in their trust store, and Root 1 not included. The semantics of "omit intermediates", and what precisely is being elided, effectively rests on unknown client configuration. The Section 3 [EDNOTE] seems to acknowledge this, in that there are a host of trade-offs at play, all of which require some degree of client state to be communicated to the server. This is alluded to by saying, in Section 3.1, "a server could choose not to send its intermediates regardless of the flag from the client, if it has reason to believe the issuing CAs do not exist in the client ICA list", but without reference to how practically that would be determined.

During the IETF 105 discussion, there was some discussion about "CCADB version" during the mic discussion, as one way of approaching an agreed-upon definition/registry for a set of certificates, but which would only be applicable to a particular user community. The draft here highlights three variations that all serve as a form of cookie (Fingerprint, HMAC, Ticket). While these might be viable for some use cases, some of the large-scale deployment concerns would be the increasing effort to segment and isolate forms of network traffic between origins within browsers, and to limit the lifetime of tracking tickets. This already had some discussion during the cross-sni-resumption, which shares a similar state overlap, but it may be worth exploring how well the amortization here works when faced under "real world" conditions (of multiple parallel connections racing, isolated between different security principals/origins, with limited lifetimes).

Within a wholly constrained environment, such as IoT, in which the operator of both TLS peers functionally has perfect knowledge of the PKI in use, these issues are not a consideration, and so it's understandably compelling there. But, in such a scenario, they can already account for this by just eliding intermediates today without requiring negotiation. Negotiating via flags comes in when both parties don't have perfect knowledge or consistent configuration between each other, but functionally, if they negotiate the flag, we're requiring that they do have perfect knowledge. That's why I'm still a little uneasy about this proposal, not because I think it's inherently unworkable, but because there's a lot of devil in the details here that are much broader, in practice, than just specifying a flag. I'd love to make sure we've got some understanding for practical deployments of this, or else I can see this being an interop nightmare in deployment.

[1] https://datatracker.ietf.org/meeting/105/materials/minutes-105-tls-00
[2] https://mailarchive.ietf.org/arch/msg/tls/KAAKiEki36gL8g40ZimNHk39OXY/
[3] https://mailarchive.ietf.org/arch/msg/tls/sJ4vlchFfKtKqYDADdxwAWoR2ug/
[4] https://www.amazon.science/publications/speeding-up-post-quantum-tls-handshakes-by-suppressing-intermediate-ca-certificates