Re: [SCITT] Internet-Draft draft-ietf-scitt-architecture-05.txt

Orie Steele <orie@transmute.industries> Mon, 12 February 2024 14:23 UTC

Return-Path: <orie@transmute.industries>
X-Original-To: scitt@ietfa.amsl.com
Delivered-To: scitt@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D5525C14F697 for <scitt@ietfa.amsl.com>; Mon, 12 Feb 2024 06:23:47 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.095
X-Spam-Level:
X-Spam-Status: No, score=-2.095 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_KAM_HTML_FONT_INVALID=0.01, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=transmute.industries
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nqBPAnU967fQ for <scitt@ietfa.amsl.com>; Mon, 12 Feb 2024 06:23:43 -0800 (PST)
Received: from mail-pg1-x52b.google.com (mail-pg1-x52b.google.com [IPv6:2607:f8b0:4864:20::52b]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4310CC14F600 for <scitt@ietf.org>; Mon, 12 Feb 2024 06:23:43 -0800 (PST)
Received: by mail-pg1-x52b.google.com with SMTP id 41be03b00d2f7-5c66b093b86so2930408a12.0 for <scitt@ietf.org>; Mon, 12 Feb 2024 06:23:43 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=transmute.industries; s=google; t=1707747822; x=1708352622; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Oe0r+0YFt3dk9l+udBV/Zj0kY6JRtWu/Ui77rSA72sw=; b=RzIefuw+IfiZamhx25eky53bU6Ef9aMh14ygRvutY3qstEUTqgWaAcNweMkgkXIXdu 6C6y1uswtfGK5G1Rj31F1YfIQBL42D/HcwZJnmIWfXje8fTOt7duj3/8HOfEx4ziL5+b chOLLr4UhWPw6dDSMg5Ag1rhIZKP4dWp6kS6F1g7KYFfkHT+mBud9VRxjs4c7RvCdoB+ +gLLajDqk1tHhwxtA9LALyBJOTAam0hM/RUmTgMbs/tFG3fEzy3ofMgsMoHBMF6NTNQP JsxfZVkdmUfIc8lVDP+yHo0mWIOXT5lrlHwlrX6mm0XfgymkdLFMe1A3W0rl+CBj5MjW mNTA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707747822; x=1708352622; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Oe0r+0YFt3dk9l+udBV/Zj0kY6JRtWu/Ui77rSA72sw=; b=Q0qsjT3jTCNLaIcI84rLj+/sbD9r1XQ45YtvjGz7z5SQgPHMqnLxheM5fJGdraEm1d aPzGT/meYFzi0QQ8eQO4RRiPF/9w59BTh+0t4h83esdJemI0DdKx2jPH+3xMmZ0+d2OU DkJDoy0APx7Vyysaugp+Vaf1U2Gfw4EMCy4+SzolyonhPrngU2qS/lwUbOvr2L/DIPZ+ UiAlI47ydyVGqLSfdZlvbUlmu66OkIvdLugMTzbS3O9/RRjg324N+/C3zndbpDzx5yN6 ZP7xmei5WMM2Z7g8IzsqVXBz+z6b4RxPeRaPxNxQcJKguheQuKW7Yhak15phQc6jNGzR Iy7A==
X-Gm-Message-State: AOJu0YzG8VM0vYAw7syvg+0ThKjEVqvi1yzJ677gNCGcUXfU3ZB1n4zC Ac8+Df0jQDgYNNZmj6iXbRYOs20BRDzZAyTlP4IneQa/3fB+Tfn+yafDGhXp3+bnMy4Ex5d/2FO BqGH31E6UktoVHvMBcP8uvxThGvwPkao/poGt7SDV0tZkNb+hZmU=
X-Google-Smtp-Source: AGHT+IH6ipLtMCHYcgc9R9SY/34OIrLYfN/D3eU2GPWLKigzlGi1HhO8nLl1Z80VVpfZiS94WEaYfXM31vfrO9bbkFw=
X-Received: by 2002:a17:90b:882:b0:28f:ee83:13cd with SMTP id bj2-20020a17090b088200b0028fee8313cdmr11304889pjb.0.1707747822195; Mon, 12 Feb 2024 06:23:42 -0800 (PST)
MIME-Version: 1.0
References: <LO4P265MB424709B41AC8D1146DDB7707E4482@LO4P265MB4247.GBRP265.PROD.OUTLOOK.COM>
In-Reply-To: <LO4P265MB424709B41AC8D1146DDB7707E4482@LO4P265MB4247.GBRP265.PROD.OUTLOOK.COM>
From: Orie Steele <orie@transmute.industries>
Date: Mon, 12 Feb 2024 08:23:30 -0600
Message-ID: <CAN8C-_LkXS_TdUqVrhdUBA46fHotbdkaeoizt6o3mFdzyj_kGQ@mail.gmail.com>
To: Robin Bryce <robin.bryce@datatrails.ai>
Cc: "scitt@ietf.org" <scitt@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000004ee94906113006ae"
Archived-At: <https://mailarchive.ietf.org/arch/msg/scitt/ZU08EZi7FnsjBIiCUSMxl6YsG3g>
Subject: Re: [SCITT] Internet-Draft draft-ietf-scitt-architecture-05.txt
X-BeenThere: scitt@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Supply Chain Integrity, Transparency, and Trust" <scitt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/scitt>, <mailto:scitt-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/scitt/>
List-Post: <mailto:scitt@ietf.org>
List-Help: <mailto:scitt-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/scitt>, <mailto:scitt-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 12 Feb 2024 14:23:47 -0000

Hey Robin,

I agree with much of what you wrote. As we try to get the draft to a
publishable state, we will need to start cutting content where there is not
rough consensus.

Inline for the rest:


On Mon, Feb 12, 2024 at 2:06 AM Robin Bryce <robin.bryce@datatrails.ai>
wrote:

> Hi all,
>
>
>
> Thanks for all the work that went into the new draft. I have some feedback
> & questions.
>
>
>
> 4.1.1 Initialization
>
>
>
> This appears to prescribe the precise contents of the initial log entries
> in the name of solving a *verification* problem - that the statement
> verification requires a trust anchor in order that verification can be
> successful.
>
> I don't understand why the trust anchor needs to be in the log in a
> particular position. Verification is just contingent on is availability. I
> also don't understand why it needs to be in the same log or any particular
> log at all.
>
> It creates significant operational problems inserting these sorts of
> requirements, and it’s very intrusive with regards the log itself.
>
> I think this section implies that an Issuer can submit a Registration
> Policy to self-attest its keys and scope of use and that becomes the thing
> the Transparency Service later verifies against? If that really is desired,
> then I think the last paragraph "This specification leaves implementation,
> encoding and documentation of Registration Policies to the operator of the
> Transparency Service" leaves scope for implementors to support that.
>
> Precisely how registration policies are established ought to be the
> business of the individual log implementation.
>

The purpose of this section is to establish how a registration policy is
added to the log, since such a policy is required to start to accept
anything.
This section could say: A new empty log is created, and the transparency
service can add whatever it likes, without performing any specific
registration policy.
Because we previously had text that said that the minimum registration
policy requires the transparency service to __verify__ signed statements,
it is logical to assume that the TS would at least have the keys to do that.
Making those keys transparent, makes the log self authenticating, which is
a valuable property for auditors, who might want to know the exact keys
that have been used to make decisions to admit content.

It sounds to me like your suggestion is to remove any requirement to
include any specific content in the log, including keys or policies? (I
would be ok with these being operator details).


> 4.1.2 Registration
>
> Re: 2.
>
> "Issuer Verification: The Transparency Service MUST perform resolution of
> the Issuer's identity. This step may require that the service retrieves the
> Issuer ID in real-time, or rely on a cache of recent resolutions. For
> auditing, during Registration, the Transparency Service MUST store evidence
> of the lookup, including if it was resolved from a cache."
>
> A requirement to verify issuer identities in real time does not guarantee
> that maliciously or mistakenly signed items will not reach the log.
> Compromise of the identity may be discovered *after* the statement has
> been accepted on the log, and then we are no better off than had we simply
> relied on cryptographic verification. It is always going to be down to
> auditors, verifiers, and log consumers in general, to use the attested data
> to reach their own conclusions. And those conclusions may well change over
> time.
>
> Having this as a requirement does not help solve these problems, it
> complicates the implementations, and it will be a significant performance
> bottle neck.
>
> Deferring id verification to the point at which a receipt is
> *asynchronously* requested could be a workable compromise.
>

This would allow anonymous users to poison the log (which stores content
forever), with content that is potentially toxic.
As a general rule, we would expect to see write side requests gated by
authorization, and read side requests gated by authorization.
In addition to gating a client from registering content, the content itself
could be gated, for example, only employees can add data to the log, and
they can only add data that is signed by a trusted partner.

> Secondly, it is far from clear what "evidence of the lookup, including if
> it was resolved from a cache" would look like in the real world. I don't
> think it is feasible for SCITT to specify that in detail. So the
> requirement is very ambiguous.
>

Strongly agree, its left over DIDs stuff, it should be removed.

Re: 7
>
> Receipts should be an optional part of the registration process.
>

Agreed.

> There exist verifiable data structures which enable receipts to be
> generated in the future which are provably identical to the receipt that
> would be issued at the time of registration. Decoupling these allows for
> higher volume use cases where receipts aren’t desired for all statements,
> or at least not desired immediately. This rests on the ability of the log
> implementation to provide tamper evidence and provable log consistency.
> There are efficient ways to batch proof (and hence receipt generation),
> that would be un-usable if a receipt is a required part of the registration
> process.
>
> The wording in the Registration section isn't super clear about if and how
> this sort of batching is accommodated.
>
I think that is ok for an architecture document, batch processing and
asynchronous responses are  probably more detailed than the architecture
should cover.
However if the text reads like this is forbidden, we should fix that.

>
> "A Transparency Service MUST ensure that a Signed Statement is registered
> before releasing its Receipt" this means the log entry for the signed
> statement must be committed to the log before the receipt can be created.
>
It's debatable what this actually means... it could be interpreted as "the
client got an HTTP 202".


> That log entry itself cannot be a transparent statement because the
> receipt is its inclusion proof in the log. I think this is probably
> obvious, but some of the wording could imply the transparent statement is a
> log entry.
>
Any bytes that go into a log, can be cose-sign1 that have receipts in the
unprotected header... Yes, that is recursive, but we need to address it, or
we need to say you cannot include unprotected headers in the logs.
A use case worth thinking about here, is vendor a, providing a receipt that
proves vendor b had a receipt.

>
> The wording at the start of 4.3 is much clearer in this regard.
>
> 4.2 Signed statements
>
> "Support for x5t is mandatory to implement" Why is this helpful ?
>
Anytime a standard offers "multiple ways to do something", there is a
chance that a set of vendors will each choose a different way,  and there
will be no interoperability.
We have this problem in the form of logs today, RFC9162 vs CCF vs Etherem
vs etc.
We have the same problem for cose-sign1, we can sign them with `iss` and
`kid` or we can sign them with `x5t`.
It's very common to have a single mandatory to implement, for cases like
this.

> If I use a cnf claim and an iss to fetch an openid connect style well
> known document over https
>

side note here, this is why the comment about "evidence of the lookup,
including if it was resolved from a cache" exists.

> and then check the kid and public key match what is in the claim, don't I,
> as a transparency service implementor, get all the benefits of TLS
> certificates without exposing the implementation to the burden of x509
> verification ?
>
I would not say you get the benefits of certificates from that.... Signing
keys rotate frequently in OpenID, and they can have certificate chains
behind them (x5c /x5u), or not.

> And that TLS certificate & its verification may well be backed by a
> transparency service in the full ness of time, but it’s actually more
> robust if its independent and opaque.
>

I assume this comment is about the https connection you used to pull keys?
This is also why the comment about "evidence of the lookup" exists, it's a
nod to aTLS... I tend to think we should leave those details out of scope
for the architecture.

>
> Surely it is sufficient to say implementations MUST implement *either*
> x5t or kid/cnf ?
>
See the comment above, assuming uniform random adoption, it would be a 50%
coin flip to have interop with this as a normative requirement.
Mandatory to implement also does not imply exclusivity, you could implement
both.


> Is there a thread or reference somewhere I can read to catch up with the
> arguments either way on this ?
>
https://mailarchive.ietf.org/arch/msg/spasm/YDqJA8xiolEsFngDnSyG1_Ej57w/
<https://mailarchive.ietf.org/arch/msg/spasm/YDqJA8xiolEsFngDnSyG1_Ej57w/>

(You can also search the scitt archive for key words like "x5t" or "kid".)

>
> 4.3 Transparent Statements
>
> "Receipts are based on Signed Inclusion Proofs as described in COSE Signed
> Merkle Tree Proofs" I understood merkle based logs were one of the options
> and that the standard wasn't fixing on merkle based logs. Is this still the
> intent ?
>

The draft is still called "COSE Signed Merkle Tree Proofs":

https://datatracker.ietf.org/doc/draft-ietf-cose-merkle-tree-proofs/

The draft uses generic language now:

"This document describes how to convey verifiable data structures, and
associated proof types in COSE envelopes."

> 4.3.1 Validation
>
> "In order to verify the inclusion proof that is included in the Receipt,
> the verification process for the inclusion proof MUST be performed as
> described in the document that registers corresponding Verifiable Data
> Structure Parameters
> https://www.ietf.org/archive/id/draft-ietf-scitt-architecture-05.html#I-D.draft-steele-cose-merkle-tree-proofs
> "
>
> This is a useful and helpful draft, but it only contains RFC9162_SHA256 in
> its registry.
>
That's correct, it establishes a registry, which cannot be updated with
other content, until it exists... technically the registry won't exist at
all until after the document is published (it blocks the architecture).


> What should implementations do in advance of being able to register here ?
>
You can't do anything in advance... you could pick a number (like 2 or 5),
and hope that it will not be taken by the time you are ready to follow the
registration advice in section:
https://datatracker.ietf.org/doc/html/draft-ietf-cose-merkle-tree-proofs-03#section-4.1

But as a general rule, I would never assume you can claim code points in a
registry that is not yet established.

We could add drafts to the initialization of the registry, but I am not a
fan of doing that, because drafts expire, and are only meant to be cited as
"work in progress"...

This is a detailed topic, you may want to read:
https://datatracker.ietf.org/doc/html/rfc8126#section-4.6

>
> 7.2.6 Impersonation
>
> "It is up to the Issuer to notify Transparency Services of credential
> revocation to stop Verifiers from accepting Signed Statements signed with
> compromised credentials" this means the guarantee isn't terribly useful.
>

Security considerations are often written in a pleading tone : )

The alternative text would be to say nothing.

>From the point of compromise to the time at which a Transparency Service
adds a new entry that addresses the compromise, is a window in time that is
controlled by "disclosure".

We would hope to see "responsible disclosure as soon as possible", but we
may never see any disclosure at all.

I would say this guidance should apply to "all timely information that
could be essential to mitigate harm to a supply chain, and its consumers".

> It’s always going to require after the fact retrospective analysis to
> figure out which statements were recorded while the identity was
> unknowingly compromised. The guarantee is misleading. It is sufficient to
> guarantee that the identity presented at the time of the statement is
> evident in the log.
>
Sorry for using a dramatic analogy, but I find drama can bring clarity to
security and privacy issues.

Saying a drivers license was valid when a firearm was purchased is one
message to the log.
Saying a driver's license was stolen and used to purchase a firearm
(producing a valid receipt) by an entity that was not the subject of the
credential (which is a driver's license) is another message.
A third message could be the drivers license itself.

I use credentials people are familiar with here to make the point, but in
reality it is likely organization or machine identity credentials that
would be stolen.

Storing key material in the log would be like storing identity credentials
in the log, an auditor might want to know which drivers license purchased
which serial number.
If all they can see in the log is that "the license was valid", that could
be either a feature or a bug ( and regional laws would certainly apply to
storing ANY PII in an append only log ).
There is utility in storing information about the identity that is
registering content, because it can help with investigations, and it can
protect or harm the operator of the transparency service, in the case of
x5t, iss, and kid, that could contain PII, because a certificate, JWK or
COSE Key can contain PII.... consider KYC requirements vs the "right to be
forgotten".
The mDoc Digital Driver's license format uses COSE Keys to enable proof of
possession for data subjects, and uses Certificates to identify issuers
(State DMVs in US).

Bringing the example back to software: Vendor A's signing keys were stolen,
have you seen any attempts to publish binaries since we know they were
stolen?
You might find some transparency vendors stored just the messages with x5t
pointing to that vendor's cert, and other transparency vendors decided to
store all those messages, and a copy of the cert itself.
The reason you might encounter both, is that it is an implementation and
operator decision, of how much information about issuers to store in
transparency logs.

>
> Hope this is useful and constructive. Thanks again for the great work!
>

Thanks for all your thoughtful comments.
If after hearing from the authors you feel there are essential changes that
MUST be made, you should file issues, and keep complaining until they are
fixed : )

>
>
>
>
>
>
>
>
>
> --
> SCITT mailing list
> SCITT@ietf.org
> https://www.ietf.org/mailman/listinfo/scitt
>


-- 


ORIE STEELE
Chief Technology Officer
www.transmute.industries

<https://transmute.industries>