[Wimse] Re: What is an identity and section 3.2.1 of draft-ietf-wimse-arch (or must identity be the compositum of all attributes?)

Mingliang Pei <mingliang.pei@broadcom.com> Mon, 08 July 2024 04:50 UTC

Return-Path: <mingliang.pei@broadcom.com>
X-Original-To: wimse@ietfa.amsl.com
Delivered-To: wimse@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 22122C1CAE8F for <wimse@ietfa.amsl.com>; Sun, 7 Jul 2024 21:50:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.252
X-Spam-Level:
X-Spam-Status: No, score=-2.252 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.148, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=broadcom.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FQ6TRtPqqO5g for <wimse@ietfa.amsl.com>; Sun, 7 Jul 2024 21:49:56 -0700 (PDT)
Received: from mail-lj1-x231.google.com (mail-lj1-x231.google.com [IPv6:2a00:1450:4864:20::231]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C9B42C1CAE79 for <wimse@ietf.org>; Sun, 7 Jul 2024 21:49:55 -0700 (PDT)
Received: by mail-lj1-x231.google.com with SMTP id 38308e7fff4ca-2ee8911b451so43192251fa.2 for <wimse@ietf.org>; Sun, 07 Jul 2024 21:49:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; t=1720414194; x=1721018994; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=G8gL9iUxfi6R6qtRteRlXjN4ZPw033rtm1hMPYhc0Wc=; b=BWCcfTVunmaVMNCwO6SaZcSqke7u/IAQPSdKfUZZ5gWvSYb6aDcR4ApspfV0u87GQ5 QzmRpRnDehkjxpKmjsva+fSK6cZ6yQzgrtii6RV7KSrz2JxsvZPIbKG9ZryeDJ7Met7+ PhoUbIh3tcq560a/+I+WR/tmTb/onVbVXMEI4=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720414194; x=1721018994; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=G8gL9iUxfi6R6qtRteRlXjN4ZPw033rtm1hMPYhc0Wc=; b=PSDAS1jwGBkie7CvjCZeTrudbuHNi0AQmAue3tWLCMbJMHmDdTxc75D0le71p0fXan AchnE9ziiITeV+XDoeqoYkQxVS0TPis3fQoAdZBMyCeH8CGKFRRFPPIWaFhagUlQR2vx lmd75Ce67SYdjg1yhmy5mYPm7ZxHkJHWRt2DHKU26V6WI+XwmXaVwEctK2Luklhwfz0h RA7YSbDGNE2JxIBgT0u/5NvuhE+5OjphvJOIbRviQCLRtOszHo2W3lgULWZyGayZcPoS Y22Sincpqo8Ywz1NDGW5fJsvm9pVC9IIUsrXUc6bPdeVW5prnIcTaGbrteJl2MjU7CMU t/ow==
X-Forwarded-Encrypted: i=1; AJvYcCUHLD5MTcbz1D4WZIVSpGEtdKxdxAVjT3UwK0SAlOjgtV+vvzN20WuQufWCmaVLyb0UmTc8SVgGbf4PTULrOQ==
X-Gm-Message-State: AOJu0YwO9yBQ0i9INN2U++s8f8tfrZCzr2RmEe62kfZ5Moeidgp5D/Iy Czj92ERaxsHaWhPsHvzFD9IUfgMcqaFzeG9K/N6z7TvwypAkJ485VcW4foGDwriqCcjNu06xZS5 TV0OolX5ehgjCttxnAMb1eNIJKDD5aslxtEg6cWa0peYxUHf3CwiCuaG310uXIGU33XdFdY/s3c wPG4doDJVAwkwD0uS2Gj2I
X-Google-Smtp-Source: AGHT+IG1cFisUwz3kwYpGPc4p/STGJxvBC+yfzzbnNmnt322FuOKuo+ML7FqsYs0Niy2YgQ8tVfvhM7rT9mcitMhu/w=
X-Received: by 2002:a05:651c:2226:b0:2ee:86c1:f74a with SMTP id 38308e7fff4ca-2ee8ee00b5amr89819771fa.35.1720414193595; Sun, 07 Jul 2024 21:49:53 -0700 (PDT)
MIME-Version: 1.0
References: <CACsn0ck4tzTV7xgYPbZ-_L1rR9RUiwmrPL4Dba_maNsuq+tdTw@mail.gmail.com> <0900E8B5-18FD-42D7-9D3F-B4E47C073061@mit.edu> <CAOgPGoBK=tJzXCO0rRETr509RcioufeOQULXxipM=hdwWR=1+Q@mail.gmail.com> <DBAPR83MB0437621D4793ECE42BDA2E2391DC2@DBAPR83MB0437.EURPRD83.prod.outlook.com> <CABDGos4Rag-yTyJthq-kMsK-RHOtkthVZ0qGuKyE2dQp=j9C3A@mail.gmail.com>
In-Reply-To: <CABDGos4Rag-yTyJthq-kMsK-RHOtkthVZ0qGuKyE2dQp=j9C3A@mail.gmail.com>
From: Mingliang Pei <mingliang.pei@broadcom.com>
Date: Sun, 07 Jul 2024 21:49:42 -0700
Message-ID: <CABDGos7E6T+7gKPoh57ei4fNxmU-zfKoNsVYCF-3=EUA63wEJw@mail.gmail.com>
To: Pieter Kasselman <pieter.kasselman=40microsoft.com@dmarc.ietf.org>
Content-Type: multipart/signed; protocol="application/pkcs7-signature"; micalg="sha-256"; boundary="000000000000ea93bf061cb52410"
Message-ID-Hash: 65BWMMOWTLDYSU2PT2VI3QKD4FUFXAP5
X-Message-ID-Hash: 65BWMMOWTLDYSU2PT2VI3QKD4FUFXAP5
X-MailFrom: mingliang.pei@broadcom.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: Joseph Salowey <joe@salowey.net>, Justin Richer <jricher@mit.edu>, Watson Ladd <watsonbladd@gmail.com>, "wimse@ietf.org" <wimse@ietf.org>
X-Mailman-Version: 3.3.9rc4
Precedence: list
Subject: [Wimse] Re: What is an identity and section 3.2.1 of draft-ietf-wimse-arch (or must identity be the compositum of all attributes?)
List-Id: WIMSE Workload Identity in Multi-Service Environment <wimse.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/wimse/pJROXniCFT2T5p-Lilv5zSYf3eo>
List-Archive: <https://mailarchive.ietf.org/arch/browse/wimse>
List-Help: <mailto:wimse-request@ietf.org?subject=help>
List-Owner: <mailto:wimse-owner@ietf.org>
List-Post: <mailto:wimse@ietf.org>
List-Subscribe: <mailto:wimse-join@ietf.org>
List-Unsubscribe: <mailto:wimse-leave@ietf.org>

Just read the PR#36
<https://github.com/ietf-wg-wimse/draft-ietf-wimse-arch/pull/36> to the
Arch draft, and found the support of "Workload Identifier":

A workload identity may be composed of many attributes
...

The Workload Identifier consists of a concise string allocated within a
namespace defined by a Trust Domain.

Thanks. It is thus a matter of the paragraph revision to emphasize further
that an identifier" is a mandatory attribute that a Workload Identity
associates.

Ming


On Sun, Jul 7, 2024 at 9:40 PM Mingliang Pei <mingliang.pei@broadcom.com>
wrote:

> I am new to this workgroup, and had just read the arch draft. This is a
> very interesting topic. Thanks for the work. I have some questions that
> might have been discussed in meetings or emails. Just a quick question that
> kind of repeats what Pieter said, and please let me know if this has
> already been well discussed.
>
> >> 1. Identifiers: Identifiers are the basic building blocks of any
> identity system. It is only possible to manage identities if there is a way
> to identify them uniquely within and across trust domains.
>
> I agree, and thought there would be some unique identifier to be
> associated with a workload, and the spec defines patterns / standards how
> workload providers may assign and register identifiers to its workload.
> Currently, the "workload identity" is defined as a set of attributes where
> the attribute names are open for a platform to define, at least, no
> specific set is proposed yet. This sounds to me that further works are
> needed to support some interoperability and interpretation (in validation,
> authorization etc.)
>
> Thanks,
>
> Ming
>
>
> On Tue, Jul 2, 2024 at 12:03 PM Pieter Kasselman <pieter.kasselman=
> 40microsoft.com@dmarc.ietf.org> wrote:
>
>> Working group chair hat off, identity enthusiast hat on...
>>
>>
>>
>> Thanks for working on this PR Joe, this is a tricky subject ;)
>>
>>
>>
>> Someone once told me that the first trap to avoid when working in
>> identity is to try and define it... It is hugely overloaded and we often
>> end up using it as shorthand for a set of other more burdensome activities
>> we try to describe.
>>
>>
>>
>> Another way to look at it is to focus on the outcome you expect from your
>> use of "identity" and then define the building block to achieve the
>> outcome. You never end up defining identity, but rather what you want to
>> use identity for (the outcome), which in turn helps scoping down the
>> building blocks.
>>
>>
>>
>> So, when it comes to things like using identity to decide access, a
>> useful way to think about the reason for having identity is that you want
>> to ensure that the right entity (person/workload/device) has access to the
>> right resource at the right time for the right reason. To build such a
>> system you end up needing a couple of building blocks. These include:
>>
>>
>>
>> 1. Identifiers: Identifiers are the basic building blocks of any identity
>> system. It is only possible to manage identities if there is a way to
>> identify them uniquely within and across trust domains.
>>
>> 2. Attestation: Attestation includes the presentation of proof to
>> establish the provenance of an entity. This assertion may include a
>> cryptographic binding (use of a cryptographic key or shared secret), but
>> may also include the collection of other information to assert the
>> provenance of the entity.
>>
>> 3. Secrets Management: Some entities use long-lived secrets such as
>> cryptographic keys (symmetric and asymmetric) as part of the attestation
>> process. In the case of asymmetric cryptographic keys, the public key may
>> be included as an attribute in a credential (see next point on credential
>> formats).
>>
>> 4. Credential Formats: Identifiers are often shared with other system
>> participants by encapsulating them in a credential with a specific format.
>> The credential may include additional information or attributes to identify
>> the entity and is often bound to this additional information using
>> cryptographic techniques (e.g. digital signatures). Examples of credential
>> formats include X.509 certificates, JSON Web Tokens, and Verifiable
>> Credentials.
>>
>> 5. Provisioning: Identifiers, attestation metadata, secrets and
>> credentials follow a lifecycle that includes the usual create, read,
>> update, and delete operations that are recorded in a system of record (e.g.
>> a directory).
>>
>> 6. Authentication: Proving that an identifier and a set of additional
>> attributes is associated with an entity. This is done by presenting a
>> credential which may include attributes that the recipient can verify,
>> including proof of possession of a private key matching the public key
>> included as an attribute in the credential.
>>
>> 7. Authorization: Answering the question of “does this entity have access
>> to that resource, at this time, under these conditions?” based on the
>> previous 6 building blocks. It answers the original question, but there are
>> a few more components needed beyond just making the decision around access.
>>
>> 8. Federation: Maintaining trust boundaries within a system is a common
>> part of controlling access to resources. As an entity cross trust
>> boundaries they need to be able to establish trust relationships to both
>> accept credentials from other domains as well as have their own credentials
>> accepted.
>>
>> 9. Monitoring and Remediation: Once an entity is authenticated and
>> authorization decisions are made, it is necessary to monitor their activity
>> to detect anomalies that may indicate compromise. If a compromise is
>> detected or suspected, it should trigger remediation to mitigate the impact
>> of a compromise.
>>
>> 10. Policy and Configuration: Policy and configuration is required for
>> each of the machine identity building blocks. Policy defines the specific
>> rules that should be applied, ranging from credential format through to
>> authentication method and authorization rules.
>>
>> 11. Compliance: Compliance for each building block needs to be proven
>> against the policies for that building block. Compliance often takes the
>> form of reports and the process is simplified through the use of structured
>> data in both the policy and the event logging.
>>
>>
>>
>> Looking at the PR, I wonder if, instead of defining a workload identity,
>> what if you define a workload identity management system that ensures the
>> right workload has access to the right resources at the right time? In that
>> case you could define a workload identifier, credential format and some of
>> the attributes included in the credential format (including a public key
>> perhaps). Instead of having a “workload obtain its identity”, perhaps we
>> might say that “a workload should be provisioned with a credential that
>> includes an identifier and other attributes, along with any secrets it
>> might need, as early as possible in its lifecycle”. Instead of “a workload
>> presents its identity”, we can be more specific by saying “a workload
>> presents its credential and authenticates itself using the private key
>> bound to the credential to generate a signature and prove control of the
>> key”. It’s way more burdensome and time consuming to say, but it could help
>> avoiding overloading how the term “identity” gets over used (this may be a
>> bit of an extreme example).
>>
>>
>>
>> Cheers
>>
>>
>>
>> Pieter
>>
>>
>>
>>
>>
>> *From:* Joseph Salowey <joe@salowey.net>
>> *Sent:* Tuesday, July 2, 2024 1:52 AM
>> *To:* Justin Richer <jricher@mit.edu>
>> *Cc:* Watson Ladd <watsonbladd@gmail.com>; wimse@ietf.org
>> *Subject:* [Wimse] Re: What is an identity and section 3.2.1 of
>> draft-ietf-wimse-arch (or must identity be the compositum of all
>> attributes?)
>>
>>
>>
>> I put together a PR (
>> https://github.com/ietf-wg-wimse/draft-ietf-wimse-arch/pull/36/files)
>> for the architecture draft to try to better describe identity in a new
>> section.  Please comment and let me know if it helps. Other sections will
>> probably need to be modified once we tighten down this section.
>>
>>
>>
>> Cheers,
>>
>>
>>
>> Joe
>>
>>
>>
>> On Tue, Jun 18, 2024 at 10:11 AM Justin Richer <jricher@mit.edu> wrote:
>>
>> Thanks, Watson.
>>
>>
>>
>> [ Chair hat off ]
>>
>>
>>
>> Ah, monads, I should have known they’d crop up here eventually. :)
>>
>>
>>
>> I agree that there’s a fundamental difference between "what something is"
>> and "what something can do", and on top of that there’s also "what we call
>> something". All the "something" in this case is a workload in its context.
>>
>>
>>
>> In many cases, it’s tempting or even convenient to wrap these up
>> together. After all, if I know who you are, then I should know what you can
>> do based on that. And if your identifier — the thing I call you — gives me
>> some of that information in a structured format, then all the better,
>> right? This is a large part of the thinking behind the SVID concept - the
>> name has some internal semantics that guide me towards understanding what
>> to do with the thing that has been given that name. And this pattern has
>> been hugely useful, since it lets you carry information about authorization
>> through a system using something that I guarantee will be there, the
>> identifier. That’s what I’m reading from that section in the WIMSE Arch
>> draft — here’s a larger identity question that gets encoded into an
>> identifier and used for authorization decisions.
>>
>>
>>
>> But for me, WIMSE does represent an opportunity to tease this apart
>> better than it has been in the past, while still allowing us to
>> deliberately collapse things in cases where it makes sense.  I should be
>> able to figure out that Jessie is cow #2 without naming her
>> "Jessie.cow[2]". But then the question becomes, how do I carry that
>> information? And more importantly for us here, how do I do that in a way
>> that’s interoperable on some level?
>>
>>
>>
>> And the question goes further, because I don’t believe that identifying
>> the workload is where the security decisions stop. In fact, I think that’s
>> just one input, and not even necessarily a required input, into an
>> authorization decision. Much more important is the context of the request
>> in which the workload is running. This goes beyond the identity of the
>> workload itself and into the world that the identified workload finds
>> itself in at runtime.
>>
>>
>>
>> I am not going to pretend to have clear answers, but I do think there’s
>> merit in pursuing these questions to find them. But that said, I do believe
>> that there are several concepts that are separable here, which can
>> sometimes be expressed together using the same element:
>>
>>
>>
>> - an identifier for a workload that is readable by other workloads and
>> systems
>>
>> - the collection of attributes that uniquely identify a workload (the
>> identifier being one of them), its "identity"
>>
>> - the runtime context around a workload (request context being a big part
>> of this)
>>
>> - the set of computed access rights for a workload in a context
>>
>>
>>
>> In my view, OAuth access tokens are just one input to the runtime
>> context, but today they’re being used to express all of these things in
>> different ways in different systems. I really do think that if we’re going
>> to stop always conflating things, we need to work on fundamental
>> differentiation.
>>
>>
>>
>>
>>
>> — Justin
>>
>>
>>
>> On Jun 17, 2024, at 7:59 PM, Watson Ladd <watsonbladd@gmail.com> wrote:
>>
>>
>>
>> Dear wimsyists,
>>
>> I must confess to never having read Leibnitz, but I think section
>> 3.2.1 will force me to. I hope this long rambling email explains why
>> and elucidates some of what confuses me in WISME, and hopefully others
>> find out they are likewise confused.
>>
>> Section 3.2.1 implies the identity must include a bunch of information
>> that's relevant to the problem of authorizing a request, that is
>> conveyed by whatever ticket brings it alongside a request.
>>
>> There are two notions of identity at play here. The first is the
>> concept that makes one thing not like the other. The other is the name
>> by which we might call a thing not like the other. E.g a farmer might
>> have cow 1, cow 2, cow 3, or Bessie, Jessie, and Daisy. Bessie does
>> not convey any of the relevant information about that cow, but does
>> identify it uniquely among the herd. To me this is also identity, and
>> indeed is the identity that would be most convenient to convey. This
>> is where Leibnitz comes in.
>>
>> Now given a request we might want to evaluate the fundamental question
>> of "who get to do what to whom". And here I think the distinction
>> between authorization and authentication has been elided a bit.
>> Authentication only determines the who. Authorization is about
>> answering the question of if the request is in fact authorized. A
>> token could speak only to authentication, but it could also provide
>> authorization information. The draft is pretty ambivalent about which:
>> while there is a lean towards authentication it seems the token
>> issuance is also supposed to be a gating function, and the token
>> needing to carry more than a simple identifier means that it starts to
>> blur the line.
>>
>> Lastly there's a huge gap for me around what sort of policies make
>> sense. Many systems I've seen and worked with had a broad class of
>> services provided by various elements, and the users of such services
>> might be a very dynamic set. Others had a self-serve partitioning of a
>> class of resources allocated to each service calling in. Other times
>> we might have systemic policies, e.g. "Infosec says all services must
>> do X, which we can determine at build time, and those that don't can't
>> run". Expressing these policies in human understandable form argues
>> for a simple identifier as a human readable policy depends on such
>> identifier. I don't think a complex identifier works for that.
>>
>> In conclusion I think we need to make an explicit choice. Either the
>> token carries a workload identifier (akin to a Service in K8S or
>> something else?) or it carries a grab bag of attested environmental
>> attributes over which policies run. And I think we need to figure out
>> what policies should be expressable and which shouldn't be.
>>
>> Sincerely,
>> Watson Ladd
>>
>> --
>> Astra mortemque praestare gradatim
>>
>> --
>> Wimse mailing list -- wimse@ietf.org
>> To unsubscribe send an email to wimse-leave@ietf.org
>>
>>
>>
>> --
>> Wimse mailing list -- wimse@ietf.org
>> To unsubscribe send an email to wimse-leave@ietf.org
>>
>> --
>> Wimse mailing list -- wimse@ietf.org
>> To unsubscribe send an email to wimse-leave@ietf.org
>>
>

-- 
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copying, distributing, dissemination, forwarding, printing, or copying of 
this e-mail is strictly prohibited. If you received this e-mail in error, 
please return the e-mail to the sender, delete it from your computer, and 
destroy any printed copy of it.