Re: [Id-event] Subject Identifiers - Working Group Last Call
Aaron Parecki <aaron@parecki.com> Tue, 15 March 2022 15:11 UTC
Return-Path: <aaron@parecki.com>
X-Original-To: id-event@ietfa.amsl.com
Delivered-To: id-event@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7694F3A1520 for <id-event@ietfa.amsl.com>; Tue, 15 Mar 2022 08:11:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.108
X-Spam-Level:
X-Spam-Status: No, score=-7.108 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=parecki.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mvyuU2w9IZU6 for <id-event@ietfa.amsl.com>; Tue, 15 Mar 2022 08:11:05 -0700 (PDT)
Received: from mail-il1-x130.google.com (mail-il1-x130.google.com [IPv6:2607:f8b0:4864:20::130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A328C3A1515 for <id-event@ietf.org>; Tue, 15 Mar 2022 08:11:05 -0700 (PDT)
Received: by mail-il1-x130.google.com with SMTP id h7so13524465ile.1 for <id-event@ietf.org>; Tue, 15 Mar 2022 08:11:05 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=parecki.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=rGypSWSnGwVuAxes59HR8XebOe7yErDsP+UOR5y654A=; b=dLdSOXeGQ4urmB/b0BCJy0OPcUWHW5QZyeXwGGN64+9O820bJ0xv0XVFp9CEObnEWn /1FEQukdjLNIp3o/eAARSH64qSZyLudr7O3FQlV38lF/iH++so7P+VlN2/qLsPqWs75b NctIWhmYr4o1BkAxL+6bYLRDfhhAxBQjBEq7n7XrW2xzUVGhXemVrOg14VVm4YIw0pxz tjTZFzM4lHOJDNeBLpA9fCiYhSJp1c6FlrdhpBbZlbXDo4xGWhKoiKa4ejDfozl9bO/Q h+0k5w28LjCVUJ3s6krEr2BGZLPfXuPnSQDMFoyoUX4P7hj83AWWY63tW5OCdlfDKYo6 dnsg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=rGypSWSnGwVuAxes59HR8XebOe7yErDsP+UOR5y654A=; b=4jF67Yljk6cB0R1yJ1SORDP7CKprpdNTnAZpq/Iybdvp9XCfvF1YD9lpk4y7Ohq3vL dsnn862gUYho8Vx7mbgbmrelGdmCbg/DwjnNwu1kN6Kgp/iHQPiv5g0KAnFtMNoAQWS5 gKoU0iul5XJD6kWrXqFtNOE/S8HTFuGVfjliUGuO6TFLkXjH2WvXk9zzLy+NpEpgQ2QZ w+IxdeoAuBwRtRmAgw5cPEAmgFANI1e9D//CEZBrds8h3e0rp3SfbxhHYM1I2iMIxO5M CPUTls8W3uePXy0udJuBw8bHF63V8pHBRP975t2TADxIXF+WSv8F70WyFfv1c22Y4iYc jCPA==
X-Gm-Message-State: AOAM5310jSChHhvVV4zyKXsJvSxate0PyBDkHedOpZLDsZta8ov8uTXt g0t8Nbjkvl85/wxt/n7IhDQCsOq6BwZ3W9rk
X-Google-Smtp-Source: ABdhPJwwK+rP3ug0anjfID4SK0h7FADKse1nNEKXuHawlZXUUFjvxNOj3PbPTSNvBBDDdFNveEyxkA==
X-Received: by 2002:a05:6e02:1b0f:b0:2c7:9ec2:1503 with SMTP id i15-20020a056e021b0f00b002c79ec21503mr7697619ilv.209.1647357063460; Tue, 15 Mar 2022 08:11:03 -0700 (PDT)
Received: from mail-io1-f45.google.com (mail-io1-f45.google.com. [209.85.166.45]) by smtp.gmail.com with ESMTPSA id h22-20020a5d9716000000b00645e6e57d5dsm10366939iol.1.2022.03.15.08.11.02 for <id-event@ietf.org> (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 15 Mar 2022 08:11:02 -0700 (PDT)
Received: by mail-io1-f45.google.com with SMTP id w7so22544701ioj.5 for <id-event@ietf.org>; Tue, 15 Mar 2022 08:11:02 -0700 (PDT)
X-Received: by 2002:a05:6638:1349:b0:319:c499:33d4 with SMTP id u9-20020a056638134900b00319c49933d4mr19659752jad.265.1647357062006; Tue, 15 Mar 2022 08:11:02 -0700 (PDT)
MIME-Version: 1.0
References: <CAD9ie-uSbNHq=Mt3ohA=URf5rv2hz7YUdUMhOf80C_f=XBrGLA@mail.gmail.com> <36D66A89-D178-6047-B270-73AD540E7FAD@hxcore.ol> <81b58b05-97b6-d910-6b58-4b565ae6ea57@free.fr> <8330bc8a-d0f5-686e-1073-84e8bf83a294@free.fr> <BD4BD998-171C-49C8-B495-E5A7B3CE2448@amazon.com> <99121270-5eac-fab6-9aa6-d4e0ed0b734d@free.fr> <57676B78-2301-431A-A068-C60CCBED2338@amazon.com> <82673afd-ee5d-50b9-48d7-496b072e7927@free.fr> <D0596A7A-C838-40E6-93EF-6D4A1CB46F05@amazon.com> <56dac4b6-75b9-08fe-46d1-bd8a7e883c76@free.fr>
In-Reply-To: <56dac4b6-75b9-08fe-46d1-bd8a7e883c76@free.fr>
From: Aaron Parecki <aaron@parecki.com>
Date: Tue, 15 Mar 2022 15:10:50 +0000
X-Gmail-Original-Message-ID: <CAGBSGjoph8dvdd+2zSAGKrnoP4VLYZStAUNvEza5BZ7x9_Y-=Q@mail.gmail.com>
Message-ID: <CAGBSGjoph8dvdd+2zSAGKrnoP4VLYZStAUNvEza5BZ7x9_Y-=Q@mail.gmail.com>
To: Denis <denis.ietf@free.fr>
Cc: "Backman, Annabelle" <richanna@amazon.com>, Dick Hardt <dick.hardt@gmail.com>, Marius Scurtescu <marius.scurtescu@coinbase.com>, "Richard Backman, Annabelle" <richanna=40amazon.com@dmarc.ietf.org>, Roman Danyliw <rdd@cert.org>, SecEvent <id-event@ietf.org>, Yaron Sheffer <yaronf.ietf@gmail.com>
Content-Type: multipart/alternative; boundary="0000000000007fd09505da43344f"
Archived-At: <https://mailarchive.ietf.org/arch/msg/id-event/x0i5DprG5O7xfAG9nCTTJo_fbNY>
Subject: Re: [Id-event] Subject Identifiers - Working Group Last Call
X-BeenThere: id-event@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "A mailing list to discuss the potential solution for a common identity event messaging format and distribution system." <id-event.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/id-event>, <mailto:id-event-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/id-event/>
List-Post: <mailto:id-event@ietf.org>
List-Help: <mailto:id-event-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/id-event>, <mailto:id-event-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Mar 2022 15:11:25 -0000
The abstract states > and named formats that define the syntax and semantics *for encoding* subject identifiers as JSON objects (emphasis mine) "semantics for encoding a subject identifier" is very different from "semantics of a subject identifier" which is where your confusion is coming from. There is no contradiction. Aaron On Tue, Mar 15, 2022 at 2:56 PM Denis <denis.ietf@free.fr> wrote: > Hello Annabelle, > > Hi Denis, > > We may be talking past one another a bit here, so let me step back and try > to state some things clearly. Within this draft, the word "subject" > includes many, many different kinds of things. It is not a synonym for > "user". Subjects may not have any clear relationship to an individual > person or a group of people. Of course, a subject may be a user, or person, > or group, or some other thing related to people. > > Agreed. > > But as stated in the draft, the *subject type* is out of scope. > > The text states in section 3.1: > > Identifier Formats define how to encode identifying information for a > subject. > They do not define the type or nature of the subject itself. > > However, the abstract states: > > This specification formalizes the notion of subject identifiers > as structured information that describe a subject, and named formats > that define the *syntax *and *semantics *for encoding subject > identifiers as JSON objects. > > Two words are important in that sentence: "*semantics*" and "*syntax*". > > This means that what is called a "format" should define on one side: > > - the semantics of the object identifier and on another side > - the syntax of the object identifier. > > The text states later on within section 3: > > A Subject Identifier MUST conform to a specific Identifier Format, (...). > > Section 3.1 states: > > Identifier Formats define how to encode identifying information for a > subject. > They do not define the type or nature of the subject itself. > > We are now in the core of the debate. > > The abstract states that subject identifiers define both *the semantics* > and the syntax of a subject identifier > while section 3.1 states that they do not define the semantics of a > subject identifier, i.e. the type or nature > of the subject identifier itself. This is contradictory. > > Some subject identifiers (and subject identifier formats) may be > appropriate for some types of subjects but not others. > For example, while I agree that a latitude and longitude coordinate > doesn't make sense as an identifier for a user, > it does make sense as an identifier for a plot of land. Likewise, an IP > address is a perfectly reasonable identifier to use > when the subject is the IP address itself (e.g., in an audit log of DHCP > lease changes), or for a node on an IP-based network. > > If subject identifiers were fully opaque, they would be of little use. > Hence they need to be structured to be able to make a difference > at a first level of granularity between: > > - those associated with a single individual, and > - those that relate to a group (i.e. to one or more individuals). > > Whether or not a subject represents an individual or a group is a property > of the *subject itself*, and is not generally required in order to > identify the subject. > Since your example use case uses JWTs, this information can and should be > included as a separate claim within the JWT. There are several benefits to > this: > > > 1. It is semantically correct, as JWTs are intended to encapsulate > claims about a subject. > 2. It provides privacy advantages, as protocols like OIDC provide > mechanisms for clients to request access to different claims. > Embedding this information within the subject identifier implies it is > necessary to understanding the identifier. > 3. It avoids adding something to subject identifiers that only really > applies to a subset of subject types. > 4. It avoids the complicated question of how to interpret these values > in the context of different subject types. > For example, one implementer might consider a subject identifier that > identifies a single POSIX group itself > (i.e., the *group*, not the* members of* the group) to be "individual" > because it represents one single thing, > while another implementer might consider it to be "group" because the > one single thing it represents is literally a group. > > Section 2.4.of the GNAP draft (Identifying the User) states: > > If the client instance knows the identity of the end user through one > or more identifiers or assertions, the client instance MAY send that > information to the AS in the "user" field. The client instance MAY > pass this information by value or by reference. > > sub_ids (array of objects): An array of subject identifiers for the > end user, as defined by [I-D.ietf-secevent-subject-identifiers]. > OPTIONAL. > > assertions (array of objects) An array containing assertions as > objects each containing the assertion format and the assertion > value as the JSON string serialization of the assertion. > OPTIONAL. > > Assertions could certainly be used, but at the moment I am not aware of an > IETF RFC that would allow to support some form of interoperability > for subject identifiers or for group memberships. > > The SEC-EVENT draft is an opportunity to standardize some of them so that > they can be used in an interoperable fashion. > > One sentence is that section is currently: > > For example, the entity to which the identifiers are presented now > knows that both identifiers relate to the same subject, > and may be able to correlate additional data based on that. > > Such sentence is inaccurate because it depends upon the type of subject > identifier that is being received. > > Section 6.1 addresses correlation risks specifically arising from > including multiple subject identifiers within the same context, as this is > something the draft enables > (via the `aliases` format, and by introducing the `sub_id` JWT claim and > permitting its use alongside `sub`). While including a subject identifier > with other information > more generally may introduce correlation risks, those risks are highly > context-dependent, and I am not sure that there is any sensible advice to > be given in this draft. > However, I can add something to the effect of "implementers must consider > such risks, and specs that use subject identifiers must provide appropriate > privacy considerations of their own." > > Your argumentation would be valid if the semantics of the subject > identifier would be out of the scope of the draft, but unfortunately this > is not the case. > > Another valuable feature will be, simply, by looking at the structure of a > subject identifier to know whether correlation of user accounts will > or will not be possible. An auditor of an audit trail would be in a > position to know it easily and hence to assess under which conditions a RS > will or will not be in a position to correlate user accounts with another > RS. At the moment, it is impossible. > > This assumes that the issuer of the subject identifier is willing to > reveal that information to the consumers of that subject identifier. > > This assumes that the client is willing to ask to the AS to use or > generate such a subject identifier (if may or may not be able to support > it). > > Further, since subjects may be correlated via information that is not part > of the subject identifier, any "non-correlatable" flag within the subject > identifier > would be insufficient to answer the question of whether the subject can be > correlated. > > The question is not whether other information can allow such correlation, > but whether such correlation will or will not be possible > only by taking advantage of the content of a SINGLE subject identifier. > > I have identified two classes of user accounts: long term and temporary. > > The processing made by the RS will be different whether the subject > identifier is a long term or a temporary subject identifier. > Hence a distinction first needs to be made in the structure of the subject > identifier to distinguish between these two classes. > > My individual draft provides an example for that processing. > > If the intent is to indicate whether or not the *user account* is long > term or temporary, then we are again talking about a property of the > *subject*, > which is better represented as a separate JWT claim, for all the reasons > stated above. > > In the JWT profile (RFC 9068), there is a a claim called "sub". > > The "Privacy considerations" section states: > > This profile mandates the presence of the "sub" claim in every JWT > access token, making it possible for resource servers to rely on that > information for correlating incoming requests with data stored locally > for the authenticated principal. > > We cannot change any more the semantics of the "sub" claim which is very > general and which, by itself, does not allow to know whether or not > some correlation will be possible. > > On the contrary, the "sub-id" claim would be able to let the client > control whether or not some correlation will be possible by the RSa. > > I was not able to find any specific examples of how a processor might > change its behavior based on this information. > > If you take a look at my draft, the processing by a RS of a sub-claim > which contains a long term or a temporary subject identifier will be rather > different. > > I think in most cases, they would operate the same way – operationally > speaking, a short-term account is little different from a dormant long-term > account. > > From what you've written, it seems your goal is to allow users to control > whether or not an RS can correlate the end user's activity across multiple > sessions. > > This is one of the goals, but not the single goal. Introducing the support > of group memberships is another goal. > > That problem is much larger than subject identifiers and cannot be solved > at that level. > > It could be solved in this draft. > > Many different kinds of claims may be used to correlate activity beyond > those used to identify the subject. > > As said earlier, the question is not whether other information can allow > such correlation, but whether such correlation will or will not be possible > simply by looking at the content of a SINGLE subject identifier. > > > The RS needs to be denied access to these claims or provided with masked > or surrogate values. > > ?!? > > It is also likely that the end user does not want the RS to know that they > are providing non-correlated values, > as that would allow the RS to modify its behavior to attempt to block > access or force the user to provide legitimate values. > In such a case, including a flag in the subject identifier would undermine > the user's control. > > If the subject identifier contains an email address, the RS will indeed > know that correlation is likely to be possible. > When the subject identifier contains a Type 1 identifier, the RS will need > to recognize it otherwise, it cannot to process the request correctly. > > At present, I don't see any justification for including individual/group > or correlatability information as part of the subject identifier structure. > > The following "Formats" are currently being defined in the draft: > > 3.2.1. Account Identifier Format > 3.2.2. Aliases Identifier Format > 3.2.3. Decentralized Identifier (DID) Format > 3.2.4. Email Identifier Format > 3.2.5. Issuer and Subject Identifier Format > 3.2.6. Opaque Identifier Format > 3.2.7. Phone Number Identifier Format > > > Some other useful "formats" should be added, like functional group > memberships, roles and hierarchical group memberships. > Is there a rational for not adding these ? > > Let us now focus on section 3.2.5 about "Issuer and Subject Identifier > Formats". The text states: > > The Issuer and Subject Identifier Format identifies a subject using > a pair of "iss" and "sub" members, analogous to how subjects > are identified using the "iss" and "sub" claims in OpenID Connect > [OpenID.Core] ID Tokens. > > The *syntax *is currently : a pair of "iss" and "sub" members. *Such > syntax may be associated with different semantics*. > > A client should be able to ask to an AS to deliver (if it can do it) into > a JWT a subject identifier associated with a user among five possibilities: > > (1) a user identifier unique for each user/ RS pair > (2) a user identifier unique for each AS / RS pair > (3) a user identifier unique for the AS whatever RS being involved, > (4) a short-term user identifier unique for the AS, > (5) a globally unique user identifier (where the uniqueness is > independent from the AS). > > Let me give an example for each of them: > > (1) a user identifier unique for each user/ RS pair > > { > "format": "UID-type 1", > "iss": "http://issuer.example.com/" <http://issuer.example.com/>, > "sub": "145234573" > } > > (2) a user identifier unique for each AS / RS pair > > { > "format": "UID-type 2", > "iss": "http://issuer.example.com/" <http://issuer.example.com/>, > "sub": "145234573" > } > > > (3) a user identifier unique for an AS whatever RS being involved, > > { > "format": "UID-type 3", > "iss": "http://issuer.example.com/" <http://issuer.example.com/>, > "sub": "145234573" > } > > > (4) a short-term user unique identifier (dependent from the AS) > > { > "format": "UID-type 4", > "iss": "http://issuer.example.com/" <http://issuer.example.com/>, > "sub": "145234573" > } > > > (5) a globally unique user identifier (independent from the AS), > > { > "format": "GUID-email", > "syntax": "email", > "email": "user@example.com" <user@example.com> > } > > > I will however publish an update with the additional privacy language I > mentioned above. > > Before doing it, please take a look at RFC 9068: JWT Profile for OAuth > 2.0 Access Tokens, on page 11 > at the "Privacy Considerations" section. > > Denis > > > — > Annabelle Backman (she/her) > richanna@amazon.com > > > > > On Mar 12, 2022, at 5:52 AM, Denis <denis.ietf@free.fr> wrote: > > *CAUTION*: This email originated from outside of the organization. Do not > click links or open attachments unless you can confirm the sender and know > the content is safe. > > Annabelle, > > Thank you for your second email. > > Rather than responding between the lines, I have constructed a global > reply taking into consideration your argumentation. > > You wrote: > > Can you provide examples where it is critical to have this > information encoded within the identifier data structure itself? > Under what circumstances would a consumer of a subject identifier > change their behavior based on this information? > > The rational of the proposal is related to the case where JWTs are > exchanged between a client and a RS (i.e. not between an AS and a RS). > This is certainly not the single case to be considered, but it is an > important case. > > When an access token is received by a RS, it may contain one or more > subject identifiers. > > They allow, when necessary, to trace the actions that have been performed > while using a JWT that contained these subject identifiers. > These subject identifiers may be placed into an audit trail, for example, > in order to be associated with an action that has taken place. > If subject identifiers were fully opaque, they would be of little use. > Hence they need to be structured to be able to make a difference > at a first level of granularity between: > > - those associated with a single individual, and > - those that relate to a group (i.e. to one or more individuals). > > Such a difference relates to what I wrote in my original email from last > year: > > In order to be able to make the difference, an *optional class* attribute > should be defined which may take one out of two values: > > > - "*ind*" to indicate an individual identifier or > - "*grp" *to indicate a group identifier. > > > > *Subject identifiers associated with a single individual * > As explained in my individual draft, a subject identifier that relates to > an individual may disclose more or less information > that allows RSs to link their user accounts. The choice between these > various types of identifiers may be done by the end user > or/and by the client. > > Depending upon the level of concern (or knowledge) of the individual as > regard to his/her privacy and what is supported > by the underlying technology, I have identified up to five possible > choices for the individual. > > Now, let us come to your question: > > Under what circumstances would a consumer of a subject identifier > change their behavior based on this information? > > I have identified two classes of user accounts: long term and temporary. > > The processing made by the RS will be different whether the subject > identifier is a long term or a temporary subject identifier. > Hence a distinction first needs to be made in the structure of the subject > identifier to distinguish between these two classes. > > My individual draft provides an example for that processing. > > Secondly, when a match needs to be done by a RS between subject > identifiers received in different JWTs, it needs to be done > using the same type of subject identifiers, hence including that type in > the structure is necessary. > > Let us now consider the point of view of the client and raise the > following question: > > Under what circumstances would a client change its behavior based > on this information? > > If a client has been asking for a type X of subject identifier in a JWT > and is then able to discover that the JWT contains instead > a type Y of subject identifier, then the client SHALL NOT transmit the JWT > to the RS, because the privacy of the end user might be impacted. > > You wrote: > > The Privacy Considerations section is intended to address the > correlation risk generally. > The JWT case is mentioned only as an example. Any suggestions on > how to make that more clear? > > It is not a matter to make it more clear based on the current content of > the current draft. > One sentence is that section is currently: > > For example, the entity to which the identifiers are presented now > knows that both identifiers relate to the same subject, > and may be able to correlate additional data based on that. > > Such sentence is inaccurate because it depends upon the type of subject > identifier that is being received. > > It is a matter of exposing the privacy concerns of the end-user and how > they may be addressed using one of the five types of subject identifiers. > > I mean that it will be possible to revise this section once the five types > of subject identifiers will have been incorporated into the document. > > Another valuable feature will be, simply, by looking at the structure of a > subject identifier to know whether correlation of user accounts will > or will not be possible. An auditor of an audit trail would be in a > position to know it easily and hence to assess under which conditions a RS > will or will not be in a position to correlate user accounts with another > RS. At the moment, it is impossible. > > > > > *Subject identifiers that relate to a group * > It will be valuable, simply by looking at the structure of a subject > identifier to know that is it related to a group (and not to a single > individual) > and to which kind of group (e.g. hierarchical, functional, or a role). > > This relates to my original email from last year where I wrote: > > It would be useful to define one format for these common > groups: one or more character strings separated by the character slash > for both hierarchical (*hgrp*) and functional group > memberships (*fgrp*) and roles (*role*): > > > > *Other replies related to your original email * > You wrote: > > From what you've shared so far, a few things jump out that make me > think it would not be appropriate to include this as a core property within > subject identifier formats: > The description assumes the subjects being identified are > users/accounts and that the subject identifier is being exchanged between > an AS and RS. > This is by far not the only use case for subject identifiers. > > As said earlier, this is certainly not the single case to be considered, > but it is an important case. > > You wrote: > > Non-correlation only really makes sense for opaque, surrogate > identifiers like UUIDs. > > Non-correlation does not only apply to fully opaque identifiers. > > You wrote: > > How do you prevent correlation if your identifier format is an IP > address, phone number, government-issued ID number, domain name, > latitude/longitude, street address, etc.? (Note that if the local > part of an email address can be understood as an opaque, surrogate > identifier > if the issuer of the subject identifier controls the email address > domain, for example, emails generated by Apple's Hide my Email feature). > > The standardization community is working taking into consideration roughly > the ISO model, where the application layer is addressed independently > from the transport or network layer. Hiding an IP address is not a concern > for the application layer and hence for the content of a JWT. > This concern can be addressed using specific techniques. > > There exist use cases where an individual can use his user account on a RS > without disclosing a phone number, a government-issued ID number, > a domain name, a latitude/longitude (!) or a street address. The list you > provide is an example of such "end user attributes". > > Note that an email address may be used as a subject identifier if the AS > incorporates it in a "globally unique user identifier", i.e. a type 4 > subject identifier > in my contribution ... and if the end-user is indeed accepting or willing > to use such "globally unique user identifier". > > Let us now finish to address the arguments raised in your first email. > > Certainly there are systems out there that issue identifiers for > individuals (i.e., users) and groups that may collide with one another > (e.g., any system that uses 0-based SQL auto-incrementing integers > for its identifiers). However, I think it is rare for such identifiers > to be used in cases where interoperability is important, and *where > the type of the subject is not made clear from context* > (e.g., a "GetGroupMembers" API would expect an identifier for a > group, not a user). Further, I suspect any such system that did need > to disambiguate between individuals and groups within the identifier > itself would likely need to disambiguate between other types of subjects > as well, e.g., hosts, documents, various resources provided by the > service, etc. As such, this proposal would not solve their problem, and > they would be better off defining their own subject identifier > format for their use case. > > You say that "there are systems out there that issue identifiers for > individuals (i.e., users) and groups that may collide with one another". > > It is not because that there may exist some systems that are badly > designed that we should not encourage to build system on good foundations. > The type of the subject identifier cannot always made clear from context > but it can be clear from the content of the JWT. > > Note also that, when an auditor takes a look at the content of an audit > trail, the "context" has been lost and hence he/she may only understand > the semantics of a subject identifier by looking at its internal structure. > > You also wrote "they would be better off defining their own subject > identifier format for their use case." > > Within the IETF, one of the objectives is interoperability and as such it > can only be achieved using standard track RFCs > rather than by defining subject identifier formats for > application-specific use cases, in a non-interoperable way. > > Denis > > It would be valuable to be able to make a difference between these five > types of user identifiers. > > Can you provide examples where it is critical to have this information > encoded within the identifier data structure itself? Under what > circumstances would a consumer of a subject identifier change their > behavior based on this information? > > From what you've shared so far, a few things jump out that make me think > it would not be appropriate to include this as a core property within > subject identifier formats: > > > 1. The description assumes the subjects being identified are > users/accounts and that the subject identifier is being exchanged between > an AS and RS. This is by far not the only use case for subject identifiers. > > 2. Non-correlation only really makes sense for opaque, surrogate > identifiers like UUIDs. How do you prevent correlation if your identifier > format is an IP address, phone number, government-issued ID number, domain > name, latitude/longitude, street address, etc.? (Note that if the local > part of an email address can be understood as an opaque, surrogate > identifier if the issuer of the subject identifier controls the email > address domain, for example, emails generated by Apple's Hide my Email > feature). > > > — > Annabelle Backman (she/her) > richanna@amazon.com > > > > > On Mar 10, 2022, at 3:27 AM, Denis <denis.ietf@free.fr> wrote: > > *CAUTION*: This email originated from outside of the organization. Do not > click links or open attachments unless you can confirm the sender and know > the content is safe. > > Hello Annabelle, > > I am glad to be able to exchange with you for the very first time. > > I am currently rather busy and I don't have the time available for a > detailed response. > > Nevertheless, I browsed through your comments and I picked one of them: > > > A subject identifier *type* attribute would be able to support four > values: *guid*, *shared*, *unique* and *tmp*. > > I'm not sure I'm following what you're intending to represent with this, > and what problem you're trying to solve. > > Last August, I have posted a draft, that has expired, but that you can > still find at: > > > https://datatracker.ietf.org/doc/html/draft-pinkas-gnap-core-protocol-00.html > > If you have some time available, please take a look at section 1.7. called > "Short term and long term user accounts", > where you will get some information. In particular, the following text. > > The four types used in the context of long-term user accounts managed by a RS are: > > (1) a unique user identifier used to identify a user for each User/ RS pair, or > > Note: this option cannot be implemented in the context of a "software-only" solution. > It requires the use, by the end-user, of a secure element with specific security > properties. [This option is not detailed any further at the moment]. > > (2) a unique user identifier used to identify a user for each AS / RS pair, or > > (3) a locally unique user identifier used to identify a user whatever RS is being involved, or > > (4) a globally unique user identifier. > > The last type used in the context of short-term user accounts managed by a RS is: > > (5) a short-term user unique identifier. > > It would be valuable to be able to make a difference between these five > types of user identifiers. > > *Note*: the draft has been posted a few months after my original comment, > hence at the time I made my original post > my ideas where not yet fully stabilized. Now, they are ! > > Denis > > On Mar 9, 2022, at 9:31 AM, Denis <denis.ietf@free.fr> wrote: > > ... > > While this statement is correct, it should be remembered that the title of > this document is: > > " Subject Identifiers for Security Event Tokens" > > and is not: > > " Subject Identifiers *Formats* for Security Event Tokens". > > > The draft formalizes the concept of a "Subject Identifier", and provides a > standard way to represent those as structured data. Therefore I think the > current name remains appropriate. > > *1. Granularity of the identification* > > A subject identifier may be able to identify an entity either individually > or as a member of a group. > > In order to be able to make the difference, an *optional class* attribute > should be defined which may take one out of two values: > > - "*ind*" to indicate an individual identifier or > - "*grp" *to indicate a group identifier. > > Whether or not a subject is an individual or group is a property of the > subject itself, not (generally speaking) a property of the subject > identifier. Subject identifiers are not general purpose containers for > claims about a subject – we already have JWTs for that. 😀 > > Certainly there are systems out there that issue identifiers for > individuals (i.e., users) and groups that may collide with one another > (e.g., any system that uses 0-based SQL auto-incrementing integers for its > identifiers). However, I think it is rare for such identifiers to be used > in cases where interoperability is important, and where the type of the > subject is not made clear from context (e.g., a "GetGroupMembers" API would > expect an identifier for a group, not a user). Further, I suspect any such > system that did need to disambiguate between individuals and groups within > the identifier itself would likely need to disambiguate between other types > of subjects as well, e.g., hosts, documents, various resources provided by > the service, etc. As such, this proposal would not solve their problem, and > they would be better off defining their own subject identifier format for > their use case. > > > *2. Correlation operations that may be either performed or prevented* > > Currently, the Privacy Considerations section only addresses one > correlation case where a JWT would have both "sub" and "sub_id" JWT claims. > While it is appropriate to mention such a case, other correlation cases > exist. > > > The Privacy Considerations section is intended to address the correlation > risk generally. The JWT case is mentioned only as an example. Any > suggestions on how to make that more clear? > > These cases become visible if the "sub_id" contains an *optional* subject > identifier *type* attribute. > > A subject identifier *type* attribute would be able to support four > values: *guid*, *shared*, *unique* and *tmp*. > > > I'm not sure I'm following what you're intending to represent with this, > and what problem you're trying to solve. Any subject identifier transmitted > from one party to another is by definition "shared". Once the recipient > receives that identifier, the transmitter has no programmatic control over > how it is used – that's the realm of *legal* contracts, not API > contracts. > > A transmitter that wishes to prevent Recipient A from using the > transmitters subject identifiers to correlate records with Recipient B may > be able to do so by issuing directed identifiers that are unique per > subject+recipient pair. A system that does so is essentially immune to this > correlation risk; it is not clear to me what value there is in advertising > this fact within the subject identifier. > > *3. Hierarchical group memberships, functional group memberships and > roles.* > > Examples of group identifiers are : hierarchical group memberships, > functional group memberships and roles. > It would be useful to define one format for these common groups: one or > more character strings separated by the character slash. > for both hierarchical (*hgrp*) and functional group memberships (*fgrp*) > and roles (*role*): > > > Hierarchical groups and functional groups are both subject types, not > subject identifier types. I can imagine something like a `path` subject > identifier format that contains an ordered list of scalar values describing > a path within a graph. That graph could be a filesystem directory tree, an > org chart, a computer network, etc. > > It might look something like: > > { > > "format": "path", > > "path": ["usr", "local", "bin", "sha512sum"] > > } > > > or > > > { > > "format": "path", > > "path": ["Example University", "Faculty", "Computer Science", "Ada > Lovelace"] > > } > > > Note that the nature of the graph, and whether or not the subject > identifier identifies the entire path itself or just the final node would > depend on the context in which the identifier appears. > > While it is an interesting thought exercise, unless someone has a use case > for this kind of subject identifier format I don't think it should go in > this draft. It can always be defined by someone later, if needed. > > — > Annabelle Backman (she/her) > richanna@amazon.com > > > > > On Mar 9, 2022, at 9:31 AM, Denis <denis.ietf@free.fr> wrote: > > *CAUTION*: This email originated from outside of the organization. Do not > click links or open attachments unless you can confirm the sender and know > the content is safe. > > Dick, > > As a response to your inquiry, I repost the original email sent on > 27/05/2021 at 19:42 (Paris local time) to the same recipients. > > Denis > > > I believe that this document should be enhanced on three aspects that are > currently not addressed. > > Section 3.1 (Identifier Formats versus Principal Types) states: > > Identifier Formats define how to encode identifying information for a > subject. They do not define the type or nature of the subject itself. > > While this statement is correct, it should be remembered that the title > of this document is: > > " Subject Identifiers for Security Event Tokens" > > and is not: > > " Subject Identifiers *Formats* for Security Event Tokens". > > Therefore it would be possible to add *optional *attributes/characteristics > to Subject Identifiers which relate to two different topics: > > - the granularity of the identification and > - the correlation operations that may be either performed or > prevented using some values contained in a specific format. > > *1. Granularity of the identification* > > A subject identifier may be able to identify an entity either individually > or as a member of a group. > > In order to be able to make the difference, an *optional class* attribute > should be defined which may take one out of two values: > > - "*ind*" to indicate an individual identifier or > - "*grp" *to indicate a group identifier. > > Examples: > > "format": "email", > "class": "ind" > "email": "tom.jones@example.com" <tom.jones@example.com> > > "format": "email", > "class": "grp" > "email": "marketing@example.com" <marketing@example.com> > > > *2. Correlation operations that may be either performed or prevented* > > Currently, the Privacy Considerations section only addresses one > correlation case where a JWT would have both "sub" and "sub_id" JWT claims. > > While it is appropriate to mention such a case, other correlation cases > exist. > > These cases become visible if the "sub_id" contains an *optional* subject > identifier *type* attribute. > > A subject identifier *type* attribute would be able to support four > values: *guid*, *shared*, *unique* and *tmp*. > > -- --- Aaron Parecki https://aaronparecki.com
- [Id-event] Subject Identifiers - Working Group La… Dick Hardt
- Re: [Id-event] Subject Identifiers - Working Grou… Yaron Sheffer
- Re: [Id-event] Subject Identifiers - Working Grou… Denis
- Re: [Id-event] Subject Identifiers - Working Grou… Dave Tonge
- Re: [Id-event] Subject Identifiers - Working Grou… Justin Richer
- Re: [Id-event] Subject Identifiers - Working Grou… Richard Backman, Annabelle
- Re: [Id-event] [UNVERIFIED SENDER] Re: Subject Id… Richard Backman, Annabelle
- Re: [Id-event] [UNVERIFIED SENDER] Re: Subject Id… Yaron Sheffer
- Re: [Id-event] [UNVERIFIED SENDER] Re: Subject Id… Richard Backman, Annabelle
- Re: [Id-event] [UNVERIFIED SENDER] Re: Subject Id… Tim Cappalli
- Re: [Id-event] [UNVERIFIED SENDER] Re: Subject Id… Justin Richer
- Re: [Id-event] [UNVERIFIED SENDER] Re: Subject Id… Dick Hardt
- Re: [Id-event] Subject Identifiers - Working Grou… Denis
- Re: [Id-event] Subject Identifiers - Working Grou… Backman, Annabelle
- Re: [Id-event] Subject Identifiers - Working Grou… Denis
- Re: [Id-event] Subject Identifiers - Working Grou… Backman, Annabelle
- Re: [Id-event] Subject Identifiers - Working Grou… Denis
- Re: [Id-event] Subject Identifiers - Working Grou… Backman, Annabelle
- Re: [Id-event] Subject Identifiers - Working Grou… Denis
- Re: [Id-event] Subject Identifiers - Working Grou… Aaron Parecki
- Re: [Id-event] Subject Identifiers - Working Grou… Denis