Re: [scim] SCIM Protocol - 3 suggestions for improvement

prateek mishra <prateek.mishra@oracle.com> Thu, 09 August 2012 21:41 UTC

Return-Path: <prateek.mishra@oracle.com>
X-Original-To: scim@ietfa.amsl.com
Delivered-To: scim@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2BE2E21F8690 for <scim@ietfa.amsl.com>; Thu, 9 Aug 2012 14:41:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.598
X-Spam-Level:
X-Spam-Status: No, score=-10.598 tagged_above=-999 required=5 tests=[AWL=-0.000, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RO0ECUJsZN6I for <scim@ietfa.amsl.com>; Thu, 9 Aug 2012 14:41:06 -0700 (PDT)
Received: from acsinet15.oracle.com (acsinet15.oracle.com [141.146.126.227]) by ietfa.amsl.com (Postfix) with ESMTP id D300121F8604 for <scim@ietf.org>; Thu, 9 Aug 2012 14:40:52 -0700 (PDT)
Received: from acsinet21.oracle.com (acsinet21.oracle.com [141.146.126.237]) by acsinet15.oracle.com (Sentrion-MTA-4.2.2/Sentrion-MTA-4.2.2) with ESMTP id q79LenxK004130 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 9 Aug 2012 21:40:49 GMT
Received: from acsmt358.oracle.com (acsmt358.oracle.com [141.146.40.158]) by acsinet21.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id q79LemTu021180 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 9 Aug 2012 21:40:48 GMT
Received: from abhmt114.oracle.com (abhmt114.oracle.com [141.146.116.66]) by acsmt358.oracle.com (8.12.11.20060308/8.12.11) with ESMTP id q79LemKJ024257; Thu, 9 Aug 2012 16:40:48 -0500
Received: from [10.159.182.73] (/10.159.182.73) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 09 Aug 2012 14:40:47 -0700
Message-ID: <50242E5B.40500@oracle.com>
Date: Thu, 09 Aug 2012 17:40:43 -0400
From: prateek mishra <prateek.mishra@oracle.com>
Organization: Oracle Corporation
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:14.0) Gecko/20120713 Thunderbird/14.0
MIME-Version: 1.0
To: scim@ietf.org
References: <CAOEeopgkEs9Z8WT_3kNw=owhL+g6JM8jmkS2f50pFFPrLt4Fbw@mail.gmail.com> <56C3C758F9D6534CA3778EAA1E0C34373302493F@BY2PRD0410MB354.namprd04.prod.outlook.com> <CAOEeopgVDq4L_fefJO0h+AeJxRNAdyL6QKxK=ewRGwX-OqeA+A@mail.gmail.com> <DF63ACC82673DB40A7AAC08FFA71DFBD27416E0B@AMXPRD0610MB353.eurprd06.prod.outlook.com> <56C3C758F9D6534CA3778EAA1E0C343733024BFD@BY2PRD0410MB354.namprd04.prod.outlook.com> <CAOEeopji6-x_58PG+vaXWkQUJPiq8aFVX0ApXya0dxKGa0P4qQ@mail.gmail.com> <20120809212548.BFA4D21F8605@ietfa.amsl.com>
In-Reply-To: <20120809212548.BFA4D21F8605@ietfa.amsl.com>
Content-Type: multipart/alternative; boundary="------------050002080609040807060102"
X-Source-IP: acsinet21.oracle.com [141.146.126.237]
Cc: "Diodati,Mark" <Mark.Diodati@gartner.com>
Subject: Re: [scim] SCIM Protocol - 3 suggestions for improvement
X-BeenThere: scim@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Simple Cloud Identity Management BOF <scim.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/scim>, <mailto:scim-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/scim>
List-Post: <mailto:scim@ietf.org>
List-Help: <mailto:scim-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/scim>, <mailto:scim-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 09 Aug 2012 21:41:19 -0000

Mark -

what is the basis of these strong opinions? I dont see any evidence or 
support for your statements in your message below.

I am a member of this working group, and I dont agree with you.

This is an IETF mailing list meant for discussion of an emerging 
technology, its possible you are not familiar with this type of discussion.
In that case, I encourage you to educate yourself in this area.

Thanks,
prateek
>
> Hi Ganesh,
>
> --It would be good to hear the spectrum of opinions ...
>
> My assessment is that your proposal adds complexity, strays from the 
> initial simplicity goals of SCIM, and hinders service provider 
> adoption (which is crucial to the success of SCIM). I speak only for 
> myself, but in my conversations with others in the WG, no one 
> supported your proposal.
>
> Mark
>
> Mark Diodati
>
> Research Vice President | Gartner
>
> phone: +1 312.238.9877
>
> blog: http://blogs.gartner.com/mark-diodati
>
> twitter: mark_diodati
>
> *From:*Ganesh and Sashi Prasad [mailto:g.c.prasad@gmail.com]
> *Sent:* Thursday, August 09, 2012 4:15 PM
> *To:* Kelly Grizzle
> *Cc:* scim@ietf.org; Emmanuel Dreux
> *Subject:* Re: [scim] SCIM Protocol - 3 suggestions for improvement
>
> > storing this information in a mapping table outside of the SCIM spec 
> is a great way to enable this solution.  Part of the key here is that 
> SCIM is just a piece of the architecture for this solution, and is 
> only responsible for the transport layer between domains.
>
> I wasn't suggesting that the mapping table be part of the SCIM spec. I 
> provided that example to illustrate that splitting and merging 
> identities is a common requirement, and that decoupling local 
> identifiers within a domain from shared identifiers between domains 
> was the best way to facilitate it.
>
> I'm suggesting that the spec do less, not more.
>
> What the SCIM spec needs to do there is just refrain from introducing 
> tight coupling. I would like to see a single identifier exposed 
> through the API, with the implication (and perhaps the recommendation) 
> that it be the shared one. Allowing one domain to expose its internal 
> identifier to the other creates tight coupling and ensures that both 
> domains need simultaneously split or merge identities, which is not 
> desirable. So I recommend _taking out_ the "external id" field from 
> the API. The spec shouldn't encourage tight coupling. If clients want 
> to pass in their internal ids as part of the resource body, no one can 
> stop them, and they can always do a search on that attribute to 
> retrieve the URI exactly as you visualise they will with the "external 
> id", but let's not elevate an anti-pattern to a recommendation by 
> enshrining the "external id" as an acceptable attribute.
>
> Am I making sense?
>
> > Regarding unique identifiers for multi-valued attributes there is a 
> trade-off involved.  On one hand this makes PATCH semantics easier.  
> On the other hand it puts extra burden on service providers.
>
>
> Precisely. The spec has to strike the right balance. It would be 
> interesting to hear from the other members of the spec mailing list. 
> You know where I stand on this. It would be good to hear the spectrum 
> of opinions.
>
> Regards,
>
> Ganesh
>
> On 10 August 2012 00:28, Kelly Grizzle <kelly.grizzle@sailpoint.com 
> <mailto:kelly.grizzle@sailpoint.com>> wrote:
>
> Thanks Emmanuel.  I had started writing up a similar response.  As you 
> suggest, storing this information in a mapping table outside of the 
> SCIM spec is a great way to enable this solution.  Part of the key 
> here is that SCIM is just a piece of the architecture for this 
> solution, and is only responsible for the transport layer between domains.
>
> You could also model these ID mappings in the SCIM user as an 
> extension but would probably not want to expose these externally.  
> Here is an example of how to model the end state of the false positive 
> scenario (splitting a user):
>
> | Internal Entity ID | External Domain ID | External Entity ID | 
> Primary flag |
>
> | 9caf78aac3d6       | D2                 | ff487230b3a0       | 
> true         |
>
> | a99a5feba839       | D2                 | 7a87f27c1dd8       | 
> true         |
>
> This could be represented as two SCIM users that contain information 
> about the entities on other domains.
>
> {
>
>   "schemas": ["urn:scim:schemas:core:1.0", 
> "urn:scim:schemas:extension:federation:1.0"],
>
>   "id": "9caf78aac3d6",
>
>   "userName": "John Smith",
>
> "urn:scim:schemas:extension:federation:1.0": {
>
>     "linkedUsers": [
>
>       {
>
>         "domain": "D2",
>
> "externalEntityId": "ff487230b3a0"
>
>       }
>
>     ]
>
>   }
>
> }
>
> {
>
>   "schemas": ["urn:scim:schemas:core:1.0", 
> "urn:scim:schemas:extension:federation:1.0"],
>
>   "id": "a99a5feba839",
>
>   "userName": "John Smith",
>
> "urn:scim:schemas:extension:federation:1.0": {
>
>     "linkedUsers": [
>
>       {
>
>         "domain": "D2",
>
> "externalEntityId": "7a87f27c1dd8"
>
>       }
>
>     ]
>
>   }
>
> }
>
> In the second user, the linkedUsers attribute would be empty until the 
> split user was synced to domain 2.
>
> Similarly, the false negative use case (merging two users) looked like 
> this at the end:
>
> | Internal Entity ID | External Domain ID | External Entity ID | 
> Primary flag |
>
> | 9caf78aac3d6       | D2                 | ff487230b3a0       | 
> true         |
>
> | 9caf78aac3d6       | D2                 | 41206cc97c8b       | 
> false        |
>
> This could be represented with the following SCIM user:
>
> {
>
>   "schemas": ["urn:scim:schemas:core:1.0", 
> "urn:scim:schemas:extension:federation:1.0"],
>
>   "id": "9caf78aac3d6",
>
>   "userName": "John Smith",
>
> "urn:scim:schemas:extension:federation:1.0": {
>
>     "linkedUsers": [
>
>       {
>
>         "domain": "D2",
>
> "externalEntityId": "ff487230b3a0"
>
>       },
>
>       {
>
>         "domain": "D2",
>
> "externalEntityId": "41206cc97c8b",
>
> "deletionRequired": true
>
>       }
>
>     ]
>
>   }
>
> }
>
> Regarding unique identifiers for multi-valued attributes there is a 
> trade-off involved.  On one hand this makes PATCH semantics easier.  
> On the other hand it puts extra burden on service providers.  Since 
> the inception of SCIM, a key goal has been to foster adoption by 
> service providers by making things fit easily onto existing systems.  
> IMO the value gained by unique identifiers for multi-valued attributes 
> is not worth the demands put on a service provider.  I also think that 
> vendors that have a non-SCIM-compliant API will choose to keep things 
> that way if the spec is too hard for them to implement.  In a green 
> field environment we do have the luxury of mandating a model to make 
> certain operations more elegant. However, we can't ignore legacy systems.
>
> --Kelly
>
> *From:*scim-bounces@ietf.org <mailto:scim-bounces@ietf.org> 
> [mailto:scim-bounces@ietf.org <mailto:scim-bounces@ietf.org>] *On 
> Behalf Of *Emmanuel Dreux
> *Sent:* Thursday, August 09, 2012 3:18 AM
> *To:* Ganesh and Sashi Prasad; Kelly Grizzle
> *Cc:* scim@ietf.org <mailto:scim@ietf.org>
> *Subject:* Re: [scim] SCIM Protocol - 3 suggestions for improvement
>
> Hi Ganesh,
>
> Nothing prevents you in your SCIM implementation (client or server) to 
> generate a unique identifier for each synchronized object and maintain 
> an internal mapping table ( you would have to map group membership as 
> well).
>
> This is what we are doing with Active Directory sources or targets:
>
> As we didn't find an immutable uniqueID in AD systems 
> (DN,samAccountName, UPN) are subject to change (even objectGuid can 
> change if an AD domain is migrated), we decided to generate and 
> maintain an internal table of ids. This fits your requirements as it 
> hides internal ids.
>
> This was written in dotnet and we have started a project to rewrite 
> our SCIM stack in PHP and will give it to the Open Source community. 
> This implementation will have a parameter : AllocateIds versus 
> UseExistingIDs.
>
> This will give the choice of "hiding" internalIDs or use them as 
> unique ID.
>
> You can also implement such feature without violating the SCIM specs, 
> or without asking to include it in the specs.
>
> --
>
> Regards,
>
> Emmanuel Dreux
>
> http://www.cloudiway.com
>
> Tel: +33 4 26 78 17 58 <tel:%2B33%204%2026%2078%2017%2058>
>
> Mobile: +33 6 47 81 26 70 <tel:%2B33%206%2047%2081%2026%2070>
>
> skype: Emmanuel.Dreux
>
> *De :*Ganesh and Sashi Prasad [mailto:g.c.prasad@gmail.com]
> *Envoyé :* jeudi 9 août 2012 03:35
> *À :* Kelly Grizzle
> *Cc :* scim@ietf.org <mailto:scim@ietf.org>
> *Objet :* Re: [scim] SCIM Protocol - 3 suggestions for improvement
>
> Hi Kelly,
>
> Thanks for your response. Let me first respond in brief to the two 
> main points you have made, and then elaborate on the first.
>
> 1.Why should domains not expose their internal identifiers to other 
> domains?
>
> a.
>
> We are designing a protocol for a federated system of domains, where 
> all domains are co-equal peers. (In physics too, N-body problems are 
> much harder than 2-body problems. :-) Therefore, assuming that there 
> are only two players in the interaction makes this tightly coupled in 
> a number of ways. We should rely on messaging and notification, with 
> encapsulation of domain-specific data.
>
> b.
>
> In any non-trivial data store, there will always be the ongoing need 
> to merge and split identities as and when "false negatives" and "false 
> positives" are discovered. A domain should be able to handle this 
> internal housekeeping freely, only notifying other domains when 
> convenient. Mapping of internal identifiers to external ones and 
> maintaining this mapping internally allows this loosely-coupled 
> housekeeping to take place. Sharing internal identifiers (or otherwise 
> outsourcing the mapping of internal to external identifiers) forces 
> housekeeping activities to be done in lock-step across domains.
>
> c.
>
> Asynchronous interaction is not just a matter of a suitable wire 
> protocol which can be designed later. The data model plays a crucial 
> role in enabling or constraining such interaction. A tightly-coupled 
> data model will force the use of synchronous interactions, and the 
> exposure of internal identifiers is a key part of this tight coupling.
>
> 2. The difficulty of assigning unique identifiers to the individual 
> values of multi-valued attributes:
>
> a.
>
> I'm not belittling the effort involved in migrating legacy data stores 
> to such a model. However, in the larger historical context of 
> cross-domain identity management, we are really at the very early 
> stages. If a relatively new discipline and a brand new spec are held 
> captive to legacy considerations, we are losing an opportunity to 
> provide a clean and elegant model to subsequent users of the spec, and 
> this will have repercussions over many years or even decades.
>
> b.
>
> If incumbent cloud providers find it hard to immediately adopt the 
> dictionary model for existing multi-valued attributes, they can 
> transition to this model by offering both "SCIM-compliant" and 
> "non-SCIM-compliant" APIs to their customers and encouraging new 
> customers to adopt the "SCIM-compliant" API. Legacy customers can be 
> supported using a "non-SCIM-compliant" API for an arbitrarily long 
> period and gradually migrated to the SCIM-compliant API. The logistics 
> are not insurmountable, and shouldn't prevent the adoption of a 
> dictionary model for multi-valued attributes.
>
> Elaboration of Point 1:
>
> When we consider federated identity across more than one domain, we 
> have to assume that domains are not necessarily master-slave in their 
> interaction. The most generic interaction model is peer-to-peer, where 
> entity lifecycle events within a domain are notified to other domains 
> (when necessary) in an asynchronous manner (i.e., through messaging) 
> and the other domains are free to respond to these events in an 
> appropriate manner and at a time of their convenience.
>
> A key set of lifecycle events for an entity is the merging and 
> splitting of identity that is often required.
>
> The question "Is this one entity?" can be answered either yes 
> (positive) or no (negative). But sometimes, we can discover false 
> positives and false negatives in our data stores.
>
> Consider a case where customers sign up online, and two customers who 
> are privacy-conscious enter fake IDs such as "John Smith", and also 
> use the same date of birth (say, 1 Jan 1970) or similar attributes. 
> The front-end application may make an intelligent (but incorrect) 
> guess that these two persons are the same, and re-assign the same 
> identifier to the second person. This is a false positive. They appear 
> to be the same entity, but they're actually different. When the error 
> is discovered, the identities will need to be split, with a new 
> identifier generated for one of them.
>
> Consider the opposite case where a customer signs up through two 
> different portals or in two different sessions, using the names 
> "JSmith" and "JohnS". It is very likely that they will be treated as 
> two different customers and assigned two unique identifiers. This is a 
> false negative. They appear to be two entities, but are actually the 
> same. At a later stage, when the error is discovered, the identities 
> will have to be merged, and one of the identifiers will have to be 
> dropped.
>
> These are not theoretical use cases. They form a significant 
> proportion of the user base in most large Web-facing applications. 
> Let's see how these can be managed in a federated way by mapping 
> internal identifiers to external ones and only exposing external 
> identifiers to other domains.
>
> a. False positives:
>
> Domain 1 has the following information about a customer in its data store:
>
> Internal ID: 9caf78aac3d6
>
> Attributes: {name: "John Smith", dob: "01-Jan-1970"}
>
> When requesting the provisioning of this entity in Domain 2, the 
> following ID is returned by Domain 2: ff487230b3a0.
>
> Domain 1 then maintains the following in a mapping table and uses it 
> for translation when talking to Domain 2, taking care never to expose 
> its internal identifier:
>
> | Internal Entity ID | External Domain ID | External Entity ID | 
> Primary flag |
>
> | 9caf78aac3d6 | D2 | ff487230b3a0 | true |
>
> When the false positive is discovered and the entity is split, Domain 
> 1 creates a new internal identifier and now has the following entity 
> information.
>
> Internal ID: 9caf78aac3d6
>
> Attributes: {name: "John Smith", dob: "01-Jan-1970"}
>
> Internal ID: a99a5feba839
>
> Attributes: {name: "John Smith", dob: "01-Jan-1970"}
>
> This second entity with its own internal identifier is invisible to 
> Domain 2, and this is by design. Communication about the original 
> entity takes place as before by mapping "9caf78aac3d6" to 
> "ff487230b3a0" and vice-versa. At some convenient time (importantly, 
> this doesn't have to be at the time the split happens), Domain 2 can 
> be requested to provision a second entity, and when it responds with 
> an identifier of "7a87f27c1dd8", this can go into the mapping table as 
> a new record associated with the second entity's internal identifier.
>
> The mapping table now contains the following entries:
>
> | Internal Entity ID | External Domain ID | External Entity ID | 
> Primary flag |
>
> | 9caf78aac3d6 | D2 | ff487230b3a0 | true |
>
> | a99a5feba839 | D2 | 7a87f27c1dd8 | true |
>
> Domain 2 is not even aware that a split has happened, and the 
> provisioning that it does is not in lockstep with the split in 
> identity that occurred in Domain 1.
>
> (What is the "Primary flag" used for? We'll see when we cover the 
> treatment of false negatives.)
>
> b. False negatives:
>
> Domain 1 has the following information about what it thinks are two 
> distinct customers in its data store:
>
> Internal ID: 9caf78aac3d6
>
> Attributes: {name: "JSmith", dob: "01-Jan-1970"}
>
> Internal ID: 273d36e30d09
>
> Attributes: {name: "JohnS", dob: "01-Jan-1970"}
>
> When requesting the provisioning of these entities in Domain 2, the 
> following IDs are returned by Domain 2: ff487230b3a0 and 41206cc97c8b.
>
> Domain 1 then maintains the following in a mapping table and uses it 
> for translation when talking to Domain 2, taking care never to expose 
> its internal identifiers:
>
> | Internal Entity ID | External Domain ID | External Entity ID | 
> Primary flag |
>
> | 9caf78aac3d6 | D2 | ff487230b3a0 | true |
>
> | 273d36e30d09 | D2 | 41206cc97c8b | true |
>
> When the false negative is discovered and the two entities are merged, 
> Domain 1 drops one of the internal identifiers and rationalises the 
> name of the customer (say, to "John Smith"). Let's say it retains the 
> first ID "9caf78aac3d6" and drops the second "273d36e30d09".
>
> The mapping table now looks like this:
>
> | Internal Entity ID | External Domain ID | External Entity ID | 
> Primary flag |
>
> | 9caf78aac3d6 | D2 | ff487230b3a0 | true |
>
> | 9caf78aac3d6 | D2 | 41206cc97c8b | false |
>
> Now two external identifiers map to the same internal one, so inbound 
> communication from Domain 2 can be unambiguously translated to the 
> same entity internally. However, when going outwards, Domain 1 will 
> have to look up the translation table to determine the "primary" 
> external ID for this entity in Domain 2, which was decided to be 
> "ff487230b3a0". That's where the "Primary flag" comes in. The second 
> external ID "41206cc97c8b" is never used thereafter in outbound 
> communication.
>
> At some stage (importantly, not in lockstep with the identity merge), 
> Domain 2 can be requested to delete the customer record identified by 
> "41206cc97c8b", and the second entry in the mapping table can be 
> removed once this is acknowledged.
>
> This scheme will scale up to multiple domains, because the "External 
> Domain ID" column helps to keep track of which external ID is shared 
> with which Domain. (Why don't we use just one external ID for an 
> entity and share it with all external domains? Tight coupling again. 
> Just as OAuth allows an access token given to a third party to be 
> invalidated without affecting the access of other third parties, the 
> use of separate external identifiers for different domains allows 
> fine-grained control of identity federation.)
>
> The scheme also allows the splitting of an entity into more than two 
> entities, and the merging of more than two entities into a single one. 
> (Any organisation with a web-facing application will tell you how many 
> John Smiths there are who were born on 1 Jan 1970!)
>
> This is a fairly long-winded explanation, but this is why we need to 
> hide internal identifiers from other domains, and why mappings need to 
> be managed internally in each domain. Such a data model also allows us 
> to choose asynchronous protocols for propagation of identity events, 
> since there is no consistency requirement to update multiple domains 
> concurrently.
>
> Regards,
>
> Ganesh Prasad
>
> On 9 August 2012 04:55, Kelly Grizzle <kelly.grizzle@sailpoint.com 
> <mailto:kelly.grizzle@sailpoint.com>> wrote:
>
> Thanks for the feedback, Ganesh.  I read through this and your InfoQ 
> article (http://www.infoq.com/articles/scim-data-model-limitations) 
> and have some thoughts.
>
> > The rest of the protocol does not meaningfully use the enterprise 
> client's identifier, the "external ID"
>
> > at all, even though it was ostensibly introduced to make things friendlier for the client.
>
> The usage pattern for an external ID would be to search for a user by 
> externalId and use the ID of the returned user in any desired 
> operation. For example:
>
> GET /Users?filter=externalId eq "bjensen"&attributes=id
>
> {
>
>   "totalResults": 1,
>
>   "Resources": [
>
>     {
>
>       "id": "2819c223-7f76-453a-919d-413861904646"
>
>     }
>
>   ]
>
> }
>
> Retrieve the ID from the response and use it.
>
> DELETE /Users/2819c223-7f76-453a-919d-413861904646
>
> This does introduce an additional HTTP request if the client chooses 
> not to store the server's id.  An issue was created to consider 
> allowing operations to use the externalId 
> (http://code.google.com/p/scim/issues/detail?id=35), but I believe the 
> general consensus has been to not include this in the spec.  One main 
> point of contention is that much of the rest of the spec (eg -- group 
> membership references, manager references, etc...) require knowledge 
> of the server's identifier. Continuing this discussion on the IETF 
> list would be a good thing, though.
>
> > the cloud provider's ID and the enterprise client's ID are both 
> "Internal IDs" with respect to their domains
>
> I think this comes down to a nomenclature problem.  The server's ID 
> does not necessarily have to be the unique identifier that the 
> underlying identity store uses, it just has to be stable and unique.  
> In many cases, the underlying identity store will provide identifiers 
> with these properties already (eg -- a uuid) and it can be used by the 
> SCIM interface.  The "externalId" is referring to the fact that the id 
> is maintained external to the SCIM server.  As long as the server's 
> identifiers are stable and unique (which is mandated by the spec), I 
> don't see a problem.
>
> > The secret is that /every value needs a key/, and multi-valued 
> attributes lack that. So our solution is quite
>
> > simple - turn every list or array (of values) into a dictionary (of key-value pairs) by 
> providing each value
>
> > with a unique and meaning-free identifier.
>
> I agree that this would be useful, especially in the PATCH operation.  
> One reason that this wasn't included in the spec originally is that it 
> can put undue burden on the service provider.  Many service providers 
> are putting SCIM interfaces in front of their existing identity stores 
> (eg -- directory servers, SaaS application databases, etc...).  Many 
> of these do not have a unique identifier for multi-valued attributes. 
> By requiring this, a majority of the server providers would have to 
> start maintaining a unique key for each multi-valued attribute.  I 
> believe this would be a roadblock for many implementers.
>
> > When the SCIM protocol uses PATCH, there are areas where it seems a 
> bit clumsy.
>
> I like the thoughts here. Your example reminds me of unified diffs 
> (http://en.wikipedia.org/wiki/Diff#Unified_format), which are commonly 
> used with a patch program (pretty much the equivalent of the PATCH 
> verb).  However, the three proposals seem to largely hinge on being 
> able to uniquely address each element within an object.  Without these 
> it is not so easy to address each patch sub-operation (REPLACE, 
> INCLUDE, etc...) or provide a multi-status.
>
>
> The 207 response would be interesting to consider for the bulk 
> endpoint 
> (http://www.simplecloud.info/specs/draft-scim-api-00.html#bulk-resources), 
> however.
>
> > There are other, non-data aspects of SCIM which may require review, 
> such as its synchronous request-response
>
> > interaction model, which is a form of tight coupling and could prove to be a source of 
> brittleness.
>
> I agree that we should explore optional asynchronous requests in 2.0.
>
> Thanks again for your thoughts.  I hope you stay involved in the 
> discussion as work on SCIM 2.0 goes forward.
>
> --Kelly
>
> *From:*scim-bounces@ietf.org <mailto:scim-bounces@ietf.org> 
> [mailto:scim-bounces@ietf.org <mailto:scim-bounces@ietf.org>] *On 
> Behalf Of *Ganesh and Sashi Prasad
> *Sent:* Wednesday, August 01, 2012 4:24 PM
> *To:* scim@ietf.org <mailto:scim@ietf.org>
> *Subject:* [scim] SCIM Protocol - 3 suggestions for improvement
>
> (I posted this on the SCIM Google Group, and I was advised to 
> subscribe to the mailing list and post it here instead, so here goes.)
>
> Hi,
>
> My name is Ganesh Prasad, and my experience in Identity and Access 
> Management is mainly through a 3-year project at an Australian 
> insurance company, an experience I have written about as a eBook on 
> InfoQ (http://www.infoq.com/minibooks/Identity-Management-Shoestring).
>
> I have been following the SCIM spec off and on, and based on my 
> experience with a loosely-coupled architecture that I found to be 
> successful, I have the following 3 suggestions to make.
>
> 1. The enterprise client and the cloud provider should maintain their 
> own internal IDs for a resource, which they should not reveal to each 
> other. Both of them should map their internal IDs to a shared External 
> ID, and this is the only ID that should be exposed through the API. 
> The current specification's provision of an id (which is the external 
> ID and the only one to be transferred through the API) and an 
> "external ID" (which is the client's internal ID and should be hidden) 
> is diametrically opposite to this.
>
> 2. When dealing with multi-valued attributes of a resource (expressed 
> as arrays in JSON), they must be converted from an array into a 
> dictionary with unique keys (UUIDs generated by the cloud provider 
> when the attribute is created). Without unique keys for every 
> attribute value of a resource, manipulating it will be clumsy and 
> inelegant.
>
> 3. The PATCH command can be improved in 3 significant ways:
>
> 3a. Leverage the fact (from 2 above) that every value has a key, to 
> greatly simplify the API
>
> 3b. Use special verbs as nested operations of the PATCH command to 
> add, modify and delete attributes at any level
>
> 3c. Use the WebDAV status code of "207 Multi-Status" instead of "200 
> OK" as the response to a PATCH (or BULK) command.
>
> To elaborate,
>
> 1. Revealing private IDs externally is a form of tight coupling. A 
> major requirement with Identity Management is to split (or merge) 
> identities when false positives (or false negatives) are detected, 
> i.e., when a resource is discovered to be more than one, or when 
> multiple resources are detected to be the same. If internal 
> identifiers are revealed to external domains, such clean-ups become 
> difficult, hence every domain that wants to expose references to a 
> resource must map its internal ID to and external one created for this 
> explicit purpose, and only reveal this.
>
> In the SCIM case, when an enterprise client POSTs a resource creation 
> request, the cloud provider must generate its own internal UUID as 
> well as an external UUID, map them together, and only return the 
> external UUID in the "Location:" header. The enterprise client should 
> map this external UUID to a newly-generated internal ID of its own. In 
> case the resource already has an identifier within the enterprise 
> client's domain, then this is the internal ID that must be mapped to 
> the external UUID returned through the POST response.
>
> 2. If a resource is to be created, and one of its attributes is 
> multi-valued, e.g.,
>
> "email-addrs" :
>
>     [
>
>         "john_smith@yahoo.com <mailto:john_smith@yahoo.com>",
>
>         "john.smith@gmail.com <mailto:john.smith@gmail.com>",
>
>         "jsmith1970@hotmail.com <mailto:jsmith1970@hotmail.com>"
>
>     ]
>
> then on successful creation, the server response should include the 
> representation of the resource, and this attribute should look like this:
>
> "email-addrs" :
>
>     [
>
>         { "7dfcb444-74d8-4f17-aa66-daf9ea3bd902" : 
> "john_smith@yahoo.com <mailto:john_smith@yahoo.com>" },
>
>         { "3bd10085-c474-43b9-9cda-8646c3085bbf" : 
> "john.smith@gmail.com <mailto:john.smith@gmail.com>" },
>
>         { "581da5c7-c6e1-4cca-9db7-7a6d1de664e1" : 
> "jsmith1970@hotmail.com <mailto:jsmith1970@hotmail.com>" }
>
>     ]
>
> The client now knows what each value is labelled. This now provides an 
> unambiguous way to reference a value to add, modify and delete it:
>
> Add:
>
> POST /Users/2819c223-7f76-453a-919d-413861904646/email-addrs
>
> value="js70@easy.com.au <mailto:js70@easy.com.au>"
>
> Modify:
>
> PUT 
> /Users/2819c223-7f76-453a-919d-413861904646/email-addrs/3bd10085-c474-43b9-9cda-8646c3085bbf
>
> value="john.r.smith@gmail.com <mailto:john.r.smith@gmail.com>"
>
> Delete:
>
> DELETE 
> /Users/2819c223-7f76-453a-919d-413861904646/email-addrs/581da5c7-c6e1-4cca-9db7-7a6d1de664e1
>
> One can even delete all email addresses like this:
>
> DELETE /Users/2819c223-7f76-453a-919d-413861904646/email-addrs
>
> I believe this is more elegant than what the spec recommends.
>
> 3. It's possible to think of the operations POST, PUT and DELETE as 
> nested operations inside a PATCH. PATCH itself need not be nested 
> because its semantics apply throughout the "tree" of a resource.
>
> However, the semantics of PUT are a little messy. Also, the use of 
> HTTP verbs at a different level could be confusing. That's why I would 
> recommend 6 separate verbs that are a little more unambiguous in their 
> meaning:
>
> 1. INCLUDE (equivalent to POST): Add this resource to a collection and 
> return a generated URI
>
> 2. PLACE (equivalent to one form of PUT): Add this resource at the 
> location specified by the accompanying URI. (If there's already a 
> value at that location, return an error status.)
>
> 3. REPLACE (equivalent to another form of PUT): Replace the value at 
> the location specified by the accompanying URI with this value. (If 
> there's no such URI, return an error status.)
>
> 4. FORCE (equivalent to a third form of PUT): This means PLACE or 
> REPLACE. (At the end of this operation, we want the specified URI to 
> hold the accompanying value whether the URI already existed or not.)
>
> 5. RETIRE (equivalent to DELETE): Delete, deactivate or otherwise 
> render inaccessible the resource at the specified URI.
>
> 6. AMEND (equivalent to PATCH): (This verb is just listed for 
> completeness. We probably don't need a nested PATCH since PATCH 
> cascades to every level of the tree.)
>
> A PATCH request could therefore look like this:
>
> PATCH /Users/2819c223-7f76-453a-919d-413861904646 HTTP/1.1
>
> Host: example.com <http://example.com>
>
> Accept: application/json
>
> Authorization: Bearer h480djs93hd8
>
> Content-length: ...
>
> {
>
> REPLACE: {
>
> "key" : "first-name",
>
> "value" : "Jack"
>
>     },
>
>     PLACE : {
>
> "key" : "middle-name",
>
> "value" : "Richard"
>
>     },
>
>     FORCE : {
>
> "key" : "dob",
>
> "value" : "01-Jan-1971"
>
>     },
>
> REPLACE : {
>
> "key" : "address.unit-number",
>
> "value" : "12"
>
>     },
>
>     PLACE : {
>
> "key" : "address.state",
>
> "value" : "SA"
>
>     },
>
>     FORCE : {
>
> "key" : "address.country",
>
> "value" : "Australia"
>
>     },
>
> INCLUDE : {
>
> "key" : "email-addrs",
>
> "value" : "js70@easy.com.au <mailto:js70@easy.com.au>"
>
>     },
>
> REPLACE : {
>
> "key" : "email-addrs/3bd10085-c474-43b9-9cda-8646c3085bbf",
>
> "value" : "john.r.smith@gmail.com <mailto:john.r.smith@gmail.com>"
>
>     },
>
>     RETIRE : {
>
> "key" : "email-addrs/581da5c7-c6e1-4cca-9db7-7a6d1de664e1"
>
>     }
>
> }
>
> The PATCH response should utilise the status code "207 Multi-Status" 
> because the nested operations could have varying status codes. A 
> sample response is below:
>
> HTTP/1.1 207 Multi-Status
>
> Content-Type: application/json
>
> ETag: W/"b431af54f0671a2"
>
> Location:"https://example.com/v1/Users/2819c223-7f76-453a-919d-413861904646"
>
> {
>
> "schemas":["urn:scim:schemas:core:1.0"],
>
> "external-id":"2819c223-7f76-453a-919d-413861904646",
>
> REPLACE: {
>
> "status" : "200 OK",
>
> "key" : "first-name",
>
> "value" : "Jack"
>
>     },
>
>     PLACE : {
>
> "status" : "200 OK",
>
> "key" : "middle-name",
>
> "value" : "Richard"
>
>     },
>
>     FORCE : {
>
> "status" : "200 OK",
>
> "key" : "dob",
>
> "value" : "01-Jan-1971"
>
>     },
>
> REPLACE : {
>
> "status" : "200 OK",
>
> "key" : "address.unit-number",
>
> "value" : "12"
>
>     },
>
>     PLACE : {
>
> "status" : "200 OK",
>
> "key" : "address.state",
>
> "value" : "SA"
>
>     },
>
>     FORCE : {
>
> "status" : "200 OK",
>
> "key" : "address.country",
>
> "value" : "Australia"
>
>     },
>
> INCLUDE : {
>
> "status" : "201 Created",
>
> "key" : "email-addrs/11f664ec-898b-4f6f-8948-ecfda74deff0",
>
> "value" : "js70@easy.com.au <mailto:js70@easy.com.au>"
>
>     },
>
> REPLACE : {
>
> "status" : "200 OK",
>
> "key" : "email-addrs/3bd10085-c474-43b9-9cda-8646c3085bbf",
>
> "value" : "john.r.smith@gmail.com <mailto:john.r.smith@gmail.com>"
>
>     },
>
>     RETIRE : {
>
> "status" : "200 OK",
>
> "key" : "email-addrs/581da5c7-c6e1-4cca-9db7-7a6d1de664e1"
>
>     }
>
> "meta": {
>
> "created":"2011-08-08T04:56:22Z",
>
> "lastModified":"2011-08-08T08:00:12Z",
>
> "location":"https://example.com/v1/Users/2819c223-7f76-453a-919d-413861904646",
>
> "version":"W\/\"b431af54f0671a2\""
>
>     }
>
> }
>
> If there are errors, they will take the place of the "200 OK" or "201 
> Created" status codes in the above successful case. But the outer 
> status will remain "207 Multi-Status".
>
> The same scheme can be used to deal with operations on members of a 
> group, and for bulk operations.
>
> I hope you find these suggestions useful.
>
> I read the SCIM spec afresh last week and these ideas came flooding 
> into my head because I have been working at another organisation (a 
> telco) for the last 5 months, also in Identity and Access Management, 
> and my thoughts have moved further along the direction of evolving a 
> specialised data model based on specific principles, especially for IAM.
>
> I am planning to write about this and also the data-related principles 
> soon and am in negotiations with InfoQ regarding publication.
>
> Regards,
>
> Ganesh Prasad
>
>
> ------------------------------------------------------------------------
>
> This e-mail message, including any attachments, is for the sole use of 
> the person to whom it has been sent, and may contain information that 
> is confidential or legally protected. If you are not the intended 
> recipient or have received this message in error, you are not 
> authorized to copy, distribute, or otherwise use this message or its 
> attachments. Please notify the sender immediately by return e-mail and 
> permanently delete this message and any attachments. Gartner makes no 
> warranty that this e-mail is error or virus free.
>
>
> _______________________________________________
> scim mailing list
> scim@ietf.org
> https://www.ietf.org/mailman/listinfo/scim