Re: [scim] SCIM Synchronization Problem

Danny Mayer <mayer@pdmconsulting.net> Tue, 24 August 2021 15:35 UTC

Return-Path: <mayer@pdmconsulting.net>
X-Original-To: scim@ietfa.amsl.com
Delivered-To: scim@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 06BFF3A1898 for <scim@ietfa.amsl.com>; Tue, 24 Aug 2021 08:35:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, NICE_REPLY_A=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EbxyX5AM4AMN for <scim@ietfa.amsl.com>; Tue, 24 Aug 2021 08:35:12 -0700 (PDT)
Received: from chessie.everett.org (chessie.everett.org [IPv6:2001:470:1:205::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DB0D03A1894 for <scim@ietf.org>; Tue, 24 Aug 2021 08:35:07 -0700 (PDT)
Received: from newusers-MBP.fios-router.home (pool-108-26-179-179.bstnma.fios.verizon.net [108.26.179.179]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by chessie.everett.org (Postfix) with ESMTPSA id 4GvCpx1K1fzMNWp; Tue, 24 Aug 2021 15:35:05 +0000 (UTC)
To: "Matt Peterson (mpeterso)" <Matt.Peterson@oneidentity.com>, SCIM WG <scim@ietf.org>
References: <e8f9d66c-f356-61b8-d38a-b5288fb9c518@pdmconsulting.net> <MWHPR19MB0957823E52255520BD4DB305E1C09@MWHPR19MB0957.namprd19.prod.outlook.com> <0f21255e-4ea9-b42d-353d-5c0661e1be5e@pdmconsulting.net> <MWHPR19MB0957723BC8FE476F68EBDEC6E1C49@MWHPR19MB0957.namprd19.prod.outlook.com>
From: Danny Mayer <mayer@pdmconsulting.net>
Message-ID: <6f80bf2c-f27e-dbf1-b5f1-fb9ff6a398f1@pdmconsulting.net>
Date: Tue, 24 Aug 2021 11:35:04 -0400
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.13.0
MIME-Version: 1.0
In-Reply-To: <MWHPR19MB0957723BC8FE476F68EBDEC6E1C49@MWHPR19MB0957.namprd19.prod.outlook.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/scim/CmlGSXr06_YJZdXYis41Yqdfj-I>
Subject: Re: [scim] SCIM Synchronization Problem
X-BeenThere: scim@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Simple Cloud Identity Management BOF <scim.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/scim>, <mailto:scim-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/scim/>
List-Post: <mailto:scim@ietf.org>
List-Help: <mailto:scim-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/scim>, <mailto:scim-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Aug 2021 15:35:19 -0000

Exactly right.

On 8/23/21 7:49 PM, Matt Peterson (mpeterso) wrote:
> The management server in your scenario is a SCIM client.
>
> If an application administrator goes in to the Application Server  and adds, removes, or modifies account,  the Management Server  has one option when using SCIM -- which is to query the Application Server (the SCIM server) for all accounts and then resolve the differences between this call and a previous call where all accounts were fetched.   This is very inefficient.  Especially when it comes to detecting deleted accounts, there is no other way in standard SCIM except to have your Management Server re-load all the accounts from your Application server.
>
> What you've implemented in your Management Server *is* a pure client-side cache scenario as I described it.  Except that you've recognized that the SCIM spec makes it very inefficient to keep your Management Serve up-to-date with changes happening "out-of-band" on the Application Servers.
>
> So... you implemented a notification system whereby your Management Server (SCIM Client) can be notified of changes that just happened on the Application Server (the SCIM server).
>   
> Your implementation, where the Application Server (SCIM server) notifies your Management Server (SCIM client) that a change has occurred *is* one of the proposed technical approaches identified by @Phil in a previous post to this list and was also mentioned in the BOF.
>
> I believe that an approach (similar to your implementation ) whereby a SCIM client can subscribe to be notified of changes on the SCIM server is an intuitive approach for solving the "Syncronization problem"
>
> --
> Matt
>
>
> -----Original Message-----
> From: Danny Mayer <mayer@pdmconsulting.net>
> Sent: Friday, August 20, 2021 8:46 AM
> To: Matt Peterson (mpeterso) <Matt.Peterson@oneidentity.com>; SCIM WG <scim@ietf.org>
> Subject: Re: [scim] SCIM Synchronization Problem
>
> CAUTION: This email originated from outside of the organization. Do not follow guidance, click links, or open attachments unless you recognize the sender and know the content is safe.
>
>
> It's not that simple. If an administrator goes in an adds a new account or updates an existing account and adds permissions to it, how will the management side know? The way I did it was to have the application monitor for those changes and notify the management server of the changes. I did this via a message queue for our internal applications.
> The management server needs to decide how to handle these violations.
>
> Danny
>
> On 8/19/21 1:33 PM, Matt Peterson (mpeterso) wrote:
>> Thank you, Danny, for creating a new thread.  The following are my comments from meeting minutes (some edits):
>>
>> There appears to be two use cases for "synchronization":
>>
>>     1.  Constructing and enforcing Application authorization models -  An application (acting as a SCIM client) caches users/groups data from the IdP in order to present "user/group pickers" when presenting screens used to configure the application  authorization rules (RBAC, Policy, or ACL).  Also, when the application enforces authorization, user and group data needs to be immediately available so that authorization decisions can be made quickly.
>>
>>     2.  Identity Management and Governance systems - implement a canonical identity model" where all accounts and groups are represented.  This model is used to build provisioning rules and calculate separation of duty violations, attestations, and approvals etc.   Management-time evaluation of the model needs to be done efficiently without external calls to the SCIM service provider.
>>
>> Both above use cases, are "client-cache" use cases that need only a "one-way" sync (from the SCIM server to the SCIM client).   To accomplish there are two distinct steps:  a) download the initial results (users/groups) and, b) keep the cached copy of these initial results (user/groups) up to date with changes that are being made on the SCIM Service provider.
>>
>> I think it could help us narrow down the list of suitable approaches if we could agree that "one-way" sync (keeping a client-side cache up to date) is our target.
>>
>> --
>> Matt Peterson
>>
>> -----Original Message-----
>> From: scim <scim-bounces@ietf.org> On Behalf Of Danny Mayer
>> Sent: Wednesday, August 18, 2021 8:34 AM
>> To: SCIM WG <scim@ietf.org>
>> Subject: [scim] SCIM Synchronization Problem
>>
>> CAUTION: This email originated from outside of the organization. Do not follow guidance, click links, or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>> I decided that this needs it's own thread and not be part of the meeting minutes.
>>
>> I have had a great deal of experience dealing with the user account synchronization problem. Here's my view of the problems.
>>
>> I will be calling one system Management Server and the other system Application Server. I found client/server labels confusing. The Management Server is what I am defining to be the server that sends updates to add/update/remove users and groups to the Application server whose account, groups and access permissions are being managed.
>>
>> First some definitions of user accounts. There are usually more than one of each of these:
>> 1. Builtin accounts
>> 2. Special-purpose accounts
>> 3. Employee
>> 4. Contractor
>> 5. Agent
>> 6. Customer
>>
>> There may be more.
>>
>> 1. Builtin accounts: These are accounts that applications have and there may be more than one. There is always an admin account which can do anything, for example the administrator account in Active Directory or a database admin account. The application may have more accounts for other purposes.
>>
>> 2. Special-purpose accounts: These may be set up to provide access to other applications, for example a SCIM request to a SCIM REST API should be handled by a special account which cannot be used to login via a UI interface and only be able to perform certain functions. In addition there may be accounts set up to listen for topics or queues on a message queue among other possibilities. Keeping separate accounts like this are important for tracking in logs and applications.
>>
>> 3. Employee: These are accounts that employees may login to the application.
>>
>> 4. Contractors: These accounts that a contractor performing work for the company may use to log into an application. Unlike Employee accounts these would have an expiration date.
>>
>> 5. Agent: Accounts like this are for external users who may need to manage information for their own customers. An example of this is an insurance agent logging in to handle an insurance policy for their clients.
>>
>> 6. Customers: These are where the customers are using the application directly. For a bank it's likely to be millions of customers. The management platform should not be involved in managing these accounts.
>>
>> Let's now look at a few example applications.
>>
>> 1. Helpdesk
>> All employees and contractors will need to be able to log into a helpdesk application and enter tickets. This means loading information about all employees and contractors. For a company with only 1000 employees that's manageable. For a company with 100K employees, it's a bigger challenge.
>>
>> 2. Customer Support
>> Only employees or contractors in the department providing customer support need access plus a few other employees. In addition identified customers may need accounts.
>>
>> 3. Expenses
>> Not all employees or contractors will be submitting expenses so it may not be necessary to have accounts for all possible users. This is something that the application owner needs to decide.
>>
>> Now let's look at logistics.
>>
>> Bulk load:
>> Each application will need an initial set of accounts set up and for something like a helpdesk this could involve load 1000-100,000 accounts.
>> The information needed could come from either the management server or separately, say from an HR system. Many servers that I have encountered limit the number of records to something like 1000, so the pagination requirement is needed for this. Even when dealing with a limited subset of employees or contractors you can run into this need.
>>
>> Synchronization
>> An application that is bulk-loaded above may need to be synchronized to the management server if the data did not come from the management server.
>>
>> Change Management
>> This is really a synchronization issue as well. Changes happen all the time and new employees/contractors need to be added, terminated ones removed and updates happen all the time. The best way of dealing with this may be to set up a message queue that each application can subscribe to and they can take the needed action when it's convenient for that application. It's not the only method but it's the one I found to be the most helpful. There are two ways of doing that: 1. send the complete user information for new accounts, send just the change for updating accounts, send the ID for terminated accounts along with some meta information. The other method which I have used is just to send the ID and whether it's new, updated or terminated.
>>
>> I hope this is helpful to the discussion.
>>
>> Danny
>>
>>
>> _______________________________________________
>> scim mailing list
>> scim@ietf.org
>> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
>> ietf.org%2Fmailman%2Flistinfo%2Fscim&amp;data=04%7C01%7CMatt.Peterson%40oneidentity.com%7C0502a1bbffb94f6220b808d963e944e9%7C91c369b51c9e439c989c1867ec606603%7C0%7C1%7C637650675822236228%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=ol7Mrjj%2Bx2SGTgQegrvwVNlVeMODf6pxEOc5PTjkK1g%3D&amp;reserved=0
>>