Re: [scim] SCIM Synchronization Problem

Phillip Hunt <phil.hunt@independentid.com> Fri, 20 August 2021 15:31 UTC

Return-Path: <phil.hunt@independentid.com>
X-Original-To: scim@ietfa.amsl.com
Delivered-To: scim@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 915A53A10C5 for <scim@ietfa.amsl.com>; Fri, 20 Aug 2021 08:31:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=independentid-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kB7gC6c_VAYd for <scim@ietfa.amsl.com>; Fri, 20 Aug 2021 08:31:47 -0700 (PDT)
Received: from mail-pj1-x1034.google.com (mail-pj1-x1034.google.com [IPv6:2607:f8b0:4864:20::1034]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 205783A10CC for <scim@ietf.org>; Fri, 20 Aug 2021 08:31:47 -0700 (PDT)
Received: by mail-pj1-x1034.google.com with SMTP id oa17so7571943pjb.1 for <scim@ietf.org>; Fri, 20 Aug 2021 08:31:47 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=independentid-com.20150623.gappssmtp.com; s=20150623; h=content-transfer-encoding:from:mime-version:subject:date:message-id :references:cc:in-reply-to:to; bh=jme/kP3xZoLqmc+QznNwfQHzzgzm6Ufk66BwIXcRcEI=; b=UKbnpPArHyynj7t8ugYv6u9EuEa5SWPyOz/5zh6uL0fQWaMfp2rexA4pSaoXEsLcdt +Jm7rETwKzk1yt/Dv+I2iKlZPdFAt+h36KrGy04vdyeewNKjsrYx3TQsOPynZNeEZlFi eUX8c78OI9R+fQsUT7Bv3PN+vzP5KBD2tECWeirxmr3cdo4h+2+x93zJwy4mgQ7HjFz7 j/EekXdS+OHc7gJgBb+J9VAZupLA5ouIFiw6ot4H76F3zG3qJwRE9cLS2TtD9MIBruuE jyshLBnf6bIZLgrKQZnqZ3X6PxgYoN0g8zksojaKmpHVvjdDL1VXyXwSPR9avJOUTEaa y6+Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:content-transfer-encoding:from:mime-version :subject:date:message-id:references:cc:in-reply-to:to; bh=jme/kP3xZoLqmc+QznNwfQHzzgzm6Ufk66BwIXcRcEI=; b=PHkT5SxXtq22Cg1piSdKw2XFgx2+ucRoDJk20+m375WCvE9twoU/flwcMUdkaqBmge MW+a/eQDUrPm65dV5eKmkdd28MO/YP7+PXOvSvRIJa40i6gTnAS6yE5JOIVnN4XU7Vdn 5p3ewO9yTUcqbYK516y96hDjj+aGkg1ZfxKM9ufX/BL78slNOCORkF5s7JJfiCM7/0Kq FrVmH429H4aEEYEPN84Hi+jv2h4jgnw7HP4QVYnKLXwG+f7hrwult2PF3JE6+4M7d6JN IluqhcU6LAFHLtb9Jal6VaAbwRyny7Iln/57b3ere1KEJtb5D5S5OwAacN0xttm1Ujdz rofg==
X-Gm-Message-State: AOAM531ggG6z3JYU4gX7BVM6UBQUl86s2GDHvTuSxJU/Gd8Oc3UPj8i1 q7oR2PF/GqjEImVCva+LyKopRQ==
X-Google-Smtp-Source: ABdhPJzQpz5GJSwr6m3XK5hHfD+cib0zQyyJiNy0t98lcyCEVrrEhCA1ARpRZJ2N1PEe+U8FYftHcg==
X-Received: by 2002:a17:902:e04e:b029:12c:def4:56a3 with SMTP id x14-20020a170902e04eb029012cdef456a3mr16747280plx.76.1629473505472; Fri, 20 Aug 2021 08:31:45 -0700 (PDT)
Received: from smtpclient.apple (node-1w7jr9qqo6k58q0ekapvh7n7h.ipv6.telus.net. [2001:569:79bc:100:e131:bc00:c956:e3d]) by smtp.gmail.com with ESMTPSA id b8sm6562985pjo.51.2021.08.20.08.31.44 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 20 Aug 2021 08:31:44 -0700 (PDT)
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
From: Phillip Hunt <phil.hunt@independentid.com>
Mime-Version: 1.0 (1.0)
Date: Fri, 20 Aug 2021 08:31:43 -0700
Message-Id: <6E006C03-06E0-4488-983E-0E0357B14363@independentid.com>
References: <b332ab98-b86d-a72c-a3c4-24e3abbddf76@pdmconsulting.net>
Cc: "Matt Peterson (mpeterso)" <Matt.Peterson=40oneidentity.com@dmarc.ietf.org>, SCIM WG <scim@ietf.org>
In-Reply-To: <b332ab98-b86d-a72c-a3c4-24e3abbddf76@pdmconsulting.net>
To: Danny Mayer <mayer@pdmconsulting.net>
X-Mailer: iPhone Mail (18G82)
Archived-At: <https://mailarchive.ietf.org/arch/msg/scim/cbre5YRzo05hk0QaTy1m8fVGH6I>
Subject: Re: [scim] SCIM Synchronization Problem
X-BeenThere: scim@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Simple Cloud Identity Management BOF <scim.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/scim>, <mailto:scim-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/scim/>
List-Post: <mailto:scim@ietf.org>
List-Help: <mailto:scim-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/scim>, <mailto:scim-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 20 Aug 2021 15:31:53 -0000

To be clear, the specs are written as protocol profiling HTTP.  The terms client and server stem from HTTP’s notion that a client initiates a request and a server responds. 

It is ok to develop a higher level in the information layer.  For example we had to adopt publish and subscribe for security events flow but the delivery specs talk about whether subscribers act as http clients or servers depending in whether events are pushed to them or whether they pull events from the publisher. 

Some of Danny’s requirements strike me as access control needs in scim service.  For example a client that authenticates with the correct scope may be allowed to return unlimited results in a request or may be allowed frequent polling requests. Another client may only be allowed to update or read specific claims at a certain rate.   Access control is something we can talk about in the charter (I have a proposal I can put together quickly in a draft)) but I would not support conflating access control with synchronization.  

The cases Danny is describing sound like a hybrid (of replica vs coordinated) where two domains are unequal but under a common authority.  This of course should be supported. 

The event model helps responsiveness in that subscribers can register to listen for changes as they occur.  As I mentioned previously I am happy to give a presentation on synchronous command (ie REST) vs async events and why this aids cross-domain synchronization performance and accuracy. 

Phil

> On Aug 20, 2021, at 8:04 AM, Danny Mayer <mayer@pdmconsulting.net> wrote:
> 
> You cannot assume that the groups are the same across different application domains. In our case each group was totally different and had different permissions associated with it. In addition you can have different permissions associated individually to a user account. Microsoft's NTFS access controls are a good example of this.
> 
> The SCIM Config information should include information on what commands are available on that system. In addition the application needs to notify the requestor that if something went wrong why it went wrong. For example, in my cases if a request for a group to be added to a user cannot be fulfilled because the group does not exist then the failure needs to be reported back. Also for our builtin and special accounts, the requestor cannot make any changes.
> 
> Does this help?
> 
> Danny
> 
>> On 8/19/21 2:05 PM, Phillip Hunt wrote:
>> This is good high level case information.
>> 
>> Matt, you make a good case for access control standards. It brings out some questions…
>> 
>> Is syncing users and groups enough or must access policy rules/data be synchronized (assuming a common standard for policy)?  How close must separate domains be to work this well together? Is it true that access control is the same across domains?
>> 
>> These issues are what led me to believe client/server RESTful “command”s like POST/PUT/PATCH are insufficient. Instead, what if an event in one domain is published and a subscriber decides how to process the event? In command mode the requestor has to know the command will work. In event mode, the receiver has context to interpret and translate into local commands.
>> 
>> Maybe we can do a con call and go through security events and how they support arms length cross domain relationships.
>> 
>> Phil
>> 
>>>> On Aug 19, 2021, at 10:33 AM, Matt Peterson (mpeterso) <Matt.Peterson=40oneidentity.com@dmarc.ietf.org> wrote:
>>> 
>>> Thank you, Danny, for creating a new thread.  The following are my comments from meeting minutes (some edits):
>>> 
>>> There appears to be two use cases for "synchronization":
>>> 
>>>  1.  Constructing and enforcing Application authorization models -  An application (acting as a SCIM client) caches users/groups data from the IdP in order to present "user/group pickers" when presenting screens used to configure the application  authorization rules (RBAC, Policy, or ACL).  Also, when the application enforces authorization, user and group data needs to be immediately available so that authorization decisions can be made quickly.
>>> 
>>>  2.  Identity Management and Governance systems - implement a canonical identity model" where all accounts and groups are represented.  This model is used to build provisioning rules and calculate separation of duty violations, attestations, and approvals etc.   Management-time evaluation of the model needs to be done efficiently without external calls to the SCIM service provider.
>>> 
>>> Both above use cases, are "client-cache" use cases that need only a "one-way" sync (from the SCIM server to the SCIM client).   To accomplish there are two distinct steps:  a) download the initial results (users/groups) and, b) keep the cached copy of these initial results (user/groups) up to date with changes that are being made on the SCIM Service provider.
>>> 
>>> I think it could help us narrow down the list of suitable approaches if we could agree that "one-way" sync (keeping a client-side cache up to date) is our target.
>>> 
>>> --
>>> Matt Peterson
>>> 
>>> -----Original Message-----
>>> From: scim <scim-bounces@ietf.org> On Behalf Of Danny Mayer
>>> Sent: Wednesday, August 18, 2021 8:34 AM
>>> To: SCIM WG <scim@ietf.org>
>>> Subject: [scim] SCIM Synchronization Problem
>>> 
>>> CAUTION: This email originated from outside of the organization. Do not follow guidance, click links, or open attachments unless you recognize the sender and know the content is safe.
>>> 
>>> 
>>> I decided that this needs it's own thread and not be part of the meeting minutes.
>>> 
>>> I have had a great deal of experience dealing with the user account synchronization problem. Here's my view of the problems.
>>> 
>>> I will be calling one system Management Server and the other system Application Server. I found client/server labels confusing. The Management Server is what I am defining to be the server that sends updates to add/update/remove users and groups to the Application server whose account, groups and access permissions are being managed.
>>> 
>>> First some definitions of user accounts. There are usually more than one of each of these:
>>> 1. Builtin accounts
>>> 2. Special-purpose accounts
>>> 3. Employee
>>> 4. Contractor
>>> 5. Agent
>>> 6. Customer
>>> 
>>> There may be more.
>>> 
>>> 1. Builtin accounts: These are accounts that applications have and there may be more than one. There is always an admin account which can do anything, for example the administrator account in Active Directory or a database admin account. The application may have more accounts for other purposes.
>>> 
>>> 2. Special-purpose accounts: These may be set up to provide access to other applications, for example a SCIM request to a SCIM REST API should be handled by a special account which cannot be used to login via a UI interface and only be able to perform certain functions. In addition there may be accounts set up to listen for topics or queues on a message queue among other possibilities. Keeping separate accounts like this are important for tracking in logs and applications.
>>> 
>>> 3. Employee: These are accounts that employees may login to the application.
>>> 
>>> 4. Contractors: These accounts that a contractor performing work for the company may use to log into an application. Unlike Employee accounts these would have an expiration date.
>>> 
>>> 5. Agent: Accounts like this are for external users who may need to manage information for their own customers. An example of this is an insurance agent logging in to handle an insurance policy for their clients.
>>> 
>>> 6. Customers: These are where the customers are using the application directly. For a bank it's likely to be millions of customers. The management platform should not be involved in managing these accounts.
>>> 
>>> Let's now look at a few example applications.
>>> 
>>> 1. Helpdesk
>>> All employees and contractors will need to be able to log into a helpdesk application and enter tickets. This means loading information about all employees and contractors. For a company with only 1000 employees that's manageable. For a company with 100K employees, it's a bigger challenge.
>>> 
>>> 2. Customer Support
>>> Only employees or contractors in the department providing customer support need access plus a few other employees. In addition identified customers may need accounts.
>>> 
>>> 3. Expenses
>>> Not all employees or contractors will be submitting expenses so it may not be necessary to have accounts for all possible users. This is something that the application owner needs to decide.
>>> 
>>> Now let's look at logistics.
>>> 
>>> Bulk load:
>>> Each application will need an initial set of accounts set up and for something like a helpdesk this could involve load 1000-100,000 accounts.
>>> The information needed could come from either the management server or separately, say from an HR system. Many servers that I have encountered limit the number of records to something like 1000, so the pagination requirement is needed for this. Even when dealing with a limited subset of employees or contractors you can run into this need.
>>> 
>>> Synchronization
>>> An application that is bulk-loaded above may need to be synchronized to the management server if the data did not come from the management server.
>>> 
>>> Change Management
>>> This is really a synchronization issue as well. Changes happen all the time and new employees/contractors need to be added, terminated ones removed and updates happen all the time. The best way of dealing with this may be to set up a message queue that each application can subscribe to and they can take the needed action when it's convenient for that application. It's not the only method but it's the one I found to be the most helpful. There are two ways of doing that: 1. send the complete user information for new accounts, send just the change for updating accounts, send the ID for terminated accounts along with some meta information. The other method which I have used is just to send the ID and whether it's new, updated or terminated.
>>> 
>>> I hope this is helpful to the discussion.
>>> 
>>> Danny
>>> 
>>> 
>>> _______________________________________________
>>> scim mailing list
>>> scim@ietf.org
>>> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ietf.org%2Fmailman%2Flistinfo%2Fscim&amp;data=04%7C01%7Cmatt.peterson%40quest.com%7C1fb1fe803b8c439026bd08d9625553ba%7C91c369b51c9e439c989c1867ec606603%7C0%7C0%7C637648940889425726%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=riJQy3jEHltc6ZFCaEe%2B6%2F2ecXN0SWBXOxIZAKLxum4%3D&amp;reserved=0
>>> 
>>> _______________________________________________
>>> scim mailing list
>>> scim@ietf.org
>>> https://www.ietf.org/mailman/listinfo/scim
>