Re: [scim] SCIM Synchronization Problem

Craig McClanahan <craigmcc@gmail.com> Tue, 24 August 2021 01:44 UTC

Return-Path: <craigmcc@gmail.com>
X-Original-To: scim@ietfa.amsl.com
Delivered-To: scim@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 929F63A15FE for <scim@ietfa.amsl.com>; Mon, 23 Aug 2021 18:44:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dIxchAA9-s9y for <scim@ietfa.amsl.com>; Mon, 23 Aug 2021 18:44:22 -0700 (PDT)
Received: from mail-yb1-xb36.google.com (mail-yb1-xb36.google.com [IPv6:2607:f8b0:4864:20::b36]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6D7AF3A15F5 for <scim@ietf.org>; Mon, 23 Aug 2021 18:44:22 -0700 (PDT)
Received: by mail-yb1-xb36.google.com with SMTP id e129so13859656yba.5 for <scim@ietf.org>; Mon, 23 Aug 2021 18:44:22 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:reply-to:from:date:message-id :subject:to:cc; bh=yEt+czagNJIpNjZPEXWbBbREFO3K+V3nU7fIdA2JRzY=; b=l7wdG4VsElb6gV6ekRuLcBkBF6K00b1zjJ/79hSF3RfEJb2sxry/1G1vl4TOOyK8X/ VacRmFpTkePh8mqrclAuoeADjUtzRfo8eL8hOUj7myF49gXmwptWSIzp2CbEvnaU2Mur HwuQP/uWvAAwvm0SJb5k5wvqxRJbpojevVJdphwA05p/Y4u82Uy+ThVUCZpbwnZWEBmj 7eu6BgVTzAVANEEEd81sOAHNBmmjS5o86qg823Fhy1sNbgcR+VUexktXtLzBQbZhSmb1 2Beu+PuBHPvkYupM4+2XUq2Aprqg6f3Q51XX6TUNcKw2H09po34HxB3j4dO5RClUbhvR bKsw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:reply-to :from:date:message-id:subject:to:cc; bh=yEt+czagNJIpNjZPEXWbBbREFO3K+V3nU7fIdA2JRzY=; b=hU6g85yZtWV1uZ3Ej2WBR3ACxJAK3fRW/TG7zvR+Rixbp1hj39dQ7bj+k9Ch4F94Ff A8XJPFjiblZBl+nS6kUInjWXg0jqYM1FRWXyS8qzlSA9fxl/KU0Mw+upAB9pNiJ9imfG SWKFkXw3fSslaNyaX5PRKTVw103IoIw6IMEmRxWJg5mZHgZIrubUXqqaPEd496h2lVhN RhISm2zkHslMC3SR/fVNat8brreMvX08MQzBaldN/NLDBJaiDfB3zHhgHjRVUWBN/2LL mE9b/ZjSMVYQefg0rlogpqZ1d50avtw2mHndF0GDPTr/Mb1PXqH6ew4SgHTJ7WHUrT0Y p6RA==
X-Gm-Message-State: AOAM533kW+Dtj28Eza13J3HtPIJMmnc6BdbKndh2PtLBajXiQJeUbe6s /37+7941ib2peBFdGWYbnk+ewJnGBWvQqiRDHA0=
X-Google-Smtp-Source: ABdhPJyM9aOvk4Uv8YSH46IUorjQB58km8zKeo4OlYCGpL+2ITLWPyLO4t7qe8BoYj6iOeB6mvMO3loCfG030RWjotg=
X-Received: by 2002:a25:c694:: with SMTP id k142mr11848552ybf.505.1629769459808; Mon, 23 Aug 2021 18:44:19 -0700 (PDT)
MIME-Version: 1.0
References: <e8f9d66c-f356-61b8-d38a-b5288fb9c518@pdmconsulting.net> <MWHPR19MB095771672B28345FA22D6399E1C09@MWHPR19MB0957.namprd19.prod.outlook.com> <10cb88b4-b115-8927-921c-1bde137940fa@pdmconsulting.net> <MWHPR19MB0957D8533A439A99C4E52C97E1C49@MWHPR19MB0957.namprd19.prod.outlook.com>
In-Reply-To: <MWHPR19MB0957D8533A439A99C4E52C97E1C49@MWHPR19MB0957.namprd19.prod.outlook.com>
Reply-To: craigmcc@gmail.com
From: Craig McClanahan <craigmcc@gmail.com>
Date: Mon, 23 Aug 2021 18:43:42 -0700
Message-ID: <CANgkmLBneyCYFK4Hn0CG4pXDKW5Nfm+Z77dbnhqrGpotaM8EkQ@mail.gmail.com>
To: "Matt Peterson (mpeterso)" <Matt.Peterson=40oneidentity.com@dmarc.ietf.org>
Cc: Danny Mayer <mayer@pdmconsulting.net>, SCIM WG <scim@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000b7c4b805ca4445bb"
Archived-At: <https://mailarchive.ietf.org/arch/msg/scim/VKqZ6vZ447jYHnPfvouruSzoc9c>
Subject: Re: [scim] SCIM Synchronization Problem
X-BeenThere: scim@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Simple Cloud Identity Management BOF <scim.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/scim>, <mailto:scim-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/scim/>
List-Post: <mailto:scim@ietf.org>
List-Help: <mailto:scim-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/scim>, <mailto:scim-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Aug 2021 01:44:29 -0000

Several years ago (early in the lifetime of the SCIM specification), I was
involved in exactly this kind of situation.  I was trying to extract the
Identity portion of a monolith out into a separate service.  Naturally, as
you'll see in lots of monoliths, the User database information was used for
*lots* of things, not just authentication and authorization, so we couldn't
just remove it from the monolith's database.

>From a SCIM terminology perspective, life was actually pretty clear -- the
monolith was the SCIM server (the source of truth), and the new Identity
Service was the SCIM client.  Creating the initial download mechanism was
no big deal -- one humongous batch transfer (or done in pieces if need be)
and the Identity Service is up to date with the initial snapshot.  But what
happens when:

   - User changes their name (or a bunch of other profile fields)?  Not
   security sensitive, but definitely UI sensitive.  If they are part of the
   data transferred from the SCIM server to the SCIM client, it's definitely
   relevant.
   - User logs on or logs off?  (Groan ... the monolith cared a lot about
   this, because it affects UI like "is this person currently logged in" ...
   even worse, the monolith updated a field in the actual user database row,
   which triggered a gazillion "user change" events (originally internal to
   the monolith) documenting a change that had no other impact than to change
   the last logged in timestamp ... sigh ... performance ... sigh.)
   - User changes their password?  We probably could have absorbed that
   function, but would have required modifications to a bunch of apps out of
   my team's purview, so we didn't.
   - For that matter, *any* change happens between when the snapshot was
   taken, and the Identity Service caught up.

Anyone who has ever tried this trick has most likely run into the same
kinds of issues -- how do you deal with incremental changes that need to be
communicated from the server to the client?  Some of those changes
(passwords, ACLs, account deactivations, etc) are very much time critical.

Our team chose to try to use webhooks from the monolith (SCIM server) back
to the Identity Service (SCIM client), totally out-of-band to anything
defined by SCIM.  This would have worked OK if the webhook technology was
actually reliable and incorporated things like guaranteed forwarding of the
change event messages.  It doesn't work very well, of course, when the
Identity Service hasn't fully processed the initial snapshot yet but
receives a realtime update from the monolith.  In retrospect, this approach
was probably a mistake -- a different messaging technology would have been
better for a narrow point-to-point requirement like this, but would not
have addressed all of the issues.  And what about a more broad-based
notification requirement?

Could we have turned the whole thing around, and made the new Identity
Service the SCIM server, and the monolith the SCIM client?  I suppose, but
it would have required the Identity Service to be involved in a gazillion
things that were not authentication or authorization related, and the need
for some sort of out-of-band "incremental change" event notifications would
have remained, just going in the other direction.

For the SCIM specification, It's the "incremental change" thing that is the
hard nut to crack, IMHO.  But it's actually a bigger problem than that,
endemic to any scenario where you are trying to tear apart a monolith.

Craig McClanahan



On Mon, Aug 23, 2021 at 4:30 PM Matt Peterson (mpeterso) <Matt.Peterson=
40oneidentity.com@dmarc.ietf.org> wrote:

> Thanks for the clarification.
>
> "Client" and "Server" are as used dozens of times in the existing RFCs
> which makes these as useful in understanding the SCIM protocol as they are
> for understanding the HTTP protocol.  In a SCIM protocol exchange there is
> a thing that provides a resource (the server) and a thing that makes a HTTP
> request (GET,POST,PUT,PATCH,DELETE) on the resource (the client).
>
> I think it would be helpful to try and use this generally accepted HTTP
> terminology if we can -- especially when talking about this synchronization
> topic.
>
> Management server <-- SCIM Client
> Application Server <-- SCIM Server
>
> With this clarification, seems to me like your Management Server is using
> SCIM to manage accounts on the Application Server, and that the Management
> Server would benefit from having an up-to-date cache of accounts that are
> on the Application Server(s) even for cases where account
> changes/additions/deletions are made "out of band"  (e.g. directly on the
> Application Server, not initiated by the Management server.)
>
> Did I understand correctly?
>
>
>
> -----Original Message-----
> From: Danny Mayer <mayer@pdmconsulting.net>
> Sent: Friday, August 20, 2021 8:54 AM
> To: Matt Peterson (mpeterso) <Matt.Peterson@oneidentity.com>om>; SCIM WG <
> scim@ietf.org>
> Subject: Re: [scim] SCIM Synchronization Problem
>
> CAUTION: This email originated from outside of the organization. Do not
> follow guidance, click links, or open attachments unless you recognize the
> sender and know the content is safe.
>
>
> I decided to avoid the client/server naming convention because I find it
> confusing. The system responsible for managing the user accounts and groups
> I have declared to be the Management server. I believe you describe this as
> the "SCIM Client" and is used to GET/POST/PATCH, etc.
> to the Application server to maintain the user accounts and groups. The
> application server is what I believe you call the "SCIM Server". This
> provides the SCIM API's needed.
>
> I find the use of client/server not helpful to understand their role in
> the protocol.
>
> Danny
>
> On 8/19/21 1:42 PM, Matt Peterson (mpeterso) wrote:
> > Danny,
> >
> > To help me understand in your post is the "SCIM Client" and what is the
> "SCIM Server" can you tell me which of your components (the Management
> Server or the Application Server) implements SCIM endpoints?
> >
> > The component that acts as a "SCIM Server" is the component that
> provides the of WebAPI endpoints that conform to the SCIM spec.  For
> example, the  /user, /group SCIM endpoints.
> >
> > The component(s) that uses these endpoints to query (GET)  users/groups
> or to create (POST) user/groups is the "SCIM client".
> >
> > --
> > Matt
> >
> > -----Original Message-----
> > From: scim <scim-bounces@ietf.org> On Behalf Of Danny Mayer
> > Sent: Wednesday, August 18, 2021 8:34 AM
> > To: SCIM WG <scim@ietf.org>
> > Subject: [scim] SCIM Synchronization Problem
> >
> > CAUTION: This email originated from outside of the organization. Do not
> follow guidance, click links, or open attachments unless you recognize the
> sender and know the content is safe.
> >
> >
> > I decided that this needs it's own thread and not be part of the meeting
> minutes.
> >
> > I have had a great deal of experience dealing with the user account
> synchronization problem. Here's my view of the problems.
> >
> > I will be calling one system Management Server and the other system
> Application Server. I found client/server labels confusing. The Management
> Server is what I am defining to be the server that sends updates to
> add/update/remove users and groups to the Application server whose account,
> groups and access permissions are being managed.
> >
> > First some definitions of user accounts. There are usually more than one
> of each of these:
> > 1. Builtin accounts
> > 2. Special-purpose accounts
> > 3. Employee
> > 4. Contractor
> > 5. Agent
> > 6. Customer
> >
> > There may be more.
> >
> > 1. Builtin accounts: These are accounts that applications have and there
> may be more than one. There is always an admin account which can do
> anything, for example the administrator account in Active Directory or a
> database admin account. The application may have more accounts for other
> purposes.
> >
> > 2. Special-purpose accounts: These may be set up to provide access to
> other applications, for example a SCIM request to a SCIM REST API should be
> handled by a special account which cannot be used to login via a UI
> interface and only be able to perform certain functions. In addition there
> may be accounts set up to listen for topics or queues on a message queue
> among other possibilities. Keeping separate accounts like this are
> important for tracking in logs and applications.
> >
> > 3. Employee: These are accounts that employees may login to the
> application.
> >
> > 4. Contractors: These accounts that a contractor performing work for the
> company may use to log into an application. Unlike Employee accounts these
> would have an expiration date.
> >
> > 5. Agent: Accounts like this are for external users who may need to
> manage information for their own customers. An example of this is an
> insurance agent logging in to handle an insurance policy for their clients.
> >
> > 6. Customers: These are where the customers are using the application
> directly. For a bank it's likely to be millions of customers. The
> management platform should not be involved in managing these accounts.
> >
> > Let's now look at a few example applications.
> >
> > 1. Helpdesk
> > All employees and contractors will need to be able to log into a
> helpdesk application and enter tickets. This means loading information
> about all employees and contractors. For a company with only 1000 employees
> that's manageable. For a company with 100K employees, it's a bigger
> challenge.
> >
> > 2. Customer Support
> > Only employees or contractors in the department providing customer
> support need access plus a few other employees. In addition identified
> customers may need accounts.
> >
> > 3. Expenses
> > Not all employees or contractors will be submitting expenses so it may
> not be necessary to have accounts for all possible users. This is something
> that the application owner needs to decide.
> >
> > Now let's look at logistics.
> >
> > Bulk load:
> > Each application will need an initial set of accounts set up and for
> something like a helpdesk this could involve load 1000-100,000 accounts.
> > The information needed could come from either the management server or
> separately, say from an HR system. Many servers that I have encountered
> limit the number of records to something like 1000, so the pagination
> requirement is needed for this. Even when dealing with a limited subset of
> employees or contractors you can run into this need.
> >
> > Synchronization
> > An application that is bulk-loaded above may need to be synchronized to
> the management server if the data did not come from the management server.
> >
> > Change Management
> > This is really a synchronization issue as well. Changes happen all the
> time and new employees/contractors need to be added, terminated ones
> removed and updates happen all the time. The best way of dealing with this
> may be to set up a message queue that each application can subscribe to and
> they can take the needed action when it's convenient for that application.
> It's not the only method but it's the one I found to be the most helpful.
> There are two ways of doing that: 1. send the complete user information for
> new accounts, send just the change for updating accounts, send the ID for
> terminated accounts along with some meta information. The other method
> which I have used is just to send the ID and whether it's new, updated or
> terminated.
> >
> > I hope this is helpful to the discussion.
> >
> > Danny
> >
> >
> > _______________________________________________
> > scim mailing list
> > scim@ietf.org
> > https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
> > ietf.org%2Fmailman%2Flistinfo%2Fscim&amp;data=04%7C01%7CMatt.Peterson%
> > 40oneidentity.com%7C93c364afa0274e3fb35b08d963ea643e%7C91c369b51c9e439
> > c989c1867ec606603%7C0%7C1%7C637650680638761908%7CUnknown%7CTWFpbGZsb3d
> > 8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
> > 3000&amp;sdata=BqZusI4HsTB%2FyPo9SCrTMW6ZdIQNyVrPxGG%2BiiXx2fs%3D&amp;
> > reserved=0
> >
>
> _______________________________________________
> scim mailing list
> scim@ietf.org
> https://www.ietf.org/mailman/listinfo/scim
>