Re: [dispatch] A State Synchronization Working Group
worley@ariadne.com Tue, 07 November 2023 03:39 UTC
Return-Path: <worley@alum.mit.edu>
X-Original-To: dispatch@ietfa.amsl.com
Delivered-To: dispatch@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 239A1C14CF1F for <dispatch@ietfa.amsl.com>; Mon, 6 Nov 2023 19:39:52 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.993
X-Spam-Level:
X-Spam-Status: No, score=-0.993 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_SOFTFAIL=0.665, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=comcastmailservice.net
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QzIfBs_kOgLd for <dispatch@ietfa.amsl.com>; Mon, 6 Nov 2023 19:39:48 -0800 (PST)
Received: from resqmta-a1p-077724.sys.comcast.net (resqmta-a1p-077724.sys.comcast.net [96.103.146.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1D7C2C14CE2C for <dispatch@ietf.org>; Mon, 6 Nov 2023 19:39:47 -0800 (PST)
Received: from resomta-a1p-076784.sys.comcast.net ([96.103.145.232]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 256/256 bits) (Client did not present a certificate) by resqmta-a1p-077724.sys.comcast.net with ESMTP id 093KrOQEAnZu30CuQrYqo2; Tue, 07 Nov 2023 03:37:46 +0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcastmailservice.net; s=20211018a; t=1699328266; bh=3QuTGJ3CYwPhYEI7UP5cd2j3YOwATEc7DDmPVvJmXp8=; h=Received:Received:Received:Received:From:To:Subject:Date: Message-ID:Xfinity-Spam-Result; b=pyoze4CzBWc7Pq7WaOsragv5cLKStWQMKmFxddj+LAaEs3kieGtj0E988/YsOoGf9 Sy7ExIDVuxLC1HhxgtTsc2wpKzapysqLw8eNfMudse6B7AZfnVh+A3LLF2MxJRXOJG 43x92awvXw5tIjyyj+VL53rcjW+ReTdNsVxihY8cwkrhJIvLt+5J8eerh2cp+rDxFR KUZuqOo0DZYMkUfICLGkGteex4xW4pxCRgk/Fp4WN8kDe89374QVeg42vkpeBUQyVF dJfn0GCuM/8Vi+v7A1B706MrJ/5HxHab+ACfDH/HPG9HayVNX+1xZrUaJ14RVv5+YG rzE+JVP/NPang==
Received: from hobgoblin.ariadne.com ([IPv6:2601:192:4a00:430::b157]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 256/256 bits) (Client did not present a certificate) by resomta-a1p-076784.sys.comcast.net with ESMTPA id 0Cu3r16Gbdz4n0Cu4rbXS6; Tue, 07 Nov 2023 03:37:25 +0000
X-Xfinity-VAAS: gggruggvucftvghtrhhoucdtuddrgedvkedrudduhedgheelucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuvehomhgtrghsthdqtfgvshhipdfqfgfvpdfpqffurfetoffkrfenuceurghilhhouhhtmecufedtudenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvvefujghssedttddttddttddtnecuhfhrohhmpeifohhrlhgvhiesrghrihgrughnvgdrtghomhculdffrghlvgcutfdrucghohhrlhgvhidmnecuggftrfgrthhtvghrnhepjeetueegteekheevgfdtkeeijeehfeetffefvdehleejkeevueeuheejheeljeffnecukfhppedviedtudemudelvdemgegrtddtmeegfedtmeemsgduheejnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehhvghlohephhhosghgohgslhhinhdrrghrihgrughnvgdrtghomhdpihhnvghtpedviedtudemudelvdemgegrtddtmeegfedtmeemsgduheejpdhmrghilhhfrhhomhepfihorhhlvgihsegrlhhumhdrmhhithdrvgguuhdpnhgspghrtghpthhtohepvddprhgtphhtthhopehtohhomhhimhesghhmrghilhdrtghomhdprhgtphhtthhopeguihhsphgrthgthhesihgvthhfrdhorhhg
X-Xfinity-VMeta: sc=-100.00;st=legit
Received: from hobgoblin.ariadne.com (localhost [127.0.0.1]) by hobgoblin.ariadne.com (8.16.1/8.16.1) with ESMTPS id 3A73bNCN569751 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Mon, 6 Nov 2023 22:37:23 -0500
Received: (from worley@localhost) by hobgoblin.ariadne.com (8.16.1/8.16.1/Submit) id 3A73bNka569748; Mon, 6 Nov 2023 22:37:23 -0500
X-Authentication-Warning: hobgoblin.ariadne.com: worley set sender to worley@alum.mit.edu using -f
From: worley@ariadne.com
To: Michael Toomim <toomim@gmail.com>
Cc: dispatch@ietf.org
In-Reply-To: <a3b09c5b-bffe-c0b8-a648-fe5e85786994@gmail.com> (toomim@gmail.com)
Sender: worley@ariadne.com
Date: Mon, 06 Nov 2023 22:37:23 -0500
Message-ID: <87fs1is07g.fsf@hobgoblin.ariadne.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/dispatch/21ju2qNjaVevaYjuZ28avnzirKc>
Subject: Re: [dispatch] A State Synchronization Working Group
X-BeenThere: dispatch@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: DISPATCH Working Group Mail List <dispatch.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dispatch>, <mailto:dispatch-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dispatch/>
List-Post: <mailto:dispatch@ietf.org>
List-Help: <mailto:dispatch-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dispatch>, <mailto:dispatch-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Nov 2023 03:39:52 -0000
Sorry, there are a lot of details here. I'm not objecting to the Braid concept, but I think a lot of details in the drafts need to be tightened up. Comments on draft-toomim-httpbis-braid-http-03 As far as I can tell, this I-D specifies a group of HTTP facilities that can be used for state synchronization. From where I sit, the two obvious uses are: (1) Collaborative editing of a document by a set of clients sending updates to a master copy on a server, and (2) A server maintaining a state object logically consisting of a set of name/value pairs, represented as an XML structure, sending incremental state changes to a set of client/subscribers. 1. What is lacking in the exposition is an example of usage. Indeed, the drafts don't include a sufficient set of facilities to present an example. They appear to be a framework which can be extended to various usages. This is going to work against their adoption, as the IETF is engineering-oriented and usually only advances work that can be applied to a current need. In particular, it seems that any use of Braid depends on the particular Merge-Type that is specified; the Merge-Type defines how all of the tricky cases are resolved. Without a defined Merge-Type, no examples can be given. Ideally, there would be several defined Merge-Types, allowing us to see how the framework nature of Braid plays out in practice in a number of different usages. 2. The discussion of the history DAG should be clarified, as that seems to be a central concept. (1) It needs to be stated that the "version history" is the same thing as the DAG that is the "full graph of parents". (2) There must be some initial version, which inherently has *zero* parents, which isn't allowed by the current text. (3) Can there be more than one initial version in the history, i.e., whether an edit can somehow combine two parents which themselves share no ancestors. 3. It is unclear what this passage could mean, since each version presumably has a unique ID: Parallel edits can merge into a single version with multiple IDs: Version: "dkn7ov2vwg", "v2vwgdkn7o" 4. This seems to guarantee bad outcomes: If a client or server does not specify a Parents header when transferring a new version, the recipient MAY presume that the most recent versions it has seen are the parents of the new version. Specifically if the recipient does not know of the parents of the new version, it should be able to determine that it cannot accept the new version. 5. In section 2.1 quoted below, "a unique point in time" needs to be rephrased, as clearly two independent processors could create distince versions at exactly the same time. Perhaps "A Version marks a specific point in the version history and consequently a specific content -- not just a specific content alone." 2. A Version marks a unique point in time -- not unique content. If a resource is changed from version A to B, and then to C, such that the contents at A are the same as the contents at C, then it is possible versions A and C to have the same ETag, even though they have different Versions. This can break a CRDT or OT merge algorithm. 6. Braid seems to assume that the resources it is versionizing are identified by URLs, which URLs are used to access the resource on a server. This works for the two examples I've mentioned at the beginning, which have "star-shaped" data flow, with a set of clients accessing one server. But for truly distributed state, "definitive" copies of the resource live on more than one server, and perforce the copies have distinct URLs. There needs to be some mechanism to separate the URLs for accessing copies from the "identity" of the resource. 7. Braid seems to permit using different Merge-Types for different changes. It's not at all clear to me how this can be made to work in general, as e.g. a server maintaining a "definitive" copy of a resource needs to be able to merge multiple versions. If different versions could have different Merge-Types attached, there's no defined way for the server to determine what to do. It seems that a single resource, for its entire history, must have a single Merge-Type associated with it. 8. This is phrased poorly: When a PUT request changes the state of a resource, it can specify the new version of the resource, the parent version IDs that existed when it was created, [...] This should say something like "the parent version IDs from which the new version is derived". After all, hundreds of version IDs for the resource may exist. 9. In regard to: 2.2. PUT a new version We call the set of data that updates a resource from one version to another an "update". An update consists of a set of headers and a body. In this example, the update includes a snapshot of the entire new value of the resource. However, one can also specify the update as a set of patches. This would be a good place to explain how an update is marked regarding what sort of update it is. E.g., if it is a complete new copy of the resource, or a set of patches. (This may be explained later in the I-D, but it should be at least mentioned here.) Also, what specifies what possible update types (e.g. full/patch) are applicable? The Merge-Type doesn't seem to specify that. Are the possible update types inherent to the resource's Content-Type? A bit more subtly, does the Merge-Type algorithm take into account the particular type of PUT used to generate the versions, or does it only take into account the contents of the versions? 10. In regard to: 2.5. Rules for Version and Parents headers If a GET request contains a Version header: ... If a GET request contains a Parents header: ... If a GET request contains both a Version and Parents header: Note that the third case overlaps with the first and second as they are written. You should split these out more cleanly. Also, this text should be the introduction to the text now in secs. 2.3 and 2.4, rather than following them. 11. In regard to: - If the server does not support historical versions, it MAY ignore the Version header and respond as usual, but MUST NOT include the Version header in its response. This isn't really a specificational part of this I-D, since by definition a server that does not support historical versions isn't conformant to this I-D. What it is is an analysis of what happens if a version-aware client attempts to access a non-version-aware server. Of course, the important point is that the client can determine that the server is not version-aware because of the absence of the Version header in the reponse. All of the points like this should be grouped into a section on upward competibility, rather than being written specificationally. 12. It's not clear how to obtain the entire history of a resource with a GET, or even "all of the ancestors of version X", if all you have is the current version (which is what is returned by a plain GET). Since you don't know what the initial version ID is, there's no way to construct a Parents header for such a request. 13. In regard to getting a range of versions, what is the exact rule for what versions are returned? Naively, I would expect "all versions strictly descended from at least one of the Parents and weakly ancestral to at least one of the Versions". That would ensure that if the sequence of versions was linear, the responses to GET /foo Parents: "a" Version: "b" GET /foo Parents: "b" Version: "c" GET /foo Parents: "c" Version: "d" would neatly dovetail to be the same as the response to GET /foo Parents: "a" Version: "d" But GET doesn't allow multiple Version values. I can also propose other plausible rules. The rule given in the I-D clearly isn't the one I stated above, it's not entirely clear what it is, making it hard to tell whether it would work nicely in practice. 14. The structure for "the response body contains a sequence of updates; each with its own content-length" seems to be new, whereas it seems that a multipart MIME type (RFC 1872) would be the natural mechanism to use. I doubt that a mechanism that is logically the same as multipart/related but syntactically different would get approved. 15. In regard to: A server MAY refactor or rebase the version history that it provides to a client, so long as it does not affect the resulting state, or the result of the patch-type's merges. This probably should be phrased more carefully, as "refactor" and "rebase" have a lot of meanings. My expectation is that the server is required to send the contents of the specified versions in the DAG, but it is free to send responses which represent those versions however it chooses. E.g. even if a version is created from a previous version with a non-patch PUT, the server is allowed to send to the client a patch from the previous version to the given version instead of the entire given version, as long as the patch creates the same content as was PUT. OTOH, you might intend that the server can rearrange the versions in the DAG in some way. But if so, cleanly specifying exactly what is permitted gets really complicated. 16. Section 2.5 states: A server does not need to honor historical version requests for all documents, for all history. If a server no longer has the historical context needed to honor a request, it may respond using an error code that will be defined in a subsequent version of this draft. but draft-toomim-httpbis-merge-types-00 states: To compute the result, a Merge Type has access to the entire version history that preceded each of the parents, but it cannot depend on information outside of that version history. Which is it to be? If in general a Merge-Type can see the entire version history, then any processor that can do merges must possess the entire version history. 17. It would be useful to start blocking out the Braid-specific 4xx responses are for the operations the I-D is defining. 18. In sec. 3.1 there is "Content-Range: json .messages[1:1]". Is this defined anywhere? OK, that's in draft-toomim-httpbis-range-patch-00 sec. 4.1 but it should be footnoted clearly. 19. Sec. 3.3 describes combining a set of patches into a single patch for update purposes. This is another place where a multipart body should be used. 20. Sec. 4 discusses subscriptions. It would be worth coordinating this mechanism with other work on subscriptions; my understanding is that a general subscription mechanism is not technically simple. In particular, long-lived TCP connections that idle are often subject to being cut off by middleboxes. Also, each subscription requires a separate TCP connection be kept open. Better to piggy-back off work that others have done than spend the effort to re-do it. Note in particular sec. 6.2: A cache supporting the Braid extensions, however, will automatically update whenever a change occurs. If a client starts a GET Subscription with a proxy, the proxy will then start and maintain a GET Subscription with the origin server. This doesn't mention that the cache isn't a single resource, but a group of resources, each one of which must be subscribed to separately in order to receive updates. -- Unless this is implicitly proposing some sort of "cache" Content-Type which contains a number of sub-resources. But as far as I know of, the machinery for that has not been defined. 21. What exactly is the subscriber receiving? Implicitly, it seems to be "all new versions that are created", but beware that since versions form a DAG, the new versions aren't necessarily a linear chain of update. 22. What sets of versions get merged, and by whom? If a server contains version X, client 1 sends an update X -> Y, and client 2 sends an update X -> Z, it seems likely that we want the server to merge Y and Z to produce W as a descendant of both Y and Z: X / \ / \ / \ Y Z \ / \ / \ / W But maybe we don't. Some processor needs to be configured to merge the right sets of versions, and I don't see a specification of that mechanism. Comments on draft-toomim-httpbis-merge-types-00: 23. In regard to sec. 4.1.4: Merge Type registrations may not be deleted; Merge Types that are no longer believed appropriate for use can be declared OBSOLETE; However, no column is specified in sec. 4.1.1 to record obsolete/non-obsolete status. 24. This part of sec. 5 seems useful but not at all part of "security considerations". More deeply, how is this to be acted upon when a Merge-Type is defined? Is the set of possible Validators known when a Merge-Type is defined? Or is this really a consideration that can only be done when the system that *uses* Merge-Type is designed? o A Merge Type should be considered in concert with a Validator. For instance, it is possible for two transactions A and B to both be valid individually, but when merged, the result is not valid. For instance, if a bank account has $1, then two simultaneous debits of $1 are both valid individually, but the resulting merger, yielding -$2, drains the bank account balance below $0. This is the double-spend problem. A system using Merge Types SHOULD be aware of how its Merge Types interact with its Validators. Dale
- [dispatch] A State Synchronization Working Group Michael Toomim
- Re: [dispatch] A State Synchronization Working Gr… Hesham ElBakoury
- Re: [dispatch] A State Synchronization Working Gr… Michael Toomim
- Re: [dispatch] A State Synchronization Working Gr… Michael Toomim
- Re: [dispatch] A State Synchronization Working Gr… touch@strayalpha.com
- Re: [dispatch] A State Synchronization Working Gr… Michael Toomim
- Re: [dispatch] A State Synchronization Working Gr… worley
- Re: [dispatch] A State Synchronization Working Gr… Joe Touch
- Re: [dispatch] A State Synchronization Working Gr… Michael Toomim
- Re: [dispatch] A State Synchronization Working Gr… Michael Toomim
- Re: [dispatch] A State Synchronization Working Gr… worley
- Re: [dispatch] A State Synchronization Working Gr… Michael Toomim
- Re: [dispatch] A State Synchronization Working Gr… worley