Re: [Storagesync] Storagesync Digest, Vol 5, Issue 1

Ted Lemon <mellon@fugue.com> Mon, 07 December 2015 17:57 UTC

Return-Path: <mellon@fugue.com>
X-Original-To: storagesync@ietfa.amsl.com
Delivered-To: storagesync@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4C43F1AD2C0 for <storagesync@ietfa.amsl.com>; Mon, 7 Dec 2015 09:57:48 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.912
X-Spam-Level:
X-Spam-Status: No, score=-1.912 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id b-6Ezw7fsCjb for <storagesync@ietfa.amsl.com>; Mon, 7 Dec 2015 09:57:46 -0800 (PST)
Received: from fugue.com (mail-2.fugue.com [IPv6:2a01:7e01::f03c:91ff:fee4:ad68]) by ietfa.amsl.com (Postfix) with ESMTP id 8928F1AD272 for <storagesync@ietf.org>; Mon, 7 Dec 2015 09:57:45 -0800 (PST)
Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="----sinikael-?=_1-14495110621180.03891570772975683"
From: Ted Lemon <mellon@fugue.com>
To: storagesync@ietf.org
In-Reply-To: <CAO_Yprbct8wFbS1WFnZZENSp-OruRUk2nRyBv4tNeKv9_CGuCg@mail.gmail.com>
References: <20151204181110.GA2418@localhost.localdomain> <1449255654746-36498631-5591108f-793d865a@fugue.com> <8F085EBA-F6A4-4FBD-8B8E-1F9AE114FD05@unterwaditzer.net> <CAO_YpraJsDKbOXD9MdxHqeAYTMoiZFyViHX+P2PtD=9hpRz9MQ@mail.gmail.com> <20151206173646.GA6290@localhost.localdomain> <1449447450498-61af5a96-1c461047-3019ac1e@gmail.com> <20151207002020.GA5002@localhost.localdomain> <1449448362292-7d42d496-109559e8-4177b3f9@gmail.com> <20151207003810.GA24130@localhost.localdomain> <1449449404474-72724227-c54ecf87-7d18f3b0@gmail.com> <20151207005426.GA29483@localhost.localdomain> <CAO_YpramyzAZ8hS6aphmBNw2FiKTpesb9uW7uGHtjRH_YkPAJg@mail.gmail.com> <1449452139832-4f314827-a7ecd596-c5312339@fugue.com> <1449454580239-1fd59d90-52f0231b-370f2ef5@gmail.com,> <> <1449455245871-cb7e86e1-1a0160c5-aa6acce3@fugue.com> <2015120711170621874681@bjtu.edu.cn> <1449459616112-6043cb32-cd69a1f9-1399f1c0@fugue.com> <CAO_Yprbct8wFbS1WFnZZENSp-OruRUk2nRyBv4tNeKv9_CGuCg@mail.gmail.com>
Date: Mon, 07 Dec 2015 17:57:42 +0000
Message-Id: <1449511062426-94cdee34-064ef498-327458b6@fugue.com>
MIME-Version: 1.0
Archived-At: <http://mailarchive.ietf.org/arch/msg/storagesync/-Ajf9DF3RWMgCQE7mTj8auO8aqU>
Subject: Re: [Storagesync] Storagesync Digest, Vol 5, Issue 1
X-BeenThere: storagesync@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Mechanisms to synchronize client file systems with Internet-based data storage services <storagesync.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/storagesync>, <mailto:storagesync-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/storagesync/>
List-Post: <mailto:storagesync@ietf.org>
List-Help: <mailto:storagesync-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/storagesync>, <mailto:storagesync-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Dec 2015 17:57:48 -0000

Monday, Dec 7, 2015 1:45 AM Linhui Sun wrote:
> 2015-12-07 11:40 GMT+08:00 Ted Lemon <mellon@fugue.com>:
>> I think C is the only right answer.   We want conflict resolution to be
>> automatic whenever possible, but we do want to do conflict resolution, and
>> not just wipe out changes, because those changes may contain real work that
>> the user on client A did.
> Do you mean you want the conflict resolution to be performed every time? If
> so, I think this might be a little bit unnecessary since a version conflict
> is not very frequently happened (especially for some personal users). The
> case you mentioned should definitely be resolved and such offline-to-online
> switch could be treated as concurrent conflict in my view.

It really depends on the use case.

> But a more frequent case is that people update their file just to replace
> the previous one, even though there are multiple people working on the same
> file. In this case, replacing file according to modification time seems
> reasonable. So the key point is how to justify which two/more versions
> should trigger the conflict resolution to avoid wiping out real work.

This use case is a common use case for folders that are used informally as a way to transfer files between users.   Typically each version of the file actually has some kind of version in the filename -- e.g., "SoW 2015/11/1 10am Ted" and we just expect the users to manage versions.   In this use model, you definitely don't need to do anything fancy.

However, this is a really broken use model, which exists because the tools don't work well enough to do something sensible.   They don't allow you to track versions, they don't allow multiple committers, and they don't have a mechanism for resolving conflicts when two people change the same version of the file and try to upload the change.

> As for the conflict resolution itself, it is very hard to achieve since the
> system needs to handle different types of file. GoogleDocs performs well
> since it only focuses on the documents. But for a storage service, we don't
> know what else types of file will be stored. A popular way I've seen is
> just to keep all the conflicted versions (named by different peers) in the
> storage.

I think this is a valid conservative way of approaching the problem.  But what makes sense to me is that the sync service be able to identify conflicts, and that there be a conflict resolution process at a layer above the sync service.   The conflict resolution process could do something trivial like resolving the conflict by making a new file with the same name plus a well-understood notation like "conflict Ted/Sun 15/11/7". 

This would be similar to what people would do if there were no conflict resolution layer.   But if you have a file type for which there is already an automatic conflict resolution process, then the conflict resolution process can just do that resolution.   There's no reason to pick a one-size-fits-all approach to this problem.   What matters is the ability to know that a conflict has occurred based on the versioning information in the metadata.


--
Sent from Whiteout Mail - https://whiteout.io

My PGP key: https://keys.whiteout.io/mellon@fugue.com