Re: [Storagesync] Storagesync Digest, Vol 5, Issue 1

Ted Lemon <mellon@fugue.com> Wed, 09 December 2015 14:35 UTC

Return-Path: <mellon@fugue.com>
X-Original-To: storagesync@ietfa.amsl.com
Delivered-To: storagesync@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 852FB1B2BFA for <storagesync@ietfa.amsl.com>; Wed, 9 Dec 2015 06:35:42 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.912
X-Spam-Level:
X-Spam-Status: No, score=-1.912 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id F_GJRpFzgVBL for <storagesync@ietfa.amsl.com>; Wed, 9 Dec 2015 06:35:41 -0800 (PST)
Received: from fugue.com (mail-2.fugue.com [IPv6:2a01:7e01::f03c:91ff:fee4:ad68]) by ietfa.amsl.com (Postfix) with ESMTP id D5AE81B2C03 for <storagesync@ietf.org>; Wed, 9 Dec 2015 06:35:39 -0800 (PST)
Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="----sinikael-?=_1-14496717330090.24760120688006282"
From: Ted Lemon <mellon@fugue.com>
To: Jakub.Moscicki@cern.ch
In-Reply-To: <506D291C-4F0B-40F3-8848-97DAAF41CAAE@cern.ch>
References: <1449452139832-4f314827-a7ecd596-c5312339@fugue.com> <1449454580239-1fd59d90-52f0231b-370f2ef5@gmail.com,> <1449455245871-cb7e86e1-1a0160c5-aa6acce3@fugue.com> <2015120711170621874681@bjtu.edu.cn> <1449459616112-6043cb32-cd69a1f9-1399f1c0@fugue.com> <CAO_Yprbct8wFbS1WFnZZENSp-OruRUk2nRyBv4tNeKv9_CGuCg@mail.gmail.com> <1449511062426-94cdee34-064ef498-327458b6@fugue.com> <CAO_YprZjqs_OFC3RybVvJ4GHWb3spKMMkkFTZO=YDustp825iw@mail.gmail.com> <1449593642163-c107ebb4-0f6d1c5a-a3f1c5e0@fugue.com> <20151208185922.GA9531@localhost.localdomain> <1449609937865-6dbdad8f-eb44d945-cd684f34@fugue.com> <AE0CE9F1-3968-4229-925B-75AA37EDC327@unterwaditzer.net> <1449670262769-e440b1e3-b960232c-260b9165@fugue.com> <506D291C-4F0B-40F3-8848-97DAAF41CAAE@cern.ch>
Date: Wed, 09 Dec 2015 14:35:33 +0000
Message-Id: <1449671733322-9f72a594-b1d5700c-d3631253@fugue.com>
MIME-Version: 1.0
Archived-At: <http://mailarchive.ietf.org/arch/msg/storagesync/HxTq3sx43ZSg5xhFEtzqWaGdISI>
Cc: storagesync@ietf.org
Subject: Re: [Storagesync] Storagesync Digest, Vol 5, Issue 1
X-BeenThere: storagesync@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Mechanisms to synchronize client file systems with Internet-based data storage services <storagesync.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/storagesync>, <mailto:storagesync-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/storagesync/>
List-Post: <mailto:storagesync@ietf.org>
List-Help: <mailto:storagesync-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/storagesync>, <mailto:storagesync-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Dec 2015 14:35:42 -0000

Wednesday, Dec 9, 2015 9:21 AM Jakub Moscicki wrote:
> Unless you want to dig into the type of the file and explore it’s internal structure to merge, the best you can do is to make sure a version on the server is generated for each conflict file and these may be easily accessible for manual merge by the interested users. For this etags are sufficient. From this point of view there is no difference between conflicts which are generated because two clients update the file at the same time (so conflicting uploads overlapping in time) and a case of offline client which edits locally outdated version (newer version available meanwhile on the server) and consequently uploading it when online again (so conflicting uploads not overlapping in time). In both cases it looks to me good enough if they end up as versions on the server for manual user merge/revert.

Most files are text files, which are easy to merge, so yes, I do want to be able to dig into the structure of the file.   Even when the file isn't "easy to merge," e.g. a sound file or a drawing, knowing the order of the changes can be very helpful when someone is doing a manual merge.   And you can build merge tools for commonly-merged files; if conflict detection is a feature of the data store, then there's an incentive to build such tools, whereas if everybody is accustomed to dumb data stores that don't provide this feature, there's no point in making merge easy.

The absence of this feature is a serious problem in my workflows; I don't know if your experience is similar.   I don't really see any point in inventing "yet another" storage sync mechanism that doesn't provide this feature.   E.g., do you really want a calendaring system that automatically re-adds files that have been deleted on one client when a second client connects that hasn't seen the deletion?   An address book that backs out updates?   A mail system that undeletes deleted messages, or deletes messages on the client that were lost on the server due to a crash?

The use case of a small collection of large files used entirely sequentially is easy to address, but (a) not very interesting and (b) not actually a common use model.   It _seems_ common because we don't provide anything better, and so people make do with it, but it doesn't actually fit with what they are doing.   If it did, we wouldn't see the proliferation of manually-labeled versions of the same file that is so common in such data stores.


--
Sent from Whiteout Mail - https://whiteout.io

My PGP key: https://keys.whiteout.io/mellon@fugue.com