[Storagesync] recent issues discussed (html)

"Fei Song" <fsong@bjtu.edu.cn> Mon, 14 December 2015 01:49 UTC

Return-Path: <fsong@bjtu.edu.cn>
X-Original-To: storagesync@ietfa.amsl.com
Delivered-To: storagesync@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3ABA31A9041 for <storagesync@ietfa.amsl.com>; Sun, 13 Dec 2015 17:49:39 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.789
X-Spam-Level:
X-Spam-Status: No, score=0.789 tagged_above=-999 required=5 tests=[BAYES_50=0.8, HTML_MESSAGE=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PZMiCNCvQEFS for <storagesync@ietfa.amsl.com>; Sun, 13 Dec 2015 17:49:36 -0800 (PST)
Received: from bjtu.edu.cn (mail.bjtu.edu.cn [218.249.29.198]) by ietfa.amsl.com (Postfix) with ESMTP id 35CE51A903F for <storagesync@ietf.org>; Sun, 13 Dec 2015 17:49:33 -0800 (PST)
Received: from PC-201001061KKK (unknown [211.71.74.217]) by Jdweb2 (Coremail) with SMTP id M55wygCnLTnbIG5WPekFAA--.20050S2; Mon, 14 Dec 2015 09:52:27 +0800 (CST)
Date: Mon, 14 Dec 2015 09:50:16 +0800
From: Fei Song <fsong@bjtu.edu.cn>
To: storagesync <storagesync@ietf.org>
References: <CAO_YprZTz+O-e82hsTgMBOLr645jJqbhtVKngubnLhimyfB2cg@mail.gmail.com> <71E522FC-C622-4DDE-B444-5CE902980823@cern.ch> <2015112611280303192311@bjtu.edu.cn>, <CAO_YprbsHsDYewHh38uDJ6gOoSZKwO8MoCR+UHz21iarEqEQOA@mail.gmail.com>
X-Priority: 3
X-Has-Attach: no
X-Mailer: Foxmail 7.0.1.91[cn]
Mime-Version: 1.0
Message-ID: <2015121409501521844540@bjtu.edu.cn>
Content-Type: multipart/alternative; boundary="----=_001_NextPart268517077240_=----"
X-CM-TRANSID: M55wygCnLTnbIG5WPekFAA--.20050S2
X-Coremail-Antispam: 1UD129KBjvJXoWxWr47AFWrXFykuw4UXF48Crg_yoWrGF1fpF WfGwsxKa4kJ3yav3ykXr4xurWrtFs3Kw43JFn3Gw4xAws8XFy0gF4xtr4rur97Jry7Z34q qr4Yvas8Cw1DZaDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUdab7Iv0xC_Kw4lb4IE77IF4wAFF20E14v26r1j6r4UM7CY07I2 0VC2zVCF04k26cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rw A2F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xII jxv20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwV C2z280aVCY1x0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40Eb7x2 x7xS6r1j6r4UMc02F40E57IF67AEF4xIwI1l5I8CrVAKz4kIr2xC04v26r1j6r4UMc02F4 0E42I26xC2a48xMc02F40Ex7xS67I2xxkvbII20VAFz48EcVAYj21lYx0E2Ix0cI8IcVAF wI0_JrI_JrylYx0Ex4A2jsIE14v26r1j6r4UMcvjeVCFs4IE7xkEbVWUJVW8JwACjcxG0x vY0x0EwIxGrwACY4xI67k04243AVAKzVAKj4xxM4xvF2IEb7IF0Fy26I8I3I1lc2xSY4AK 67AK6w4l42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67 AKxVWUGVWUWwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1j6r15MIIY rxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14 v26r1j6r4UMIIF0xvE42xK8VAvwI8IcIk0rVWrJr0_WFyUJwCI42IY6I8E87Iv67AKxVWU JVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r1j6r4UMVCEFcxC0VAYjxAxZFUvcSsGvfC2Kf nxnUUI43ZEXa7IU5-SdPUUUUU==
X-CM-SenderInfo: aytwlqpemw3hxhgxhubq/
Archived-At: <http://mailarchive.ietf.org/arch/msg/storagesync/8zfVDVeSfc4mwRocuap5Fd8JCiw>
Subject: [Storagesync] recent issues discussed (html)
X-BeenThere: storagesync@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
Reply-To: fsong <fsong@bjtu.edu.cn>
List-Id: Mechanisms to synchronize client file systems with Internet-based data storage services <storagesync.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/storagesync>, <mailto:storagesync-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/storagesync/>
List-Post: <mailto:storagesync@ietf.org>
List-Help: <mailto:storagesync-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/storagesync>, <mailto:storagesync-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Dec 2015 01:49:39 -0000

Here is the latest version. Please email me if anything is missed:
 
1.         The design targets of WebDAV, rsync and other existing approaches?
2.         The potential use cases of ISS, such as client/server, git-like pattern, svn, etc.
3.         The efficiency improvements might be the second goal for standardizing ISS protocol
4.         CORS headers on storage sync APIs
5.         What is needed for the ISS: a sync protocol or a generalized API
6.         remoteStorage draft discussion
a)         relationship vs WebDAV
b)         MOVE action (synchronization) should be added or not
c)         Beside web browser, desktop apps (by hacking way)
d)         comics of new standard
e)         etag issues vs metadata
                         i.              is mainly for identifying whether a document is changed or not
                       ii.              is easy to implement than that of WebDAV sync protocol or not
                      iii.              the metadata file contains all etags for all files at both client and server side or not
f)          the distributed peer model (no server) and C/S mode
g)         a fancy example (with pics) of OfflineIMAP’s sync process in following URL
http://blog.ezyang.com/2012/08/how-offlineimap-works/
7.         GitHub (instead of email messages) has been created:
https://github.com/labkode/Internet-Storage-Sync
a)         What is the topic? 
                         i.              Whether it is suitable to build on WebDAV
                       ii.              WebDAV vs remoteStorage
                      iii.              Advantages vs disadvantages of WebDAV
8.         Metadata and data separation scheme and platform for synchronization
a)         ownCloud when configured to use an Object Storage as the Primary User Storage. (Metadata is handled by a MySQL database)
b)         CERN EOS Storage System. (Metadata is handled in high performance in-memory structures).
c)         DropBox. (As far as I know it uses S3 behind. For metadata it is unknown, but probably not on S3). 
Paper for analyzing DropBox:
http://annasperotto.org/papers/2012/imc140-drago.pdf
d)         ClawIO will have an implementation of this approach in the next phase using Swift.
9.         Whether we should keep metadata history or modification history or action history at server side (or other places)
10.     How to handle the “conflict discovery (resolution)” issues
a)         A good demonstration of this issue was given in details (File F, Client A, Client B, etc.)
b)         Should it be a layer above syncing?
c)         Should it depend on the use case?
d)         Should it be a one-size-fits-all approach?
e)         How many different kinds of conflict?
                         i.              File level conflicts, e.g. both remote (server) and local (client) have changed since last sync.
                       ii.              Interleaved level conflicts, e.g. one client makes a change based on version X of the file, a second makes a change based on version X+1, the second one commits, and then the first one commits.
                      iii.              …
f)          The sequence (order) of changes in one file is important or not (quite similar with “item 9” in this list), perhaps depends on the file type (or use cases):
                         i.              It may be important for: sound file, drawing file, latex source files of articles, iWork files, Office files, images, pdfs, source code, etc.
                       ii.              It may be not important for: Physics data.
g)         Whether to enable an automatic merging might be a research problem or a further use case?
h)         If It is so hard to do conflict resolution even for simple and well-structured “vcard case”, should we handle this issue by separating files based on their property?
11.     A de-facto sync protocol description used by ownCloud sync clients (and some footnote discussion on future development directions). This document comes with a test suite (partially implemented) that verifies if a server adheres to this specification. The URL is as follow:
https://github.com/cernbox/smashbox/blob/master/protocol/protocol.md
 
 
Some related organizations, events, projects, etc.: 
GEANT community, OpenCloudMesh, ownCloud, CS3, remoteStorage, ClawIO, crosscloud, Dropbox, CERN EOS Storage System, to be added…