[Storagesync] recent issues discussed (plain text)

"Fei Song" <fsong@bjtu.edu.cn> Mon, 14 December 2015 01:59 UTC

Return-Path: <fsong@bjtu.edu.cn>
X-Original-To: storagesync@ietfa.amsl.com
Delivered-To: storagesync@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 277241A9060 for <storagesync@ietfa.amsl.com>; Sun, 13 Dec 2015 17:59:01 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 1.689
X-Spam-Level: *
X-Spam-Status: No, score=1.689 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, J_CHICKENPOX_12=0.6, J_CHICKENPOX_14=0.6, J_CHICKENPOX_16=0.6, J_CHICKENPOX_17=0.6, J_CHICKENPOX_18=0.6, J_CHICKENPOX_26=0.6, MIME_BASE64_BLANKS=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id glWH54QKwhy5 for <storagesync@ietfa.amsl.com>; Sun, 13 Dec 2015 17:58:59 -0800 (PST)
Received: from bjtu.edu.cn (mail.bjtu.edu.cn [218.249.29.198]) by ietfa.amsl.com (Postfix) with ESMTP id 1E2851A905F for <storagesync@ietf.org>; Sun, 13 Dec 2015 17:58:58 -0800 (PST)
Received: from PC-201001061KKK (unknown [211.71.74.217]) by Jdweb4 (Coremail) with SMTP id eJ5wygBHaPwfI25WnPkSAA--.32527S2; Mon, 14 Dec 2015 10:02:07 +0800 (CST)
Date: Mon, 14 Dec 2015 09:59:57 +0800
From: Fei Song <fsong@bjtu.edu.cn>
To: storagesync <storagesync@ietf.org>
References: <CAO_YprZTz+O-e82hsTgMBOLr645jJqbhtVKngubnLhimyfB2cg@mail.gmail.com> <71E522FC-C622-4DDE-B444-5CE902980823@cern.ch> <2015112611280303192311@bjtu.edu.cn>, <CAO_YprbsHsDYewHh38uDJ6gOoSZKwO8MoCR+UHz21iarEqEQOA@mail.gmail.com>
X-Priority: 3
X-Has-Attach: no
X-Mailer: Foxmail 7.0.1.91[cn]
Mime-Version: 1.0
Message-ID: <2015121409595759333241@bjtu.edu.cn>
Content-Type: text/plain; charset="gb2312"
Content-Transfer-Encoding: base64
X-CM-TRANSID: eJ5wygBHaPwfI25WnPkSAA--.32527S2
X-Coremail-Antispam: 1UD129KBjvJXoWxWr47AFWrXFykuw4UXF48Crg_yoWrGF1fpF WfGwsxKa4kJ3yav3ykXr4xurWrtFs3Kw43JFn3Gw4xAws8XFy0gF4xtr4rur97Jry7Z34q qr4Yvas8Cw1DZaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUv0b7Iv0xC_Cr1lb4IE77IF4wAFF20E14v26r1j6r4UM7CY07I2 0VC2zVCF04k26cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rw A2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j 6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oV Cq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0 I7IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r 4UM4x0Y48IcxkI7VAKI48JM4xvF2IEb7IF0Fy264kE64k0F24lFcxC0VAYjxAxZF0Ex2Iq xwCY02Avz4vE14v_Gr1l42xK82IYc2Ij64vIr41lx2IqxVAqx4xG67AKxVWUJVWUGwC20s 026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1j6r15MIIYrxkI7VAKI48JMIIF 0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF0x vE42xK8VAvwI8IcIk0rVWrJr0_WFyUJwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E 87Iv6xkF7I0E14v26r1j6r4UMVCEFcxC0VAYjxAxZFUvcSsGvfC2KfnxnUUI43ZEXa7IU8 xHUPUUUUU==
X-CM-SenderInfo: aytwlqpemw3hxhgxhubq/
Archived-At: <http://mailarchive.ietf.org/arch/msg/storagesync/OZOlxzT6dz_aRHtLT4Z6b_-J3Gs>
Subject: [Storagesync] recent issues discussed (plain text)
X-BeenThere: storagesync@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
Reply-To: fsong <fsong@bjtu.edu.cn>
List-Id: Mechanisms to synchronize client file systems with Internet-based data storage services <storagesync.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/storagesync>, <mailto:storagesync-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/storagesync/>
List-Post: <mailto:storagesync@ietf.org>
List-Help: <mailto:storagesync-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/storagesync>, <mailto:storagesync-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Dec 2015 01:59:01 -0000

Here is the latest version. Please email me if anything is missed:

1.The design targets of WebDAV, rsync and other existing approaches?
2.The potential use cases of ISS, such as client/server, git-like pattern, svn, etc.
3.The efficiency improvements might be the second goal for standardizing ISS protocol
4.CORS headers on storage sync APIs
5.What is needed for the ISS: a sync protocol or a generalized API
6.remoteStorage draft discussion
  a)relationship vs WebDAV
  b)MOVE action (synchronization) should be added or not
  c)Beside web browser, desktop apps (by hacking way)
  d)comics of new standard
  e)etag issues vs metadata
    i.is mainly for identifying whether a document is changed or not
    ii.is easy to implement than that of WebDAV sync protocol or not
    iii.the metadata file contains all etags for all files at both client and server side or not
  f)the distributed peer model (no server) and C/S mode
  g)a fancy example (with pics) of OfflineIMAP’s sync process in following URL
    http://blog.ezyang.com/2012/08/how-offlineimap-works/
7.GitHub (instead of email messages) has been created:
  https://github.com/labkode/Internet-Storage-Sync
  a)What is the topic? 
    i.Whether it is suitable to build on WebDAV
    ii.WebDAV vs remoteStorage
    iii.Advantages vs disadvantages of WebDAV
8.Metadata and data separation scheme and platform for synchronization
  a)ownCloud when configured to use an Object Storage as the Primary User Storage. (Metadata is handled by a MySQL database)
  b)CERN EOS Storage System. (Metadata is handled in high performance in-memory structures).
  c)DropBox. (As far as I know it uses S3 behind. For metadata it is unknown, but probably not on S3). 
    Paper for analyzing DropBox:
    http://annasperotto.org/papers/2012/imc140-drago.pdf
  d)ClawIO will have an implementation of this approach in the next phase using Swift.
9.Whether we should keep metadata history or modification history or action history at server side (or other places)
10.How to handle the “conflict discovery (resolution)” issues
  a)A good demonstration of this issue was given in details (File F, Client A, Client B, etc.)
  b)Should it be a layer above syncing?
  c)Should it depend on the use case?
  d)Should it be a one-size-fits-all approach?
  e)How many different kinds of conflict?
    i.File level conflicts, e.g. both remote (server) and local (client) have changed since last sync.
    ii.Interleaved level conflicts, e.g. one client makes a change based on version X of the file, a second makes a change based on version X+1, the second one commits, and then the first one commits.
    iii.…
  f)The sequence (order) of changes in one file is important or not (quite similar with “item 9” in this list), perhaps depends on the file type (or use cases):
    i.It may be important for: sound file, drawing file, latex source files of articles, iWork files, Office files, images, pdfs, source code, etc.
    ii.It may be not important for: Physics data.
  g)Whether to enable an automatic merging might be a research problem or a further use case?
  h)If It is so hard to do conflict resolution even for simple and well-structured “vcard case”, should we handle this issue by separating files based on their property?
11.A de-facto sync protocol description used by ownCloud sync clients (and some footnote discussion on future development directions). This document comes with a test suite (partially implemented) that verifies if a server adheres to this specification. The URL is as follow:
  https://github.com/cernbox/smashbox/blob/master/protocol/protocol.md


Some related organizations, events, projects, etc.: 
GEANT community, OpenCloudMesh, ownCloud, CS3, remoteStorage, ClawIO, crosscloud, Dropbox, CERN EOS Storage System, to be added…