Re: [Storagesync] Some preliminary investigations on ownCloud

Linhui Sun <lh.sunlinh@gmail.com> Thu, 26 November 2015 09:38 UTC

Return-Path: <lh.sunlinh@gmail.com>
X-Original-To: storagesync@ietfa.amsl.com
Delivered-To: storagesync@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 910D41B381F for <storagesync@ietfa.amsl.com>; Thu, 26 Nov 2015 01:38:29 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qlZk0Kre9I11 for <storagesync@ietfa.amsl.com>; Thu, 26 Nov 2015 01:38:26 -0800 (PST)
Received: from mail-wm0-x233.google.com (mail-wm0-x233.google.com [IPv6:2a00:1450:400c:c09::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 78D161B381E for <storagesync@ietf.org>; Thu, 26 Nov 2015 01:38:25 -0800 (PST)
Received: by wmww144 with SMTP id w144so14666662wmw.0 for <storagesync@ietf.org>; Thu, 26 Nov 2015 01:38:24 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=YCdVlguhUmdUyNsAR+P8Xk5Vvhd4KpMjpk5/3ZzseB4=; b=brfYSDmGysUQ64XVBl6dqKPMjrRsLChQw8VjjBU68HYl/OMEU69lx186Ave1vBREUO kgd9ZrBmkQvG/s20XNh/kl1FtmGpcB8sMmKdc5KF972AQMCA1rniRZTUtGkt4Qn8bViM 0VOfWF+KKbN+sHItXlVdRLISoHcSOaH/uzJAk6+xllChS9go7y46RqNxqYcKXOKRb66w Jyibg7BxOTWSshiQk51/+4IuYqgywc2lzQ1JrX0HW6sjUm/GVMOUDKafTZk8pJagLbNY pXMnDKohzLDSLlje2uQ/+uH6kOXjT6i6HgRoD/5xU8gDT+pDW3QFU2w1BhxNJlPDFQ5C INDQ==
X-Received: by 10.28.48.10 with SMTP id w10mr2398952wmw.39.1448530704006; Thu, 26 Nov 2015 01:38:24 -0800 (PST)
MIME-Version: 1.0
Received: by 10.28.27.147 with HTTP; Thu, 26 Nov 2015 01:38:04 -0800 (PST)
In-Reply-To: <03A07C3E-9B3B-4886-9131-2CAF0A7B3F85@cern.ch>
References: <CAO_YprZTz+O-e82hsTgMBOLr645jJqbhtVKngubnLhimyfB2cg@mail.gmail.com> <71E522FC-C622-4DDE-B444-5CE902980823@cern.ch> <2015112611280303192311@bjtu.edu.cn> <CAO_YprbsHsDYewHh38uDJ6gOoSZKwO8MoCR+UHz21iarEqEQOA@mail.gmail.com> <03A07C3E-9B3B-4886-9131-2CAF0A7B3F85@cern.ch>
From: Linhui Sun <lh.sunlinh@gmail.com>
Date: Thu, 26 Nov 2015 17:38:04 +0800
Message-ID: <CAO_YprbXEPJWi55sRTXnGxWHTHzFKWRkUY4Hg-bEshCtvzpdkQ@mail.gmail.com>
To: Jakub Moscicki <Jakub.Moscicki@cern.ch>
Content-Type: multipart/alternative; boundary="001a11422fa80f064005256e56b6"
Archived-At: <http://mailarchive.ietf.org/arch/msg/storagesync/Onefqvy0HvWFDRT-DohakQuuqg4>
Cc: fsong <fsong@bjtu.edu.cn>, storagesync <storagesync@ietf.org>
Subject: Re: [Storagesync] Some preliminary investigations on ownCloud
X-BeenThere: storagesync@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Mechanisms to synchronize client file systems with Internet-based data storage services <storagesync.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/storagesync>, <mailto:storagesync-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/storagesync/>
List-Post: <mailto:storagesync@ietf.org>
List-Help: <mailto:storagesync-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/storagesync>, <mailto:storagesync-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 26 Nov 2015 09:38:29 -0000

2015-11-26 15:29 GMT+08:00 Jakub Moscicki <Jakub.Moscicki@cern.ch>:

> 2015-11-26 11:28 GMT+08:00 Fei Song <fsong@bjtu.edu.cn>:
>
>> BTW, Based on the last sentence of last email:"The intention of this
>> message is to investigate the current state of using WebDAV for sync
>> purposes to see what needs to be improved here and whether we need new
>> protocols"
>>
>> The outcome he/she wanted might be just the links like
>> http://cs3.ethz.ch/program.html :)
>>
> In another word, I think what I want finally might be a discussion about
> "What is needed for the ISS: a sync protocol or a generalized API". Sorry
> for the poor expression : )
>
>
> I think this is a relevant question indeed (well, actually also if there
> is any standard needed at all in the first place). By the generalized API
> do you mean a HTTP-style API (like REST)? In that case you may consider
> WebDAV quite close and the difference between the protocol and the API
> blurs for practical purposes.
>
Yes. And maybe for the application layer applications, an API is more
realistic and practical at the first stage? I'm not saying we should not
standardize a sync protocol...

>
> While I agree that WebDAV may have disadvantages there is a good number of
> installations using this already in our community (research labs mainly in,
> but not limited to, Europe). I think the interesting point about
> WebDAV/HTTP is extensibility and maybe it is worth is that these extensions
> go by a “standard”. However, you should consider that for
>
Actually I don't know what the "extensibility" means here, are you saying
that we could extend the WebDAV?

> reasonably efficient sync scenario the server should also exhibit certain
> behaviour. That is, in case of owncloud for example, a server should be
> able to efficiently propagate the ETAG changes up the directory tree (like
> the Merkle Tree) so that the client may use propfind efficiently. This is
> not a hard spec requirement but otherwise propfinding the entire remote
> tree each time would be impractically inefficient. So this really goes a
> little bit beyond just an API.
>
And that is what I mean the "applicability", IMO, the WebDAV seems not
suitable for sync purposes. As a result, the developers compromise to
propfind the entire file tree at intervals.

>
> You should also consider that the OpenCloudMesh (under GEANT umbrella) is
> an initiative with the intent is to make cross-service sharing very easy.
> These shares may also be synchronized automatically. I currently have no
> evidence, however, that other software providers with the exception of
> owncloud are interested in developing such standard. If there is no bottom
> up interest from the users then it won’t happen (in my opinion).
>
That's cool. I'm just wondering whether the ISS could get some input from
that...

>
> With the link I wanted to point you to the fact that what you discuss in
> this mailing list will be also discussed at the upcoming CS3 event I linked
> in. The intent is to do this discussing together between service providers,
> developers and researchers together — so that it does not only end up as an
> academic exercise but backed up by a critical mass if there is some
> potential in standardisation, at least in our community. I hope this could
> be of interest to IETF community, as mentioned on the program page.
>
> Selected papers will be published in FGCS after the event, so please stand
> by, or attend the event if you want to be part of this discussion here.
> BTW. the abstract submission deadline is past but one exceptionally good
> contribution could still possibly be accommodated if submitted rapidly.
>
> I would be nonetheless happy to continue contributing to the discussion in
> this mailing list.
>
Thanks for that!

>
> Best regards,
>
> Jakub Moscicki
>
> --
>
>
>
> Regards,
> Linhui
>
>>
>>
>> --------------
>> Fei Song
>> >Hello,
>> >
>> >What kind of outcome are you looking for with this analysis? Some
>> research in this area has already been done or is being done as we speak
>> >
>> >e.g. "A study of delta-sync and other optimisation in HTTP/Webdav
>> synchonisation protocols"
>> >
>> >see "Technology and Research":
>> >
>> >http://cs3.ethz.ch/program.html
>> >
>> >It would be interesting to see if there is a potential for
>> collaboration. Or maybe we already have some information you are looking
>> for.
>> >
>> >Best regards,
>> >
>> >Jakub Moscicki
>> >
>> >—
>> >
>> >
>> >On 25 Nov 2015, at 11:45, Linhui Sun <lh.sunlinh@gmail.com<mailto:
>> lh.sunlinh@gmail.com>> wrote:
>> >
>> >Hi all,
>> >
>> >As I mentioned before, I think the developers could benefit from the
>> IETF standards. The ownCloud (https://owncloud.org/) is just an example.
>> It is developed for those who do not trust commercial storage services and
>> want to build their own network-based storage services. The ownCloud is
>> using WebDAV (RFC4918) to achieve the data sync. IMO, the WebDAV is
>> designed for distributed work but not for the sync. Thus, I made some
>> preliminary investigations on how the ownCloud uses WebDAV for sync
>> purposes. A brief summary of what I've found is in the following, please
>> correct me if I am wrong.
>> >
>> >I installed the ownCloud server (v8.2.1) on the CentOS7, and the client
>> is a desktop client on Windows.
>> >
>> >1. To find whether there is a change to the synced directory, the client
>> continuously sends PROPFIND to the server at regular intervals (around 34
>> seconds under my observation). The server will respond a 207 Multi-Status
>> Response to tell whether the main directory has been changed. To perform
>> this regular check, the client will open a new TCP connection to send the
>> PROPFIND, the server will close the existing TCP connection after
>> responding the 207 Multi-Status Response. For the next check, the client
>> will open another new TCP connection.
>> >
>> >2. Every time adding (or creating) a new file to the local folder, the
>> client will open a new TCP connection (if there is no connection existing)
>> to send the file asap. The client will first send several PROPFINDs to find
>> out which sub-directory has been changed. And then it sends the file using
>> PUT. The server will respond a 201 Created Response and then terminate the
>> connection. Currently, I haven't found any application layer chunking, all
>> the segmentation are performed by TCP.
>> >
>> >3. Every time I delete (or rename) a file locally, the client will also
>> open a new TCP connection to send several PROPFINDs to find out which file
>> has been removed (or renamed). Then it will send DELETE (or MOVE). The
>> server will respond a 204 No Content Response (or 201 Created Response) and
>> then terminate the connection.
>> >
>> >4. I open a file and frequently edit and save it (actually this is what
>> I usually do with the Dropbox). The client will send the whole file to the
>> server every time I save the file.
>> >
>> >To summarize, it seems that the ownCloud makes heavily use of PROPFIND
>> to achieve the sync process. Each sync operation (e.g. upload, modify and
>> etc.) will start with sending one or more PROPFINDs. And currently, if I
>> add a file to the server (directly from the server side via web interface),
>> the client cannot find the change. I need to interrupt the sync and recover
>> it to make the client be aware of the change and download the newly added
>> file. I'm not sure whether this is caused by the sync mechanism or an
>> improper server configuration. I need to investigate this further and also
>> how the ownCloud works for multiple clients (or devices).
>> >
>> >For ISS, I think ownCloud has demonstrated to some extent that similar
>> IETF protocols could be deployed and employed. The intention of this
>> message is to investigate the current state of using WebDAV for sync
>> purposes to see what needs to be improved here and whether we need new
>> protocols.
>> >
>> >Comments are welcome : )
>> >
>> >Regards,
>> >Linhui
>> >
>> >
>> >_______________________________________________
>> >Storagesync mailing list
>> >Storagesync@ietf.org<mailto:Storagesync@ietf.org>
>> >https://www.ietf.org/mailman/listinfo/storagesync
>> >
>> _______________________________________________
>> Storagesync mailing list
>> Storagesync@ietf.org
>> https://www.ietf.org/mailman/listinfo/storagesync
>>
>
>
>