Re: [Storagesync] Storagesync Digest, Vol 5, Issue 1

"Fei Song" <fsong@bjtu.edu.cn> Sat, 12 December 2015 13:12 UTC

Return-Path: <fsong@bjtu.edu.cn>
X-Original-To: storagesync@ietfa.amsl.com
Delivered-To: storagesync@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E4AA71A8749 for <storagesync@ietfa.amsl.com>; Sat, 12 Dec 2015 05:12:56 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.789
X-Spam-Level:
X-Spam-Status: No, score=0.789 tagged_above=-999 required=5 tests=[BAYES_50=0.8, MIME_BASE64_BLANKS=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id amvR8MUuvFzB for <storagesync@ietfa.amsl.com>; Sat, 12 Dec 2015 05:12:55 -0800 (PST)
Received: from bjtu.edu.cn (mail.bjtu.edu.cn [218.249.29.198]) by ietfa.amsl.com (Postfix) with ESMTP id 997D01A8747 for <storagesync@ietf.org>; Sat, 12 Dec 2015 05:12:52 -0800 (PST)
Received: from PC-201001061KKK (unknown [211.71.74.217]) by Jdweb2 (Coremail) with SMTP id M55wygDXXDj_HWxWiWsDAA--.4239S2; Sat, 12 Dec 2015 21:15:43 +0800 (CST)
Date: Sat, 12 Dec 2015 21:13:35 +0800
From: Fei Song <fsong@bjtu.edu.cn>
To: mellon <mellon@fugue.com>, storagesync <storagesync@ietf.org>
References: <1449452139832-4f314827-a7ecd596-c5312339@fugue.com> <1449454580239-1fd59d90-52f0231b-370f2ef5@gmail.com, > <1449455245871-cb7e86e1-1a0160c5-aa6acce3@fugue.com> <2015120711170621874681@bjtu.edu.cn> <1449459616112-6043cb32-cd69a1f9-1399f1c0@fugue.com> <CAO_Yprbct8wFbS1WFnZZENSp-OruRUk2nRyBv4tNeKv9_CGuCg@mail.gmail.com> <1449511062426-94cdee34-064ef498-327458b6@fugue.com> <CAO_YprZjqs_OFC3RybVvJ4GHWb3spKMMkkFTZO=YDustp825iw@mail.gmail.com> <1449593642163-c107ebb4-0f6d1c5a-a3f1c5e0@fugue.com> <20151208185922.GA9531@localhost.localdomain> <1449609937865-6dbdad8f-eb44d945-cd684f34@fugue.com> <AE0CE9F1-3968-4229-925B-75AA37EDC327@unterwaditzer.net> <1449670262769-e440b1e3-b960232c-260b9165@fugue.com> <506D291C-4F0B-40F3-8848-97DAAF41CAAE@cern.ch> <1449671733322-9f72a594-b1d5700c-d3631253@fugue.com>, <1449672190209-97dbcf5a-4802eeae-a6a33a55@fugue.com>
X-Priority: 3
X-Has-Attach: no
X-Mailer: Foxmail 7.0.1.91[cn]
Mime-Version: 1.0
Message-ID: <2015121221133476519222@bjtu.edu.cn>
Content-Type: text/plain; charset="gb2312"
Content-Transfer-Encoding: base64
X-CM-TRANSID: M55wygDXXDj_HWxWiWsDAA--.4239S2
X-Coremail-Antispam: 1UD129KBjvJXoW7Kr4rZF1ftry8Jw1UKr4kJFb_yoW8Aw1xpF WfAF43Kr4DXFnY9340yw4xXFW8trs7J39rW3WUJryxAwsYya1Ikr4xKrWFvF9xu3s8XF40 vr1Yq3Wjva98ZaDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBSb7Iv0xC_Zr1lb4IE77IF4wAFF20E14v26r1j6r4UM7CY07I2 0VC2zVCF04k26cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rw A2F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xII jxv20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwV C2z280aVCY1x0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC 0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr 1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4xvF2IEb7IF0Fy264kE64k0 F24lFcxC0VAYjxAxZF0Ex2IqxwCY02Avz4vE14v_Xr4l42xK82IYc2Ij64vIr41l4I8I3I 0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWU GVWUWwC2zVAF1VAY17CE14v26r1Y6r17MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI 0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF0xvE42xK8VAvwI8IcIk0 rVWrZr1j6s0DMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Jr 0_Gr1l6VACY4xI67k04243AbIYCTnIWIevJa73UjIFyTuYvjxUIBOJDUUUU
X-CM-SenderInfo: aytwlqpemw3hxhgxhubq/
Archived-At: <http://mailarchive.ietf.org/arch/msg/storagesync/1I77lCKhmo2IA71RqJbemdp9nsA>
Subject: Re: [Storagesync] Storagesync Digest, Vol 5, Issue 1
X-BeenThere: storagesync@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
Reply-To: fsong <fsong@bjtu.edu.cn>
List-Id: Mechanisms to synchronize client file systems with Internet-based data storage services <storagesync.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/storagesync>, <mailto:storagesync-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/storagesync/>
List-Post: <mailto:storagesync@ietf.org>
List-Help: <mailto:storagesync-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/storagesync>, <mailto:storagesync-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 12 Dec 2015 13:12:57 -0000



--------------
Fei Song
>BTW, it occurs to me that at CERN you deal a lot in very large data sets which are created once and never modified, because they are records of physical events.  So the idea of merging might not seem all that important, because you will never do it on these data sets.
>
>However, one thing that a good versioning system allows for that I think you should consider important is the ability to avoid accidentally losing data.   When you have really big data archives, one of the things that you want to do for redundancy is keep multiple copies.   If one instance gets corrupted, you want to be able to detect that, and you want to be able to avoid losing other instances of the file if the file is lost from an instance of a folder.   The most reliable way to avoid that is to keep versioning metadata.   Multiple copies of versioning metadata are very useful for forensic analysis when something goes wrong, as well.
>
>Additionally, while the bulk of the actual _data_ that CERN stores is really big files, the little files matter just as much--the work researchers are doing, particularly collaboratively.   Enabling efficient collaboration on articles in progress, enabling effective sharing of code, and so on, all are very important despite representing a tiny percentage of the total data stored.

I agree that efficient collaboration and other similar works in sync is important.
Here, I got a question. If the data is used for recording physical event and never modified.
Could we just mark "read only" and prohibit updates for all multiple copies after the first sync is finished.
What is the benefit for using versioning metadata system for this case?

>
>
>--
>Sent from Whiteout Mail - https://whiteout.io
>
>My PGP key: https://keys.whiteout.io/mellon@fugue.com