Re: [Storagesync] Storagesync Digest, Vol 5, Issue 1

Jakub Moscicki <Jakub.Moscicki@cern.ch> Wed, 09 December 2015 21:26 UTC

Return-Path: <Jakub.Moscicki@cern.ch>
X-Original-To: storagesync@ietfa.amsl.com
Delivered-To: storagesync@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2D3691A7014 for <storagesync@ietfa.amsl.com>; Wed, 9 Dec 2015 13:26:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id GHColAZ2xDXO for <storagesync@ietfa.amsl.com>; Wed, 9 Dec 2015 13:25:59 -0800 (PST)
Received: from emea01-db3-obe.outbound.protection.outlook.com (mail-db3on0638.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe04::638]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B25D01A7009 for <storagesync@ietf.org>; Wed, 9 Dec 2015 13:25:58 -0800 (PST)
Received: from AM3PR06CA023.eurprd06.prod.outlook.com (10.141.192.141) by HE1PR06MB1418.eurprd06.prod.outlook.com (10.163.177.20) with Microsoft SMTP Server (TLS) id 15.1.337.19; Wed, 9 Dec 2015 21:25:36 +0000
Received: from AM1FFO11FD016.protection.gbl (2a01:111:f400:7e00::119) by AM3PR06CA023.outlook.office365.com (2a01:111:e400:882b::13) with Microsoft SMTP Server (TLS) id 15.1.355.16 via Frontend Transport; Wed, 9 Dec 2015 21:25:37 +0000
Authentication-Results: spf=pass (sender IP is 188.184.36.48) smtp.mailfrom=cern.ch; ietf.org; dkim=none (message not signed) header.d=none;ietf.org; dmarc=bestguesspass action=none header.from=cern.ch;
Received-SPF: Pass (protection.outlook.com: domain of cern.ch designates 188.184.36.48 as permitted sender) receiver=protection.outlook.com; client-ip=188.184.36.48; helo=CERNMX12.cern.ch;
Received: from CERNMX12.cern.ch (188.184.36.48) by AM1FFO11FD016.mail.protection.outlook.com (10.174.64.94) with Microsoft SMTP Server (TLS) id 15.1.337.8 via Frontend Transport; Wed, 9 Dec 2015 21:25:35 +0000
Received: from cernfe02.cern.ch (188.184.36.47) by cernmxgwlb4.cern.ch (188.184.36.48) with Microsoft SMTP Server (TLS) id 14.3.158.1; Wed, 9 Dec 2015 22:25:02 +0100
Received: from CERNXCHG51.cern.ch ([fe80::20f7:8173:2da8:398a]) by CERNFE02.cern.ch ([fe80::bc89:8f4e:8731:2c47%13]) with mapi id 14.03.0174.001; Wed, 9 Dec 2015 22:25:01 +0100
From: Jakub Moscicki <Jakub.Moscicki@cern.ch>
To: Ted Lemon <mellon@fugue.com>
Thread-Topic: [Storagesync] Storagesync Digest, Vol 5, Issue 1
Thread-Index: AQHRLH0+7gCKX1UFx02QDZgdWy/v7563OhsAgAArgwCAAAZPgIAAAsoAgAAiFACAAVlOAIAACqEAgAACRoCAAAS6AIAAC86AgAAAVQCAAAIgAIAAPLWAgAHN3gCAAA3lAIAB4XiAgAAwgYCAAPsvAIAAb/IAgAAAzwCAAAM9gIAAAb8AgAADTgCAAAE9AIAABBwAgAAHaICAAAtbgIAAAxyAgAAesin///WngIAAM7sAgAC71wCAAP3bgIAAgq+AgAAjBgCAACjcgIABKcH+///yNgCAAAPMgIAAAh+AgABwRoA=
Date: Wed, 09 Dec 2015 21:25:00 +0000
Message-ID: <BD79DC4A-1803-49FF-A779-30844167C0DF@cern.ch>
References: <1449452139832-4f314827-a7ecd596-c5312339@fugue.com> <1449454580239-1fd59d90-52f0231b-370f2ef5@gmail.com,> <1449455245871-cb7e86e1-1a0160c5-aa6acce3@fugue.com> <2015120711170621874681@bjtu.edu.cn> <1449459616112-6043cb32-cd69a1f9-1399f1c0@fugue.com> <CAO_Yprbct8wFbS1WFnZZENSp-OruRUk2nRyBv4tNeKv9_CGuCg@mail.gmail.com> <1449511062426-94cdee34-064ef498-327458b6@fugue.com> <CAO_YprZjqs_OFC3RybVvJ4GHWb3spKMMkkFTZO=YDustp825iw@mail.gmail.com> <1449593642163-c107ebb4-0f6d1c5a-a3f1c5e0@fugue.com> <20151208185922.GA9531@localhost.localdomain> <1449609937865-6dbdad8f-eb44d945-cd684f34@fugue.com> <AE0CE9F1-3968-4229-925B-75AA37EDC327@unterwaditzer.net> <1449670262769-e440b1e3-b960232c-260b9165@fugue.com> <506D291C-4F0B-40F3-8848-97DAAF41CAAE@cern.ch> <1449671733322-9f72a594-b1d5700c-d3631253@fugue.com> <1449672190209-97dbcf5a-4802eeae-a6a33a55@fugue.com>
In-Reply-To: <1449672190209-97dbcf5a-4802eeae-a6a33a55@fugue.com>
Accept-Language: en-GB, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [81.28.197.96]
Content-Type: text/plain; charset="utf-8"
Content-ID: <A2B4868B9334B44A9CC2A95D920592CA@cern.ch>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-EOPAttributedMessage: 0
X-Microsoft-Exchange-Diagnostics: 1; AM1FFO11FD016; 1:7NwAcARhqgxEmtmMjAPjyAdDmeuyCcdkU+3Vm6+0QvyvsNLogDx0I0SlMHA+2o1UY/A9tQIue+90XAuX6VPXQf37eRIgoga8418mKw2MaXbbELmUmkXdc+eBtlxQYDH1a9w9xCDfgDtZON315diAgYG36MU1wN+g4J53Xdh4XtIx23AXEjbS1/zwJyHl88xX7SyIZREzx1u9dWQLVewPuE46HCyuYzgMWiiac1cGCTbMLEk2Rd61LusnBzuy1LoDg23JImo2e95270jQolXO2F5GkvbVWGj7aYX7d0M2BG5Y+eMD0XXISDU1xMjJ+bpRBPwRvK2vlKfDNUt6ae/GaOrXfspHXcfc32z5MJzQ+PJhSS0IW5vLyuOs0Nx0JnDi+uyBmXPQ8sdq/gN+G6VyIoD6c/d4Z/W6sicQfsWtUfk=
X-Forefront-Antispam-Report: CIP:188.184.36.48; CTRY:CH; IPV:CAL; IPV:NLI; EFV:NLI; SFV:NSPM; SFS:(10009020)(6009001)(2980300002)(438002)(5403001)(189002)(10533003)(18543002)(199003)(24454002)(102836003)(15975445007)(110136002)(86362001)(6806005)(23676002)(66066001)(74482002)(53416004)(106466001)(36756003)(5003600100002)(47776003)(82746002)(50986999)(54356999)(92566002)(189998001)(50466002)(1220700001)(106116001)(16796002)(87936001)(33656002)(76176999)(19580405001)(5004730100002)(1096002)(2900100001)(26826002)(93886004)(5001970100001)(3846002)(5008740100001)(586003)(2950100001)(5250100002)(83716003)(6116002)(19580395003)(104396002); DIR:OUT; SFP:1101; SCL:1; SRVR:HE1PR06MB1418; H:CERNMX12.cern.ch; FPR:; SPF:Pass; PTR:cernmx12.cern.ch; MX:1; A:1; LANG:en;
X-Microsoft-Exchange-Diagnostics: 1; HE1PR06MB1418; 2:IFFsNPMtMs2u3uapWNaoLdelslyfEREYhglv1h9cIjlS/V4CQP3NJmszJ8ttbO4FubdR7por9hrLEuxXSBkWgq1qnvuC6zd+SmeLPrebrvlcabkmRtV1CmTUtB0flBgpDj+jrkGafP/UbEhRV1WvMA==; 3:yJIS1gPl70y2XnLHrViAYDOjvmq7b7gaNZeilA9oai5P00gup2ngHUAA6io85fQ32tNRHJRQCcNSq5g0/NhYOpopt65vjelS3tJo33hLHCxGuKTEiuSZSeO5rXzS37j0/pnDNVYF+A+gANW1WU7ncpWc+3LcUtv50toaBgXw45vJ3+HzF4FqPEGy79eYEO5wDFTDyF8E3kz6C+RHLTSK7DuHD/dBdh5GuvavVHD+vFKOXyy8QEy5cKVClQG0OGVOFq5H+9ybdUJ0cK96lBlBBA==; 25:zJ4goy73qple9piQrUNuIqterluQ2q5up377jGGfBsxJ8tCTbxzwicl4girMARlT/MdYWs8kEEeYNd5QDF4t9zsMBP/jDhp1wmZKVaEgH0YFG115qFC7wjCWBXmD0Ckg0i1kuDuCVlLaR0GjGXcG/Fw4qPafVn3lABmf1tLnpWZwqihL9ih9x2TvGmVzuO8d/HC3cUgyLrn2cleD7Zmrvr+fEvnLEfzVcVEQuef/WLTd4stH0FheW84GnL/frU6Ir5NbfUJxEYuYmD9L5HH2Bg==
X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(8251501001); SRVR:HE1PR06MB1418;
X-Microsoft-Exchange-Diagnostics: 1; HE1PR06MB1418; 20:4LgNkIH7U+owagYNEl3j+gwlSPxlYGiscOTeAPwBxeU1/dJfEIYUcI6WGvDAN6ablRf0xVwn3Z3Hkieck/Kc7YyPScAeE3qVWBIri2azjbXN8v1SXSv9EEyk5Vf06HuhsUPs8rnNsuy8q+pjEccfLpgERsDAhGx99IZ+OTRNPOgRkqSQsVMX3Sg17JUfdcefOoc9vOtnZuRoDS9u2JyqQupWCG+ieb6afh4jM/pw8mf+xc9jx++h7lgTQqNOccUie0Mv/PNcM821y0gVazUkcKSteAKtucbmjhUxOIC3Bn6G31u7kz5xw9o437lG/vc8F90at9DVh9urG3EBcqQZWZNt9Wx5jFj45Ytw16sbUgW0v+kJDD6bNqyfA7FGUBx/fe2OSEoU9FOTP1EA1JbRweCzBDdUQ8mGx+0J7NdYagRlXZUGJP4+D59Z7sdRZRxrMR5P2pouUxSE1LBzB35iirjI752vIIhyokUe/mLeHFNLZB/nVpea1xHwamHo40Pw; 4:E9SPB8EMUM/DNJkBinQHPfPS12i/aZvEQ9PR5wy6jS6ceJ8hx1/ie8yHewS4F+9No39XlpbB5a3oxPh2LL7ZbtVHuh7mbkUTtcA9m3Y7Kyxv9YQWtJ4Sbny7Pl+YaMT/TG0MaGDwQQMUENtcBZF9yj/mZ1O0uKt2MI/TUGeQ7TobPs1XygFwgzBjKynwi73K3hf/XGFu51Yh+Vsn+zg35kL8lMFgW7R9lV1vTiBm72uxTtunsI8Pjsv1pwquot52D3hV6ADSliwkTPRPR6h1Im0FY3DjUFwO8gGL5+OlVB4/91yGFqKJq+jrElryhatZDR2Gpjjbu/4ycLUnMrj1d2Mc1EiRnePQEPHNySHo+8YS4LbOgJ5gNeGJwpIYNehi
X-Microsoft-Antispam-PRVS: <HE1PR06MB1418A85E34BA0263A7FDC0CD86E80@HE1PR06MB1418.eurprd06.prod.outlook.com>
X-Exchange-Antispam-Report-Test: UriScan:;
X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(601004)(2401047)(8121501046)(5005006)(520078)(3002001)(10201501046); SRVR:HE1PR06MB1418; BCL:0; PCL:0; RULEID:; SRVR:HE1PR06MB1418;
X-Forefront-PRVS: 0785459C39
X-Microsoft-Exchange-Diagnostics: 1;HE1PR06MB1418;23:CO3syjz0HolKqOkCL6zRPfAnPWlIFLjSD9jtPBJcZkcqEglP0k8+q0CmgVeLA1LsKTwYf7YFnsrjcqOkJCQr3c6Uqs3fm7uxlw1d3u9bOA4hd776U/3luCTHKJtpuIdHu0gAewwlErSow2AvWA9MBN+7IrRbyjK91rF8SFQodOS9esyQvnZwB1xjs5Kd61EGOfL9pv9UH1xOO05yi3X/Skzs/wG8E5T3xsua6C6fD9rI+a6H/oOu1HbpyEfoi73tKvMBfqmZCSNqQf0e3fMxfe7ujs7wVJNKb29jHzjkA1sxOK0tQiOG2q1tTYuglnin1En1fZCeuLpok3RyGvnS/HyVHLEXrGUi+VMt4SlyLtVu7HRqyzn1MBLusKNjWvLfqxuHVGD6+8iAnLJ1Bjn6x6on0Uhjmp3ApjSwU0oNcNgkz4GOxz4VyVoxhk5ganEd6tY0NFexut5CvBgV9+EPU/oppCshuBRkz/C4meaLcO1csQZKvV59MwDIzcr62Bj9CgZzKvPavV7HXHbJNonASFQuyDlyMeKFBaSwgEiDVfcmfV+H+pcznYuA03SRE3ysYklUcX2XJ03ruydjJ4bYv6F8lhLM6EXcfN/Y5z3fFbBXYeJBbwwWb/l36f8BYqK6KnX+rSKC58EJD9ICWp6wWET6+AIBuZqE87kmhjyAhzyxeDsu3Ch7dLZ6HitCK0eDXOVVPOS+hBdESuWwmp2PjfSMFR9GGWNdWFzP0LdCtkhsfOqbRBWoPIifJ8JlYWXZ0qlrQIxC111cVwLe4ZLNRo3w13y5O+k+tHFcVtCfX80YByD/b3AiGwvISu2SsMkxeH9EXk0Abph6n0o303iADXd87TasP7habxmOX6CiNcfVdkNuMghwwKhFl/6BJ5B+Usz0jI2uFUhdk2425BYgk4Kq/kAPnrXtnUgx5VNgHJFVIwMTaNBjmRgy7KIGaKsASvq2WBxKvNXFX/7peNHIBG5o2JiiKT8b9Dw3+ywWl2PZLw0pkuUpKnZzC2YNQ5XinXWZYfxkVisRYqtcGHKmAg8L7yP1jNP4YjUU2WuqM0JebaEKsIBm6OUKAz8SF6INVB1KevurH6yhO09XSVBaJ303lv0vNl5NcqRQNcduNQdE1RjmBFoFzfbHOpcg0bRQQiD8KpF/SJhXwEXxlUFUehlr1eUsx3y0ceOHKlJnrf7s9PcFguhXIIwGf75o3emEFWPzuL4ZTLZGa8mRmqcihHcE9aW4iDn4vJmSwS5qyLXcPADK/ebRxDfKce6Ez4GcqejmRWbRPnuMpk7NpnjokA==
X-Microsoft-Exchange-Diagnostics: 1; HE1PR06MB1418; 5:PfLuly4EOkWTwouXtQ0jOLe3MPmX9ytdOa6vm6OsBBbm8tF549Y+ITIGTIrfp8mOCoCsN5CV6IgjjCkoaJji2DClQDAhXYPKqZvTK4xFOKzCAzJppXZOkk9xMnv7Nx3RDAtsPfzus0aa0R/XE6Pykg==; 24:t7xDZLUNi8x7jZCTfFPHJQVmff5z6GI7QxRDepwajIAC+eRNijC3bn3pqVQGFQqke2FWdLBcVvaPN6ltEuePRMKuR0rkMABnWbD65wmeS34=
SpamDiagnosticOutput: 1:23
SpamDiagnosticMetadata: NSPM
X-OriginatorOrg: cern.ch
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Dec 2015 21:25:35.9677 (UTC)
X-MS-Exchange-CrossTenant-Id: c80d3499-4a40-4a8c-986e-abce017d6b19
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=c80d3499-4a40-4a8c-986e-abce017d6b19; Ip=[188.184.36.48]; Helo=[CERNMX12.cern.ch]
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR06MB1418
Archived-At: <http://mailarchive.ietf.org/arch/msg/storagesync/7zXVQqBFQ5CB7t9C_14VOxIzgXk>
Cc: "storagesync@ietf.org" <storagesync@ietf.org>
Subject: Re: [Storagesync] Storagesync Digest, Vol 5, Issue 1
X-BeenThere: storagesync@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Mechanisms to synchronize client file systems with Internet-based data storage services <storagesync.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/storagesync>, <mailto:storagesync-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/storagesync/>
List-Post: <mailto:storagesync@ietf.org>
List-Help: <mailto:storagesync-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/storagesync>, <mailto:storagesync-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Dec 2015 21:26:04 -0000

Hello,

Last time I looked at this in cernbox we had more than 1000 file extensions and quite a large fraction of “typical dropbox” types of files (documents of different sorts, latex source files of articles, iWork files, Office files, images, pdfs, source code (!), etc). We also have an increasing number of physics datasets and we will have many more as we provide direct connection via sync/share to the already existing huge collections of these datasets. For the data files at CERN, yes you are right, there is no point of merging them and they quite often are never modified. But I would also tend to disagree with your statement that most of files are text files — as was explained by Marc Blanchet — in general it is not the case, not only for CERN.

Best regards,

kuba

--

> On 09 Dec 2015, at 15:43, Ted Lemon <mellon@fugue.com> wrote:
> 
> BTW, it occurs to me that at CERN you deal a lot in very large data sets which are created once and never modified, because they are records of physical events.  So the idea of merging might not seem all that important, because you will never do it on these data sets.
> 
> However, one thing that a good versioning system allows for that I think you should consider important is the ability to avoid accidentally losing data.   When you have really big data archives, one of the things that you want to do for redundancy is keep multiple copies.   If one instance gets corrupted, you want to be able to detect that, and you want to be able to avoid losing other instances of the file if the file is lost from an instance of a folder.   The most reliable way to avoid that is to keep versioning metadata.   Multiple copies of versioning metadata are very useful for forensic analysis when something goes wrong, as well.
> 
> Additionally, while the bulk of the actual _data_ that CERN stores is really big files, the little files matter just as much--the work researchers are doing, particularly collaboratively.   Enabling efficient collaboration on articles in progress, enabling effective sharing of code, and so on, all are very important despite representing a tiny percentage of the total data stored.
> 
> 
> --
> Sent from Whiteout Mail - https://whiteout.io
> 
> My PGP key: https://keys.whiteout.io/mellon@fugue.com_______________________________________________
> Storagesync mailing list
> Storagesync@ietf.org
> https://www.ietf.org/mailman/listinfo/storagesync