Re: [ire] CSV and RFC4180

Francisco Obispo <fobispo@isc.org> Thu, 13 December 2012 18:35 UTC

Return-Path: <fobispo@isc.org>
X-Original-To: ire@ietfa.amsl.com
Delivered-To: ire@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3ADC821F89C6 for <ire@ietfa.amsl.com>; Thu, 13 Dec 2012 10:35:40 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.36
X-Spam-Level:
X-Spam-Status: No, score=-2.36 tagged_above=-999 required=5 tests=[AWL=0.240, BAYES_00=-2.599, NO_RELAYS=-0.001]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id x7r-iPr5uxtd for <ire@ietfa.amsl.com>; Thu, 13 Dec 2012 10:35:39 -0800 (PST)
Received: from mx.ams1.isc.org (mx.ams1.isc.org [IPv6:2001:500:60::65]) by ietfa.amsl.com (Postfix) with ESMTP id E25FA21F89BE for <ire@ietf.org>; Thu, 13 Dec 2012 10:35:38 -0800 (PST)
Received: from bikeshed.isc.org (bikeshed.isc.org [IPv6:2001:4f8:3:d::19]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "mail.isc.org", Issuer "RapidSSL CA" (not verified)) by mx.ams1.isc.org (Postfix) with ESMTPS id 3DD8A5F9B71; Thu, 13 Dec 2012 18:35:26 +0000 (UTC) (envelope-from fobispo@isc.org)
Received: from [IPv6:2001:4f8:3:65:8445:ed4d:9aa7:e2a5] (unknown [IPv6:2001:4f8:3:65:8445:ed4d:9aa7:e2a5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (Client did not present a certificate) by bikeshed.isc.org (Postfix) with ESMTPSA id C1736216C80; Thu, 13 Dec 2012 18:35:24 +0000 (UTC) (envelope-from fobispo@isc.org)
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
From: Francisco Obispo <fobispo@isc.org>
In-Reply-To: <CCEF5CEF.69E5%gustavo.lozano@icann.org>
Date: Thu, 13 Dec 2012 10:35:24 -0800
Content-Transfer-Encoding: quoted-printable
Message-Id: <41D8E8EE-AC91-4080-9B23-0BEA36BFFFEA@isc.org>
References: <CCEF5CEF.69E5%gustavo.lozano@icann.org>
To: Gustavo Lozano <gustavo.lozano@icann.org>
X-Mailer: Apple Mail (2.1499)
Cc: "ire@ietf.org" <ire@ietf.org>
Subject: Re: [ire] CSV and RFC4180
X-BeenThere: ire@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Internet Registration Escrow discussion list." <ire.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ire>, <mailto:ire-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ire>
List-Post: <mailto:ire@ietf.org>
List-Help: <mailto:ire-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ire>, <mailto:ire-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Dec 2012 18:35:40 -0000

One of the things I was thinking, is that the escrow dump is sort of like a ship's manifest, it needs to be as internationally 'readable' as possible, so that any port in the world can parse it.

Loading and unloading on the database should be secondary to all of this, we should be focusing on portability and recoverability.

Francisco

On Dec 13, 2012, at 10:26 AM, Gustavo Lozano <gustavo.lozano@icann.org> wrote:

> James,
> 
> I understand that we can produce text files encoded in UTF-8.
> 
> My concern is the TEXTDATA ABNF grammar defined in RFC4180: %x20-21 /
> %x23-2B / %x2D-7E, which support a subset of US-ASCII extended only.
> 
> After pre delegation testing, ICANN will have evidence that the escrow
> deposit file is correct, but this is only a snapshot in time. Registries
> platforms are updated, RDBMS are updated, libraries are updated and in
> general all the components of the SRS evolve during time. My concern is to
> find in the future that the escrow deposit file of a registry operator is
> corrupted because some library is now following the ABNF grammar in
> RFC4180 or other validation rules. The same applies to the EBERO system.
> 
> XML being a well defined standard make more comfortable in this regard.
> 
> Regards,
> 
> Gustavo Lozano
> 
> On 12/13/12 10:04 AM, "Gould, James" <JGould@verisign.com> wrote:
> 
>> Gustavo,
>> 
>> I don't believe that the file encoding is the key determinate of the file
>> format decision.  The CSV draft includes an encoding attribute with the
>> default of UTF-8.  It's up to the producer and consumer to support the
>> appropriate encoding of the data in any case whether we're talking about
>> XML as the file format or CSV.  I don't believe we would have any issue
>> producing or consuming UTF-8 encoded CSV files.
>> 
>> -- 
>> 
>> JG
>> 
>> 
>> 
>> James Gould
>> Principal Software Engineer
>> jgould@verisign.com
>> 
>> 703-948-3271 (Office)
>> 12061 Bluemont Way
>> Reston, VA 20190
>> VerisignInc.com
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> On 12/13/12 1:00 PM, "Gustavo Lozano" <gustavo.lozano@icann.org> wrote:
>> 
>>> Colleagues,
>>> 
>>> I find the proposal of Comma-Separated Values (CSV) Objects Mapping
>>> interesting and I think that both approaches CSV or XML for escrow data
>>> have its advantages and disadvantages.
>>> 
>>> The only RFC related to CSV that I have found is RFC4180.
>>> 
>>> This text from RFC4180 concerns me:
>>> TEXTDATA =  %x20-21 / %x23-2B / %x2D-7E
>>> 
>>> The escrow deposit will contain non US-ASCII data.
>>> 
>>> How can we be sure that the libraries/database tools used to implement
>>> the
>>> export/import of CSV will adequately work with non US-ASCII data? There
>>> are different platforms and architectures used by different players
>>> (EBEROs, registry operators and data escrow agents) that will be upgraded
>>> and will evolve during time.
>>> 
>>> In this regard I feel more confortable with XML because Unicode support
>>> have been present since the beginning.
>>> 
>>> Thoughts?
>>> 
>>> Regards,
>>> Gustavo Lozano
>>> 
>>> _______________________________________________
>>> ire mailing list
>>> ire@ietf.org
>>> https://www.ietf.org/mailman/listinfo/ire
>> 
> 
> _______________________________________________
> ire mailing list
> ire@ietf.org
> https://www.ietf.org/mailman/listinfo/ire

Francisco Obispo 
Director of Applications and Services - ISC
email: fobispo@isc.org
Phone: +1 650 423 1374 || INOC-DBA *3557* NOC
PGP KeyID = B38DB1BE