Re: [ire] CSV and RFC4180

Gustavo Lozano <gustavo.lozano@icann.org> Thu, 13 December 2012 18:26 UTC

Return-Path: <gustavo.lozano@icann.org>
X-Original-To: ire@ietfa.amsl.com
Delivered-To: ire@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7B14721F8B0C for <ire@ietfa.amsl.com>; Thu, 13 Dec 2012 10:26:56 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.599
X-Spam-Level:
X-Spam-Status: No, score=-6.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id oxRz-ebg29AH for <ire@ietfa.amsl.com>; Thu, 13 Dec 2012 10:26:55 -0800 (PST)
Received: from EXPFE100-2.exc.icann.org (expfe100-2.exc.icann.org [64.78.22.237]) by ietfa.amsl.com (Postfix) with ESMTP id C410D21F8B01 for <ire@ietf.org>; Thu, 13 Dec 2012 10:26:55 -0800 (PST)
Received: from EXVPMBX100-1.exc.icann.org ([64.78.22.232]) by EXPFE100-2.exc.icann.org ([64.78.22.237]) with mapi; Thu, 13 Dec 2012 10:26:55 -0800
From: Gustavo Lozano <gustavo.lozano@icann.org>
To: "Gould, James" <JGould@verisign.com>, "ire@ietf.org" <ire@ietf.org>
Date: Thu, 13 Dec 2012 10:26:56 -0800
Thread-Topic: [ire] CSV and RFC4180
Thread-Index: Ac3ZX23q333dBbsYT9eLF5hZG9bPrQ==
Message-ID: <CCEF5CEF.69E5%gustavo.lozano@icann.org>
In-Reply-To: <C41D7AF7FCECBE44940E9477E8E70D7A0D7420E4@BRN1WNEXMBX02.vcorp.ad.vrsn.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/14.2.5.121010
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [ire] CSV and RFC4180
X-BeenThere: ire@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Internet Registration Escrow discussion list." <ire.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ire>, <mailto:ire-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ire>
List-Post: <mailto:ire@ietf.org>
List-Help: <mailto:ire-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ire>, <mailto:ire-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Dec 2012 18:26:56 -0000
X-List-Received-Date: Thu, 13 Dec 2012 18:26:56 -0000

James,

I understand that we can produce text files encoded in UTF-8.

My concern is the TEXTDATA ABNF grammar defined in RFC4180: %x20-21 /
%x23-2B / %x2D-7E, which support a subset of US-ASCII extended only.

After pre delegation testing, ICANN will have evidence that the escrow
deposit file is correct, but this is only a snapshot in time. Registries
platforms are updated, RDBMS are updated, libraries are updated and in
general all the components of the SRS evolve during time. My concern is to
find in the future that the escrow deposit file of a registry operator is
corrupted because some library is now following the ABNF grammar in
RFC4180 or other validation rules. The same applies to the EBERO system.

XML being a well defined standard make more comfortable in this regard.

Regards,

Gustavo Lozano

On 12/13/12 10:04 AM, "Gould, James" <JGould@verisign.com> wrote:

>Gustavo,
>
>I don't believe that the file encoding is the key determinate of the file
>format decision.  The CSV draft includes an encoding attribute with the
>default of UTF-8.  It's up to the producer and consumer to support the
>appropriate encoding of the data in any case whether we're talking about
>XML as the file format or CSV.  I don't believe we would have any issue
>producing or consuming UTF-8 encoded CSV files.
>
>-- 
>
>JG
> 
>
> 
>James Gould
>Principal Software Engineer
>jgould@verisign.com
> 
>703-948-3271 (Office)
>12061 Bluemont Way
>Reston, VA 20190
>VerisignInc.com
>
>
>
>
>
>
>
>On 12/13/12 1:00 PM, "Gustavo Lozano" <gustavo.lozano@icann.org> wrote:
>
>>Colleagues,
>>
>>I find the proposal of Comma-Separated Values (CSV) Objects Mapping
>>interesting and I think that both approaches CSV or XML for escrow data
>>have its advantages and disadvantages.
>>
>>The only RFC related to CSV that I have found is RFC4180.
>>
>>This text from RFC4180 concerns me:
>>TEXTDATA =  %x20-21 / %x23-2B / %x2D-7E
>>
>>The escrow deposit will contain non US-ASCII data.
>>
>>How can we be sure that the libraries/database tools used to implement
>>the
>>export/import of CSV will adequately work with non US-ASCII data? There
>>are different platforms and architectures used by different players
>>(EBEROs, registry operators and data escrow agents) that will be upgraded
>>and will evolve during time.
>>
>>In this regard I feel more confortable with XML because Unicode support
>>have been present since the beginning.
>>
>>Thoughts?
>>
>>Regards,
>>Gustavo Lozano
>>
>>_______________________________________________
>>ire mailing list
>>ire@ietf.org
>>https://www.ietf.org/mailman/listinfo/ire
>