Re: [ire] CSV and RFC4180
Gustavo Lozano <gustavo.lozano@icann.org> Thu, 13 December 2012 19:03 UTC
Return-Path: <gustavo.lozano@icann.org>
X-Original-To: ire@ietfa.amsl.com
Delivered-To: ire@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1AB2421F8A3E for <ire@ietfa.amsl.com>; Thu, 13 Dec 2012 11:03:47 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.599
X-Spam-Level:
X-Spam-Status: No, score=-6.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id y1hICUuSG6k5 for <ire@ietfa.amsl.com>; Thu, 13 Dec 2012 11:03:46 -0800 (PST)
Received: from EXPFE100-1.exc.icann.org (expfe100-1.exc.icann.org [64.78.22.236]) by ietfa.amsl.com (Postfix) with ESMTP id 47A1A21F8623 for <ire@ietf.org>; Thu, 13 Dec 2012 11:03:46 -0800 (PST)
Received: from EXVPMBX100-1.exc.icann.org ([64.78.22.232]) by EXPFE100-1.exc.icann.org ([64.78.22.236]) with mapi; Thu, 13 Dec 2012 11:03:45 -0800
From: Gustavo Lozano <gustavo.lozano@icann.org>
To: "Gould, James" <JGould@verisign.com>, "ire@ietf.org" <ire@ietf.org>
Date: Thu, 13 Dec 2012 11:03:46 -0800
Thread-Topic: [ire] CSV and RFC4180
Thread-Index: Ac3ZZJLbg0azj2qpR/WVaEMbmhYG0w==
Message-ID: <CCEF6231.69F6%gustavo.lozano@icann.org>
In-Reply-To: <C41D7AF7FCECBE44940E9477E8E70D7A0D742264@BRN1WNEXMBX02.vcorp.ad.vrsn.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/14.2.5.121010
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [ire] CSV and RFC4180
X-BeenThere: ire@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Internet Registration Escrow discussion list." <ire.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ire>, <mailto:ire-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ire>
List-Post: <mailto:ire@ietf.org>
List-Help: <mailto:ire-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ire>, <mailto:ire-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Dec 2012 19:03:47 -0000
James, The encoding of the file could be UTF-8, but my reading of the ABNF grammar (Section 2 of RFC4180) is that the the code point repertory is limited to %x20-21 / %x23-2B / %x2D-7E. I have created a text file with non us-ascii code points, fields separated by "," in my computer and I saved this file as UTF-8. I opened the file with different applications without problem , but I am not sure that this file can be considered CSV format compliant. I am not trying to be picky here, the escrow deposit is a fundamental piece of the registry transition process and RFC 4180 made me feel that CSV is not as well as defined as XML. Regards, Gustavo On 12/13/12 10:32 AM, "Gould, James" <JGould@verisign.com> wrote: >Gustavo, > >Doesn't it state in RFC 4180 the following? > > Common usage of CSV >is US-ASCII, but other character sets defined > by IANA for the "text" >tree may be used in conjunction with the > "charset" parameter. > >It does >not preclude the use of UTF-8 or any other character >set >(http://www.iana.org/assignments/character-sets/character-sets.xml), so >we >should be good. > > > >-- > >JG > > > >James Gould >Principal Software >Engineer >jgould@verisign.com > >703-948-3271 (Office) >12061 Bluemont >Way >Reston, VA 20190 >VerisignInc.com > > > > > > > >On 12/13/12 1:26 PM, "Gustavo >Lozano" <gustavo.lozano@icann.org> wrote: > >>James, >> >>I understand that we >can produce text files encoded in UTF-8. >> >>My concern is the TEXTDATA ABNF >grammar defined in RFC4180: %x20-21 / >>%x23-2B / %x2D-7E, which support a >subset of US-ASCII extended only. >> >>After pre delegation testing, ICANN >will have evidence that the escrow >>deposit file is correct, but this is >only a snapshot in time. Registries >>platforms are updated, RDBMS are >updated, libraries are updated and in >>general all the components of the SRS >evolve during time. My concern is to >>find in the future that the escrow >deposit file of a registry operator is >>corrupted because some library is >now following the ABNF grammar in >>RFC4180 or other validation rules. The >same applies to the EBERO system. >> >>XML being a well defined standard make >more comfortable in this regard. >> >>Regards, >> >>Gustavo Lozano >> >>On >12/13/12 10:04 AM, "Gould, James" <JGould@verisign.com> >wrote: >> >>>Gustavo, >>> >>>I don't believe that the file encoding is the key >determinate of the file >>>format decision. The CSV draft includes an >encoding attribute with the >>>default of UTF-8. It's up to the producer and >consumer to support the >>>appropriate encoding of the data in any case >whether we're talking about >>>XML as the file format or CSV. I don't >believe we would have any issue >>>producing or consuming UTF-8 encoded CSV >files. >>> >>>-- >>> >>>JG >>> >>> >>> >>>James Gould >>>Principal Software >Engineer >>>jgould@verisign.com >>> >>>703-948-3271 (Office) >>>12061 Bluemont >Way >>>Reston, VA 20190 >>>VerisignInc.com >>> >>> >>> >>> >>> >>> >>> >>>On 12/13/12 >1:00 PM, "Gustavo Lozano" <gustavo.lozano@icann.org> >wrote: >>> >>>>Colleagues, >>>> >>>>I find the proposal of Comma-Separated >Values (CSV) Objects Mapping >>>>interesting and I think that both approaches >CSV or XML for escrow data >>>>have its advantages and >disadvantages. >>>> >>>>The only RFC related to CSV that I have found is >RFC4180. >>>> >>>>This text from RFC4180 concerns me: >>>>TEXTDATA = %x20-21 / >%x23-2B / %x2D-7E >>>> >>>>The escrow deposit will contain non US-ASCII >data. >>>> >>>>How can we be sure that the libraries/database tools used to >implement >>>>the >>>>export/import of CSV will adequately work with non >US-ASCII data? There >>>>are different platforms and architectures used by >different players >>>>(EBEROs, registry operators and data escrow agents) >that will be >>>>upgraded >>>>and will evolve during time. >>>> >>>>In this >regard I feel more confortable with XML because Unicode support >>>>have been >present since the beginning. >>>> >>>>Thoughts? >>>> >>>>Regards, >>>>Gustavo >Lozano >>>> >>>>_______________________________________________ >>>>ire mailing >list >>>>ire@ietf.org >>>>https://www.ietf.org/mailman/listinfo/ire >>> >> > >____ >___________________________________________ >ire mailing >list >ire@ietf.org >https://www.ietf.org/mailman/listinfo/ire > >
- [ire] CSV and RFC4180 Gustavo Lozano
- Re: [ire] CSV and RFC4180 Gould, James
- Re: [ire] CSV and RFC4180 Gustavo Lozano
- Re: [ire] CSV and RFC4180 Gould, James
- Re: [ire] CSV and RFC4180 Francisco Obispo
- Re: [ire] CSV and RFC4180 Gustavo Lozano
- Re: [ire] CSV and RFC4180 Gould, James