Re: [ire] CSV and RFC4180

"Gould, James" <JGould@verisign.com> Thu, 13 December 2012 18:32 UTC

Return-Path: <JGould@verisign.com>
X-Original-To: ire@ietfa.amsl.com
Delivered-To: ire@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 70CDF21F8545 for <ire@ietfa.amsl.com>; Thu, 13 Dec 2012 10:32:54 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.266
X-Spam-Level:
X-Spam-Status: No, score=-6.266 tagged_above=-999 required=5 tests=[AWL=0.333, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HR6wCfQR2Mjl for <ire@ietfa.amsl.com>; Thu, 13 Dec 2012 10:32:53 -0800 (PST)
Received: from exprod6og115.obsmtp.com (exprod6og115.obsmtp.com [64.18.1.35]) by ietfa.amsl.com (Postfix) with ESMTP id DC08F21F8ADA for <ire@ietf.org>; Thu, 13 Dec 2012 10:32:32 -0800 (PST)
Received: from osprey.verisign.com ([216.168.239.75]) (using TLSv1) by exprod6ob115.postini.com ([64.18.5.12]) with SMTP ID DSNKUMofQJbWxL3r8snMWeUI4AYmBf5BRNqp@postini.com; Thu, 13 Dec 2012 10:32:53 PST
Received: from brn1wnexcas02.vcorp.ad.vrsn.com (brn1wnexcas02.vcorp.ad.vrsn.com [10.173.152.206]) by osprey.verisign.com (8.13.6/8.13.4) with ESMTP id qBDIWT1s013904 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Thu, 13 Dec 2012 13:32:29 -0500
Received: from BRN1WNEXMBX02.vcorp.ad.vrsn.com ([::1]) by brn1wnexcas02.vcorp.ad.vrsn.com ([::1]) with mapi id 14.02.0318.004; Thu, 13 Dec 2012 13:32:28 -0500
From: "Gould, James" <JGould@verisign.com>
To: Gustavo Lozano <gustavo.lozano@icann.org>, "ire@ietf.org" <ire@ietf.org>
Thread-Topic: [ire] CSV and RFC4180
Thread-Index: Ac3ZW7D9ufs4LNVoS7qVrGt6cHeCeQAAJiwAAAtDYAD//620gA==
Date: Thu, 13 Dec 2012 18:32:28 +0000
Message-ID: <C41D7AF7FCECBE44940E9477E8E70D7A0D742264@BRN1WNEXMBX02.vcorp.ad.vrsn.com>
In-Reply-To: <CCEF5CEF.69E5%gustavo.lozano@icann.org>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/14.2.2.120421
x-originating-ip: [10.173.152.4]
Content-Type: text/plain; charset="us-ascii"
Content-ID: <A66DD1DF5BAB144D9F808C1A147EDA01@verisign.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [ire] CSV and RFC4180
X-BeenThere: ire@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Internet Registration Escrow discussion list." <ire.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ire>, <mailto:ire-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ire>
List-Post: <mailto:ire@ietf.org>
List-Help: <mailto:ire-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ire>, <mailto:ire-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Dec 2012 18:32:54 -0000

Gustavo,

Doesn't it state in RFC 4180 the following?

  Common usage of CSV is US-ASCII, but other character sets defined
      by IANA for the "text" tree may be used in conjunction with the
      "charset" parameter.

It does not preclude the use of UTF-8 or any other character set
(http://www.iana.org/assignments/character-sets/character-sets.xml), so we
should be good.



-- 

JG
 

 
James Gould
Principal Software Engineer
jgould@verisign.com
 
703-948-3271 (Office)
12061 Bluemont Way
Reston, VA 20190
VerisignInc.com







On 12/13/12 1:26 PM, "Gustavo Lozano" <gustavo.lozano@icann.org> wrote:

>James,
>
>I understand that we can produce text files encoded in UTF-8.
>
>My concern is the TEXTDATA ABNF grammar defined in RFC4180: %x20-21 /
>%x23-2B / %x2D-7E, which support a subset of US-ASCII extended only.
>
>After pre delegation testing, ICANN will have evidence that the escrow
>deposit file is correct, but this is only a snapshot in time. Registries
>platforms are updated, RDBMS are updated, libraries are updated and in
>general all the components of the SRS evolve during time. My concern is to
>find in the future that the escrow deposit file of a registry operator is
>corrupted because some library is now following the ABNF grammar in
>RFC4180 or other validation rules. The same applies to the EBERO system.
>
>XML being a well defined standard make more comfortable in this regard.
>
>Regards,
>
>Gustavo Lozano
>
>On 12/13/12 10:04 AM, "Gould, James" <JGould@verisign.com> wrote:
>
>>Gustavo,
>>
>>I don't believe that the file encoding is the key determinate of the file
>>format decision.  The CSV draft includes an encoding attribute with the
>>default of UTF-8.  It's up to the producer and consumer to support the
>>appropriate encoding of the data in any case whether we're talking about
>>XML as the file format or CSV.  I don't believe we would have any issue
>>producing or consuming UTF-8 encoded CSV files.
>>
>>-- 
>>
>>JG
>> 
>>
>> 
>>James Gould
>>Principal Software Engineer
>>jgould@verisign.com
>> 
>>703-948-3271 (Office)
>>12061 Bluemont Way
>>Reston, VA 20190
>>VerisignInc.com
>>
>>
>>
>>
>>
>>
>>
>>On 12/13/12 1:00 PM, "Gustavo Lozano" <gustavo.lozano@icann.org> wrote:
>>
>>>Colleagues,
>>>
>>>I find the proposal of Comma-Separated Values (CSV) Objects Mapping
>>>interesting and I think that both approaches CSV or XML for escrow data
>>>have its advantages and disadvantages.
>>>
>>>The only RFC related to CSV that I have found is RFC4180.
>>>
>>>This text from RFC4180 concerns me:
>>>TEXTDATA =  %x20-21 / %x23-2B / %x2D-7E
>>>
>>>The escrow deposit will contain non US-ASCII data.
>>>
>>>How can we be sure that the libraries/database tools used to implement
>>>the
>>>export/import of CSV will adequately work with non US-ASCII data? There
>>>are different platforms and architectures used by different players
>>>(EBEROs, registry operators and data escrow agents) that will be
>>>upgraded
>>>and will evolve during time.
>>>
>>>In this regard I feel more confortable with XML because Unicode support
>>>have been present since the beginning.
>>>
>>>Thoughts?
>>>
>>>Regards,
>>>Gustavo Lozano
>>>
>>>_______________________________________________
>>>ire mailing list
>>>ire@ietf.org
>>>https://www.ietf.org/mailman/listinfo/ire
>>
>