[ldapext] UTF-8 full support in LDIF / LDIF v2

Yves Dorfsman <yves@zioup.com> Sat, 21 March 2009 07:31 UTC

Return-Path: <yves@zioup.com>
X-Original-To: ldapext@core3.amsl.com
Delivered-To: ldapext@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 886DD3A6AE6 for <ldapext@core3.amsl.com>; Sat, 21 Mar 2009 00:31:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.447
X-Spam-Level:
X-Spam-Status: No, score=-2.447 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, SARE_SUB_ENC_UTF8=0.152]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YbDBb+jj-hnt for <ldapext@core3.amsl.com>; Sat, 21 Mar 2009 00:31:23 -0700 (PDT)
Received: from idcmail-mo2no.shaw.ca (idcmail-mo2no.shaw.ca [64.59.134.9]) by core3.amsl.com (Postfix) with ESMTP id 736F63A6ABA for <ldapext@ietf.org>; Sat, 21 Mar 2009 00:31:23 -0700 (PDT)
Received: from pd7ml1no-ssvc.prod.shaw.ca ([10.0.153.161]) by pd7mo1no-svcs.prod.shaw.ca with ESMTP; 21 Mar 2009 01:32:10 -0600
X-Cloudmark-SP-Filtered: true
X-Cloudmark-SP-Result: v=1.0 c=0 a=J6b0bBEkAAAA:8 a=ka6OHc2Xp9yHgCD4J5cA:9 a=SYNMtFUqoGRPnw7ZjlMA:7 a=dM3bhYhE9GyoZPJdspamM5WkTmUA:4 a=xZWt0H8d2RIA:10
Received: from s0106003018a62844.cg.shawcable.net (HELO home.zioup.com) ([68.147.37.46]) by pd7ml1no-dmz.prod.shaw.ca with ESMTP; 21 Mar 2009 01:32:10 -0600
Received: from [192.168.2.51] (r51.zioup.net [192.168.2.51]) by home.zioup.com (Postfix) with ESMTP id 6790A350BF; Sat, 21 Mar 2009 01:32:10 -0600 (MDT)
Message-ID: <49C497F9.7010200@zioup.com>
Date: Sat, 21 Mar 2009 01:32:09 -0600
From: Yves Dorfsman <yves@zioup.com>
User-Agent: Thunderbird 2.0.0.19 (X11/20090105)
MIME-Version: 1.0
To: ldapext@ietf.org
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Cc: alexey.melnikov@isode.com
Subject: [ldapext] UTF-8 full support in LDIF / LDIF v2
X-BeenThere: ldapext@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: LDAP Extension Working Group <ldapext.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ldapext>, <mailto:ldapext-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ldapext>
List-Post: <mailto:ldapext@ietf.org>
List-Help: <mailto:ldapext-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ldapext>, <mailto:ldapext-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 21 Mar 2009 07:31:24 -0000

Hi,

I'd like to re-start this thread.

I have been re-reading the earlier messages, and have put some thought into 
it as well. I have also been communicating with the RFC-editor and Alexey 
Melnikov (one of the Area Directors for LDAP). I have also taken a stab at 
the new text to jumpstart the discussion.


Scope:
I personally would prefer to address the UTF-8 support only, and possibly 
other minor issues, but no other major change. My reasoning is that:
-the change is relatively simple
-the lack of UTF-8 support is a problem that affects me directly
-I understand the problem
-adding other major functionalities will probably delay the RFC, while we 
could get this one out relatively fast, while working on the other 
functionalities for another version.

Alexey said that he'd like to have all the options on the table.


Version scheme:
Last year I wrote that the version number is limited to one digit. That was 
a mistake, I re-read RFC 2234 (the specification of Backus-Naur Form), and 
what is in the current RFC is one digit or more, so this is a non-issue.


Escaping UTF-8 characters:
RFC 2253 (UTF-8 String Representation of Distinguished Names
) allows for escaping characters with backslashes and hex numbers. I can see 
the point when working from the command line, and say your terminal is not 
set properly, or you don't have the appropriate keyboard, but I am not sure 
about files... What do you guys think ?


Authorship:
I have tried to contact Gordon Good, the original author of RFC 2849,
but have not heard from him yet (could be due to spam filters). The 
RFC-Editor and Alexey say there are procedures around that, and we can leave 
that as a last minute item.

Example in UTF-8:
The RFC-Editor is very clear on this, RFCs are ASCII only but we can add a 
postscript file with examples containing UTF-8 if we want. I am not sure 
there is much value, this is pretty trivial.

Removing some paragraphs:
Should we remove some paragraph that don't seem to be relevant any more, 
such as:
"   The application/directory MIME content-type [RFC2425] is a general
    framework and format for conveying directory information, and is
    independent of any particular directory service.  The LDIF format is
    a simpler format which is perhaps easier to create, and may also be
    used, as noted, to describe a set of changes to be applied to a
    directory.
"


Expired draft:
Refence [Armijo00] is a draft expired in 2001. It is used in example 7. Is 
this still relevant ?

RFC4525:
I noticed that RFC 4525 has updated the LDIF definition. Should this be 
included in this RFC ? I have created an extra file with its inclusion.


Here are the two versions I have created:
http://www.sollers.ca/hg/ldif-utf8/file/d307d875966f/proposal.txt#l1
and a side by side diff:
http://www.sollers.ca/projects/ldif-utf8/files/proposal-diff.html

and with the addition of rfc 4525:
http://www.sollers.ca/hg/ldif-utf8/file/213f3b5dcd86/proposal.txt#l1
side by side diff:
http://www.sollers.ca/projects/ldif-utf8/files/rfc4525addon.html


-- 
Yves.
http://www.sollers.ca/