Re: [VCARDDAV] updating vCard 3.0 to vCard 4.0

"Javier Godoy" <rjgodoy@fich.unl.edu.ar> Fri, 16 July 2010 10:10 UTC

Return-Path: <rjgodoy@fich.unl.edu.ar>
X-Original-To: vcarddav@core3.amsl.com
Delivered-To: vcarddav@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 3720C3A6816 for <vcarddav@core3.amsl.com>; Fri, 16 Jul 2010 03:10:10 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.763
X-Spam-Level:
X-Spam-Status: No, score=0.763 tagged_above=-999 required=5 tests=[AWL=-0.439, BAYES_50=0.001, J_CHICKENPOX_13=0.6, J_CHICKENPOX_75=0.6, STOX_REPLY_TYPE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MOE6H5d8OZnf for <vcarddav@core3.amsl.com>; Fri, 16 Jul 2010 03:10:05 -0700 (PDT)
Received: from fich.unl.edu.ar (fich.unl.edu.ar [168.96.132.90]) by core3.amsl.com (Postfix) with ESMTP id 59A763A6827 for <vcarddav@ietf.org>; Fri, 16 Jul 2010 03:10:04 -0700 (PDT)
Received: from Javier2 ([190.193.109.175]) (authenticated user rjgodoy@fich.unl.edu.ar) by fich.unl.edu.ar (using TLSv1/SSLv3 with cipher RC4-MD5 (128 bits)); Fri, 16 Jul 2010 07:10:18 -0300
Message-ID: <B4A58BAF4AA84006B6B30B5E6015CB70@Javier2>
From: Javier Godoy <rjgodoy@fich.unl.edu.ar>
To: Daisuke Miyakawa <d.miyakawa@gmail.com>
References: <85421D6B55DD4BC18C9FB8105F4D66DD@Javier2> <AANLkTik07FG0DP21XS85ZIflhfVEQjXnlKDmEureV09c@mail.gmail.com>
Date: Fri, 16 Jul 2010 07:09:30 -0300
MIME-Version: 1.0
Content-Type: text/plain; format="flowed"; charset="UTF-8"; reply-type="original"
Content-Transfer-Encoding: 8bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.5512
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5512
Cc: vcarddav@ietf.org
Subject: Re: [VCARDDAV] updating vCard 3.0 to vCard 4.0
X-BeenThere: vcarddav@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IETF vcarddav wg mailing list <vcarddav.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/vcarddav>, <mailto:vcarddav-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/vcarddav>
List-Post: <mailto:vcarddav@ietf.org>
List-Help: <mailto:vcarddav-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/vcarddav>, <mailto:vcarddav-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 16 Jul 2010 10:10:10 -0000

Daisuke Miyakawa wrote:

> Interesting. I'd like to have some documentation describing relation between
> vCard 3.0 and 4.0, while I don't know there's any draft for it.
> As for the mapping from 3.0 to 4.0, however, I have some concerns.

> vCard 3.0 just recommends UTF-8. It is not MUST. Actually I've seen some
> device emitted vCard with Japanese >local charset called Shift_JIS.
>One-to-One mapping from Shift_JIS to UTF-8 is sometimes impossible, as far as
>I rember.

I didn't know that some Shift-JIS characters cannot be represented in UTF-8,
and I couldn't find any reference about that, either. Despite of some
ambiguities [1], it seems that Unicode is a superset of Shift-JIS [2]:

[[
How is JIS X0213 related to Unicode / ISO/IEC 10646?
 Almost all characters in JIS X0213 have corresponding characters in Unicode /
ISO/IEC 10646. Only a few non-Kanji characters are represented by composite
sequences in Unicode / ISO/IEC 10646. Kanji characters are mapped to one of
the blocks of CJK Unified Ideographs, CJK Compatibility Ideographs, CJK
Unified Ideographs Extension A, or CJK Unified Ideographs Extension B in
Unicode 4.0 (or later versions) and corresponding versions of ISO/IEC 10646;
or are mapped to CJK Compatibility Ideographs.
]]

[1] http://www.w3.org/TR/japanese-xml/#ambiguity_of_yen
[2] http://unicode.org/faq/han_cjk.html#8


>There are some other examples I can find which makes the migration difficult.
> From the actual usage, I personally want such documentation anyway.
> I think we don't need to have a strict mapping but some guideline, though I
> agree it won't be a formal "specification".

In some cases, the mapping is strict (such as removing separators from date
values), but for the general case there are some situations that cannot be
solved "by the given algorithm" (for instance, vCard 4.0 does not allow inline
entities, which were allowed for the RFC 2426 AGENT property, now subsumed
into AGENT; one should a stream of vCards for that purpose, but the nature of
this "stream" depends on which context the vCard is used). And there are other
cases where several solution exists (should I map TZ as text values, or try
heuristics for matching it against Olson database entries?)

At least we should call attention to some changes, in order to facilitate the
migration. I call this task upon us because we already know what has changed,
and (more importantly) we know why we changed it that way. As the WG, we can
authoritatively review the guidelines, even though they are not part of the
standard.


> For example, now we have SORT-AS parameter instead of vCard 3.0's
> SORT-STRING, strings for sort should be > composed differently.

That is tricky. SORT-STRING represents the national-language-specific sorting
of the FN property, while SORT-AS applies to N and other properties, but FN is
not among them.

The issue is that we had no per-component sorting information in vCard 3.0:
 FN:Oscar del Pozo
 N:del Pozo Triscon;Oscar
 SORT-STRING:Pozo

In this example, the preferred vCard 4.0 form would be
 FN:Oscar del Pozo
 N;SORT-AS="Pozo;Oscar":del Pozo Triscon;Oscar;;


But this mapping is not always straightforward:

 - SORT-STRING allows LANGUAGE param. SORT-AS assumes the same LANGUAGE as the
N property where it occurs. Language matching should be applied.

 - What would happen if SORT-STRING had x-params?

 - There was no sorting information for the given-name "Oscar". Should we
assume that the given-name from N is the sorting given-name? (in Japanese,
that would mean that SORT-AS would contain kanji instead of kana, since N is
written in kanji), should it be left empty?

 - The definition of SORT-STRING is confusing ("the sort string is used to
provide family name or given name text that is to be used in sorting", but the
examples in RFC 2426 provide SORT-STRINGs where only the family name is
given). The example above could have been SORT-STRING:Pozo Triscon Oscar, from
which a naive algorithm could derive SORT-AS="Pozo Triscon Oscar;" or
SORT-AS="Pozo Triscon Oscar;Oscar". One could try removing the given-name from
the SORT-STRING, but the given-name is not guaranteed to be contained verbatim
in the SORT-STRING.


Best Regards

Javier


2010/7/15 Javier Godoy <rjgodoy@fich.unl.edu.ar>


¿Will be there a WG draft about updating from vCard 3.0 to vCard 4.0?

I'm concerned that, after the publication of vcardrev, systems supporting
vCard 4.0 would have to interoperate with systems supporting vCard 3.0-only. .

Besides, since RFC 2425 and RFC 2426 will be obsolete, the requirement of
"supporting both vCard 3.0 and vCard 4.0" will imply references to the
obsolete and current standards, on an equal footing. Instead, I would like to
require "mapping from vCard 3.0 to vCard 4.0", and have such mapping reviewed
by the same people who participated in the discussion about vcardrev.

Interoperability with older systems may also require exporting vCard 4.0
instances as vCard 3.0. Later, the downgraded instance might be converted back
to vCard 4.0. I would like to provide a mechanism for preserving as much
information as possible during this mapping.

I think the community would benefit and feel more confident if some
broadly-discussed guidelines are provided, instead of asking each implementor
to reinvent the wheel.

If this is not in the schedule, and if you think it might be usefeul, I would
like to take this responsability.


Best Regards

Javier