Re: [VCARDDAV] Questions about text handling in vCard 4.0 (rev 11)

Daisuke Miyakawa <d.miyakawa@gmail.com> Tue, 06 July 2010 14:25 UTC

Return-Path: <d.miyakawa@gmail.com>
X-Original-To: vcarddav@core3.amsl.com
Delivered-To: vcarddav@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 2554E3A697F for <vcarddav@core3.amsl.com>; Tue, 6 Jul 2010 07:25:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 3.576
X-Spam-Level: ***
X-Spam-Status: No, score=3.576 tagged_above=-999 required=5 tests=[BAYES_50=0.001, FRT_BELOW2=2.154, HTML_MESSAGE=0.001, SARE_GIF_ATTACH=1.42]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fWJJBj1vz8mu for <vcarddav@core3.amsl.com>; Tue, 6 Jul 2010 07:25:43 -0700 (PDT)
Received: from mail-gw0-f44.google.com (mail-gw0-f44.google.com [74.125.83.44]) by core3.amsl.com (Postfix) with ESMTP id 7EF583A696B for <vcarddav@ietf.org>; Tue, 6 Jul 2010 07:25:43 -0700 (PDT)
Received: by gwb10 with SMTP id 10so3349143gwb.31 for <vcarddav@ietf.org>; Tue, 06 Jul 2010 07:25:42 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:x-goomoji-body:date:message-id:subject:from:to:cc :content-type; bh=teCK0bGultYqgv81mJRCXfDsWonxelFViUoWMOfNFXY=; b=CjXi0O9Yy5zstu/uqBTFeQHaajnea3mT25XSAzlW0JGdw4eGDj7XXIMc56n78CBN2v m/6JHKIXN1hyiFXz8/jY4Ij6wC22j0vTCHVQ0DkQr80eLrDbRfDo+XXEXklEP7ul4LIz K3AL6rv5wW1an9GfMrxjLse5cUd3zXG3LUaVM=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:x-goomoji-body:date:message-id :subject:from:to:cc:content-type; b=S8yxOvVjzkjvodPIOyu9xynlJKT9gJJHjTwfmr4G5Z8wN063DWEnhqTNApxbr8VYZl eLLvWac9sgz1p5wIrR7uo3BM8rpI7DWmMFAE0VzTyxfsHWMs+4pgCzA4r6iK37g6DhG/ qcso+gmx/A1rrJGMk+ifX3S/Zxaa4/4mCgXkI=
MIME-Version: 1.0
Received: by 10.90.84.1 with SMTP id h1mr1584338agb.138.1278426336513; Tue, 06 Jul 2010 07:25:36 -0700 (PDT)
Received: by 10.90.56.11 with HTTP; Tue, 6 Jul 2010 07:25:36 -0700 (PDT)
In-Reply-To: <4C3333A4.4000307@viagenie.ca>
References: <AANLkTik6O1nZvjdDRn1bdGb20xKbWJApIsnwfTJ8BbRa@mail.gmail.com> <4C31CF6B.9050500@viagenie.ca> <AANLkTimt74eL5nCfDFK2QgHggyL9qONlqAUDOWKjan-l@mail.gmail.com> <4C31DA5F.6030906@viagenie.ca> <AANLkTin2KEkx8wphdHhdQj2H9sY0VjR85JTsRjc7rJWr@mail.gmail.com> <4C3333A4.4000307@viagenie.ca>
X-Goomoji-Body: true
Date: Tue, 06 Jul 2010 23:25:36 +0900
Message-ID: <AANLkTikGQf5yj_9TmjHIySqhuyoa-QIFLXZmqev6xFK4@mail.gmail.com>
From: Daisuke Miyakawa <d.miyakawa@gmail.com>
To: Simon Perreault <simon.perreault@viagenie.ca>
Content-Type: multipart/related; boundary="0016e64f47d6a8e04c048ab8d3cc"
Cc: vcarddav@ietf.org
Subject: Re: [VCARDDAV] Questions about text handling in vCard 4.0 (rev 11)
X-BeenThere: vcarddav@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IETF vcarddav wg mailing list <vcarddav.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/vcarddav>, <mailto:vcarddav-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/vcarddav>
List-Post: <mailto:vcarddav@ietf.org>
List-Help: <mailto:vcarddav-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/vcarddav>, <mailto:vcarddav-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 06 Jul 2010 14:25:45 -0000

2010年7月6日22:46 Simon Perreault <simon.perreault@viagenie.ca>:

> On 2010-07-06 00:53, Daisuke Miyakawa wrote:
> >     Can you please suggest text for what you have in mind? Probably a
> >     sentence or two to be added at the end of 3.3...?
> >
> > Here's an arbitrary example.
>
> I meant text that we should add to the vCard 4 specification and that
> would address your concern...
>
> > The problem here is we cannot estimate potential Unicode which harm
> > actual readability/edit-ability/xxx-ability caused by receiver/editor
> > side's limitation. Even readability is part of my concern.
>
> What we're talking about here is how Unicode characters get encoded in
> vCard. The current method is to encode them as-is. You are suggesting to
> use \xNNNN, and your argument is readability. I think this argument is
> weak because:
>
> - Unicode characters may actually be *more* readable than \xNNNN notation.
>

Please see the font issue I described bellow. When no font is available, it
cannot be read at all.
Rather, it may corrupt the way other characters are shown.


> - Do we really care about readability of vCard data that is going to be
> displayed exactly the same to the user regardless of the encoding?



>

> Another example: a final fallback method users will use during reading
> > and editing vCard would be just opening the file and editing it
> > manually. Then how can they edit unknown characters like Chinese,
> > Japanese, Korean, V...?
>
> I don't see a problem. What do you mean by "unknown"? I can just use vi
> on any UTF-8 file and edit it without any difficulty. One can do the
> same with Microsoft Word or any other editor that understands UTF-8 I
> would assume.


Hmm, "unknown" was unclear.

If no font for the locale is available, it is "unknown" to the machine.
For example, Unicode 5.2 allows Mahjong tiles in text
http://www.unicode.org/charts/PDF/Unicode-5.1/U51-1F000.pdf

I don't think usual PC is able to render it

You can try the tiles http://0xcc.net/jsescape/

I can see \U0001F002 as just a square (🀂 looks almost same as □)

Unicode 6.0 will support emoji, pictgraphs used in Japan.
[?] can be sent in text format. I don't think usual editors support it at
first.
At that time, it will be displayed with unreadable something.

In my PC, I guess no Hebrew font, while Unicode has it. Fonts for all
characters in Unicode
cannot be expected even in usual PCs.

Unicode has 22bits and potentially it may grow. Relying on the assumption
all usual computers are able to
show them when seeing vCard is a bit difficult.

I agree that \xNNNN (or \xNNNNNNNN, \UNNNNNNNNN) does not solve this issue
completely, but
without taking care of it, I think usability will not improve enough.

One proposal I can do is that composer side are allowed (but not
recommended) to use the format only when they cannot emit the word (like
when users want to edit foreign friends name without an appropriate IME) but
know codepoint for that characters.
Receiver side MUST be able to decode \xNNNN to appropriate Unicode form.
This is a kind of dirty compromise but there's no technical difficulty nor
theoretically insufficiency. I suppose I can implement sender/receiver
easily.


Simon
> --
> NAT64/DNS64 open-source --> http://ecdysis.viagenie.ca
> STUN/TURN server        --> http://numb.viagenie.ca
> vCard 4.0               --> http://www.vcarddav.org
>



-- 
Daisuke Miyakawa (宮川大輔)
d.miyakawa@gmail.com