Re: [VCARDDAV] Proposal around escape character handling (2nd round)

Daisuke Miyakawa <d.miyakawa@gmail.com> Wed, 14 July 2010 03:43 UTC

Return-Path: <d.miyakawa@gmail.com>
X-Original-To: vcarddav@core3.amsl.com
Delivered-To: vcarddav@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id AD5D43A6976 for <vcarddav@core3.amsl.com>; Tue, 13 Jul 2010 20:43:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 1.496
X-Spam-Level: *
X-Spam-Status: No, score=1.496 tagged_above=-999 required=5 tests=[AWL=-0.259, BAYES_50=0.001, HTML_MESSAGE=0.001, MIME_BASE64_TEXT=1.753]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hKAFWZvwDgjx for <vcarddav@core3.amsl.com>; Tue, 13 Jul 2010 20:43:11 -0700 (PDT)
Received: from mail-gx0-f172.google.com (mail-gx0-f172.google.com [209.85.161.172]) by core3.amsl.com (Postfix) with ESMTP id ABDF43A68E7 for <vcarddav@ietf.org>; Tue, 13 Jul 2010 20:43:10 -0700 (PDT)
Received: by gxk3 with SMTP id 3so4091318gxk.31 for <vcarddav@ietf.org>; Tue, 13 Jul 2010 20:43:17 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=pLqeok2E+TxnBAwqQxYjxLmWk2jG5ZEzlbZ0kciAu3k=; b=YUQvvSYbkMGJnCJv8BlKD/2D44sQb6dlBKdbUDyhOYpotKsfbq4fDZMB+E8diW09pZ LElB7y0oOsOUiMgD6PALkJE0ednkxEYDW79ukhkQPhgjXbt4B0UuZ8hC6F5NceVyapBn xYn5++h2FbJLTXonm5uxbFbFal8HrCo1y6yXk=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=fOFgBrEDEGKNNaZ3p+xtrh8aP+sYgsCVTnco5eqPvfAxqBg3eZB2vBFRRQboxT7OWk CaWnDm+ex2DXgfJal4GorkNkaGbTQxNDtDyzlUUj6c/3Pabavtsc4WbrCabNhUiKTYWd wHoyDX5ymjqFntSxk9MsBsrOsnmQbW232taDQ=
MIME-Version: 1.0
Received: by 10.90.71.9 with SMTP id t9mr12184174aga.87.1279078997091; Tue, 13 Jul 2010 20:43:17 -0700 (PDT)
Received: by 10.90.35.2 with HTTP; Tue, 13 Jul 2010 20:43:17 -0700 (PDT)
In-Reply-To: <36FF8BB750F3694C0400EAC1@caldav.corp.apple.com>
References: <AANLkTilx6XgI2iosuKf5zmHnLggkmYe4EeeN-PijvI5K@mail.gmail.com> <36FF8BB750F3694C0400EAC1@caldav.corp.apple.com>
Date: Wed, 14 Jul 2010 12:43:17 +0900
Message-ID: <AANLkTimsCU0Y_PNSH7Xc1s2cxrxmRyNlfx_PgzFu1p2d@mail.gmail.com>
From: Daisuke Miyakawa <d.miyakawa@gmail.com>
To: Cyrus Daboo <cyrus@daboo.name>
Content-Type: multipart/alternative; boundary="00163630f19142e4aa048b50c941"
Cc: vcarddav@ietf.org
Subject: Re: [VCARDDAV] Proposal around escape character handling (2nd round)
X-BeenThere: vcarddav@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IETF vcarddav wg mailing list <vcarddav.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/vcarddav>, <mailto:vcarddav-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/vcarddav>
List-Post: <mailto:vcarddav@ietf.org>
List-Help: <mailto:vcarddav-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/vcarddav>, <mailto:vcarddav-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 14 Jul 2010 03:43:16 -0000

As for the proposal 2, 3, 4, I'll refrain my opinion right now: there's no
strong opposition toward what you guys suggest =)

I'd like to focus on proposal 1: escaping semicolon, a field delimeter
2010/7/13 Cyrus Daboo <cyrus@daboo.name>
>
> Hi Daisuke,
>
> --On July 13, 2010 10:35:32 PM +0900 Daisuke Miyakawa <
d.miyakawa@gmail.com> wrote:
>
>> ****** Proposal 1 (new):
>> The one-to-one rules above MUST be applied to "all" the properties, even
>> including X- properties, for uniformity between properties.
>> In other words, semicolons MUST be escaped even when the property does
>> not allow multiple values (like (0, 1)).
>>
>>
>> In this proposal, how readers must/should act when ';' is given without
>> escape in those properties is undefined. I don't think "undefined" is a
>> good idea,
>> but I cannot think up better idea for mentioning it as a formal
>> specification.
>
> ; is used as a separator for "compound" values not multiple values (i.e.
the "cardinality" (0, 1) has no bearing on that). In compound values it is
vitally important to know where the field delimiters are (';') vs normal
text occurrences of the field delimiter ('\;').
>
> So there is a real problem here. For example, if in the future I define a
new property FOO that uses a ;-delimited compound value, then I want to be
sure that clients not aware of this property will properly "round-trip" it.
So if I send such a client this:
>
> FOO:delimited;text\;string
>
> I want to be sure that when it parses and re-generates the vcard, the
exact same value (octet-by-octet) comes back. In that case the client must
not unescape - if it did then it would generate:
>
> FOO:delimited;text;string
>
> alternatively, if the client left the \; as-is and then re-generated it
could end up with:
>
> FOO:delimited;text\\;string
>
> and that would be wrong! Now a smart client might spot the use of \; in
the original value and somehow "mark" that as being compound and then apply
compound text generation rules when re-generating (in which case it would
hopefully generate the correct result). However, I am not sure we can rely
on that - if we want to then we need text clearly explaining what has to
happen. Failing that, there is no way to define a new property that uses a
compound TEXT value and ensure that it is backwards compatible (without
resorting to using something other than ; and , as delimiters).

That makes sense.

I think adding a rule migt be helpful:
- vCard readers MUST handle "\;" as a character ';' even when its property
name is unknown.
  That is, FOO, X-EXTRA-PROPERTY, or any other property names MUST NOT
affect how property values be treated.

>From the view of backward compatibility (2.1, 3.0), however, I agree that
this is not sufficient.
I guess a lot of vCard libraries would re-use implementations for vCard 3.0,
which does not have strict rule around the delimiter (;).
If they just use them, what you mention above would happen
(“delimited;text\;string” -> "delimited;text;string" OR
"delimited;text\\;string")


> Given all of that, I think we need text stating that new properties
(registered, X-, vendor etc) MUST NOT use COMMA or SEMI-COLON text
delimiting as existing clients may not roundtrip them. If a "structured"
value is required, then a different delimiter has to be used. We may want to
pick a specific character for that for IANA registered properties if we
care.

Feasible enough. If we do so, it would be better to clearly define what
"new" properties are.

We have several properties.

name = "SOURCE" / "NAME" / "KIND" / "FN" / "N" / "NICKNAME"

/ "PHOTO" / "BDAY" / "DDAY" / "BIRTH" / "DEATH"

/ "ANNIVERSARY" / "SEX" / "ADR" / "LABEL" / "TEL" / "EMAIL"

/ "IMPP" / "LANG" / "TZ" / "GEO" / "TITLE" / "ROLE" / "LOGO"

/ "ORG" / "MEMBER" / "RELATED" / "CATEGORIES" / "NOTE"

/ "PRODID" / "REV" / "SOUND" / "UID" / "CLIENTPIDMAP" / "URL"

/ "VERSION" / "CLASS" / "KEY" / "FBURL" / "CALADRURI"

/ "CALURI" / "XML" / iana-token / x-name

I think N, ORG, and ADR must be considered as "old" registered properties,
which are able to use the delimiter.
Then, does it make sense for us to prohibit the use of the delimiter in the
other properties including iana-token and x-name?

Also, we will need to take care of SORT-AS parameter too.

Now we don't have SORT-STRING as a property name, but rev12 has SORT-AS,
which utilized the delimiter (semicolon).
I think we also need to define its behavior

N;SORT-AS="Koura;Osamu":Koura;Osamu;;


Thanks,

-- 
Daisuke Miyakawa (宮川大輔)
d.miyakawa@gmail.com