[VCARDDAV] Questions, Concerns, and Errata concerning vcardrev-13
Rohit Khare <Rohit@Khare.org> Tue, 12 October 2010 06:44 UTC
Return-Path: <Rohit@Khare.org>
X-Original-To: vcarddav@core3.amsl.com
Delivered-To: vcarddav@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 286C63A6BC5 for <vcarddav@core3.amsl.com>; Mon, 11 Oct 2010 23:44:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.409
X-Spam-Level:
X-Spam-Status: No, score=0.409 tagged_above=-999 required=5 tests=[AWL=-1.741, BAYES_50=0.001, GB_I_LETTER=-2, IP_NOT_FRIENDLY=0.334, J_CHICKENPOX_33=0.6, J_CHICKENPOX_44=0.6, MANGLED_SIDE=2.3, SARE_MILLIONSOF=0.315]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qv6NomO5apda for <vcarddav@core3.amsl.com>; Mon, 11 Oct 2010 23:43:20 -0700 (PDT)
Received: from xent.com (xent.com [69.55.232.243]) by core3.amsl.com (Postfix) with ESMTP id 067A23A67B6 for <vcarddav@ietf.org>; Mon, 11 Oct 2010 23:41:55 -0700 (PDT)
Received: from [192.168.2.102] (m209-97.dsl.rawbw.com [198.144.209.97]) (authenticated bits=0) by xent.com (8.13.5.20060308/8.13.5/Debian-3ubuntu1.1) with ESMTP id o9C6gQ8K026254 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Mon, 11 Oct 2010 23:42:28 -0700
Message-Id: <2B2DCE4A-4AC8-4C21-88CA-597A8123C809@Khare.org>
From: Rohit Khare <Rohit@Khare.org>
To: vcarddav@ietf.org
Content-Type: text/plain; charset="WINDOWS-1252"; format="flowed"; delsp="yes"
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Apple Message framework v936)
Date: Mon, 11 Oct 2010 23:42:25 -0700
X-Mailer: Apple Mail (2.936)
Subject: [VCARDDAV] Questions, Concerns, and Errata concerning vcardrev-13
X-BeenThere: vcarddav@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IETF vcarddav wg mailing list <vcarddav.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/vcarddav>, <mailto:vcarddav-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/vcarddav>
List-Post: <mailto:vcarddav@ietf.org>
List-Help: <mailto:vcarddav-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/vcarddav>, <mailto:vcarddav-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 Oct 2010 06:44:29 -0000
The very least I owe our colleagues here on the list is a detailed reading of the current draft-ietf-vcarddav-vcardrev-13.txt I wanted to share some fairly low-level notes I took as I read through the spec. Some are errata, which may have already been posted to a wiki page [1]. Others are observations that, while frank, cranky, or (hopefully) humorous, are intended to reflect what an outsider might question when their boss first assigns him or her to “add vCard4 to our product by next week!” and starts reading from page 1 — without any of the WG history or archives at hand. At the high-level, I don’t believe there’s such a beast as a “file format that humans aren’t exposed to” Short of taking extreme measures to prevent comprehensibility (e.g. ASN.1 BER), developers, advanced users, and administrators who have to put such things in envelopes (e.g. DB fields) all have to encounter and puzzle out the meaning of fields and work around bugs. Few will be English speakers, so I’m not appealing to semantically workable meanings for words, but many will be cutting and pasting instructions from the Interwebs to get the job done, and they deserve our sympathy — not our scorn. Finally, I don’t like many of my own ‘editorial recommendations,’ but I wanted to respect the spirit of the group’s request for ‘alternate text’ by October 11. I am solely responsible for the text that follows, which means I’m glad to take the blame and even happier to share the credit :) — Rohit [1] https://wiki.mozilla.org/VCard4#draft_13_section_by_section_review * case-insensitivity (page 7). Raised a red flag for me about the XML mapping I knew was coming up in association with this spec, since the XML will necessarily be case-sensitive. Is there evidence of case- mixing being interoperable at present, in the wild? Editorial recommendation: If the data supports this, recommend that implementers prefer one case style as canonical, whether or not they are “liberal in what they accept” * The group concept arrives early on (page 9). Only much later would the revision history clue me in that it’s been a controversial idea that’s made it in and out of the drafts. In any case, the “group.” prefix came out of nowhere as I was reading the syntax, without reference to its roots in the HOME/WORK distinction. Whenever I introduce a new degree of freedom, I prefer to ground it with specific, evocative examples as soon as possible. At this point, even after reviewing the whole doc, I’m unclear on why it exists — are there other cultures that have lots of evidence of a taxonomy with higher valence than Home/Work? Are implementations incorrectly assuming the group prefix correlates to a user-facing label (fields should never determine display, imo)? Editorial recommendation: a better explanation inline, or at least a specific forward reference. Instead, the early arrival of such a weak claim (SHOULD/MAY) reduced my confidence that this would be an interoperable spec. * “space-saving reasons” (page 8). Caught my eye, because it contradicted the (debatable) point I’ve heard repeatedly of late, that vCard is not a human-readable format. In which case, saving space is ZIP’s job (or some other content-transfer-encoding), not a legitimate reason to make any tradeoff that makes writing a parser harder than it should be… * … specifically because it makes comma and semicolon handling worryingly complex (§3.3). My spidey-senses start tingling when I have to scan over special handling for escaping. I always hope for something so mandatory that I can pre-process my input with a regex, but I usually don’t get that moment of satori :) Short of that, I became concerned that if I’m going to be ABNF-driven, that’s fine, but having input processing rules that aren’t context-sensitive seemed surprising (every comma, everywhere). Editorial recommendation: MUST always unescape all input (obviously); MUST escape commas in fields that repeat; SHOULD escape commas in fields that don’t — in the end, implementers must be prepared to read in all commas permitted by the ABNF in §3.2 (the actual ABNF, not the warning text about “some value types” at the end — otherwise, exclude it from SAFE-CHAR), and not those separately forbidden by the side- agreement in §3.3. Confidence: Low — I may have misunderstood the mechanics here. * 4.2 - data: URIs are very handy here (RFC 2397). Other URL schemes would also be helpful, and I recognize that can apply by virtue of 6.2.4. Data uris also support BASE64 (which is what Apple Address Book already preferred http://markmail.org/message/s6rcxdh6u5ylqtm4 ) Editorial consideration: Since data: uris specify the MIME Types of their payload, it may be worth illustrating this in the reference so that implementers are reminded of the potential of multiple, conflicting image types, and to (at least) prefer the one found in the data: uri. In general, though, I suspect FMTTYPE should now have any normative meaning for a URI. The MIME Type specified when retrieving a representation of that resource, whether over the network or by decoding data: uris, is controlling. Editorial recommendation: remove fmttype-param from this rule: refer-param = "VALUE=uri" / fmttype-param * 4.4 boolean is misspelled (third time it occurs on the first line). Also, I don’t understand why so much case conversion is permitted here? Are there lots of examples of interoperability failures, or can we just remind authors the ABNF is the controlling legal language (all CAPS per page 10 - not at all “case insensitive”!). They may accept “TrUE” at their own risk, but never to generate that. Editorial recommendation: mandate all-caps case. * 4.6 FLOAT does not permit scientific notation. Is that worth warning implementers, who may just use their favorite language’s %f formatting? * 5.0 (top of page 17) Implementers MUST ignore undocumented or private use parameters, but MAY ignore group prefixes (3.2). Given the risk of merge loops, especially when a PID is specified, dropping fields seems worse than round-tripping intact. I wonder why that was ruled out? Evidence of propagation failures in the wild might help decide this issue. * Case-insensitivity arises throughout the doc, again in 5.2 (‘b’ and ‘B’). As an implementer, I wish I didn’t have to read the text to discover gotchas — either the ABNF should be mixed-case, or declare outright that uppercase is preferred and MUST NOT be propagated. * 5.3 The motivational scenario for non-vCard-aware consumers of vCard data seems quite made-up. For that matter, the hypothetical search engine just got some witness killed because it indexed the location as TEXT but didn’t obey CLASS ;) And what’s with the sudden preference for lower-case symbols? I was half expecting to be told elsewhere in the doc that TYPEs would be case-insensitive too. * PREF reminds me of content-negotiation in HTTP. It was a beautiful idea that was convoluted too far to really ever work. The similarities begin with the arbitrary 0-100 scale (oops! 1-100 see? and 1 > 100, don’t forget — this kind of scaling jiggery-pokery is unlikely to interoperate, much less compose (concatenating records from multiple sources). Where is the actual reference to a running system with that much subtlety of expression? Editorial Recommendation: By contrast, I can bet you that users care immensely about the *exact order* that fields were typed in. Why not make this ORDER, if we don’t want the original sequential occurrence of lines in the file to be controlling law? If anything, I’d wish we ruled out the use of PREF precisely because the same four characters exist in RFC 2426 with a completely different meaning (and it’s a presence/absence token to boot, not a parameter) * Aside: one new function that’s become integral to my user of an ‘address book’ in the decades since RFC 2426 is the ‘call log’. I can imagine that it might be a useful illustrative example for how to use private extensions and propose new IANA registries to show how a cellphone developer could extend vCard to round-trip their address book more completely… * 5.5. Is ALTID a number or an entity tag or what? I mean, if every single example uses a digit, I was inclined to believe it was close to a ordinal position referent system, until I was rudely awakened at the very last line of the section that it was text (and hence a string- equivalence index referent system). What do real implementations do when marshaling these graph structures? * 5.7 “act like tags” is devoid of meaning. What definition or citation are we providing for this? And the cross-dependency with the setting of KIND to ‘individual’ is exactly the kind of English- language lawyering no implementer wants to read up on after getting the ABNF working. In general, I’d strike KIND entirely, until some braver working group wants to pursue “vThing” on its merits — rather than glomming it onto the interchange format for a class of software — address books — that has never aspired to catalog more general than methods of addressing communications. In addition, TYPE is one of the more vacuous/contentious four-letter strings in computing. Look no further than RFC 2426 where Type exists a descriptive mixed-case string in every other section header — because then it was being used the way ‘properties’ is in the current spec. “multiple, different uses” the brightest of bright-red-flags that a spec will interoperate poorly in the field. I already have nightmarish visions of having to compile truth tables into my validation code that correspond to the listings in this section. To say nothing of making future extensibility a nightmare to maintain, since I’d have to find and recompile all of those lookup tables when a new property comes along… * 5.7 also contains a typo: FIBURL. Unless that’s a new way to lie to your boss that you aren’t going to be available for that meeting — ever :) * 5.8 c’mon, really? You’re going to justify *@&# like DEATH on genealogical bases, and I can’t even create a stock vCard for Jesus H. Christ? Where’s B.C.E.? And next I suppose I’ll be told I can’t I use http://en.wikipedia.org/wiki/French_Republican_Calendar for cataloging notable guillotinings. Not to mention offending followers of http://en.wikipedia.org/wiki/Hebrew_calendar , http://en.wikipedia.org/wiki/Japanese_era_name , or http://en.wikipedia.org/wiki/Stardate And don’t forget TIME while you’re at it: http://en.wikipedia.org/wiki/Swatch_Internet_Time Punting by saying, “well, there’s only one true defined meaning, but you can plug in anything else you want as x- or ask IANA” is not a decision that supports diversity — it’s a decision to favor exactly one point of view. This kind of Eurocentric dead-white-male crap is inexcusable in my opinion. Either there’s a legitimate degree of freedom required here, with documented, interoperable use cases, or it’s just useless — no, scratch that, *offensive* - preening. Now, of course, it happens to be that every single use of *address book* software that I’ve been exposed to uses Gregorian dates. Every other CALSCALE use that might make sense for people, organizations, or things in general is appealing to the existence of software yet unwritten — and if you can use it to create a concordance of Star Trek characters, I’ll bet it won’t be called an “address book” or a “contact card” either. * 5.10 should cite RFC 5870 http://geouri.org/ as the recommended URI scheme, or a forward reference to 6.5.2. Chasing this down was the first time I became aware that “GEO” is *both* a parameter and a property, but I suppose that ship has sailed. No implementer should be burdened by caring about which word-beginning-with-p to use… * 5.12 solves only one narrow problem — was there no other way to indicate XML versions without this additional layer of out-of-band signaling? Of course, I’m in favor of dropping the XML property entirely — whether or not there’s ever a standard XML mapping of vCard itself. Editorial recommendation: VERSION directly conflicts with the older description of §3.6.9 in RFC 2426 to boot, so I would drop it entirely. =========== I stopped editing as closely at this point. It’s extremely late tonight on October 11. Here’s the fairly raw dump of the remaining bullet points in my scratchpad buffer. Many may be wrong, of course! =========== * 5.13 doesn’t permit full MIME types? MIME types have parameters too, so might be worth noting. Or it might be too pedantic for words... * 6.1.4 Q: where the hell does KIND come from? A: the impulse to model things no address book has ever modeled before. * 6.2.3 I’ve seen nickname used to store screenname — there are a lot more custom services and social networks that need to dump member listings, and I can imagine more will go with the shortcut “username is like a kind of alias or nickname, right” The 3rd example of nick=boss is especially perverse, since that’s a job title or role relationship — and, once again, a level of detail I’ve never imagined any real user typing into a contact system to remind themselves how to refer to a colleague, client, or relative. When examples lack credibility, specifications lack credibility. * 6.2.4 PHOTO - bare subtypes? WHy not just describe it as *incorrect* (but found in the wild — you have to cull the herd at some point). Having a nearly-mandatory Encoding with a single possible value bothers me, but can’t easily be ruled out — at least we got the option of URIs for this field in return for that complexity. * 6.2.5 calscale is just getting in the way again here. Why are we so sure of what it can modify if we have no examples of even a second type of calscale? I can make up hypothetical use cases that say even a bare calscale would be helpful, without any date or datetime at all… * 6.2.10 is absence supposed to be inferred as 0 or 9? Because let me tell you, the politically freighted “not known” is not at all the same as “not included”. Properties ought to have default values, and the debate this one would provoke is worth watching. Furthermore, it’s utterly unreasonable to use a scalar unit (integer) to represent a limited group — it’s a %d and will be used as such. So why not at least mandate a testable rule for interoperability by claiming that unrecognized integers should survive round-trips. http://en.wikipedia.org/wiki/Third_gender Interestingly, HR-XML mentions that gender specification may be illegal in certain contexts: http://ns.hr-xml.org/2_5/HR-XML-2_5/CPO/GuidelinesForISOUtilities.html section 3 (and yes, I did find http://www.ietf.org/mail-archive/web/vcarddav/current/msg00997.html and the thread at http://lists.w3.org/Archives/Public/public-contacts-coord/2010JulSep/0010.html ) Too bad this is only in German: http://translate.google.com/translate?hl=en&sl=de&u=http://de.wikipedia.org/wiki/Datenstandards_zur_Beschreibung_des_Geschlechts&ei=bvqzTPfZFoy2vQPhv7mbCg&sa=X&oi=translate&ct=result&resnum=1&ved=0CBYQ7gEwAA&prev=/search%3Fq%3D%2522ISO%2B5218%2522%2B%252Bintersex%26hl%3Den%26safe%3Doff%26client%3Dsafari%26rls%3Den%26prmd%3Div it’s the best reference I could find on the topic shows that ISO 5218 was extended by the central cancer registry (NACCR) with 3/Other (hermaphrodite) and 4/Transsexual. There’s another half-dozen standards in there. Not to mention a telling point that the UK Government Data Standards includes a birth-sex and a current-sex field… * 6.3.1 positional notation isn’t that much better than CSV, is it really? Even if it is mandated by some other standard (and a citation at this point in the doc would be helpful, why not make the ABNF more developer-friendly by actually marking tokens, in order, for each of the levels. I don’t want to see devs counting ;s… * 6.3.2 I never thought much about a pragma (escape hatch) just to print a preformatted label. Seems far from DRY, but not worth fighting * 6.4.1 Editorial suggestion: make a table? it’s a wall of text and as a dev, I’d appreciate the Least You Have To Know (TM). Also, what happened to Telex?:) http://www.cio.com/article/598628/10_Technologies_That_Should_Be_Extinct_but_Aren_t_ http://www.economist.com/node/10609367 (telex: a faint ping - also at http://edwardlucas.blogspot.com/2008/02/telex-lives-on.html Telex is still used for ships at sea — this company was advertising on a Google search for the term today: http://plainsailing.org/telex.html and for the final entertainment from the time capsule: how hip ad firms use MCI Mail to send telexes! 1988 story on “THE EXECUTIVE COMPUTER; Sending a Telex From Your Desk” http://select.nytimes.com/gst/abstract.html?res=FB0710FE3F550C778DDDA10894D0484D81 * 6.4.2 interaction with IDNs? non-Internet email addrs? Twitter handles? PEOPLE WILL PUT WHATEVER THEY NEED TO IN THOSE FIELDS — even deadlines shoved into anniversary b/c they need the client name to pop up on their calendar in 6 weeks to sell again. so be aware of that whenever we finally mature to a standard that’s expected to interoperate in the real-world, not the lab. * 6.5.1 Typo: “dailight“ * 6.5.2 We’re not shy about mandating formats for values that allow actual interoperability; I wish I knew from the doc (c.f. mabbet) why there’s a MAY here for other formats. Perhaps we can have vCards for stars, when someone invents an celestial: scheme for right ascension and declination :) * I found 6.6.2 sufficiently confusing I had to consult the X.520 text. I found a freely-available copy at http://www.itu.int/rec/dologin.asp?lang=e&id=T-REC-X.520-200508-I!!PDF-E&type=items , http://www.x500standard.com/index.php?n=Ig.LatestAvail that confused me further until I hit this phrase: “people sharing the same occupation.” — that is, in colloquial English, Role == “occupation” or perhaps “trade”/“profession”. But anyway, I’ll drop it — that this field even exists is a sign of how much Versit, in turn, signed over decisionmaking to ITU… and we don’t have any statistical data in the wild about how it’s even used, and by whom. * 6.6.6. is aptly numbered — what the heck was going on when the spec wandered into the territory of social applications without a single citation to an authority about this oddly-curated collection of potential relationships. At least XFN can point to an ‘installed base’ of tens of millions of example social graphs in the wild and a field-tested universe of link relations. “private extensions may be used” is no excuse for a lack of discetion. What language even has “supervisee”? You have to escalate from the free Merriam-Webster Online Dictionary because it’s “only available in our premium Merriam-Webster Unabridged Dictionary.” * 6.7.1 Comparing it to §3.6.1 in RFC 2426, I think it’s nifty to drag the field in to the 21st century by recasting them as “tags” - but if we really do mean to accommodate tags, in the sense of the rel=tag microformat, then the values have to be URLs (the visible tag is the final base pathname component). So either the non-normative statement “Also known as "tags".” has some other defensible meaning, or it probably should avoid that four-letter word IMHO. Since it’s a legacy field, I know we can’t change the spelling, but I also find it amusing that it’s the only plural field name in the entire spec… * Another bug going back to §3.6.2 of RFC 2426 is basing NOTE on the “X.520 Description” — the actual X.520 spec is “text that describes the associated object” rather than “supplemental information” More like an acronym expansion than a warning that the fax machine will be off-hook after 5:15PM. And, of course, neither definition reflects what really goes into a NOTE field: CRM, cases, birthdays of your client’s children, and so on… * 6.8.1 When upgrading CLASS with additional informative text about the basically-useless PRIVATE/PUBLIC/CONFIDENTIAL distinction dating back to 3.7.1 in RFC 2426, has anyone ever field tested what distinction a developer would naively draw between PRIVATE and CONFIDENTIAL? Pop quiz time… I think the explanatory text is a valiant effort to imply some behavior never seen enforced by interoperable running code instead of putting a stake through the heart of unused fields? As stated in §9, it is merely “desired,” but by whom? Since “That policy is not enforced in any way” (as it was in 2426, where they chose not to dignify CLASS with definitions) (CLASS is absent from vCard 2.1, http://www.imc.org/pdi/vcard-21.txt ) This also makes a mockery of the template in 10.2.6. for “TOP SECRET” — how about actually useful examples, such as perhaps a “NOINDEX” classification to warn other services not to “crawl” that card for searching…? * 6.8.2 are there examples of KEY in the wild? Extending its definition from vCard3 to use FMTTYPE permits the addition of MIME typing information, which from my humble knowledge of the field, is still inadequate to the task of identifying how the key would be used. But I also can’t recommend opening up this can of worms to permit a common, easily used non-“b” encoding of key as text (e.g. SSH, in RFC 4716; or PEM, in RFC1421; or Asymmetric Key Packages, RFC5958 — see http://www.cryptosys.net/pki/rsakeyformats.html for even more)
- [VCARDDAV] Questions, Concerns, and Errata concer… Rohit Khare
- Re: [VCARDDAV] Questions, Concerns, and Errata co… Cyrus Daboo
- Re: [VCARDDAV] Questions, Concerns, and Errata co… Simon Perreault
- Re: [VCARDDAV] Questions, Concerns, and Errata co… Andy Mabbett
- [VCARDDAV] Why the group construct exists Simon Perreault
- Re: [VCARDDAV] Questions, Concerns, and Errata co… Simon Perreault
- [VCARDDAV] Case sensitivity of boolean values Simon Perreault
- [VCARDDAV] Round-tripping of X- Simon Perreault
- [VCARDDAV] Case sensitivity, take 2 Simon Perreault
- Re: [VCARDDAV] Questions, Concerns, and Errata co… Simon Perreault
- [VCARDDAV] Case of VALUE parameter values Simon Perreault
- [VCARDDAV] PREF interoperability Simon Perreault
- Re: [VCARDDAV] PREF interoperability Simon Perreault
- [VCARDDAV] New feature: call log Simon Perreault
- [VCARDDAV] ALTID parameter's value type Simon Perreault
- Re: [VCARDDAV] Questions, Concerns, and Errata co… Rohit Khare
- Re: [VCARDDAV] Questions, Concerns, and Errata co… Simon Perreault
- Re: [VCARDDAV] Questions, Concerns, and Errata co… Simon Perreault
- Re: [VCARDDAV] Questions, Concerns, and Errata co… Simon Perreault
- Re: [VCARDDAV] Questions, Concerns, and Errata co… Simon Perreault
- [VCARDDAV] SEX round-tripping Simon Perreault
- [VCARDDAV] Making ADR's ABNF more verbose Simon Perreault
- [VCARDDAV] Make a table for TEL TYPE values Simon Perreault
- [VCARDDAV] TEL TYPE value for Telex Simon Perreault
- Re: [VCARDDAV] Questions, Concerns, and Errata co… Simon Perreault
- Re: [VCARDDAV] Make a table for TEL TYPE values Andy Mabbett
- Re: [VCARDDAV] Questions, Concerns, and Errata co… Simon Perreault
- Re: [VCARDDAV] Make a table for TEL TYPE values Simon Perreault
- Re: [VCARDDAV] Make a table for TEL TYPE values Andy Mabbett
- Re: [VCARDDAV] Make a table for TEL TYPE values Julian Reschke