Re: Please comment on draft-duerst-mailto-bis-04.txt
"Frank Ellermann" <nobody@xyzzy.claranet.de> Sun, 20 January 2008 11:57 UTC
Received: from balder-227.proper.com (localhost [127.0.0.1]) by balder-227.proper.com (8.13.5/8.13.5) with ESMTP id m0KBv4Jp016789 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 20 Jan 2008 04:57:04 -0700 (MST) (envelope-from owner-ietf-822@mail.imc.org)
Received: (from majordom@localhost) by balder-227.proper.com (8.13.5/8.13.5/Submit) id m0KBv4bL016788; Sun, 20 Jan 2008 04:57:04 -0700 (MST) (envelope-from owner-ietf-822@mail.imc.org)
X-Authentication-Warning: balder-227.proper.com: majordom set sender to owner-ietf-822@mail.imc.org using -f
Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by balder-227.proper.com (8.13.5/8.13.5) with ESMTP id m0KBv1uq016779 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO) for <ietf-822@imc.org>; Sun, 20 Jan 2008 04:57:02 -0700 (MST) (envelope-from gi8-ietf-822@gmane.org)
Received: from list by ciao.gmane.org with local (Exim 4.43) id 1JGYnD-0004o6-Dp for ietf-822@imc.org; Sun, 20 Jan 2008 11:56:51 +0000
Received: from c-134-88-17.hh.dial.de.ignite.net ([62.134.88.17]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for <ietf-822@imc.org>; Sun, 20 Jan 2008 11:56:51 +0000
Received: from nobody by c-134-88-17.hh.dial.de.ignite.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for <ietf-822@imc.org>; Sun, 20 Jan 2008 11:56:51 +0000
X-Injected-Via-Gmane: http://gmane.org/
To: ietf-822@imc.org
From: Frank Ellermann <nobody@xyzzy.claranet.de>
Subject: Re: Please comment on draft-duerst-mailto-bis-04.txt
Date: Sun, 20 Jan 2008 12:57:09 +0100
Organization: <http://purl.net/xyzzy>
Lines: 262
Message-ID: <fmvcto$njp$1@ger.gmane.org>
References: <6.0.0.20.2.20080105153859.07b87790@localhost>
Reply-To: Frank Ellermann <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Complaints-To: usenet@ger.gmane.org
X-Gmane-NNTP-Posting-Host: c-134-88-17.hh.dial.de.ignite.net
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2800.1914
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1914
Cc: uri@w3.org
Sender: owner-ietf-822@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-822/mail-archive/>
List-ID: <ietf-822.imc.org>
List-Unsubscribe: <mailto:ietf-822-request@imc.org?body=unsubscribe>
Martin Duerst wrote: > I expect to submit it to the IESG soon. The draft doesn't depend on the improvements in 2822upd or 4234bis, referencing RFC 2822 and 4234 is no issue. [RFC 2368] > contains some advice against using a bcc field in a > mailto: URI, but this doesn't seem to be followed, > and we were unable to find any reason, so we removed it. MUAs might not support Bcc, not display it in their default configuration, or reject attempts to preset Bcc. Better keep the advice, in an ordinary public mailto: URL Cc: has more or less the same effect. > I'm not subscribed to ietf-822@imc.org, so please > keep me (and my coauthors) in the cc. Can't do, will send separate copy from the Reply-To address, URI list added to the Cc: ================================================== = Review of mailto-bis-04 (top down single pass) = You use %2C to separate <addr-spec>s derived from the RFC 2368 syntax #mailbox based on RFC 822 2.7. Among other # oddities that forbids runs of comma. How does %2C match the overall STD 66 syntax ? If xxx in a mailto:xxx?yyy pattern is a <hier-part>, and that's a <path-rootless>, finally arriving at <segment-nz>, then an unencoded comma is a <pchar> matching <sub-delims>, why do you use %2C here ? My old browser had issues with comma in URLs, but it was implemented years before STD 66 was written. The difference between your <some-delims> and the <sub-delims> plus ":" and "@" in STD 66 appears to be "&" and "=" (you need them for &hname=hvalue) PLUS "/" and "?". What's the reason to exclude "/" and "?" from <some-delims> ? Whatever you end up with, please explain it in the draft, comparing obscure ASCII subsets is a PITA. A note about <addr-spec> says that some characters have to be percent-encoded because they are not allowed in an STD 66 URL. It took me about a year to understand that that's beside the point for the purpose of news: URLs. Some characters have to be percent-encoded because they otherwise don't match <pchar> in STD 66. That's a subtle difference. You write that of 'the characters in sub-delims, at least the following also have to be percent-encoded: "&", ";", and "="' I don't see why, you need "&" and "=" only after the "?", not before in the address list, and you don't do anything special with ";". Percent-encoding "," *within* <addr-spec> would make sense, if you use it as delimiter for an address list, but at the moment the draft uses %2C. Testing mail to "co,ma"@example + "am,oc"@example 1: mailto:%22co,ma%22@example%2C%22am,oc%22@example That would be IMO strange, cleaner versions could be 2: mailto:%22co,ma%22@example,%22am,oc%22@example 3: mailto:%22co%2Cma%22@example,%22am%2Coc%22@example (2) keeps comma as is no matter what its purpose is, (3) uses an unencoded comma to separate addresses. You forbid NO-WS-CTL and <obs-local-part>, please add <obs-domain>. That's crap like user@example(oops).com with comments, folding, and white space on both sides of the dots separating domain labels. IMO you need a MUST NOT for all obs-cenities mirroring the same MUST NOT in RFC 2822 and 2822upd. If 2822upd adds NO-WS-CTL to "obs" and you upgrade the normative reference remove the then redundant MUST NOT NO-WS-CTL. Note (3) about comments and whitespace in a local part is ambiguous, you want to forbid CFWS, but likely not <quoted-pair> horrors like "sp\ ce"@example - I think that works, mailto:%22sp%5C%20ce%22@example is okay (?) BTW, for reasons unknown to me "bare space" is not allowed in <quoted-string>, recently discussed on the SMTP list. This could be a bug in RFC 2822 not yet fixed in 2822upd, maybe a missing SP in <qtext> (?!?) Note (4) is odd, there are no "non-ASCII" characters in domains used for (non-EAI) e-mail, so why discuss it, or limit it to domains ? You could just say that all percent-encoded characters that are not ASCII are supposed to be UTF-8 for compatibility with RFC 3987 and ongoing I18N work (EAI, IDN, IRI). While you're at it please note that this won't work as expected with many UAs and so SHOULD be avoided at the moment. All mailto: URI producers SHOULD use the A-label form of domains, URI consumers might have no idea what U-labels are, percent-encoding doesn't help with this issue. It only helps in non-UTF8 documents for native mailto: IRIs not in the document charset. Or maybe for *all* mailto: IRIs in non-UTF8 documents, the Firefox 2 bug for your non-UTF-8 IRI tests likely also affects mailto:, not only http: | When the internationalized domain name is used to | compose a message, the name must be transformed to | the IDNA encoding where appropriate [RFC3490]. The appropriate place is IMO the mailto URI producer. | The considerations for reg-name in [STD66] apply. I'm not aware of a registry permitting %-characters in their <reg-name>s. Please let's focus on the LDH- labels as required for SMTP in mailto: URLs. We can do UTF8SMTP etc. later, mailto is complex enough. :-( Okay, you have "should A-label" at the end of (4), I proposed a kind of "temporary" SHOULD above. Notes 4 and 5 are far too long and confusing, chapter 2 is for folks trying to understand the mailto: syntax, incl. users not familiar with EAI / IDN / IRI / I18N. Notes 4 and 5 could be subsections of a section with "I18N considerations". IMO any I18N in 2368bis is irrelevant before it doesn't at least work for ASCII and STD 66. | Percent-encoding is needed for the same characters | as listed above for "addr-spec". Stupid question, why ? Behind the "?" you don't need to worry about "/" and "?" anymore (just an example). | The "body" hname should contain the content for the | first text/plain body part of the message. s/hname/hvalue/ The body= concept is weird. It's not clear what the charset is, if you assume UTF-8 it might not fly for old UAs, assuming document charset is worse, assuming local charset of the UA also makes no sense. Better deprecate it. Who implemented body= anyway, and how bad was it ? | Non-ASCII characters can be encoded in hvalue as follows: Indeed, it works like a charme, but not for body= hvalue. | Non-ASCII characters can be encoded according to UTF-8 | [STD63], and then each octet of the corresponding UTF-8 | sequence is percent-encoded to be represented as URI | characters. But that doesn't fly. The MUA started by the browser (if that's what happens) can assume that it's running in the local charset of the operating system, it doesn't need to support UTF-8 at all. The only interoperable solution is what you have as (1), RFC 2047 + 2231. (2) doesn't work, not before UTF-8 is the only charset worldwide outside of museums. You cannot decree this above what RFC 2277 did. It's the job of the URI producer to get it right, mailto: URLs prepare a message/rfc822 with an US-ASCII header. The URI consumer is the weaker part - you have to protect them for interoperability, not force them to upgrade when they're not ready for it. Shift all "UTF-8 and beyond" issues into "I18N considerations" - they muddy the water for the job at hand, define a STD 66 compatible mailto: URL preparing a 2822upd + MIME compatible message/rfc822. | mailto:?to=addr1@an.example%2C%20addr2@an.example If my <segment-nz> theory has merits this is not "nz", i.e. syntactically invalid. I'm too lazy to check this against the regular expression in STD 66. A hname "to" like "bcc", let alone "body", is a bad idea and best avoided. The "to" function belongs to "mailto", not into the query part - a very simple mailto: approach could be to ignore query parts. | A mailto URI designates an "internet resource", which | is the mailbox specified in the address. I'm not sure about this, the "resource" appears to be a "proto"-message/rfc822, which can be sent to one or more mailboxes with SMTP (or whatever, but IIRC 2822upd only mentions SMTP, not UUCP / LMTP / ...) when it's ready. Other URI schemes and a MIME access-method deal with a "mailbox" as a "resource". Maybe I'm confused, or maybe it would help if you say "one or more". | The operation of how any URI scheme is resolved is not | mandated by the URI specifications. Depends, the nntp: URL scheme is designed for NNTP, it would be slightly more complex if it were designed for article numbers on "non-NNTP" news servers (example). The NNTP details are not mandated, but the design fits. JFTR, I guess we agree. You have a good In-Reply-To example in 7.1, please add this hname to the "save and useful" list in section 4. Please remove body= from "save and useful", it's neither save nor useful, it's a bad idea to start with. While keywords= are save they are rarely used. IMO cc= is a more realistic candidate for "save and useful". Other bad ideas (in addition to body=, bcc=, and to=) are date= or message-id= for obvious reasons. Maybe enumerate all potentially "save and useful" header fields: subject=, cc=, in-reply-to=, keywords=, is that really all ? | When producing mailto: URIs, all spaces SHOULD be | encoded as %20. The given reasons are compelling, justifying MUST. The "+" or "_" hacks are for Google or Wikipedia. Not too bad in subject= or keywords=, but not in addresses etc. | The mailto URI scheme is limited in that it does not | provide for substitution of variables. That's not a specific mailto: limitation, you could say "URI schemes are"... Section 6 is where you could put all UTF-8 notes, maybe rename it to "Internationalization considerations" as proposed in RFC 2277. As is section 6 is a lame excuse for breaking existing software and annoying poor users. 7.1 + 7.2 are excellent. | Applicable protocol: | None. This registration is made to assure that | this header field name is not used at all, in order | to not create any problems for mailto: URIs. It suffices to reserve it for "mail", I don't see how it could affect "http" or "news". SIP has apparently its own way to register header fields => not your problem. Frank
- Re: Please comment on draft-duerst-mailto-bis-04.… Frank Ellermann
- Please comment on draft-duerst-mailto-bis-04.txt Martin Duerst