Re: Gen-ART LC review of draft-ietf-eai-utf8headers-09.txt

Harald Tveit Alvestrand <harald@alvestrand.no> Mon, 24 March 2008 11:47 UTC

Return-Path: <ietf-bounces@ietf.org>
X-Original-To: ietfarch-ietf-archive@core3.amsl.com
Delivered-To: ietfarch-ietf-archive@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 4C2B928C366; Mon, 24 Mar 2008 04:47:04 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -100.602
X-Spam-Level:
X-Spam-Status: No, score=-100.602 tagged_above=-999 required=5 tests=[AWL=-0.165, BAYES_00=-2.599, FH_RELAY_NODNS=1.451, HELO_MISMATCH_ORG=0.611, RDNS_NONE=0.1, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Dk0KCmei2cdD; Mon, 24 Mar 2008 04:47:03 -0700 (PDT)
Received: from core3.amsl.com (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 319CA28C274; Mon, 24 Mar 2008 04:47:02 -0700 (PDT)
X-Original-To: ietf@core3.amsl.com
Delivered-To: ietf@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 085F93A6CA1; Mon, 24 Mar 2008 04:47:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZtYMY73Gyt9S; Mon, 24 Mar 2008 04:46:59 -0700 (PDT)
Received: from eikenes.alvestrand.no (eikenes.alvestrand.no [158.38.152.233]) by core3.amsl.com (Postfix) with ESMTP id 6C6DE3A6828; Mon, 24 Mar 2008 04:46:59 -0700 (PDT)
Received: from localhost (eikenes.alvestrand.no [127.0.0.1]) by eikenes.alvestrand.no (Postfix) with ESMTP id D21A8259783; Mon, 24 Mar 2008 12:44:37 +0100 (CET)
Received: from eikenes.alvestrand.no ([127.0.0.1]) by localhost (eikenes.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 11768-05; Mon, 24 Mar 2008 12:44:31 +0100 (CET)
Received: from [192.168.1.54] (162.80-203-220.nextgentel.com [80.203.220.162]) by eikenes.alvestrand.no (Postfix) with ESMTP id 50418259782; Mon, 24 Mar 2008 12:44:31 +0100 (CET)
Message-ID: <47E79499.3060302@alvestrand.no>
Date: Mon, 24 Mar 2008 12:46:33 +0100
From: Harald Tveit Alvestrand <harald@alvestrand.no>
User-Agent: Thunderbird 2.0.0.12 (X11/20080227)
MIME-Version: 1.0
To: Spencer Dawkins <spencer@wonderhamster.org>
Subject: Re: Gen-ART LC review of draft-ietf-eai-utf8headers-09.txt
References: <001001c88c18$a841afc0$6501a8c0@china.huawei.com> <47E6B2EF.50109@alvestrand.no> <002801c88d34$93bde290$6401a8c0@china.huawei.com>
In-Reply-To: <002801c88d34$93bde290$6401a8c0@china.huawei.com>
X-Virus-Scanned: by amavisd-new at alvestrand.no
Cc: General Area Review Team <gen-art@ietf.org>, ietf@ietf.org, "Abel Yang (editor)" <abelyang@twnic.net.tw>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IETF Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: ietf-bounces@ietf.org
Errors-To: ietf-bounces@ietf.org

Spencer Dawkins skrev:
> Hi, Harald,
>
> Thanks for the quick feedback (Gen-ART reviewers like this because we 
> can remember writing the review, and at least part of what we were 
> thinking about :-)
>
> Looks like mostly goodness. If we're in synch, I dropped it from this 
> e-mail.
>
> Spencer
>
>
>>> 1.2.  Relation to other standards
>>>
>>>   This document also updates [RFC2822] and MIME, and the fact that an
>>>   experimental specification updates a standards-track spec means that
>>>   people who participate in the experiment have to consider those
>>>   standards updated.
>>>
>>> Process: The ID Tracker is showing this draft in Last Call status, 
>>> but I
>>> can't find (in the archive or in my personal folders) any Last Call
>>> announcement, which I was looking for, in order to check how Chris 
>>> explained
>>> the downref at Last Call time - I'm expecting that it will be quite
>>> entertaining. Has anyone else seen such an announcement on IETF 
>>> Announce?
>> Note: Intended status is Experimental.
>>
>> The subject line of the Last Call was
>>
>> Last Call: draft-ietf-eai-smtpext (SMTP extension for 
>> internationalized email address) to Experimental RFC
>>
>> and covered 2 drafts; this may be why you did not find it.
>
> Exactly right (I was scanning by subject). While I'm amazed that the 
> downref isn't being called out in the Last Call announcement, I think 
> RFC tracks and standards levels are so arbitrary that they are 
> useless, so I'm not complaining - I was trying to figure out if there 
> really had been a Last Call announcement sent, that's all.
I actually don't see a downref here - this is an Experimental updating a 
Draft Standard (or Full; I don't remember current status well). If 
anything, this is unusual as an upref, not a downref....
>
>>> 4.  Changes on Message Header Fields
>>>
>>>   This protocol does NOT change the definition of header field names.
>>>
>>> technical: I'm confused here. Is this text saying "does not change 
>>> header
>>> field names"? I would have thought this specification is exactly 
>>> changing
>>> the definition of header field names...
>> It does not change the definition of header field NAMES (which remain 
>> ASCII), but changes the definition of header field BODIES (which used 
>> to be ASCII, but are now UTF-8).
>>>
>>>   That is, only the bodies of header fields are allowed to have UTF-8
>>>   characters; the rules in [RFC2822] for header field names are not
>>>   changed.
>> And this sentence is saying that. How can we express this more clearly?
>
> Ah. You filled in the missing piece for me here. Perhaps something like
>
> "This protocol does NOT change the [RFC2822] rules for defining header 
> field names. The bodies of header fields are allowed to contain UTF-8 
> characters, but the header field names themselves must contain ASCII 
> characters."
That seems like a good editorial suggestion to me. Thanks!
>
>>>   Interoperability considerations:  The media type provides
>>>      functionality similar to the message/rfc822 content type for email
>>>      messages with international email headers.  When there is a need
>>>      to embed or return such content in another message, there is
>>>      generally an option to use this media type and leave the content
>>>      unchanged or downconvert the content to message/rfc822.  Both of
>>>      these choices will interoperate with the installed base, but with
>>>      different properties.  Systems unaware of international headers
>>>      will typically treat a message/global body part as an unknown
>>>      attachment, while they will understand the structure of a message/
>>>      rfc822.  However, systems which understand message/global will
>>>      provide functionality superior to the result of a down-conversion
>>>      to message/rfc822.  The most interoperable choice depends on the
>>>      deployed software.
>>>
>>> technical: not sure what the last sentence actually means. "We don't 
>>> know
>>> what the most interoperable choice will be"? Text in the same 
>>> paragraph says
>>> both choices are interoperable. If that text is correct, I don't 
>>> understand
>>> what you're saying here.
>> Would it be better to say "the most useful choice"? It's likely to be 
>> the difference between a compliant MUA offering to dump the message 
>> to a file and displaying it as a message...
>
> "The most useful choice" seems very reasonable. The current text seems 
> to contradict other text in the same paragraph.
>
>>> 5.  Security Considerations
>>>
>>>   Because UTF-8 often requires several octets to encode a single
>>>   character, internationalized local parts may cause mail addresses to
>>>   become longer.  As specified in [RFC2822], each line of characters
>>>   MUST be no more 998 octets, excluding the CRLF.
>>>
>>> clarity: s/CRLF/CRLF, even when UTF-8 characters are being used/
>>>
>>>   Because internationalized local parts may cause email addresses to be
>>>   longer, processes which parse, store, or handle email addresses or
>>>   local parts must take extra care not to overflow buffers, truncate
>>>   addresses, exceed storage allotments, or, when comparing, fail to use
>>>   the entire length.
>>>
>>> technical: this is great advice, but I don't understand how UTF-8 
>>> changes
>>> the situation. If you aren't changing the 998-octet requirement, 
>>> software
>>> that breaks for UTF-8 would also break for ASCII headers with the 
>>> same octet
>>> length.
>> If someone uses another representation internally (for instance 
>> UTF-16), and has a 998-character buffer, that will sometimes fit into 
>> 998 octets of UTF-8, and sometimes not. The same goes in the other 
>> direction.... I'm sure others will think of other cases.
>
> Thanks for the clear explanation here. This is headed in the right 
> direction - I wasn't impressed with guidance that says "take extra 
> care", but saying "must accommodate 998 characters (which may require 
> more than 998 octets, depending on the character set in use), and must 
> not overflow buffers, ..." seems clear enough to me.
I think it's more like "must accomodate 998 octets, and not send more 
than 998 octets, even though the relationship between this number and 
the number of UTF-8 characters is not a simple one". I see that Klensin 
has picked up on this for 2821, too.

Thanks for the review!

                Harald
_______________________________________________
IETF mailing list
IETF@ietf.org
https://www.ietf.org/mailman/listinfo/ietf