Re: [EAI] Question about draft-ietf-eai-rfc5335bis-12.txt

Frank Ellermann <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com> Wed, 05 October 2011 21:15 UTC

Return-Path: <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com>
X-Original-To: ima@ietfa.amsl.com
Delivered-To: ima@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A09EE1F0C5F for <ima@ietfa.amsl.com>; Wed, 5 Oct 2011 14:15:10 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.827
X-Spam-Level:
X-Spam-Status: No, score=-102.827 tagged_above=-999 required=5 tests=[AWL=-0.028, BAYES_00=-2.599, FROM_LOCAL_NOVOWEL=0.5, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MSPruS7lXSD9 for <ima@ietfa.amsl.com>; Wed, 5 Oct 2011 14:15:10 -0700 (PDT)
Received: from mail-ww0-f44.google.com (mail-ww0-f44.google.com [74.125.82.44]) by ietfa.amsl.com (Postfix) with ESMTP id D9B2E1F0C46 for <ima@ietf.org>; Wed, 5 Oct 2011 14:15:09 -0700 (PDT)
Received: by wwf22 with SMTP id 22so2253713wwf.13 for <ima@ietf.org>; Wed, 05 Oct 2011 14:18:18 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=bqywTDNtF/zoT7/EHLCkYw+C/uKwxyXwtWlQwbcQ0fs=; b=Xo/s4pd/mxRu327We9pnZRKTMPEYUtS1CzvuBvNhzRUGoI2h0d2X2dhWxONnNO6xiN E9o25s3YhzARiHdTc6JYV3Z5PEhTQ8FJYZFXcBiNPIcbzi24H3MffI66y1pK5eua+Dw/ 7A9aFscUaRQfE7YmZS9P7SX/FPjKrT9Rb/jto=
Received: by 10.227.11.81 with SMTP id s17mr224961wbs.62.1317849498090; Wed, 05 Oct 2011 14:18:18 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.180.80.134 with HTTP; Wed, 5 Oct 2011 14:17:38 -0700 (PDT)
In-Reply-To: <4E8CB3EE.10500@trigofacile.com>
References: <4E8CB3EE.10500@trigofacile.com>
From: Frank Ellermann <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com>
Date: Wed, 05 Oct 2011 23:17:38 +0200
Message-ID: <CAHhFyboH97X3t+fi8KgsiWcODCu4MBn_H1yL5nO2Rpd1uN-N3A@mail.gmail.com>
To: Julien ÉLIE <julien@trigofacile.com>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Cc: ima@ietf.org
Subject: Re: [EAI] Question about draft-ietf-eai-rfc5335bis-12.txt
X-BeenThere: ima@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "EAI \(Email Address Internationalization\)" <ima.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ima>, <mailto:ima-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ima>
List-Post: <mailto:ima@ietf.org>
List-Help: <mailto:ima-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ima>, <mailto:ima-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 05 Oct 2011 21:15:10 -0000

On 5 October 2011 21:45, Julien ÉLIE wrote:

Hi, I pick your easiest question:

> * Shouldn't a way to distinguish currently suggested UTF-8 from
>   a possible upcoming suggested encoding (UTF-32, UTF-128 or
>   anything that could be standardized in a few years) be provided?

All UTFs unsuited for MIME including those with NUL-octets like
UTF-32 will never be used for Internet messages.  All new Internet
protocols already MUST support at least UTF-8, that is an IETF
policy since 1998 stated in RFC 2277 (= BCP 18).

In theory UTF-7 and BOCU-1 are suited for MIME, but if you MUST
support UTF-8, and the protocols or document formats (= here the
message header fields) offer no way to negotiate something else,
it will be UTF-8.   It can be limited to subsets such as NFC; see
RFC 5198 for the details affecting all protocols remotely related
to Telnet (incl. WHOIS, SMTP, FTP, HTTP, NNTP).  Actually HTTP is
a special case, HTTP embraced Latin-1 before UTF-8 and RFC 2277
existed:  The HTTPbis WG has "fun" with this issue, but they are
certainly also trying to adopt UTF-8, not something else.

In other words, as far as the Internet is concerned we will get
UTF-8 everywhere, anything else would be science fiction.  In a
bold statement RFC 2277 predicts that this transition will take
at least 50 years.

> Is "message/global" the way to know UTF-8 is expected?  Isn't
> it necessary to mention UTF-8 somewhere (in a "charset" value)?

The message/global business is about the message header fields;
for message/rfc822 it is ASCII.  The content (message body) is
still default ASCII unless there is a MIME header field saying
something else:  Something else can be anything (not limited to
UTF-8) encoded for 7bit (MIME B or QP) or 8bit (for UTF8SMTP,
does not require an encoding for MIME-compatible charsets such
as UTF-8, BOCU-1, Windows-1252, etc.)  UTF8SMTP does not allow
"unencoded binary" or charsets such as UTF-16, UTF-32, or SCSU
in the message body.

> * Can an internationalized message contain a multipart
>   content-type using an UTF-8 "boundary" value?

Good question, without cheating (= looking into the EAI drafts
if they cover MIME Content-* header fields somewhere) I don't
know the answer.

-Frank