Re: [openpgp] Character encodings

Tim Bray <tbray@textuality.com> Tue, 17 March 2015 19:01 UTC

Return-Path: <tbray@textuality.com>
X-Original-To: openpgp@ietfa.amsl.com
Delivered-To: openpgp@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6D1DF1A8884 for <openpgp@ietfa.amsl.com>; Tue, 17 Mar 2015 12:01:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.977
X-Spam-Level:
X-Spam-Status: No, score=-1.977 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id A6rhSqi9Bec2 for <openpgp@ietfa.amsl.com>; Tue, 17 Mar 2015 12:01:07 -0700 (PDT)
Received: from mail-lb0-f179.google.com (mail-lb0-f179.google.com [209.85.217.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 589CC1A88E8 for <openpgp@ietf.org>; Tue, 17 Mar 2015 12:00:58 -0700 (PDT)
Received: by lbbzq9 with SMTP id zq9so14175385lbb.0 for <openpgp@ietf.org>; Tue, 17 Mar 2015 12:00:56 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=H5PbA7JBwrytCkKHSPfpfDdY41E6tt5AvTyX8jGfPJA=; b=Z5gP1p9anlsQUumuUcH8w3YaAxw+M1WnEU9uOCEAxuzKbYIvpD7qZd14dsJHi5mA1x i5tC0lOFZjkgh8GiA15slj1jX6rkBpokKyKJvke2Ubc42ER2LPkxqC5g1vXlg5m6zuuT DXpzFIKQlKED3Dx7VmVQNaDx+t2JBOqB/NT4TXvW6tIBERWbxts7jWCGMJc72yXkB/b+ UC9t9iQ9c9jpYzkLSsdx7Nv9b1MQYUpL5TPae9wN+LkKiiUtmnOPyrOyHMtNMQ5cXwzP oKwRtA2CjH3Do00SC4HGEU3N9q60PREomIJlqdRgtYl9T0djEbi7aW4lyne7jcCl2dR/ EIzg==
X-Gm-Message-State: ALoCoQlcyrFqvxYIujSnorVsnfCUZcLPUs9DUcAPZ2I50PBDEcn7sguRC77Qo6/6Ot0iNGfaZAwk
MIME-Version: 1.0
X-Received: by 10.112.144.41 with SMTP id sj9mr61126305lbb.3.1426618856797; Tue, 17 Mar 2015 12:00:56 -0700 (PDT)
Received: by 10.114.3.242 with HTTP; Tue, 17 Mar 2015 12:00:56 -0700 (PDT)
X-Originating-IP: [122.56.232.225]
Received: by 10.114.3.242 with HTTP; Tue, 17 Mar 2015 12:00:56 -0700 (PDT)
In-Reply-To: <CAHRa8=UbKKnmAmHCxsGwONsgM5udRbbKkm=Nyzf7Jrgg70+j5A@mail.gmail.com>
References: <CAHRa8=UbKKnmAmHCxsGwONsgM5udRbbKkm=Nyzf7Jrgg70+j5A@mail.gmail.com>
Date: Wed, 18 Mar 2015 08:00:56 +1300
Message-ID: <CAHBU6isuaGx_=0hBQUJ6LNMdSGJDJ8t0s0jhiZCVOe6znB7G2g@mail.gmail.com>
From: Tim Bray <tbray@textuality.com>
To: Wyllys Ingersoll <wyllys@gmail.com>
Content-Type: multipart/alternative; boundary="047d7b3a82de30912e05118096d5"
Archived-At: <http://mailarchive.ietf.org/arch/msg/openpgp/84jk7y16xXafLUXl-1PVNFvKZeI>
Cc: openpgp@ietf.org
Subject: Re: [openpgp] Character encodings
X-BeenThere: openpgp@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Ongoing discussion of OpenPGP issues." <openpgp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/openpgp>, <mailto:openpgp-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/openpgp/>
List-Post: <mailto:openpgp@ietf.org>
List-Help: <mailto:openpgp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/openpgp>, <mailto:openpgp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 17 Mar 2015 19:01:09 -0000

This would be a huge step backward. The proportion of text on the internet
that is UTF-8 is monotonically increasing toward 100%. Thank goodness.
On Mar 18, 2015 4:38 AM, "Wyllys Ingersoll" <wyllys@gmail.com> wrote:

> One area that I think needs some attention is the character encoding and
> charsets for encrypted text messages.
>
> 4880 says that everything should be UTF-8.  However, the reality is that
> UTF8 is not used everywhere and there are lots of clients that compose
> messages in their native preferred character set (Latin5, Greek, Kanji,
> etc) and its very difficult as an implementor to figure it out after the
> fact without some indication from the sender.
>
> The literal packet format only specifies 3 possible values - binary, UTF8,
> or plain.  The ASCII Armor header may specify a different charset (though
> unfortunately very few agents add the "Charset" PGP header).
> Additionally, if the message had MIME headers, there may be yet another
> charset indicated in MIME that differs from the ASCII Armor charset and the
> literal packet data format byte.
>
> If the encrypting PGP software knows what character encoding was used to
> compose the original message, there should be some way to communicate this
> in the message that would be definitive so that the decrypting software can
> present it the way it was originally intended.  As an implementor, this is
> one of the trickiest areas to get right so that the end user sees the
> messages as it was originally intended.
>
>
>
>
>
>
> _______________________________________________
> openpgp mailing list
> openpgp@ietf.org
> https://www.ietf.org/mailman/listinfo/openpgp
>
>