Re: [openpgp] A way to securely define cleartext signature charset

Andre Heinecke <aheinecke@intevation.de> Tue, 11 September 2018 08:36 UTC

Return-Path: <aheinecke@intevation.de>
X-Original-To: openpgp@ietfa.amsl.com
Delivered-To: openpgp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 653FF131084 for <openpgp@ietfa.amsl.com>; Tue, 11 Sep 2018 01:36:40 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VBLadqmgWLE4 for <openpgp@ietfa.amsl.com>; Tue, 11 Sep 2018 01:36:37 -0700 (PDT)
Received: from kolab.intevation.de (kolab.intevation.de [212.95.107.133]) by ietfa.amsl.com (Postfix) with ESMTP id 2D55D130E60 for <openpgp@ietf.org>; Tue, 11 Sep 2018 01:36:36 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by kolab.intevation.de (Postfix) with ESMTP id 776A662848 for <openpgp@ietf.org>; Tue, 11 Sep 2018 10:36:35 +0200 (CEST)
X-Virus-Scanned: by amavisd-new at intevation.de
Received: from kolab.intevation.de ([127.0.0.1]) by localhost (kolab.intevation.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1uYhLQ4WXWaN for <openpgp@ietf.org>; Tue, 11 Sep 2018 10:36:33 +0200 (CEST)
Received: from localhost (localhost [127.0.0.1]) by kolab.intevation.de (Postfix) with ESMTP id C82096288B for <openpgp@ietf.org>; Tue, 11 Sep 2018 10:36:33 +0200 (CEST)
Received: from esus.localnet (81-5-224-141.hdsl.highway.telekom.at [81.5.224.141]) (Authenticated sender: andre.heinecke@intevation.de) by kolab.intevation.de (Postfix) with ESMTPSA id 94DE462802; Tue, 11 Sep 2018 10:36:33 +0200 (CEST)
From: Andre Heinecke <aheinecke@intevation.de>
To: openpgp@ietf.org
Cc: Jon Callas <joncallas@icloud.com>
Date: Tue, 11 Sep 2018 10:36:32 +0200
Message-ID: <4583135.Hku5QGgJE0@esus>
User-Agent: KMail/5.2.3 (Linux/4.9.0-8-amd64; KDE/5.28.0; x86_64; ; )
In-Reply-To: <8B546F88-AD17-4EBE-B8F8-F2D72D02CE8A@icloud.com>
References: <BY2PR16MB0278DB57063BDB6F519B882BE9050@BY2PR16MB0278.namprd16.prod.outlook.com> <8B546F88-AD17-4EBE-B8F8-F2D72D02CE8A@icloud.com>
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="nextPart2253596.LtcxTvYFki"; micalg="pgp-sha256"; protocol="application/pgp-signature"
Archived-At: <https://mailarchive.ietf.org/arch/msg/openpgp/wI6FqRJC0EGGEpd2DbYOZu7Mdfs>
Subject: Re: [openpgp] A way to securely define cleartext signature charset
X-BeenThere: openpgp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Ongoing discussion of OpenPGP issues." <openpgp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/openpgp>, <mailto:openpgp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/openpgp/>
List-Post: <mailto:openpgp@ietf.org>
List-Help: <mailto:openpgp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/openpgp>, <mailto:openpgp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Sep 2018 08:36:40 -0000

Hi Jon, Neil,

thanks for your comments!

On Monday, September 10, 2018 4:53:03 PM CEST Jon Callas wrote:
> > On Sep 10, 2018, at 11:23 AM, Neil Hunsperger 
<Neil_Hunsperger=40symantec.com@dmarc.ietf.org> wrote:
> > I'll add a data point. Some years back, the PGP Desktop product added an 
unsigned "Charset" header to its ASCII armor. The result looked like this:

That would also be an option. I don't prefer it because it would be unsigned 
but it would already help with usability issues.

> And for what it’s worth, section 6.2 of RFC 4880 says:
> 
>      - "Charset", a description of the character set that the plaintext
>      is in.  Please note that OpenPGP defines text to be in UTF-8.  An
>       implementation will get best results by translating into and out
>      of UTF-8.  However, there are many instances where this is easier
>       said than done.  Also, there are communities of users who have no
>       need for UTF-8 because they are all happy with a character set
>       like ISO Latin-5 or a Japanese character set.  In such instances,
>       an implementation MAY override the UTF-8 default by using this
>       header key.  An implementation MAY implement this key and any
>       translations it cares to; an implementation MAY ignore it and
>       assume all text is UTF-8.

There is indeed very little definition in this section,...

>  However, there are many instances where this is easier
>  said than done. 

And that is the problem. E.g. a webmailer in which you paste UTF-8 Text, then 
the webmailer sees that it can encode that message as latin 1 and sends it as 
latin 1. Now on the receiving side you have a content-type saying "latin 1" 
but the message was actually signed in UTF-8. And so you have to "try" 
multiple charsets if you whish to verify the message (as it could also be 
signed as latin1).
 
> All those MAYs are there because of the real world considerations. 

Well, my proposed change would be optional. Without the "Charset" in the 
message an implementation would fallback to the "do what you want" guessing 
game with all these MAYs :-)

> People still use JIS all over the place, for example, and this allows them
> to mark their text and have it work correctly. (That’s why we put it in both
> the standard and software. The examples of Latin-5 and JIS were real.) On 
> the other hand, there was a completely reasonable objection that there are
> not only silly character sets that one could make up (nods to the computer
> language “Whitespace”), and real-world issues of what happens when the
> diehard Latin-5 people start sending messages to the diehard JIS people, and
> the resulting N^2 testing matrix.

I'm not sure if you say that we should not add standardized way to define the 
charset for cleartext signatures or that we should?

I don't really see the problem of either silly character sets or Latin-5 / JIS 
messages. As long as It can be converted to the display charset / for passage 
through the openpgp engine it should be ok.

> Thus, this section lets an implementation throw its hands up in the air and
> scream wherever and whenever it wants, while giving a decent way to
> clearsign Japanese text.

Yeah, but from a usability standpoint I do not like guessign, screaming and 
failing if it can be avoided at all :-).


Best Regards,
Andre

-- 
Andre Heinecke |  ++49-541-335083-262  | http://www.intevation.de/
Intevation GmbH, Neuer Graben 17, 49074 Osnabrück | AG Osnabrück, HR B 18998
Geschäftsführer: Frank Koormann, Bernhard Reiter, Dr. Jan-Oliver Wagner