Re: [openpgp] review of the SOP draft

Daniel Kahn Gillmor <dkg@fifthhorseman.net> Thu, 14 November 2019 00:18 UTC

Return-Path: <dkg@fifthhorseman.net>
X-Original-To: openpgp@ietfa.amsl.com
Delivered-To: openpgp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3351812004E for <openpgp@ietfa.amsl.com>; Wed, 13 Nov 2019 16:18:20 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=neutral reason="invalid (unsupported algorithm ed25519-sha256)" header.d=fifthhorseman.net header.b=F8PO8Ijy; dkim=pass (2048-bit key) header.d=fifthhorseman.net header.b=Z0xH3Eku
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RgL4_pGNd966 for <openpgp@ietfa.amsl.com>; Wed, 13 Nov 2019 16:18:17 -0800 (PST)
Received: from che.mayfirst.org (che.mayfirst.org [IPv6:2001:470:1:116::7]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7C2DF120045 for <openpgp@ietf.org>; Wed, 13 Nov 2019 16:18:17 -0800 (PST)
DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/simple; d=fifthhorseman.net; i=@fifthhorseman.net; q=dns/txt; s=2019; t=1573690695; h=from : to : subject : in-reply-to : references : date : message-id : mime-version : content-type : from; bh=PAo7fZh0thdMuNwcbCQbaAMilcpOsi/DlEIu8qIgiTw=; b=F8PO8IjyvjeA5qHjkgYgE1X/pNYPXnRPLml2buuKeYgrPlJIdueJ30d7 FgccUbJQLIE/BNkDHHRkHyCcL7V5AQ==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=fifthhorseman.net; i=@fifthhorseman.net; q=dns/txt; s=2019rsa; t=1573690694; h=from : to : subject : in-reply-to : references : date : message-id : mime-version : content-type : from; bh=PAo7fZh0thdMuNwcbCQbaAMilcpOsi/DlEIu8qIgiTw=; b=Z0xH3EkuAsdVOcRD4ZwO6PCjo8SPqBwwXoThKb4DFn/e9ZoU9wRUpDuj oTyJugY4+wHVACxrj9vM7aynjWChqLAvkxtfRkOSumriOXp0bLEy8yMxHY pxeRHd3cORTiwFxL4qXmuUquYNem8ZTJOgf2B1tvHgMG9ZE16Yk2FWjzSs ZbB+wjkwSdF/L4fSjXnirP0G5QcR4OIYUHTcIWMKj7LfMx9TQ0W5kFUp76 Low1nygAHYlePeKE00yaXOPqFYVN+p5XF2+NkOgiULnLxJ+iuh7ek/T7yU VNmmVM9RyloZOB3/8zhNRfbh5Q7mfhX2YQZkLiponuqLFYORnIuF5Q==
Received: from fifthhorseman.net (unknown [185.97.93.8]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by che.mayfirst.org (Postfix) with ESMTPSA id 9BCD0F9A5; Wed, 13 Nov 2019 19:18:13 -0500 (EST)
Received: by fifthhorseman.net (Postfix, from userid 1000) id EC9F9203CC; Wed, 13 Nov 2019 19:18:07 -0500 (EST)
From: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
To: Antoine =?utf-8?Q?Beaupr=C3=A9?= <anarcat@torproject.org>, openpgp@ietf.org
In-Reply-To: <87r22dm2fr.fsf@curie.anarc.at>
References: <87mud28fds.fsf@curie.anarc.at> <87h83arpby.fsf@fifthhorseman.net> <87r22dm2fr.fsf@curie.anarc.at>
Autocrypt: addr=dkg@fifthhorseman.net; prefer-encrypt=mutual; keydata= mDMEXEK/AhYJKwYBBAHaRw8BAQdAr/gSROcn+6m8ijTN0DV9AahoHGafy52RRkhCZVwxhEe0K0Rh bmllbCBLYWhuIEdpbGxtb3IgPGRrZ0BmaWZ0aGhvcnNlbWFuLm5ldD6ImQQTFggAQQIbAQUJA8Jn AAULCQgHAgYVCgkICwIEFgIDAQIeAQIXgBYhBMS8Lds4zOlkhevpwvIGkReQOOXGBQJcQsbzAhkB AAoJEPIGkReQOOXG4fkBAO1joRxqAZY57PjdzGieXLpluk9RkWa3ufkt3YUVEpH/AP9c+pgIxtyW +FwMQRjlqljuj8amdN4zuEqaCy4hhz/1DbgzBFxCv4sWCSsGAQQB2kcPAQEHQERSZxSPmgtdw6nN u7uxY7bzb9TnPrGAOp9kClBLRwGfiPUEGBYIACYWIQTEvC3bOMzpZIXr6cLyBpEXkDjlxgUCXEK/ iwIbAgUJAeEzgACBCRDyBpEXkDjlxnYgBBkWCAAdFiEEyQ5tNiAKG5IqFQnndhgZZSmuX/gFAlxC v4sACgkQdhgZZSmuX/iVWgD/fCU4ONzgy8w8UCHGmrmIZfDvdhg512NIBfx+Mz9ls5kA/Rq97vz4 z48MFuBdCuu0W/fVqVjnY7LN5n+CQJwGC0MIA7QA/RyY7Sz2gFIOcrns0RpoHr+3WI+won3xCD8+ sVXSHZvCAP98HCjDnw/b0lGuCR7coTXKLIM44/LFWgXAdZjm1wjODbg4BFxCv50SCisGAQQBl1UB BQEBB0BG4iXnHX/fs35NWKMWQTQoRI7oiAUt0wJHFFJbomxXbAMBCAeIfgQYFggAJhYhBMS8Lds4 zOlkhevpwvIGkReQOOXGBQJcQr+dAhsMBQkB4TOAAAoJEPIGkReQOOXGe/cBAPlek5d9xzcXUn/D kY6jKmxe26CTws3ZkbK6Aa5Ey/qKAP0VuPQSCRxA7RKfcB/XrEphfUFkraL06Xn/xGwJ+D0hCw==
Date: Thu, 14 Nov 2019 03:18:07 +0300
Message-ID: <87tv77nwe8.fsf@fifthhorseman.net>
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature"
Archived-At: <https://mailarchive.ietf.org/arch/msg/openpgp/SjvoplI9tG6S9b7bvT8J-NuAYWA>
Subject: Re: [openpgp] review of the SOP draft
X-BeenThere: openpgp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Ongoing discussion of OpenPGP issues." <openpgp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/openpgp>, <mailto:openpgp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/openpgp/>
List-Post: <mailto:openpgp@ietf.org>
List-Help: <mailto:openpgp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/openpgp>, <mailto:openpgp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 14 Nov 2019 00:18:20 -0000

I've incorporated some of your suggestions from this e-mail directly in
the draft now.  Thanks a lot for them!  more discussion below…

On Tue 2019-11-12 12:26:00 -0500, Antoine Beaupré wrote:
> I guess what I'm wondering is how I would make this work with my yubikey
> at all. Or maybe I got this backwards and the yubikey interface is what
> should implement sop directly?

I have no idea about how to make it work with hardware tokens.  I remain
frankly unconvinced about the seurity tradeoffs for most uses of
hardware tokens (something i guess i need to actually write down more
formally), and i'm unlikely to spend a lot of time developing how to
integrate them with `sop`.

I would only assume that some `sop` implementation would choose to carve
out `@MYHSM:xxx` as an input space for `KEY`-style inputs, where `xxx`
is some form of addressing scheme that indicates which secret key on
which device should be used to interact with the device.

But sorting out the addressing scheme alone is problematic enough don't
plan to incorporate any of that detail in this draft.  A well-formed,
thorough yet compact merge request might convince me otherwise, but
color me skeptical at the moment.

>>> I find those examples confusing. Multiple arguments, in particular,
>>> seems ambiguous. Is it "CERT DATA"? or "CERT DATA"?
>>
>> ???  i think those are the same thing, but i'll just assume you meant
>> "DATA CERTS" at the end.  The answer is that there must be exactly one
>> SIGNATURE object to verify and there may be multiple certs, so the only
>> possible way to do it is SIGNATURE first, then CERTS.
>
> Ah, yes, sorry about this. I was specifically refering to:
>
>     sop verify announcement.txt.asc alice.pgp < announcement.txt
>
> And `"CERT DATA" or "DATA CERT"`?
>
> And I guess where we differ is I am not sure it's that clear that the
> first argument of a series can be different from the rest...

well, the middle arguments of a series are *definitely* hard to
distinguish, so the only plausible distinctions when you've got
one-vs-many positional arguments is whether the "one" goes first or
last.  But let's follow up on that over at

    https://gitlab.com/dkg/openpgp-stateless-cli/issues/7

and in particular, on your ongoing merge request at:

    https://gitlab.com/dkg/openpgp-stateless-cli/merge_requests/13

If anyone else has strong feelings about this choice, please take a look
over there and follow up, either here on list, or on those tickets.

>>> How do we generate purpose-specific subkeys?
>>
>> With `sop`, you do not ;)
>
> Sad.

I be

>> If you want to do fancy OpenPGP certificate generation, you do that with
>> your toolkit's own fancy features.
>>
>> I've opened https://gitlab.com/dkg/openpgp-stateless-cli/issues/2 to
>> track that maybe we do want some rough guidance about what kinds of
>> secret key capabilities we want any `sop` to be able to generate here
>> though.
>
> Commented on that. Would still love to see a more decent way to handle
> subkeys because that's a really hard thing to do in existing
> implementations.
>
> At least creating split subkeys by default would be a great start, IMHO.

This is exactly the sort of decision that i want to see implementers
make, so we can document their choices.  `sop` is not about fancy key
management, nor should it be.  As i wrote in
https://gitlab.com/dkg/openpgp-stateless-cli/issues/2, i do not want
`sop` to place any detailed constraints here, i just want the generated
key to be functional for use with `sop`.

>> We don't mandate UTF-8 unless the signer claims that the thing being
>> signed is text.  If so, it really does need to be UTF-8.  I have no
>> patience for non-UTF-8-encoded text in 2019.
>>
>> OpenPGP embeds UTF-8 explicitly in its User ID formatting.  Any OpenPGP
>> implementation must already handle UTF-8.
>>
>> if anyone thinks that dealing with different character encodings is a
>> good idea, please consider that the character encoding is not recorded
>> in the signature itself, leading charset-switching attacks like those in
>> https://dkg.fifthhorseman.net/notes/inline-pgp-harmful/
>>
>> Do you think this information belongs in this document?
>
> Absolutely, otherwise it looks like an arbitrary decision.

I've just added the following subsection with "Guidance for
Implementers":

    Text is always UTF-8 {#utf8}
    --------------------

    Various places in this specification require UTF-8 {{RFC3629}} when encoding text. `sop` implementations SHOULD NOT consider textual data in any other character encoding.

    OpenPGP Implementations MUST already handle UTF-8, because various parts of {{RFC4880}} require it, including:

     - User ID
     - Notation name
     - Reason for revocation
     - ASCII-armor Comment: header

    Dealing with messages in other charsets leads to weird security failures like {{Charset-Switching}}, especially when the charset indication is not covered by any sort of cryptographic integrity check.
    Restricting textual data to `UTF-8` universally across the OpenPGP ecosystem eliminates any such risk without losing functionality, since `UTF-8` can encode all known characters.


If any thinks that's either wrong or insufficient, please send
corrections/improvements!

> I wish we didn't have to deal with this distinction, but if so, maybe we
> should clarify the source of it here. Otherwise it comes as a surprise
> to me, an experience OpenPGP user.

As a user, you shouldn't ever need to see it.  As an implementer, you
do need to think about it.

I've added the following text to the discussion of `sop sign`:

    `--as=binary` SHOULD result in an OpenPGP signature of type 0x00 ("Signature of a binary document").
    `--as=text` SHOULD result in an OpenPGP signature of type 0x01 ("Signature of a canonical text document").
    See section 5.2.1 of {{RFC4880}} for more details.

And i've added a new subsection in "Guidance for Conumers":

    Choosing between `--as=text`  and `--as=binary`
    ------------------------------------------------------

    A program that invokes `sop` to generate an OpenPGP signature typically needs to decide whether it is making a text or binary signature.

    By default, `sop` will make a binary signature.
    The caller of `sop sign` should choose `--as=text` only when it knows that:
     - the data being signed is in fact textual, and encoded in `UTF-8`, and
     - the signed data might be transmitted to the recipient (the verifier of the signature) over a channel that has the propensity to transform line-endings.

    Examples of such channels include FTP ({{RFC959}}) and SMTP ({{RFC5321}}).

> What I'm saying is the `sop sign` example is error prone. Forget the `<`
> and the mandated order and you might reverse the signing key and the
> message.

sure. if you screw up any API, you can screw up any API :)

>>>> If `sop decrypt` fails for any reason and the identified `--session-key-out`
>>>> file already exists in the filesystem, the file will be unlinked.
>>>  
>>> This seems dangerous! Why do we delete a file we haven't created?
>>> Explain.
>>
>> We don't want the user to run `sop`, and then inspect a file that was
>> already in the filesystem thinking that it is `sop`s output.  If you
>> think that's a bad decision, please suggest what we should do
>> differently.
>
> Maybe we should not overwrite existing files at all and fail earlier?

I think you're proposing that if the `--sessionkey-out` file already
exists in the filesystem, that should be an error in the first place.
I'd be happy to entertain that idea, if anyone wants to provide text for
it.

If you decide to try to write it up, please think about how it works for
the other scenarios where `sop` can produce output on more than stdout.
it would be nice if these mechanisms all had the same behavior.

>>>> [`--with-session-key`] enables decryption of the `CIPHERTEXT` using the session key directly against the `SEIPD` packet.
>>>> This option can be used multiple times if several possible session keys should be tried.
>>>
>>> What happens if both "in" and "out" are provided? I can venture a guess,
>>> but it would be important to make that explicit as there can be horrible
>>> bugs there.
>>
>> Please do venture a guess, in the form of proposed text! I'd also love
>> to hear what the horrible bugs are.  I don't see them.
>
> I would argue that both options should not be provided at once. One
> implementation that could come up would be that the program attempts to
> read the file as it's writing it, truncating the precious key before it
> has time to read it.

Ah, you're not talking about providing both options -- you're talking
about providing both options pointing *at the same file*.  i agree, that
sounds like a bad idea, but it's a bad idea for *any* pair of input and
output fields.

> We can continue the discussion in issue #13, but the TL;DR: is that I
> agree that stripping trailing control characters is a good idea, but
> disagree about whitespace in general.

I hope other folks will weigh in on #13.  There's interesting discussion
going on there about what properties it's reasonable to expect from a
"well-formed" password.

> I don't know how OpenPGP packets are built. Can't we show the signature
> on the output of decrypt?

Absolutely not.  Mixing the cleartext output with the signature
verification stream is a classic cause of failures.  What if the
cleartext data happens to "look like" a signature verification?  how is
the consumer supposed to distinguish between them?

It is critical to keep them separate.

>> But if the primary operation is decryption, i don't think we should fail
>> on signature validity for reasons outlined above.
>
> But that assumes decryption is the primary operation.

The subcommand is "sop decrypt".  By definition, "decrypt" is the
primary operation.

> In the context where all my email traffic is encrypted with OpenPGP,
> for example, decryption is not the primary operation anymore. I *do*
> want to fail properly on signature validity, it becomes a primary
> operation when encryption is "default"...

You want a *failure* in the sense that you think that an MUA shouldn't
show the user the cleartext of the message if no valid signature can be
found?

This is suprising to me, and i know of no MUA that does this.

>>> File descriptors could be passable as distinct options, like
>>> --sign-with-fd for --sign-with.
>>
>> This is an interesting proposal, though i don't see how --sign-with=@FD:3
>> is much different from --sign-with-fd=3  -- i guess it lets you use
>> files that are literally named @FD:3 ?  Is that important?
>
> It's less magic, more explicit, and correlates better with other
> commandline APIs I have encountered.

it looks to me like it would make the description of the command line
significantly more verbose, but i'm willing to consider it if someone
wants to propose a specific textual change.

> Say you think you are in a trusted directory with "CERTS" that you want
> to encrypt to. You call:
>
>   sop encrypt * < /tmp/file > /tmp/file.pgp
>
> Except you made a mistake and the attacker has control of the current
> directory, and injects a file named (say) @ENV:SOMETHING. Assuming they
> have control over the SOMETHING environment, they can now add an
> encryption key to the message.

if the attacker has control of the directory, they can inject an
encryption key in the first place, right?  i don't think @ENV makes this
any worse...

> Control of the environment is kind of a stretch, I must admit, but in
> certain environments (most notably web servers), a *lot* of stuff can
> end up there and it shouldn't be completely trusted this way.

perhaps when an `@`-prefixed argument is supplied, if a file with a
matching literal name exists, `sop` should fail with an error because of
the ambiguity?  This seems like an unlikely and unusual situation, but i
can see how it might be worth thinking about it.

Perhaps it would be interesting to contrast a MR that contains this
guidance with a MR that switches over to the --with-$foo-fd= approach
you've suggested.

>> patches welcome, particularly for this kind of editorial cleanup :)
>
> https://gitlab.com/dkg/openpgp-stateless-cli/merge_requests/12

thanks, merged ;)

>>> It would also be great if we could explain where those magic numbers
>>> come from in the first place. I suspect they were chosen to not overlap
>>> with existing error codes, but that's just a guess.
>>
>> Justus picked 69 in his OpenPGP Interoperability Test Suite.  I chose
>> the others as "reasonable-sized primes" just for fun.  I don't think
>> this information belongs in this document, as it doesn't matter.
>
> I love this kind of information in text, it makes it less dull. :p

i think we have a difference of opinion here.  maybe this is ok in an
acknowlegements section, but i definitely don't want to read discursive
stories when i'm trying to extract technical information.

If you want to supply a patch for the acknowledgements section, i'd be
willing to consider merging it.

> `sop probe` would do the minimal amount of work required to determine
> which keys ("signers") to consider  when decrypting, then call `decrypt`
> properly.

I don't think this is "stateless", and i don't think it's
well-specified.

I also don't think it's particularly useful to know *that* a thing was
signed (by some arbitrary certificates) if you haven't already made some
determination that the certificate is meaningful in the current context.

> `sop probe` could also do the general task of parsing OpenPGP messages
> into packets and stuff like that.

this doesn't sound like a subcommand at the same level of abstraction
that `sop` is aiming for.

So i'm still not convinced.  But if you (or anyone) wants to make a
merge request that proposes new subcommands, i'm definitely up for
reviewing them.

>>>> Compression {#compression}
>> […]
>>> How about decryption? Do we attempt decompression during decrypt?
>>
>> It will be interesting to see what implementers do!  I've left `sop`
>> deliberately agnostic there, and i would like to learn from test suites
>> what the answer is.
>
> Should we make that decision clearer in the document?

it's not really a decision :) Hopefully my description of how it would
work as part of an interop suite will give a hint of this kind of
approach.

Thanks for the discussion!

        --dkg