Re: Section 5.2.3 of latest draft: bis14.

hal@finney.org ("Hal Finney") Sat, 16 July 2005 00:51 UTC

To: ietf-openpgp@imc.org, lpb@ece.cmu.edu
Subject: Re: Section 5.2.3 of latest draft: bis14.
Message-Id: <20050715234725.0293757E8C@finney.org>
Date: Fri, 15 Jul 2005 16:47:25 -0700
From: hal@finney.org
Sender: owner-ietf-openpgp@mail.imc.org
Precedence: bulk

Levi Broderick writes:
> I noticed that the following bullet is missing from the latest draft.
> It used to appear between 'One-octet hash algorithm' and 'Hashed
> subpacket data set' in section 5.2.3.
>
>       - Two-octet scalar octet count for following hashed subpacket
>         data. Note that this is the length in octets of all of the hashed
>         subpackets; a pointer incremented by this number will skip over
>         the hashed subpackets.

This is definitely an error and needs to be fixed.

A couple of other relatively minor points relating to this section.

We now use the term "data set" for the hashed and unhashed subpackets:

      - Hashed subpacket data set. (zero or more subpackets)

      - Two-octet scalar octet count for the following unhashed
        subpacket data. Note that this is the length in octets of all of
        the unhashed subpackets; a pointer incremented by this number
        will skip over the unhashed subpackets.

      - Unhashed subpacket data set. (zero or more subpackets)

"Data set" is defined in the next section, 5.2.3.1:

    A subpacket data set consists of zero or more signature subpackets,
    preceded by a two-octet scalar count of the length in octets of all
    the subpackets; a pointer incremented by this number will skip over
    the subpacket data set.

This definition could be interpreted to mean that the data set includes
the two-octet scalar count.  In fact, in the layout in 5.2.3 the data
set does not include the scalar count.  5.2.3.1 could be reworded to say
"A subpacket data set consists of zero or more signature subpackets,
AND IS preceded by a two-octet scalar count..."

Another slight wording inconsistency is in 5.2.3:

    The data being signed is hashed, and then the signature data from
    the version number through the hashed subpacket data (inclusive) is
    hashed. The resulting hash value is what is signed.

This "x is hashed, and then y is hashed" business has caused confusion
for implementors.  5.2.2 fixed the wording for V3 packets:

    The concatenation of the data to be signed, the signature type and
    creation time from the signature packet (5 additional octets) is
    hashed. The resulting hash value is used in the signature algorithm.

We should make the same change for V4 packets in 5.2.3.  I don't know
if there are any other places where we talk about hashing X and then
Y and then Z instead of, as we should, hashing the contatenation of
X and Y and Z.

I am diffing against bis-12 which is the only old version I have here.
Another change I notice is that the preferred algorithm signature
subpackets in 5.2.3.7, 5.2.3.8 and 5.2.3.9 have their contents changed
from a "sequence" of one-octet values to an "array" of one-octet values.
However we do not otherwise define "array".  Is that word really
better than "sequence" here?  To me, a sequence of values is a plainer
description while an array perhaps connotes a somewhat more complex
data structure.  Of course in C an array is simply bytes in memory so if
that is how it is being read, OK.  I'm just worried that an implementor
is going to look for a definition of array.

Section 5.5.2:

    V2 keys are identical to V3 keys except for the deprecated V3 keys
    except for the version number. An implementation MUST NOT generate
    them and may accept or reject them as it sees fit.

Two "except for"s here, it doesn't look right.

Section 5.9 on literal packets:

      - File name as a string (one-octet length, followed by a file
        name). This may be a zero-length string. Commonly, if the source
        of the encrypted data is a file, this will be the name of the
        encrypted file. An implementation MAY consider the file name in
        the literal packet to be a more authoritative name than the
        actual file name.

I know we discussed this here, but I'm not sure this is right yet.
What is the "actual file name"?  And what does it mean for a name to
be authoritative?  This is making some assumptions about processing flow
which may not be correct.  I think "actual file name" means the name of
the file being decrypted, assuming that the encrypted data actually came
from a file.  But then, usually the encrypted file name is not used for
the decrypted data, rather some modification of that file name is used,
so perhaps that is the "actual file name"?

Maybe we could change the last sentence to "When decrypting, an
implementation MAY use this name as the name of an output file."
That would hint what we mean it to be used for.  Or maybe just leave the
last sentence off entirely and just say that this is commonly the name
of the encrypted file, let the implementor figure out what if anything
he wants to do with it.

We refer to RFC 822 in two places, but that's been superceded by
RFC 2822.

The only reason I noticed this in my diffing was because we changed
to put a space after RFC.  But in the references we have no space,
e.g. [RFC 2045] became [RFC2045].  I guess it's OK to use a space in
the text and no space in the references, but why not do it the same
in both contexts?  I would vote for no space, it looks better to me,
but your eyes may differ.

Section 13, Security Considerations:

     * In winter 2005, Serge Mister and Robert Zuccherato from Entrust
       released a paper describing a way that the "quick check" in
       OpenPGP CFB mode can be used with a random oracle to decrypt two
       octets of every cipher block [MZ05]. They recommend as
       prevention not using the quick check at all.

       Many implementers have taken this advice to heart for any data
       that is both symmetrically encrypted, but also the session key
       is public-key encrypted. In this case, the quick check is not
       needed as the public key encryption of the session key should
       guarantee that it is the right session key. In other cases, the
       implementation should use the quick check with care. On the one
       hand, there is a danger to using it if there is a random oracle
       that can leak information to an attacker. On the other hand, it
       is inconvenient to the user to be informed that they typed in
       the wrong passphrase only after a petabyte of data is decrypted.
       There are many cases in cryptographic engineering where the
       implementer must use care and wisdom, and this is another.

This is good but I think some of the wording could be smoothed.   The
first sentence of the second paragraph should not have a comma after
the first part of the "both" clause, and "but" doesn't seem like the
right connective.  I suggest,

       Many implementers have taken this advice to heart for any data
       that is symmetrically encrypted and for which the session key
       is public-key encrypted.

I also have a problem with "there is a danger to using it if there is
a random oracle that can leak information".  This makes it sounds like
the random oracle is some other entity independent of the implementation.
I would prefer to avoid the word "oracle" as not all implementors may be
familiar with the technical meaning, and in common use it has mystical
or religious connotations.

I think what we want is something like "there is a danger to using it
if timing information about the check can be exposed to an attacker,
particularly via an automated service that allows rapidly repeated
queries".

Finally I think the last clause should say "and this is one" rather
than "and this is another".

I do have to add that I think this paragraph is perhaps a little informal
or even poetic for a security document.  Implementors "take things to
heart" and use their "care and wisdom".  I could see an implementor
wondering whether he was reading a spec or beginning a study of Zen.
Maybe we should think about changing this to be a little more cool and
just warn them that if they are going to use the check bits, they need
to be aware of the danger of leaking timing data.  The content of the
paragraph is good, it's just the style which struck me as being a bit off.
Again, your taste may differ.

Everything else looked good as far as I could see.

Hal Finney

Section 5.2.3 of latest draft: bis14. Levi Broderick
Re: Section 5.2.3 of latest draft: bis14. "Hal Finney"
Re: Section 5.2.3 of latest draft: bis14. Jon Callas
Re: Section 5.2.3 of latest draft: bis14. Marko Kreen
Re: Section 5.2.3 of latest draft: bis14. "Hal Finney"
Re: Section 5.2.3 of latest draft: bis14. Jon Callas
Re: Section 5.2.3 of latest draft: bis14. Ben Laurie
Re: Section 5.2.3 of latest draft: bis14. Ben Laurie