Re: [kitten] review of draft-ietf-kitten-krb-spake-preauth-00

Benjamin Kaduk <kaduk@mit.edu> Tue, 29 August 2017 01:43 UTC

Date: Mon, 28 Aug 2017 20:43:03 -0500
From: Benjamin Kaduk <kaduk@mit.edu>
To: Greg Hudson <ghudson@mit.edu>
Cc: kitten@ietf.org, draft-ietf-kitten-krb-spake-preauth@ietf.org
Message-ID: <20170829014303.GN96685@kduck.kaduk.org>
References: <20170818181043.GC35188@kduck.kaduk.org> <59e6271c-5970-5cb7-209a-73a1e02cc5f8@mit.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <59e6271c-5970-5cb7-209a-73a1e02cc5f8@mit.edu>
User-Agent: Mutt/1.8.3 (2017-05-23)
Archived-At: <https://mailarchive.ietf.org/arch/msg/kitten/G6xmy16Gv6KNgaISWR3kHOO_Orc>
Subject: Re: [kitten] review of draft-ietf-kitten-krb-spake-preauth-00
Precedence: list

Thanks for the detailed reply.  I'll try to trim some uncontroversial
bits.

On Sat, Aug 19, 2017 at 02:17:44PM -0400, Greg Hudson wrote:
> On 08/18/2017 02:10 PM, Benjamin Kaduk wrote:
> > Our transcript hash covers just the SPAKE negotiation messages (and
> > excludes the rest of the AS-REQ/AS-REP bodies, even the SPAKE second
> > factor messages as well?!), though the final KDC-REQ-BODY does get
> > included in the key derivation.  I haven't thought too hard about
> > whether this is potentially problematic, but are there reasons why
> > it would be difficult to hash everything for the transcript?
> > Oh, or is the claim that since the KDC-REQ-BODY goes into the K'
> > calculation that we get confirmation of "everything" anyway?
> 
> The transcript checksum is a vehicle for key derivation.  The primary
> requirement for key derivation (from the SPAKE2 algorithm) is that it
> must take into account the identities of both parties, both public
> values, the initial secret, and the computed shared group element.  The
> final KDC-REQ-BODY gives us the party identities (with the ancillary
> bonus of including other request parameters), and the transcript
> checksum includes the public values (with the ancillary bonus of
> including the advertised group numbers from the client if a SPAKESupport
> message was part of the exchange, in case there was a downgrade attack).

Ah, yes, I was making assumptions that the checksum would keep going,
but I now see that it's supposed to stop after a fixed number of
inputs (some of which are skipped depending on which optimizations are
in use).

> Things we could include in the transcript checksum but currently do not
> include:
> 
> * Other pa-data values in the request.  Including these would be
> difficult to implement, and would present a chicken-and-egg problem if
> other pa-data types want to do the same thing.
> 
> * KDC-REQ-BODY encodings other than the final one.  I don't see a
> problem with including these in the transcript hash, but I don't see an
> advantage either.
> 
> * More parts of intermediate KRB-ERRORs than the SPAKE pa-data.  The
> only part of an intermediate KRB-ERROR used by the client is the error
> code (PREAUTH_REQUIRED or MORE_PREAUTH_DATA_REQUIRED), and I don't see
> any wiggle room for the attacker in manipulating that.
> 
> * Second-factor messages.  In the current design, the same transcript
> checksum is used for all K'[n] derivations, and the first derivation is
> used to encrypt the first SPAKESecondFactor, before any other
> second-factor messages are created.  Changing the design to include a
> variable transcript hash would, I think, make the standard harder to
> understand and implement.
> 
> * Parts of the AS-REP.  No SPAKE message accompanies the AS-REP, so this
> could only be used for the derivation of the final reply key used to
> encrypt the enc-part.  As for the previous bullet, including any part of
> the AS-REP would require changing the transcript checksum for different
> key derivations.  Most of the AS-REP contents (everything but the
> ticket) are redundant or couldn't be included in the derivation for one
> reason or another anyway.

Thanks for listing these out; I'm now convinced that I agree with your
analysis, and we don't gain much by expanding the scope of the transcript.

> > We do reply key strengthening with K'[0] at present.  It seems like
> > using K'[last-n-of-proper-parity] would include more transcript
> > checksum and thus nominally be "better"; is that flawed reasoning?
> 
> As noted above, the transcript checksum does not depend on n.  n is just
> a numeric parameter that feeds into the key derivation function.
> 
> Obviously this aspect of the design needs to be more explicit.  I
> propose to add this sentence to section 6 after the sentence beginning
> "It therefore incorporates...":
> 
>     Once the transcript checksum is finalized, it is used without
>     change for all key derivations (section 7).

That would probably help, thanks.

> > We should have test vectors before final publication.
> 
> Agreed.  I have the ability to generate test vectors (using a separate
> Python implementation, not the C implementation for MIT krb5), but they
> will change when key usage values are assigned, so I haven't included
> any in the draft yet.

It seems likely that this document will advance before
draft-ietf-kitten-kerberos-iana-registries.  Do I understand correctly
that the Kerberos Registrar role has been transferred to you?  Any
comment on when you would feel comfortable making an official assignment?

> > I'm not sure that the registry policies make sense, most notably
> > with respect to marking things as Required (to implement).
> 
> I tend to agree that any mandatory-to-implement policy should be written
> into this draft, and not be part of the registry.
> 
> > In a weaker sense, anything adding values
> > in these registries could be seen as adding to the ASN.1 module
> 
> To my mind, the ASN.1 module only defines the wire encoding of
> particular data inputs.  That encoding does not change when new groups
> are added.

That's the interpretation I'm inclined to use; I just mentioned the other
for discussion.

> > In section 1.1, item (4) (either side can store password or equivalent)
> > makes it sound like only one side needs to store anything.  Maybe it's
> > supposed to say that "Each side has freedom to pick whether to store
> > a password or password-equivalent"?
> 
> I am not personally sure what this bullet point means; I will make a
> note to discuss it.

Thanks.

> > I'll also note that OpenSSL has recently changed its documentation to
> > refer to the more-vague "randomness" since it can be hard to use "entropy"
> > in a technically correct way.  But that decision seems to fall squarely
> > within editorial discretion, even if I had a strong opinion about it
> > (which I don't).
> 
> I have seen people quibble with using "entropy" the way that some
> cryptographic documents do (very roughly, to mean the number of equally
> likely values a state variable could have given all of the information
> available to an attacker), but I never saw how "randomness" was better.

It's definitely not clear-cut; leaving "entropy" seems fine to me.

> > I wonder to some extent whether all of section 1.2 is needed for a final
> > document.
> 
> This section hasn't necessarily aged well, as there are other PAKE
> algorithms not described within.  I would be okay with shortening it,
> perhaps not to name any specific alternatives.

Not naming any specific alternatives is probably the right thing to
do, given the advances you mention.

> > Section 1.2 should probably compare the single-round-trip nature of
> > SPAKE against the round-trip count of ENC_TIMESTAMP.
> 
> I am not sure there's any concise comparison to be made for this
> section.  The way we are using SPAKE (with the KDC presenting the
> initial public value) means we might use one more round trip than
> encrypted timestamp, or we might use the same number if one of the
> described optimizations is used.

I was imagining something like:

[... single round trip, allowing SPAKE preauthentication to occur in
the same number of round trips as encrypted timestamp, if either
optimization from section 4.6 is used.  If neither optimization is used,
an additional round trip is incurred, but in consideration of all the
above properties, SPAKE remains an ideal PAKE for use in Kerberos
pre-authentication.

> > Section 1.3 notes that we allow secure transfer of material from client
> > to KDC for verification; while reviewing I noted that (the initial
> > challenge) from KDC to client remains unauthenticated; do we want to
> > mention that limitation explicitly here?
> 
> It is pointed out in the security considerations.  I can see how section
> 1.3 as currently written could be a little deceptive; however, at this
> point the text is speaking generally about how a PAKE can be used in the
> design of Kerberos two-factor authentication.  As we haven't even
> started talking concretely about the SPAKE preauth mech, I'm not
> comfortable adding that caveat here.

Okay.

> > It seems like we're mostly just justifying the scheme here
> 
> We currently refer normatively to the CFRG SPAKE2 document.  Someone who
> has read that document needs to know how this protocol relates to it.
> Here we are saying that we use a custom key derivation function, as
> allowed by that draft.

That seems clear to me now; maybe it was just the "also" that got me
confused.

> (As the CFRG draft hasn't advanced for many months, Nathaniel and I have
> discussed the possibility of not using it normatively, and describing
> the algorithm in this document instead.  But that's a wider topic.)

I can put my chair hat on an ask the CFRG chairs for a status update.

> > I thought a little bit about proposing to exclude the ASN.1 extension
> > marker from PA-SPAKE, but the argument for doing so is fairly weak
> > and it doesn't really have a downside, so I guess it should stay.
> > (We could perhaps be clear that an "empty" value is a zero-length
> > OCTET STRING.)
> 
> I'm a bit paranoid about tricking implementors into putting 04 00 into
> the padata-value, rather than using 04 00 as the padata-value.  In my
> mental model, RFC 4120 already specifies that padata-value is a
> non-optional OCTET STRING, and that's not really within the purview of
> this document.  I know that other people have their own models, but I am
> not really worried about an implementor leaving out the padata-value
> entirely as that would quickly be discovered in interop testing.

Seems reasonable enough to me.

> > Section 4.1 seems like it could be read as saying that the client
> > should send an AS-REQ with no PA-DATA, and then the KDC responds with
> > a KRB-ERROR and only the PA-SPAKE METHOD-DATA (and no others)
> 
> I will add a parenthetical "(possibly in addition to other PA-DATA
> elements)".

Thanks.

> > Section 4.2 lets ("MAY") the KDC pick a group not listed by the client;
> > do we ever expect this to result in a working connection?
> 
> I can bring that up again for discussion; I know I've talked about it
> with Nathaniel before, but I can't remember the details.  One possible
> use is to communicate to the client what it would have to
> implement/enable to interop with the KDC, even if we don't expect it to
> work for this particular exchange.

Okay.  I don't object to keeping the current text.

> > In section 4.6, a forward reference to section 6 when mentioning the
> > transcript hash could be helpful.
> 
> We have a forward reference in the first use of "transcript checksum" in
> section 4.2, and not for the second use in section 4.2 or the use in
> section 4.3.  As section 4.6 describes a modification of 4.1/4.2/4.3,
> I'm not sure we need another forward reference.

Okay.

> I did note that we inconsistently use "transcript hash" and "transcript
> checksum".  I will standardize on "transcript checksum".

Thanks!

> > I would consider moving the note that the PRF+ used here is the
> > RFC6113 one earlier, perhaps even to the introduction if not the
> > start of section 7 where we talk about "PRF+ input".
> 
> We use PRF+ in two places (section 5 and section 7).  In both places, we
> refer to RFC 6113 right after we use the function.  It is true that  in
> section 7 we talk at some length about the PRF+ input string before
> actually invoking PRF+ and including the reference, but I don't see that
> as a problem--it would be pretty hard for an implementor to miss the
> reference and use the wrong PRF+.  I would be okay with just defining
> the "input string" without referencing PRF+ until afterwards, if that
> would be clearer.

I think that would alleviate my concern, yes.

> > Section 9 fourth paragraph could perhaps say a bit more about why
> > the client cannot be a signing oracle.
> 
> Can you be more specific or propose text?  I don't know how to act on
> this feedback item.

Well, I'm not sure I completely understand it myself, so it's hard to
propose text.  Namely, the client is a "signing oracle" in that it
will happily sign whatever is put in front of it ... but this is
believed to not be a problem, because the key used for the signing
depends on the message being signed in a way that is (hoped to be)
specific to this protocol and would not be usable in a different
context.  If that's correct, it's probably worth calling out the
message-dependence of the signing key and consequent non-transferrability
of the signature.

> > Also in section 9, now on page 14, second paragraph, I'm not
> > entirely sure what is at risk of compromise, which also leaves me
> > confused as to whether the "non-" part of "non-negligible" is
> > correct.
> 
> That paragraph seems vague and could possibly be removed.  If an
> implementation doesn't derive the right encryption keys, it won't
> interop or match the (forthcoming) test vectors anyway.

I think it is probably safe to remove.

> > Just below, for the paragraph after the list of forbidden checksum
> > types, we talk of the EncryptedData messages having potential side
> > channels.  It seems that this may apply to both the encdata arm of
> > the PA-SPAKE CHOICE and the SpakeResponse factor; should we make
> > this more explicti?
> 
> Both of those use EncryptedData (and are the only uses of EncryptedData
> in the spec), so yes, it's intended to apply to both.  I'll propose this
> text:
> 
>     Both the size of the EncryptedData and the number of
>     EncryptedData messages used for second-factor data (including the
>     factor field of the SPAKEResponse message and messages using the
>     encdata PA-SPAKE choice) may reveal information about the second
>     factor used in an authentication.

Sounds good.

> > The next paragraph talks of an attacker being able to replay the
> > final message to any of the realm's KDCs, but does not comment
> > on whether multiple replays are possible at a given KDC.  (As I
> > understand it, MIT's lookaside cache would be expected to trigger
> > and not incur additional authentication being logged, but I don't
> > know that that's universal.)
> 
> I think this was my text, and my intent was to include both the same KDC
> and other KDCs.  I wouldn't expect the lookaside cache to be much
> protection, as the attacker could probably make small manipulations to
> the request so that it doesn't match byte-for-byte.  I'm not sure the
> text needs to make that any more explicit.

Okay.  (I do agree that the current text does apply to the same KDC
as well as other KDCs.)


> > Later in the same paragraph, maybe
> > s/instead/in contrast/, and specify that the key exchange is an
> > asymmetric one, since we do a lot of symmetric key exchange in Kerberos.
> 
> I will use "in contrast", but I'm not sure it would add clarity to say
> that the SPAKE key exchange is asymmetric--it does, after all, result in
> a symmetric key.

Fair enough.

> > The first paragraph/sentence of section 5 is rather ungainly; could
> > it be reworded and/or split in twain?
> 
> I will just remove the middle part, so that it says:
> 
>     Group elements are converted to octet strings using the
>     serialization method defined in the IANA "Kerberos SPAKE Groups"
>     registry created by this document.

That works for me!

> > Item 2 should probably also clarify that there is no trailing 0 included.
> 
> Disagree.  It would be very noisy to explicitly disclaim trailing 0
> bytes every time IETF standards talk about strings.

This is true, though I seem to see it fairly often.

> > Page 15, second-to-last paragraph, "is weaker than the secret key"
> > could include a secret key derived from a weak password, whose
> > brute-force resistance is quite low (and part of the justification
> > for this behavior).  It's probably better to talk about the key size
> > of the secret key and the strength attributed to keys of that size,
> > than just the strength of the key itself.
> 
> I propose:
> 
>     The selected group's resistance to offline brute-force attacks
>     may not correspond to the size of the reply key. For performance
>     reasons, a KDC MAY select a group whose brute-force work factor is
>     less than the reply key length. [...]

That addresses my concern, thanks.


Is there anything else we need to discuss before an -01 can be issued?

Thanks again,

Ben

[kitten] review of draft-ietf-kitten-krb-spake-pr… Benjamin Kaduk
Re: [kitten] review of draft-ietf-kitten-krb-spak… Greg Hudson
Re: [kitten] review of draft-ietf-kitten-krb-spak… Robbie Harwood
Re: [kitten] review of draft-ietf-kitten-krb-spak… Greg Hudson
Re: [kitten] review of draft-ietf-kitten-krb-spak… Benjamin Kaduk
Re: [kitten] review of draft-ietf-kitten-krb-spak… Benjamin Kaduk
Re: [kitten] review of draft-ietf-kitten-krb-spak… Greg Hudson
Re: [kitten] review of draft-ietf-kitten-krb-spak… Nathaniel McCallum
Re: [kitten] review of draft-ietf-kitten-krb-spak… Robbie Harwood