Re: [tcpinc] Genart last call review of draft-ietf-tcpinc-tcpcrypt-07

Daniel B Giffin <dbg@scs.stanford.edu> Sun, 22 October 2017 04:19 UTC

Return-Path: <dbg@scs.stanford.edu>
X-Original-To: tcpinc@ietfa.amsl.com
Delivered-To: tcpinc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B9D5E1359AF; Sat, 21 Oct 2017 21:19:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0
X-Spam-Level:
X-Spam-Status: No, score=0 tagged_above=-999 required=5 tests=[SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MSIYCWUZiuZp; Sat, 21 Oct 2017 21:19:03 -0700 (PDT)
Received: from market.scs.stanford.edu (www.scs.stanford.edu [IPv6:2001:470:806d:1::9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 91C9E1359B0; Sat, 21 Oct 2017 21:19:03 -0700 (PDT)
Received: from market.scs.stanford.edu (localhost [127.0.0.1]) by market.scs.stanford.edu (8.15.2/8.15.2) with ESMTP id v9M4Iv2S047124; Sat, 21 Oct 2017 21:18:57 -0700 (PDT)
Received: (from dbg@localhost) by market.scs.stanford.edu (8.15.2/8.15.2/Submit) id v9M4ItoK023589; Sat, 21 Oct 2017 21:18:55 -0700 (PDT)
Date: Sat, 21 Oct 2017 21:18:55 -0700
From: Daniel B Giffin <dbg@scs.stanford.edu>
To: Dale Worley <worley@ariadne.com>
Cc: gen-art@ietf.org, draft-ietf-tcpinc-tcpcrypt.all@ietf.org, tcpinc@ietf.org, ietf@ietf.org, David Mazieres expires 2018-01-14 PST <mazieres-ddragqirgwht7ezx2d39a3jw72@temporary-address.scs.stanford.edu>
Message-ID: <20171022041855.GB66393@scs.stanford.edu>
References: <150837900205.18813.9695364316023017206@ietfa.amsl.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <150837900205.18813.9695364316023017206@ietfa.amsl.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpinc/kvoGSM-yche4Y9IiLX9gXYLBI8U>
Subject: Re: [tcpinc] Genart last call review of draft-ietf-tcpinc-tcpcrypt-07
X-BeenThere: tcpinc@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Working group mailing list for TCP Increased Security \(tcpinc\)" <tcpinc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpinc>, <mailto:tcpinc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpinc/>
List-Post: <mailto:tcpinc@ietf.org>
List-Help: <mailto:tcpinc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpinc>, <mailto:tcpinc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 22 Oct 2017 04:19:07 -0000

Thanks for the thorough review, Dale!

So that others can see how the draft will be changed to
address your comments, I'll respond to them inline below.

Dale Worley wrote:
> Reviewer: Dale Worley
> Review result: Ready with Nits
> 
> I am the assigned Gen-ART reviewer for this draft.  The General Area
> Review Team (Gen-ART) reviews all IETF documents being processed
> by the IESG for the IETF Chair.  Please treat these comments just
> like any other last call comments.
> 
> For more information, please see the FAQ at
> <https://wiki.tools.ietf.org/area/gen/wiki/GenArtfaq>.
> 
> Document:  review-draft-ietf-tcpinc-tcpcrypt-07
> Reviewer:  Dale R. Worley
> Review Date:  2017-10-18
> IETF LC End Date:  2017-10-19
> IESG Telechat date:  2017-10-26
> 
> Summary:
> 
>        This draft is basically ready for publication, but has nits
>        that should be addressed before publication.
> 
> * Major/global items:
> 
> 1. The construction _phrase_ is used in many places.  The construction
> "phrase" is also used frequently.  It's not clear if these have
> specific semantics, though _..._ seems to be used for defining
> instances of a term, and "..."  seems to be used around mathematical
> notation.  What syntax(es) are intended to be used in the RFC, and
> with what meaning(s)?  The RFC Editor can probably make recommendations
> here.

Yes, underscores are used to mark terms that are being
introduced for the first time and defined.

And quotation marks are used where logical names (generally
mathematical values or protocol field names) are embedded in
normal text, in order that they can be more easily parsed as
identifiers instead of English.  (I'm going through now and
making this slightly more consistent than it was.)

Note that our method of composing this document (mmark2rfc)
produces XML output (which we submit alongside the text
version of each draft) that can be used to set off these
identifiers in a different typeface if the format supports
it.  For example, the xml2rfc tool produces HTML output with
these identifiers in a "typewriter" face ... but I don't
know what is the future of such formats in the IETF.  The
quotation marks that are produced for plain text output look
a little busy, but do seem to achieve the goal of
disambiguation.  We're absolutely open to suggestions here.

> 
> 2. Section 1 should be updated to use the language of BCP 14 (RFC 8174)
> section 2.

Thanks, done!

> 
> 3. The term "key agreement scheme" doesn't seem to be used consistently.
> In a narrow sense, it seems to be used for the initial phases of the
> encryption.  In a broad sense, it seems to be used for the set of
> algorithm selections, key lengths, and magic numbers that are used by
> the tcpcrypt algorithm, a set identified by a particular TEP
> identifier.  The two can be confused, because it seems that only a few
> items in the set can be varied using the 4 defined TEP identifiers.
> But I reflexively assume that all of these parameters can be varied
> within the overall scheme of "tcpcrypt".
> 
> Is it the intention that the TEP identifier *only* specifies the key
> agreement scheme in the narrow sense, and we are *committing* to never
> varying the other parameters?  Or are we taking the more natural path
> that the TEP identifier specifies all of these parameters, but the
> currently defined values all specify the same values for all but one
> parameter?  In either case, we need to make the overall scheme clear
> early on and use the terminology consistently.

Thanks for pointing this out; it's true that there is some
confusing conflation between "key-agreement schemes" and
TEPs.

I've gone through and used something like "negotiated TEP"
in a couple places where the document said that a parameter
depended on the "negotiated key-agreement scheme", and also
added various phrases to make clear that the TEP dictates
all the parameters.

I've fixed the first paragraph of "3.2 Protocol
negotiation"; you'll find that some edit-requests below
where you refer to that paragraph.

Lastly, section "5. Key agreement schemes" now begins like
this:

   The TEP negotiated via TCP-ENO indicates the use of one of the key-
   agreement schemes named in Table 2.  For example,
   "TCPCRYPT_ECDHE_P256" names the tcpcrypt protocol with key-agreement
   scheme ECDHE-P256 and the associated length parameters below.

   All the TEPs specified in this document require the use of HKDF-
   Expand-SHA256 as the CPRF, and these lengths for nonces and session
   keys:

                             N_A_LEN: 32 bytes
                             N_B_LEN: 32 bytes
                             K_LEN:   32 bytes

   If future documents assign additional TEPs for use with tcpcrypt,
   they may specify different values for the lengths above.  Note that
   the minimum session ID length required by TCP-ENO, together with the
   way tcpcrypt constructs session IDs, implies that "K_LEN" must have
   length at least 32 bytes.

> 
> 4. The positioning of the tables seems to be poor relative to the
> sections which refer to them.  Presumably the RFC Editor will clean
> that up.

I'm not sure what you mean here, as tables 2 and 3 probably
need to be in "IANA considerations", which immediately
follows the sections most closely related to those tables.

> 
> 5. Does draft-ietf-tcpinc-tcpeno require that the application can
> query the stack to find out whether encryption was established vs. the
> connection has fallen back to being unencrypted?

Yes (in section 5.1):

   Each TEP MUST define a session ID that is computable by both
   endpoints and uniquely identifies each encrypted TCP connection.
   Implementations MUST expose the session ID to applications via an API
   extension.  The API extension MUST return an error when no session ID
   is available because ENO has failed to negotiate encryption or
   because no connection is yet established.

> 
> 6. It might be worth adjusting the rules for how the A and B roles are
> carried forward during session resumption.  Of course, each host
> should compute the resumption identifier that it expects to receive
> based on the role it had in the previous session.  But it's not clear
> to me why a host that used k_ab for encryption (i.e., had the A role)
> in the previous session must also use k_ab for encryption in the
> resumed session, since the two sequences of k_ab/k_ba are generated
> from the different session keys of the two sessions.  If you made the
> choice of k_ab/k_ba be dependent on the A/B roles established by
> TCP-ENO for *this* session, it seems like the specification of the
> protocol would be a bit simpler.

I'll explain this design choice under separate cover, when I
have a moment ...

> 
> 7. In the encryption frame, it seems to me that the (unencrypted)
> control byte could be eliminated and the rekey flag put into the
> (encrypted) flags byte, if we define that rekey=1 means that rekeying
> takes effect on the *next* frame rather than the current one.
> However, that would eliminate the 7 reserved unencrypted flags the
> frame format now has, which might be useful in the future.  (I suspect
> that the usefulness of an unencrypted field in the frame is something
> that cryptographers understand but I don't.)

I suppose you're right, that would be a clever economy.
However, if it has been some time since you last sent data
and your security policy is not to encrypt any data under
sufficiently-stale keys, then it's good to have the ability
to go ahead with the new key immediately.

> 
> * Minor/editorial items:
> 
> Table of Contents
> 
>      11.1.  Normative References . . . . . . . . . . . . . . . . . .  24
>      11.2.  Informative References . . . . . . . . . . . . . . . . .  25
>    Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  25
> 
> The names of these three sections aren't capitalized like those of
> other section.

Dang, yeah ... I prefer "Capitalize only first letter" for
readability, but the several RFCs I just peeked at do all
caps, so we'll assimilate.

> 
> 3.1.  Cryptographic algorithms
> 
>    o  A _collision-resistant pseudo-random function (CPRF)_ is used to
>       generate multiple cryptographic keys from a pseudo-random key,
>       typically the output of the extract function.  The CPRF is defined
>       to produce an arbitrary amount of Output Keying Material (OKM),
>       and we use the notation CPRF(K, CONST, L) to designate the first L
>       bytes of the OKM produced by the pseudo-random function identified
>       by key K on CONST.
> 
> It is unclear what "the pseudo-random function identified by key K"
> means, as only three functions have been identified to this point, and
> none of them seem to have identifiers.
> 
> It sounds like CPRF is defined to produce an endless stream of OKM
> based on two inputs, K and CONST -- T(1) | T(2) | T(3) | ... -- and
> CPRF(K, CONST, L) is the first L bytes of the stream.  If so, it seems
> to me that it would be clearer to say it in those terms:
> 
>    o  A _collision-resistant pseudo-random function (CPRF)_ is used to
>       generate multiple cryptographic keys from a pseudo-random key.
>       The CPRF produces an endless stream of Output Keying
>       Material (OKM), and we use the notation CPRF(K, CONST, L) to
>       designate the first L bytes of the OKM produced by the
>       CRPF when keyed with K and CONST.
> 
> --

Yep, thanks ... I've gone with something very similar:

   o  A _collision-resistant pseudo-random function (CPRF)_ is used to
      generate multiple cryptographic keys from a pseudo-random key,
      typically the output of the extract function.  The CPRF produces
      an arbitrary amount of Output Keying Material (OKM), and we use
      the notation CPRF(K, CONST, L) to designate the first L bytes of
      the OKM produced by the CPRF when parameterized by key K and the
      constant CONST.

> 
>    The Extract and CPRF functions used by default are the Extract and
>    Expand functions of HKDF [RFC5869].  
> 
> These functions don't have these roles "by default", but rather, these
> functions are specified for these roles by the four defined TEP
> identifiers, and indeed, there is no way to specify that other
> functions are to be used in these roles.  It seems more sensible to
> say something like
> 
>    The Extract and CPRF functions used with the tcpcrypt variants
>    defined in this document are the Extract and Expand functions of
>    HKDF [RFC5869].
> 
> Since you expand on what is in RFC 5869, it might be worth providing a
> reference for HMAC-Hash to RFC 2104.

Good points, both; I've changed that paragraph to:

   The Extract and CPRF functions used by the tcpcrypt variants defined
   in this document are the Extract and Expand functions of HKDF
   [RFC5869], which is built on HMAC [RFC2104].  These are defined as
   follows in terms of the function "HMAC-Hash(key, value)" for a
   negotiated "Hash" function such as SHA-256; the symbol | denotes
   concatenation, and the counter concatenated to the right of CONST
   occupies a single octet.

> 
> It doesn't seem to be stated here or in RFC 5869 that the value of the
> counter in the calculation of T(n) is n reduced modulo 256 -- there's
> no statement that after 0xFF is used to generate T(255), T(256) is
> generated using 0x00.  (Should that be specified here or put in an
> erratum to RFC 5869?)

Oh, good catch ... actually in RFC 5869 it addresses this
with the input constraint "L <= 255*HashLen".

So I've added a line to the bottom of our HKDF figure:

           HKDF-Extract(salt, IKM) -> PRK
              PRK = HMAC-Hash(salt, IKM)

           HKDF-Expand(PRK, CONST, L) -> OKM
              T(0) = empty string (zero length)
              T(1) = HMAC-Hash(PRK, T(0) | CONST | 0x01)
              T(2) = HMAC-Hash(PRK, T(1) | CONST | 0x02)
              T(3) = HMAC-Hash(PRK, T(2) | CONST | 0x03)
              ...

              OKM  = first L octets of T(1) | T(2) | T(3) | ...
              where L < 255*OutputLength(Hash)

> 
> 3.2.  Protocol negotiation
> 
>    Tcpcrypt depends on TCP-ENO [I-D.ietf-tcpinc-tcpeno] to negotiate
>    whether encryption will be enabled for a connection, and also which
>    key agreement scheme to use.
> 
> This doesn't really classify things correctly.  It should be something
> like
> 
>    Tcpcrypt depends on TCP-ENO [I-D.ietf-tcpinc-tcpeno] to negotiate
>    that encryption will be enabled for a connection, that tcpcrypt
>    will be used, and which cryptographic algorithms and parameters
>    tcpcrypt will use.

Right ... I've changed this paragraph so that it leaves
"key-agreement scheme" as a shorthand for all the associated
parameters (because the ECDH and KDF steps could be
considered two parts of key-agreement), but enumerates those
explicitly in a later sentence:

   Tcpcrypt depends on TCP-ENO [I-D.ietf-tcpinc-tcpeno] to negotiate
   whether encryption will be enabled for a connection, and also which
   key-agreement scheme to use.  TCP-ENO negotiates the use of a
   particular TCP encryption protocol or _TEP_ by including protocol
   identifiers in ENO suboptions.  This document associates four TEP
   identifiers with the tcpcrypt protocol, as listed in Table 2.  Each
   identifier indicates the use of a particular key-agreement scheme,
   with an associated CPRF and length parameters.  Future standards may
   associate additional TEP identifiers with tcpcrypt, following the
   assignment policy specified by TCP-ENO.

> 
> --
> 
>    This document adopts the terms
>    "host A" and "host B" to identify each end of a connection uniquely,
>    following TCP-ENO's designation.
> 
> You don't actually say that this document's use of A and B matches the
> A and B roles assigned by TCP-ENO.  If you mean it to, say
> 
>    This document uses the terms "host A" and "host B" to identify the
>    hosts that TCP-ENO designates as the A role and B role.

Okay.  The TCP-ENO document uses "host A" and "host B" too,
so I've gone with:

  TCP-ENO uses the terms "host A" and "host B" to identify
  each end of a connection uniquely, and this document
  employs those terms in the same way.

> 
> --
> 
>    ENO suboptions include a flag "v" ...
> 
> Might be better to phrase it "The ENO suboptions ..." to connect with
> the negotiation described in the preceding paragraph.

We really mean "every ENO suboption", so I've changed this
sentence to "An ENO suboption includes a flag `v` ..."

> 
>    In order to
>    propose session resumption (described further below) with a
>    particular TEP, a host sends a variable-length suboption containing
>    the TEP identifier, the flag "v = 1", and an identifier for a session
>    previously negotiated with the same host and the same TEP.
> 
> Probably better to say "an identifier derived from a session previously
> negotiated...".

I've changed this to, "an identifier derived from a session
secret previously negotiated ..."

> 
> 3.3.  Key exchange
> 
>    o  "PK_A", "PK_B": ephemeral public keys for hosts A and B,
>       respectively.
> 
> The use of "PK" for a public key seems to be poorly mnemonic, as it is
> also the acronym of "private key".  There ought to be standard (and
> distinct!) abbreviations for these phrases, but I can't find any...

Yeah ... at least in this document we don't name private
keys, as they are never transmitted.

> 
>    The particular master key in use is advanced as described in
>    Section 3.8.
> 
> Presumably, "The first master key used is mk[0], and use advances to
> successive master keys as described in section 3.8." -- we have a
> series of master keys, so the keys are numbers, and so a key *itself*
> cannot "advance", what advances is something which selects/uses one of
> the series of master keys.
> 

Right.  I'll use, "The process of advancing through the
series of master keys is described in section 3.8."

> You probably want to index k_ab and k_ba by the index of the mk they
> are generated from:
> 
>                   k_ab[i] = CPRF(mk[i], CONST_KEY_A, ae_keylen)
>                   k_ba[i] = CPRF(mk[i], CONST_KEY_B, ae_keylen)
> 
> and similarly for all uses of k_ab and k_ba.

Yes, the hairy bit is that there is a series of session
secrets `s[i]` that you walk through by session resumption,
and within each session there are master keys `mk[j]` that
you walk through by re-keying.

We once *doubly* indexed the keys.  But you've inspired me
to try the middle way, which is to make the session index
implicit for keys but always explicitly give the "generation
number", `j`.  I think this works reasonably well and is
less mysterious.

> 
> 3.4.  Session ID
> 
>    As required, a tcpcrypt session ID begins with the negotiated TEP
>    identifier along with the "v" bit as transmitted by host B.  The
>    remainder of the ID is derived from the session secret, as follows:
> 
>         session_id[i] = TEP-byte | CPRF(ss[i], CONST_SESSID, K_LEN)
> 
> This might be better phrased
> 
>    As required, a tcpcrypt session ID begins with the byte transmitted
>    by host B that contained the negotiated TEP identifier along with
>    the "v" bit.  The remainder of the ID is derived from the session
>    secret for this session, ss:
> 
>         session_id = TEP-byte | CPRF(ss, CONST_SESSID, K_LEN)
> 
> Exactly how you describe the TEP-byte depends on the terminology
> established in draft-ietf-tcpinc-tcpeno, but it seems that that draft
> doesn't define a term for "the byte that carries v and the TEP
> identifier".
> 
>    Finally, each master key "mk" is used to generate keys for
>    authenticated encryption for the "A" and "B" roles.  Key "k_ab" is
>    used by host A to encrypt and host B to decrypt, while "k_ba" is used
>    by host B to encrypt and host A to decrypt.
> 
>                   k_ab = CPRF(mk, CONST_KEY_A, ae_keylen)
>                   k_ba = CPRF(mk, CONST_KEY_B, ae_keylen)
> 
> Though this needs to be written more carefully:  Which key is used by
> each host is not determined by its A/B role in *this* connection, but
> by the role it had in the first session in the resumption-sequence of
> which this session is a part.  See the second-to-last paragraph in
> section 3.5.  (You may want to introduce terms for those two roles.)

Good catch.  I've changed these paragraphs to:

   Finally, each master key "mk[j]" is used to generate keys for
   authenticated encryption:

               k_ab[j] = CPRF(mk[j], CONST_KEY_A, ae_keylen)
               k_ba[j] = CPRF(mk[j], CONST_KEY_B, ae_keylen)

   In the first session derived from fresh key-agreement, keys "k_ab[j]"
   are used by host A to encrypt and host B to decrypt, while keys
   "k_ba[j]" are used by host B to encrypt and host A to decrypt.  In a
   resumed session, as described more thoroughly below in Section 3.5,
   each host uses the keys in the same way as it did in the original
   session, regardless of its role in the current session: for example,
   if a host played role "A" in the first session, it will use keys
   "k_ab[j]" to encrypt in each derived session.

> 
> 3.5.  Session resumption
> 
>    When two hosts have already negotiated session secret "ss[i-1]", they
>    can establish a new connection without public-key operations using
>    "ss[i]".  A host signals willingness to resume with a particular
>    session secret by sending a SYN segment with a resumption suboption:
>    that is, an ENO suboption containing the negotiated TEP identifier
>    from the original session and part of an identifier for the session.
> 
>    The resumption identifier is calculated from a session secret "ss[i]"
>    as follows:
> 
>                  resume[i] = CPRF(ss[i], CONST_RESUME, 18)
> 
> I don't like the phrasing here because it depends on ss[i] being
> within a larger sequence of session secrets without ever describing it
> as such.
> 
> I don't think you mean "negotiated TEP identifier from the original
> session [when PRK was computed]".  You might mean "negotiated TEP
> identifier from the previous session", but it seems from later
> paragraphs, you mean "negotiated TEP identifier of the new session",
> because the later paragraphs seem to show v = 1 as mandatory, which is
> true of the new session but not necessarily true of the previous
> session.
> 
> Paragraph 1 mentions "an identifier for the session" but paragraph 2
> says "The resumption identifier".
> 
> I think you want to phrase this paragraph something like this:
> 
>    When two hosts have already negotiated a session with a particular
>    session secret, they can establish a new connection without
>    public-key operations using the next session secret in the sequence
>    derived from the original PRK.  A host signals willingness to
>    resume with a particular new session secret by sending a SYN
>    segment with a resumption suboption:  that is, an ENO suboption
>    whose value is the negotiated TEP identifier of the session
>    concatenated with half of the "resumption identifier" for the
>    session.
> 
>    The resumption identifier is calculated from a session secret "ss"
>    as follows:
> 
>                  resume = CPRF(ss, CONST_RESUME, 18)
> --

Looks great, I've adopted this language almost verbatim.

> 
>    If a passive opener recognizes the identifier-half in a resumption
>    suboption it has received and knows "ss[i]"
> 
> It seems like "and knows ss[i]" is redundant.  This could be more
> clearly stated:
> 
>    If a passive opener recognizes the identifier-half as being derived
>    from a session secret and PRK that it has cached, 

Good idea; I've used:

   If a passive opener receives a resumption suboption containing an
   identifier-half it recognizes as being derived from a session secret
   that it has cached, it SHOULD (with exceptions specified below) agree
   to resume from the cached session by sending its own resumption
   suboption, which will contain the other half of the identifier.

> 
> --
> 
>    If it does not agree to resumption with a particular TEP
> 
> It's best not to start a paragraph with "it" as a subject.  And what
> is the significance of "with a particular TEP"?  It seems better to
> say
> 
>    If the passive opener does not agree to resumption, it may either
>    ...
> 

Okay, I've put the noun before the pronoun, but had to leave
"with a particular TEP" in order to make clear that you can
request "same TEP but fresh key-exchange".

   If the passive opener does not agree to resumption with a particular
   TEP, it may either request fresh key exchange by responding with a
   non-resumption suboption using the same TEP, or else respond to any
   other received suboption.

> --
> 
>    Implementations that perform session caching MUST provide a means for
>    applications to control session caching, including flushing cached
>    session secrets associated with an ESTABLISHED connection or
>    disabling the use of caching for a particular connection.
> 
> What is "session caching"?  What is the significance of the term
> "ESTABLISHED"?  And "disabling the use of caching" seems to be
> ambiguous -- does it mean that nothing will be read from the cache
> (session resumption will not be accepted for this session) or that
> nothing will be written to the cache (no later session can be a
> resumption of this session)?  I suspect this paragraph hasn't been
> updated from using the terminology of an earlier version.

Good questions.  I've changed this paragraph to:

   Implementations that cache session secrets MUST provide a means for
   applications to control that caching.  In particular, when an
   application requests a new TCP connection, it must be able to specify
   that during the connection no session secrets will be cached and all
   resumption requests will be ignored in favor of fresh key exchange.
   And for an established connection, an application must be able to
   cause any cache state that was used in or resulted from establishing
   the connection to be flushed.

> 
> 3.8.  Re-keying
> 
>    A host SHOULD NOT initiate more than one concurrent re-key operation
>    if it has no data to send; that is, it should not initiate re-keying
>    with an empty encryption frame more than once while its record of the
>    remote generation number is less than its own.
> 
> I think you meant "consecutive" here instead of "concurrent".  But that
> still isn't the rule you want, since a host may have to perform two
> consecutive keepalives without sending any data between them.  I'm not
> sure how you want to state this condition.  Perhaps something like
> 
>    A host SHOULD NOT initiate a re-key operation if it has sent no
>    data since the last re-key operation unless sufficient time has
>    passed to require a keep-alive as described in Section 3.9.

The phrase "concurrent re-key operation" is perhaps
heavy-handed.  It refers to the process of sending "re-key"
and awaiting the "re-key" flag in response, after which the
two hosts have the same generation number.

The idea is that you shouldn't re-key again (i.e.,
"concurrently" with the first request) until your peer has
responded in kind, which it is required to do as soon as it
receives your message.

And if there has been no response -- whether your primary
intention was to advance the key schedule or to probe for
liveness -- then you have your answer about liveness and
there should be no need to perform a following keep-alive.

So ... I *think* ... the paragraph works as written.

> 
> 4.1.  Key exchange messages
> 
>                   8
>               +--------+-------+-------+---...---+-------+
>               |nciphers|sym-   |sym-   |         |sym-   |
>               | =K+1   |cipher0|cipher1|         |cipherK|
>               +--------+-------+-------+---...---+-------+
> 
> Generally when a sequence is 0-indexed, you would identify the count
> (nciphers) as "K" and the items as "sym-cipher0" through "sym-cipherK-1".
> Or probably better, "sym-cipher[0]" through "sym-cipher[K-1]", giving
> 
>                   8
>               +--------+---------+---------+---...---+-----------+
>               |nciphers|sym-     |sym-     |         |sym-       |
>               | =K     |cipher[0]|cipher[1]|         |cipher[K-1]|
>               +--------+---------+---------+---...---+-----------+

Duh, of course.

And bless you for drawing out the ascii for me!

> 
>    When sending "Init1", implementations of this protocol MUST omit the
>    field "ignored"; that is, they must construct the message such that
>    its end, as determined by "message_len", coincides with the end of
>    the field "PK_A".
> 
> Maybe better to say
> 
>    Implementations of this protocol MUST construct "Init1" with the
>    field "ignored" of zero length.
> 
> Ditto for Init2.

Great, yes.

> 
> 8.  Security considerations
> 
>    If it can be
>    established that the session IDs computed at each end of the
>    connection match, then tcpcrypt guarantees that no man-in-the-middle
>    attacks occurred unless the attacker has broken the underlying
>    cryptographic primitives (e.g., ECDH).  A proof of this property for
>    an earlier version of the protocol has been published [tcpcrypt].
> 
> Is there a known/defined/standard way to perform such a comparison?
> If this is valuable enough to be mentioned, it seems like tcpcrypt
> should incorporate a way of doing it.
> 
> [END]
> 
> 

The idea is that, instead of specifying a particular set of
authentication schemes as part of this protocol (as TLS does
for example), applications can use arbitrary means to
compare session IDs and thus achieve communication security.
As examples, a public-key infrastructure can be used to sign
session IDs; or session IDs can be compared out-of-band and
even after-the-fact in order to audit the security of past
communications.

This property is shared with any protocol that uses TCP-ENO,
and should probably be expounded on a little more clearly
there.  It looks like we streamlined things a little too
far.

Thanks again for the very close reading of this document,
Dale.  You turned up a bunch of misleading wording and even
a couple outright mistakes.

daniel