Re: [tcpm] Benjamin Kaduk's Discuss on draft-ietf-tcpm-rfc793bis-25: (with DISCUSS and COMMENT)

Benjamin Kaduk <kaduk@mit.edu> Tue, 22 March 2022 10:57 UTC

Return-Path: <kaduk@mit.edu>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3AC373A0FB0; Tue, 22 Mar 2022 03:57:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.721
X-Spam-Level:
X-Spam-Status: No, score=-1.721 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, KHOP_HELO_FCRDNS=0.186, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LRx20O81680d; Tue, 22 Mar 2022 03:57:49 -0700 (PDT)
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 90E7D3A0FC7; Tue, 22 Mar 2022 03:57:48 -0700 (PDT)
Received: from mit.edu (c-73-169-244-254.hsd1.wa.comcast.net [73.169.244.254]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 22MAvVlk007177 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 22 Mar 2022 06:57:39 -0400
Date: Tue, 22 Mar 2022 03:57:31 -0700
From: Benjamin Kaduk <kaduk@mit.edu>
To: Wesley Eddy <wes@mti-systems.com>
Cc: The IESG <iesg@ietf.org>, draft-ietf-tcpm-rfc793bis@ietf.org, tcpm-chairs@ietf.org, tcpm@ietf.org, Michael Scharf <michael.scharf@hs-esslingen.de>
Message-ID: <20220322105731.GB13021@mit.edu>
References: <163236803976.28405.5643771942452620510@ietfa.amsl.com> <6f6e7b90-081a-74b4-b329-8879addcb8c4@mti-systems.com> <20220225210031.GV12881@kduck.mit.edu> <96c51cc2-512e-f126-1022-b84ce896d08b@mti-systems.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <96c51cc2-512e-f126-1022-b84ce896d08b@mti-systems.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/Tz9l_OqAvK8CJz3P0nFHiMJAh3w>
Subject: Re: [tcpm] Benjamin Kaduk's Discuss on draft-ietf-tcpm-rfc793bis-25: (with DISCUSS and COMMENT)
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 22 Mar 2022 10:57:54 -0000

Hi Wes,

Sorry for the slow reply here; I ended up taking some time to gather my
thoughts about what the core topic that bothered me is, and wrote some
proposed text that I tried to solicit for some informal feedback before
sending here (but that solicitation timed out, incurring a bit more delay).

On Mon, Feb 28, 2022 at 09:52:29PM -0500, Wesley Eddy wrote:
> First, on the one DISCUSS point remaining:
> 
> On 2/25/2022 4:00 PM, Benjamin Kaduk wrote:
> 
> > In essence, I think that we require a fairly strong justification to
> > publish an Internet Standard in 2022 that says it's okay to adopt a data
> > model where a host has a global piece of state that it freely sends to
> > anyone who asks, where that piece of state can be used to attack/disrupt
> > all new connections that host makes, as opposed to just connections on the
> > 5-tuple that asked.
> 
> I agree with Joe's explanation on the applicability of the concern being 
> a bit more nuanced, since it's important for some things (like BGP 
> sessions, which 5961 was written in response to), but less so for 
> shorter-lived connections, hosts that aren't servers, etc.

I'll quote and reply to Joe (and Michael) later, and while I agree that
there is more nuance than "randomized ISN MUST be used, always", I don't
think that we want to be in the habit of making IETF protocols provide
weaker security properties based on the type of traffic we think they'll be
conveying.  (Experience has shown that we typically guess wrong, or at
least incompletely,  when we guess what types of traffic we'll convey.)

> 
> > The actual scope of utility of an ISN is local to an individual 5-tuple,
> > not global to a host, and false sharing of ISNs across connections adds
> > risk.  My stance as SEC AD is that the IETF should produce protocols that
> > are as secure as possible *subject to any constraints on them*.  Here, we
> > have the constraint of retaining compatibility with the existing deployment
> > base (which is a very compelling constraint!), so you do not see me
> > advocating for mandatory tcpcrypt (for example).  But I want to see some
> > explanation of what harm or risk there is in saying that the ISN is always
> > produced with the PRF of the 5-tuple (okay, 4-tuple since the protocol is
> > always TCP), to motivate a divergence from the more-secure behavior and
> > thereby justify retaining the behavior currently in the draft.  What
> > constraint are we subject to that prevents doing the right (security-wise)
> > thing?
> >
> I think that the WG would have to decide on this, since it's more than 
> just an editorial matter.
> 
> A fair change might be to leave SHOULD, but be more clear about why it 
> isn't felt to justify a MUST presently, if that seems to make sense to you?

That could be a promising approach.  I'll note that the recent IESG has
been more vocal about asking for "SHOULD" to be treated as "MUST with
escape clause", that is, we should be able to identify in some manner
(not necessarily an explicit enumeration) when the "not-SHOULD" route would
be appropriate to take.  For example, I quote from
https://datatracker.ietf.org/doc/draft-ietf-dnsop-svcb-https/ballot/#draft-ietf-dnsop-svcb-https_francesca-palombini
(the DISCUSS portion in particular):

  If SHOULD is used, then it must be accompanied by at least one of:
  (1) A general description of the character of the exceptions and/or in what
  areas exceptions are likely to arise.  Examples are fine but, except in
  plausible and rare cases, not enumerated lists.
  (2) A statement about what should be done, or what the considerations are,
  if the "SHOULD" requirement is not met.
  (3) A statement about why it is not a MUST.

Unfortunately, I'm not very well placed to provide
that answer myself, so it's something of a gap in the text I propose below.


Upon further reflection, my main issue with the current text is that
(especially within the historical context of implementation behavior) it it
is very hazy about whether the "clock" for ISN is global within a TCP
implementation or scoped to the connection it is to be used on (recalling
that §3.4.1 starts with "a connection is defined by a pair of sockets"),
and arguably more importantly, of the security properties when the clock is
global to the implementation.  In particular, in "When new connections are
created, an initial sequence number (ISN) generator is employed that
selects a new 32 bit ISN", it's clear that "when new connections are
created" refers to multiple distinct connections, but "an [ISN] generator
is employed" discusses only a single generator, as if it is shared across
those connections.

I like the way that the first paragraph of §3.4.1 clearly frames the
problem (or at least the part of it that the subsequent discussion focuses
on) as relating to assigning segments within a connection to a particular
incarnation of that connection (i.e., rejecting segments from a previous
incarnation).  Its last sentence is a good lead-in to the problems that ISN
selection need to address.  But the next few paragraphs leave some things
lacking in terms of accurately reflecting the security properties of the
choices available.  I do understand that if the constraints are to prevent
"stale" segments from being processed even when a TCP endpoint "loses all
knowledge of the sequence numbers it has been using" that practical
implementation considerations force the use of such global state, and if
the overhead of a PRF evaluation is too high a single global "clock" is
basically the only option.  But that does not have to make it the default
option that we present, and we can do better about clearly explaining the
risk of ISN guessability when the single global "clock" is used.

So I would propose to replace the second through fourth paragraphs of
the section (counting the formula from RFC 6528 as being inside of the
fourth paragraph) with something more like the following (further edits
welcome):

   To avoid confusion we must prevent segments from one incarnation of a
   connection from being used while the same sequence numbers may still
   be present in the network from an earlier incarnation.  However, the
   design of the TIME-WAIT state means that actual information about
   previously used sequence numbers will be discarded, and a robust
   system will provide some protection against processing segments from
   a previous incarnation of the connection even when such
   sequence-number state has been discarded or lost (e.g., due to restart).
   The procedure for selecting an initial sequence number (ISN) for an
   instance of a connection needs to be designed so as to minimize the
   likelihood of sequence number reuse for that connection.  This
   procedure also needs to account for the security issues that result
   if an off-path attacker is able to predict or guess ISN values [43],
   as a special case of the attacks possible by guessing in-window
   sequence number values at any point in a connection's lifecycle.
   The logical independence of sequence numbers across connections,
   combined with the ease with which an attacker can probe the ISN
   generation algorithm on an individual connection and data
   minimization best practices, implies that ISN generation should
   produce unrelated values for different connections at any given point
   in time.

   Within a given connection, the likelihood of sequence number reuse is
   minimized by ensuring that the full space of potential ISN values is
   exhausted before reusing any individual ISN value.  In keeping with
   RFC 793, this specification achieves that behavior using a number
   sequence that monotonically increases until it wraps, known loosely
   as a "clock".  This clock is a 32-bit counter that typically
   increments at least once every roughly 4 microseconds, although it is
   neither assumed to be realtime nor precise, and need not persist
   across reboots.  The clock component is intended to ensure that within
   a Maximum Segment Lifetime (MSL), generated ISNs will be unique,
   since it cycles approximately every 4.55 hours, which is much longer
   than the MSL.

   The combined properties of connection independence and clock
   dependence of ISNs can be achieved with only a small amount of
   per-implementation state by combining a global clock with a
   pseudorandom function of the connection identifiers and a secret key;
   the following procedure from [43] SHOULD be used for ISN selection:

   ISN = M + F(secretkey, localip, localport, remoteip, remoteport)

   where M is the 4 microsecond timer, and F() is a pseudorandom
   function (PRF) of the connection's identifying parameters ("localip,
   localport, remoteip, remoteport") and a secret key ("secretkey")
   (SHLD-1).  F() MUST NOT be computable from the outside (MUST-9), or
   an attacker could still guess at sequence numbers from the ISN used
   for some other connection.  The PRF could be implemented as a
   cryptographic hash of the concatenation of the TCP connection
   parameters and some secret data.  For discussion of the selection of
   a specific hash algorithm and management of the secret key data,
   please see Section 3 of [43].

   In cases where the above procedure is not used, the selection of
   initial sequence numbers for a connection still MUST incorporate a
   clock-driven update as described above into its generation of initial
   sequence numbers (MUST-8).

> 
> And now, closing the loop on a few COMMENTs:
> 
> >
> > Section 3.8.4
> >
> >      An implementation SHOULD send a keep-alive segment with no data
> >      (SHLD-12); however, it MAY be configurable to send a keep-alive
> >      segment containing one garbage octet (MAY-6), for compatibility with
> >      erroneous TCP implementations.
> >
> > Such misbehaved TCP impelementations were misbehaved even in 1989 when
> > RFC 1122 was published -- do we have a sense for whether they are still
> > around to any significant degree?
> >> That's a great question; I don't know, but would highly doubt they are
> >> around anymore (at least not in production on the Internet).
> > Might be an interesting experiment, but it sounds like we don't have enough
> > data to justify changing the text here (yet).
> 
> As a follow-up action, we can bring this to attention in TCPM and see if 
> anyone is interested in helping to gather data on whether any 
> implementations still need this.  That would then let us see if this can 
> be removed in an update.  But anyways, probably shouldn't block this doc 
> (which I think you agree with, since this was just a COMMENT, not 
> DISCUSS topic).

Right, no need to block this document on this topic.

> 
> >>> Section 4
> >>>
> >>>      Destination Address
> >>>              The network layer address of the remote endpoint.
> >>>      [...]
> >>>      Source Address
> >>>              The network layer address of the sending endpoint.
> >>>
> >>> These definitions don't seem to work in the context of a receiver
> >>> validating the TCP checksum, where the destination address is the local
> >>> endpoint's address and the source address is the remote endpoint's
> >>> address.  (I note that these definitions are different from what RFC 793
> >>> itself used.)
> >> Interesting point on the destination (though the source one seems fine,
> >> it's always the sending endpoint).  Maybe we should just change "remote"
> >> to "receiving" in the destination one?
> > That's an easy fix; good idea.
> >
> In rev -27, I changed this to:
> 
>     Destination Address
>             The network layer address of the endpoint intended to receive
>             a segment.
> 
> 
> 
> >>>      The more secure Initial Sequence Number generation algorithm from RFC
> >>>      6528 was incorporated.  See RFC 6528 for discussion of the attacks
> >>>      that this mitigates, as well as advice on selecting PRF algorithms
> >>>      and managing secret key data.
> >>>
> >>> (As I mentioned up in §3.4, that guidance is no longer current.)
> >> Is updating 6528 agreed to be possible future work?
> > Yes.  (Is this something where I should be expecting to take the action
> > item to write a draft, or do you think someone else might take up the pen?)
> 
> It might be worthwhile to succinctly describe the proposed update to 
> TCPM, and see what the interest level is.  If you have bandwidth to do 
> so, you'd probably be best suited, but otherwise, I'd also be happy to 
> bring it up on-list and at the next meeting (per chairs approval), as 
> part of closing-out this 793bis work.

Okay, I'll make a note in my local todo list that there could be some work
in this space.
The main headline concerns would be something like switching from MD5 to
siphash, which covers both the "more modern hash function" aspect and
"incorporate the secret key in a cryptographically robust manner": 6528
suggests "a cryptographic hash of the concatenation of the connection-id
and some secret data" but best practices have evolved so as to put the
secret data as the first input digested by the hash function to avoid the
internal hash state being controlled by attacker-supplied data.

> 
> 
> >>> Section 3.8.6.3
> >>>
> >>>      Note that there are several current practices that further lead to a
> >>>      reduced number of ACKs, including generic receive offload (GRO), ACK
> >>>      compression, and ACK decimation [26].
> >>>
> >>> Reference [26] seems reasonable for ACK decimation and ACK compression,
> >>> but doesn't seem to cover GRO at all.
> >> True; do you think we should try to find a proper GRO reference? I'm not
> >> aware of there being a real standard or academic type of reference.
> > I wouldn't put much effort into looking.
> > It's a bit outside my normal field, so I don't have any immediate insight.
> > Google finds the DPDK docs, which might be stable and well-known, and
> > wikipedia'shttps://en.wikipedia.org/wiki/TCP_offload_engine  entry has a
> > brief mention, with links to an LWN article and
> > https://books.google.com/books?id=C3wQBwAAQBAJ  .  If we know someone with
> > a copy of the latter, it might be worth checking if it would be a useful
> > reference.
> 
> This one looks good to me, and I can include it in the next revision, if 
> you agree:
> 
> https://www.kernel.org/doc/html/latest/networking/segmentation-offloads.html

That looks reasonable to me.  I suspect the RFC Editor would prefer a more
stable reference than "latest", but will let them worry about it.

> 
> 
> >>> Section 3.9.2.2
> >>>
> >>>      Soft Errors
> >>>        For ICMP these include: Destination Unreachable -- codes 0, 1, 5,
> >>>        Time Exceeded -- codes 0, 1, and Parameter Problem.
> >>>
> >>>        For ICMPv6 these include: Destination Unreachable -- codes 0 and 3,
> >>>        Time Exceeded -- codes 0, 1, and Parameter Problem -- codes 0, 1,
> >>>        2.
> >>>
> >>>        Since these Unreachable messages indicate soft error conditions,
> >>>
> >>> I'm not entirely sure that I'd classify "parameter problem" as an
> >>> "unreachable" message per se.
> >> I'm not sure I understand this comment.  This is the list from RFC 5461.
> > It looks like I was misparsing on first read, since there were commas
> > serving different roles.  ("Parameter Problem" is an element in the outer
> > list, matched with "Destination Unreachable" and "Time Exceeded", not one
> > of the codes for Time Exceeded.)  Using semicolons as delimiters for the
> > "outer" list, with commas for the lists of type codes, would help
> > distinguish them.
> >
> I see; semicolons will be in the next revision.
> 
> 
> 
> >>> Section 3.10.1
> >>>
> >>>            the parameters of the incoming SYN segment.  Verify the
> >>>            security and DiffServ value requested are allowed for this
> >>>            user, if not return "error: precedence not allowed" or "error:
> >>>            security/compartment not allowed."  If passive enter the LISTEN
> >>>
> >>> It's surprising for the error string to mention "precedence" when the
> >>> predicate is DiffServ value.
> >> This is a good point.  When the definition of those bits was changed, it
> >> doesn't seem like the error message indicated here was correspondingly
> >> changed.
> > Is this something we need to leave alone for backwards compatibility, or do
> > we have leeway to change the error message here?  (It looks like this text
> > is unchanged in the -27.)
> 
> I don't think there's much of a compatibility issue, since the socket 
> API works differently than this anyways.  We could change "precedence" 
> to "DSCP".

OK.

> 
> >>>               *  If this connection was initiated with a passive OPEN
> >>>                  (i.e., came from the LISTEN state), then return this
> >>>                  connection to LISTEN state and return.  The user need
> >>>                  not be informed.  If this connection was initiated
> >>>                  with an active OPEN (i.e., came from SYN-SENT state)
> >>>                  then the connection was refused, signal the user
> >>>                  "connection refused".  In either case, all segments on
> >>>                  the retransmission queue should be removed.  And in
> >>>
> >>> IIUC, what's described here as "removed" is described elsewhere as
> >>> "flushed"; it would be good to use consistent terminology when possible.
> >> Do you have a strong preference?
> > No strong preference, just a general desire to have consistent terminology
> > unless there is a difference in meaning intended.
> >
> >
> Ok, I can change this to say that the retransmission queue should be 
> flushed in the next revision.
> 
> 
> 
> >>> Section 4
> >>>
> >>>      internet datagram
> >>>              The unit of data exchanged between an internet module and the
> >>>              higher level protocol together with the internet header.
> >>>
> >>> "exchanged between an internet module and the higher level protocol"
> >>> sounds like a local operation; I would have expected the definition of
> >>> an *internet* datagram to involve transfer over the (inter)network.
> >> Ok, can you suggest an alternative definition?  I agree this one from
> >> the original 793 isn't terrific.
> > I'm somewhat reluctant to concoct a new definition from scratch at this
> > point in the process.  I see that RFC 1983 has "IP datagram" refer to just
> > "datagram", on which it says:
> >
> >     datagram
> >        A self-contained, independent entity of data carrying sufficient
> >        information to be routed from the source to the destination
> >        computer without reliance on earlier exchanges between this source
> >        and destination computer and the transporting network.  See also:
> >        frame, packet.
> >        [Source: J. Postel]
> >
> > If I was to overcome my reluctance and try to revise the existing text here
> > into something that makes more sense, it would be:
> >
> > internet datagram
> >    A unit of data [from some higher layer protocol] exchanged between
> >    internet hosts, together with the internet header that allows the
> >    datagram to be routed from source to destination.
> >
> > (I'm not sure whether I'd include the bit in square brackets.)
> > Feel free to use mine if you like, or go with Postel's (per RFC 1983) or
> > something else.
> Yours looks good to me, and I'll queue it for the next revision.

Okay, thanks.


Let me pull in bits from Joe and Michael now.


On Fri, Feb 25, 2022 at 02:40:14PM -0800, touch@strayalpha.com wrote:
> 
> > On Feb 25, 2022, at 1:00 PM, Benjamin Kaduk <kaduk@mit.edu> wrote:
> > 
> > On Sun, Jan 09, 2022 at 11:02:12PM -0500, Wesley Eddy wrote:
> >> I think your question would be a good one to bring up in TCPM for future 
> >> work, but the working group was trying to avoid such changes in this 
> >> document.
> > 
> > I would like to see some stronger justification for pushing this change to
> > future work rather than incorporating it now.
> 
> The primary issue is IOT devices. Requiring a clock or strong randomization simply isn’t always warranted.

Er, hasn't a "clock" (~4us timer update) always been MUST-level required?

Are you saying that best-practice security protections are not warranted
for IOT devices simply because they are IOT devices, or saying that the
computational burden of randomization is excessive for (certain) IOT
devices (or something else)?  I would be interested in hearing details of
scenarios for which (e.g) SipHash is excessive burden, noting that the
paper (http://cr.yp.to/siphash/siphash-20120918.pdf) explicitly considers
8-bit microcontrollers in the design discussion.

> > In essence, I think that we require a fairly strong justification to
> > publish an Internet Standard in 2022 that says it's okay to adopt a data
> > model where a host has a global piece of state that it freely sends to
> > anyone who asks, where that piece of state can be used to attack/disrupt
> > all new connections that host makes, as opposed to just connections on the
> > 5-tuple that asked.
> 
> That’s not quite what’s happening. Just because the ISN doesn’t use either a clock or strong randomization doesn’t mean it can be guessed for new connections.

Again, I am assuming that a clock is used, since that's a MUST-level
requirement.  Are you saying that MUST-8 is ignored in practice?  My
comparisons have been between clock-without-randomization and
clock-with-randomization.

I'm assuming that "clock-without-randomization" means "one single clock
shared across all 4-tuples used by the host"; is that incorrect?

> > The actual scope of utility of an ISN is local to an individual 5-tuple,
> > not global to a host, and false sharing of ISNs across connections adds
> > risk.  
> 
> You need to consider the overall risk, which is:
> 
> a) that you open a connection and learn an ISN
> AND
> b) that you can attack other connections to that host (i.e., as an off-path attacker), notably by knowing the entire 4-tuple
> 
> (On-path attackers can simply watch whatever ISN you PRF anyway).

Yes, I agree that describes the overall risk.

> The latter still needs to be “in window” at least, so the vulnerability is already somewhat mitigated.
> 
> Finally, the vulnerability is one that can be mitigated by the end system, e.g., by using a clock. So at best, it shoots itself in the foot.
> 
> I see no good reason to raise the bar for IoT implementations simply to prevent them from doing something with such local impact.

I'm not proposing to block IoT implementations from doing things that open
themselves to attack -- I'm proposing to be careful about what we give as
baseline expectations in an Internet Standard.

> > My stance as SEC AD is that the IETF should produce protocols that
> > are as secure as possible *subject to any constraints on them*.
> 
> Let’s please be clear that this is NOT *security*. Security would be TCP-AO or IPsec (as noted in RFC6528, sec 4).

"security" is not a boolean; it's a (multi-dimensional!) spectrum of risks
and tradeoffs.  There can still be value in increasing the cost to an
attacker for achieving a certain result, even if the undesirable result is
not fully prevented. My goal here is to make Internet Standard TCP as
secure as possible without breaking compatibility or violating other
constraints on the protocol/ecosystem, and to provide justifications for
when we do need to diverge from best-possible security properties.

(As an aside, "IoT" as a term is also about as vague as "security"; more
concrete representations of the constrained-device scenarios believed to be
relevant tend to be more actionable than just saying "IoT".  Even a
reference to C0/C1/C2 from RFC 7228 is more useful than just "IoT".)

> > Here, we
> > have the constraint of retaining compatibility with the existing deployment
> > base (which is a very compelling constraint!), so you do not see me
> > advocating for mandatory tcpcrypt (for example).  But I want to see some
> > explanation of what harm or risk there is in saying that the ISN is always
> > produced with the PRF of the 5-tuple (okay, 4-tuple since the protocol is
> > always TCP), to motivate a divergence from the more-secure behavior and
> > thereby justify retaining the behavior currently in the draft.  What
> > constraint are we subject to that prevents doing the right (security-wise)
> > thing?
> 
> We already have RFC6528 that specifies this as a SHOULD.
> 
> My question to you is “what has changed since 2012 that makes you think this is now appropriate for a MUST”.

Per above, I'm no longer pushing strongly for random ISN as MUST in this
document.  (Part of why is that, on further reflection, the 6528 expression
is not the only way to achieve the desired properties, and there's no need
to mandate a single mechanism for interoperability.)  But I do think that
BCP 188 (from 2014) has changed our understanding of the ways in which we
need to defend Internet protocols and what our baseline expectations should
be for the properties our protocols provide.

> AFAICT, that’s where the burden of proof is, not on us to defend keeping a SHOULD.

I think there is still some burden to document why the SHOULD is not a
MUST, i.e., in what cases the non-recommended behavior is appropriate.



On Mon, Feb 28, 2022 at 11:54:48AM +0000, Scharf, Michael wrote:
> Chiming in as TCPM co-chair and document shepherd...
> 
> > > Here, we
> > > have the constraint of retaining compatibility with the existing deployment
> > > base (which is a very compelling constraint!), so you do not see me
> > > advocating for mandatory tcpcrypt (for example).  But I want to see some
> > > explanation of what harm or risk there is in saying that the ISN is always
> > > produced with the PRF of the 5-tuple (okay, 4-tuple since the protocol is
> > > always TCP), to motivate a divergence from the more-secure behavior and
> > > thereby justify retaining the behavior currently in the draft.  What
> > > constraint are we subject to that prevents doing the right (security-wise)
> > > thing?
> > 
> > We already have RFC6528 that specifies this as a SHOULD.
> >
> > My question to you is “what has changed since 2012 that makes you think
> > this is now appropriate for a MUST”.
> 
> It was clear consensus in TCPM *not* to realize SHOULD/MUST modifications of the TCP protocol in 793bis, but instead to limit the document to protocol mechanisms that already have IETF consensus.
> 
> It was also TCPM consensus that stacks that correctly implement existing TCP standards prior to 793bis should not become incompatible to the TCP standards once 793bis is published.

IMHO it would be valuable to concretely document these assumptions in
793bis itself, to appropriately set the expectation of the reader as to
what content is being presented.  It's unfortunate that these facts were
not documented in the shepherd writeup and/or last-call announcement, as
they would have been relevant in shaping the IETF review of the document.
In other words, 793bis currently calls out that it consolidates the many
updates to the protocol into a single comprehensive specification, but does
not specifically call out that it attempts to exclude other changes and
modernization such that it should be considered "state of the art as of
2012" rather than "state of the art as of 2022".

Thanks,

Ben