[tcpm] Security concerns with relative timestamp exposure through TCP ISNs

Aaron Rainbolt <arraybolt3@gmail.com> Tue, 29 October 2024 22:00 UTC

Date: Tue, 29 Oct 2024 17:00:23 -0500
From: Aaron Rainbolt <arraybolt3@gmail.com>
To: tcpm@ietf.org
Message-ID: <20241029170023.2ed90044@kf-ir16>
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="Sig_/jhA5.7gIOZ1x5CDGcEYueo+"; protocol="application/pgp-signature"; micalg="pgp-sha512"
Message-ID-Hash: K6CZXAD53CQR3NKOIKCG3OLYDAOULY62
CC: protocol-vulnerability@ietf.org, adrelanos@kicksecure.com
Precedence: list
Subject: [tcpm] Security concerns with relative timestamp exposure through TCP ISNs
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/FzXQv-QmlQyMi-hbvGpg9uCB5Zs>

Greetings, and thanks for your time.

I am a developer contributing to the Kicksecure and Whonix projects. As
part of my work, I did some research into potential security issues
related to how TCP Initial Sequence Numbers (ISNs) are generated
according to RFC9293. This is a report of my findings, potential
concerns, and ways in which these concerns may be able to be
alleviated. I've copied the lead Kicksecure developer on this email.

According to RFC9293 section 3.4.1, TCP ISNs must be generated as
follows:

> TCP initial sequence numbers are generated from a number sequence
> that monotonically increases until it wraps, known loosely as a
> "clock". This clock is a 32-bit counter that typically increments at
> least once every roughly 4 microseconds, although it is neither
> assumed to be realtime nor precise, and need not persist across
> reboots. The clock component is intended to ensure that with a
> Maximum Segment Lifetime (MSL), generated ISNs will be unique since
> it cycles approximately every 4.55 hours, which is much longer than
> the MSL. Please note that for modern networks that support high data
> rates where the connection might start and quickly advance sequence
> numbers to overlap within the MSL, it is recommended to implement the
> Timestamp Option as mentioned later in Section 3.4.3.
>
> A TCP implementation MUST use the above type of "clock" for
> clock-driven selection of initial sequence numbers (MUST-8), and
> SHOULD generate its initial sequence numbers with the expression:
>
> ISN = M + F(localip, localport, remoteip, remoteport, secretkey)
>
> where M is the 4 microsecond timer, and F() is a pseudorandom
> function (PRF) of the connection's identifying parameters ("localip,
> localport, remoteip, remoteport") and a secret key ("secretkey")
> (SHLD-1). F() MUST NOT be computable from the outside (MUST-9), or an
> attacker could still guess at sequence numbers from the ISN used for
> some other connection. The PRF could be implemented as a
> cryptographic hash of the concatenation of the TCP connection
> parameters and some secret data.

The algorithm described here provides good reliability without making
it easy for an attacker to guess TCP ISNs. However, it exposes a
relative timestamp to both sides of the connection, data which may seem
non-sensitive but which is likely exploitable in attacks.

Relative timestamp data can, among other things, be used to determine
the relative temperature and system load of a remote machine based on
clock skew measurements. This is described in Steven J. Murdoch's "Hot
or Not: Revealing Hidden Services by their Clock Skew".[1] The
referenced paper describes several methods in which this capability can
be used to an attacker's advantage - the primary issue covered is the
ability to de-anonymize Tor hidden services, however other issues that
are covered include geolocation and data transmission via a covert
channel. The paper relies on TCP timestamps, however as TCP ISNs are
sent both ways by both peers in a connection and can be easily obtained
by opening and closing a new connection, I believe they are a suitable
alternative for TCP timestamps. TCP ISNs are particularly useful in
this context as TCP timestamps can be disabled, but TCP ISNs cannot.

The most obvious way I can think to exploit this is to use it to
distinguish computationally intensive operations from computationally
cheap operations, even if both take the same approximate amount of
time. For instance, an authentication system for a server may run a
computationally intenstive constant-time password hashing algorithm if
a provided username is present on the system, but simply sleep for the
duration of time needed to run that algorithm if a provided username
does not exist on the system. In theory, this would be immune to
timing attacks while reducing resource consumption, however in practice
the ability to extract relative timestamps via TCP ISNs could be used
to subvert this by analyzing the remote system's clock skew. The skew
would be noticeably different for intensive operations than for cheap
operations, which could then be used to determine if the username in
question existed on the remote system. Obviously, a single login
attempt would probably not be enough to change the clock skew of the
victim system enough to matter, so the attacker would likely have to
make many login requests in parallel over a relatively long period of
time to achieve the desired effect. The "hot or not" paper mentions that
the capacity of the covert channel offered by the attack is
approximately 2 to 8 bits per hour, meaning that an attacker could use
this technique to enumerate usernames at a speed of about 2 to 8 users
per hour (possibly higher due to the clock speed used for ISN
generation) assuming the remote server was not put under heavy load by
another influence.[2]

Another way this flaw could be used is for low-speed, hidden data
transfer. A "sender" machine can run any arbitrary TCP server (even one
that does not send or receive any visible data), while the "receiver"
machine can run a simple TCP client that connects to the sender using a
particular localip/localport/remoteip/remoteport combo repeatedly. The
receiver need only record the sender's ISN from each connection and
record it, then it can close the connection. The sender can now
transmit data (albeit very slowly and inefficiently) to the receiver by
modulating its system load. The receiver will be able to observe the
system load patterns based on clock skew, and extract a binary sequence
from that pattern corresponding to the modulated system load of the
sender. In this way a machine could transfer sensitive data to another
machine without having to send visible data through the TCP connection.
Again, the speed of the transfer would only be about 2 to 8 bits per
hour, so this would mostly be useful for exfiltrating things such as
encryption keys from an already compromised server.

Note that I have not actually tested the feasibility of these attacks
yet, however they seem like they would work in theory. I believe I have
the necessary equipment to test these in practice, and can try this if
it would be helpful.

Ultimately, to mitigate this attack method, it is necessary to avoid
embedding any recognizable clock (relative or otherwise) in TCP
transmissions. However, this poses a challenge, as the monotonic nature
of TCP ISN generation is very useful for avoiding confusion between old
and new incarnations of a TCP connection. There are a couple possible
solutions I can think of that may help here:

* RFC6528 mentions that "[s]imple random selection of the TCP ISNs
  would mitigate those attacks that require an attacker to guess valid
  sequence numbers." This would naturally make it unsafe to quickly
  start a new connection after one was just closed however as the new
  ISN has a significant chance of being too close to an already-used
  sequence number to be handled properly (you end up with data from old
  and new connections potentially getting mixed up). However, this
  problem would likely vanish if TCP sequence numbers used a
  significantly larger number of bits than currently. With 256-bit
  sequence numbers, the chances of arbitrarily generating an ISN in the
  same 32-bit neighborhood as another sequence number is 2**224, which
  would make collisions very unlikely. Even if a machine established a
  new connection to a remote machine 1000 times a second for 1000
  years, the chances of even a single sequence number "close call"
  would only be about 1 in 8.55 × 10**62 (in reality it would be a bit
  less than that if there were multiple packets circulating in the
  network at any given time, but this would be of any practical worry
  as far as I can tell). This would incur an additional 28 bytes of
  overhead per packet, however.
* The fact that the ISN generation algorithm necessarily involves a
  secret key means that the first ISN used for any particular
  connection upon device reboot will be effectively equivalent to a
  random 32-bit value. With this in mind, the 64ns clock could be
  replaced with a simple counter, which would increment by one for
  every packet sent in a connection that used a particular
  localip/localport/remoteip/remoteport combo. This counter would not
  be reset after a connection was disconnected, so that in the event of
  rapid reconnections, the new ISNs would still increase monotonically.
  In the event of a reboot, the counter will reset, but so will the
  secret key, so the first ISN generated for each connection will
  appear entirely randomized, just like it is with the current
  specification. To my awareness, a counter-based scheme such as this
  wouldn't be usable for malicious purposes beyond what is already
  possible, as both sides of a TCP connection can simply count packets
  if they want a counter value. Depending on the implementation
  however, it may be possible for some counter values to be skipped,
  which could potentially expose network issues or be used as a covert
  channel. This would also consume additional memory between reboots,
  a problem which could be rectified by occasionally resetting the
  secret key and freeing the memory used by all counters.

Worthy of note, there already exists a Linux kernel module, tirdad[3],
which replaces the TCP ISN generation routines in the Linux kernel with
random number generators. The author of tirdad has their own article on
the rationale behind this.[4] At least on the relatively slow
connection I had when I tested it, and with the very lightweight
workload I used for testing (web browsing), this didn't seem to cause
any serious issues over short periods of time. It is almost certainly
not suitable for all TCP use cases, but it did seem to be "close
enough" for brief basic desktop use.

Thank you for taking the time to read this, and I hope this is helpful.

Sincerely,
Aaron

References:
[1] https://murdoch.is/papers/ccs06hotornot.pdf
[2] The "hot or not" paper mentions the possibility of using TCP
sequence numbers for attacks, but then states that "as the
cryptographic function is re-keyed every 5 minutes, maintaining long
term clock skew figures is non-trivial." I have not been able to find
this rekeying happening for ISN generation in the Linux source code
(looking at the tip of the master branch at the time of this writing),
and when doing testing on Debian 12 with a kernel patched to print ISNs
to the kernel log as they were generated, I did not see any resetting
happening, thus I believe this is a problem with Linux's implementation
at the very least. Even if an implementation did do rekeying for ISN
generation, I believe this is still a spec issue as the spec does not
mandate occasional rotation of the key used in the ISN generation
algorithm.
[3] https://github.com/0xsirus/tirdad/
[4] https://bitguard.wordpress.com/2019/09/03/an-analysis-of-tcp-secure-sn-generation-in-linux-and-its-privacy-issues/

[tcpm] Security concerns with relative timestamp … Aaron Rainbolt
[tcpm] Re: Security concerns with relative timest… Yoshifumi Nishida
[tcpm] Re: Security concerns with relative timest… Aaron Rainbolt