Re: QUIC-LB

Mikkel Fahnøe Jørgensen <mikkelfj@gmail.com> Sat, 18 May 2019 11:44 UTC

From: Mikkel Fahnøe Jørgensen <mikkelfj@gmail.com>
In-Reply-To: <CAM4esxR9J01j_yXCzfaMyZrm-yfDSxi6nh=TSh5viWYSHNFgMQ@mail.gmail.com>
References: <CAM4esxR9J01j_yXCzfaMyZrm-yfDSxi6nh=TSh5viWYSHNFgMQ@mail.gmail.com>
MIME-Version: 1.0
Date: Sat, 18 May 2019 04:44:19 -0700
Message-ID: <CAN1APdd+PuA17Gf7a78HySZ37QFs1C+L=mKjaTKN+7W+d7v22A@mail.gmail.com>
Subject: Re: QUIC-LB
To: IETF QUIC WG <quic@ietf.org>, Martin Duke <martin.h.duke@gmail.com>
Content-Type: multipart/alternative; boundary="000000000000202dea05892805ba"
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/r2fbeLtXLgpnvbQ4SukA8kyJ0_M>
Precedence: list

Thanks for making this work. On a first pass over the text I was originally
concerned about insider off-path attackers, but the auth token deals with
that. I’m not sure that a further complication of a fully handshaked LB
protocol would be beneficial.

Note that a load balancer might have many different users that do not trust
each other and which are potentially off-path attackers towards one another
. The load balancer might have to deal with many auth tokens at once. This
is not necessarily a problem, but something to keep in mind. If the load
balancer decides the token it is probably easy to manage.

If the off-path insiders can see each others traffic, they can also see the
auth tokens and steal them. This could be very bad. Well - technically
off-path attackers cannot see traffic, but that is a simplistic assumption
since switches can be manipulated easily to divert traffic temporarily. I
believe there needs to a be some additional signature validation to the LB
messages like a HMAC key or similar.

I see the point of setting a QUIC version for LBv1, but it defeats the
stated design goal of making it hide among QUIC packets. I’d remove that
design goal. The real purpose of the packaging appears to make the packets
compatible with the traffic on the same network interfaces.

I have not carefully read the section on message types, but the overall
design seems simple enough and easy to add as an extension to an existing
QUIC server implementation.

Summarising what I write below: I recommend replacing the obfuscated
divisor based algorithm with an efficient reversible integer hash
algorithm. For the ciphers I recommend merging them into a single construct
controlled by configuration parameters, and I suggest encrypting the entire
CID, possible with an extra block that only the server needs to deal with.
This will automatically protect additional server info bits. This makes the
encryption a padded one or two block cipher with optional auth tag encoded
as encrypted 0 padding. The 0 padding is simple and useful, but relatively
weak. It would be simple to design something stronger that would allow a
load balancer to filter more bad traffic before it hits a server.

Mikkel


Some minor and major feedback from reading through
https://tools.ietf.org/html/draft-duke-quic-load-balancers-04

Sect. 1. :
"QUIC allows servers (or load balancers) to designate an initial

   connection ID “


This could be misread os the original connection ID. I think the text
needs to clarify that this ID happens after a server endpoint is
chosen randomly and the first assigned ID ensures that the load
balancers maintains that path.


Sect. 1. :

"In the absence of any shared state between load

   balancer and server, the load balancer must maintain a relatively
   expensive table of server-generated connection IDs, and will not
   route packets correctly if they use a connection ID that was
   originally communicated in a protected NEW_CONNECTION_ID frame.
“


I’m not sure that you can generally say that it would not be routed
correctly, but it cannot guarantee it in all cases.


Very minor:
Sect. 3. terms not explained: SCID (explained later), middlebox, 0/1-RTT,
mix of 0RTT 1-RTT use of dash. A reference to invariants could clarify SCID.

A discussion on QUIC versions, or at least variants, would be useful before
assuming CID placement at offset 1 in 3.1.1.

Sect. 3.2.1. typo "routing mask that with more than”.
Sect. 3.2.1. complex formulation, suggest dropping first paragraph of these:
"The load balancer MUST NOT select a routing mask that with more than

   126 routing bits set to 1, which allows at least 2 bits for config
   rotation (see Section 5
<https://tools.ietf.org/html/draft-duke-quic-load-balancers-04#section-5>)
and 16 for server purposes in a maximum-
   length connection ID.

   The first two bits of an SCID MUST NOT be routing bits; these are
   reserved for config rotation.

“

The latter paragraph fully captures the design and is much easier to understand.


Sect. 3.2. : I was initially concerned about performance, but the division
can be converted into a multiplication and few other operations, but that
will typically require some preselect divisors that are statically compiled
and which an attacker will quickly learn. See alt. proposal below.

Sect. 3.3.1.: Typo: "using the as many of the first octets”
Sect. 3.3.1: Reference to QUIC header protection is not a good idea since
it is version dependent and as such an unnecessary dependency.
Sect. 3.3.1.: The text is not clear about what is meant by “the token”. In
fact I don’t understand it. Is it a configuration parameter to make the
nonce configuration cycle specific? (Reading further, I guess this
references the authentication token - if so, I’m not sure it is a good idea
to use it here since it might help crack the auth token without adding
significant value to CID protection - you could just zero-pad instead).
Sect. 3.3.1 Why not encrypt the additional server chosen CID “info” bits
using the same key rather than requiring them to appear random. It is hard
to squeeze in something crypto safe in a small space - this can be handled
by using 1 bit of the nonce for CID’s longer than 16 bytes. The LB need not
decrypt the second block of the CID if there is one.
Sect. 3.3.3.: Formulation: “ AES-ECB encrypts this nonce”, technically the
nonce isn’t encrypted but used to encrypt the server_id. Also, ECB cannot
encrypt as it is an algorithm and not a process. This text is better "The
server decrypts the server ID using 128-bit AES Electronic

   Codebook (ECB) mode,”, but why would the server want to decrypt its
own server id? I can see it wants to encrypt it to generate a CID.


Sect. 3.3. Overall: why is this a stream cipher. It looks like a block
cipher to me. You could perhaps distinguish it from 3.4 by calling it a
padded block cipher, or just padded cipher? To become a stream cipher you
need a counter or similar progress. My suggestion to use one nonce bit for
a second CID block is essentially that.

Sect. 3.4.1.: using the word “octets” instead of “bytes” which is used
elsewhere, and in QUIC.  If my suggestion of encrypting a second block
beyond 16 bytes is followed, the zero padding should be placed in first
blok so the load balancer need not decrypt the second block.

Sect. 3.4. The 0 paddding adds some protection against random attackers,
but near zero protection against middleboxes that flips a bit to route
traffic to wrong the server. A previously learned CID could also be used by
none-middle boxes to flood a specific server without the load balancer
being able to filter this. A checksum/tag of some sort would improve this.

I don’t see why you would want a separate algorithm for 3.3 and 3.4,
especially if you follow my suggestion to also encrypt the additional
“info” bits. If you want more security, you just make the CID longer. Then
3.4. becoems a special case of 3.3. and you just need to mention it in
security considerations. The zero padding length of 3.4.  would then be
allowed to be 0 to support both case 3.3. and 3.4 but recommended to be at
least for 4 bytes for secure operation.




I’d suggest an alternative to algorithm 3.2 which is simpler and faster
than generic division. You can get fast division by using a set of static
divisors, but these are quickly learned by attackers.

The following hash is a perfect bijective mapping and also has an efficient
reverse mapping making it easy to generate CIDs for a specific server
modulo. There is also a 64 bit version. The hash seed can be rotated. See
also discussion on stackoverflow: http://stackoverflow.com/a/12996028

My version with hash seed added:
/* This assumes the key points to a 32-bit aligned value. */
static inline size_t ht_uint32_hash_function(const void *key, size_t len)
{
    uint32_t x = *(uint32_t *)key + (uint32_t)(HT_HASH_SEED);

    (void)len;

    /* http://stackoverflow.com/a/12996028 */
    x = ((x >> 16) ^ x) * UINT32_C(0x45d9f3b);
    x = ((x >> 16) ^ x) * UINT32_C(0x45d9f3b);
    x = ((x >> 16) ^ x);
    return x;
}
inverse function without hash seed:

unsigned int unhash(unsigned int x) {
    x = ((x >> 16) ^ x) * 0x119de1f3;
    x = ((x >> 16) ^ x) * 0x119de1f3;
    x = (x >> 16) ^ x;
    return x;}



On 18 May 2019 at 02.40.42, Martin Duke (martin.h.duke@gmail.com) wrote:

As many of you know, the QUIC Load Balancers Working Group has plugging way
for close to a year.

Here's a snapshot of our current progress:
https://datatracker.ietf.org/doc/draft-duke-quic-load-balancers/

We are reaching the point of diminishing returns. The draft is deliberately
expansive, providing lots of options in anticipation of the WG eliminating
some as unimportant use cases.

While there are a handful of open issues, we hope to obtain WG adoption in
the near future. I would certainly hope that, when QUIC ships, there is a
QUIC-LB draft in a reasonably mature state (ie, vetted by the working group
as a whole) so that equipment and software vendors can converge on a
standard solution.

If you're interested in the working group taking over this work sooner
rather than later, I encourage you to make this known to the chairs.

Thanks
Martin
The other one. No, the *other* other one.

QUIC-LB Martin Duke
Re: QUIC-LB Mikkel Fahnøe Jørgensen