Re: Use of zero-length connection IDs and NAT rebinding resilience

Martin Duke <martin.h.duke@gmail.com> Tue, 06 August 2024 14:29 UTC

Return-Path: <martin.h.duke@gmail.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C2D8FC151070 for <quic@ietfa.amsl.com>; Tue, 6 Aug 2024 07:29:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.106
X-Spam-Level:
X-Spam-Status: No, score=-2.106 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Wwtl4-hxsrUV for <quic@ietfa.amsl.com>; Tue, 6 Aug 2024 07:29:55 -0700 (PDT)
Received: from mail-ua1-x936.google.com (mail-ua1-x936.google.com [IPv6:2607:f8b0:4864:20::936]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DA0A1C151086 for <quic@ietf.org>; Tue, 6 Aug 2024 07:29:55 -0700 (PDT)
Received: by mail-ua1-x936.google.com with SMTP id a1e0cc1a2514c-83446a5601bso316370241.0 for <quic@ietf.org>; Tue, 06 Aug 2024 07:29:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722954595; x=1723559395; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=7czN8RhDHqT0A4XWca3/TzvzUZCPLPfd/7Kx1lcj2fc=; b=L8CgXSrAdLcgze/j1AcAUB1HmJowIDMxNGjJ4nuup54E4zxJibFcHI1baiBlhvugCv G6IuX52J5dWuF6vGYfysjn1UGzAQS2bYv8gJoGs8/65Cghoy53zPHxyhpG57G1a+Kq0J t6ooeLFWQK0sWzJxzz0RqNaOhtWPVTDNbA+VgiPFwqMPmaNZTJA8PvL8SbAY1A0gBGRw PuFIYu5WX4kokrWVi/Fy1D2DuOzfbZOruYP0mLpJpQQ1jHCZaw6TnXhMiQpxDw2W7PVw APQjrEwOcaDQ1IshGvvgsCzrTGBC+qrdLy/dA8ADQIjju8cp1UyZlALtByX7fiqLL4qL OQ+g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722954595; x=1723559395; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=7czN8RhDHqT0A4XWca3/TzvzUZCPLPfd/7Kx1lcj2fc=; b=HfkKFMjnPDoQKL/DIqmNCvryvYCJvNcd9lXEStqU4F/hvQfA+lXxjBCue3qjSrd3AZ hi+Nkt2BmROZiPcsDfPpWPSTinTFv4Gu3QIxo5N4r+YzD06djyQ0uXeG9PQ4/gXmi00E FknFBFJA5eH0X+2bw6c8LlUTC4iiZXRKB1BLvhLtR8BPckEZTK9Kwkb95UA06NEWmuo4 zwM9lkcQ7TwN92Ko1fmdrkXl4OINylxv9BXNKG7nUpThuM3OdP3aZ8nvrHGGaseHSpMw 8RWnpEagbMBJnWsjadQRLDJLGUDf+FKyuxzz1grSL0pIAOQVwygnCeB4XLuVBSxZRzNF /hRg==
X-Forwarded-Encrypted: i=1; AJvYcCXi4mtsxfhVJ6XwxS88NzJtpSQ1N0ODxQigaKH0Db15R4igEPNe6aI5jKws7iTlMdLUbn/ZvFNjnJ8Vn6Cs
X-Gm-Message-State: AOJu0YyoSs9wyIVYGbyhTrH9F0f8wMoFJMNhnOmSV5qWxcfHv8keGpRv iAj9vTd/Z2683u9LlKOm9BD6hM7oc+erYgqct//qy9kvzvjdgxqrvf1GWiqJ83X8fPgtZ/s35E+ JexQJmvmkb87pH6iGWIVrz2O84t+blw==
X-Google-Smtp-Source: AGHT+IEL4ifzDWhpKRAAChTRQvmh6GyNVHdMhDZUoo/h/9w+MjzZlVLjr7r7DwylYTgPTeaoZs6nfoeLUT/PFqU8VUo=
X-Received: by 2002:a05:6122:1e03:b0:4ec:f8b1:a34b with SMTP id 71dfb90a1353d-4f89ff93371mr15986819e0c.8.1722954594738; Tue, 06 Aug 2024 07:29:54 -0700 (PDT)
MIME-Version: 1.0
References: <bef28311-83d0-4aff-bce2-81fc14e58a20@app.fastmail.com> <13977c9e-3539-4bf9-a268-e45c8e92a404@huitema.net> <4b26934e-eea4-4c96-b517-e9a396ed2a1d@app.fastmail.com>
In-Reply-To: <4b26934e-eea4-4c96-b517-e9a396ed2a1d@app.fastmail.com>
From: Martin Duke <martin.h.duke@gmail.com>
Date: Tue, 06 Aug 2024 07:29:42 -0700
Message-ID: <CAM4esxTTNPUpEfQHwu73b34y5Od+pOQme_MN9cg6MRZzwLr4XA@mail.gmail.com>
Subject: Re: Use of zero-length connection IDs and NAT rebinding resilience
To: Cameron Steel <ietfquic=40tugzrida.xyz@dmarc.ietf.org>
Content-Type: multipart/alternative; boundary="000000000000956c88061f04a06c"
Message-ID-Hash: XF73X55MKP3335VKB4Y45KAGRRW3TS7P
X-Message-ID-Hash: XF73X55MKP3335VKB4Y45KAGRRW3TS7P
X-MailFrom: martin.h.duke@gmail.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-quic.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: Christian Huitema <huitema@huitema.net>, quic@ietf.org
X-Mailman-Version: 3.3.9rc4
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/jI1hDcUKQlFqjWShxNtGkmUADRw>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Owner: <mailto:quic-owner@ietf.org>
List-Post: <mailto:quic@ietf.org>
List-Subscribe: <mailto:quic-join@ietf.org>
List-Unsubscribe: <mailto:quic-leave@ietf.org>

Like Christian said, it's the client-to-server connection ID length that
matters here. Almost all servers use non-zero-length connection IDs, so if
you fix your servers this should work fine.

On Sun, Aug 4, 2024 at 8:07 PM Cameron Steel <ietfquic=
40tugzrida.xyz@dmarc.ietf.org> wrote:

> Hi Christian,
> You are of course correct, I had been conflating terminology in my mind,
> apologies.
>
> The situation I was experiencing the issue in is as you describe: the NAT
> mapping times out and the first packet after the timeout is an HTTP GET
> which receives a new NAT mapping.
>
> You are also correct that Chrome does use 0-length id's for server >
> client, and the servers I've been using in my testing do use
> non-zero-length id's for the other direction.
>
> Given the error nginx logs when experiencing the issue ("quic no available
> client ids for new path while handling decrypted packet"), I had put the
> issue down to a suboptimal interpretation of RFC 9000 section 9.1 ("an
> endpoint MUST NOT reuse a connection ID when sending to more than one
> destination address").
>
> I have also seen this behaviour when the server is Caddy, and with a site
> behind Cloudflare, so the interpretation seems to be somewhat widespread on
> the server side.
>
> I've attached a pcap from the internal side of a NAT and Chrome net-export
> of the issue, let me know if more details would be helpful.
>
> Cameron Steel.
>
> On Mon, Aug 5, 2024, at 09:34, Christian Huitema wrote:
>
>
>
> On 8/4/2024 3:37 PM, Cameron Steel wrote:
> > Hi QUIC experts,
> >
> > I've just completed a writeup of an issue I was experiencing with
> websites using QUIC through my ISP's CGNAT. In short, the issue was due to
> the CGNAT having a rather short UDP timeout of 20 seconds, in combination
> with the fact that Google Chrome seems to use zero-length connection IDs,
> which prevents connection migration.
> >
> > In the process of checking the behaviour I was observing against the
> QUIC RFCs, I came across a few oddities that I'd like to bring up:
> >
> > Both RFC 9000 and 9308 fairly plainly state that connections using
> zero-length IDs will not be resilient to NAT rebinding, however RFC 9000
> section 5.1.1 does have this passage which vaguely implies that multiple
> network paths are possible with zero-length IDs:
> >
> >> An endpoint that selects a zero-length connection ID during the
> handshake cannot issue a new connection ID. A zero-length Destination
> Connection ID field is used in all packets sent toward such an endpoint
> over any network path.
> >
> > As this is only implied the once that I can find, I'm assuming it's just
> ambiguous wording and that the intended behaviour is what I observed, that
> connection migration is not permitted when using a zero-length connection
> ID.
>
> It is a bit more complicated than that. First, let's get the naming
> right. "Connection migration" describes a voluntary action in which the
> client tries to reach the server using a different 5-tuple and a
> different connection ID. What you are encountering here is "NAT
> Rebinding", i.e., the effect of an uncoordinated decision by the NAT to
> forget the binding between the 5-tuple used by the client and the
> "external" 5-tuple.
>
> After the NAT rebinding, all packets sent by the server to the old
> 5-tuple will be lost: there is no mapping for that and packet are
> dropped by the NAT, or maybe the mapping has been reused for a new
> client and packet are dropped by that client because they cannot be
> decrypted.
>
> The solution is for the server to somehow learn the new value of the
> client's 5-tuple. It can only do that by receiving packets from the
> client. All the packets sent after the NAT rebinding and before a new
> packet is received by the client will be lost, whether connection IDs
> are used or not. For example, if the application pattern is to send a
> request, then wait some long time before the server replies, the long
> wait will increase the risk of NAT rebinding, and the eventual response
> of the server will be lost.
>
> If the traffic is series of HTTP GET triggering immediate responses,
> there is hope. The server could learn the new 5-tuple when receiving the
> GET command. But it needs to associate the arriving packet with the old
> connection, and it can only do that if the old packet carries a
> connection ID.
>
> >
> > Given that, I'd be very curious to hear any insight into why Chrome has
> chosen not to use connection IDs.
>
> NAT Traversal will work if connection IDs are used in the client to
> server direction. I was under the impression that Chrome uses 0-length
> CID in the server to client direction, but Google servers use 8 bytes
> CID in the client to server direction. If that's the case, NAT rebinding
> should work.
>
> > If anyone is interested in reading my full writeup, you can find it
> here:
> https://blog.tugzrida.xyz/2024/08/04/too-quic-for-chrome-troubleshooting-udp-nat-rebinding/
>
> Can you attach some kind of packet log so we can see what is really
> happening? QLOG would be great.
>
> -- Christian Huitema
>
>
>