Re: [tsvwg] L4S dual-queue re-ordering and VPNs

Pete Heist <pete@heistp.net> Sat, 08 May 2021 06:45 UTC

Return-Path: <pete@heistp.net>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2E7813A406D for <tsvwg@ietfa.amsl.com>; Fri, 7 May 2021 23:45:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.099
X-Spam-Level:
X-Spam-Status: No, score=-2.099 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=heistp.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CwHEow5ZnlMb for <tsvwg@ietfa.amsl.com>; Fri, 7 May 2021 23:45:35 -0700 (PDT)
Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [IPv6:2a00:1450:4864:20::42e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8FDF13A406C for <tsvwg@ietf.org>; Fri, 7 May 2021 23:45:35 -0700 (PDT)
Received: by mail-wr1-x42e.google.com with SMTP id d11so11338124wrw.8 for <tsvwg@ietf.org>; Fri, 07 May 2021 23:45:35 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=heistp.net; s=google; h=message-id:subject:from:to:cc:date:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=M42eyTLQhGayLkob9DyuPzpUUEQ4gse/lly7zasKyz8=; b=iBh4+yp9AFMChTa3PSvurT4jEgXXuxxnSzbNZ3g8yUunWEXsMCo8uhh80TU2syXvj+ PV0mxKynWlwaoCRPAOpjjkaBRJcELyBbMCh0XxpWhk8WpTelce95GBZFMZ4Qe2MbwIxi SkQw41SsNiawmlov1LaCZL2R9/xstiWLPX/jK5XtAw2YS6SsW+04Xv1+RODJ1byRosKE ZSGgXIyoF45GkEmIf4iPaKpkxk1kFJrVlE594XRsKHkwLIbt4DMQVszoc114zbGd+8Y7 IsXJUPP3qItUj3CwiGCUxa2VaRX6vqqEGBAtArj/S/3C9BRBs90B1drETTi6/sJLR+pR ZmUQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=M42eyTLQhGayLkob9DyuPzpUUEQ4gse/lly7zasKyz8=; b=m9FC+PrOWvhLCnb1IPsN6anzgRuR84cbH3WalSC9q+pMkc2e4awls6ly4XTYAHwmvU LZ3vy2a5Lv7GE5V6iI84ACq23+ujOca1RoZWu1qkV3awoyEqXb1z2e3YHbPAeUR6kcWP AXTHfwmZse/J/0CVZdpxHQDFjN35Q3U73OpChWUXH8bF7vb9giP2t3Bik6GSqnMfzzwH kxCd5AWPPBxcVVtHSgWIbVv32Quimyqfc3pI206qXoF9ELqMbycnwIGXJ/TB62EfuWBa c8tBdKAKJwdmcILGEC62UeYaiq+Xf/SBliDBIxfD9meQ9Su/9V4EtsccboDPSm2Nawx3 /wIw==
X-Gm-Message-State: AOAM533ApgFUrODGzrQFlqmJ71ulQFdoJnq4UqpVQ8mJB+e/WQzMTFHA +cfxockgcP6apLLcOOq7TCo74g==
X-Google-Smtp-Source: ABdhPJzJZx880AjO7hHvYNu68Tb2x4xd265oup5dHgyTq9q/clWZBXNFQC1RnFTjVycVMd5RlGoynA==
X-Received: by 2002:adf:f1d2:: with SMTP id z18mr17531408wro.245.1620456332202; Fri, 07 May 2021 23:45:32 -0700 (PDT)
Received: from [10.72.0.88] (h-1169.lbcfree.net. [185.193.85.130]) by smtp.gmail.com with ESMTPSA id q20sm37189553wmq.2.2021.05.07.23.45.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 May 2021 23:45:31 -0700 (PDT)
Message-ID: <e15d732f64bf983975dbe507092b39f0744f7f74.camel@heistp.net>
From: Pete Heist <pete@heistp.net>
To: "Black, David" <David.Black@dell.com>, Sebastian Moeller <moeller0@gmx.de>, Greg White <g.white@CableLabs.com>
Cc: TSVWG <tsvwg@ietf.org>
Date: Sat, 08 May 2021 08:45:30 +0200
In-Reply-To: <MN2PR19MB40452C9DD1164609A005139583569@MN2PR19MB4045.namprd19.prod.outlook.com>
References: <68F275F9-8512-4CD9-9E81-FE9BEECD59B3@cablelabs.com> <1DB719E5-55B5-4CE2-A790-C110DB4A1626@gmx.de> <MN2PR19MB40452C9DD1164609A005139583569@MN2PR19MB4045.namprd19.prod.outlook.com>
Content-Type: text/plain; charset="UTF-8"
User-Agent: Evolution 3.40.1
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/NzYD3L9UotXBZLcRgSpNOPWsFzc>
Subject: Re: [tsvwg] L4S dual-queue re-ordering and VPNs
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 08 May 2021 06:45:41 -0000

I've added some additional tests at 10 and 20Mbps, and re-worked the
writeup to include a table of the results:

https://github.com/heistp/l4s-tests/#dropped-packets-for-tunnels-with-replay-protection-enabled

I noticed that this issue seems to affect tunnels with replay window
sizes of 32 and 64 packets regardless of the bottleneck bandwidth,
likely because the peak C sojourn times can also increase as the
bandwidth decreases. IMO, this seems like a safety concern from the
standpoint that the deployment of DualPI2 can cause harm to
conventional traffic, in IPsec tunnels using common defaults in
particular, beyond that which is caused by DualPI2 itself.

It may be fixed by increasing the window size or disabling replay
protection, but it may not be easy for admins or users to identify the
source of this problem when it occurs, or know who to contact about it.

Pete

On Sat, 2021-05-08 at 02:01 +0000, Black, David wrote:
> [posting as an individual, not a WG chair]
> Linking together a couple of related points:
> 
> > [SM] Current Linux kernels seem to use a window of ~8K packets, while
> > OpenVPN defaults to 64 packets, Linux ipsec seems to default to
> > either 32 or 64. 8K should be reasonably safe, but 64 seems less
> > safe.
> 
> Common VPN design practice here appears to be picking a plausible
> default size (which can be reconfigured and change from release to
> release) for the accounting window to detect replay, hence this:
> 
> > >  But, in any case, it seems to me that protocols that need to be
> > > robust to out-of-order delivery would need to consider being robust
> > > to re-ordering in time units anyway, and so would naturally need to
> > > scale that functionality as packet rates increase.
> 
> may not happen in a smooth fashion.  As Sebastian writes:
> 
> > [SM] The thing is these methods aim to avoid Mallory fudging with the
> > secure connection between Alice and Bob (not our's), and need to
> > track packet by packet, that is not easily solved efficiently with a
> > simple time-out
> 
> That's correct, and use of a simple time-out by itself is prohibited
> for obvious security reasons.  For more details on a specific example,
> see Section 3.4.3 of RFC 4303 (ESP), which specifies the ESP anti-
> replay mechanism (could be used as a reference in writing text on how
> L4S interacts with anti-replay)  ... and the observant reader will
> notice that this section is a likely source of the anti-replay 32 and
> 64 packet values for Linux IPsec:
> https://datatracker.ietf.org/doc/html/rfc4303#section-3.4.3 .
> 
> Thanks, --David
> 
> -----Original Message-----
> From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of Sebastian Moeller
> Sent: Wednesday, May 5, 2021 5:21 PM
> To: Greg White
> Cc: TSVWG
> Subject: Re: [tsvwg] L4S dual-queue re-ordering and VPNs
> 
> 
> [EXTERNAL EMAIL] 
> 
> Hi Greg,
> 
> thanks for your response, more below prefixed [SM].
> 
> > On May 3, 2021, at 19:35, Greg White <g.white@CableLabs.com> wrote:
> > 
> > I'm not familiar with the replay attack mitigations used by VPNs, so
> > can't comment on whether this would indeed be an issue for some VPN
> > implementations.
> 
> [SM] I believe this to be an issue for at least those VPNs that use UDP
> and defend against replay attacks (including ipsec, wireguard,
> OpenVPN). All more or less seem to use the same approach with a limited
> accounting window to allow out-of-order delivery of packets. The head
> of the window typically seems to be advanced to the packet with the
> highest "sequence" number, hence all of these are sensitive for the
> kind of packet re-ordering the L4S ecn id draft argues was benign...
> 
> 
> >  A quick search revealed
> > (https://urldefense.com/v3/__https://www.wireguard.com/protocol/__;!!LpKI!0R_YA5wY-HgCAeBd-ajbFbEamek2Wo9ESyoFSJ6whDL8_0kmFhysbbCeOP789qBv$
> >  [wireguard[.]com] ) that Wireguard apparently has a window of about
> > 2000 packets, so perhaps it isn't an immediate issue for that VPN
> > software?
> 
> [SM] Current Linux kernels seem to use a window of ~8K packets, while
> OpnenVPN defaults to 64 packets, Linux ipsec seems to default to either
> 32 or 64. 8K should be reasonably safe, but 64 seems less safe.
> 
> > But, if it is an issue for a particular algorithm, perhaps another
> > solution to address condition b would be to use a different "head of
> > window" for ECT1 packets compared to ECT(0)/NotECT packets?  
> 
> [SM] Without arguing whether that might or might not be a good idea, it
> is not what is done today, so all deployed end-points will treat all
> packets the same but at least wireguard and linux ipsec will propagate
> ECN vaule at en- and decapsulation, so are probably affected by the
> issue.
> 
> > In your 100 Gbps case, I guess you are assuming that A) the
> > bottleneck between the two tunnel endpoints is 100 Gbps, B) a single
> > VPN tunnel is consuming the entirety of that 100 Gbps link, and C)
> > that there is a PI2 AQM targeting 20ms of buffering delay in that 100
> > Gbps link?  If so, I'm not sure that I agree that this is likely in
> > the near term.
> 
> [SM] Yes, the back-of-an-envelop worst case estimate is not terribly
> concerning, I agree, but the point remains that a fixed 20ms delay
> target will potentially cause the issue with increasing link speeds...
> 
> 
> >  But, in any case, it seems to me that protocols that need to be
> > robust to out-of-order delivery would need to consider being robust
> > to re-ordering in time units anyway, and so would naturally need to
> > scale that functionality as packet rates increase.
> 
> [SM] The thing is these methods aim to avoid Mallory fudging with the
> secure connection between Alice and Bob (not our's), and need to track
> packet by packet, that is not easily solved efficiently with a simple
> time-out (at least not as far as I can seem but I do not claim
> expertise in cryptology or security engineering). But I am certain, if
> you have a decent new algorithm to enhance RFC2401 and/or RFC6479 the
> crypto community might be delighted to hear them. ;)
> 
> > I'm happy to include text in the L4Sops draft on this if the WG
> > agrees it is useful to include it, and someone provides text that
> > would fit the bill.
> 
> [SM] I wonder whether a section on L4S-OPs a la, "make sure to
> configure a sufficiently large replay window to allow for ~20ms
> reordering" would be enough, or  wether the whole discussion would not
> also be needed in
> https://urldefense.com/v3/__https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-ecn-l4s-id-14*appendix-B.1__;Iw!!LpKI!0R_YA5wY-HgCAeBd-ajbFbEamek2Wo9ESyoFSJ6whDL8_0kmFhysbbCeOFX4wc3G$
>  [datatracker[.]ietf[.]org] widening the re-ordering scope from the
> existing "Risk of reordering Classic CE packets" subpoint 3.?
> 
> Regards
>         Sebastian
> 
> 
> > 
> > -Greg
> > 
> > 
> > On 5/3/21, 1:44 AM, "tsvwg on behalf of Sebastian Moeller"
> > <tsvwg-bounces@ietf.org on behalf of moeller0@gmx.de> wrote:
> > 
> >    Dear All,
> > 
> >    we had a few discussions in the past about L4S' dual queue design
> > and the consequences of packets of a single flow being accidentally
> > steered into the wrong queue.
> >    So far we mostly discussed the consequence of steering all packets
> > marked CE into the LL-queue (and
> > https://urldefense.com/v3/__https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-ecn-l4s-id-14*appendix-B.1__;Iw!!LpKI!0R_YA5wY-HgCAeBd-ajbFbEamek2Wo9ESyoFSJ6whDL8_0kmFhysbbCeOFX4wc3G$
> >  [datatracker[.]ietf[.]org] Risk of reordering Classic CE packets:
> > only discusses this point); there the argument is, that this
> > condition should be rare and should also be relative benign, as an
> > occasional packet to early should not trigger the 3 DupACK mechanism.
> > While I would liked to see hard data confirming the two hypothesis,
> > let's accept that argument for the time being.
> > 
> >    BUT, there is a traffic class that is actually sensitive to
> > packets arriving out-of-order and too early: VPNs. Most VPNs try to
> > secure against replay attacks by maintaining a replay window and only
> > accept packets that fall within that window. Now, as far as I can
> > see, most replay window algorithms use a bounded window and use the
> > highest received sequence number to set the "head" of the window and
> > hence will trigger replay attack mitigation, if the too-early-packets
> > move the replay window forward such that "in-order-packets" from the
> > shorter queue fall behind the replay window.
> > 
> >    Wireguard is an example of a modern VPN affected by this issue,
> > since it supports ECN and propagates ECN bits between inner and outer
> > headers on en- and decapsulation. 
> > 
> >    I can see two conditions that trigger this:
> >    a) the arguably relatively rare case of an already CE-marked
> > packet hitting an L4S AQM (but we have no real number on the
> > likelihood of that happening)
> >    b) the arguably more and more common situation (if L4S actually
> > succeeds in the field) of an ECT(1) sub-flow zipping past
> > ECT(0)/NotECT sub-flows (all within the same tunnel outer flow)
> > 
> >    I note that neither single-queue rfc3168 or FQ AQMs (rfc3168 or
> > not) are affected by that issue since they do not cause similar re-
> > ordering.
> > 
> > 
> >    QUESTIONS @ALL:
> > 
> >    1)  Are we all happy with that and do we consider this to be
> > acceptable collateral damage?
> > 
> >    2) If yes, should the L4S OPs draft contain text to recommend end-
> > points how to cope with that new situation? 
> >         If yes, how? Available options are IMHO to eschew the use of
> > ECN on tunnels, or to recommend increased replay window sizes, but
> > with a Gigabit link and L4S classic target of around 20ms, we would
> > need to recommend a repay window of:
> > > = ((1000^3 [b/s]) / (1538 [B/packet] * 8 [b/B])) *
> > > (20[ms]/1000[ms]) = 1625.48764629 [packets]
> >    or with a power of two algorithm 2048, which is quite a bit larger
> > than the old default of 64...
> >         But what if the L4s AQM is located on a back-bone link with
> > considerably higher bandwidth, like 10 Gbps or even 100 Gbps? IMHO a
> > replay window of 1625 * 100 = 162500 seems a bit excessive
> > 
> > 
> >    Also the following text in
> > https://urldefense.com/v3/__https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-ecn-l4s-id-14*appendix-A.1.7__;Iw!!LpKI!0R_YA5wY-HgCAeBd-ajbFbEamek2Wo9ESyoFSJ6whDL8_0kmFhysbbCeOJfaO_VT$
> >  [datatracker[.]ietf[.]org]
> > 
> >    "  Should work in tunnels:  Unlike Diffserv, ECN is defined to
> > always
> >          work across tunnels.  This scheme works within a tunnel that
> >          propagates the ECN field in any of the variant ways it has
> > been
> >          defined, from the year 2001 [RFC3168] onwards.  However, it
> > is
> >          likely that some tunnels still do not implement ECN
> > propagation at
> >          all."
> > 
> >    Seems like it could need additions to reflect the just described
> > new issue.
> > 
> > 
> > 
> >    Best Regards
> >         Sebastian
> > 
> > 
>