Re: [tsvwg] L4S dual-queue re-ordering and VPNs

Pete Heist <pete@heistp.net> Thu, 06 May 2021 15:31 UTC

Message-ID: <833de11952732c9e6e92e6f66d23b225e590f053.camel@heistp.net>
From: Pete Heist <pete@heistp.net>
To: Sebastian Moeller <moeller0@gmx.de>, Greg White <g.white@CableLabs.com>
Cc: TSVWG <tsvwg@ietf.org>
Date: Thu, 06 May 2021 17:31:26 +0200
In-Reply-To: <1DB719E5-55B5-4CE2-A790-C110DB4A1626@gmx.de>
References: <68F275F9-8512-4CD9-9E81-FE9BEECD59B3@cablelabs.com> <1DB719E5-55B5-4CE2-A790-C110DB4A1626@gmx.de>
Content-Type: text/plain; charset="UTF-8"
User-Agent: Evolution 3.40.0
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/EkOev7mBYSo-4cbfO7Z1N-7Itvk>
Subject: Re: [tsvwg] L4S dual-queue re-ordering and VPNs
Precedence: list

Hi,

To test this out, I added a scenario to the l4s-tests repo of two flow
competition in an IPsec tunnel through DualPI2:

https://github.com/heistp/l4s-tests/#dropped-packets-for-tunnels-with-replay-protection-enabled

It shows that non-L4S traffic through the C queue can see drops when
packets arrive outside the replay window- essentially the concern that
Sebastian raised. The effect here is reduced throughput for a CUBIC
flow in competition with Prague, depending on the replay window in use.

For this 100Mbps case, I tried a range of 5 values for replay-window
(0, 32, 64, 128, 256). It looks like it's good to account for peak
sojourn times through the C queue- perhaps 50ms? If so, a replay window
of 512 packets might be better here. As I understand it, the default
for IPsec is typically 32 or 64 packets. So, some existing deployed
tunnels may need to be reconfigured, depending on their bandwidths.

The default and compiled-in value for Wireguard appears to now be 8192
packets(?), so there I wouldn't expect this to be a problem until the
tunneled traffic is at least ~2Gbps, but that's just a guess, by
estimating what bandwidth can exceed 8192 packets in 50ms.

Thanks Sebastian for bringing this up...

Pete

On Wed, 2021-05-05 at 23:20 +0200, Sebastian Moeller wrote:
> Hi Greg,
> 
> thanks for your response, more below prefixed [SM].
> 
> > On May 3, 2021, at 19:35, Greg White <g.white@CableLabs.com> wrote:
> > 
> > I'm not familiar with the replay attack mitigations used by VPNs, so
> > can't comment on whether this would indeed be an issue for some VPN
> > implementations.
> 
> [SM] I believe this to be an issue for at least those VPNs that use UDP
> and defend against replay attacks (including ipsec, wireguard,
> OpenVPN). All more or less seem to use the same approach with a limited
> accounting window to allow out-of-order delivery of packets. The head
> of the window typically seems to be advanced to the packet with the
> highest "sequence" number, hence all of these are sensitive for the
> kind of packet re-ordering the L4S ecn id draft argues was benign...
> 
> 
> >  A quick search revealed (https://www.wireguard.com/protocol/ ) that
> > Wireguard apparently has a window of about 2000 packets, so perhaps
> > it isn't an immediate issue for that VPN software?
> 
> [SM] Current Linux kernels seem to use a window of ~8K packets, while
> OpnenVPN defaults to 64 packets, Linux ipsec seems to default to either
> 32 or 64. 8K should be reasonably safe, but 64 seems less safe.
> 
> > But, if it is an issue for a particular algorithm, perhaps another
> > solution to address condition b would be to use a different "head of
> > window" for ECT1 packets compared to ECT(0)/NotECT packets?  
> 
> [SM] Without arguing whether that might or might not be a good idea, it
> is not what is done today, so all deployed end-points will treat all
> packets the same but at least wireguard and linux ipsec will propagate
> ECN vaule at en- and decapsulation, so are probably affected by the
> issue.
> 
> > In your 100 Gbps case, I guess you are assuming that A) the
> > bottleneck between the two tunnel endpoints is 100 Gbps, B) a single
> > VPN tunnel is consuming the entirety of that 100 Gbps link, and C)
> > that there is a PI2 AQM targeting 20ms of buffering delay in that 100
> > Gbps link?  If so, I'm not sure that I agree that this is likely in
> > the near term.
> 
> [SM] Yes, the back-of-an-envelop worst case estimate is not terribly
> concerning, I agree, but the point remains that a fixed 20ms delay
> target will potentially cause the issue with increasing link speeds...
> 
> 
> >  But, in any case, it seems to me that protocols that need to be
> > robust to out-of-order delivery would need to consider being robust
> > to re-ordering in time units anyway, and so would naturally need to
> > scale that functionality as packet rates increase.
> 
> [SM] The thing is these methods aim to avoid Mallory fudging with the
> secure connection between Alice and Bob (not our's), and need to track
> packet by packet, that is not easily solved efficiently with a simple
> time-out (at least not as far as I can seem but I do not claim
> expertise in cryptology or security engineering). But I am certain, if
> you have a decent new algorithm to enhance RFC2401 and/or RFC6479 the
> crypto community might be delighted to hear them. ;)
> 
> > I'm happy to include text in the L4Sops draft on this if the WG
> > agrees it is useful to include it, and someone provides text that
> > would fit the bill.
> 
> [SM] I wonder whether a section on L4S-OPs a la, "make sure to
> configure a sufficiently large replay window to allow for ~20ms
> reordering" would be enough, or  wether the whole discussion would not
> also be needed in
> https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-ecn-l4s-id-14#appendix-B.1
>  widening the re-ordering scope from the existing "Risk of reordering
> Classic CE packets" subpoint 3.?
> 
> Regards
>         Sebastian
> 
> 
> > 
> > -Greg
> > 
> > 
> > On 5/3/21, 1:44 AM, "tsvwg on behalf of Sebastian Moeller"
> > <tsvwg-bounces@ietf.org on behalf of moeller0@gmx.de> wrote:
> > 
> >    Dear All,
> > 
> >    we had a few discussions in the past about L4S' dual queue design
> > and the consequences of packets of a single flow being accidentally
> > steered into the wrong queue.
> >    So far we mostly discussed the consequence of steering all packets
> > marked CE into the LL-queue (and
> > https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-ecn-l4s-id-14#appendix-B.1
> >  Risk of reordering Classic CE packets: only discusses this point);
> > there the argument is, that this condition should be rare and should
> > also be relative benign, as an occasional packet to early should not
> > trigger the 3 DupACK mechanism. While I would liked to see hard data
> > confirming the two hypothesis, let's accept that argument for the
> > time being.
> > 
> >    BUT, there is a traffic class that is actually sensitive to
> > packets arriving out-of-order and too early: VPNs. Most VPNs try to
> > secure against replay attacks by maintaining a replay window and only
> > accept packets that fall within that window. Now, as far as I can
> > see, most replay window algorithms use a bounded window and use the
> > highest received sequence number to set the "head" of the window and
> > hence will trigger replay attack mitigation, if the too-early-packets
> > move the replay window forward such that "in-order-packets" from the
> > shorter queue fall behind the replay window.
> > 
> >    Wireguard is an example of a modern VPN affected by this issue,
> > since it supports ECN and propagates ECN bits between inner and outer
> > headers on en- and decapsulation. 
> > 
> >    I can see two conditions that trigger this:
> >    a) the arguably relatively rare case of an already CE-marked
> > packet hitting an L4S AQM (but we have no real number on the
> > likelihood of that happening)
> >    b) the arguably more and more common situation (if L4S actually
> > succeeds in the field) of an ECT(1) sub-flow zipping past
> > ECT(0)/NotECT sub-flows (all within the same tunnel outer flow)
> > 
> >    I note that neither single-queue rfc3168 or FQ AQMs (rfc3168 or
> > not) are affected by that issue since they do not cause similar re-
> > ordering.
> > 
> > 
> >    QUESTIONS @ALL:
> > 
> >    1)  Are we all happy with that and do we consider this to be
> > acceptable collateral damage?
> > 
> >    2) If yes, should the L4S OPs draft contain text to recommend end-
> > points how to cope with that new situation? 
> >         If yes, how? Available options are IMHO to eschew the use of
> > ECN on tunnels, or to recommend increased replay window sizes, but
> > with a Gigabit link and L4S classic target of around 20ms, we would
> > need to recommend a repay window of:
> > > = ((1000^3 [b/s]) / (1538 [B/packet] * 8 [b/B])) *
> > > (20[ms]/1000[ms]) = 1625.48764629 [packets]
> >    or with a power of two algorithm 2048, which is quite a bit larger
> > than the old default of 64...
> >         But what if the L4s AQM is located on a back-bone link with
> > considerably higher bandwidth, like 10 Gbps or even 100 Gbps? IMHO a
> > replay window of 1625 * 100 = 162500 seems a bit excessive
> > 
> > 
> >    Also the following text in
> > https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-ecn-l4s-id-14#appendix-A.1.7
> > 
> >    "  Should work in tunnels:  Unlike Diffserv, ECN is defined to
> > always
> >          work across tunnels.  This scheme works within a tunnel that
> >          propagates the ECN field in any of the variant ways it has
> > been
> >          defined, from the year 2001 [RFC3168] onwards.  However, it
> > is
> >          likely that some tunnels still do not implement ECN
> > propagation at
> >          all."
> > 
> >    Seems like it could need additions to reflect the just described
> > new issue.
> > 
> > 
> > 
> >    Best Regards
> >         Sebastian
> > 
> > 
>

Re: [tsvwg] L4S dual-queue re-ordering and VPNs Greg White
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Sebastian Moeller
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Pete Heist
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Black, David
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Pete Heist
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Jonathan Morton
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Bob Briscoe
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Bob Briscoe
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Jonathan Morton
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Bob Briscoe
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Sebastian Moeller
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Sebastian Moeller
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Jonathan Morton
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Sebastian Moeller
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Bob Briscoe
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Sebastian Moeller
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Black, David
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Bob Briscoe
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Bob Briscoe
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Bob Briscoe
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Sebastian Moeller
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Jonathan Morton
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Bob Briscoe
Re: [tsvwg] L4S dual-queue re-ordering and VPNs Sebastian Moeller