Re: What CSRs Do

Willy Tarreau <w@1wt.eu> Tue, 19 November 2019 07:32 UTC

Date: Tue, 19 Nov 2019 08:32:20 +0100
From: Willy Tarreau <w@1wt.eu>
To: Frode Kileng <frodeki@gmail.com>
Cc: quic@ietf.org
Subject: Re: What CSRs Do
Message-ID: <20191119073220.GA5824@1wt.eu>
References: <BL0PR11MB3394D769128DEFEEFBC3CA71904C0@BL0PR11MB3394.namprd11.prod.outlook.com> <0cf4d3ad-0f64-0ae4-2d95-77a39f40639d@tele.no> <20191119042904.GC5602@1wt.eu> <4003626d-8966-6772-7df0-e76549320430@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <4003626d-8966-6772-7df0-e76549320430@gmail.com>
User-Agent: Mutt/1.6.1 (2016-04-27)
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/xhxWF7rqZ9gNi-VaKuiHBpiJ788>
Precedence: list

On Tue, Nov 19, 2019 at 06:57:07AM +0100, Frode Kileng wrote:
> On 19/11/2019 05:29, Willy Tarreau wrote:
> > 
> > The only known alternative is to claim "we have no loss here" and point the
> > finger at the next step in the chain operated by someone else.
> 
> What additional value can "loss bit(s)" provide beyond confirming that the
> source is within your network domain or somewhere else, i.e. information
> that is available from other "loss sources". How would network operators use
> this loss bit(s) signal when providing such "excellent customer support"
> when the problem source is external?

I find your condescending tone inappropriate to discuss technical matters.

But since you asked, when I'm invited to troubleshooting sessions when
people experience issues ranging from "seems slow" to "sometimes hangs"
on TCP, I look at IP+ports, sequence numbers, ACK numbers, flags, window
sizes, timestamps and SACK fields, to figure what's happening. It just
happens that in QUIC all of them except IP+ports are encrypted, which
basically means my only solution will be to ask people "does it still
happen when you block UDP port 443?". And that's really sad. Network
issues are never white-or-black. Large buffers in VM environments cause
retransmits without losses, resulting in apparent loss on one side and
duplicate on the other one. Packets can get corrupted on their way from
one side to the other one, and guess what ? Some devices even manage to
reconstruct a valid checksum, but from inspection you cannot tell whether
the packet is OK or not. Some packets can be dropped past a certain size
(typical problem of PPPoE whose encapsulation takes 8 bytes), resulting
in only full packets not to be delivered. On the side experiencing the
miss, usually you immediately spot this thanks to the sequence number as
you notice a hole of 1460 bytes. Here you still see packets coming but
you don't know anything about them so you can just guess. Did you know
that you can even lose packets between two processes on the loopback
under sustained load ?

Debuggability is a key to any protocol's health. The vast majority of
the fixes that go into any network stack come from observations in the
field resulting in captures exhibiting a bad behavior. Here such captures
will never exist so there will be nothing to fix. This is still what I
really do not like at all in this way of designing a protocol.

Just my two cents,
Willy

What CSRs Do Border, John
Re: What CSRs Do Frode Kileng
Re: What CSRs Do Willy Tarreau
Re: What CSRs Do Frode Kileng
Re: What CSRs Do Lubashev, Igor
Re: What CSRs Do Frode Kileng
RE: What CSRs Do Lubashev, Igor
Re: What CSRs Do Willy Tarreau
Re: What CSRs Do Frode Kileng
Re: What CSRs Do alexandre.ferrieux
Re: What CSRs Do Frode Kileng
RE: What CSRs Do Lubashev, Igor