Re: [aqm] [Bloat] ping loss "considered harmful"

Jonathan Morton <chromatix99@gmail.com> Mon, 02 March 2015 10:55 UTC

Return-Path: <chromatix99@gmail.com>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E112D1A86FB for <aqm@ietfa.amsl.com>; Mon, 2 Mar 2015 02:55:18 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.75
X-Spam-Level:
X-Spam-Status: No, score=-1.75 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aRABwUcp0H22 for <aqm@ietfa.amsl.com>; Mon, 2 Mar 2015 02:55:17 -0800 (PST)
Received: from mail-lb0-x22c.google.com (mail-lb0-x22c.google.com [IPv6:2a00:1450:4010:c04::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 55E251A86F7 for <aqm@ietf.org>; Mon, 2 Mar 2015 02:54:54 -0800 (PST)
Received: by lbvn10 with SMTP id n10so29116535lbv.6 for <aqm@ietf.org>; Mon, 02 Mar 2015 02:54:52 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=3LhvdXqgIycbWyO+Zia/z37v9eRwO1Z2LqJBMGkSEoo=; b=z058NIIwZVTip+AL7diVr38bqdKybp88P50KTAIK4D1qkMgYCYF8S498KCdglHJqKP DOoBAwLyu/cYAjch3y+oAxACPyLKS1/hXY1v/6cO7SSkm9mx5GOm2ruIlgxR9nWe0ImS 6y9LBoBFwtDeKlNquDJV9XoflMnCBGAhxw4uJpPXEAVg0LS6MZeqHShRhcrA/6b2zhas APN4b3xCU7mU8i7kO5t3mrmOvE3VaX944WHSaHPiQc+WtB+j4c7LDN5WBnMUV+5j1eCx CLFmaivuBR9LBCzqrsLt1fQJW0CarSwb4DLxS3AIKrhFVZlHmDI4XYFx1ykZKE+EJFfB zxWQ==
X-Received: by 10.152.245.38 with SMTP id xl6mr23658364lac.68.1425293692812; Mon, 02 Mar 2015 02:54:52 -0800 (PST)
Received: from [192.168.43.25] (37-219-125-17.nat.bb.dnainternet.fi. [37.219.125.17]) by mx.google.com with ESMTPSA id wp8sm2316669lbb.30.2015.03.02.02.54.49 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 02 Mar 2015 02:54:51 -0800 (PST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\))
From: Jonathan Morton <chromatix99@gmail.com>
In-Reply-To: <alpine.DEB.2.02.1503021108270.20507@uplift.swm.pp.se>
Date: Mon, 02 Mar 2015 12:54:45 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <802AFC8C-B59B-4971-A4ED-5C0375E683B1@gmail.com>
References: <CAA93jw7KW=9PH002d3Via5ks6+mHScz5VDhpPVqLUGK2K=Mhew@mail.gmail.com> <7B3E53F5-2112-4A50-A777-B76F928CE8F2@trammell.ch> <alpine.DEB.2.02.1503021108270.20507@uplift.swm.pp.se>
To: Mikael Abrahamsson <swmike@swm.pp.se>
X-Mailer: Apple Mail (2.2070.6)
Archived-At: <http://mailarchive.ietf.org/arch/msg/aqm/F2LN3_ahIFH0vRXtR7yrmSL9ick>
X-Mailman-Approved-At: Mon, 02 Mar 2015 05:54:52 -0800
Cc: Brian Trammell <ietf@trammell.ch>, "aqm@ietf.org" <aqm@ietf.org>, "cerowrt-devel@lists.bufferbloat.net" <cerowrt-devel@lists.bufferbloat.net>, bloat <bloat@lists.bufferbloat.net>
Subject: Re: [aqm] [Bloat] ping loss "considered harmful"
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Mar 2015 10:55:19 -0000

> On 2 Mar, 2015, at 12:17, Mikael Abrahamsson <swmike@swm.pp.se> wrote:
> 
> On Mon, 2 Mar 2015, Brian Trammell wrote:
> 
>> Gaming protocols do this right - latency measurement is built into the protocol.
> 
> I believe this is the only way to do it properly, and the most likely easiest way to get this deployed would be to use the TCP stack.
> 
> We need to give users an easy-to-understand metric on how well their Internet traffic is working. So the problem here is that the users can't tell how well it's working without resorting to ICMP PING to try to figure out what's going on.
> 
> For instance, if their web browser had insight into what the TCP stack was doing then it could present information a lot better to the user. Instead of telling the user "time to first byte" (which is L4 information), it could tell the less novice user about packet loss, PDV, reordering, RTT, how well concurrent connections to the same IP address are doing, tell more about *why* some connections are slow instead of just saying "it took 5.3 seconds to load this webpage and here are the connections and how long each took". For the novice user there should be some kind of expert system that collects data that you can send to the ISP that also has an expert system to say "it seems your local connection delays packets", please connect to a wired connection and try again". It would know if the problem was excessive delay, excessive delay that varied a lot, packet loss, reordering, or whatever.
> 
> We have a huge amount of information in our TCP stacks that either are locked in there and not used properly to help users figure out what's going on, and there is basically zero information flow between the applications using TCP and the TCP stack itself. Each just tries to do its best on its own layer.

This seems like an actually good idea.  Several of those statistics, at least, could be exposed to userspace without incurring any additional overhead in the stack (except for the queries themselves), which is important for high-performance server users.  TCP stacks already track RTT, and sometimes MinRTT - the difference between these values is a reasonable lower-bound estimate of induced latency.

For stacks which don’t already track all the desirable data, a socket option could be used to turn that on, allocating extra space to do so.  To maximise portability, therefore, it might be necessary to require that option before statistics requests will be valid, even on stacks which do collect it all anyway.

Recent versions of Windows, even, have a semi-magic system which gives a little indicator of whether your connection has functioning Internet connectivity or not.  This could be extended, if Microsoft saw fit, to interpret these statistics and notify the user that their connection was behaving badly in the ways we now find interesting.  Whether Microsoft will do such a thing (which would undoubtedly piss off every major ISP on the planet) is another matter, but it’s a concept that can be used by Linux desktops as well, and with less political fallout.

Now, who’s going to knuckle down and implement it?

 - Jonathan Morton