Re: Spin bit discussion - where we're at - use case exsist!!!

Benjamin Kaduk <bkaduk@akamai.com> Tue, 28 November 2017 16:22 UTC

Return-Path: <bkaduk@akamai.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0B5E7128891 for <quic@ietfa.amsl.com>; Tue, 28 Nov 2017 08:22:09 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=akamai.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ng_kTBrWeTKU for <quic@ietfa.amsl.com>; Tue, 28 Nov 2017 08:22:02 -0800 (PST)
Received: from mx0b-00190b01.pphosted.com (mx0b-00190b01.pphosted.com [IPv6:2620:100:9005:57f::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CE82B127B73 for <quic@ietf.org>; Tue, 28 Nov 2017 08:22:01 -0800 (PST)
Received: from pps.filterd (m0122330.ppops.net [127.0.0.1]) by mx0b-00190b01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id vASGLr87011517; Tue, 28 Nov 2017 16:21:53 GMT
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=akamai.com; h=subject : to : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=jan2016.eng; bh=dPA3a4Nx++s2n3UHkCwMiOE578n6A/gi/kYfUvJhxY8=; b=pMD6AARcetjBnwsaxdP7Mte0qIrE9+Zw+9neBctFyW4q0fEXjnLl1xuCKwOmYoSDxDgV r7uKjog30RaVlzzv33avR23tbLJSF5nQF7dVtOx8M09d0az6hf5Em45qP6HGVZ9RJDEs BfeIFKFjfyFS0R3biEkbH6xUxoqXUQKxI2KVIjgplDg0rQ/Xs9e2TgrQ788LFhNmTyLm RApYI4yMxnCTOzRqZPGwR+Gf9ozz0ISb2HKjqFrHg/5g1s/3m6UwP6CdnEtQB872flX0 mVw2NuurUkU61wP39pG5MjJvyCY3KimMpcuT9ADTaYSsAseHRBxPE9dne8gQmdX1QURE +A==
Received: from prod-mail-ppoint2 (prod-mail-ppoint2.akamai.com [184.51.33.19]) by mx0b-00190b01.pphosted.com with ESMTP id 2ef11wgx1v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 28 Nov 2017 16:21:53 +0000
Received: from pps.filterd (prod-mail-ppoint2.akamai.com [127.0.0.1]) by prod-mail-ppoint2.akamai.com (8.16.0.21/8.16.0.21) with SMTP id vASGLTfW018456; Tue, 28 Nov 2017 11:21:52 -0500
Received: from prod-mail-relay10.akamai.com ([172.27.118.251]) by prod-mail-ppoint2.akamai.com with ESMTP id 2ef4qymgen-1; Tue, 28 Nov 2017 11:21:52 -0500
Received: from [172.19.17.86] (bos-lpczi.kendall.corp.akamai.com [172.19.17.86]) by prod-mail-relay10.akamai.com (Postfix) with ESMTP id 9D59E235F7; Tue, 28 Nov 2017 16:21:52 +0000 (GMT)
Subject: Re: Spin bit discussion - where we're at - use case exsist!!!
To: Roni Even <roni.even@huawei.com>, QUIC WG <quic@ietf.org>
References: <6E58094ECC8D8344914996DAD28F1CCD8464CB@DGGEMM506-MBX.china.huawei.com>
From: Benjamin Kaduk <bkaduk@akamai.com>
Message-ID: <c88953b3-61c3-2605-a2a5-38848c0f319f@akamai.com>
Date: Tue, 28 Nov 2017 10:21:52 -0600
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0
MIME-Version: 1.0
In-Reply-To: <6E58094ECC8D8344914996DAD28F1CCD8464CB@DGGEMM506-MBX.china.huawei.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
Content-Language: en-US
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-11-28_10:, , signatures=0
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1711280221
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-11-28_10:, , signatures=0
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1711280221
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/mVN6BMrh_rkiHfGyH5XWtfAZ1Mc>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Nov 2017 16:22:09 -0000

On 11/23/2017 12:13 AM, Roni Even wrote:
> <snip>
>> 1) A description of the use case(s) that motivate this proposal. We
>> understand that the goal is to measure RTT, but some people are still unclear
>> as to why that's necessary to operate a network. Detailed scenarios and
>> ideally real-world examples (e.g., from TCP) would help tremendously.
>> Saying "I need to debug the network" is not enough detail.
>>
> [Roni Even] I submitted such a document but I assume that nobody read it since I would expect that this bullet will at least mention it !!!
>
> See https://tools.ietf.org/id/draft-even-quic-troubleshooting-video-delivery-00.tx 
>
> Please read the document 
>

Okay, I went and read the document.  I'm not really sure how much it
helps me understand the scenarios in question.

The high level takeways from my read of the document seem to be:

o Streaming video is common, and a common source of end-user complaints
when it doesn't work well.  (It also uses a lot of bandwidth.)

o A measurement point in the middle of a TCP flow can use TCP segment
numbers to estimate RTT for the upstream and downstream halfs of the
flow, i.e., a video streaming flow that is largely server-to-client
gives a more reliable RTT downstream RTT estimate (i.e., between
measurement point and client).  RTT estimates can be noisy due to
delayed acks and packet loss.

o Such a measurement point can also estimate upstream/downstream loss by
examining the sliding window and observing retransmits. For the
server-to-client video flow, the downstream loss estimate is again the
more reliable one

o Home wifi is a common source of problems, and wifi has a unique
signature of high RTT combined with low loss.  I guess the idea is
supposed to be that a measurement point close to the home network can
"blame" the wifi if it sees high RTT but not much loss on the flow in
question?

o Network operators can use big data analytics to detect deviations from
normal network metrics and isolate faulty network segments.  The above
RTT and loss metrics at measurement points are used as examples but no
claim is made that other metrics would not provide equally functional
input for analytics.

o The network operator can use RTT/loss estimates from a measurement
point near the server as evidence to try to convince the server operator
that the server('s network) is misbehaving.

While the discussion of how a measurement point can use segment numbers
and window information to estimate RTT and loss is useful, I'm not
overly concerned about that discussion and am willing to take it as a given.

I think what might help me understand the utility of these measurements
is a detailed walkthrough of what would happen in some (hypothetical)
case where I as a user call up my ISP and complain that youtube is
misbehaving, assuming I can get to a point where I'm talking to a human
at the ISP that can perform/view this sort of measurement data.  What
information does the ISP need from me?  What's the first thing to do --
insert a measurement point as close to my home router as possible to
rule (out/in) my wifi?  Or do we start off by engaging multiple
measurement points?  Does the "big data" analytics at the network
measurement center come into play at all for this walkthrough or is it
completely separate?

It's also still unclear to me to what extent RTT/loss as input to the
network measurement center's analytics is used because it's easy and
works vs. what research has been done with using other metrics as input
to the analytics.

Thanks,

Ben