RE: [ippm] New QUIC Packet Loss Measurement (draft-cfb-ippm-spinbit-measurements)

"Lubashev, Igor" <ilubashe@akamai.com> Thu, 23 April 2020 16:41 UTC

Return-Path: <ilubashe@akamai.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E330D3A0C08; Thu, 23 Apr 2020 09:41:42 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.499
X-Spam-Level:
X-Spam-Status: No, score=-1.499 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, GB_ABOUTYOU=0.5, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=akamai.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pmF9QY58FyVg; Thu, 23 Apr 2020 09:41:39 -0700 (PDT)
Received: from mx0a-00190b01.pphosted.com (mx0a-00190b01.pphosted.com [IPv6:2620:100:9001:583::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C96E63A0C0D; Thu, 23 Apr 2020 09:41:39 -0700 (PDT)
Received: from pps.filterd (m0122332.ppops.net [127.0.0.1]) by mx0a-00190b01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 03NGXLPj014815; Thu, 23 Apr 2020 17:41:35 +0100
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=akamai.com; h=from : to : cc : subject : date : message-id : references : in-reply-to : content-type : mime-version; s=jan2016.eng; bh=FTkWAdap9PD17khp2N6tVWaEWR0h8jtsNPrLp02vDHo=; b=Mr/heFKeL2StC6WVVHcex8rms6Ylir3lph+VQp06c/Atb9rLICF+a7vtplf31vGNRyGj Dfpo2QjbwBXcaBaS1xpmSXRdJV19753NiIyV/K4O2btK2jSw4fdx8vjRKp9Hf/YtaPbV DsO0qG8Uv0l5AVAPaAgOjz+7BjCk1TggjQcVStwm5X4OdMnhD3EUXU+H2pcRQZbWmWXZ 2ded63GFGxJON7EPZ5UjRsHWJcFeY9LcPUm7IhhedeDeLoLQ/8VflghJY4QD6AR+Yep9 GeyKNK5xzE812gOovVlhwq0mIwIgBa8GaOnuzWfhotLN88kmFotlC4eeOaFNOU4XIYBl hA==
Received: from prod-mail-ppoint2 (prod-mail-ppoint2.akamai.com [184.51.33.19] (may be forged)) by mx0a-00190b01.pphosted.com with ESMTP id 30fskjw2u1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 23 Apr 2020 17:41:34 +0100
Received: from pps.filterd (prod-mail-ppoint2.akamai.com [127.0.0.1]) by prod-mail-ppoint2.akamai.com (8.16.0.27/8.16.0.27) with SMTP id 03NGWDud008098; Thu, 23 Apr 2020 12:41:33 -0400
Received: from email.msg.corp.akamai.com ([172.27.123.30]) by prod-mail-ppoint2.akamai.com with ESMTP id 30fvvva8yb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Thu, 23 Apr 2020 12:41:31 -0400
Received: from usma1ex-dag1mb6.msg.corp.akamai.com (172.27.123.65) by usma1ex-dag1mb4.msg.corp.akamai.com (172.27.123.104) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Thu, 23 Apr 2020 12:41:27 -0400
Received: from usma1ex-dag1mb6.msg.corp.akamai.com ([172.27.123.65]) by usma1ex-dag1mb6.msg.corp.akamai.com ([172.27.123.65]) with mapi id 15.00.1497.006; Thu, 23 Apr 2020 12:41:27 -0400
From: "Lubashev, Igor" <ilubashe@akamai.com>
To: "Bulgarella Fabio (Guest)" <fabio.bulgarella=40guest.telecomitalia.it@dmarc.ietf.org>, Ian Swett <ianswett@google.com>, Cociglio Mauro <mauro.cociglio@telecomitalia.it>
CC: Riccardo Sisto <riccardo.sisto@polito.it>, "isabelle.hamchaoui@orange.com" <isabelle.hamchaoui@orange.com>, "quic@ietf.org" <quic@ietf.org>, "alexandre.ferrieux@orange.com" <alexandre.ferrieux@orange.com>, "IETF IPPM WG (ippm@ietf.org)" <ippm@ietf.org>
Subject: RE: [ippm] New QUIC Packet Loss Measurement (draft-cfb-ippm-spinbit-measurements)
Thread-Topic: [ippm] New QUIC Packet Loss Measurement (draft-cfb-ippm-spinbit-measurements)
Thread-Index: AdYNfRCPTsr9cJpyT2q4st79g+H+ngAiuRSAAlAHJ8AAbNOfAAAgNAaQ
Date: Thu, 23 Apr 2020 16:41:27 +0000
Message-ID: <676f723b239b42ab9992780c440c5297@usma1ex-dag1mb6.msg.corp.akamai.com>
References: <3ca3b5aae01d4650a3451639268b3f1e@TELMBXD14BA020.telecomitalia.local> <CAKcm_gMEELBizN_h5+s3Ow0LKXEgTRGg+-AqzJMZXVBDwQcDLA@mail.gmail.com>, <6b9e74ac94114d28ae4a66f1e9625ebd@usma1ex-dag1mb6.msg.corp.akamai.com> <1587582698016.72632@guest.telecomitalia.it>
In-Reply-To: <1587582698016.72632@guest.telecomitalia.it>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: yes
X-MS-TNEF-Correlator:
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [172.19.112.94]
Content-Type: multipart/related; boundary="_004_676f723b239b42ab9992780c440c5297usma1exdag1mb6msgcorpak_"; type="multipart/alternative"
MIME-Version: 1.0
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.676 definitions=2020-04-23_12:2020-04-23, 2020-04-23 signatures=0
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-2002250000 definitions=main-2004230128
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.676 definitions=2020-04-23_12:2020-04-23, 2020-04-23 signatures=0
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 bulkscore=0 lowpriorityscore=0 impostorscore=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 mlxlogscore=999 adultscore=0 mlxscore=0 clxscore=1011 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004230128
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/9KX3V_adQDmxr59gOL7B9eskI-4>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 23 Apr 2020 16:41:43 -0000

Fabio, thanks for the reply.


  *   "Moreover, with a strong amount of reordering the Qbit (and the SpinBit) would not work either."

Strong reordering exists.  That's why we are recommending the minimum Q to be 64.  This comes from real data, not lab data.  If 5 packets were enough to tell the end of marking periods, we would had recommended minimum Q to be 8.



  *   R-bit signal delay.

R signal is still going to be significantly delayed, especially with slower connections (not every connection is going to be streaming media at a sustained rate of 15Mbps).  So if for a 15Mbps average-rate connection you can expect to see R-signal with a delay between 8-RTTs (1:2 ACK) and 38-RTTs (1:10 ACK), the delay increases to 24-RTTs and 114-RTTs for a 5Mbps average-rate connection.


  *   R-bit signal quality.

L-bit signal provides a crisp shape of the loss - whether it is random loss, or packets are lost of batches (and how many packets are in a loss batch).  By contrast, R-signal only indicates an average loss over 2 to 10 Q-bit marking periods.  The share of the loss is erased.

I think for that alone, L-signal is more useful for troubleshooting.



  *   L-bit signal depends on each implementation's determination of when to "declare a packet lost" (which may differ for each implementation), while R-signal depends only on more generic behaviors.

This is an important difference, and it seems accurate on its surface.  In reality, every QUIC implementation will keep track of packets and declare them lost for the purposes of data retransmission and/or congestion control.  The timing and details of when packets are declared lost (L-bit signal) will vary.  Likewise, every QUIC implementation will send some ACKs, but their timing and frequency (R-signal) will vary.  R-signal quality will also heavily depend on the "marking period detection" implementation quality and "Q-signal averaging" implementation details.



  *   End-to-end loss: "We can also place two independent observers wherever on the two directions and they will still produce coherent E2E measures."
  *   End-to-end loss: "by observing only one direction we can measure the [...] the end-to-end in the opposite direction (Rbit).

Correct, as long as both endpoints implement the entire QR scheme (even if this is an unusual setup - measuring path A requires observing path B).

Also, accurate accounting with QR scheme requires correlation of both directions, with the R-bit direction significantly lagging the Q-bit direction.  It should be understood how this scheme behaves in cases of changing Connection IDs.

Best wishes,


  *   Igor


From: Bulgarella Fabio (Guest) <fabio.bulgarella=40guest.telecomitalia.it@dmarc.ietf.org>
Sent: Wednesday, April 22, 2020 3:12 PM
To: Lubashev, Igor <ilubashe@akamai.com>; Ian Swett <ianswett@google.com>; Cociglio Mauro <mauro.cociglio@telecomitalia.it>
Cc: Riccardo Sisto <riccardo.sisto@polito.it>; isabelle.hamchaoui@orange.com; quic@ietf.org; alexandre.ferrieux@orange.com; IETF IPPM WG (ippm@ietf.org) <ippm@ietf.org>
Subject: Re: [ippm] New QUIC Packet Loss Measurement (draft-cfb-ippm-spinbit-measurements)

Hello Igor,
here some thoughts about your concerns.

1. If big differences in delay are present between two parallel paths on which the same connection is divided, this is certainly an unwanted behavior, because it could create problems to the protocol itself even before damaging the performance measurements. As a network operator, we try to avoid such scenarios. However, our mechanism works correctly in most cases and is still resistant to normal reordering. Moreover, with a strong amount of reordering the Qbit (and the SpinBit) would not work either.


2. It's true that your mechanism allows the measurement of E2E and upstream loss of the observed direction. However, these measurement methodologies are born with the idea of providing information on the performance of the entire connection, as was done with the SpinBit. Actually with our mechanism, when observing a single direction, we still are able to measure the E2E loss of the opposite channel (using the R-bit).


3. The ACK ratio of 1:10 should only occur in the absence of loss. By introducing loss, the ratio between ACK and data is certainly better. So the quality of our signal improves just when it is needed; we don't care if we get delayed measurements when everything is working great. This behavior is confirmed by our laboratory tests where we have seen that loss can be easily detected in an extremely accurate manner even in the case of highly unbalanced flows.

L-Bit has an RTO delay but to evaluate an accurate loss ratio you have still to observe a significant number of packets.


4. First of all, our solution is not dependent on any specific protocol implementation unlike the QL and therefore there shouldn't be problems related to the quality of the implementation. In other words, our portion of code is always the same, in every protocol implementation.
Secondly, as anticipated in point 2, by observing only one direction we can measure the losses in the operator's domain (Qbit) and the end-to-end in the opposite direction (Rbit). So, our solution does work also for asymmetric path segments. We can also place two indipendent observers wherever on the two directions and they will still produce coherent E2E measures.

The QL solution does not give any information about the opposite direction, it only measures what it observes, that is the current direction. Having two independent measures, one per direction, can be an advantage in some aspects, but a disadvantage for others.


Best regards,
Fabio B.




________________________________
Da: ippm <ippm-bounces@ietf.org> per conto di Lubashev, Igor <ilubashe=40akamai.com@dmarc.ietf.org>
Inviato: martedì 21 aprile 2020 18:31
A: Ian Swett; Cociglio Mauro
Cc: Riccardo Sisto; isabelle.hamchaoui@orange.com; quic@ietf.org; alexandre.ferrieux@orange.com; IETF IPPM WG (ippm@ietf.org)
Oggetto: [EXT] Re: [ippm] New QUIC Packet Loss Measurement (draft-cfb-ippm-spinbit-measurements)

I like the QR draft from the perspective that it mostly works (and lots of credit is due for things that work).

I have a few concerns, though.

1. The most trivial one is that QR is actually harder to implement correctly for endpoints, given the tricky processing required to identify when a marking interval has ended.  Experimental data that Akamai & Orange compiled shows that there are networks that do ECMP among links with significantly different latencies.  They will cause a lot more reordering to a burst of packets than the recommended "observe 5 packets (in a row?) of the opposite markings" rule can address.  You might need to potentially defer the decision on where exactly the marking period boundary is until you've seen almost half of the next marking period.  Requiring every endpoint to implement this right is tough.

2. Since the most important direction (at least for downloads and media streaming) is server-to-client, to get this signal with QR, you need both endpoints implementing the logic, with the heaviest lift on the clients (see above).  For QL, a client that does not want to implement QL logic but is ok with the loss data reporting can just update the TP and header protection mask - all the lift is on the server side.

3. I am concerned with the quality of the signal you get from QR.  While the upstream loss signal is strong (same Q-bit), the downstream/end-to-end loss signal is weak and is likely to result in very delayed and inconclusive signal.  Google's experience (as I understand it) and protocol research (https://erg.abdn.ac.uk/~downloads/ackscaling.pdf<https://urldefense.proofpoint.com/v2/url?u=https-3A__erg.abdn.ac.uk_-7Edownloads_ackscaling.pdf&d=DwMF-g&c=96ZbZZcaMF4w0F4jpN6LZg&r=Djn3bQ5uNJDPM_2skfL3rW1tzcIxyjUZdn_m55KPmlo&m=u3hS0QnHLn0-P-Lmy8-Yhm7L0G310RjwTQfw2XLvfJk&s=8qb_Zcm3YASCVrMZiZ5Le_3_nPxd8AC7bUbVwXl4L9I&e=>) point to the benefit of ACK-ing only 1:10 packets after the initial slow start (draft-fairhurst-quic-ack-scaling).  Since such behavior has a beneficial impact on networks, mobile devices, and endpoint cpu, and no negative impact on the protocol performance, I expect it to be in common use.  With 1:10 ACK ratio, R-bit signal is significantly delayed as it is lagging the loss by at least 1400*(Q=64)*2*10=1.8MB of data transferred, which is at least 38 RTTs for a 15Mbps stream over a 25ms link (L-bit signal is delayed by an RTO).

4. QR scheme only works for symmetric path segments, when the observer can capture both directions of the flow. You also cannot compute end-to-end loss in any direction, unless both endpoints implement the entire QR algorithm.  In contrast, QL works for single-direction observers, and whichever endpoint implements QL algorithm ensures end-to-end loss signal for the packets it sends, and the quality of that loss signal solely depends on the quality of that endpoint's implementation (not the quality of the other endpoint's implementation).  QL seems more deployable.


  *   Igor


From: Ian Swett <ianswett@google.com>
Sent: Wednesday, April 8, 2020 3:44 PM
To: Cociglio Mauro <mauro.cociglio=40telecomitalia.it@dmarc.ietf.org>
Cc: quic@ietf.org; Lubashev, Igor <ilubashe@akamai.com>; alexandre.ferrieux@orange.com; isabelle.hamchaoui@orange.com; Riccardo Sisto <riccardo.sisto@polito.it>; IETF IPPM WG (ippm@ietf.org) <ippm@ietf.org>
Subject: Re: [ippm] New QUIC Packet Loss Measurement (draft-cfb-ippm-spinbit-measurements)

Thanks for sharing this.  From my perspective, this is an improvement upon the previous proposal in terms of robustness.  This design feels very similar to the spin bit, and seems trivial to implement, which are also nice properties.

Also, QUIC doesn't really have retransmissions, so the 'R bit' name always made me a bit uncomfortable.

I'm not saying QUIC should adopt this yet, but I'd be interested in seeing a privacy analysis completed for it.

Ian

On Wed, Apr 8, 2020 at 4:25 AM Cociglio Mauro <mauro.cociglio=40telecomitalia.it@dmarc.ietf.org<mailto:40telecomitalia.it@dmarc.ietf.org>> wrote:
Dear QUIC WG Members.

We submitted to IPPM WG a draft where we described a new 2 bit packet loss methodology (https://tools.ietf.org/html/draft-cfb-ippm-spinbit-measurements-01<https://urldefense.proofpoint.com/v2/url?u=https-3A__tools.ietf.org_html_draft-2Dcfb-2Dippm-2Dspinbit-2Dmeasurements-2D01&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=Djn3bQ5uNJDPM_2skfL3rW1tzcIxyjUZdn_m55KPmlo&m=e_BQ3UDNSWkXu_ID1rkGv3CCZs89sezu-6Ibkvltf7Y&s=UeTW_fLvxHy_ewVvm0y6kIqZoXUR8PYWgxAOxmQScYw&e=>).
The measurement of packet loss is under discussion in QUIC WG and our draft introduces two alternatives about it.
The first one is a spin-bit dependent signal and uses a single bit. The second one, described in the following linked slides, is a standalone solution based on a two bits loss signal and on alternate marking (RFC8321)..
This last methodology improves, in our opinion, the algorithm proposed by Orange and Akamai described in https://tools.ietf.org/html/draft-ferrieuxhamchaoui-quic-lossbits-03<https://urldefense.proofpoint.com/v2/url?u=https-3A__tools.ietf.org_html_draft-2Dferrieuxhamchaoui-2Dquic-2Dlossbits-2D03&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=Djn3bQ5uNJDPM_2skfL3rW1tzcIxyjUZdn_m55KPmlo&m=e_BQ3UDNSWkXu_ID1rkGv3CCZs89sezu-6Ibkvltf7Y&s=V649Uqu8mfuBWvMRBTUkpEWfpEqlLfufGs67XXzvNFo&e=>.

The 2 bits are the sQuare bit (Q-bit) and the Reflection square bit (R-bit).
The Q-bit doesn't change from the Ferrieux-Hamchaoui draft but the R-bit substitutes the L-bit.
This avoids the L-bit dependence from an internal protocol variable, a problem raised in the last QUIC interim meeting.

You can find the slides, we prepared for the IPPM interim meeting, at the following link: https://github.com/ietf-ippm/meeting-materials/blob/aa486f7ead28b9f55c5ef499e5c9ed33ab06daf9/ietf107-virtual/Slides/07-spinbit-measurements.pdf<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ietf-2Dippm_meeting-2Dmaterials_blob_aa486f7ead28b9f55c5ef499e5c9ed33ab06daf9_ietf107-2Dvirtual_Slides_07-2Dspinbit-2Dmeasurements.pdf&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=Djn3bQ5uNJDPM_2skfL3rW1tzcIxyjUZdn_m55KPmlo&m=e_BQ3UDNSWkXu_ID1rkGv3CCZs89sezu-6Ibkvltf7Y&s=3l6t3FYQn0iz_xX0sxNFKUnh7bwYFMOpQ0z_zNhXHpg&e=>

Comments and suggestions are always welcome.

Of course we are available to present our proposals in the next QUIC WG meeting or to arrange a Webex side meeting if needed.

Best regards.

Mauro, Fabio, Giuseppe, Massimo and Riccardo


_____________________
Mauro Cociglio
TIM - CT.TA.EI
Via G. Reiss Romoli, 274
10148 - Torino (Italy)
Tel.: +390112285028<tel:+39%20011%20228%205028>
Mobile: +393357669751<tel:+39%20335%20766%209751>
_____________________


[Image removed by sender.]<https://urldefense.proofpoint.com/v2/url?u=https-3A__on.tim.it_banner-2Dmail-2Ddip&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=Djn3bQ5uNJDPM_2skfL3rW1tzcIxyjUZdn_m55KPmlo&m=e_BQ3UDNSWkXu_ID1rkGv3CCZs89sezu-6Ibkvltf7Y&s=z55QpRg48UYrhUvomUNAlkMCUxdQT20Mz1VhZkNrLQI&e=>

Questo messaggio e i suoi allegati sono indirizzati esclusivamente alle persone indicate. La diffusione, copia o qualsiasi altra azione derivante dalla conoscenza di queste informazioni sono rigorosamente vietate. Qualora abbiate ricevuto questo documento per errore siete cortesemente pregati di darne immediata comunicazione al mittente e di provvedere alla sua distruzione, Grazie.

This e-mail and any attachments is confidential and may contain privileged information intended for the addressee(s) only. Dissemination, copying, printing or use by anybody else is unauthorised. If you are not the intended recipient, please delete this message and any attachments and advise the sender by return e-mail, Thanks..

Rispetta l'ambiente. Non stampare questa mail se non è necessario.

_______________________________________________
ippm mailing list
ippm@ietf.org<mailto:ippm@ietf.org>
https://www.ietf.org/mailman/listinfo/ippm<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ietf.org_mailman_listinfo_ippm&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=Djn3bQ5uNJDPM_2skfL3rW1tzcIxyjUZdn_m55KPmlo&m=e_BQ3UDNSWkXu_ID1rkGv3CCZs89sezu-6Ibkvltf7Y&s=YF4GnFEgwmSakC8Bv_763sIZZR3g8t7wdsuiOl6C8fU&e=>