Re: [alto] Comment on draft-ietf-alto-performance-metric-09

"Y. Richard Yang" <> Fri, 17 April 2020 20:50 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 51ACD3A18B4; Fri, 17 Apr 2020 13:50:42 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.403
X-Spam-Status: No, score=-1.403 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FREEMAIL_FORGED_FROMDOMAIN=0.248, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.248, HTML_MESSAGE=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 0G2dPj6vmdFa; Fri, 17 Apr 2020 13:50:41 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id E912D3A13C2; Fri, 17 Apr 2020 13:47:34 -0700 (PDT)
Received: by with SMTP id o3so2045624vsd.4; Fri, 17 Apr 2020 13:47:34 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=AZkgs5jeFDkwlTFdGZySQl+hwY/1i5UaDPncgsMRddQ=; b=rhYY/tWLaWqlH0v+jUUyVYMNp/Y/LPoGGiH6zRseDEk3pv6XmSJNz8aW2EmQ3XScWP P2YumNbLqjZRBJaQXxnqG3BeKxCPjqJ8jHFP25wTCXjwo6dzXwAlDylGWcxfl4a9TvSE mCHGhdL2+aF9g1eRu7MM0Fdoas1q7pM3bcx6qzzHLvPP5BSnGimPDzz59IbJ4j8q9TpM Mx+bUIJeX02UuCJAvQn3UF1VcuId17g1VQKoMDdNE7H5tt5ec+E+WzMNr0NcJ63sKiXQ J3T5YZXvLd//zU92vOT6PkK/mscJGhHJHc/4FI6lc4QBs8DPPCWym5gp/jfBj/Rc+Cyr 9cWA==
X-Gm-Message-State: AGi0Pubf7Gm2D5O+klg1lBAqg79GSBmfKDvHkqBmOFZjx7z+ptCr3Pa6 YMGtDhoIIqnf8DAXyy9d22lpg40bBecq/D5e9EM=
X-Google-Smtp-Source: APiQypLdDfzCSsU0WjYPsU0lIXYlIEMhT1zXsyUllBj1FoyqjhLmMzgcdz2SV2oitQcpXk3pb0cmCLOHaouUbcTZ+TA=
X-Received: by 2002:a67:f42:: with SMTP id 63mr4390481vsp.24.1587156443140; Fri, 17 Apr 2020 13:47:23 -0700 (PDT)
MIME-Version: 1.0
References: <>
In-Reply-To: <>
From: "Y. Richard Yang" <>
Date: Fri, 17 Apr 2020 16:47:11 -0400
Message-ID: <>
To: Martin Duke <>
Content-Type: multipart/alternative; boundary="000000000000ff191605a382a7b8"
Archived-At: <>
Subject: Re: [alto] Comment on draft-ietf-alto-performance-metric-09
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Application-Layer Traffic Optimization \(alto\) WG mailing list" <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 17 Apr 2020 20:50:47 -0000

Hi Martin,

Thank you so much for the review! The authors (Luis, Sabine, ...) just had
a review meeting this morning and your review is very timely.

Please see below.

On Fri, Apr 17, 2020 at 3:45 PM Martin Duke <> wrote:

> This is not a full review. I'm late to this and I apologize if I'm
> covering ground which has been fully resolved before. With all hats off...
> It has come up in the working group before that some of the cost metrics
> can vary quite a lot over time, much more quickly than the scale at which
> ALTO would update them.
> Have we considered changing the metrics to have more long-lived properties
> while still being useful? For example
> 1. we could split the various delay metrics into min and max versions. Min
> and Max delay should be relatively constant over a given topology. Max
> delay maps quite nicely to something like an SLA.

Indeed. We discussed more stable statistics (e.g., min can be stable if it
reflects propagation delay and no major reroute, 95-percentile can be more
stable, ...).

> 2. packet loss -- can we focus this on link layer losses rather than queue
> drops? Most simply, this could just be a boolean that says "this path
> frequently has losses not related to congestion" but could also be expanded
> into a link packet error rate, if valuable. Congestion based losses are of
> course dependent on the application.

This is an interesting point! I am not an expert on many link layer
technologies, but I assume that the loss (more like error) rate is more
stable in many settings. Early this week during the NSF Huge Data workshop,
one discussion item was that we are approaching a new state where many more
files can be corrupted but not detected by TCP checksum for large-scale
data transfers. It is not clear where the corruption happens (link link, a
buggy switch, ...), but I assume that it is more stable. It is a good

Back up at a slightly higher level, one key issue that we want to resolve
before we are finally happy to wrap up this document is whether we can
reduce the number of metrics, instead of having a plethora of metrics:

set of metrics (delay, loss, bandwidth, ...) X set of statistics (min, max,
percentile, avg) X set of measurement settings (ping each x sec)

>From your comments, one direction I see is to define a small subset of the
above cross-product space, as a first step. One we define a larger set but
make sure to include the more stable/more-likely-quickly-usable settings?

Thanks again for the timely feedback!


> Thanks
> Martin