Re: [alto] AD review of draft-ietf-alto-performance-metrics-15

Qin Wu <bill.wu@huawei.com> Mon, 05 July 2021 02:33 UTC

From: Qin Wu <bill.wu@huawei.com>
To: Martin Duke <martin.h.duke@gmail.com>
CC: "Y. Richard Yang" <yry@cs.yale.edu>, IETF ALTO <alto@ietf.org>, "sabine.randriamasy@nokia-bell-labs.com" <sabine.randriamasy@nokia-bell-labs.com>, LUIS MIGUEL CONTRERAS MURILLO <luismiguel.contrerasmurillo@telefonica.com>
Thread-Topic: [alto] AD review of draft-ietf-alto-performance-metrics-15
Thread-Index: AddxRZYiyM73WYTpQvOQy8ivmg43DQ==
Date: Mon, 05 Jul 2021 02:33:04 +0000
Message-ID: <87b7ba4b93b4492b9f4b240502da43e2@huawei.com>
Accept-Language: zh-CN, en-US
Content-Language: zh-CN
Content-Type: multipart/alternative; boundary="_000_87b7ba4b93b4492b9f4b240502da43e2huaweicom_"
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/alto/EqmkM3WCmbKVMRKWx3FjrDb6dAU>
Subject: Re: [alto] AD review of draft-ietf-alto-performance-metrics-15
Precedence: list

Hi, Martin:
I have filed two open issues related to this draft:
https://github.com/ietf-wg-alto/draft-ietf-alto-performance-metrics/issues/1
https://github.com/ietf-wg-alto/draft-ietf-alto-performance-metrics/issues/2
I have asked authors to address them in v-16. The new version will come soon.

-Qin
发件人: Martin Duke [mailto:martin.h.duke@gmail.com]
发送时间: 2021年7月3日 6:23
收件人: Qin Wu <bill.wu@huawei.com>
抄送: Y. Richard Yang <yry@cs.yale.edu>; IETF ALTO <alto@ietf.org>
主题: Re: [alto] AD review of draft-ietf-alto-performance-metrics-15

This has been in "Revised I-D needed" for 95 days. Is there an issue I should be aware of, or will I see a revision soon?

On Wed, Jun 9, 2021 at 5:36 AM Qin Wu <bill.wu@huawei.com<mailto:bill.wu@huawei.com>> wrote:
Hi, Richard and Martin:

发件人: alto [mailto:alto-bounces@ietf.org<mailto:alto-bounces@ietf.org>] 代表 Martin Duke
发送时间: 2021年5月4日 2:01
收件人: Y. Richard Yang <yry@cs.yale.edu<mailto:yry@cs.yale.edu>>
抄送: Brian Trammell <ietf@trammell.ch<mailto:ietf@trammell.ch>>; IETF ALTO <alto@ietf.org<mailto:alto@ietf.org>>
主题: Re: [alto] AD review of draft-ietf-alto-performance-metrics-15

Hi Richard,

Replies inline.

On Thu, Apr 8, 2021 at 1:04 PM Y. Richard Yang <yry@cs.yale.edu<mailto:yry@cs.yale.edu>> wrote:
If the intent is that it be machine-readable, then there are several places where this standard is going to need more standardization (i.e. precise definition of text fields).

Some of the authors discussed the issue and feel that going down the path of making the content machine-readable, in a systematic way, adds substantial complexity.

OK, let's just make the intent clear in the draft.

.
But zooming out: I understand that the point is that "the client knows more about the context," which is pretty much what 5.1 says. But I don't understand if the "client" is the user or a user agent, and what either one would actually do with the information. Would the application execute a policy based on the source? Why would it use a latency that came from an sla, but not from measurement? etc.

This is a good comment. Here is an example (https://ipnetwork.bgtmo.ip.att.net/pws/averages.html) that motivated some early discussions. In this example, AT&T post both target (aka sla. should we change the name from sla to service-level-target?) and actual measurements. In this sense, ALTO can be considered as a standard way of providing and update the information. Both target and actual values can be useful. Make sense?

So this use case is about allowing users to query performance metrics for consumption by humans, rather than by automated clients trying to pick a server? This seems like a lot of machinery for that purpose, but if this is important to the WG than OK. Let's just write down how this context info can be used, if this is the strongest use case.


[Qin] I think operators may need to monitor the performance where there is SLA and proactively detect when the service is degrading. The regulators are also interested in monitoring the performance of broadband service and compare performance between ISPs or between different countries. All of these use cases have been documented in RFC7536 which performance metric draft could reference.




- Sec 2.1 I am confused about the meaning of the "sla" cost-source. Does this refer to an SLA the ALTO client has with the network? Between the target IP and the network? Or something else? If the first, does this link to client authentication in some way? If the second, what are the privacy implications of exposing these SLAs?

It is target IP and the network. Here is some text in the current version
on the authentication and privacy aspects (Sec. 6):
"Indeed, TE
   performance is a highly sensitive ISP information; therefore, sharing
   TE metric values in numerical mode requires full mutual confidence
   between the entities managing the ALTO server and the ALTO client.
   ALTO servers will most likely distribute numerical TE performance to
   ALTO clients under strict and formal mutual trust agreements.  On the
   other hand, ALTO clients must be cognizant on the risks attached to
   such information that they would have acquired outside formal
   conditions of mutual trust."

Will this be OK?

That privacy information is alright, but exposing the details of third-party SLAs deserves special attention.

Yes.

But to follow up your answer: if the client has a better SLA than the target, this won't show up in the metrics at all?

Now I see that I need to clarify. The metric is end-to-end, from src IP to dst (target) IP. Does this clarify? I can add a sentence to clarify this.

Yes, that clarifies it. It also spurs my first followup question: does this link to client authentication in some way, or can anyone impersonate a client to get its SLA?

[Qin]: I agree with Martin we should prevent Excess disclosure of the ALTO service provider's data to the unauthorized client or third party,RFC7285 defines protection strategy in the security section to address these risks such as adopt HTTP Digestion authentication or TLS client authentication.

I suggest to quote some text in RFC7285 to address this issue.






- Sec 2.1. Related to the above, the text suggests that any cost-source expressed as "import" could also be expressed as "estimation". Why would the server do this? The text should say, or perhaps it would be conceptually cleaner if "estimation" and "import" were mutually exclusive sources by definition.

In the early WG discussion, they were considered separate, and then the agreement was that import is a special case of estimation, with more specific dependency tracking. Consider data provenance of how the ALTO data are computed. Estimation means that the server does not want to indicate the specific details, and the important gives a precise indication of the exact protocols.

OK, I now understand that "import" implies a specific set of parameters. I can't understand what value this distinction has, but that just circles around to me not understanding the cost-source information at all.


I see. Let me try to clarify slightly differently. import means that the ALTO server can provide a precise source of information using specific parameters, and estimate is that it comes from a black-box (the server does not reveal). Thinking about it a bit more, we can go down the path of specifying a precise format (rfc/section just as ippm) when specifying import. Will this be a direction that you want to go?

I don't have a strong opinion on the outcome, except that it is strongly motivated by a use case and have semantics consistent with that use case. There can be loosely defined contexts that are designed to be human readable, which allows end users to see the QoS they're getting. Or, it can be machine readable allowing policy to execute off of it, but I'm not 100% sure why policy would care about the context.
[Qin]: My impression is that distinguish the measured value from SLA value make sense given the use case provided above. Regarding the distinguish import from estimate, my feeling the client doesn’t need to care about whether it is one hop metric or multiple hop metric (i.e., end to end metric) as long as we have already specify the source address and destination address in the request/response. I think distinguish measured value from estimated value make sense to me, I think the key difference is the measure value focus on directly measured value while the estimated value focuses on compound derived value. Maybe we should reference RFC6792 for the definition of direct metric, cumulative metric and sampled metric
“

   Direct metrics



      Metrics that can be directly measured or calculated and are not

      dependent on other metrics.



   Cumulative metrics



      Metrics measured over several reporting intervals for accumulating

      statistics.  The time period over which measurements are

      accumulated can be the complete RTP session, or some other

      interval signaled using an RTCP Measurement Information XR Block

      [RFC6776<https://datatracker.ietf.org/doc/html/rfc6776>].  An example cumulative metric is the total number of

      RTP packets lost since the start of the RTP session.



   Sampled metrics



      Metrics measured at a particular time instant and sampled from the

      values of a continuously measured or calculated metric within a

      reporting interval (generally, the value of some measurement as

      taken at the end of the reporting interval).  An example is the

      inter-arrival jitter reported in RTCP SR and RR packets, which is

”
Make sense?
If this make sense, I think distinction nominal, sla from estimated make sense, without introducing cost context, we may get different results of performance metrics.
Maybe the issues we need a better terminology since the current term such estimation, import may be a little bit misleading,  import should go away in my opinion based on clarification above.


- Sec 5.4.1: "...the ALTO server may provide the client with the validity period of the exposed metric values."

Shouldn't there be a standard format for this? Or are you implying the use of cost-calendar?

Good catch. The decision of the WG at the time was to use HTTP whenever possible. For example, the freshness is indicated by HTTP timestamp (see Sec. 5.2); by consistency, then, we should use HTTP Expires. We can add this. Agree?

Agreed.

[alto] AD review of draft-ietf-alto-performance-m… Martin Duke
Re: [alto] AD review of draft-ietf-alto-performan… Martin Duke
Re: [alto] AD review of draft-ietf-alto-performan… Martin Duke
Re: [alto] AD review of draft-ietf-alto-performan… Y. Richard Yang
Re: [alto] AD review of draft-ietf-alto-performan… Y. Richard Yang
Re: [alto] AD review of draft-ietf-alto-performan… Martin Duke
Re: [alto] AD review of draft-ietf-alto-performan… Y. Richard Yang
Re: [alto] AD review of draft-ietf-alto-performan… Martin Duke
Re: [alto] AD review of draft-ietf-alto-performan… Qin Wu
Re: [alto] AD review of draft-ietf-alto-performan… Martin Duke
Re: [alto] AD review of draft-ietf-alto-performan… Qin Wu