Re: [ippm] Call for Adoption of draft-mhmcsfh-ippm-pam

Alexander L Clemm <ludwig@clemm.org> Thu, 17 November 2022 00:32 UTC

Return-Path: <ludwig@clemm.org>
X-Original-To: ippm@ietfa.amsl.com
Delivered-To: ippm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A2BEBC14CF12; Wed, 16 Nov 2022 16:32:13 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.893
X-Spam-Level:
X-Spam-Status: No, score=-1.893 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, NICE_REPLY_A=-0.001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4HFgoace6e7g; Wed, 16 Nov 2022 16:32:11 -0800 (PST)
Received: from mout.perfora.net (mout.perfora.net [74.208.4.197]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C9D5BC14CF0E; Wed, 16 Nov 2022 16:32:10 -0800 (PST)
Received: from [172.16.0.44] ([73.189.160.186]) by mrelay.perfora.net (mreueus002 [74.208.5.2]) with ESMTPSA (Nemesis) id 0LgJSE-1pGHWe2IpY-00nlqQ; Thu, 17 Nov 2022 01:32:08 +0100
Content-Type: multipart/alternative; boundary="------------bWnGqwl1w7h0jrRm1Sveks1N"
Message-ID: <bd936ac5-960f-4e68-437c-58e88fd0993a@clemm.org>
Date: Wed, 16 Nov 2022 16:32:07 -0800
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2
Content-Language: en-US
To: Greg Mirsky <gregimirsky@gmail.com>, Benoit Claise <benoit.claise=40huawei.com@dmarc.ietf.org>
Cc: Tommy Pauly <tpauly=40apple.com@dmarc.ietf.org>, IETF IPPM WG <ippm@ietf.org>
References: <1BCD27D1-4A44-4FF1-BD91-C6B78F0F03A3@apple.com> <b108d198-f24f-bb8e-6782-05ffe95e2888@huawei.com> <CA+RyBmVprYT8vy2Hz+3Hap66u=4DaUh-7YAHVEfPmNzN3dtqLg@mail.gmail.com>
From: Alexander L Clemm <ludwig@clemm.org>
In-Reply-To: <CA+RyBmVprYT8vy2Hz+3Hap66u=4DaUh-7YAHVEfPmNzN3dtqLg@mail.gmail.com>
X-Provags-ID: V03:K1:64bFWHFa1EGWkavNVvlwABMGAssyGmLGN/HxbNMcNxrhCQv8lkQ P6TBQ0IrrJcQ9obV/haD9qIILJLhZEFLgwKUyl03hVfAEBBDg08cuREnyqiI7GHHE7lxN3r RVbBj9g62wnwC8v0pLJUQ45eIy4x/JfPbfsi2rU/BoH33krGE86Fz0BE4saiszVGqRWDy8g uf4WRAN0p7TUIRsddhsiQ==
X-UI-Out-Filterresults: notjunk:1;V03:K0:TPWj+LFZg+8=:1BcEIWANUaAfDbAscM8Lmo zBkXA2gOpSGjcjBAQzVtjL4KDC9gyjN2SOtRuvnPMF10jqZtvqkGWm5aOD5u3XhpqaauUNjLO AFOwVyOvKhNIik/liguifgnlNY6WCh0SLlAhp2hPgmtyoDTgB5icMdo15B/O7TPpG+sxRXX7z muLaIMJ8lbeA+4O/Qof4srrBASRS/lX6B+MeB48PYOZiPahorvGVRBAOi7kY5Qu0dEsM3m5kP hiTrZGoDw7CUADNQ5vn4AGE4PTezWh0RUDY/X9Cv7X6qR0GWofuKui/v7lFNVUX6AV1mpqPCT k0yZ+flbn8p4scynjRDlKSSZUjiqJbBZAUZvTkEMaELrzSZxEgWNaBRr5XDWA1XUlk7pgrqro CxfME5FUFvXUDDdbpdWK7LUmVUf4SlvxqgNyIvJMAhcyvAqOlt5TFRJek2JHpYVdbOWjjfrcV l0EiOHpQCydSS3tUtDmxLFCJJocioCqmi9Npk0mcpg0n/EvPWKLMXI4CVULtooakZUlogGehP eQfQCWjJ3icMAA1YhDIfrgaUkjPjX5poQ7DJAMx0i/LNvXwJBmaQxzhm3fjFJyCcqCu+EPlY/ 7rcMswZ7OkW3ZDycvzrO1QP2yTpCNW3jgtOtA3QsisxdQcmBBJnsPAZLnMY2kIqErlIqEherm DfUYSACty2sc4xNGpmVHT7kGITAQ+KoeJTIPZSgvNqnfNrLxeYYAXrcSXk2NmLrszFyy8Tnue px4r/M0tj9D2hiRzNSPpa8RA6lAtZAsvEwKasfj/G/eQDHEHag/WPVKmYhkVI3SSGUbSWhVgR jV/jXgjiMSbOV3TzId2Yra4HqGBUBwCSbkCMJ2JTxYPPpaWxM49pnadH7WGPLyaoonA9ryDRH 9RtipBRdWZsz0CL5RGnJHsAVKUvltuW6o3J7Dvs3WW5uNngrRfoPnqyPVzWj+VKnUkm7JSWQ1 DSco18N0Xj0SE7UNrluVBMPQ7NZWXcGCF+N+aAafz9zwxOQhAbvKiWm+F03XautDlIHqEK9R7 2nzl41TUhb5eNZzKvgpOcDwMP9Nlh05zMQ+8bku5xkLGVMF6e4k/WGF/J0AYEk9PxPdgomCQw ZvdHpusbBIsb1rVkAsdKT0FIh7CfDlDPGyKsGd77VP0od9zw32BgzcrpP3IzOrXvhJevQUjOG oNyMPdJSgPQPsdgjJtuk/3N8NvPWzlEk0iSeQRn11rlBrqvF9b6VdpOhm615UV+XYfLUtqzwS dBlEZ9nLbajfVFzc90zFrfemwsulkRg4bUYz8WO0mj7K5+Rdmgol6dxWFbTOn/chamae+i7F2 EWvawm5jSOG1jUOx52t0v9wEvnD9+A==
Archived-At: <https://mailarchive.ietf.org/arch/msg/ippm/d8cexZlueDIR3QCE1amLfUfK0yo>
Subject: Re: [ippm] Call for Adoption of draft-mhmcsfh-ippm-pam
X-BeenThere: ippm@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: IETF IP Performance Metrics Working Group <ippm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ippm>, <mailto:ippm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ippm/>
List-Post: <mailto:ippm@ietf.org>
List-Help: <mailto:ippm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ippm>, <mailto:ippm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Nov 2022 00:32:13 -0000

Hi all,

I wanted to respond to this thread to address and hopefully get closure 
on Benoit's comments, inline, delimited by <ALEX>. My apologies for the 
belated response.  In summary, I do think the scope and the problem are 
clear enough for this work to be adopted, recognizing that there is 
clearly work that remains to be done.  One area certainly for discussion 
concerns the actual set of "second order" metrics to include, but again 
I am wondering whether this determination needs to be made now versus 
once adopted.

--- Alex

On 9/14/2022 1:50 AM, Greg Mirsky wrote:
> Hi Benoit,
> thank you for your comments and questions. Please find my notes 
> in-lined below under the GIM>> tag. I am looking forward to continuing 
> our discussion.
>
> Regards,
> Greg
>
> On Mon, Sep 5, 2022 at 9:21 AM Benoit Claise 
> <benoit.claise=40huawei.com@dmarc.ietf.org> wrote:
>
>     Dear all,
>
>     I don't dispute the importance of this work. However, the scope of
>     this work is not clear yet IMO.
>
>      - SLO, sure,  but what's not clear to me is: SLO per customer,
>     per service, per class of service, per flow, per application
>     I found "Precision Availability Metrics (PAM), aimed at capturing
>     end-to-end service levels for a flow, specifically the degree to
>     which flows comply with the SLOs that are in effect".
>     So OK, we speak about flow. So what is your flow definition?
>
> GIM>> The scope of monitoring is the same as the scope of SLA that is 
> composed of the set of SLOs.

<ALEX> To Benoit: The SLO will apply at the level of a service instance, 
generally be at the level of the flow. Performing metering at that level 
is straightforward; at the end of the day metrics such a Violated 
Intervals are yet another metric that could be maintained as part of 
flow statistics. That said, clearly you can construct more complex 
service models and the availability metrics will apply equally 
regardless of the particular scope with which you define the SLO.  
Whatever the scope, the metrics answer the question whether the service 
that was being delivered was in fact available at all times, i.e. 
compliant with the terms of the SLO.

We can refine the text, but at the end of the day I think the intent is 
pretty clear and this is one of the items that can be refined as the 
work is adopted.

</ALEX>

>     - Btw, based on the previous quoted sentence, I don't understand
>     this PAM name. No mention of SLA, no mention of flow, no notion of
>     service.
>      Basically, you report a service level indicator (SLI). You
>     confused me with PAM
>
> GIM>> The intention is to report not raw SLI, i.e., measurable metric, 
> but rather how the SLI is conforming to its SLO.

<ALEX> PAM stands for Precision Availability Metric.  I do not think 
there is a need to get into SLAs (which contain a lot more stuff than is 
relevant here); let's focus just on the compliance of the service being 
delivered with its stated service level objective(s).  The reason we 
chose the term "availability" is because of its analogy to system 
availability.  We consider the service as "available" when it is 
delivered per the agreed-upon quality (i.e. SLO).

I am not sure what you refer to with "service level indicator". If you 
consider "system availability" a service level indicator, then sure, you 
can consider it an SLI.  To me, however, SLIs would be better reserved 
to refer to things like latency or less.  What makes the Precision 
Availability Metrics "special" is the fact that they are not "absolute" 
(i.e. RTT=276msec or such), but relative to an SLO, capturing 
violations.  Regular SLIs don't do that - they simply say what was 
delivered, without indicating whether or not that was in compliance or 
not, which would require additional postprocessing.

</ALEX>

>     - How are you going to report this flow definition, along with the
>     SLI? IPFIX key fields? With a YANG model?
>     This section 6 content is key to understand how to use those SLIs
>     in an operational environment
>
>             The following is a list of items for which further discussion is
>             needed as to whether they should be included in the scope of this
>             specification:
>
>             *  A YANG data model.
>
>             *  A set of IPFIX Information Elements.
>
>             *  Statistical metrics: e.g., histograms/buckets.
>
> GIM>> We welcome collaboration on all or any of these problems.
>
>     - I am not a big fan to specify some level of thresholding in
>     specifications.
>
>         *  VI is a time interval during which at least one of the performance
>            parameters degraded below its pre-defined optimal level threshold.
>
>         *  SVI is a time interval during which at least one the performance
>            parameters degraded below its pre-defined critical threshold.
>
>     Based on my experience, most of the time, we don't get the
>     threshold values/names right, and we don't get the number of them
>     right.
>         ex: violated, severely violated ... why not extremely
>     violated, catastrophically violated?
>
<ALEX> I do agree that we need to keep thresholding and such (which I 
consider secondary metrics) separate from the primary metrics.  The 
primary metrics are what is IMHO the most important here; secondary 
metrics are nice to have (but an add-on which offers its own complexity, 
may depend on individual operational policies, etc).  We can certainly 
discuss which of the metrics to include, but again this is a 
determination that IMHO should not make or break adoption. Re: 
tresholding, this is in no way central in this draft; we are aware of 
the associated issues and this may be one of the items that could be 
taken out or isolated from the other aspects.

</ALEX>

> GIM>> Agree that it might take several iterations to set thresholds 
> right. Would note that draft-ietf-teas-ietf-network-slices 
> <https://datatracker.ietf.org/doc/draft-ietf-teas-ietf-network-slices/> gives 
> and example of SLO in Section 4.1 using target/bound values, i.e., 
> thresholds, as following:
>    *  A Service Level Objective (SLO) is a target value or range for the
>       measurements returned by observation of an SLI.  For example, an
>       SLO may be expressed as "SLI <= target", or "lower bound <= SLI <=
>       upper bound".  A customer can determine whether the provider is
>       meeting the SLOs by performing measurements on the traffic.
>
>     Trying to express, from the measurement aspects, whether the
>     observations are SEVERELY impacting (that's the way I read SVI) is
>     not the right approach IMO.
>     This is maybe you open issues in section 6
>         * Policies regarding the definition of "violated" and
>     "severely violated" time interval.
>
> GIM>> Yes, that is our intention to further work on improving these 
> definitions.
>
>
>     Bottom line:
>     Granted, IPPM is about performance metrics but specifying metrics
>     without specifying how they will be used in an operational
>     environment is not the right way IMO.
>     I believe the scope of this document is NOT clear enough to be
>     adopted. In other words, I don't know what I'm signing for...
>
<ALEX> I hope this does provide clarification.  I do think that what the 
document is trying to accomplish is sufficiently clear, the fact that 
some additional work remains to be done after adoption notwithstanding.  
But we are not asking for Last Call, just for adoption at this point.

</ALEX>


>     Regards, Benoit
>
>     On 9/1/2022 7:25 PM, Tommy Pauly wrote:
>>     Hello IPPM,
>>
>>     As discussed at IETF 114, we’re starting an adoption call for
>>     Precision Availability Metrics for SLO-Governed End-to-End
>>     Services, draft-mhmcsfh-ippm-pam.
>>
>>     https://datatracker.ietf.org/doc/draft-mhmcsfh-ippm-pam/
>>
>>     The current version is here:
>>
>>     https://www.ietf.org/archive/id/draft-mhmcsfh-ippm-pam-02.html
>>
>>     Please reply to this email by *Thursday, September 15*, to
>>     indicate whether you support adoption of this draft.
>>
>>     Best,
>>     Tommy & Marcus
>>
>>
>>     _______________________________________________
>>     ippm mailing list
>>     ippm@ietf.org
>>     https://www.ietf.org/mailman/listinfo/ippm
>
>     _______________________________________________
>     ippm mailing list
>     ippm@ietf.org
>     https://www.ietf.org/mailman/listinfo/ippm
>
>
> _______________________________________________
> ippm mailing list
> ippm@ietf.org
> https://www.ietf.org/mailman/listinfo/ippm