Re: [ippm] John Scudder's Discuss on draft-ietf-ippm-rfc8889bis-02: (with DISCUSS and COMMENT)

Giuseppe Fioccola <giuseppe.fioccola@huawei.com> Tue, 12 July 2022 17:41 UTC

Return-Path: <giuseppe.fioccola@huawei.com>
X-Original-To: ippm@ietfa.amsl.com
Delivered-To: ippm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E8441C14F720; Tue, 12 Jul 2022 10:41:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.906
X-Spam-Level:
X-Spam-Status: No, score=-1.906 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zEQtLoOM8Gxq; Tue, 12 Jul 2022 10:41:40 -0700 (PDT)
Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6A6F9C157B4D; Tue, 12 Jul 2022 10:41:32 -0700 (PDT)
Received: from fraeml714-chm.china.huawei.com (unknown [172.18.147.226]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4Lj7LZ1r51z67KQL; Wed, 13 Jul 2022 01:40:06 +0800 (CST)
Received: from fraeml714-chm.china.huawei.com (10.206.15.33) by fraeml714-chm.china.huawei.com (10.206.15.33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Tue, 12 Jul 2022 19:41:29 +0200
Received: from fraeml714-chm.china.huawei.com ([10.206.15.33]) by fraeml714-chm.china.huawei.com ([10.206.15.33]) with mapi id 15.01.2375.024; Tue, 12 Jul 2022 19:41:29 +0200
From: Giuseppe Fioccola <giuseppe.fioccola@huawei.com>
To: John Scudder <jgs@juniper.net>, The IESG <iesg@ietf.org>
CC: "draft-ietf-ippm-rfc8889bis@ietf.org" <draft-ietf-ippm-rfc8889bis@ietf.org>, "ippm-chairs@ietf.org" <ippm-chairs@ietf.org>, "ippm@ietf.org" <ippm@ietf.org>, "tpauly@apple.com" <tpauly@apple.com>
Thread-Topic: John Scudder's Discuss on draft-ietf-ippm-rfc8889bis-02: (with DISCUSS and COMMENT)
Thread-Index: AQHYlgUhl8GSzPV5x0CJaj0qau2zyq169RWw
Date: Tue, 12 Jul 2022 17:41:29 +0000
Message-ID: <6776f349e0a74909a73e707205e8bc59@huawei.com>
References: <165764015555.5095.645276163082274118@ietfa.amsl.com>
In-Reply-To: <165764015555.5095.645276163082274118@ietfa.amsl.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.81.214.34]
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-CFilter-Loop: Reflected
Archived-At: <https://mailarchive.ietf.org/arch/msg/ippm/PakhLJQUWxZRo3L0wyCb5C9m08E>
Subject: Re: [ippm] John Scudder's Discuss on draft-ietf-ippm-rfc8889bis-02: (with DISCUSS and COMMENT)
X-BeenThere: ippm@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: IETF IP Performance Metrics Working Group <ippm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ippm>, <mailto:ippm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ippm/>
List-Post: <mailto:ippm@ietf.org>
List-Help: <mailto:ippm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ippm>, <mailto:ippm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 Jul 2022 17:41:42 -0000

Hi John,
Thanks again for your revision.
Please find my answers inline tagged as [GF].
I plan to address your comments in the next revision.

Best Regards,

Giuseppe

-----Original Message-----
From: John Scudder via Datatracker <noreply@ietf.org> 
Sent: Tuesday, July 12, 2022 5:36 PM
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-ippm-rfc8889bis@ietf.org; ippm-chairs@ietf.org; ippm@ietf.org; tpauly@apple.com; tpauly@apple.com
Subject: John Scudder's Discuss on draft-ietf-ippm-rfc8889bis-02: (with DISCUSS and COMMENT)

John Scudder has entered the following ballot position for
draft-ietf-ippm-rfc8889bis-02: Discuss

When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.)


Please refer to https://www.ietf.org/about/groups/iesg/statements/handling-ballot-positions/
for more information about how to handle DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-ippm-rfc8889bis/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

Thanks for this document. As you may have noticed I had considerable difficulty
with the definition of "cluster". Once I completed an end-to-end read-through,
this was resolved (good!) but because RFCs are often consumed piecemeal (e.g.
someone may just dip into a portion of a document rather than settling down
with a nice cup of tea to read it end-to-end), I think it's important to fix
this problem, on the assumption I'm not the only person who might be thrown off.

[GF]: No problem. I agree with you. It is better to improve the readability.

I'll leave the details in the COMMENT, but I will repeat one observation from
my comment #3, which is that I count at least four separate (re-)definitions of
"cluster" in the document. With so many, it's no wonder that they're
inconsistent, and quite possibly the simplest solution would involve cutting
the number of definitions down to as close to 1 as possible.

[GF]: Your suggestion to keep only one definition of cluster makes sense to me. I can probably keep only the definition in the Terminology section and align the others or refer to section 2 where necessary.

----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

I support Roman's DISCUSS.

1. In §1, this

                                  While this document and its Clustered
   Alternate-Marking method is valid for multipoint-to-multipoint
   unicast flows, anycast, and ECMP flows.

This is a sentence fragment (a naked subordinate clause). I'd suggest a rewrite
but I can't make out what you are trying to say.

[GF]: This is about the scope. Maybe I can replace "is valid for" with "apply to"

2. You have multiple definitions of "cluster" in the document, it seems. In §2
you have,

      Cluster: Smallest identifiable subnetwork of the entire monitoring
      network graph that still satisfies the condition that the number
      of packets that go in is the same as the number that go out.  A
      cluster partition algorithm can be applied to split the monitoring
      network into clusters.

As we have discussed in the past, according to this definition, a cluster is
always a single router. I think in an earlier discussion you suggested that it
could be corrected by saying something like "smallest subnetwork larger than a
singleton router", which I think would work, "smallest non-trivial" would too
(although it's less explicit). 
[GF]: I already revised the sentence in my local version and in draft-ietf-6man-ipv6-alt-mark as well.
But then in §5 you have,

   In addition, it is also possible to leverage the data provided by the
   other counters in the network to converge on the smallest
   identifiable subnetworks where the losses occur.  These subnetworks
   are named "clusters".

As written, this makes nonzero losses a definitional attribute of a cluster.
Now that I've taken in the entire document, I guess this is the incorrect
definition. Possibly a fix would be to just drop the final sentence, which
(incorrectly?) implies that losses are an essential element of the definition.

[GF]: I see your point. I will delete that sentence and refer to the definition in the Terminology section

For what it's worth, the algorithm of Section 5.1 is perfectly clear. I see the
desire to have a prose description of what a cluster is trying to do, but I
wonder if in the end it would be best to make the algorithm the canonical
definition, referenced from the prose definition as in

      Cluster: Smallest identifiable non-trivial
      subnetwork of the entire monitoring
      network graph that still satisfies the condition that the number
      of packets that go in is the same as the number that go out.  A
      cluster partition algorithm, such as that found in Section 5.1,
      can be applied to split the monitoring
      network into clusters.

[GF]: Good suggestion. In this case I could probably move the prose description and the related example to an Appendix.

3. In §5, you have

   A cluster graph is a subnetwork of the entire monitoring network
   graph that still satisfies the packet loss equation (introduced in
   the previous section), where PL in this case is the number of packets
   lost in the cluster.

I'd previously pointed out that this is problematic since the PL equation you
mention is in the nature of a definition; since you don't supply a condition
there's no way of saying whether it's "satisfied" or not. In response you said
(https://mailarchive.ietf.org/arch/msg/ippm/iXSBuXrQaETl6MyeDH8DBaPiJ9Q/),

```
We can modify the wording in Section 5 accordingly:

"A cluster graph is a subnetwork of the entire monitoring network graph that
still satisfies the condition that the number of packets that go in is the same
as the number that go out, if no packet loss occurs."

```

If indeed that is the right definition (see my earlier point, so probably
inserting "non-trivial" or "with more than a single router") then I do think
you should make that change, or in any case you should correct the current text
somehow.

[GF]: Already done in my local version.

By the way, do you mean something subtly different by "cluster graph" than you
do by "cluster"? Or do you just mean "the graph that represents the cluster"? I
think my confusion over just what a "cluster" is would have been mitigated by
fewer re-definitions of it. If "cluster graph" is indeed just another way of
saying "cluster" it's the third of four definitions! (The two I mention in my
point #2, this one, and the algorithm in Section 5.1.) With four definitions of
the same thing, in different words, no wonder some inconsistency crept in!

[GF]: Yes, with cluster graph I only meant the graph that represents the cluster. It is not a new definition. For this reason, I will probably remove some of these definitions and simply refer to Terminology section.

4. In §5.1,

   In summary, once a flow is defined, the algorithm to build the
   clusters partition is based on topological information; therefore, it
   considers all the possible links and nodes crossed by the given flow,
   even if there is no traffic.

If a flow has no traffic, can we call it a flow? It seems counter to the normal
English meaning of the word. Perhaps you mean possible links and nodes that
could potentially be crossed by the given flow?

[GF]: Yes, I will fix.

5. Thanks for reporting the outcome of the earlier experiment, in §9. As with
my review of rfc8321bis, I think the RFC 2119 keywords need to be in a
"Deployment Considerations" section or similar, which could be achieved by
retitling this section or by separating the material into reporting the
experiment outcome (this section) and a new subsection for deployment advice.

[GF]: Agree. As per RFC8321bis I can retitle the section.

6. Again similarly to my review of the companion document, I think §9's "the
Multipoint Alternate Marking Method is RECOMMENDED only for controlled domains"
needs to be fixed. Please bring it into line with whatever language you choose
for rfc8321bis.

[GF]: Sure. Similarly to RFC8321bis, I will replace it with: "The Multipoint Alternate Marking Method MUST only be applied to controlled domains."

Nits:

7. While the reference tags are in principle arbitrary strings, I wondered if
it was a typo that you used "PNPM" to tag a paper titled "AM-PM"?

[GF]: It was not a typo since PNPM stands for Packet Network Performance Measurement. But AM-PM can also be used.

8. In §7.1.1,

                 This means that, in the calculation, it is possible to
   weigh the timestamps by considering the number of packets for each
   endpoints.

s/endpoints/endpoint/

[GF]: Ok