RE: John Scudder's Discuss on draft-ietf-6man-ipv6-alt-mark-14: (with DISCUSS and COMMENT)

Giuseppe Fioccola <giuseppe.fioccola@huawei.com> Mon, 27 June 2022 18:01 UTC

Return-Path: <giuseppe.fioccola@huawei.com>
X-Original-To: ipv6@ietfa.amsl.com
Delivered-To: ipv6@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9E901C15AD25; Mon, 27 Jun 2022 11:01:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.906
X-Spam-Level:
X-Spam-Status: No, score=-1.906 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qmDALGKO6bgH; Mon, 27 Jun 2022 11:01:48 -0700 (PDT)
Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6C501C15AAFA; Mon, 27 Jun 2022 11:01:48 -0700 (PDT)
Received: from fraeml715-chm.china.huawei.com (unknown [172.18.147.206]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4LWwRr3fQRz67lBv; Tue, 28 Jun 2022 01:57:44 +0800 (CST)
Received: from fraeml714-chm.china.huawei.com (10.206.15.33) by fraeml715-chm.china.huawei.com (10.206.15.34) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Mon, 27 Jun 2022 20:01:44 +0200
Received: from fraeml714-chm.china.huawei.com ([10.206.15.33]) by fraeml714-chm.china.huawei.com ([10.206.15.33]) with mapi id 15.01.2375.024; Mon, 27 Jun 2022 20:01:44 +0200
From: Giuseppe Fioccola <giuseppe.fioccola@huawei.com>
To: John Scudder <jgs@juniper.net>, The IESG <iesg@ietf.org>
CC: "draft-ietf-6man-ipv6-alt-mark@ietf.org" <draft-ietf-6man-ipv6-alt-mark@ietf.org>, "6man-chairs@ietf.org" <6man-chairs@ietf.org>, "ipv6@ietf.org" <ipv6@ietf.org>, "bob.hinden@gmail.com" <bob.hinden@gmail.com>, "otroan@employees.org" <otroan@employees.org>
Subject: RE: John Scudder's Discuss on draft-ietf-6man-ipv6-alt-mark-14: (with DISCUSS and COMMENT)
Thread-Topic: John Scudder's Discuss on draft-ietf-6man-ipv6-alt-mark-14: (with DISCUSS and COMMENT)
Thread-Index: AQHYiL/elXBlN8BQQ0y6KwLT9/B8+61jY22g
Date: Mon, 27 Jun 2022 18:01:44 +0000
Message-ID: <718cc820610146f6b3b4803ed0f232a7@huawei.com>
References: <165618105650.35108.7989371160326579064@ietfa.amsl.com>
In-Reply-To: <165618105650.35108.7989371160326579064@ietfa.amsl.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.81.223.94]
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-CFilter-Loop: Reflected
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipv6/gbduWbRFxkr06z1rxU6UgtXav9E>
X-BeenThere: ipv6@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "IPv6 Maintenance Working Group \(6man\)" <ipv6.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipv6>, <mailto:ipv6-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipv6/>
List-Post: <mailto:ipv6@ietf.org>
List-Help: <mailto:ipv6-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 27 Jun 2022 18:01:52 -0000

Dear John,
Thanks a lot for your detailed review.
I will update the document to address your comments.
Please see inline my replies tagged as [GF].

Regards,

Giuseppe

-----Original Message-----
From: John Scudder via Datatracker <noreply@ietf.org> 
Sent: Saturday, June 25, 2022 8:18 PM
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-6man-ipv6-alt-mark@ietf.org; 6man-chairs@ietf.org; ipv6@ietf.org; bob.hinden@gmail.com; otroan@employees.org; otroan@employees.org
Subject: John Scudder's Discuss on draft-ietf-6man-ipv6-alt-mark-14: (with DISCUSS and COMMENT)

John Scudder has entered the following ballot position for
draft-ietf-6man-ipv6-alt-mark-14: Discuss

When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.)


Please refer to https://www.ietf.org/about/groups/iesg/statements/handling-ballot-positions/
for more information about how to handle DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-6man-ipv6-alt-mark/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

There are a few points I'd like to DISCUSS before moving this forward. The
first should be easy to address:

1. The number of authors (6) exceeds the maximum recommended by the RFC Editor
(5), see RFC 7322 §4.1.1. Has this exception already been cleared with the AD?
I see no mention of it in the shepherd writeup.

(Speaking of the shepherd writeup, and this is a comment for the shepherd and
not the authors and is only an aside, not something that needs to be addressed
in the DISCUSS, the writeup answer to question 1 is incomplete.)

The second is substantive rather than administrative, and may require more
discussion:

2. From what I understand of the rules about IPv6 extension header insertion
(viz, only end nodes can do it), plus the assumptions stated in §2.1.1 about
the extent of the "controlled domain", it would seem a natural consequence 
that ipv6-alt-mark is only applicable to networks where user traffic enters and
exits a tunnel at the perimeter of the "controlled domain". I don't see this
stated plainly anywhere in the document. If I'm correct this seems like an
important characteristic to spell out. If I'm incorrect, I'd appreciate a
discussion of where I went wrong.

[GF]: Yes, it is correct. This is a requirement based on the security concerns raised. I think we can highlight better this point in the Abstract and in the Introduction so it becomes clearer. 

I touch on various aspects of this overall point in some of my comments as well.

Finally, although I haven't placed any of the further comments in the DISCUSS
section, I note that some of them are fairly fundamental so I would appreciate
a thorough reply to the comments as well.


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

3. In §2.1.1 we have,

      the CPE (Customer Premises Equipment) is most likely to be the
      starting or ending node since it connects the user's premises with
      the service provider's network and therefore belongs to the
      operator's controlled domain.  Typically the CPE encapsulates a
      received packet in an outer IPv6 header which contains the
      Alternate Marking data.  The CPE can also be able to filter and
      drop packets from outside of the domain with inconsistent fields
      to make effective the relevant security rules at the domain
      boundaries, for example a simple security check can be to insert
      the Alternate Marking data if and only if the destination is
      within the controlled domain.

A remark -- I was surprised to see that the CPE is considered by the authors to
be within the operator's security perimeter. Even if the operator manages the
CPE (not a given), the CPE isn't physically secure, so I would assume it can't
be considered to be as trustworthy as equipment kept on the operator's premise.

[GF]: We considered the CPE since it normally imposes the tunnel to the outgoing traffic (e.g. SRv6). We can further specify in the document that the CPE can be within the controlled domain only if it is managed by the operator and if additional security measures are taken to keep it secure. Otherwise it cannot be totally trustworthy.

And, a question -- since in this paragraph you suggest that the controlled
domain perimeter terminates prior to the user equipment, I read your final
clause ("...destination is within the controlled domain") to mean that Alt-Mark
wouldn't be usable for any traffic destined for user equipment. Thus the
applicability would be limited to network management traffic, or overlay
traffic that will be decapsulated somewhere within the operator network
(perhaps at a CPE collocated with the user equipment which is the true
destination). Is that accurate? If not, what am I missing?

[GF]: Yes, it is correct. This is for security reasons. Maybe a future security extension is possible to extend it to the user equipment.

I guess it's overlay traffic, since as you also point out, only the source node
is allowed to insert the Option Header -- and for the CPE to be the source
node, it has to be doing an encapsulation operation. Right?

[GF]: Yes correct.

(If the user equipment could be the destination, then the "simple security
check" might not be so simple, depending on whether the user premise addressing
comes from provider space or the user's own space.)

[GF]: Indeed, but considering the scope of the document, if you apply it to the overlay traffic, it is not possible that the user equipment is the destination. 

4. You mention in §4 and elsewhere that the Destination Option version of the
Alt-Mark can be processed by each node in the route list if there's a Routing
Header present:

   o  Destination Option preceding a Routing Header => every destination
      node in the route list.

Suppose some flavor of segment list compression is in use, such as the
draft-ietf-spring-srv6-srh-compression NEXT-C-SID "flavor". In some cases (when
the SID list is short enough and the compression tight enough) can't you have a
packet that contains a SID list that's expressed entirely in what conventional
IPv6 would call the "destination address", and no Routing Header is present?

What would the expected processing of the Alt-Mark Destination Option be in
that case?

[GF]: In this case, we should understand if the processing of DOH happens as for the case of Routing Header or not. If not, HBH can be used. I think this can be a good point to address in draft-ietf-6man-sids.

5. In §2 you have

                                                 Furthermore, since the
   Flow Label is pseudo-random, there is always a finite probability of
   collision.  Those reasons make the definition of the FlowMonID
   necessary for IPv6.

This objection to the pseudo-randomness of the flow label seemed OK, until I
came to the definition of FlowMonID, which you define as... pseudo-random, and
you later talk about its possibility of collision. Maybe you should remove the
"furthermore" sentence, since apparently you don't have any objection to
collisions after all?

[GF]: Yes, right. I will do.

(I return to the question of the FlowMonID later on in this review.)

6. In §4 you have

   The Hop-by-Hop Option defined in this document is designed to take
   advantage of the property of how Hop-by-Hop options are processed.
   Nodes that do not support this Option SHOULD ignore them.

This doesn't seem to be written quite right. If you're just stating an
expectation, based on the requirements from RFC 8200, then you're not
introducing a new requirement, and you shouldn't have the RFC 2119 keyword
there. It's also not obvious what the referent of "them" is. Do you perhaps
mean something like this?

   The Hop-by-Hop Option defined in this document is designed to take
   advantage of the property of how Hop-by-Hop options are processed.
   Nodes that do not support this Option would be expected to ignore
   it if encountered, according to the procedures of [RFC8200].

In general the paragraph that includes this sentence is repetitive, so you
could also just remove the sentence and do a little re-wording -- but at least
a rewrite to remove the SHOULD is necessary.

[GF]: Yes, good suggestion! I will revise and replace the sentence.

7. In §5.1 you say,

                                                                where B
   is the fixed time duration of the batch, which refers to the original
   marking interval at the source node considering that this interval
   could fluctuate along the path.

How can the interval fluctuate, considering that you've taken pains to point
out that only the source node is able to mark the packets?

Perhaps you mean that as the packets that make up the batch traverse the
network, their spacing would be expected to jitter based on network conditions,
and therefore at any point in the network other than the source, the observed
time duration might be different. If you think that is really important to
state, I think you need to write it out in more detail, although to me it
doesn't seem to be of the essence, so I would suggest just removing
"considering that this interval could fluctuate along the path" -- you've
already made it clear by saying it's the interval *at the source node*.

[GF]: Yes, I totally agree with you. Since draft-ietf-ippm-rfc8321bis already covers all these details, we do not need to include there so I would simply remove that sentence.

8. I'm curious why you require that the FlowMonID is set pseudo-randomly. (By
the way I assume a true random value would also be acceptable!) Assuming the
implementation follows this:

                                                          In particular,
   it is RECOMMENDED to consider the 3-tuple FlowMonID, source and
   destination addresses:

Then why wouldn't it be superior for the source node to produce the FlowMonID
by using a counter that it increments by one for each subsequent flow? The
source node could keep such a counter globally, or per destination, depending
on expectations of number of flows. I suppose you might be trying to keep the
process stateless, except that you've already given up on statelessness by
allowing for a central controller to assign FlowMonIDs.

[GF]: Yes, this is a possibility that can be mentioned as well, but, as you said, it requires a stateful solution. 

Also it's surprising to me that it's considered reasonable to involve a central
controller in choosing a FlowMonID per flow. It seems intuitively obvious that
this would be unscalable and/or introduce unacceptable latency unless there are
notably few flows in the network, and/or the flows are very long-lived. (I
guess this could be the case if a "flow" is all the packets exchanged by a
given ingress and egress pair of CE routers, but if that's an anticipated case
it's not explained in the document.)

[GF]: The case of the controller is considered in related drafts (e.g. draft-chen-pce-pcep-ifit, draft-ietf-idr-sr-policy-ifit) where Alternate Marking can be enabled automatically when the path is instantiated or, for example, when the SR policy is applied. I can clarify in the next version that it is supposed that the controller enable Alternate Marking and its parameters (like FlowMonID) together with the flow configuration and signaling.

9. I don't really understand §5.4 and would rather not take the time to read
draft-ietf-ippm-rfc8889bis in detail to get the necessary background. I *think*
when this section talks about clusters, these are a measurement strategy and
there's no marking that takes place at a cluster border -- correct?

[GF]: Yes it is a measurement strategy, indeed RFC8889bis is the extension of RFC8321bis. RFC8321 only applies to point-to-point unicast flows, while the Clustered approach is valid for multipoint-to-multipoint unicast flows, anycast and ECMP flows. A cluster may or may not have a marking node at the border depending on its position in the network graph.

Also, in this paragraph,

   The Cluster is the smallest identifiable subnetwork of the entire
   Network graph that still satisfies the condition that the number of
   packets that goes in is the same that goes out.  With network
   clustering, it is possible to use the partition of the network into
   clusters at different levels in order to perform the needed degree of
   detail.  So, for Multipoint Alternate Marking, FlowMonID can identify
   in general a multipoint-to-multipoint flow and not only a point-to-
   point flow.

The final sentence starts with "so," which implies that the final sentence
causally follows from the previous sentences. While the statement may well be
true as far as I know, the previous sentences aren't enough to motivate it. I
think dropping the "so" would be enough to fix this, and then the curious
reader could go review draft-ietf-ippm-rfc8889bis carefully to understand the
reasons.

[GF]: I agree, I can remove the "so"

A second question related to this paragraph, if the Cluster is the "*smallest*
identifiable subnetwork", etc, doesn't that mean a Cluster is always an
individual router? Might you actually mean the *largest* identifiable
subnetwork that satisfies the property?

[GF]: Good point, but it can be better to define it as the "smallest
identifiable subnetwork composed by more than one node". I will also revise RFC8889bis accordingly.

10. In §6 you say,

                                   Alternate Marking implies
   modifications on the fly to an Option Header of IPv6 packets by the
   source node

I am surprised by this. My impression (elaborated in some of my earlier
questions) was that Option Headers would only be inserted by the packet's
source, per the requirements of RFC 8200. You also take pains to say that (§2),

                The intermediate nodes and destination node MUST only
      read the marking values of the option without modifying the Option
      Header.

What, then, are these "modifications on the fly"?

[GF]: You are right. I will revise the wording. No modification on the fly is possible by the source node.

11. Later in §6,

                                                                 As
   already discussed in Section 4, it is RECOMMENDED that the AltMark
   Option does not affect the throughput and therefore the user
   experience.

Placing requirements on router performance characteristics seems like
overreach, to say the least -- and indeed, Section 4 doesn't do this, it takes
the more sensible approach of explaining why the chosen design is expected to
be implementable without adversely affecting performance. Perhaps reword
something like,

   As is discussed in Section 4, the design of the AltMark Option has
   been chosen with throughput in mind, such that it can be implemented
   without affecting the user experience.

[GF]: It makes sense. I will reword this part.

12. Again in §6,

           In this case, nodes outside the domain MUST simply ignore
   packets with AltMark Option since they are not configured to handle
   it and should not process it.

I don't think you can make that statement with confidence -- what if domain A
adjoins a different domain B which also uses AltMark within its own boundaries?
In such a case, domain B would indeed be configured to handle it.

In such a case you would need a double-fault in containment to have a problem,
though -- domain A would have to allow outbound leakage, and domain B would
have to allow inbound leakage. But the statement as you've written it seems
wrong.

In addition to the above, I don't see how you can impose requirements on nodes
outside the domain (your use of the keyword "MUST"). After all, the nodes
outside the domain might not implement ipv6-alt-mark at all. I think the
keyword "MUST" is out of place here.

[GF]: Agree, but this is a general problem of HBH and DOH. Maybe I can revise the sentence and remove the keyword "MUST". In this case it is just an expected behavior.

NITS:

13. In §4,

                                                           A new type of
   Routing Header, referred as Segment Routing Header (SRH), has been
   defined in [RFC8754] for SRv6.

It isn't really so new, RFC 8754 is more than two years old now. In general I
suggest reviewing the use of "new" when publishing any RFC, and considering
whether it will stand the test of time -- when a reader considers your document
in ten years, will "new" still make sense?

[GF]: Yes, I will remove the "new"

14. In §5.1,

                                    In this way each marked packet can
   be assigned to the right batch by each node.  Usually the counters
   can be taken in the middle of the batch period to be sure to take
   still counters.

When you say "still counters" do you mean "still" in the sense of "unmoving,
quiescent"? I suggest "quiescent" would be less ambiguous in that case, and
maybe "read" instead of "take" -- so "read quiescent counters".

[GF]: Good suggestion. I will replace the wording as suggested.

15. In §6,

   so the only effect is the increased MTU

It isn't the MTU that's increased, it's the packet size.

[GF]: Ok

16. In §6,

   The flow identifier (FlowMonID) composes the AltMark Option together
   with the two marking bits (L and D).

No it doesn't, taken in this context the most obvious reading of "composes" is
as a synonym for "concatenates", and it doesn't do that. Probably what you mean
is this?

   The flow identifier (FlowMonID), together with the two marking bits
   (L and D), comprises the AltMark Option.

[GF]: Yes, I will replace the sentence.

17. And immediately following, still in §6,

                                         As explained in Section 5.3,
   there is a chance of collision if the FlowMonID is set pseudo
   randomly and a solution exists.

That seems wrong. I think you've spliced incorrectly with that final "and"?
Maybe you're saying that there's a chance of collision, but that there is a
mitigation ("a solution") available for this problem?

[GF]: Yes, I will revise it.

18. Again in §6,

   Additionally, it is to be noted that the AltMark Option is carried by
   the Options Header and it may have some impact on the packet sizes

"Will have", not "may have".

[GF]: Ok