Re: [alto] Chair review of path-vector-13 (Part 1 of 2)

kaigao@scu.edu.cn Fri, 19 February 2021 16:18 UTC

Date: Sat, 20 Feb 2021 00:18:18 +0800
From: kaigao@scu.edu.cn
To: Vijay Gurbani <vijay.gurbani@gmail.com>
Cc: draft-ietf-alto-path-vector@ietf.org, IETF ALTO <alto@ietf.org>
In-Reply-To: <CAMMTW_+j7L6K3t8rU2ooDxkiGMBeZE9byaiekj7htAeFyLPxPQ@mail.gmail.com>
References: <CAMMTW_+j7L6K3t8rU2ooDxkiGMBeZE9byaiekj7htAeFyLPxPQ@mail.gmail.com>
Content-Type: multipart/mixed; boundary="----=_Part_335840_1972817059.1613751498806"
MIME-Version: 1.0
Message-ID: <16417868.1747f.177bb15b837.Coremail.kaigao@scu.edu.cn>
Archived-At: <https://mailarchive.ietf.org/arch/msg/alto/0V-LL8dk5LRvO2zQ0F6hUYwYi-g>
Subject: Re: [alto] Chair review of path-vector-13 (Part 1 of 2)
Precedence: list

Hi Vijay and the ALTO WG,




Thanks very much for the review!




We have taken a first pass of the document addressing the comments. Please see our responses inline. The latest document is also attached in the email.




Best,

Kai



-----Original Messages-----
From:"Vijay Gurbani" <vijay.gurbani@gmail.com>
Sent Time:2021-02-08 23:35:13 (Monday)
To: draft-ietf-alto-path-vector@ietf.org
Cc: "IETF ALTO" <alto@ietf.org>
Subject: Chair review of path-vector-13 (Part 1 of 2)


Chair review from beginning of document to the end of S6.6.
Part 1 of 2.

Major:


- S4.1, below Figure 2:  Note that we do not have "availbw" defined in ALTO as a current cost metric, so it is not a good idea to use it here without qualifying it further.  If used as is, it creates confusion.  My advice would be to either qualify the use of "availbw" as a hypothetical cost metric, or choose an actual cost metric from the performance-metric draft and restate the example.




[PV] Thanks for the comments. We make it clear in the new text that the metric is hypothetical.

  OLD:
   The single-node ALTO topology abstraction of the network is shown in
   Figure 2.

  NEW:
   The single-node ALTO topology abstraction of the network is shown in
   Figure 2.  Assume the cost map returns a hypothetical cost type
   representing the available bandwidth between a source and a
   destination.




- S4.1, "Case 1": I don't see how the "application will obtain 150 Mbps at most."  Consider that the bottleneck bandwidth is 100 Mbps, as that is the bandwidth of the most constrained link.  Once traffic leaves sw5, it can get no more than 100 Mbps on the remaining links.  So, I don't understand how the "application will obtain 150 Mbps at most."?  Perhaps I am missing something?




[PV] We agree the computation process should be better explained. More details are now provided to explain 1) what is the objective of the application and 2) how it computes the value.




- S4.2.3: This paragraph, especially the second sentence onwards needs to be re-written to better flesh out the need.  Currently it says, "While both approaches...", however, it is not clear that there are two approaches being delineated from each other here.  It needs more edits so it reads better. (Some nits in this paragraph appear in the Nits section trying to tease out the language.)




[PV]   We agree the second approach is not clearly specified. The new text follows the same structure of previous paragraphs and only focuses on what is achievable with the PV extension.
  OLD:

   ... Otherwise, the ALTO client may
   have to make multiple queries and potentially with the complete list
   of CDNs and/or service edges.  While both approaches offer the same
   information, making multiple queries introduces larger delay and more
   overhead on both the ALTO server and the ALTO client.




  NEW:
   ...  Thus, an ALTO client may leverage the
   information to better conduct CDN request routing or offload
   functionalities from the user equipment to the service edge, with
   considerations on different resource constraints.


- S5.1.3: When Section 5 begins, it says that "This section gives a non-normative overview of the Path Vector extension."  However, in S5.1.3, there is a normative "MUST".  (Same problem in S5.3, there are many "MUST"s there, and in Section 5.3.3 there are "RECOMMENDED" and "SHOULD NOT".)

Generally, I am a bit hesitant that certain subsections of Section 5 --- Section 5.3.2 in particular --- appear to contain normative behaviour, and this should be specified in a normative section, or do NOT start Section 5 by saying that this section gives a non-normative overview, and make this a normative section. I understand this is a major comment, so please think how you want to handle this carefully.





[PV]   We agree the normative behaviors should be moved to specification sections. In particular, the normative contents of 5.3.2 are moved to 7.1.6/7.2.6 and 7.3. And the normative contents of 5.3.3 are moved to 7.1.6/7.2.6.


- S5.3.2: Not sure I follow the logic in the first paragraph.  As Fig. 4 showed, there is one PV request, and if ALTO SSE extension is being used, presumably, it will contain the "client-id".  If the response contains a Path Vector resource, shouldn't that "client-id" simply apply to it?  I am sure I am missing something here as you have thought about this more than me; perhaps you could add a simple example to make the problem more explicit.




[PV]   The idea is to allow SSE to push the updates for only one part in a PV response. However, we realize that the content of S5.3.2 is repetitive as RFC 8895 (SSE) has already specified how to push  updates for multipart resources. We now follow the design of RFC 8895 and there is no backward compatibility issue between PV and SSE now.


- S6.4: Why have a mini Security Considerations paragraphs in the subsections of S6.4, but not in the subsections of S6.3 and S6.5?  I am not saying that you remove the mini Security Considerations paragraphs, but if there are security considerations worth pointing out in S6.4, I suspect that there are security considerations worth pointing out in S6.3 and S6.5?  (One such security consideration is listed below in S6.5.1.)





[PV] The reason of having mini security consideration paragraphs in Section 6.4 is because the document defines two properties in Section 6.4 and the Unified Property document asks for security  consideration when defining a new property. However, for cost type definition, such a paragraph is not formally required so we do not include one.




- S6.4.2: "The persistent entity ID property is the entity identifier of the persistent ANE which an ephemeral ANE presents (See Section 5.1.2 for details)." ==> I am not sure what this means? Why is an ephemeral ANE presenting a persistent entity identifier?  Is it important that you are defining an ephemeral ANE and associating it with persistent entities?  If so, then please make this clear as there is a lot of ambiguity in this section.




[PV] This sentence is based on the contents of Section 5.1.2, which provides more details on ephemeral ANE and persistent ANE. We add an example to 5.1.2 to illustrate the importance of having persistent ANEs.




NEW:

   ... For example, an ALTO server may define an ANE for each service

   edge cluster.  Once a client chooses to use a service edge, e.g., by
   deploying some user-defined functions, it may want to stick to the
   service edge to avoid the complexity of state transition or
   synchronization.  Persistent entities have a persistent ID that is
   registered in a Property Map, together with their properties.  See
   Section 6.2.4 and Section 6.4.2 for more detailed instructions on how
   to identify ephemeral ANEs and persistent ANEs.


- S6.5.1: What is the effect if the ALTO server chooses to obfuscate the path vector, causing the client to experience sub-optimal routing.  The client does not know that the server has obfuscated the path vector, so it MUST interpret the path vector as given to it by the ALTO server.  This raises the question whether such obfuscation, because it is indistinguishable from a non-obfuscated response, creates an attack on the client?  (Would a mini Security Consideration paragraph be appropriate here?)  Clearly, since ALTO assumes that the server is trusted to some degree, the issue becomes (a) can the client, by repeated querying, figure out that it is being duped on occasion?  (b) what does it then do?





[PV] We have not finalized the revision for this comment yet.




Minor:

- S1, paragraph 3: Why would "job completion time" be shared by bottleneck  network links?  On first glance, job completion time is a function of the  compute resources on the host not network links, but on further reflection,   job completion time could also be a function of the network links on the host if the data needs to be marshalled to the job (process) in order for it to complete.  If so, then perhaps reword as:
 
 OLD:
 For example, job completion time, which is an important QoE metric for a large-scale data analytics application, is impacted by shared bottleneck links inside the carrier network.

 NEW:


 For example, job completion time, which is an important QoE metric  for a large-scale data analytics application, is impacted by shared  bottleneck links inside the carrier network as link capacity may  impact the rate of data input/output to the job.




[PV]  Thanks for the comment, we adopt the new text in the revised document.




- S5.1.1: "Thus they must follow the mechanisms specified in the [i-D.ietf-alto-unified-props-new]." ==> Here, it may help to point to a specific section of the I-D you want the implementer to follow the mechanisms of.  Do you mean the naming mechanism defined in the I-D?  The inheritance mechanism defined in the I-D?




- S5.1.2: How does the client know that an ANE in a response is ephemeral versus persistent?  You answer this question in Section 6.4.2, perhaps you can put a forward reference to Section 6.4.2 as I am sure other readers will have the same question.





[PV] Thanks for the comment. We add a forwarding reference and point the readers to Section 6.2.4 and 6.4.2, which both give more concrete examples of how to differentiate ephemeral and persistent ANEs.




- S6.2.4: "...their entity domain names MUST be ".ane"..." ==> MUST be .ane or MUST use the .ane prefix?  I can't tell.  Please specify this better through an example as well.  You do have an example in the last paragraph, but the writing of the example is ambiguous.  My understanding is: ".ane:NET1" is an ephemeral ANE, while "dc-props.ane:DC1" is a persistent ANE.  Is that correct?  If so, just explicitly mention this.




[PV] Thanks for the comment. The entity domain name must be .ane (i.e., the first case).  Your understanding is correct and we explicitly mention it in the new texts.

OLD:

   For example, the defining resource of ".ane:NET1" is the Property Map
   part that contains this identifier, i.e., the ANE entity ".ane:NET1"
   is self-defined.  The defining resource of "dc-props.ane:DC1" is the
   Property Map with the resource ID "dc-props".




NEW:

   For example, the defining resource of an ephemeral ANE whose entity
   identifier is ".ane:NET1" is the Property Map part that contains this
   identifier.  The defining resource of a persistent ANE whose entity
   identifier is "dc-props.ane:DC1" is the Property Map with the
   resource ID "dc-props".




Nits:




[PV] The proposed changes are adopted as they are unless specifically explained.


- S4.1: s/the scheduling.  However,/the scheduling, however,/

- S1, paragraph 3: s/applications, however, the/applications, the/

- S1, paragraph 5: s/in a huge volume/in an increase in volume/

- S1: s/The pressure on the/The requirements on the/

- S1: s/ALTO server convey/ALTO server to convey/

- S1: s/that each identifies/that identifies/
  or
      s/that each identifies/, each element of which identifies/

- S3: s/in a cost map or for a/in a cost map, or for a/

- S4.2.1: s/Gigabytes, Terabytes, and even Petabytes/gigabytes, terabytes, and even petabytes/
(Reason: there is no need to gratuitously capitalize these.)

- S4.2.1: s/related to the completion time of the slowest data transfer./related to the data transfer time over the slowest link./





[PV]  The proposed change has a different meaning and we propose to use the following:
  s/which is related to the completion time of the slowest data transfer/which is related to the completion time of all the data transfers belonging to the job




- S4.2.1: s/the Path Vector extension/the extension defined in this document/
(This is repeated in S4.2.2 and perhaps elsewhere, please consider it as a request for global change.)

- S4.2.2: s/It is getting important/It is important/

- S4.2.3: s/may have to make/will need/

- S4.2.3: s/and potentially with/and potentially need/


- S5.2: s/, meaning the/, this means that the/


Thanks,



- vijay

Attachment: draft-ietf-alto-path-vector.txt

[alto] Chair review of path-vector-13 (Part 1 of … Vijay Gurbani
Re: [alto] Chair review of path-vector-13 (Part 1… kaigao
Re: [alto] Chair review of path-vector-13 (Part 1… kaigao

Re: [alto] Chair review of path-vector-13 (Part 1 of 2)

Attachment: draft-ietf-alto-path-vector.txt