Re: [alto] Chair review of path-vector-13 (Part 1 of 2)

kaigao@scu.edu.cn Mon, 22 February 2021 15:32 UTC

Return-Path: <kaigao@scu.edu.cn>
X-Original-To: alto@ietfa.amsl.com
Delivered-To: alto@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 635EC3A0597; Mon, 22 Feb 2021 07:32:18 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.896
X-Spam-Level:
X-Spam-Status: No, score=-1.896 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mvboaoJD6kOq; Mon, 22 Feb 2021 07:32:12 -0800 (PST)
Received: from zg8tmty1ljiyny4xntqumjca.icoremail.net (zg8tmty1ljiyny4xntqumjca.icoremail.net [165.227.154.27]) by ietfa.amsl.com (Postfix) with SMTP id 4A96E3A0657; Mon, 22 Feb 2021 07:32:09 -0800 (PST)
Received: by ajax-webmail-app1 (Coremail) ; Mon, 22 Feb 2021 23:32:05 +0800 (GMT+08:00)
X-Originating-IP: [125.70.168.127]
Date: Mon, 22 Feb 2021 23:32:05 +0800 (GMT+08:00)
X-CM-HeaderCharset: UTF-8
From: kaigao@scu.edu.cn
To: "Vijay Gurbani" <vijay.gurbani@gmail.com>
Cc: draft-ietf-alto-path-vector@ietf.org, "IETF ALTO" <alto@ietf.org>
X-Priority: 3
X-Mailer: Coremail Webmail Server Version XT5.0.13 build 20200820(b2b8cba1) Copyright (c) 2002-2021 www.mailtech.cn mail
In-Reply-To: <CAMMTW_+j7L6K3t8rU2ooDxkiGMBeZE9byaiekj7htAeFyLPxPQ@mail.gmail.com>
References: <CAMMTW_+j7L6K3t8rU2ooDxkiGMBeZE9byaiekj7htAeFyLPxPQ@mail.gmail.com>
Content-Type: multipart/alternative; boundary="----=_Part_344100_1353297842.1614007925774"
MIME-Version: 1.0
Message-ID: <54e33a74.17dbb.177ca5e7c0f.Coremail.kaigao@scu.edu.cn>
X-Coremail-Locale: en_US
X-CM-TRANSID: 4wAACgCHqW12zjNgpm+4AA--.16010W
X-CM-SenderInfo: 5ndlwt3r6vu3oohg3hdfq/1tbiAQQAB138kksuBwAHsU
X-Coremail-Antispam: 1Ur529EdanIXcx71UUUUU7IcSsGvfJ3iIAIbVAYjsxI4VWkKw CS07vEb4IE77IF4wCS07vE1I0E4x80FVAKz4kxMIAIbVAFxVCaYxvI4VCIwcAKzIAtYxBI daVFxhVjvjDU=
Archived-At: <https://mailarchive.ietf.org/arch/msg/alto/23HfNwNPoYwZqKfwWw6UaAqOLVI>
Subject: Re: [alto] Chair review of path-vector-13 (Part 1 of 2)
X-BeenThere: alto@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Application-Layer Traffic Optimization \(alto\) WG mailing list" <alto.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/alto>, <mailto:alto-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/alto/>
List-Post: <mailto:alto@ietf.org>
List-Help: <mailto:alto-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/alto>, <mailto:alto-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Feb 2021 15:32:18 -0000

Hi Vijay and the ALTO WG,




This is a follow-up on the comments that are not fully address in the previous email. Please see below.




Thanks!




Best,

Kai



-----Original Messages-----
From:"Vijay Gurbani" <vijay.gurbani@gmail.com>
Sent Time:2021-02-08 23:35:13 (Monday)
To: draft-ietf-alto-path-vector@ietf.org
Cc: "IETF ALTO" <alto@ietf.org>
Subject: Chair review of path-vector-13 (Part 1 of 2)


Chair review from beginning of document to the end of S6.6.
Part 1 of 2.

Major:
- S4.1, below Figure 2:  Note that we do not have "availbw" defined in ALTO as a current cost metric, so it is not a good idea to use it here without qualifying it further.  If used as is, it creates confusion.  My advice would be to either qualify the use of "availbw" as a hypothetical cost metric, or choose an actual cost metric from the performance-metric draft and restate the example.

- S4.1, "Case 1": I don't see how the "application will obtain 150 Mbps at most."  Consider that the bottleneck bandwidth is 100 Mbps, as that is the bandwidth of the most constrained link.  Once traffic leaves sw5, it can get no more than 100 Mbps on the remaining links.  So, I don't understand how the "application will obtain 150 Mbps at most."?  Perhaps I am missing something?

- S4.2.3: This paragraph, especially the second sentence onwards needs to be re-written to better flesh out the need.  Currently it says, "While both approaches...", however, it is not clear that there are two approaches being delineated from each other here.  It needs more edits so it reads better. (Some nits in this paragraph appear in the Nits section trying to tease out the language.)

- S5.1.3: When Section 5 begins, it says that "This section gives a non-normative overview of the Path Vector extension."  However, in S5.1.3, there is a normative "MUST".  (Same problem in S5.3, there are many "MUST"s there, and in Section 5.3.3 there are "RECOMMENDED" and "SHOULD NOT".)

Generally, I am a bit hesitant that certain subsections of Section 5 --- Section 5.3.2 in particular --- appear to contain normative behaviour, and this should be specified in a normative section, or do NOT start Section 5 by saying that this section gives a non-normative overview, and make this a normative section. I understand this is a major comment, so please think how you want to handle this carefully.

- S5.3.2: Not sure I follow the logic in the first paragraph.  As Fig. 4 showed, there is one PV request, and if ALTO SSE extension is being used, presumably, it will contain the "client-id".  If the response contains a Path Vector resource, shouldn't that "client-id" simply apply to it?  I am sure I am missing something here as you have thought about this more than me; perhaps you could add a simple example to make the problem more explicit.

- S6.4: Why have a mini Security Considerations paragraphs in the subsections of S6.4, but not in the subsections of S6.3 and S6.5?  I am not saying that you remove the mini Security Considerations paragraphs, but if there are security considerations worth pointing out in S6.4, I suspect that there are security considerations worth pointing out in S6.3 and S6.5?  (One such security consideration is listed below in S6.5.1.)

- S6.4.2: "The persistent entity ID property is the entity identifier of the persistent ANE which an ephemeral ANE presents (See Section 5.1.2 for details)." ==> I am not sure what this means? Why is an ephemeral ANE presenting a persistent entity identifier?  Is it important that you are defining an ephemeral ANE and associating it with persistent entities?  If so, then please make this clear as there is a lot of ambiguity in this section.

- S6.5.1: What is the effect if the ALTO server chooses to obfuscate the path vector, causing the client to experience sub-optimal routing.  The client does not know that the server has obfuscated the path vector, so it MUST interpret the path vector as given to it by the ALTO server.  This raises the question whether such obfuscation, because it is indistinguishable from a non-obfuscated response, creates an attack on the client?  (Would a mini Security Consideration paragraph be appropriate here?)  Clearly, since ALTO assumes that the server is trusted to some degree, the issue becomes (a) can the client, by repeated querying, figure out that it is being duped on occasion?  (b) what does it then do?





[PV] The effects are highly implementation-specific, and it is true that
  obfuscation may create an attack on the client by compromising the integrity
  of ALTO information. As we discuss in Section 11, there are some obfuscation
  methods that can preserve the integrity of the information.

  Regarding the last two issues, the answer to (a) is also
  implementation- and network-specific, if the obfuscation is idempotent, i.e.,
  generating the same obfuscated results for the same request, a client will not
  be able to figure out that it is being duped; even if a client sees two
  different results, it may still be the consequences of internal network
  changes; for the answer to (b), we feel that it does not fall out of the scope
  of Sec 15.2 in RFC 7285.

  Instead of expanding the security discussion in Sec 6.5.1, the proposed change
  is to move the security consideration on the integrity to Sec 11 (security
  consideration), as reduction/obfuscation are usually introduced as mechanisms
  of protecting confidentiality.




OLD:

  To mitigate this risk, the ALTO server should consider protection
   mechanisms to reduce information exposure or obfuscate the real
   information, in particular, in settings where the network and the
   application do not belong to the same trust domain.  But the
   implementation of Path Vector extension involving reduction or
   obfuscation should guarantee the requested properties are still
   accurate, for example, by using minimal feasible region compression
   algorithms [TON2019] or obfuscation protocols [SC2018][JSAC2019].




NEW:

   To mitigate this risk, the ALTO server should consider protection
   mechanisms to reduce information exposure or obfuscate the real
   information, in particular, in settings where the network and the
   application do not belong to the same trust domain.  For example, in
   the multi-flow bandwidth reservation use case as introduced in
   Section 4, only the available bandwidth of the shared bottleneck link
   is crucial, and the ALTO server may only preserve the critical
   bottlenecks and can change the order of links appearing in the Path
   Vector response.

   However, arbitrary reduction and obfuscation of information exposure
   may potentially introduce a risk on the integrity of the ALTO
   information, leading to infeasible or suboptimal decisions of ALTO
   clients,

   To mitigate this risk, if an ALTO client finds that the traffic
   distribution based on the Path Vector information is not feasible
   (e.g., causing constant congestion) or not better than a distribution
   which does not fully conform to the information (e.g., by randomly
   choosing the source/destination for certain flows), it can follow the
   protection strategies for potential undesirable guidance from
   authenticated ALTO information, specified in Section 15.2.2 of RFC
   7285 [RFC7285].  While repeatedly sending the same query can
   potentially detect the integrity problem for certain obfuscation
   methods (e.g., those based on time or randomness) under certain
   network conditions (e.g., where the routing and ANE properties are
   stable), an ALTO client must be aware that this behavior may be
   considered as a denial-of-service attack on the server and may lead
   to the rejection of further requests from the client.

   On the other hand, this risk can also be mitigated from the server
   side.  While the implementation of an ALTO server is beyond the scope
   of this document, implementations of ALTO servers involving reduction
   or obfuscation of the Path Vector information should consider
   reduction/obfuscation mechanisms that can preserve the integrity of
   ALTO information, for example, by using minimal feasible region
   compression algorithms [TON2019] or obfuscation protocols
   [SC2018][JSAC2019].




Minor:

- S1, paragraph 3: Why would "job completion time" be shared by bottleneck  network links?  On first glance, job completion time is a function of the  compute resources on the host not network links, but on further reflection,   job completion time could also be a function of the network links on the host if the data needs to be marshalled to the job (process) in order for it to complete.  If so, then perhaps reword as:
 
 OLD:
 For example, job completion time, which is an important QoE metric for a large-scale data analytics application, is impacted by shared bottleneck links inside the carrier network.

 NEW:
 For example, job completion time, which is an important QoE metric  for a large-scale data analytics application, is impacted by shared  bottleneck links inside the carrier network as link capacity may  impact the rate of data input/output to the job.

- S5.1.1: "Thus they must follow the mechanisms specified in the [i-D.ietf-alto-unified-props-new]." ==> Here, it may help to point to a specific section of the I-D you want the implementer to follow the mechanisms of.  Do you mean the naming mechanism defined in the I-D?  The inheritance mechanism defined in the I-D?

- S5.1.2: How does the client know that an ANE in a response is ephemeral versus persistent?  You answer this question in Section 6.4.2, perhaps you can put a forward reference to Section 6.4.2 as I am sure other readers will have the same question.

- S6.2.4: "...their entity domain names MUST be ".ane"..." ==> MUST be .ane or MUST use the .ane prefix?  I can't tell.  Please specify this better through an example as well.  You do have an example in the last paragraph, but the writing of the example is ambiguous.  My understanding is: ".ane:NET1" is an ephemeral ANE, while "dc-props.ane:DC1" is a persistent ANE.  Is that correct?  If so, just explicitly mention this.

Nits:

- S4.1: s/the scheduling.  However,/the scheduling, however,/

- S1, paragraph 3: s/applications, however, the/applications, the/

- S1, paragraph 5: s/in a huge volume/in an increase in volume/

- S1: s/The pressure on the/The requirements on the/

- S1: s/ALTO server convey/ALTO server to convey/

- S1: s/that each identifies/that identifies/
  or
      s/that each identifies/, each element of which identifies/

- S3: s/in a cost map or for a/in a cost map, or for a/

- S4.2.1: s/Gigabytes, Terabytes, and even Petabytes/gigabytes, terabytes, and even petabytes/
(Reason: there is no need to gratuitously capitalize these.)

- S4.2.1: s/related to the completion time of the slowest data transfer./related to the data transfer time over the slowest link./

- S4.2.1: s/the Path Vector extension/the extension defined in this document/
(This is repeated in S4.2.2 and perhaps elsewhere, please consider it as a request for global change.)

- S4.2.2: s/It is getting important/It is important/

- S4.2.3: s/may have to make/will need/

- S4.2.3: s/and potentially with/and potentially need/


- S5.2: s/, meaning the/, this means that the/


Thanks,



- vijay