[Dyncast] CAN BoF issues #4 #15 #36

"liupengyjy@chinamobile.com" <liupengyjy@chinamobile.com> Tue, 21 June 2022 09:27 UTC

Return-Path: <liupengyjy@chinamobile.com>
X-Original-To: dyncast@ietfa.amsl.com
Delivered-To: dyncast@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4B233C13A250; Tue, 21 Jun 2022 02:27:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_FONT_FACE_BAD=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_KAM_HTML_FONT_INVALID=0.01, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vyOWr8_Szlhv; Tue, 21 Jun 2022 02:27:09 -0700 (PDT)
Received: from cmccmta2.chinamobile.com (cmccmta2.chinamobile.com [221.176.66.80]) by ietfa.amsl.com (Postfix) with ESMTP id 85BA2C14CF0E; Tue, 21 Jun 2022 02:27:08 -0700 (PDT)
X-RM-TagInfo: emlType=0
X-RM-SPAM-FLAG: 00000000
Received: from spf.mail.chinamobile.com (unknown[172.16.121.1]) by rmmx-syy-dmz-app06-12006 (RichMail) with SMTP id 2ee662b18eeabb2-94ecc; Tue, 21 Jun 2022 17:27:06 +0800 (CST)
X-RM-TRANSID: 2ee662b18eeabb2-94ecc
X-RM-TagInfo: emlType=0
X-RM-SPAM-FLAG: 00000000
Received: from CMCC-LP (unknown[10.2.53.73]) by rmsmtp-syy-appsvr01-12001 (RichMail) with SMTP id 2ee162b18ee7dd0-5c4a8; Tue, 21 Jun 2022 17:27:06 +0800 (CST)
X-RM-TRANSID: 2ee162b18ee7dd0-5c4a8
Date: Tue, 21 Jun 2022 17:31:46 +0800
From: "liupengyjy@chinamobile.com" <liupengyjy@chinamobile.com>
To: dyncast <dyncast@ietf.org>
Cc: rtgwg <rtgwg@ietf.org>, jgs <jgs@juniper.net>, Tony Li <tony.li@tony.li>, cjbc <cjbc@it.uc3m.es>
References: <CO1PR13MB49207A5646A942342E81B67585C99@CO1PR13MB4920.namprd13.prod.outlook.com>, <20220531000035646550674@chinamobile.com>
X-Priority: 3
X-GUID: 882EF921-567F-409F-8946-CD4DB7679873
X-Has-Attach: no
X-Mailer: Foxmail 7.2.21.453[cn]
Mime-Version: 1.0
Message-ID: <20220621173145734665449@chinamobile.com>
Content-Type: multipart/alternative; boundary="----=_001_NextPart856068535757_=----"
Archived-At: <https://mailarchive.ietf.org/arch/msg/dyncast/uigG8inRR8Jvj41AqpfGCltr3Xo>
Subject: [Dyncast] CAN BoF issues #4 #15 #36
X-BeenThere: dyncast@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Dynamic Anycast <dyncast.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dyncast>, <mailto:dyncast-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dyncast/>
List-Post: <mailto:dyncast@ietf.org>
List-Help: <mailto:dyncast-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dyncast>, <mailto:dyncast-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 21 Jun 2022 09:27:11 -0000

Dear All,

Here are the responses to issues #4 #15 #36, which are related to the requirements of mobility, latency and flow affinity. Any comments are welcome.

This email is also copied to the questioner (https://datatracker.ietf.org/doc/minutes-113-can/), hope for further suggestions and confirmations. Thanks.

#4 Do the mobility issues and associated protocols are also in scope? There are scenarios where routing alone would not be sufficient. 
Might need routing+mobility solutions. From the routing side, the service affinity are needed.
Supporting the affinity to a particular service instance while moving as a client will need a solution (which will depend on the realization of such affinity).
Supporting mobility across service instances, i.e. the moving from one service instance to another mid-way of an ongoing transaction, will need extra precaution, likely as a solution at the application layer but possibly supported through the CAN infrastructure.
 
#15 It seems impossible to satisfy that requirement simultaneously with the latency requirement. 
Fulfilling the session persistence (or affinity) requirement together with any latency requirement may indeed be a challenge, e.g., when long running sessions occur across varying compute conditions at smaller timescale. In this case, CAN may support session mobility from the possibly overloaded serving service instance to a new, better suitable, one (see also issue #4) during the ongoing session to mitigate the otherwise negative impact on latency. The methods with which CAN may support this are in scope of CAN’s proposed work

#36 Need to understand if there are requirement to avoid extra messages or 1ms of latency.
Extra messages, such as incurred in off-path systems like DNS, lead to possibly significant latency, particularly when incurred frequently in scenarios where possible service endpoints may need to change frequently. The use cases attempted to cover this. Generally, avoiding extra messages in any solution CAN may develop is a standard requirements for any engineering solution, following the simplification principle.

Any detailed discussion is expected to be only within dyncast mailing list. You can also check and add your comments to any of them(https://github.com/CAN-IETF/CAN-BoF-ietf113/issues).  

Regards,
Peng



liupengyjy@chinamobile.com

From: Linda Dunbar
Date: 2022-05-11 06:11
To: dyncast@ietf.org
Subject: [Dyncast] Categories of the CAN BoF issues
CAN BoF proponents:
 
Many thanks for creating the CAN BoF issues tracking  in the Github: https://github.com/CAN-IETF/CAN-BoF-ietf113/issues/created_by/CAN-IETF?page=1&q=is%3Aopen+is%3Aissue+author%3ACAN-IETF
 
I went through the issues captured in the Github and characterized them into groups. Some issues can be lumped together for the discussion. There are quite a few issues related to the requirements, which need to be clarified.
 
Best Regards, Linda
 
 
Issues associated with Applications vs. Underlay networks:
Consider not to load underlay network with application details. #35
We have multiple upper layer application. Do we have additional needs for routing(e.g. WG?) or we are using those applications and won't need such new WG? #30
It needs application information too, so it can't just make a decision at the network layer. #23
This is not striked as a routing problem; it's all service discovery that can be done in higher layers. #21
3GPP and URSP solve this based on UPF selection. It uses both endpoint + application. #20
One overlay plane per application. Resources/metric specific to the plane. #19
How does the application layer or the transport layer learn the network status to steering traffic? #16
 
Need more clear requirements for CAN (to be addressed by draft-liu-dyncast-ps-usecases):
Need to understand if three are requirement to avoid extra messages or 1ms of latency #36
Regarding the flow affinity, is it from network perspective or from application/computation perspective? #33
How to effectively compute paths? Shall we put CPUs into account? #32
What happens when the user moves? If so we also need to move application context. #25
It can only move the services around as fast as it can update the routing plane. which comes back to the point about service discovery (waiting for convergence/distribution as opposed to just updating the SD server) #24
Whether the interests of the organization deploying the application and the organization providing the network connectivity are aligned. Google doesn't worry about this because they are both. #17
The question is more what the scope and semantic of information is that will need to cross organizational boundaries. This needs further study, in particular when assuming stakeholder division between service and network provider.
It seems impossible to satisfy that requirement simultaneously with the latency requirement. #15
It wasn't clear that how hard of a requirement session persistence is. #13
A session usually creates ephemeral state. If execution changes from one (e.g., virtualized) service instance to another, state/context needs transfer to another. Such required transfer of state/context makes it desirable to have session persistence (or instance affinity) as the default, removing the need for explicit context transfer, while also supporting an explicit state/context transfer (e.g., when metrics change significantly).
Should it select UPF based on the application? Steering is done per user? or per application? #9
This seems to assume conventional non-distributed applications just running at the edge. what about modern frameworks like Sapphire? and Ray? #7
It would be good to understand the multi-site requirements of such framework, which I have understood to mainly run in single DCs.
Relation to 3GPP UPF #6
Relation to ALTO #5
Do the mobility issues and associated protocols are also in scope? There are scenarios where routing alone would not be sufficient. #4
What is the position in the edge location regarding to UPF? #3
Is there some sort of authorization model so that an edge can indicate whether or not it will provide compute services? #2
What is CNC and the relationship with CAN #1
 
Measurement of the Computing Resources (to be addressed by draft-du-computing-resource-representation):
It is hard to use existing work to measure the computation, but we can optimize the latency through the performance monitoring. We have performance/measurement matrix over there. #34
Clarifications on the computing resource, its requirements and characteristics would be helpful. #27
Each application may have a different definition of "resources" these then have to be boiled down into a single topology Network Aware Computing (NAC! :) does scale #14
Is computing resource measurable? #10
It is, and how to use the measurement would be solution related. See IFIP Networking 2022 paper on how to simply expose “computing capability” and achieve better steering with such simple measure.
Why compute resource is different with other resources? #8
 
Load Balance based solutions:
The point is that we need a standardized LB protocol #18
The LB as part of the application itself is superior (part of the distributed application itself is to obtain and keep updating the "best" unicast location to use). #22
If there is anything missing from current lbs that would prevent their use as-is? other than there is for market reasons no interop standard between different lbs? #12
For the load balance, should it learn the network’s status? #11
 
Dyncast based Solution issues:
For Dyncast, when the time is short, is it possible for the router to decide the routing? It is too fast. #31
Is dyncast proposed to encapsulate? #29
Will CAN dyncast impact each and every router? How to avoid loops? #28
What's the assumed scale of a D-router? 10 ^ 6 sessions? 100^ 8? What's the assumed update rate? !Gb? 1Tb? #26