Re: [Dyncast] CAN BoF issues #9 #20 #25

"liupengyjy@chinamobile.com" <liupengyjy@chinamobile.com> Fri, 27 May 2022 06:55 UTC

Return-Path: <liupengyjy@chinamobile.com>
X-Original-To: dyncast@ietfa.amsl.com
Delivered-To: dyncast@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 35156C159493; Thu, 26 May 2022 23:55:04 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.787
X-Spam-Level:
X-Spam-Status: No, score=-1.787 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_FONT_FACE_BAD=0.001, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_KAM_HTML_FONT_INVALID=0.01] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bnFemTK2V4fu; Thu, 26 May 2022 23:55:01 -0700 (PDT)
Received: from cmccmta1.chinamobile.com (cmccmta1.chinamobile.com [221.176.66.79]) by ietfa.amsl.com (Postfix) with ESMTP id E3F3FC14F743; Thu, 26 May 2022 23:54:59 -0700 (PDT)
X-RM-TagInfo: emlType=0
X-RM-SPAM-FLAG: 00000000
Received: from spf.mail.chinamobile.com (unknown[172.16.121.87]) by rmmx-syy-dmz-app02-12002 (RichMail) with SMTP id 2ee2629075c17ac-3ec0c; Fri, 27 May 2022 14:54:57 +0800 (CST)
X-RM-TRANSID: 2ee2629075c17ac-3ec0c
X-RM-TagInfo: emlType=0
X-RM-SPAM-FLAG: 00000000
Received: from CMCC-LP (unknown[10.2.53.73]) by rmsmtp-syy-appsvrnew04-12029 (RichMail) with SMTP id 2efd629075bf537-516c9; Fri, 27 May 2022 14:54:56 +0800 (CST)
X-RM-TRANSID: 2efd629075bf537-516c9
Date: Fri, 27 May 2022 14:59:29 +0800
From: "liupengyjy@chinamobile.com" <liupengyjy@chinamobile.com>
To: dyncast <dyncast@ietf.org>
Cc: rtgwg <rtgwg@ietf.org>, "linda.dunbar" <linda.dunbar@futurewei.com>, "dirk.trossen" <dirk.trossen@huawei.com>
References: <CO1PR13MB49207A5646A942342E81B67585C99@CO1PR13MB4920.namprd13.prod.outlook.com>, <20220523173040895155433@chinamobile.com>, <CO1PR13MB492046DFB9684726D72D05BA85D79@CO1PR13MB4920.namprd13.prod.outlook.com>, <99326e64f8cb48ff900b01bb492652d4@huawei.com>, <CO1PR13MB4920779B79F7FFC214D2B5FE85D69@CO1PR13MB4920.namprd13.prod.outlook.com>
X-Priority: 3
X-GUID: 49D9AEBC-F33D-48CE-9226-44B51FC209D6
X-Has-Attach: no
X-Mailer: Foxmail 7.2.21.453[cn]
Mime-Version: 1.0
Message-ID: <20220527145929367127540@chinamobile.com>
Content-Type: multipart/alternative; boundary="----=_001_NextPart757027773406_=----"
Archived-At: <https://mailarchive.ietf.org/arch/msg/dyncast/qbaALXaEZ01jWX_5nvlUc-EAgBo>
Subject: Re: [Dyncast] CAN BoF issues #9 #20 #25
X-BeenThere: dyncast@ietf.org
X-Mailman-Version: 2.1.34
Precedence: list
List-Id: Dynamic Anycast <dyncast.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dyncast>, <mailto:dyncast-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dyncast/>
List-Post: <mailto:dyncast@ietf.org>
List-Help: <mailto:dyncast-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dyncast>, <mailto:dyncast-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 27 May 2022 06:55:04 -0000

According to the discussion, please check if the updated descriptions are appropriate:

#9 Should it select UPF based on the application? Steering is done by per user or per application?
Traffic steering is done per application/service but the metrics deciding on the steering may also consider user-specific input, if needed by the service (making this service-specific again). The application could first provide a strategy of select UPF, and network will have some flexibility in steering based on this strategy other than request the AF each time when there’s needed, which could decrease the delay in corresponding to users.
For distributed UPFs in 5G, UPFs could be partitioned by functionality, such as UPFs specialized for Video, Conference or others. For each type of UPFs, there could be multiple instances.  In 23.548, 3GPP further defined means by which UPF (re)selection could be fine-tuned based on the DNS request. And this re-selection could take place following UE mobility, which could be seen as the on-path element making the decision. The point though is the delay incurred in this selection, the amount of configuration per application /the coupling of that information between operator & application domains, and the dependence on inspecting DNS.
 
#20 3GPP and URSP solve traffic steering problem based on UPF selection. It uses both endpoint and application. 
The combination of endpoint and application in UPF selection is more in a static way. The dynamic decision of updating path comes all the way along the application subscription procedure from upper layer(e.g. SMF) which will cause more delay. For distributed UPFs based way, some descriptions could be found in issues #9.


#25 What happens when the user moves? Should CAN handle the possible re-assignment to another service instance?
There are two cases here: 
1) a different (or a  more optimized) service instance being chosen upon the client moving to a new location, which does need context/state transfer. 
2) maintaining the same service instance for flow affinity upon the client moving to a new location. This will lead to how service affinity will be realized. 
Some applications have their own layer of handling sessions when user move. Some don’t. For the CAN, generally, the assumption would be that a moving client maintains affinity with a previous service instance until the transaction is finished. This can be considered as Value added Services offered to applications that require affinity. 

Regards,
Peng


liupengyjy@chinamobile.com
 
From: Linda Dunbar
Date: 2022-05-26 00:57
To: Dirk Trossen; liupengyjy@chinamobile.com; dyncast
CC: rtgwg
Subject: Re: [Dyncast] CAN BoF issues #9 #20 #25
Dirk, 
 
Answers to your questions are inserted below: 
 
Linda
 
 
From: rtgwg <rtgwg-bounces@ietf.org> On Behalf Of liupengyjy@chinamobile.com
Sent: Monday, May 23, 2022 4:31 AM
To: dyncast <dyncast@ietf.org>
Cc: rtgwg <rtgwg@ietf.org>
Subject: [Dyncast] CAN BoF issues #9 #20 #25
 
Dear All,
 
Based on the categories of the CAN BoF issues, here are the responses to the following issues #9 #20 #25, which were all the issues relates to 3GPP including #6 posted last week. Any comments are welcome. Thanks!
 
#9 Should it select UPF based on the application? Steering is done by per user or per application?
Traffic steering is done per application/service but the metrics deciding on the steering may also consider user-specific input, if needed by the service (making this service-specific again). The application could first provide a strategy of select UPF, and network will have some flexibility in steering based on this strategy other than request the AF each time when there’s needed, which could decrease the delay in corresponding to users.
 
[Linda] For distributed UPFs in 5G, UPFs could be partitioned by functionality, such as UPFs specialized for Video, Conference or others. For each type of UPFs, there could be multiple instances.  In 23.548, 3GPP further defined means by which UPF (re)selection could be fine-tuned based on the DNS request. And this re-selection could take place following UE mobility. The point though is the delay incurred in this selection, the amount of configuration per application /the coupling of that information between operator & application domains, and the dependence on inspecting DNS.
[DOT] So would an app/category-specific UPF be equivalent to what the on-path element making the decision (where the number of UPF instances signifies the degree of distribution of the decision making)?
[Linda] Agree your statement. 3GPP 23.548 currently have two options: 1) ANYCAST for the category-specific UPFs (letting the underlay routers to choose the destination based on the network conditions); 2) SMF choosing the individual UPF instances (which put a very heavy load to the SMF). 
 
 
#20 3GPP and URSP solve traffic steering problem based on UPF selection. It uses both endpoint and application. 
The combination of endpoint and application in UPF selection is more in a static way. The decision of updating path comes all the way along the application subscription procedure from upper layer which will cause more delay.
[DOT] I must admit I don’t parse this aspect. Does this mean that 3GPP and URSP propose to use app-level selection of the appropriate service instance? If so, this method is captured in the use case draft (the gap analysis part), so it might be useful to link to this?
[Linda] I assume this is referring to SMF choosing individual UPF instances based on the applications and UEs locations. 
 
#25 What happens when the user moves? Should we also move application context?
This will depend on how service affinity will be realized. But generally, the assumption would be that a moving client maintains affinity with a previous service instance until the transaction is finished. Changing a service instance mid-transaction does need context/state transfer, indeed, but that is true for any solution (including app level).
[Linda] It depends on applications. Some applications have their own layer of handling sessions when user move. Some don’t. This can be considered as Value added Services offered to applications that require affinity.
[DOT] The question seems to focus on client mobility, which 3GPP handles well I’d argue. The answer (which I partially wrote) suggests that client mobility may cause the need to re-assign the client to a different (better fitting) service instance. So I suggest to capture this in the issue title instead of the possibly needed application context transfer, i.e., “#25 What happens when the user moves? Should CAN handle the possible re-assignment to another service instance?” I agree with Linda that this may be an value added service but the question may be how CAN could help here instead of entirely relying on the application only (not on the transfer itself, since this is highly app-specific but the signaling that would lead to the transfer itself)?
[Linda] There are two cases here: 1) a different (or a  more optimized) service instance being chosen upon the client moving to a new location; 2) maintaining the same service instance for flow affinity upon the client moving to a new location. 
 
You can also add your comments to any of them(https://github.com/CAN-IETF/CAN-BoF-ietf113/issues). 
 
Regards,
Peng
 


liupengyjy@chinamobile.com
 
From: Linda Dunbar
Date: 2022-05-11 06:11
To: dyncast@ietf.org
Subject: [Dyncast] Categories of the CAN BoF issues
CAN BoF proponents:
 
Many thanks for creating the CAN BoF issues tracking  in the Github: https://github.com/CAN-IETF/CAN-BoF-ietf113/issues/created_by/CAN-IETF?page=1&q=is%3Aopen+is%3Aissue+author%3ACAN-IETF
 
I went through the issues captured in the Github and characterized them into groups. Some issues can be lumped together for the discussion. There are quite a few issues related to the requirements, which need to be clarified.
 
Best Regards, Linda
 
 
Issues associated with Applications vs. Underlay networks:
·         Consider not to load underlay network with application details. #35
·         We have multiple upper layer application. Do we have additional needs for routing(e.g. WG?) or we are using those applications and won't need such new WG? #30
·         It needs application information too, so it can't just make a decision at the network layer. #23
·         This is not striked as a routing problem; it's all service discovery that can be done in higher layers. #21
·         3GPP and URSP solve this based on UPF selection. It uses both endpoint + application. #20
·         One overlay plane per application. Resources/metric specific to the plane. #19
·         How does the application layer or the transport layer learn the network status to steering traffic? #16
 
Need more clear requirements for CAN (to be addressed by draft-liu-dyncast-ps-usecases):
·         Need to understand if three are requirement to avoid extra messages or 1ms of latency #36
·         Regarding the flow affinity, is it from network perspective or from application/computation perspective? #33
·         How to effectively compute paths? Shall we put CPUs into account? #32
·         What happens when the user moves? If so we also need to move application context. #25
·         It can only move the services around as fast as it can update the routing plane. which comes back to the point about service discovery (waiting for convergence/distribution as opposed to just updating the SD server) #24
·         Whether the interests of the organization deploying the application and the organization providing the network connectivity are aligned. Google doesn't worry about this because they are both. #17
o    The question is more what the scope and semantic of information is that will need to cross organizational boundaries. This needs further study, in particular when assuming stakeholder division between service and network provider.
·         It seems impossible to satisfy that requirement simultaneously with the latency requirement. #15
·         It wasn't clear that how hard of a requirement session persistence is. #13
o    A session usually creates ephemeral state. If execution changes from one (e.g., virtualized) service instance to another, state/context needs transfer to another. Such required transfer of state/context makes it desirable to have session persistence (or instance affinity) as the default, removing the need for explicit context transfer, while also supporting an explicit state/context transfer (e.g., when metrics change significantly).
·         Should it select UPF based on the application? Steering is done per user? or per application? #9
·         This seems to assume conventional non-distributed applications just running at the edge. what about modern frameworks like Sapphire? and Ray? #7
o    It would be good to understand the multi-site requirements of such framework, which I have understood to mainly run in single DCs.
·         Relation to 3GPP UPF #6
·         Relation to ALTO #5
·         Do the mobility issues and associated protocols are also in scope? There are scenarios where routing alone would not be sufficient. #4
·         What is the position in the edge location regarding to UPF? #3
·         Is there some sort of authorization model so that an edge can indicate whether or not it will provide compute services? #2
·         What is CNC and the relationship with CAN #1
 
Measurement of the Computing Resources (to be addressed by draft-du-computing-resource-representation):
·         It is hard to use existing work to measure the computation, but we can optimize the latency through the performance monitoring. We have performance/measurement matrix over there. #34
·         Clarifications on the computing resource, its requirements and characteristics would be helpful. #27
·         Each application may have a different definition of "resources" these then have to be boiled down into a single topology Network Aware Computing (NAC! :) does scale #14
·         Is computing resource measurable? #10
o    It is, and how to use the measurement would be solution related. See IFIP Networking 2022 paper on how to simply expose “computing capability” and achieve better steering with such simple measure.
·         Why compute resource is different with other resources? #8
·          
Load Balance based solutions:
·         The point is that we need a standardized LB protocol #18
·         The LB as part of the application itself is superior (part of the distributed application itself is to obtain and keep updating the "best" unicast location to use). #22
·         If there is anything missing from current lbs that would prevent their use as-is? other than there is for market reasons no interop standard between different lbs? #12
·         For the load balance, should it learn the network’s status? #11
·          
Dyncast based Solution issues:
·         For Dyncast, when the time is short, is it possible for the router to decide the routing? It is too fast. #31
·         Is dyncast proposed to encapsulate? #29
·         Will CAN dyncast impact each and every router? How to avoid loops? #28
·         What's the assumed scale of a D-router? 10 ^ 6 sessions? 100^ 8? What's the assumed update rate? !Gb? 1Tb? #26