Re: [Idr] Fw: [Can] Proposed CAN WG charter for discussion

Robert Raszuk <robert@raszuk.net> Tue, 31 January 2023 10:03 UTC

Return-Path: <robert@raszuk.net>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BC606C14CF05 for <idr@ietfa.amsl.com>; Tue, 31 Jan 2023 02:03:22 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.083
X-Spam-Level:
X-Spam-Status: No, score=-2.083 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_FONT_FACE_BAD=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_KAM_HTML_FONT_INVALID=0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=raszuk.net
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ngHXUaZGwNdE for <idr@ietfa.amsl.com>; Tue, 31 Jan 2023 02:03:19 -0800 (PST)
Received: from mail-wm1-x329.google.com (mail-wm1-x329.google.com [IPv6:2a00:1450:4864:20::329]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2F5E3C14CE2D for <idr@ietf.org>; Tue, 31 Jan 2023 02:03:18 -0800 (PST)
Received: by mail-wm1-x329.google.com with SMTP id d4-20020a05600c3ac400b003db1de2aef0so10162803wms.2 for <idr@ietf.org>; Tue, 31 Jan 2023 02:03:18 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raszuk.net; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=I8BdAAlltd/RR44XD7FyS73MuUT3oyCJwsfk3xmW3aA=; b=AbLzgIVIBwgvJ+0WPwuQ1Gozv8V4JGlvsKqkQmEWQkgsjkEhxdPGS0u+B4DJ6G58lU EP8/DZDsgndr6m6KX7Bf8HyxBE2jC3RhfKU5Jf0o0qDczxrkLkSEzlEivbLBToDvTJuH ddrSuASCgaVfmKswxgB7nX2j0JNq7EjHdqfiFGak0d8CBN6zx2nucGtjjP7X8W3n1IKF WLXNL4EMe2xy+ZCdwhg3gLiB7boQxWYRvkJ6w0/U3Bej89bmll6B8+RpoH/byz0M9Jhy DlRBDvx+A+PFNMgkqtUheW/1X+IYDRw50+ySmjYzoSTDJrCSchPZPFiaOticwV4kUYAk MJug==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=I8BdAAlltd/RR44XD7FyS73MuUT3oyCJwsfk3xmW3aA=; b=z+dW7xbutaHl31KuBj7IwwDkazgdNZHSoakykAGNmPpj2IPeqWUU11nBD7qvP/pMvC e1glbH24U2S/rvxP8M7JrFVMzd3torOU/dwrMlcJKKolxfHZ5PAnlGlcAfIzbiOk1CIt 82uG5o84Ftmq30BI29zr+maOpkD1NIe+s2dDboUCxbXAMVbgS9fp5JLl735TAmfP9/BP n3Gy1JH/AtB+Z3O9RCp1gK4PcKlZ8t62y4+KFGjE2MyV2O5+qkMlHAMaf6QpHUYNCqWv EaDPAKn7huSvJxkMIIp5cRGg5dqD9hF7lz46OWIP6sC0wwYphqPxKPHuFAGsc0cNvXla VZlQ==
X-Gm-Message-State: AFqh2ko6prtDPt090D2ESv1Y9EUpaijEFu5EZWmWonZI4njk/uikZ3H/ o1E2hNIa9jZo39S75DjUDH7mpVVMPZjqxTBGVRTa2Q==
X-Google-Smtp-Source: AMrXdXsTAKgqMV4fl5oACSExaNE6KfI1S7Y9fEdmLHtgcrS3Euy09GIUwpzIePzJAwlyhtV4hOYiezwnRmS6QQS1kQU=
X-Received: by 2002:a05:600c:3b86:b0:3d8:f22e:118f with SMTP id n6-20020a05600c3b8600b003d8f22e118fmr2964638wms.144.1675159396742; Tue, 31 Jan 2023 02:03:16 -0800 (PST)
MIME-Version: 1.0
References: <202301311646078514713@mail.zgclab.edu.cn> <tencent_6B5F83350023EB5D205CB9027368068FB908@qq.com>
In-Reply-To: <tencent_6B5F83350023EB5D205CB9027368068FB908@qq.com>
From: Robert Raszuk <robert@raszuk.net>
Date: Tue, 31 Jan 2023 11:03:05 +0100
Message-ID: <CAOj+MMGnQzLhvv_PQ+5wb4hw=EUJFjbJw923nxJ28XChpUDs2Q@mail.gmail.com>
To: Fang Gao <fredagao@foxmail.com>
Cc: Liu <liupengyjy@chinamobile.com>, "linda.dunbar" <linda.dunbar@futurewei.com>, jgs <jgs@juniper.net>, can <can@ietf.org>, "idr@ietf.org" <idr@ietf.org>, Farinacci <farinacci@gmail.com>
Content-Type: multipart/alternative; boundary="000000000000c9114005f38c70a8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/JjAUrNILTvRy7tY6KVFlGOSWv5M>
Subject: Re: [Idr] Fw: [Can] Proposed CAN WG charter for discussion
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 31 Jan 2023 10:03:22 -0000

Hi Fang,

What you said in your note below is 100% sound. Thank you !

Especially this section is spot on:

1) Pre-allocate resources to create instances at multiple regions. It is
uncertain which client will access to which regions, but it needs to ensure
that the client go to the nearest region or be scheduled to the region with
sufficient resources.

2) Global Acceleration service (called as “GA”, already launched on AWS,
ALI, Tencent, etc.): After the client reach the nearest POP site of public
cloud carrier’s network, the traffic of the client is sent to the real
back-end Region (where is the ECS instance belongs to) through the own
backbone network of the public cloud, instead of going to the region
through uncertain public internet networks.

3) Internet to anywhere (An emerging potential service, cloud be called as
“I2A” temporarily): The public cloud provides a new service that allocates
the ESC by network latency requirement (or other constraints set by
tenant). When the tenant purchasing ECS instances, instead of specifying a
Region directly, it allows the tenant to select a latency range zone for
ECS allocation. Then the public cloud carrier allocates ECS to a Region
meet tenant's requirement (e.g., less than 500ms), and dynamically
switchover the ECS to other region when the SLA of the original region
changed and cannot meet the constraints.

But folks here want to expose cloud metrics (even if unified and/or
normalized) to network elements and let the network elements  of third
parties to choose which is the best region or cloud cluster for the users
to take.

I think approaches you highlighted are really scalable when it is cloud
operator itself who makes the call.

And as you said access to any hyperscaler cloud today is almost always 1
IXP hop away so from there it is the cloud operator who can decided where
to put the workload based on the packet dst or even x-tuple.

So OTT with cloud control is IMO preferred scenario. Pushing any cloud
state to the network is not so.

Kind regards,
Robert


On Tue, Jan 31, 2023 at 10:37 AM Fang Gao <fredagao@foxmail.com> wrote:

>
>  About the Anycast IP, I think this is one of the approaches to provide
> unified entry point for application instance or service instance.
>
>
> First, for the unified entry of application instances/service instance,
> there are two approaches commonly used by public cloud (OTT):
>
> 1) Method 1--Anycast IP: Using the same IP address for application
> instances that are distributed at any site or any region.
>
> 2) Method 2- Unified DNS domain name: Resolving the same Domain Name to
> different EIP address of different nearest region according to the location
> of the clients;
>
>
> Then, what kinds of service on the cloud in recent years might require a
> unified entry for application instances in different regions:
>
> 1) Pre-allocate resources to create instances at multiple regions. It is
> uncertain which client will access to which regions, but it needs to ensure
> that the client go to the nearest region or be scheduled to the region with
> sufficient resources.
>
> 2) Global Acceleration service (called as “GA”, already launched on AWS,
> ALI, Tencent, etc.): After the client reach the nearest POP site of public
> cloud carrier’s network, the traffic of the client is sent to the real
> back-end Region (where is the ECS instance belongs to) through the own
> backbone network of the public cloud, instead of going to the region
> through uncertain public internet networks.
>
> 3) Internet to anywhere (An emerging potential service, cloud be called as
> “I2A” temporarily): The public cloud provides a new service that allocates
> the ESC by network latency requirement (or other constraints set by
> tenant). When the tenant purchasing ECS instances, instead of specifying a
> Region directly, it allows the tenant to select a latency range zone for
> ECS allocation. Then the public cloud carrier allocates ECS to a Region
> meet tenant's requirement (e.g., less than 500ms), and dynamically
> switchover the ECS to other region when the SLA of the original region
> changed and cannot meet the constraints.
>
>
> Next, we cloud take a look at the types of EIP on public cloud. We knew
> that the public IP address of ECS on the cloud is provided by the EIP
> (Elastic IP) service:
>
> 1) Regional-level EIP Pool: Each region has its own IP address pool. It is
> the most common scenario. Tenants select the Region in which to allocate
> the ECS, then an unallocated EIP in the Regional EIP Pool will be obtained
> and bound to this ECS.
>
> 2) Global-level EIP Pool: These IP subnets do not belong to any dedicated
> Region, but belong to the Global resources above regions. They can be
> considered as “Regionless” IP resource. For such instance, the ECS
> allocates an address in the global EIP Pool regardless which Region is it
> belong to. Maybe it could be corresponded to the Anycast address;
>
>
> At last, going back to the application and unified entry method it adopts.
> The unified entry is a requirement which is determined by the
> service/application itself, ant not depends on the “network-centric” mode
> or “application centric” mode. About the “unified entry” of service:
>
> 1) in “network-centric” mode, it maybe tends to take “Method 1--Anycast
> IP”;
>
> 2) For the “application-centric” mode, both approaches will be used. In
> the scenario which the DNS resulting to closest site is applicable, “Method
> 2- Unified DNS” cloud be taken (and “Method 1-Anycast” is also possible
> here, depends more on the design of service developer). In other scenarios
> in which the Mothed 2 is not applicable or service is not accessed by DNS,
> “Method 1-Anycast” will be implemented.
>
> As the Example of GA service, AWS takes method-1 and Alibaba cloud uses
> method-2, while Tencent and Huawei cloud provide both method-1 and metod-2.
>
>
> As I started to catch the information of CAN yesterday, I will apologize
> if any misunderstanding about anycast or CAN confuse us.
>
> B.R.
> Fang Gao
>
>
> *From:* Robert Raszuk <robert@raszuk.net>
> *Date:* 2023-01-28 19:40
> *To:* Peng Liu <liupengyjy@chinamobile.com>
> *CC:* linda.dunbar <linda.dunbar@futurewei.com>; jgs <jgs@juniper.net>;
> can <can@ietf.org>; idr@ietf.org; Dino Farinacci <farinacci@gmail.com>
> *Subject:* Re: [Can] [Idr] Proposed CAN WG charter for discussion
> Hello Peng,
>
> > So CAN won't impact every routers but just egress and ingress
>
> That's true. But here we are essentially talking about completely
> different directions/architectures and considering the selection on which
> one to take. Both are vastly different and pretty orthogonal to each other.
>
> *Option 1 - network centric -* the one you are suggesting -
>
> * Use anycast /32 or /128 as destination address
> * Enable reception and installation of multiple paths for each anycast
> address
> * Push tons of very dynamic data to each ingress router from behind egress
> routers **
> * Associate that dynamic data with specific active path or subset of paths
> of subject anycast addresses
> * Pre resolve in real time (continued FIB churn) all of the paths of
> anycast addresses in respect to load behind them  - and that must be done
> irrespective of any interest for that data
> * Make egress selection based on that state.
>
> ** - I realize that you will contest this and say that there is going to
> be a very small amount of relatively static data to start with. But I can
> rest assure you that even if you start wil small and static inputs this
> will grow fast as compute selection will require to accommodate new data
> points as we go along.
>
> *Option 2 - application centric - *
>
> * Do not use anycast
> * Do not put any of the dynamic state of the compute/content load/state to
> the network
> * When application is trying to resolve address of the compute/content
> cluster just be smart of what address is returned to it
> * No touch to the network - letting it do what it is good to do - take
> your packet and deliver it to the dst address in the packet
> * Load information is not broadcasted anywhere - can stay local and only
> the resolvers need to be aware of it
>
>
> Also note that while you could perhaps make option 1 work in your (say 5G)
> network for your service it does not sound like it would be applicable to
> access public clouds compute cluster based on the actual load in the same
> way over  the Internet.
>
> So bottom line is that while I have been working on network centric
> services for nearly 25 years now in this very case I do believe we should
> really focus on option 2 for addressing CAN's requirements.
>
> Kind regards,
> Robert
>
>
> On Sat, Jan 28, 2023 at 3:48 AM Peng Liu <liupengyjy@chinamobile.com>
> wrote:
>
>> Hi Robert,
>>
>> There might be OTT based solutions that don't involve ingress/egress
>> routers . But some environments, like in our 5G edge network, OTT method is
>> more expensive than a mechanism for egress routers to distribute the
>> information to ingress routers so that path selection engines can consider
>> both. CAN aims at the case where the operator wants to offer the
>> selection service from its edge devices.
>>
>> In the charter, 'The assumed model for the CAN WG is an overlay network,
>> where an ingress routing node makes a forwarding decision based on the
>> metrics of interest, and then steers the traffic to an egress node that
>> serves the selected service instance, for example using a tunnel.
>> Architectures that require the underlay network to be service-aware are out
>> of scope.'
>>
>> So CAN won't impact every routers but just egress and ingress, before the
>> architecture, it is a little early to determine which protocol could be used.
>> But for the directions, I think IETF is for building various tools. like
>> one person can use  knife to peel an apple doesn’t mean peeler shouldn’t be
>> invented.
>>
>> Regards,
>> Peng
>> ------------------------------
>> liupengyjy@chinamobile.com
>>
>>
>> *From:* Robert Raszuk <robert@raszuk.net>
>> *Date:* 2023-01-28 05:35
>> *To:* Linda Dunbar <linda.dunbar@futurewei.com>
>> *CC:* John Scudder <jgs@juniper.net>; can@ietf.org; idr@ietf.org;
>> farinacci@gmail.com
>> *Subject:* Re: [Can] [Idr] Proposed CAN WG charter for discussion
>> Hi Linda,
>>
>> But why do we need to do that within the underlay network vs Over The Top
>> (OTT) way ?
>>
>> Why network needs to be at all involved in distribution of the load
>> information if we could solve it at the application level
>> and keep network lean and as much stateless as possible ? Simple mapping
>> plane will work just fine for this resulting in OTT Compute Aware Load
>> Balancer (for the lack of the better name).
>>
>> Why bring this "awareness" to BGP or IGP or even routers in general ?
>>
>> Isn't the draft https://www.ietf.org/id/draft-kjsun-lisp-dyncast-03.html
>> a possible solution ?
>>
>> Many thx,
>> R.
>>
>>
>> On Fri, Jan 27, 2023 at 9:43 PM Linda Dunbar <linda.dunbar@futurewei.com>
>> wrote:
>>
>>> John,
>>>
>>> Oh, I guess I have over-thought of the "Architecture & framework".
>>> The proponents' wanting a mechanism for egress routers to distribute
>>> computing resources to ingress routers can be considered as one rough
>>> architecture.
>>>
>>> Thank you.
>>>
>>> Linda
>>>
>>> -----Original Message-----
>>> From: John Scudder <jgs@juniper.net>
>>> Sent: Friday, January 27, 2023 12:06 PM
>>> To: Linda Dunbar <linda.dunbar@futurewei.com>
>>> Cc: can@ietf.org; idr@ietf.org; farinacci@gmail.com
>>> Subject: Re: Proposed CAN WG charter for discussion
>>>
>>> Hi Linda,
>>>
>>> I didn't mean to say that the architecture would have to be completed to
>>> the point of RFC publication before that step could be started! But of
>>> course, anyone studying the applicability of a mechanism, has to be
>>> thinking, "applicable for what purpose"? So I think that studying
>>> applicability presupposes that the person doing the study has an
>>> architecture in mind.
>>>
>>> Your summary seems about right, and I think it demonstrates that those
>>> in the side discussion *do* have at least a rough architecture in mind. My
>>> point is,
>>>
>>> a. It's important to write that rough architecture down, to make the
>>> assumptions transparent to all WG participants, and b. It's important that
>>> when listing work items, we do not lose sight of the fact that this is one
>>> work item.
>>>
>>> I don't see the bullet list as comprising a strictly ordered list of
>>> tasks that have to be completed in the order listed, I'm sure some will be
>>> worked on in parallel or even out of order.
>>>
>>> I hope that helps?
>>>
>>> -John
>>
>>