Re: [Coin] Follow up to the E2E transport discussions @ IETF 115

Dirk Trossen <dirk.trossen@huawei.com> Mon, 28 November 2022 08:51 UTC

Return-Path: <dirk.trossen@huawei.com>
X-Original-To: coin@ietfa.amsl.com
Delivered-To: coin@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F1C4DC14CE58 for <coin@ietfa.amsl.com>; Mon, 28 Nov 2022 00:51:07 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.197
X-Spam-Level:
X-Spam-Status: No, score=-4.197 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id i16-NLiBIgkY for <coin@ietfa.amsl.com>; Mon, 28 Nov 2022 00:51:03 -0800 (PST)
Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E1A2FC14F607 for <coin@irtf.org>; Mon, 28 Nov 2022 00:51:02 -0800 (PST)
Received: from frapeml500008.china.huawei.com (unknown [172.18.147.201]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4NLK1Z5dB4z67Yjp; Mon, 28 Nov 2022 16:50:42 +0800 (CST)
Received: from lhrpeml500002.china.huawei.com (7.191.160.78) by frapeml500008.china.huawei.com (7.182.85.71) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 28 Nov 2022 09:50:59 +0100
Received: from lhrpeml500003.china.huawei.com (7.191.162.67) by lhrpeml500002.china.huawei.com (7.191.160.78) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Mon, 28 Nov 2022 08:50:59 +0000
Received: from lhrpeml500003.china.huawei.com ([7.191.162.67]) by lhrpeml500003.china.huawei.com ([7.191.162.67]) with mapi id 15.01.2375.034; Mon, 28 Nov 2022 08:50:59 +0000
From: Dirk Trossen <dirk.trossen@huawei.com>
To: Ike Kunze <Ike.Kunze@comsys.rwth-aachen.de>, Toerless Eckert <tte@cs.fau.de>
CC: JEF <jefhe@foxmail.com>, coin <coin@irtf.org>
Thread-Topic: [Coin] Follow up to the E2E transport discussions @ IETF 115
Thread-Index: AQHY/1xZYdEjK4/ZAkSyGv7rdivpR65UB6Qg
Date: Mon, 28 Nov 2022 08:50:59 +0000
Message-ID: <142ba0a6f7814e64890c2325b4588882@huawei.com>
References: <Y3vBlqpmW82lUPax@faui48e.informatik.uni-erlangen.de> <DBFE8A14-FB76-4D33-9C9D-E485AE7717A4@comsys.rwth-aachen.de>
In-Reply-To: <DBFE8A14-FB76-4D33-9C9D-E485AE7717A4@comsys.rwth-aachen.de>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.220.96.241]
Content-Type: multipart/alternative; boundary="_000_142ba0a6f7814e64890c2325b4588882huaweicom_"
MIME-Version: 1.0
X-CFilter-Loop: Reflected
Archived-At: <https://mailarchive.ietf.org/arch/msg/coin/9zw5yCO4Bk0lTwYOHEVPWizt7HE>
Subject: Re: [Coin] Follow up to the E2E transport discussions @ IETF 115
X-BeenThere: coin@irtf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "COIN: Computing in the Network" <coin.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/coin>, <mailto:coin-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/coin/>
List-Post: <mailto:coin@irtf.org>
List-Help: <mailto:coin-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/coin>, <mailto:coin-request@irtf.org?subject=subscribe>
X-List-Received-Date: Mon, 28 Nov 2022 08:51:08 -0000

Dear all,

Thanks for the continued discussions. Please see inline.

Best,

Dirk

From: Coin <coin-bounces@irtf.org> On Behalf Of Ike Kunze
Sent: 23 November 2022 17:54
To: Toerless Eckert <tte@cs.fau.de>
Cc: JEF <jefhe@foxmail.com>; coin <coin@irtf.org>
Subject: Re: [Coin] Follow up to the E2E transport discussions @ IETF 115

Dirk, Jeffrey, Toerless, all,

Thanks a lot for your comments!
I think those are already a lot of very insightful thoughts that could help shaping future discussions.
Please find my thoughts on your comments below, marked by [IK].

Cheers,
Ike

On 16. Nov 2022, at 14:57, Dirk Trossen <dirk.trossen@huawei.com<mailto:dirk.trossen@huawei.com>> wrote:

One discussion point in how to progress with that draft was what to do in terms of analysis. As I outlined in my email to the list (to which I have sadly received no reply yet), one suggestion made in the interim was to separate the analysis from the use case description (and terminology). In the light of this discussion here and in reference to Dirk K’s comment, such separation may indeed be useful, where the analysis could focus on applying the key aspects of the proposed principles in the COIN transport paper to those use cases. For instance, how are aspects of simplicity and transparency (which we outlined as being key when talking about ‘incomplete versions of the application function’ in the network) concretely reflected in industrial networking use cases (or those on micro-service apps)? Firstly, what is that ‘incomplete version’ (for a specific use case) and how simplicity and transparency apply here from the application’s or use case perspective. While it is not the ‘practical’ experience Dirk called for, it comes close and relates to the (use case) discussion we are having anyway (and which is documented in the form of the draft).

Such analysis could be a practical way forward to shed some more light onto what the paper attempted to outline and it would also help us as a community to get some view (note that I am not calling for consensus per se) on the relation of COIN to the E2E principle, which is an aspect I often hear being asked in the context of COIN.

[IK] Thanks, Dirk, for bringing up the use case draft / the analysis of the use cases.
I agree that this could be one sensible way of gathering some form of “near-practical” insight and I actually had slides [1] for IETF 115 specifically probing for interest in working on the analysis part, but there was no room left on the agenda.
Similar to the considerations for the use case draft, I think it would be very helpful to get many viewpoints of different people.
From what I gathered during the hallway discussions, there are, e.g., widely differing views regarding what could be the ‘incomplete version of the function’ or rather what could be the envisioned scope of such functions, e.g., only “core networking tasks” or “bringing application logic into the network”.
Some even seem to think that “what functions might be deployed?” is the wrong question to ask (see Toerless’s comments).
Thus, having structured discussions on (some of) these aspects absolutely makes sense to me and is one way to progress this within the RG if there is broad interest by others to participate.
[DOT] Fully agree and this was the point of pushing forward the aspect of analysis more; it will require at least an initial ‘structure’ for looking at the space of ‘functions in the networks’.

[1] https://datatracker.ietf.org/meeting/115/materials/slides-115-coinrg-draft-irtf-coinrg-use-cases

On 21. Nov 2022, at 13:49, JEF <jefhe@foxmail.com<mailto:jefhe@foxmail.com>> wrote:

1, About the End2End principle/argument

Where this principle is applicable: it’s more about the relationship between applications and the network, not the size of domain ( “internet-wide” or liminted domain). If the network is designed for one application, for instance the AI training cluster, the end2end principle may not make sense. If the network is supposed to convey many applications, for instance the DC network,  end2end principle is still a good guidance.
What does this principle suggest: "incomplete functionality not being deployed in today’s infrastructure" is one way to put it. The discusion in [1] further explains two goals of the end2end argument: application autonomy and network transparency. My personal opinion is that these two goals should be respected if we expect a COIN network to convey various applications.

[IK] I agree that this is another way to phrase these observations and I see similarities to the thoughts of Toerless, too.
In fact, the notion of “network [..] designed for one application” or "Mission specific network” (thanks Toerless, I like this phrasing!) better conveys the key characteristics of what I intended to express with “limited domain”, namely that the network and applications can be co-designed as they are controlled by a single entity.
I think a key question that we could ask ourselves at this point is whether we see COIN as a pointed solution within mission specific networks or whether we expect COIN networks to cover multiple applications and “domains”.
Already relating to Toerless’s comments, building a “robust” network infrastructure that can generally handle both version could also be an outcome of these considerations.
[DOT] This is indeed key and relates to Jeffrey’s ROI point. The E2E argument, in my understanding, intended to outline design principles and aspects, not limited to specific ROI angles though but to build systems that are general and can be skewed towards specific ROI intentions after rollout; the permissionless innovation that was rightly mentioned here.  So while limited domains may give us a nice abstraction for the ‘mission-specific network’ part that Toerless phrased, it still leaves at least two key aspects open (not claiming the following list to be exhaustive)

  1.  How to design those limited domains so that both mission specificity but also generality is achieved (unless one wants to build its own mission-specific network and not care about generality at all) in building mission-specific limited domains?
  2.  How to interconnect those limited domains over the ‘Internet’, in case this is a requirement? How to avoid ‘leakage’ of any assumed in-network semantic into those parts of the network that do not support them (e.g., other limited domains or the Internet)?



2, What kind of computing functions should be integrated in the network
In my view there are 3 options or approaches. The last two approaches may don't have the issue of "in-complete" functionality, since they provide generic functions to any or many applications.
(1) integrate an application-specific function into the network.
The key difference between this approach and the other two below is that, in this approach, these in-network functions may have some semantics or context related to a specific application. Usually this is questionable. The applications can evolve to a new technology overnight by a software update, the networks as an infrastructure are usually expected to be stable. Will the programmable network be flexible enough to change the game?
(2) integrate an execution enviroment (EE) inside the network that can execute nearly any functions
This maybe is close to the "active network" in my understanding: a programmable network node provides an execution enviroment (EE) which is likely Turin-complete; the packet comes with a piece of code in it which is executed in this EE.
(3) integrate some "atomic" computing functions inside the network that are supposed to be used by many applications
This approach seems follow the principles of "application autonomy and network transparency" from the end2end argument. But the question is: can we define a compact set of functions useful for most applications? Maybe we need first to have a good abstraction of functionality of various applications.  In my experience, the function set in MPI (see page 266 in [2]) could be a good reference. Historically that was defined for scientific computations. Concepts like all-reduce etc are now used in AI cluster.

[IK] Thanks, Jeffrey, I think this is a good starting point for discussing possible types of COIN functionality.
The first group certainly fits our definition of “E2E-function-internal computation” the closest as we (possibly) directly derive the functionality from the application.
However, I also think that (2) and (3) can fit here, too.
After all, an atomic function, e.g., may very well represent an incomplete version of many different functions and I think many of the questions raised in our talk/paper/draft also apply here, e.g., how such functionality could/should be addressed, especially if there are multiple atomic functions on a single device or even a freely programmable EE.
Having a discussion on these aspects, as proposed by Dirk above, would be very interesting.
[DOT] Thanks Jeffrey for these options. Personally, I see them all somehow variants of the same, to be honest. Option 2 seems to be a realization of option 1, equally to option 3 though, while option 3 is a limitation of option 1 (not ALL but only SOME well agreed application functions). But it is indeed a good starting to look at the problem as to WHAT and HOW many functions f_1, …, f_n a COIN network MAY, SHOULD, or MUST have. But to me, it misses the point of the paper Ike presented: if we assumed possible sub-functions f1,…,fn in the network (those n may be either app-specific or agreed ‘atomic’ functions, may be done through infra SW update or through an execution environment – thus capturing all options above), how do I build E2E applications in such environment? The key point, also made in the paper Ike presented, is that simplicity and transparency are good guidance points in doing so. IOW, you ought to know that function f_i is being executed in the network for the endpoints to perform their functions properly, while simplicity pushes you for carefully considering your number n for the in-network functions you may want.



3, Protocols or the discussion between "TCP-like" and "distributed computing”
It seems two modes for the potential protocols for in-network computing.
(1)still use the notion of "end2end session" but extend the function executed on the network nodes.
One exampel is the paper [3] (in-network computing for all-reduce in AI). My feeling is that we can start from an analogy between in-net computing and multicast. Unlike unicast, multicast changes the data by "copying", and it's not a "wire" any more, it's point2multi-point. If we genenalize the function of "copying" to "reduce" and change the traffic topology to multi-point2point, it's "convergentCast" (see page 52 in [4]). We can further extend the topology to multipoint-to-multipoint, then it goes to the collective communication in the definition of MPI [2]. We can also extend the functions inside the network nodes to what are discussed above.
From the experience in multicast, we could have some ideas how to do the naming, addressing, routing, reliability and rate control (the most difficult part) for in-net computing.

[IK] This is a nice way of looking at the problem space.
What could be interesting is to think about which layer we would see such concepts on, i.e., do we want to have this on L3 or do we include this in the transport notion?
For example, if we consider “node compute capacity” in addition to the normal forwarding capacities of networking nodes, we would somehow need to find ways to express this, potentially in the form of flow control on L4?
I don’t know, but it is definitely very interesting to think about this further.
[DOT] From a transport perspective, this is indeed a good way to look at the potential for COIN and also at the question of what the aforementioned functions f_1,…,f_n in a COIN system should be. The MPI reference is a good one since it may give us a good starting point on what operations could be done at branching points in the network rather than at the endpoints only (given that MPI realizations of ‘broadcast’ are often/mostly mere endpoint replications is deeply frustrating, and this is only talking about the COPY operation; the same applies to more complex operations). In the work that David Guzman presented on DLT, we have also identified the potential role of the network to assist in ‘diffusion multicast’ (which is currently done in the DLT peers by way of complex and costly TCP operations, as the paper outlined). The general direction here is right in looking at multicast work as inspiration for COIN, while thinking of more complex-than-COPY-only operations.


On 21. Nov 2022, at 19:21, Toerless Eckert <tte@cs.fau.de<mailto:tte@cs.fau.de>> wrote:

Aka: The challenge is not to figure out "what" enhanced functionality we want
hop-by-hop, but to built the network infrastructure, so that we can show that
we can introduce the "whatever" enhancement such that the remaining/established
services in the network are not going to be affected.

Isolation techniques such as what we might want to call "programmable slices"
and similar mechanisms are IMHO at the foundation of such network infrastructure
designs. Likewise programmatically enforced fallback / backward compatibility
mechanisms.

Of course, this would be ambitious and it is much more likely thart nobody will
give this a serious amount of investment. For many functions (including Coin),
it will be easier to design all type of hybrid overlay systems where "just" need
to figure out how to do something slightly more involved than a distributed
DataCenter to run additional compute at coordinated locations across a topology.
But that "lame" escape route is going to happen in the industry already anyhow
driven by the need to monetize edge-compute. Not much IMHO what bse research can
add to those efforts.

[IK] Thanks, Toerless, for this insightful comment.
I am not sure that I would totally agree with your sentiment that “E2E” is the boring subset of COIN, but I get what you are aiming at.
Essentially, what we intended to achieve with our talk/paper/draft is to raise questions, the answers to which might point towards the network infrastructure that you mention.
From your perspective, do you rather see a need for innovation on the “network infrastructure” level, e.g., how “whatever" functionality could be inserted or exposed to applications, or on the “end-host/transport level”, i.e., focusing on how such functionality can be integrated in the overall application logic?
From my interpretation of your comments, I would assume that you probably refer to the “network infrastructure” level while we (as in the authors) come from the “end-host” perspective.
At least to me, these two ways of thinking about this problem are just two sides of the same coin (pun intended).
Maybe it might make sense to discuss a bit more about the different implications of the two sides and/or whether my assessment as seeing this as two sides of the same coin is actually something others would agree to.