Re: [icnrg] [irsg] IRSG review request draft-oran-icnrg-qosarch-04

"David R. Oran" <daveoran@orandom.net> Thu, 20 August 2020 18:08 UTC

Return-Path: <daveoran@orandom.net>
X-Original-To: icnrg@ietfa.amsl.com
Delivered-To: icnrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 952383A10A3; Thu, 20 Aug 2020 11:08:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3CcmGSqVcehM; Thu, 20 Aug 2020 11:08:18 -0700 (PDT)
Received: from spark.crystalorb.net (spark.crystalorb.net [IPv6:2607:fca8:1530::c]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E1C413A10A5; Thu, 20 Aug 2020 11:08:18 -0700 (PDT)
Received: from [192.168.15.168] ([IPv6:2601:184:407f:80ce:5056:c3d2:2add:ec1b]) (authenticated bits=0) by spark.crystalorb.net (8.14.4/8.14.4/Debian-4+deb7u1) with ESMTP id 07KI83gi005378 (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256 verify=NO); Thu, 20 Aug 2020 11:08:05 -0700
From: "David R. Oran" <daveoran@orandom.net>
To: "Schooler, Eve M" <eve.m.schooler@intel.com>
Cc: icnrg@irtf.org, icnrg-chairs@ietf.org, "Colin Perkins" <csp@csperkins.org>, "Steering Group" <irsg@irtf.org>
Date: Thu, 20 Aug 2020 14:07:58 -0400
X-Mailer: MailMate (1.13.1r5707)
Message-ID: <10481907-3FDE-4922-854E-6954510DBA57@orandom.net>
In-Reply-To: <SN6PR11MB31509302D11466B012F852D9D7450@SN6PR11MB3150.namprd11.prod.outlook.com>
References: <BB926E54-EFC9-42A6-A5B4-EA24A8AA9063@csperkins.org> <SN6PR11MB31500BBEBCE6A69A62F32A50D7440@SN6PR11MB3150.namprd11.prod.outlook.com> <E79F23E1-DBB1-4A96-977A-F563140F30D2@orandom.net> <SN6PR11MB31509302D11466B012F852D9D7450@SN6PR11MB3150.namprd11.prod.outlook.com>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="=_MailMate_3563423B-4D92-4AAD-B268-1112C7595B40_="
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/icnrg/4RXNJuerM8cztSvlcGORdjVJmmg>
Subject: Re: [icnrg] [irsg] IRSG review request draft-oran-icnrg-qosarch-04
X-BeenThere: icnrg@irtf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Information-Centric Networking research group discussion list <icnrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/icnrg>, <mailto:icnrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/icnrg/>
List-Post: <mailto:icnrg@irtf.org>
List-Help: <mailto:icnrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/icnrg>, <mailto:icnrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Thu, 20 Aug 2020 18:08:22 -0000

Thanks a bunch for these substantial thoughts on areas the document 
might address but currently doesn’t. This message definitely qualifies 
for TL;DR, but if it does generate a spirited discussion on the ICNRG 
list I think that would be a good thing (this message cc’s the IRSG 
since Eve’s original message with technical comments on the QoSArch 
document included the larger community, but I encourage further 
follow-up to stay on the ICNRG list).

I have a rather narrow view of what QoS belongs in the network layer 
(L3) as opposed to higher in the stack and I think that colors my 
thoughts on these issues pretty strongly.

In particular, I think there are two themes to this:

1) whether “in network computation” is appropriate to represent in 
the ICN inter-networking protocol (NDN or CCNx). One view is that its 
only real role is to deliver Interests to endpoints (or caches) holding 
the desired named content and returning the corresponding data, 
Alternatively one could consider resources for computational tasks 
beyond to this forwarding/routing function in scope.

2) whether the resources addressable using QoS treatments are just the 
“abstract” resources represented by protocol functions, or more 
directly the underlying resources used by the protocol to achieve those 
functions. For example, PIT capacity, forwarding packet rate, and cache 
capacity would be what were manipulated by requested QoS treatments, 
while things like CPU, DRAM memory, or energy consumption would not.

It would be really interesting to get people’s views on this. My own 
view, clearly represented by the current text of the QoSArch document is 
that for (1) it’s only the role of getting Interest and Data delivered 
that is relevant for ICN network-layer QoS, and for (2) only the 
abstract resources are what gets modeled.

Some more detailed thoughts are embedded.


On 10 Aug 2020, at 20:46, Schooler, Eve M wrote:

> + ICNRG (meant to cc originally, but I seem only to have cc’d the 
> chairs)
>
> Hi Dave,
>
> I think the document is sound technically, at least in terms of what 
> it discusses from the ICN perspective.
>
> However, one aspect about QoS that seems less clear from the document 
> (but that you hint at in various places, yet never seem to really sink 
> your teeth into fully) is how does one specify QoS treatments, when 
> the QoS spans multiple kinds of resources, not only the network but 
> also storage/caches, compute resources, and energy (e.g., 
> availability/excess clean energy):

Well, the document doesn’t reach the goal of being an actual specified 
QoS architecture. I’m just trying to point in a particular direction. 
So, the draft talks about the ability to specify richer QoS treatments 
than can be specified for IP, and to be able to represents tradeoffs as 
well as absolutes (e.g. sacrifice latency in order to get higher 
satisfaction rate). It doesn’t come anywhere close to saying exactly 
what treatments might be included or exactly how they are achieved 
algorithmically.

My goal is that the ICNRG and researchers in general will sink their 
teeth into this and that the QoSArch document, if published, can serve 
as pointing in a productive direction.
>
>   *   section 4, Table 2: when you identify the kinds of  resources 
> ICN requires, there is mention of resources for Compute capacity for 
> forwarding functions, but not for in-network computation.

I tried to clarify above that this is intentional. I think allocating 
compute resources via the L3 forwarding protocol would be a mistake and 
a pretty dramatic layer violation (although sometimes layer violations 
are called for and result in important advances, particularly if the 
violated layer boundary was poorly chosen by the architects). If you 
look at some of our current work like CFN 
(https://dl.acm.org/doi/10.1145/3357150.3357395) the resource management 
for placing and allocating compute is a protocol running **on top** of 
the ICN protocols, not inside as QoS-oriented extensions. What it 
*could* do for example (but doesn’t since current protocols don’t 
have the hooks) is use QoS machinery to ensure ongoing computations get 
higher reliability for result delivery than initiating new computations, 
or to rank less important computations to have less stringent latency 
bounds.


>   *   section 5: this discusses TCP/IP equivalence classes and Intserv 
> and Diffserv, but made me wonder if there is a similar discussion that 
> could be had about QoS from the standpoint of computation engines, 
> e.g., Kubernetes or other orchestrated workload/offload frameworks.

Given things like Kubernetes, Istio, and friends that place 
computations, there is certainly lots of work to be done to design joint 
optimization of routing, network capacity, and compute placement/load. 
The question for me is what inherent QoS treatments (and associated 
algorithms) in the L3 protocols are needed in order to enable higher 
layer resource management protocols to have the information they need, 
and the knobs to turn, in order to to that. My guess (although I can’t 
prove it) is that the needed capabilities in the L3 ICN protocols are 
quite modest, and are captured by the QoSArch document reasonably well. 
If that’s not the case then I’d really love to hear what might be 
missing or wrong in the overall approach I recommend.

One big open question that I do talk about in the document is whether 
Intserv-like traffic control is useful and if explicit admission control 
is needed. In the IP world, things worked out that supplying traffic 
control and admission control directly to consumer end hosts was a 
colossal no-op/failure, but that using those capabilities internal to 
individual AS’s to do traffic engineering turned out to be quite 
widely employed.

The provisional conclusion I put in the document is that it would not be 
too difficult to add traffic control to the protocols, but that adding 
admission control would be tricky at best and horrendously complicated 
at worst because of the multi-path/multi-destination forwarding model 
and the topological independence of application names.

>   *   Section 6.3: Interesting tradeoff questions are raised between 
> reliability vs latency vs bandwidth. Tradeoff algorithms would seem to 
> need joint-optimization expertise, and so we will likely need the kind 
> of work that folks like Edmund Yeh (Northeastern) and Andrea Goldsmith 
> (Stanford) are doing on joint-optimization of caching and routing for 
> heterogeneous wireless networks (and they’ve recently augmented 
> their research focus to include joint optimization for 
> caching-routing-and-compute).

Yes. I recommend that all those schemes operate at a higher layer than 
the L3 forwarding, but do influence the routing protocol in substantial 
ways. So, I would say that when we do work on ICN routing protocols this 
is the place to dig in rather than the data plane forwarding/caching 
primitives. Some prior work has tried to to the whole thing adaptively 
through stochastic forwarding algorithms (e.g. 
https://arxiv.org/abs/1505.05259). I’d need serious convincing that 
this direction has a lot of promise (nor its partner in crime, machine 
learning running in the data plane of switches to adapt forwarding and 
caching). I could easily be wrong here of course…

>   *   Section 6.5: great discussion here of the interplay between 
> network QoS and caching, from both consumers and producers.

Thanks!

>   *   Section 7 and 7.1: the principles list does include caching as 
> something needing further specification, but I wonder if 7.1 could do 
> more to comment on the impact of computational resources for 
> in-network compute (vs forwarding functions).
>

The dives straight into the core of what it means to have 
“in-network” computing as opposed to “on-network” computing. My 
own view is that if we don’t sort this out with reasonable 
architectural choices we’ll wind up recapitulating the 
less-than-stellar history of service chaining, VNFs, and the like.

In the specific context of ICN, for me some good guideposts are:

- If you  need to transform data, you don’t do it in forwarders since 
the data is hashed and immutable
- If you need to look at payloads, you don’t do it in forwarders since 
the data is encrypted and you don’t have the keys
- If you want to compute in the same hardware box as the box that’s 
doing forwarding, that’s fine, but you are “on-network” rather 
than “in-network” since you are just a consumer/producer co-located 
with the forwarder.

Now, I actually think this is a good division, since most of the things 
people are trying to embed in the packet forwarding path of switches is 
really hard to do if it involves anything other than affecting the 
forwarding (e.g. firewall, DDoS mitigations, etc.). Now it my be that 
certain important optimizations of the computing functions running 
“on-network” can be helped by pushing expensive operations down to 
the switch hardware, just as people have been doing for years with smart 
NICs. That however, is not what I think people are proposing when they 
talk about “in-network” computing.

>
> My thoughts on this topic are not fully formed, but I thank you for 
> getting me to wonder aloud if this is the right document in which to 
> raise these issues or if a broader QoS technical discussion and 
> proposed guidelines should appear somewhere else. The reason why ICN 
> has been an interesting place for this discussion is because of the 
> close partnership with caching; my intuition is that if we can get QoS 
> to comprehend not only network QoS but caching QoS, then it might help 
> us to specify QoS and QoS tradeoffs among other a range of resources.
>
> Best regards,
> Eve
>
I hope this can generate some good discussion!

After going through these comments and thoughts, I don’t plan on 
making further changes to the current draft, so I’m going to submit it 
after waiting a day or two more for any further feedback on my proposed 
changes from the editorial comments.

Thanks again,

DaveO.