[icnrg] Some comments on draft-li-icnrg-damc
"David R. Oran" <daveoran@orandom.net> Thu, 03 August 2023 16:01 UTC
Return-Path: <daveoran@orandom.net>
X-Original-To: icnrg@ietfa.amsl.com
Delivered-To: icnrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 33128C1524DD for <icnrg@ietfa.amsl.com>; Thu, 3 Aug 2023 09:01:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.907
X-Spam-Level:
X-Spam-Status: No, score=-6.907 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=crystalorb.net header.b="DUuWPbYI"; dkim=neutral reason="invalid (unsupported algorithm ed25519-sha256)" header.d=crystalorb.net header.b="3zfLCXaQ"
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LXxUF2EGYZid for <icnrg@ietfa.amsl.com>; Thu, 3 Aug 2023 09:01:23 -0700 (PDT)
Received: from crystalorb.net (omega.crystalorb.net [IPv6:2600:3c01:e000:42e::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 42766C131809 for <icnrg@irtf.org>; Thu, 3 Aug 2023 08:57:39 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=crystalorb.net; s=mail; h=Content-Type:MIME-Version:Message-ID:Date:Subject :To:From:From:Sender:Reply-To:Subject:Date:Message-ID:To:Cc:MIME-Version: Content-Type:Content-Transfer-Encoding:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: In-Reply-To:References:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=THdopo+GxJIcv+yLuVCzyxlGCgmmM4TQ4palTKsS4uc=; b=DUuWPbYIS8wI9AMrl6d8q+Oboy bnG+SRLDOZ64B0zmEJPypflCcA48GGtTeUZxjyXxJ++3sxPF0TPXGiDaCIU8c87Y5IraWzd8oHGNn /nc1L1AmzcZHchxLaD7orhYVukheVSV+109atDpkI0VNg6gFt4eiaPVib+Tr9f0kadxg=;
DKIM-Signature: v=1; a=ed25519-sha256; q=dns/txt; c=relaxed/relaxed; d=crystalorb.net; s=omegamail; h=Content-Type:MIME-Version:Message-ID:Date: Subject:To:From:From:Sender:Reply-To:Subject:Date:Message-ID:To:Cc: MIME-Version:Content-Type:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=THdopo+GxJIcv+yLuVCzyxlGCgmmM4TQ4palTKsS4uc=; b=3zfLCXaQz8+xUSsWzxuxtuETeS Jxlst9xOCxTerlSSj/TRThD2LvZkV4ITwQsoJbmwBkMfdwElv8J4cUXMPfDg==;
Received: from [2601:184:407f:80cf:1559:1ec0:f4d7:f330] (helo=[192.168.15.242]) by crystalorb.net with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from <daveoran@orandom.net>) id 1qRaei-0069Uh-5G for icnrg@irtf.org; Thu, 03 Aug 2023 08:54:28 -0700
From: "David R. Oran" <daveoran@orandom.net>
To: ICNRG <icnrg@irtf.org>
Date: Thu, 03 Aug 2023 11:57:36 -0400
X-Mailer: MailMate (1.14r5937)
Message-ID: <2A5DD9FE-85F7-4871-904B-9351395E69C2@orandom.net>
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=_MailMate_497208DA-51D9-48C0-8912-C1CCA9DCC7AA_="; micalg="sha-256"; protocol="application/pkcs7-signature"
X-SA-Exim-Connect-IP: 2601:184:407f:80cf:1559:1ec0:f4d7:f330
X-SA-Exim-Mail-From: daveoran@orandom.net
X-SA-Exim-Scanned: No (on crystalorb.net); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/icnrg/IzDyjLz_5Cgrt4ugrPGBmZk3FRo>
Subject: [icnrg] Some comments on draft-li-icnrg-damc
X-BeenThere: icnrg@irtf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Information-Centric Networking research group discussion list <icnrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/icnrg>, <mailto:icnrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/icnrg/>
List-Post: <mailto:icnrg@irtf.org>
List-Help: <mailto:icnrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/icnrg>, <mailto:icnrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Thu, 03 Aug 2023 16:01:28 -0000
<Chair Hat off> I finally got around to reading this. Sorry for the delay - been really busy with NSDI and ICN conference stuff. I have some comments, I hope you find them useful: ### Template - This doesn’t read as a Standards Track document since it doesn’t define a specific protocol or API. You probably ought to re-classify it as **Informational**. ### Introduction - RFC8763 is probably not the best reference to use when describing ICN. Perhaps RFC7927, or better a survey paper like https://ieeexplore.ieee.org/document/6231276 - Could you supply a reference to a paper or material showing the centralized controller as a performance bottleneck? That would be helpful, especially to quantify the problem so alternative distributed approaches could be compared. A couple of the systems I’m familiar with, like Ray (https://rise.cs.berkeley.edu/projects/ray/) don’t exhibit the controller as the prime bottleneck; rather the layout of the microservices done by the controller and how that interacts with the scheduling and congestion control of the network. Other system may of course show the controller as the bottleneck. - Ditto for scalability and reliability. Often this is ameliorated by having the controller itself be distributed with some coordination via consensus using Paxos or Raft. - Isn’t the *Service Gateway* you define centralized? You don’t talk about it as a distributed component. Ditto for the Service Mesh Communication Scheduling Center. ## Terminology - It would be helpful to be more specific about what parts of ICN you are actually adopting. Specifically: - What naming scheme? Hierarchical? Flat, Administered how? - What routing? Is it directly name based or via some NRS (Name Resolution Service) translated to locators? - Per-object security using packet signatures or something else? You seem to base source origin authentication on Prefixes, which could be fine, but not what the popular extant ICN protocols do. - ICN is not “defined” in RFC7927. In fact, since here you are defining terminology, a more useful reference would be to RFC8793. - Similarly, RFC9344 while an important reference for how to instrument and manage an CCNx-based network isn’t the correct reference for a definition of FIB or RIB. In fact, in general those terms apply to just about any routing/forwarding system, and hence likely don’t need a reference, only an expansion of the acronyms. - In contrast, you probably ought to define what you consider a “Pod”, especially what characteristics you think it has compared to, for example a “rack”, or a “failure domain” in a data center network. ## Key process / distributed service mesh architecture - Small typo in the section title - missing space “thedistributed” - I don’t understand why you say “service gateway linked to each of the pods of the service identification information owned by the Pod”. In general I would not think of information as being “owned” by a Pod. Do you mean stored data that is hosted on a particular Pod? Ownership on the other hand is an authentication/authorization concept. - How are authorizations constructed and communicated? What’s the data structure? What’s the protocol? - What is “topology link information”? Is that the internal topology graph of a Pod? The connectivity of the Pod to other Pods? Something else? In the same paragraph, it isn’t at all clear who is computing the FIB and how. Is this done by the Service Router, Service Gateway, both? By what routing algorithm/protocol? - It’s not clear what you mean by “detection”. What are you detecting? ## Key communication message types & functions - This section is very abstract. Is the intention to evolve this into an actual protocol design? If so, there are lots of open questions: - what is the transport for these messages? NDN? CCNx? MQTP? HTTPS? something else? - How are failures of the various components detected, communicated to other components, and recovered from? - How is the initial state instantiated that defines the structure of a DAMC instance? - The material on service measurement is very general and it’s very unclear what the timescales it is intended to operate over. In datacenters, workloads typically evolve unpredictably and wildly over timescales as short as a few seconds. That’s why it’s particularly challenging, and it isn’t clear how your general approach deals with that, or whether the existing datacenter measurement tools are not a better choice than something bespoke. - How does “the entire service mesh” relate to the Pod material above? Is it over a single Pod or many pods, and how does this relate to the existing management of failure domains in a datacenter, not to mention the distribution of workloads across data centers, or even in hybrid multi-cloud scenarios? There are ton of existing approaches for all of these (e.g. Istio - https://istio.io, Linked, Consul); it would really help to reference and cite them to better justify the creation of something new. ## Operation process… - you say traffic management is a forwarding plane function. That isn’t the normal use of the term. Do you mean packet error control, flow control, and congestion control? - I’m still befuddled by how you are using authentication/authorization terminology and functions. You need to be a whole lot more specific here, especially in how trust is managed and how things that need transitive or delegated authorization work. - You say “service gateway, it selects the best path and the next hop according to the rules in the routing information base (RIB) and the Forwarding information base (FIB) to forward the packet to the target Microservices.” Typically individual message/packet forwarding does not require consulting a RIB. If you look at the RIB, why bother computing a FIB in the first place? - You say “By participating in this authentication process, the service gateway ensures that only legitimate Pods with authorized service prefixes are granted access to the network.” Really? I would expect Pods to need network access for a ton of things unrelated to running a service mesh instance or instances… - You start here talking about “Class 2” and “Class 3” signaling. Are those the numbered categories in figure 4 or something else? These descriptions confused me. - Why do you require that you compute a link state database (“Link State Database (LSDB) is generated between the Service Gateway and the Service Router)“ Do you mean a topology map, that could be represented in many ways. An LSDB is one particular form, and not necessarily the best/richest form. - Coming back the crucial naming question, how are named entities like “Service B/4” bound to locations, and who is responsible for maintaining and updating the binding (ignoring the possibility of things like process or VM or container migration). - “When the service gateway (SG-4) receives a communication message, it will pass the message to the Podcast” Podcast? What’s that? In summary, many thanks for submitting this, but it needs a lot of work in both exposition and in justifying why this approach is in fact better than what we have in production today, or even research prototypes employing ICN protocols like CFN (https://dl.acm.org/doi/10.1145/3357150.3357395) DaveO
- [icnrg] Some comments on draft-li-icnrg-damc David R. Oran
- Re: [icnrg] Some comments on draft-li-icnrg-damc Xueting Li
- Re: [icnrg] Some comments on draft-li-icnrg-damc David R. Oran
- Re: [icnrg] Some comments on draft-li-icnrg-damc Xueting Li
- [icnrg] New Version Notification for draft-li-icn… Xueting Li