[Teas] Offline discussions of scalable IETF network slicing

Adrian Farrel <adrian@olddog.co.uk> Tue, 25 July 2023 21:13 UTC

Return-Path: <adrian@olddog.co.uk>
X-Original-To: teas@ietfa.amsl.com
Delivered-To: teas@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DA34BC1519B2 for <teas@ietfa.amsl.com>; Tue, 25 Jul 2023 14:13:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.096
X-Spam-Level:
X-Spam-Status: No, score=-7.096 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=olddog.co.uk
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4dBm_IeBiLhM for <teas@ietfa.amsl.com>; Tue, 25 Jul 2023 14:13:13 -0700 (PDT)
Received: from mta5.iomartmail.com (mta5.iomartmail.com [62.128.193.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 51126C14CE22 for <teas@ietf.org>; Tue, 25 Jul 2023 14:13:11 -0700 (PDT)
Received: from vs1.iomartmail.com (vs1.iomartmail.com [10.12.10.121]) by mta5.iomartmail.com (8.14.7/8.14.7) with ESMTP id 36PLD9xQ009126 for <teas@ietf.org>; Tue, 25 Jul 2023 22:13:09 +0100
Received: from vs1.iomartmail.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BDF904604B for <teas@ietf.org>; Tue, 25 Jul 2023 22:13:09 +0100 (BST)
Received: from vs1.iomartmail.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A8FF94603D for <teas@ietf.org>; Tue, 25 Jul 2023 22:13:09 +0100 (BST)
Received: from asmtp3.iomartmail.com (unknown [10.12.10.224]) by vs1.iomartmail.com (Postfix) with ESMTPS for <teas@ietf.org>; Tue, 25 Jul 2023 22:13:09 +0100 (BST)
Received: from LAPTOPK7AS653V (dhcp-9242.meeting.ietf.org [31.133.146.66]) (authenticated bits=0) by asmtp3.iomartmail.com (8.14.7/8.14.7) with ESMTP id 36PLD7wD029693 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for <teas@ietf.org>; Tue, 25 Jul 2023 22:13:09 +0100
Reply-To: adrian@olddog.co.uk
From: Adrian Farrel <adrian@olddog.co.uk>
To: 'TEAS WG' <teas@ietf.org>
Date: Tue, 25 Jul 2023 22:13:06 +0100
Organization: Old Dog Consulting
Message-ID: <04b201d9bf3c$cfcad5e0$6f6081a0$@olddog.co.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Outlook 16.0
Thread-Index: Adm/O/pPg42f0Td8TGaDdEa9PHMGmw==
Content-Language: en-gb
X-Originating-IP: 31.133.146.66
X-Thinkmail-Auth: adrian@olddog.co.uk
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=olddog.co.uk; h=reply-to :from:to:subject:date:message-id:mime-version:content-type :content-transfer-encoding; s=20221128; bh=ixcEoksWPZfFIP+z9xJBF 1pWZCViwr93+jycoMn/YWo=; b=Yf/uy0/G2N8m2b6svDmLAkoAisWTLF3Rm91Rz vbarMGnxwMIyJcP7dGzZPn63bb483iyT9TBIwELMhPSjCowsDi3CJrkBzSYBwijt r+saAyE1xNu1dD273ReS999MaSWxS4R7ZaYZznTBiIijL5H2LiqmleqjoM6LshAB 0yftpmDDfKmPb1Fk7EuVDG1ibPiQuFIhlNL5pMi93o/bsv5+CrS8yiYYdvwLAgbP mxjdTXR1gcRbEc6QhvYsPqhnCTMeUsxuw24wtZpZbTlnmPG0JATrVAb0c3Zx9ybP KIGkNxmyQyB81b9S4eXujQmL/XuvUEU2kwk+L/PmoOhDsYx1Q==
X-TM-AS-GCONF: 00
X-TM-AS-Product-Ver: IMSVA-9.1.0.2090-9.0.0.1002-27774.002
X-TM-AS-Result: No--21.624-10.0-31-10
X-imss-scan-details: No--21.624-10.0-31-10
X-TMASE-Version: IMSVA-9.1.0.2090-9.0.1002-27774.002
X-TMASE-Result: 10--21.624200-10.000000
X-TMASE-MatchedRID: k/osMVn4eXgZnuop9luYIAzrPeIO/OIHskenOhrqdfUPjLUILobt+HqT KrT4kCRpZRUjkVFROF7W9XGSV2LP1IXqj1rCpFuhxkQABNmid9O6G/3hYXpyaRdHX2eZFfZf1Ie ckOrbKEyEju6+OLyeK+MKmonJ4fKzfWolJp7j0jUiLmf+ghTG/wGo1vhC/pWjKJwXybB2cHEsgd kHScxUMW8X2hD8HnjIjoySbfPN9AcWKByVxNXHz5Rc/c+3aB1Rv8CW/nssFSwfaBJLrllK9Q7q2 ilp3sbBEj+RyJv8nXpXfyJjJyOvldpUsVk2Y0Y0rltvlARhKR2+F//Mn3a2w5WFv3czdTZoVPdt drLvloUJWs6phKGSPvU8+ceFB7y2omZutMkUeLXece0aRiX9WpnaxzJFBx6vmG9h0TAHW4sphTm CSWz6qV6MmZjkbDdlwRFaH1TLMMuQuTMifjb2WOQ0jDxGUAJDE3Ba/Z0GuaRYC5LPd7BvbYsTRu Vl54RaN5Rz/iNcCQ+Yc6ieZ5ug3cIV1bw4H083XPDypgTcWVXV164v+IEINXnLfQHNf1uByrZvJ Xxh8Z0rws41TIZsJbzRy25mEcuy7dKH6T1atiJFl9A34VWpsLZKVmx/xIouyPRAwD/3abZuj7Oj xBcvt5TziLVM/w8ifjWeJX0rKbNNfs8n85Te8v7E6GNqs6ceKzfM9B6IRt76C0ePs7A07YVH0dq 7wY7up8Odl1VwpCSUTGVAhB5EbQ==
X-TMASE-SNAP-Result: 1.821001.0001-0-1-22:0,33:0,34:0-0
Archived-At: <https://mailarchive.ietf.org/arch/msg/teas/c7EAE3DG1rQcLmFu_IaXypj5qFc>
Subject: [Teas] Offline discussions of scalable IETF network slicing
X-BeenThere: teas@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Traffic Engineering Architecture and Signaling working group discussion list <teas.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/teas>, <mailto:teas-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/teas/>
List-Post: <mailto:teas@ietf.org>
List-Help: <mailto:teas-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/teas>, <mailto:teas-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 25 Jul 2023 21:13:18 -0000

Hi TEAS,

Over the last couple of months a self-selecting group has been chatting and
worrying about scalability in (IETF) network slicing solutions. What will
the impacts be on the existing routing infrastructure and protocols? Will
proposed approaches be able to support the scale required? Etc.

We came up with some text that describes the problems and guides where we
think the work should go. It's just our ideas and a first step towards the
WG agreeing what the parameters are.

As I understand it, this is not intended to specify requirements, but to
help people understand which solutions they put energy into, and what would
be the implications of deploying solutions in different scenarios and with
different objectives.

Jie has a slot in the meeting to talk about his NRP scaling draft. He plans
to cover this text, but it is quite wordy and makes for a long presentation.
So, if you care about this stuff, please read the text (below) and bring
your thoughts to the meeting and/or the list.

Cheers,
Adrian (only from me cos I was left holding the pen)

=====

Scaling of network slicing uses a hierarchy of aggregation in order to
achieve scalability. Multiple slices are supported by an NRP; multiple
NRPs are enabled on a filtered topology; multiple filtered topologies
utilise a single underlying network. The hierarchy at any stage may be
made trivial (i.e., collapsed) according to the deployment objectives
of the operator and the capabilities of the network technology.

To recap, and in general terms:
- The network slice is an edge-to-edge service
- The NRP is a set of network resources (e.g., buffers, bandwidth,
  queues) and assigned per-packet behaviors
- The filtered topology is a set of network resources (call it a
  virtual network if it makes it easier for you to think about) on
  which path computation or traffic steering can be performed.

Scalability concerns exist at multiple points in the solution:

- The control protocols must be able to handle the distribution of
  information necessary to support the slices, NRPs, and filtered
  topologies.
- The network nodes must be able to handle the computational load of
  determining paths
- The forwarding engines must be able to access the information in
  packets and make forwarding decisions at line speed
  Path selection tools must be able to process network information and
  determine paths on demand.
 
Assuming that it is achievable, it is desirable for NRPs to have no 
more than small impact (zero being preferred) on the IGP information
that is propagated, and to not required additional SPF computations 
beyond those that are already required.

Assuming that external mechanisms can deal with path selection, NRP
identification should be decoupled from forwarding decisions for
packets.

Given that, we can set out the following design principles.

1) A filtered topology is a subset of the underlying physical topology.
   Thus, it defines which links (and nodes) are eligible to be used by
   the NRPs. It may be selected as a set of links with particular
   characteristics, or it may be a set of forwarding paradigms applied
   to the topology. Thus, a filtered topology may be realised through
   multi-topology techniques (such as colored links), as a virtual TE
   topology, or using flex-algo.

2) It is not envisaged that there would be many filtered topologies
   active, and running SPF per filtered topology is not a high burden.

3) Multiple NRPs can run on a single filtered topology meaning that the
   NRPs can be associated with the same filtered topology and use that 
   topology's SPF computation results.

4) Three separate things need to be identified by information carried
   within a packet:
   - path
   - NRP
   - topology (i.e., filtered topology)
   How this information is encoded (separate fields, same field,
   overloading existing fields) forms part of the solution work.

5) NRP IDs should have domain-wide scope, and must be unique within a
   filtered topology.

6) Configuration mechanisms are used to set up packet / resource
   treatments on nodes

7) Configuration mechanisms (such as southbound protocols from a
   controller) are used to install bindings on network nodes between
   domain-wide resource treatment identifiers (NRP IDs) and configured
   packet treatment as per (6)

8) The path selection performed by or within a traffic engineering
   process, within or external to the head end node, (in particular
   the topology selection and path computation within that topology)
   may consider the characteristics of the filtered topology and the
   attributes of the NRP, but is agnostic to the resource treatment 
   that the packets will receive within the network.  Ensuring that
   the selected components of the path that are configured are capable
   of supporting the resource treatments identified by the NRP ID, is
   a separate matter.

9) The selected path is indicated in the packets using existing or new
   mechanisms. Whether that is SR-Policy (for some variety of SR),
   flex-algo (for whatever flex-algo expression you like), or even
   experiments like CRH, is something out of scope for now, but it will
   obviously form part of the full set of solution specifications.

10) The components or mechanisms that are responsible for deciding what
   path to select, for deciding how to mark the packets to follow the
   selected path, and for determining what resource treatment identifier
   (NRP ID) to apply to packets are also responsible for ensuring
   sufficient consistency so that the whole solution works.

11) Different packet transport mechanisms may use different means to
   carry the NRP ID.  The writeup we need at this stage is agnostic to
   that.
 
The result of this is that different operators can choose to deploy
things at different scales, and while we may have opinions about what
scales are sensible / workable / desirable, we do not have to get WG 
agreement on that aspect.

The routing protocols (IGP or BGP) do not need to involved in any of
these points, and it is important to isolate them from these aspects
in order that there is no impact on scaling or stability. Furthermore,
the complexity of SPF in the control plan is unaffected by this.

Note that there is always a trade-off between optimal solutions and
scalable solutions.
- We need to achieve a scalable solution that can be deployed in all
  circumstances. We should acknowledge that:
  - We may need some extensions to the data/control/management plane
    to achieve this result. I.e., it may be that this cannot be done
    today with existing tools.
  - The scalable solution might not be optimal everywhere.
- We must understand that optimal solutions are good for specific
  environments, but:
  - Might not work in other environments
  - May have scalability issues.
We should allow for both approaches, but we need to be clear of the
costs and benefits in all cases in order that:
- We support significant optimisations
- Do not let non-scalable solutions creep into wider deployment.
In particular, we should be open to the use of approaches that do not
require control plane extensions and that can be applied to deployments
with limited scope. Included in this are:
- resource SIDs
- L3VPN