Re: [Isis-wg] Some comments on draft-white-openfabric-02
Jeff Tantsura <jefftant.ietf@gmail.com> Wed, 12 April 2017 18:33 UTC
Return-Path: <jefftant.ietf@gmail.com>
X-Original-To: rtgwg@ietfa.amsl.com
Delivered-To: rtgwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CA2DE12EB4C for <rtgwg@ietfa.amsl.com>; Wed, 12 Apr 2017 11:33:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, MIME_QP_LONG_LINE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JBPr_PdGfFQz for <rtgwg@ietfa.amsl.com>; Wed, 12 Apr 2017 11:33:57 -0700 (PDT)
Received: from mail-pf0-x241.google.com (mail-pf0-x241.google.com [IPv6:2607:f8b0:400e:c00::241]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3FF32127444 for <rtgwg@ietf.org>; Wed, 12 Apr 2017 11:33:57 -0700 (PDT)
Received: by mail-pf0-x241.google.com with SMTP id i5so6568778pfc.3 for <rtgwg@ietf.org>; Wed, 12 Apr 2017 11:33:57 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=user-agent:date:subject:from:to:message-id:thread-topic:references :in-reply-to:mime-version:content-transfer-encoding; bh=BiahpOsYy9795jLxYcBonxcM96FB5oDu90uSB0uNZqg=; b=RFtQ10BR9UIj+yoXMUgM/ajMfDdlZAzXrhNFvDgOF3+j2a+IibldFoMcaj185+b0w5 hwC1qNub3/+Z04Z4vy7w6mg4qfJc3aegIIJ3cRe/Xgv/6Hd/oEGupMqxmZjh2sk7q5VG KvXbZtodBvHofWBovKx5km9+tQFyJRmfiRBm+DB8FNLLmNSoP6aJznQwWhsa3vyjmYzl JnmG4oR2IKJkZip0s8c5vAOr3YBhqvklxtrjJtXnF5HrhoqcgpRNxehTTNIbxgtTCOpE 55QeoyDIY6gAeSxaAA3PDXqbVX9LYh9KNW2ZYS1qVI4zcggljHprK7TUh8A1joXsa5n6 24gA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:user-agent:date:subject:from:to:message-id :thread-topic:references:in-reply-to:mime-version :content-transfer-encoding; bh=BiahpOsYy9795jLxYcBonxcM96FB5oDu90uSB0uNZqg=; b=BAFVIUqiDfZuyCQ0bDOZXMBDYm1gGpGt8l/mFK6TtWfNjd3i0f2ulu6l5jM6EMwbY+ HkEca6ECMrOd8BGCuApsPcJs0Qh7OiQ+w8YyeyIjmiqQ7nYqOt0gTKr9xgHZnL2xYwDn hfx9OZlqlcUUbjq8p0EhZPVroXgmbd9hfmBEeCql7W0xaa/d3oJgB+zvPz9smMMObWQz o9cAmYPwylGH/IO3E1Jey8rNbW1rudAp/TKsAjyXlDECS31iwFDKnpP3ZVcXYMH23NEH TIQr8kIDT2jlffvpZqmE6XBGGAMSpjGLts065kCN6vMI2vkDo8UvHZKRFusfN13ARnJA eLMQ==
X-Gm-Message-State: AFeK/H2GQPNSPxx9bPTP3ZZSMw8J4Xrfw+Zg+ZSTghwAyeuIDP5jw3n/aLLC5HIKUv4/xQ==
X-Received: by 10.99.225.5 with SMTP id z5mr69075001pgh.145.1492022036732; Wed, 12 Apr 2017 11:33:56 -0700 (PDT)
Received: from [10.24.45.80] ([68.65.169.228]) by smtp.gmail.com with ESMTPSA id s21sm38106436pgg.65.2017.04.12.11.33.55 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Apr 2017 11:33:55 -0700 (PDT)
User-Agent: Microsoft-MacOutlook/f.20.0.170309
Date: Wed, 12 Apr 2017 11:33:54 -0700
Subject: Re: [Isis-wg] Some comments on draft-white-openfabric-02
From: Jeff Tantsura <jefftant.ietf@gmail.com>
To: Erik Auerswald <auerswald@fg-networking.de>, Russ White <7riw77@gmail.com>, rtgwg@ietf.org
Message-ID: <8C49C6EC-C129-452E-AF3B-35C74611F51E@gmail.com>
Thread-Topic: [Isis-wg] Some comments on draft-white-openfabric-02
References: <20170412085018.GA29441@fg-networking.de>
In-Reply-To: <20170412085018.GA29441@fg-networking.de>
Mime-version: 1.0
Content-type: text/plain; charset="UTF-8"
Content-transfer-encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtgwg/X_9H0N0SpnsSJmoLkhUD-phdTMQ>
X-BeenThere: rtgwg@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Routing Area Working Group <rtgwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtgwg/>
List-Post: <mailto:rtgwg@ietf.org>
List-Help: <mailto:rtgwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Apr 2017 18:34:00 -0000
Erik, I have added your email to RTGWG list, so now you are allowed to post there. Cheers, Jeff On 4/12/17, 01:50, "Isis-wg on behalf of Erik Auerswald" <isis-wg-bounces@ietf.org on behalf of auerswald@fg-networking.de> wrote: Hi all, I have read draft-white-openfabric-02 and would like to comment on a few points. I'll start at the top of the draft and continue through the text. Please keep my e-mail address in replies, because I am not subscribed to the isis-wg and rtgwg mailing lists. 1. The abstract states "[...]topology information is extracted through broad based connections." I do not understand that sentence. 2. Section 1.1., Goals, mentions large scale data centers. Would it be appropriate to reference RFC 7938, Use of BGP for Routing in Large-Scale Data Centers, here? Said RFC proposes a Clos topology for the network, which seems to be similar to the spine and leaf topology of openfabric. 3. In section 1.3., Simplification, I noticed a spelling mistake: mutliaccess (should be multiaccess). 4. In section 1.5., Sample Network, a spine and leaf network is shown in figure 1. The topology shown in that figure is different from the 5-stage Clos topology shown in RFC 7938, figure 3. The 5-stage Clos topology from RFC 7938 represents the network topology used by Facebook for the Altoona data center, as publicized in https://code.facebook.com/posts/360346274145943/introducing-data-center-fabric-the-next-generation-facebook-data-center-network/. Another generalization of the 3-stage Clos network to more than 3 stages called Beneš network can be found on Wikipedia: https://en.wikipedia.org/wiki/Clos_network#Clos_networks_with_more_than_three_stages Both of these 5-stage networks differ from figure 1 of the openfabric draft insofar as each T2 switch is connected to a proper subset of T1 switches (openfabric designation) in both the RFC 7938 "Clos" topology and the Beneš network. This is crucial for increasing the amount of input- and output ports without using bigger switches. Since this is important for later comments, I have adapted figure 3 from RFC 7938 into the following drawing: +----+ +----+ |L1.1| |L1.2| (T0) +----+ +----+ | \________________ / | | ________________\/ | | / \ | +----+ +----+ |F1.1| |F1.2| (T1) +----+ +----+ / \ / \ / \ / \ +----+ +----+ +----+ +----+ |S1.1| |S1.2| |S2.1| |S2.2| (T2) +----+ +----+ +----+ +----+ \ / \ / \ / \ / +----+ +----+ |F2.1| |F2.2| (T1) +----+ +----+ | \________________ / | | ________________\/ | | / \ | +----+ +----+ |L2.1| |L2.2| (T0) +----+ +----+ Legend: Lx.y: Leaf switches (a.k.a. Top of Rack (ToR) switches) Fx.y: Fabric switches Sx.y: Spine switches Inter-switch connections: Lx.y is connected to Fx.* Fx.y is connected to Lx.* and Sy.* Sx.y is connected to F*.x Figure 2: 5-Stage Clos Topology (adapted from [RFC7938], Figure 3) I have used the name "Fabric switch" similar to Facebook's use of that name in the above referenced blog post, just to have distinct names and single letter abbreviations for each tier. A reference to RFC 7938, section 3.2, Clos Network Topology, would fit into this section. 5. It might be appropriate to mention the use of timeouts and exponential back-off for initial adjacency formation in section 2. Something like sequentially trying all discovered neighbors and using exponentially increasing random timeouts for subsequent rounds until the first adjacency is formed. A "Happy Eyeballs" (RFC 6555) like approach of trying to form two adjacencies with a slight delay in-between might be nice as well. 6. Section 3., Determining Location on the Fabric, relies on the special topology from figure 1 of the openfaric draft. In both Beneš networks and the topology shown in figure 2 (of this mail), FD == TD and TD == 4 holds for non-T0 switches. One example is S1.1 from figure 2. It can be easily seen from that figure that for all switches in that topology FD == TD == 4. Thus the algorithms from sections 3.1., Determining T0, and 3.2., Determining T1 and above, do not work for general fabric topologies. 7. The algorithm described in section 4, Flooding Optimization, does not work for the 5-stage "Clos" topology (see figure 2). An example for this is a change that pertains just switches S1.1 and F1.1 in figure 2 (e.g. a link between these two switches fails). Because the T0 switches Lx.y receive the LSPs as DNR, the LSPs do not reach switches Fx.2 and S2.y during flooding. The failure recovery mechanism of section 4.1., Flooding Failures, is needed to propagate the LSPs by design, but this is clearly thought of as a backup mechanism that is not needed for normal operation. 8. Section 5.1., Transit Link Reachability, would benefit from a reference to RFC 5837, Extending ICMP for Interface and Next-Hop Identification. 9. Section 6., Openfabric and Route Aggregation, should disallow route summarization. Otherwise the failure of a single link will result in traffic black-holing without intra-tier links. See e.g. RFC 7839, sections 8.2. and 8.2.1. But intra-tier links are disallowed in section 1.5, Sample Network. Since the reason for disallowing intra-tier links, topology auto- detection, is not yet solved (see comment 6. above), you might allow the combination of intra-tier links and route summarization. I would prefer disallwoing both for openfabric, because the added complexity of route summarization and its effects on resiliency in the case of failures seem a bad trade-off for the reduced routing table size. Thanks for reading this far. :-) Best regards, Erik -- Dipl.-Inform. Erik Auerswald http://www.fg-networking.de/ auerswald@fg-networking.de T:+49-631-4149988-0 M:+49-176-64228513 Gesellschaft für Fundamental Generic Networking mbH Geschäftsführung: Volker Bauer, Jörg Mayer Gerichtsstand: Amtsgericht Kaiserslautern - HRB: 3630 _______________________________________________ Isis-wg mailing list Isis-wg@ietf.org https://www.ietf.org/mailman/listinfo/isis-wg
- Re: [Isis-wg] Some comments on draft-white-openfa… Jeff Tantsura
- Some comments on draft-white-openfabric-02 Erik Auerswald