Re: [Lsr] Thoughts on the area proxy and flood reflector drafts.

Tony Przygienda <tonysietf@gmail.com> Wed, 10 June 2020 16:41 UTC

Return-Path: <tonysietf@gmail.com>
X-Original-To: lsr@ietfa.amsl.com
Delivered-To: lsr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F18013A0BFE for <lsr@ietfa.amsl.com>; Wed, 10 Jun 2020 09:41:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Yxa-iKB4e2Ox for <lsr@ietfa.amsl.com>; Wed, 10 Jun 2020 09:41:05 -0700 (PDT)
Received: from mail-io1-xd2c.google.com (mail-io1-xd2c.google.com [IPv6:2607:f8b0:4864:20::d2c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 25E4A3A0BF2 for <lsr@ietf.org>; Wed, 10 Jun 2020 09:40:55 -0700 (PDT)
Received: by mail-io1-xd2c.google.com with SMTP id m81so2979629ioa.1 for <lsr@ietf.org>; Wed, 10 Jun 2020 09:40:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=QA4CVGmOK7/vlQv/94pW9BMM1oxaLzHMJbrcDXxzrvo=; b=Hy3PqGFmKQINEGqxJG6unGZ+7y6gp5aoPHbUC9Qri+35bqt2VQX89tfyec7B/Ggo63 +KW8XA792PT/L4tx8g4ZFytYCgwqYxNmY3Q4fHt2MQSVGEsP4CQztz9Fqsr7NVPGHnIh k2qf9vFoPcnN4kv+GDT7hQGg+ek7X1CsAWiXTtZWed8SdU9MYdVrh7O2GzaOEkLUS8F0 7epmAHE7Aex9pJu4ealqY2EYyWR7PspTIEGL81LVJNeAVnK+1WqqcA68F5iYmPGlDz2n ZzZHSogsCm/ln5XgLUXVrGYGhI/YsFIG3tG2ScZzapbL1HOBvEFBEhOYJ8+MaLUk2BKJ ElpA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=QA4CVGmOK7/vlQv/94pW9BMM1oxaLzHMJbrcDXxzrvo=; b=LR/1Un5uTXtd44KIGeHHidfsqmjsrwxfTeJmEmxrkDVw0kz2jOUn9e70m3y99p5rkI mXnmX/BO9N8dXxVMg5g9BkKbYWTFT8ykW90lnD3nGFwb1Cf9Q5YdMBVVI4MujxIjPr5f 9VkNuNDIFFSJvxq3sYwRrdD0B0w81/EFHqbrSLxLgyEM9av/sXWMX0J+RhbHYlgDmXpJ 6EhoWVpUDBvoDr9nEOxUl3qYVqi4n/nN7+1B0vdK4BYDae4vAnUNSDymqkL5Q23rShAN z5DsZ7MSM8nFyEdWnbcKFYmshKwxacQfnxq7RSG4sompf+BrTtROPE+vYEbFkSieVQt8 MWog==
X-Gm-Message-State: AOAM531uk0So6umB2Oc6/YS0hYjPcTBtYCCD7ptJ8aaXg0VLhTJ65Npk JDeT6mR3I62ojNLvN2QJD4vdIbxV3qRupeJUVYbPdDIuvGk=
X-Google-Smtp-Source: ABdhPJwUiRGbrpORkzdDI/UYhYOSITFz2+ag6q3fSQCb8Ui0JiDwK1upmjEO4p0d4Fdsmr9Y6esbUlnybWkT71rogr4=
X-Received: by 2002:a05:6602:2815:: with SMTP id d21mr4148392ioe.174.1591807253658; Wed, 10 Jun 2020 09:40:53 -0700 (PDT)
MIME-Version: 1.0
References: <790B898F-DB03-499E-BAAE-369504539475@chopps.org> <22086D70-6A19-4EA3-B15B-405FD5271262@chopps.org> <CA+wi2hMGcfqgPBoWLbqhS5vrF_Jy1RtAM7iMan4uYUjEc9X_2Q@mail.gmail.com> <48779A7B-FC92-495E-A2D6-98700E9FB337@chopps.org>
In-Reply-To: <48779A7B-FC92-495E-A2D6-98700E9FB337@chopps.org>
From: Tony Przygienda <tonysietf@gmail.com>
Date: Wed, 10 Jun 2020 09:40:02 -0700
Message-ID: <CA+wi2hOPEP=xT34QVV=ZYAg3ou1=gsg1n=5x3ZQXy_p84dt65A@mail.gmail.com>
To: Christian Hopps <chopps@chopps.org>
Cc: lsr@ietf.org
Content-Type: multipart/alternative; boundary="000000000000e7c02905a7bd8184"
Archived-At: <https://mailarchive.ietf.org/arch/msg/lsr/ww_4iTEh5Sl4KdoF7R9zWPC5AO4>
Subject: Re: [Lsr] Thoughts on the area proxy and flood reflector drafts.
X-BeenThere: lsr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Link State Routing Working Group <lsr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lsr>, <mailto:lsr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lsr/>
List-Post: <mailto:lsr@ietf.org>
List-Help: <mailto:lsr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lsr>, <mailto:lsr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Jun 2020 16:41:14 -0000

On Wed, Jun 10, 2020 at 4:27 AM Christian Hopps <chopps@chopps.org> wrote:

>
>
> > On Jun 9, 2020, at 10:01 PM, Tony Przygienda <tonysietf@gmail.com>
> wrote:
> >
> > Chris (addressing in WG member context you declared), I reply tersely
> since we will put more work into the draft once it's adopted (for which I
> think you saw a good amount of support in two threads already).
> >
> > I deferred from your email since the chain-terrasteam topology you're
> showing is simply not what we are dealing in any operational, successful
> networks today AFAIK frankly and I saw lots of "assume complexity" and
> "dislikes" in your email which I didn't read as technical arguments but
> mental attitudes. Likes or dislikes and assumptions are fine but we should
> probably focus on existing network/customer technical + operational
> arguments & requirements when building solutions and now what you or we
> like first.
>
> Both Area Proxy and Flood Reflector are proposing to use L1 areas as
> transit to connect L2, isn't that chaining? It seemed like a decent way to
> help visualize the proposals along with some numbers, perhaps you have
> something better...
>
> The Area Proxy draft is making everything L2 and using the L1 areas to
> redefine the advertised topology information to allow it to scale. Because
> everything is ultimately L2 nothing changes in the data plane to provide
> this transit.
>
> The flood reflector draft is keeping the L1-only abstraction so it has to
> provide for transit some other way.
>
> > So trying to extract the technical point you seem to be making inbetween
> all that
> >
> > a) I see how you can try to have a mental model of "virtual links". What
> we suggest are not virtual links (I implemented VL @ least once but it's so
> long ago I forgot pretty much all the details so had to look stuff up :-)
> Virtual links in OSPF were "magic bits on LSAs" that kind of computed "SPF
> reachability through the area to change SPF" edge-to-edge and the
> asynchronicity of all that flooding-being-tunnels-being-SPF was playing
> havoc on us @ this time of 300MHz CPUs + frankly, the complexity of that
> was not needed @ this time just as partition healing was never implemented
> in ISIS.. That's why it never went anywhere much, my take, others may
> correct. Saying "virtual links are bad" and "this is virtual links to me so
> it's bad" is simply a "strawman fallacy" to me frankly. This draft suggests
> (but as I wrote Bruno as answer to his fairly deep email) to run proper
> flooding over proper tunnels (we run routing over tunnels all the time in
> all kind of scenarios be it BGP proper or SD-WAN or overlays obviously) but
> if you choose FRs to be one hop away you can get away without any "L2
> tunneled adjacencies", that's deployment choice.
>
> If you are not using tunnels, but are still trying to provide transit
> through the L1 area for L2, that is exactly what OSPF virtual links are
> doing. Part of making those work is the advertisement of reachability into
> the area from another, changes to router advertisements (to indicate if the
> area is transit capable), and changes to the SPF calculation to modify
> route choices based on whether they are based on virtual links or not.
>
> The complexity of OSPF virtual links is there to make them work, right?
> I'm not trying to make any argument, strawman or otherwise; I'm just trying
> to understand what is being proposed as a replacement for the full-mesh of
> tunnels.
>
> It's nice if it reduces to OSPF virtual links b/c we then have an example
> of how to actually implement it, and years of experience to understand it.
> If that highlights that getting the non-tunneled choice right isn't easy,
> well I guess that's important, right?
>
>
Hmmm, AFAIR the implementation of OSPF virtual links was having no tunnel
at all (and that's how I remember I implemented it then). I cite

"

The InterfaceUp event occurs for
    a virtual link when its corresponding routing table entry becomes
    reachable.  Conversely, the InterfaceDown event occurs when its
    routing table entry becomes unreachable.  In other words, the
    virtual link's viability is determined by the existence of an
    intra-area path, through the Transit area, between the two
    endpoints.
"

 and that was largely the problem, flooding+SPF+routing table was basically
the adjacency keep up and very flappy/circular instead of proper stable
tunnel with hellos. Did you implement and run virtual links in OSPF over
tunnels? which type?

And having run good amount of routing over tunnels I am still to see all
those dragons you are summoning. GRE, incl soft GRE is very widely deployed
and works well as are other tunnel types I worked with.

A valid point of discussion is how many adjacencies you can keep up to the
edge, my experience is hundreds which for all practical purposes is a very
good scale but otherwise we can relax the draft and build a "flood
reflector hierarchy", I'm open to that if it's a real concern. Ultimately
running more than hundreds is largely a question of relaxing timers which
is relatively easily done.

> b) Generally we may seem a bit muddled between different types of
> "tunnels" and "tunnels are bad" and "lots of tunnels". The draft talks
> about 2 types of tunnels and it seemed to be written clearly enough to
> distinguish that easily based on feedback I got so far
> >
> > i) control plane tunnels are proper L2 entities (again, if your FR is
> one hop away from leafs then you don't need any tunneling but can run
> normal L2 adjacencies which I hope are not too scary; whole thing is really
> equivalent to BGP RR, do you put it in path, do you want to run multi-hop
> and how confident are you in your lower level infra not dropping TCP
> "tunnels" under you; every day's business since years really). So, no,
> there is no magic and hidden complexity and whatever not, you may or may
> not use auto-discovery the draft provides (you can just build something
> completely statically configured and in fact several customers told me
> they'd prefer it that way just as they don't auto-discover RRs normally) to
> build bunch of tunnels towards your 2-3 FRs from your edges and you're in
> business after L2 adjacency comes up. Looks like any old ISIS + some
> optional TLVs you can ignore that indicate for some smart future folks to
> know it's a FR adjacnecy and not "real L2 adjaqcency". No fork-lifting of
> whole cluster, no fork-lifting routers outside a cluster, no single point
> of failure under some fancy new name, barely any protocol changes (in fact
> I didn't see any proposal to run anything more minimal than that except
> maybe very simple version of TTZ which however has too many L2 adjacencies
> exposed @ any reasonable cluster size to really solve the problem of amount
> of L2 information sloshed around)
>
> I think I understand the flood reflector bit; It's replacing the network
> of (now data-plane only) tunnels in flooding graph and the advertised
> topology.
>
> > ii) data plane tunnels. The draft basically explains that for the
> solution to work in a simple fashion,full-mesh of data forwarding tunnels
> can be established (which are NOT visible in L2) as shortcuts that allow to
> utilize all paths through L1 and that will work fine since it doesn't spill
> into L2. You want to run L1 adjacencies over those tunnels if you care
> whether they are up but you could do just BFD e.g. and use them as
> forwarding next-hops in the computation without them being visible in L1
> ISIS. The other option is to not use such a data plane mesh and use
> reachability instead and we can explain that further in detail after
> adoption and we get more people talking through that etc. or you can look @
> my preso @ last IETF where I kind of quickly ran through that (and it
> seemed relatively obvious to me how it works). In summary, we chose to do
> real work rather than polilsh optional points in individual drafts because,
> frankly, customers are not interested all that much often whether IETF WG
> feels like working on it while they have a pressing problem and need
> solutions in a timely manner. And AFAIR the chairs guided the group
> multiple times towards "ignore the problem, of no importance" and now with
> a certain urgency want to have everything @ the same time.
>
> So this is creating a full-mesh overlay network of tunnels between the
> edge routers on the L1 area. I don't think you would want to advertise that
> overlay network back into the underlay network (L1 area) so you can't just
> form up L1 adjacencies (unmodified) to determine if they are up or not,
> some other mechanism would have to be used. OSPF virtual links use the
> intra-area connectivity to "v-bit" routers to determine if the area is
> transit capable -- or perhaps BFD as you suggest.
>
> I guess the question is can there be topological control-plane
> connectivity to the flood reflector, but not data-plane overlay network
> tunnel connectivity?
>
> The draft is also saying you suppress advertising the overlay network in
> L2 as well b/c it is represented instead by the advertised topology created
> by using the flood reflector. So the overlay is network represented, but
> unadvertised itself. This suppression of advertising the overlay tunnel
> network seems similar to how the L12 topology inside an area proxy is also
> suppressed, except ... tunnels. :)
>
>
At a scale we look at it was not a big problem to advertise the L1 tunnel
mesh back into the underlay but as you say BFD would work fine etc. Not
really that different from shortcuts we have all over the place in IGPs
already. But again, I explained how you can run it without any L1 tunnels
data mesh in the preso @ IETF and we'll update the draft with details once
it's adopted.

You do seem to be carrying as WG member a hot torch for area-proxy for some
reason, that's fine with me, frankly, I had extensive discussions with
customers when DriveNet was being proposed to them (which AFAIS is
basically area-proxy) and the solution is intriguing but it did not cut
lots of requirements of large customers and there are a lot of unresolved
issues operationally with an approach like that. However, I'm here to make
sure to get flood reflectors adopted so it can be deployed by customers in
ideally multi-vendor, interoperable scenarios and not on crusades so let's
adopt the draft and then we fill the pieces in and maybe extend it to
hierarchy, no-tunnel-one-hop-away bits if people think that is
interesting/relevant and are willing to work on that.  And then customers
can choose whether they run area-proxy or flood reflectors based on what
requirement set is important to them.

thanks

--- tony