Re: [Rift] [RIFT][Non equal cost anycast]

Tony Przygienda <tonysietf@gmail.com> Mon, 29 July 2019 02:48 UTC

Return-Path: <tonysietf@gmail.com>
X-Original-To: rift@ietfa.amsl.com
Delivered-To: rift@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D085C120170 for <rift@ietfa.amsl.com>; Sun, 28 Jul 2019 19:48:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.998
X-Spam-Level:
X-Spam-Status: No, score=-1.998 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ty7M91AZ2t0j for <rift@ietfa.amsl.com>; Sun, 28 Jul 2019 19:48:23 -0700 (PDT)
Received: from mail-ed1-x544.google.com (mail-ed1-x544.google.com [IPv6:2a00:1450:4864:20::544]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 81AE312016E for <rift@ietf.org>; Sun, 28 Jul 2019 19:48:22 -0700 (PDT)
Received: by mail-ed1-x544.google.com with SMTP id w13so57968597eds.4 for <rift@ietf.org>; Sun, 28 Jul 2019 19:48:22 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=AsP7FHkn8szk/PEf8c1gJYw1x78DUT/pIgCYAmhE9w8=; b=e6jN3Ts7H0sGTpEDgZw9sdikld+k8PuAo/mBYCqNeRAAbH5jefe/JQTauVqPgsQMTW xpNsAm6YKHbGd+VewKAnST9woiT0suUUszgxTPYAtvHQngmg8qjKpcyo9Z56FIeVVQkd PS+8z0dpmo3QOjCWmnvaRrnF9cqK+f5cr7B6pruiAeVVGMcDykF0sXoXNgEOIc26ctkE dISS8/WslBAmrGMHpOAuomWMDMIcPL/cX6oRAIL/CXmRVBZRzHbgrgXDCUYBPNnnPYuu sLFYRutkI4YkM21h5VKTTIWQRPA0l+Qzc4NMFGKPmWfZN9b6i/3xWW6spmBdg5JlfStF 9lJA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=AsP7FHkn8szk/PEf8c1gJYw1x78DUT/pIgCYAmhE9w8=; b=ijaVZoDC3RQApIhG7Apch+cUs2K7U2k3zR2Rtaeqp04hce3Kot/Ar1xMF/FNygC6SJ PPe13tuvaHsYqN3gzoN53YQ1RmN5agSh2bQ//rwzXzaCtIOU7M902NGgZHzWv6Z25soI JACGetarf0t+TlxX3GhsXjJVPuSTi8cpIAK3Jf5WARViU1xRWEFiQyrKNyRVVhlJSPZk IGcYonc3Sj7wEfavsrmK3KKBysFOT4H1BgKAhaa3mToPWizEuWJwFE+pmDPiHDWf9Ks8 UPpAa/oGxnvLyZtPgirpNsQmHRAEcm4TzMiHg1o0CuAAdWZEiE6dRgVAtS2n5RRVlM8H 6vcw==
X-Gm-Message-State: APjAAAXaYZqaedZz/tLrD007leflmYqma807BmRxsoPtnQKafGeVuRcK l8e7HWWRdUoD4T9L1fi3mII0Hoy78elBZdT+EIA=
X-Google-Smtp-Source: APXvYqyIy1088pimMx6WTN3ft1MR3o3pf2MAeHFYSv6H4OLtzxClwjO6+epjiFGbj5YOkb/UfCBdSc5xsWUE7+h9IRE=
X-Received: by 2002:aa7:c3c4:: with SMTP id l4mr38445485edr.32.1564368501035; Sun, 28 Jul 2019 19:48:21 -0700 (PDT)
MIME-Version: 1.0
References: <CA+wi2hMg6gx_nnHCu7iP9S3snAjL=qAWObx3Hh=bUgzF=vz+3A@mail.gmail.com> <201907291001327532177@zte.com.cn>
In-Reply-To: <201907291001327532177@zte.com.cn>
From: Tony Przygienda <tonysietf@gmail.com>
Date: Sun, 28 Jul 2019 19:47:44 -0700
Message-ID: <CA+wi2hORAFJ3uUg-NOS5BLXFwPQoi+Y9n-DyMkfmoftH2FBJtQ@mail.gmail.com>
To: xu.benchong@zte.com.cn
Cc: Antoni Przygienda <prz@juniper.net>, rift@ietf.org
Content-Type: multipart/alternative; boundary="000000000000cd5662058ec8ec52"
Archived-At: <https://mailarchive.ietf.org/arch/msg/rift/FLTdcm6yKXugqEajHdZiFJlpIys>
Subject: Re: [Rift] [RIFT][Non equal cost anycast]
X-BeenThere: rift@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Discussion of Routing in Fat Trees <rift.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rift>, <mailto:rift-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rift/>
List-Post: <mailto:rift@ietf.org>
List-Help: <mailto:rift-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rift>, <mailto:rift-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 29 Jul 2019 02:48:26 -0000

Hey Benchong, observe that RIFT is only giving suggested computations and
does not provide a prescriptive language for them. In simpler terms, as
long we follow valley-free routing (packet goes up until it turns down) we
can use any nexthop available giving you reachability to the prefix in the
desired direction really  (and in both cases you must do LPM based on the
information provided to prevent blackholing). And then in both directions
you can load-balance as you please. This allows to saturate the whole
fabric while balancing on any metric, where BAD northbound is a metric that
had good response from people who reviewed how they would like an IP fabric
protocol to work and that's why it's included. One could define something
like BAD as well in southbound direction but it's not that clear what makes
sense there, simplest is SPF, next would be all-paths, then balancing on
some kind of bandwidth in southbound towards the leaf (but that's where it
gets hard if you think about it).

In any case, if we want anycast to work properly we have to build an
next-hop that includes all nodes that advertised the prefix, yes. But
obviously we can also just include the shortest SPF node & it's still
anycast, just a very badly balancing one if distances are uneven ;-) ...

Valley-free routing in RIFT (obviously possible only since we have rank
ordering of the nodes, i.e. a top and a bottom) frees us from the shackles
of SPF and moves things closer to
https://en.wikipedia.org/wiki/Maximum_flow_problem in a simple way since
employing secondary technique in traditional routing like CSPF and so on
pre-conditions much more complex machinery and source routing mechanisms
like MPLS/SR in the fabric which would make it obviously more expensive
(the finer we want to control our commodity flow the more state we stick on
the packet and/or into the network) and less reactive to changes (due to
the necessary state distribution into intermediate points once the topology
converged). Whereas in plain RIFT we stick to valley-free hop-by-hop
routing which though it does not allow much granularity except prefix
makes up by being pliable, simple & hence cheap and robust (which seems
very desirable in fabric where bandwidth substitues for smarts [as IP did
in most of its history]). And yes, SR/MPLS can be made to work on RIFT but
that's a different discussion altogether. In a sense, if we advertise
unsolicited label bindings on LIEs (which we optionally include)  we have a
flavor of LDP already for free if I'm not mistaken ...

--- tony

On Sun, Jul 28, 2019 at 7:01 PM <xu.benchong@zte.com.cn> wrote:

>
> Tony
>
> Does this mean that when S-SPF calculate, we cannot select the best
> nexthop with the smallest distance, all the nexthops must be added to the
> forwarding table, and distance is used as the basis for “Non equal cost
> anycast” on the forwarding plane.
>
> N-SPF uses the method of 5.3.6.1 and the BAD instead the distance.
>
>
> --Benchong
>
>
>
> 原始邮件
> *发件人:*TonyPrzygienda <tonysietf@gmail.com>
> *收件人:*徐本崇10065053;
> *抄送人:*Antoni Przygienda <prz@juniper.net>;rift@ietf.org <rift@ietf.org>;
> *日 期 :*2019年07月27日 03:34
> *主 题 :**Re: [Rift] [RIFT][Non equal cost anycast]*
> Benchong, as always, when people start implement they start to ask the
> real questions ;-) Yes, any cast in RIFT is much closer to what you would
> consider “true any cast” than IP is which is really just ECMP on same
> address. In RIFT anycast on different distance nodes is a normal thing.
>
> First, it it important to understand the difference between mobility and
> any cast on the fabric. if a prefix moves on the fabric without using the
> mobility attributes it can appear in two locations @ once of course (if the
> new TIE floods faster than the previous location manages to purge the
> prefix). That's not a proper anycast of course, that's just an artefact. If
> the prefix properly attaches timestamps by some means (such as 6lo) it will
> be understood as having moved, otherwise it will be any cast for a bit.
>
> And then, of course there is true any cast which is equal to two prefixes
> advertised from two nodes being equal. RIFT is loop-free which means that
> it doesn’t really care all that much about distance so if a packet enters
> from ToF it can be forwarded to any leaf showing any cast. That allows true
> “service on any cast” architecture. In case when you route from the leaf
> the packet will use default (unless an implementation does their own things
> the spec doesn’t mention but doesn’t suppress either) until it pops up far
> enough it sees any cast @ which point in time it will turn the packet south
> (assuming all anycast is on leafs). if balancing of any cast over whole
> metric is desired then the packet needs to be pushed all the way to the ToF
> using tunnels or some other solution.
>
> So, basically, any cast is just a funky next-hop on a prefix that can
> point to two different nodes & metric can be used to balance or ignored and
> the spec does not need to say more I think.
>
> This little diatribe should make it into RIFT applicability statement in
> some form I think ...
>
> -- tony
>
> On Fri, Jul 26, 2019 at 5:57 AM <xu.benchong@zte.com.cn> wrote:
>
>> Hi,Tony
>>
>> Can you talk about REQ6, How does RIFT support it? Is it benefiting from
>> the default route?
>>
>> "  REQ6:    Non equal cost anycast must be supported to allow for easy
>>
>>             and robust multi-homing of services without regressing to
>>
>>             careful balancing of link costs."
>>
>>
>> Thank you!
>>
>> Benchong
>>
>>
>> _______________________________________________
>> RIFT mailing list
>> RIFT@ietf.org
>> https://www.ietf.org/mailman/listinfo/rift
>>
>
>