Re: [Lsr] LSR Flooding Reduction Drafts - Moving Forward

tony.li@tony.li Fri, 24 August 2018 20:36 UTC

Return-Path: <tony1athome@gmail.com>
X-Original-To: lsr@ietfa.amsl.com
Delivered-To: lsr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9A8D2130DEB for <lsr@ietfa.amsl.com>; Fri, 24 Aug 2018 13:36:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.001, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gF2x7wHbnLIZ for <lsr@ietfa.amsl.com>; Fri, 24 Aug 2018 13:36:09 -0700 (PDT)
Received: from mail-pf1-x42b.google.com (mail-pf1-x42b.google.com [IPv6:2607:f8b0:4864:20::42b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CC13C130DC0 for <lsr@ietf.org>; Fri, 24 Aug 2018 13:36:09 -0700 (PDT)
Received: by mail-pf1-x42b.google.com with SMTP id l9-v6so5040144pff.9 for <lsr@ietf.org>; Fri, 24 Aug 2018 13:36:09 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=/LcYPWJT493Y5WSNML/xCZt2Hoi0vPXEuOAl6NiBT1M=; b=OmY7FraDh8ZUlwly28SgnqncCOuVek/CBjWKObbN1LM17gk1adx0+6j/Da6N1REQNk iRySTU8qINcfcVmq0To4gWxe7P5kEmP4/Z7Qzy4REbiggKFPOQJIbdNj5CWQ0K+sNm0B Xs80rfN1XYRtEUrEUyKsCBbSMJTJETJHScRpYChWnKbsDNu8h1k80j1SXWNM7G3GvZuW f5vXYHnrSFki0AppYqyx4a+3dz862Ho/Hm/xxRDrGqLKlOHQ6sOTjg8cHit4qaFDChFR E65t5uSQFzL+B7p86FxsYqSJmtJyEf2glRNjJImUKqBrhAQ281kT1iMIi4Kx/TjHwcIt wJgQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=/LcYPWJT493Y5WSNML/xCZt2Hoi0vPXEuOAl6NiBT1M=; b=NT9xu6jw1IJL63xzQee9mD67Ss038yr5TSY4SSGbVqGcQCfioeRxd3fjYyfdheFjOB t9pXarE4vtLEILgL4Rr2ma8C+nHnkkKmyC0hPEoYx3yASwBI2mqNitpni0joIPTUAFuO zDLXaOvVOdbycelpyiIcoD5ST+0D7PhrQVQ/Z9XVeo1BkSeDnD8co3CVkfQmh0xggSTN ECkypW/LrqzoHRGE4QSAzRz0ErRUzA34mSb7qr/fwjntiTjOQdZ7njr+bg+6xEnnvhuo fYrwJZeW3Vka2VZ1URqTy2nlEaUwDnOoqsfxFwhQywXCGu5WVqold0VeJ1ti7zIF8rQl 3weQ==
X-Gm-Message-State: APzg51CxFc7R1QeXhzQrpCy6IvWrlddorNlJyorUapExU0tRL9CBwMFj Mo0ERCJbyZKjl1m535AOqZw=
X-Google-Smtp-Source: ANB0VdZm2we9tK2NLPSsd6KWowCbvRIVrvOGnivZQRZ3wkYvdg+ULnnKzSJ+QfnaXaE0SjsO2lZtJA==
X-Received: by 2002:a63:a919:: with SMTP id u25-v6mr3104947pge.211.1535142969473; Fri, 24 Aug 2018 13:36:09 -0700 (PDT)
Received: from [172.22.228.216] ([162.210.130.3]) by smtp.gmail.com with ESMTPSA id r87-v6sm11909855pfb.1.2018.08.24.13.36.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 24 Aug 2018 13:36:08 -0700 (PDT)
Sender: Tony Li <tony1athome@gmail.com>
From: tony.li@tony.li
Message-Id: <F0C191B8-5888-47FA-A4D6-D9328AA675C7@tony.li>
Content-Type: multipart/alternative; boundary="Apple-Mail=_168620F0-07F4-4EDD-9E11-A0C3019FE870"
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
Date: Fri, 24 Aug 2018 13:36:07 -0700
In-Reply-To: <CA+wi2hPsPQinwL2KL+xZnTYa1dwuw=thKjq_ddHnJQNnr19ugg@mail.gmail.com>
Cc: acee=40cisco.com@dmarc.ietf.org, Peter Psenak <ppsenak@cisco.com>, lsr@ietf.org, Jeff Tantsura <jefftant.ietf@gmail.com>
To: Tony Przygienda <tonysietf@gmail.com>
References: <8F5D2891-2DD1-4E51-9617-C30FF716E9FB@cisco.com> <C64E476F-1C00-435E-9C74-BEC3053377E8@gmail.com> <2F5FDB3F-ADCA-4DB4-83DA-D2BC3129D2F2@gmail.com> <5B7E78DD.90302@cisco.com> <172728E8-49E6-4F43-9356-815E1F4C22E7@gmail.com> <5B7FCAB3.6040600@cisco.com> <3D1DEC37-ACE7-4412-BB2E-4C441A4E7455@tony.li> <CCF220A3-8308-47B8-8CC6-1989705FF05C@cisco.com> <CA+wi2hNv8AVyR81LRmJ=Pd5_p5rS2djCOjY9YDgKxG=KEO_MkA@mail.gmail.com> <39509D13-4D2D-49A9-8738-C9D1F7C54223@tony.li> <CA+wi2hPsPQinwL2KL+xZnTYa1dwuw=thKjq_ddHnJQNnr19ugg@mail.gmail.com>
X-Mailer: Apple Mail (2.3445.9.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/lsr/tUIFvJ7EWK06HlmglojtkGqmr6Q>
Subject: Re: [Lsr] LSR Flooding Reduction Drafts - Moving Forward
X-BeenThere: lsr@ietf.org
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: Link State Routing Working Group <lsr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lsr>, <mailto:lsr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lsr/>
List-Post: <mailto:lsr@ietf.org>
List-Help: <mailto:lsr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lsr>, <mailto:lsr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 24 Aug 2018 20:36:13 -0000

Tony,

> as to miscabling: yepp, the protocol has either to prevent adjacencies coming up or one has to deal with generic topologies. If you don't want to design miscabling prevention/topology ZTP into protocol (like ROLL or RIFT did) you have to deal with generic graph as you say. I think that if you have 99.99% of the time a predictable graph it's easier to restrict wrong cabling than deal with arbitrary topology when tackling this problem but that's my read. 


Well, you can take that approach, but you are at a risk of ignoring the one link that you needed to make the entire topology work better.  ;-)

And if you don’t like the extra link problem, there’s also the missing link: what do you do when you have a leaf-spine topology, but there’s one link missing?  It’s not like you can ignore the missing link and flood on it anyway.  ;-)


> Another observation though would be that if you have a single mesh then centralized controller delay on failure becomes your delay bound how long flooding is disrupted possibly (unless your single covering graph has enough redundancy to deal with single llink failure, but then you're really having two as I suggest ;-). That could get ugly since you'll need make-before-break if installing a new mesh from controller me thinks with a round-trip from possibly a lot of nodes … 


Perhaps you didn’t understand the draft in detail.

Even the loss of the area leader does not disrupt flooding.  

The flooding topology is in effect until a new area leader is elected and a new topology is distributed.  Yes, there is a hole in the flooding topology and you’re no longer bi-connected, but as long as it was still a single failure, you should still have a functioning flooding topology.  And because of that, it’s reasonable to assume that the area members can elect a new leader and switch to the new topology in an orderly fashion.

It’s very true that there is a period where things are not bi-connected and a second failure will cause a flooding problem.  That period should be on the order of one failure detection, one flooding propagation, and one SPF computation.  If we expect failures to happen more frequently than that, then we need to call that out in our requirements. That type of scenario is perhaps reasonable under a MANET environment, but does not match any of my experience with typical data centers.


> >  iii) change in one of the vertex lifts 
> 
> 
> Sorry, I don’t understand point iii).
> 
> A mildly stuffed (or mathematically concise ;-) way to say that if you have one or two covering graphs (and vertex lift is the more precise word here since "covering graph" can be also an edge lift which is irrelevant here) and one of those subgraphs gets recomputed & distributed (due to failures, changes in some metrics, _whatever_) then this should not lead to disruption. Basically make-before-break as one possible design point, harder to achieve of course in distributed fashion … 


I think it would help the discussion if we phrased it less concisely. :-)


> > moreover, I observe that IME ISIS is much more robust under such optimizations since the CSNPs catch (@ a somehow ugly delay cost) any corner cases whereas OSPF after IDBE will happily stay out of sync forever if flooding skips something (that may actually become a reason to introduce periodic stuff on OSPF to do CSNP equivalent albeit it won't be trivial in backwards compatible way on my first thought, I was thinking a bit about cut/snapshot consistency check [in practical terms OSPF hellos could carry a checksum on the DB headers] but we never have that luxury on a link-state routing protocol [i.e. we always operate under unbounded epsilon consistency ;-) and in case of BGP stable oscialltions BTW not even that ;^} ]). 
> 
> Emacs
> 
> 
> And that I cannot parse. Emacs? You want LISP code? But then, Dino may get offended ;-) 


Your comment seemed like just a poke in the perennial IS-IS vs. OSPF debate, which seems about as constructive as the Vi vs. Emacs debate.

Tony