Re: [Gen-art] [Last-Call] Genart last call review of draft-ietf-bess-datacenter-gateway-10

Gyan Mishra <hayabusagsm@gmail.com> Wed, 19 May 2021 05:59 UTC

Return-Path: <hayabusagsm@gmail.com>
X-Original-To: gen-art@ietfa.amsl.com
Delivered-To: gen-art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1B39D3A1FBD; Tue, 18 May 2021 22:59:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.086
X-Spam-Level:
X-Spam-Status: No, score=-2.086 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_REMOTE_IMAGE=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hNL9-PlNTz8K; Tue, 18 May 2021 22:59:03 -0700 (PDT)
Received: from mail-pg1-x52f.google.com (mail-pg1-x52f.google.com [IPv6:2607:f8b0:4864:20::52f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6AE8A3A1FBC; Tue, 18 May 2021 22:59:03 -0700 (PDT)
Received: by mail-pg1-x52f.google.com with SMTP id f22so7735863pgb.9; Tue, 18 May 2021 22:59:03 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=lKHTF7JihDFIeTIvcnd7rESoT7kgc501dC3xjd90JIc=; b=n1FMHAeWyFZz+ApDJCfgRa0uVimQ9DRQemrtgrw+XBokYu+KLX+wfXFze0yVsnsjpi W1CMd/JZJcoHn/MLA/9VXyuW6fEmPjHYFu9zPQ4Qw+hqcN/a4IsAy8zKpOqur7cpqNhG oXIx3cujotcI2WvRdcnp2ScyldSE3H2eg4P+eS5xsCHhTkc/6cogvTZfaulx6pNOUH8e bMM30vDZbzZAxdJuWhpHg9Dlh0yIRFeHN3bJ7e3uOgdRtqg/lpTMp0aInIsqhVujiYIw Z2qU84HdCdA9c217/bBA7TakP0ojZfz3WRwntANRomyjRkni8ITe4YoPbiI2MISxc5sb 8g0g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=lKHTF7JihDFIeTIvcnd7rESoT7kgc501dC3xjd90JIc=; b=mvUB6dLEsvfA1nild79ix0oUX1p6CH8weTuby57NBsW2mX8uGjeWxMcI1SenNNGTgf xY4lNH3Qyk7JIl0XTJjW00quChGtf+2OT4VQ6eAAAGlMOqMj4I6kkXbP59R8wnSVkpHN K5weGi1OxwdzTbZyGxPeRs3RyWmO3HHaiFOyAz+f0oJkLHZLJqqyW2tkgomorT4T8rGB AcrEJfFEJsODyyRQwFS8d0G5wCXEWPVui42x8tjN/5Y91vWmOYcrNPoThq1E65t/nmH5 VzPBiUeM/TMEchKV+1UQoVR8V3AGwB8hPSp1ZAX0iSv4bYgm6kCFubATm4QBZijjB9pJ RzNA==
X-Gm-Message-State: AOAM530ngUOArpl1pmt2yRziCY8z2SRqM6rA7elgWzBIWjlH/r2DohJ7 LbN0v3a2QlT8TVa/Vy32kb2s195mop++7rOe3cc=
X-Google-Smtp-Source: ABdhPJy7vL70TYcP2UvpX1ScfwDQI5c+mzPtJh6+g3Ui6/5v/0TsQBM7fno3zJZD4GDTxNsqMQJHKvKwwU15yexiQlk=
X-Received: by 2002:a62:3682:0:b029:2dd:ed69:6e85 with SMTP id d124-20020a6236820000b02902dded696e85mr9150288pfa.20.1621403942091; Tue, 18 May 2021 22:59:02 -0700 (PDT)
MIME-Version: 1.0
References: <161967518819.13605.6722172787091620121@ietfa.amsl.com> <3666A7D7-06B9-43E1-A126-9268CB0CEF77@eggert.org> <CABNhwV27Fc1VAtfSt7WSF_SewCq6tqPbNKsQD6HZwbdJYrt+fg@mail.gmail.com> <BY3PR05MB808154F278BA55265620F665C72C9@BY3PR05MB8081.namprd05.prod.outlook.com>
In-Reply-To: <BY3PR05MB808154F278BA55265620F665C72C9@BY3PR05MB8081.namprd05.prod.outlook.com>
From: Gyan Mishra <hayabusagsm@gmail.com>
Date: Wed, 19 May 2021 00:00:10 -0400
Message-ID: <CABNhwV1Bda-o-rtuZQ70SVSAR2uHPo=2gbj5pxSFwya3SwatUQ@mail.gmail.com>
To: John E Drake <jdrake@juniper.net>
Cc: Lars Eggert <lars@eggert.org>, General Area Review Team <gen-art@ietf.org>, "bess@ietf.org" <bess@ietf.org>, "draft-ietf-bess-datacenter-gateway.all@ietf.org" <draft-ietf-bess-datacenter-gateway.all@ietf.org>, "last-call@ietf.org" <last-call@ietf.org>
Content-Type: multipart/mixed; boundary="00000000000001bb3105c2a886b5"
Archived-At: <https://mailarchive.ietf.org/arch/msg/gen-art/2maesVzCbSo_roD9hHCuh6VyfGk>
Subject: Re: [Gen-art] [Last-Call] Genart last call review of draft-ietf-bess-datacenter-gateway-10
X-BeenThere: gen-art@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "GEN-ART: General Area Review Team" <gen-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/gen-art>, <mailto:gen-art-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/gen-art/>
List-Post: <mailto:gen-art@ietf.org>
List-Help: <mailto:gen-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/gen-art>, <mailto:gen-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 19 May 2021 05:59:14 -0000

Dear Authors,

Attached is a txt version -gsm update of version 10 that contains a first
cut at what I think would be appropriate  RFC 2119 SHOULD / MUST
language for a specification.  I also made some editorial updates to make
the specification clear to the reader.  In this thread and on the call we
had we talked about changing the ingress & egress domain to ingress &
egress site which I made that change as well.  Using the word "domain"
really makes it confusing as
to which domain is being referred  so the change to "site" really helps
readability.

Few more questions & thoughts  related to the draft for the authors to help
in finalizing the draft for publication below:

GW failover: (John Scudder)
GW's will need a local iBGP session for failover.  In the scenario where
one GW is disconnected from the backbone the draft clearly states that the
advertisement of the GW is withdrawn, when the active set of GWs changes
each externally advertised route will be re-advertised with the new tunnel
encapsulation attribute union which reflects the current set of active
GWs.
In the case of inconsistent routing within the site GW1 can reach GW2, GW1
cannot reach S2. Low probability but entirely possible.  Maybe a note in
the draft on this scenario may make things worse with blackhole to GW2.

Section 5 - RFC 9012 Tunnel encapsulation attribute BGP Prefix-sid
limitations (Alvaro Retana)
SR end to end or at a minimum within an SR domain may not be general use
case and maybe limited due to BGP prefix sid sub-tlv can only be used for
IPv4/IPv6 labeled unicast AFI/SAFI 1/4 2/4.
We may want to comment in section 5 that use of SR maybe limited and not a
general use case.  Also does this limitation impact the use of SRv6?

Additional thoughts for the authors.

Does the draft require SR in the backbone or can RSVP-TE be used?

If RSVP-TE can be used, maybe a different name for the identifier should be
used and not SR domain identifier.

Section 3 - Is the SR domain identifier value the RT that is attached to
the GW auto discovery route?

RFC 4360 is mentioned in section 3 as normative reference, however RFC 5668
4 byte extended community should also be mentioned as normative.

We may want to mention this bleed over of GW routes due to
mis-configuration in section 8 - security considerations

   Note that if a GW is (mis)configured with a different SR domain

   identifier from the other GWs to the same domain then it will not be

   auto-discovered by the other GWs (and will not auto-discover the

   other GWs).  This would result in a GW for another site

   receiving only the Tunnel Encapsulation attribute included in the BGP

   best route; i.e., the Tunnel Encapsulation attribute of the
   (mis)configured GW or that of the other GWs.

As there may be significant propagation delays with convergence for
re-advertisement as the set of active GWs change in cases where the number
of AS's is very large over the public internet, maybe that should be
mentioned.

Kind Regards

Gyan



On Tue, May 18, 2021 at 4:49 PM John E Drake <jdrake@juniper.net> wrote:

> Excellent, thanks so much for your help on this.
>
> Yours Irrespectively,
>
> John
>
>
> Juniper Business Use Only
>
> > -----Original Message-----
> > From: Gyan Mishra <hayabusagsm@gmail.com>
> > Sent: Tuesday, May 18, 2021 4:28 PM
> > To: Lars Eggert <lars@eggert.org>
> > Cc: General Area Review Team <gen-art@ietf.org>; bess@ietf.org;
> draft-ietf-
> > bess-datacenter-gateway.all@ietf.org; last-call@ietf.org
> > Subject: Re: [Last-Call] Genart last call review of
> draft-ietf-bess-datacenter-
> > gateway-10
> >
> > [External Email. Be cautious of content]
> >
> >
> > Hi Lars’s  & DC Gateway authors
> >
> > I will be responding back today to the Gen-Art original email I sent
> with final
> > comments and hope the final comments will help improve the document.
> >
> >     I will also address the comments from John Scudder related to GW
> failover as
> > well as Alvaro’s comments related to tunnel encapsulation attribute BGP
> prefix
> > sid Sub-TLV limitations.  Also will add new text recommendations related
> to RFC
> > 2119 MUST / SHOULD language to help improve the document.
> >
> >
> > Thank you
> >
> > Gyan
> > On Tue, May 18, 2021 at 3:31 AM Lars Eggert <lars@eggert.org> wrote:
> >
> > > Gyan, thank you for your review and thank you all for the following
> > > discussion. I have entered a No Objection ballot for this document
> > > based on the current status of the discussion.
> > >
> > > Lars
> > >
> > >
> > > > On 2021-4-29, at 8:46, Gyan Mishra via Datatracker
> > > > <noreply@ietf.org>
> > > wrote:
> > > >
> > > > Reviewer: Gyan Mishra
> > > > Review result: Not Ready
> > > >
> > > > I am the assigned Gen-ART reviewer for this draft. The General Area
> > > > Review Team (Gen-ART) reviews all IETF documents being processed by
> > > > the IESG for the IETF Chair.  Please treat these comments just like
> > > > any other last call comments.
> > > >
> > > > For more information, please see the FAQ at
> > > >
> > > >
> > <
> https://urldefense.com/v3/__https://trac.ietf.org/trac/gen/wiki/GenArtfaq__
> ;
> > !!NEt6yMaO-gk!RIcJvmiBoFFiuLezPbzRuUXybG_QHD8PujD7pROBUPot5dc9nX-
> > rMTiD1THCYZA$ >.
> > > >
> > > > Document: draft-ietf-bess-datacenter-gateway-??
> > > > Reviewer: Gyan Mishra
> > > > Review Date: 2021-04-28
> > > > IETF LC End Date: 2021-04-29
> > > > IESG Telechat date: Not scheduled for a telechat
> > > >
> > > > Summary:
> > > >   This document defines a mechanism using the BGP Tunnel
> Encapsulation
> > > >   attribute to allow each gateway router to advertise the routes to
> the
> > > >   prefixes in the Segment Routing domains to which it provides
> access,
> > > >   and also to advertise on behalf of each other gateway to the same
> > > >   Segment Routing domain.
> > > >
> > > > This draft needs to provide some more clarity as far as the use case
> > > > and
> > > where
> > > > this would as well as how it would be used and implemented.  From
> > > reading the
> > > > specification it appears there are some technical gaps that exist.
> > > > There
> > > are
> > > > some major issues with this draft. I don’t think this draft is ready
> yet.
> > > >
> > > > Major issues:
> > > >
> > > > Abstract comments:
> > > > It is mentioned that the use of Segment Routing within the Data
> Center.
> > > Is
> > > > that a requirement for this specification to work as this is
> > > > mentioned throughout the draft?  Technically I would think the
> > > > concept of the
> > > discovery
> > > > of the gateways is feasible without the requirement of SR within the
> > > > Data Center.
> > > >
> > > > The concept of load balancing is a bigger issue brought up in this
> > > > draft
> > > as the
> > > > problem statement and what this draft is trying to solve which I
> > > > will
> > > address
> > > > in the introduction comments.
> > > >
> > > > Introduction comments:
> > > > In the introduction the use case is expanded much further to any
> > > functional
> > > > edge AS verbiage below.
> > > >
> > > > OLD
> > > >
> > > >   “SR may also be operated in other domains, such as access networks.
> > > >   Those domains also need to be connected across backbone networks
> > > >   through gateways.  For illustrative purposes, consider the Ingress
> > > >   and Egress SR Domains shown in Figure 1 as separate ASes.  The
> > > >   various ASes that provide connectivity between the Ingress and
> Egress
> > > >   Domains could each be constructed differently and use different
> > > >   technologies such as IP, MPLS with global table routing native BGP
> to
> > > >   the edge, MPLS IP VPN, SR-MPLS IP VPN, or SRv6 IP VPN”
> > > >
> > > > This paragraph expands the use case to any ingress or egress stub
> > > > domain
> > > Data
> > > > Center, Access or any.  If that is the case should the draft name
> > > > change
> > > to
> > > > maybe a “stub edge domain services discovery”.  As this draft can be
> > > used for
> > > > any I would not preclude any use case and make the GW discovery open
> > > > to
> > > be used
> > > > for any service GW edge function and change the draft name to
> > > > something
> > > more
> > > > appropriate.
> > > >
> > > > This paragraph also states for illustrative purposes which is fine
> > > > but
> > > then it
> > > > expands the overlay/underlay use cases. I believe this use case can
> > > > only
> > > be
> > > > used for any technology that has an overlay/underlay which would
> > > preclude any
> > > > use case with just an underlay global table routing such as what is
> > > mentioned
> > > > “IP, MPLS with global table routing native BGP to the edge.  The IP
> > > > or
> > > global
> > > > table routing would be an issue as this specification requires
> > > > setting a
> > > RT and
> > > > an export/import RT policy for the discover of routes advertised by
> > > > the
> > > GWs.
> > > > As I don’t think this solution from what I can tell would work
> > > technically for
> > > > global table routing I will update the above paragraph to preclude
> > > global table
> > > > routing.  We can add back in we can figure that out but I don’t
> > > > think any public or private operator would change from global table
> > > > carrying all
> > > BGP
> > > > prefixes in the underlay now drastic change to VPN overlay pushing
> > > > all
> > > the
> > > > any-any prefixes into the overlay as that would be a prerequisite to
> > > > be
> > > able to
> > > > use this draft.
> > > >
> > > >> From this point forward I am going to assume we are using VPN
> > > >> overlay
> > > > technology such as SR or MPLS.
> > > >
> > > > NEW
> > > >
> > > >   “SR may also be operated in other domains, such as access networks.
> > > >   Those domains also need to be connected across backbone networks
> > > >   through gateways.  For illustrative purposes, consider the Ingress
> > > >   and Egress SR Domains shown in Figure 1 as separate ASes.  The
> > > >   various ASs that provide connectivity between the Ingress and
> Egress
> > > >   Domains could be two as shown in Figure-1 or could be many more as
> > > exists
> > > >   with the public internet use case, and each may be constructed
> > > differently
> > > >   and use different technologies such as MPLS IP VPN, SR-MPLS IP
> > > > VPN, or
> > > SRv6
> > > >   IP VPN” with a “BGP Free” Core.
> > > >
> > > > This may work without “BGP Free” core but I think to simplify the
> > > > design complexity I think constraining to “BGP Free” core transport
> layer.
> > > SR-TE path
> > > > steering as well gets much more complicated if all P routers are
> > > > running
> > > BGP as
> > > > well. I think in this example we can even explicitly say this
> > > > example
> > > shows the
> > > > public internet as that would be one of the primary use cases.
> > > >
> > > > This paragraph is confusing to the reader
> > > >
> > > > As a precursor to this paragraph I think it maybe a good idea to
> > > > state
> > > that we
> > > > are talking global table IP only routing or VPN overlay technology
> > > > with
> > > SR/MPLS
> > > > underlay transport.  That will make this section much easier to
> > > understand.
> > > >
> > > > Figure 1 drawing you should give a AS number to both the ingress
> > > > domain
> > > and
> > > > egress domain so the reader does not have to make assumptions if it
> > > > iBGP
> > > or
> > > > eBGP connected to the egress or ingress domain and state eBGP in the
> > > > text below.  Lets also call the intermediate ASNs in the middle as
> > > > depicted
> > > in the
> > > > diagram could be 2 as shown illustratively but could be many
> > > > operator
> > > domains
> > > > such as in the case of traversing the public internet.   In the
> drawing
> > > I would
> > > > replace ASBR for PE as per this solution as I am stating it has to
> > > > be a
> > > VPN
> > > > overlay paradigm and not global routing.  Also in the VPN overlay
> > > scenario when
> > > > you are doing any type of inter-as peering the inter-AS peering is
> > > > almost always between PE’s and not a separate dedicated device
> > > > serving a special “ASBR-ASBR” function as the PE is acting as the
> > > > border node providing the “ASBR” type function.  So in the re-write
> > > > I am assuming the drawing has
> > > been
> > > > updated changing ASBR to  PE.  Lets give each node a number so that
> > > > we
> > > can be
> > > > clear in the text exactly what node we are referring to.  In the
> > > > drawing
> > > please
> > > > update that GW1 peers to PE1 and GW2 peers to PE2 and GW3 peers to
> PE3.
> > > GW3
> > > > also peers to GW4 and GW2 peers  to GW5 which GW4 and GW5 are part
> > > > of
> > > AS3.  In
> > > > the AS1-AS2 peering  top peer would be PE6 peers to PE8 and bottom
> > > > peer
> > > PE7
> > > > peers to PE9.  So PE6 and PE7 are in AS1 and PE8 and PE9 are in AS2.
> > > > I
> > > made
> > > > the bottom to ASBRs in AS3 for the selective deterministic load
> > > balancing now
> > > > calling them GW4 and GW5 used later in the problem statement.
> > > >
> > > > One major problem with this problem statement description is that it
> > > > is incorrect as far as GW load balancing that it does not work today
> > > > in the topology given in Figure-1.  The function of edge GW load
> > > > balancing is
> > > based on
> > > > the iBGP path tie breaker lowest common denominator in the BGP path
> > > selection
> > > > which is lowest IGP underlay metric and as long as the metric is
> > > > equal
> > > and you
> > > > have iBGP multipath enabled  you now can load balance to egress PE1
> > > > and
> > > PE2
> > > > endpoints. So in this case flows coming from AS1 into AS2 hit a P
> > > intermediate
> > > > router which has iBGP multipath enabled and has lets say equal cost
> > > > for
> > > route
> > > > to the next hop attribute assuming next-hop-self is set so the cost
> > > > to
> > > > loopback0 on PE1 and cost to loopback0 on PE2 is lets say 10, so now
> > > > you
> > > have a
> > > > BGP multipath.  What is required though is the RD has to be unique
> > > > in a
> > > “BGP
> > > > Free” core RR environment where all PE’s route-reflector-clients
> > > > peer to
> > > the RR
> > > > and for all the paths that are advertised to the RR to be reflected
> > > > to
> > > all the
> > > > egress PE edges the RD must be unique for the RR to reflect all
> paths.
> > > BGP
> > > > add-paths is only used if you have Primary and Backup routing setup
> > > > where
> > > > PE1-GW1 has a 0x prepend and PE2-GW2 has 1x prepend so now with BGP
> > > add-paths
> > > > along with BGP PIC Edge you now have a edge pre-programmed backup
> > path.
> > > So the
> > > > add-paths is not necessarily something that helps for load balancing
> > > > and
> > > is in
> > > > fact orthogonal to load balancing as it for Primary / Backup routing
> > > > and
> > > not
> > > > Active/Active load balancing routing where load balancing with VPN
> > > overlay is
> > > > simply achieved with unique RD per PE and iBGP multipath and equal
> > > > cost
> > > paths
> > > > to the underlay recursive IGP learned next-hop-attribute in this
> > > > case
> > > the PE
> > > > loopback 0 per the next hop rewrite via “next-hop-sellf” done on the
> > > PE-RR
> > > > peering in a standard VPN overlay topology.   As far as load
> balancing
> > > being
> > > > accomplished in the underlay what I have stated is independent of
> > > > SR-TE
> > > however
> > > > with SR-TE candidate path the load balancing ECMP spray to egress PE
> > > egress GW
> > > > AS can also happen as well with prefix-sid.
> > > >
> > > > OLD
> > > >   Suppose that there are two gateways, GW1 and GW2 as shown in
> > > >   Figure 1, for a given egress SR domain and that they each
> advertise a
> > > >   route to prefix X which is located within the egress SR domain with
> > > >   each setting itself as next hop.  One might think that the GWs for
> X
> > > >   could be inferred from the routes' next hop fields, but typically
> it
> > > >   is not the case that both routes get distributed across the
> backbone:
> > > >   rather only the best route, as selected by BGP, is distributed.
> This
> > > >   precludes load balancing flows across both GWs.
> > > >
> > > > I am rewriting the text in the NEW as there is some discrepancy in
> > > > the
> > > routes
> > > > being distributed across the backbone and what gets distributed.  So
> > > > I am completely re-writing to make it more clear what we are trying
> > > > to state
> > > here as
> > > > the text appears technically to be incorrect.  To help state the
> > > > flow
> > > will use
> > > > the BGP route flow to help depict the routing and try to get to the
> > > problem
> > > > statement we are trying to portray.
> > > >
> > > > NEW
> > > >
> > > >   Suppose that there are two gateways, GW1 and GW2 as shown in
> > > >   Figure 1, for a given egress SR domain and each gateway advertises
> > > > via
> > > EBGP
> > > >   a VPN prefix X to AS2 core domain via EBGP with underlay next hop
> > > > set
> > > to GW1
> > > >   or GW2. In this case we are Active / Active load balancing with
> > > > PE1
> > > and PE2
> > > >   receives the VPN prefix and advertised the VPN prefix X into the
> > > domain with
> > > >   next-hop-self set on the PE-RR peering to the PE’s loopback0.  The
> > > > P
> > > routers
> > > >   within the domain have ECMP path with IGP metric tie to the egress
> > > > PE1
> > > and
> > > >   egress PE2 for VPN Prefix X learned from GW1 and GW2. SR-TE path
> > > > can
> > > now be
> > > >   stitched from GW3 to PE3 SR-TE Segment-1 to PE3 to PE6 and PE7
> > > Segment-2 to
> > > >   PE8 and PE9 to Egress Domain via PE1 and PE2 to GW1 and GW2.  In
> > > > this
> > > case
> > > >   however we don’t want the traffic to be steered via SR-TE Load
> > > balanced via
> > > >   ingress GW3 and want to take GW3 out of rotation and load balance
> > > traffic to
> > > >   GW4 and GW5 instead.
> > > >
> > > > **Text above provides the updated selective deterministic gateway
> > > steering
> > > > described below to achieve the goal.  I think that may have been the
> > > intent of
> > > > the authors and I am just making it more clear**
> > > >
> > > > As for problem statement as GW load balancing can occur in the
> > > > underlay
> > > as
> > > > stated easily that is not the problem.
> > > >
> > > > In my mind I am thinking the problem statement that we want to
> > > > describe
> > > in both
> > > > the Abstract and Introduction is not vanilla simple gateway load
> > > balancing but
> > > > rather a predictable deterministic method of selecting gateways to
> > > > be
> > > used that
> > > > is each VPN prefix now has a descriptor attached -  tunnel
> > > > encapsulation attribute which contains multiple TLVs one or more for
> > > > each “selected
> > > gateway”
> > > > with each tunnel TLV contains an egress tunnel endpoint sub-tlv that
> > > identifies
> > > > the gateway for the tunnel.  Maybe we can have in the sub-tlv a
> > > > priority
> > > field
> > > > for pecking order preference of which GWs are pushed up into the GW
> hash
> > > > selected for the SR-ERO path to be stitched end to end.   So lets say
> > > you had
> > > > 10 GWs and you break them up into 2 tiers or multi tiers and have
> > > > maybe
> > > gateway
> > > > 1-5 are primary and 6-10 are backup and that could be do to various
> > > reasons so
> > > > you can basically pick and choose based on priority which GW that
> > > > gets
> > > added to
> > > > the GW hash.
> > > >
> > > > I have some feedback and comments on the solution and how best to
> > > > write
> > > the
> > > > verbiage to make it more clear to the reader.
> > > >
> > > > I think in the solution as far s the RT to attach for the GW auto
> > > discovery.
> > > > So with this new RT we are essentially creating a new VPN RIB that
> > > > has
> > > prefixes
> > > > from all the selected gateways that are discovered from the tunnel
> > > > encapsulation attribute TLV.
> > > >
> > > > In the text here what is really confusing is if the tunnel
> > > > encapsulation attribute is being attached to the underlay recursive
> route to
> > next hop
> > > > attribute or the VPN overlay prefix.   So the reason I am thinking
> it is
> > > being
> > > > attached to the VPN overlay prefix and not the underlay next hop
> > > attribute is
> > > > how would you now create another transport RIB and if you are
> > > > creating a
> > > new
> > > > transport RIB there is already a draft defined by Kaliraj
> > > > Vairavakkalai
> > > or
> > > > BGP-LU SAFI 4 labeled unicast that exits today to advertise next
> > > > hops
> > > between
> > > > domains for an end to end LSP load balanced path.
> > > >
> > > >
> > > https://urldefense.com/v3/__https://tools.ietf.org/html/draft-kaliraj-
> > > idr-bgp-classful-transport-planes-07__;!!NEt6yMaO-gk!RIcJvmiBoFFiuLezP
> > > bzRuUXybG_QHD8PujD7pROBUPot5dc9nX-rMTiD7W4i_nA$
> > > >
> > > > IANA code point below
> > > > 76      Classful-Transport SAFI
> > > > [draft-kaliraj-idr-bgp-classful-transport-planes-00]
> > > >
> > > > Also in line with CT another option is BGP-LU SAFI 4 to import the
> > > loopbacks
> > > > between domains which is the next hop attribute to be advertised
> > > > into
> > > the core
> > > > end to end LSP.  So the BGP-LU SAFI  RIB could be used for the next
> > > > GW
> > > next hop
> > > > advertisement between domains so that there is visibility of all the
> > > egress PE
> > > > loopback0 between domains.   So you can either stitch the LSP
> segmented
> > > LSP
> > > > like inter-as option-b SR-TE stitched and use nex-hop self PE-RR
> > > > next-hop rewrite on each of the PEs within the internet domain or
> > > > you could
> > > import all
> > > > the PE loopback from all ingress and egress domains into the
> > > > internet
> > > domain
> > > > similar to inter-as opt-c create end to end LSP instantiate an end
> > > > to
> > > end SR-TE
> > > > path.
> > > >
> > > > Maybe you could attach the RT tunnel encapsulation attribute tunnel
> > > > tlv endpoint tlv to the VPN overlay prefix.  Not sure how that would
> > > > be
> > > beneficial
> > > > the underlay steers the VPN overlay.
> > > >
> > > > So maybe you could couple the VPN overlay new GW RIB RT to the
> > > > transport Underlay CT CLAS RIB or BGP-LU RIB coupling  may have some
> > > > benefit but
> > > that
> > > > would have to be investigated but I think is out of scope of the
> > > > goals
> > > of this
> > > > draft.
> > > >
> > > > I think we first have to figure out the goal and purpose of this
> > > > draft
> > > by the
> > > > authors and how the GW discovery should work in light of the CT
> > > > class CT
> > > RIB
> > > > AFI/SAFI codepoint draft that exists today as well as the BGP-LU
> > > > option
> > > for
> > > > next hop advertisement within the internet domain.
> > > >
> > > > Section 3 comments
> > > >
> > > >      “Each GW is configured with an identifier for the SR domain.
> That
> > > >      identifier is common across all GWs to the domain (i.e., the
> same
> > > >      identifier is used by all GWs to the same SR domain), and unique
> > > >      across all SR domains that are connected (i.e., across all GWs
> to
> > > >      all SR domains that are interconnected).
> > > >
> > > > **No issues with the above**
> > > >
> > > >      A route target ([RFC4360]) is attached to each GW's
> auto-discovery
> > > >      route and has its value set to the SR domain identifier.
> > > >
> > > > **So here if the RT is attached to the GW auto-discovery route we
> > > > need
> > > to state
> > > > is that the underlay route and that the PE does a next-hop-self
> > > > rewrite
> > > of the
> > > > eBGP link to the BGP egress domain next hop to the loopback0 so the
> > > > GW
> > > next hop
> > > > that we are tracking of all the ingress and egress PE domains is the
> > > egress and
> > > > ingress PE loopback0.**
> > > >
> > > >      Each GW constructs an import filtering rule to import any route
> > > >      that carries a route target with the same SR domain identifier
> > > >      that the GW itself uses.  This means that only these GWs will
> > > >      import those routes, and that all GWs to the same SR domain will
> > > >      import each other's routes and will learn (auto-discover) the
> > > >      current set of active GWs for the SR domain.”
> > > >
> > > > **So if this is the case and we are tracking the underlay RIB and
> > > > attach
> > > a
> > > > route target to all the ingress PE & P next hops which is loopback0
> > > > =
> > > this is
> > > > literally identical to BGP-LU importing all the loopbacks between
> > > domains or
> > > > using CT class** There is no need for this feature to use the tunnel
> > > > encapsulation attribute.  I am not following why you would not use
> > > BGP-LU or CT
> > > > clas RIB.**
> > > >
> > > >   “To avoid the side effect of applying the Tunnel Encapsulation
> > > >   attribute to any packet that is addressed to the GW itself, the GW
> > > >   SHOULD use a different loopback address for packets intended for
> it.”
> > > >
> > > > **I don’t understand this statement as the next hop is the ingress
> > > > and
> > > egress
> > > > PE loopback0 that is the next hop being tracked for the gateway load
> > > balancing.
> > > > The GW device subnet between the GW and PE is not advertised into
> > > > the
> > > internet
> > > > domain as we do next-hop-self on the PE PE-RR iBGP peering and so
> > > > the GW
> > > to PE
> > > > subnet is not advertised.**   Looking at it a second time I think we
> are
> > > > thinking here BGP-LU inter-as opt c style import of loops between
> > > domains and
> > > > so instead of importing the loop0 which carries all packets on the
> > > > GW
> > > device
> > > > use a different loopback GW1 so it does not carry the FEC of all
> > > > BAU
> > > packets
> > > > similar concept utilized in RSVP-TE to VPN mapping "per-vrf TE"
> > > > concept
>


-- 

<http://www.verizon.com/>

*Gyan Mishra*

*Network Solutions A**rchitect *

*Email gyan.s.mishra@verizon.com <gyan.s.mishra@verizon.com>*



*M 301 502-1347*