RE: Persistent loops when mixing rtgwg-enterprise-pa-multihoming and rtgwg-dst-src-routing

Chris Bowers <cbowers@juniper.net> Wed, 26 July 2017 22:11 UTC

Return-Path: <cbowers@juniper.net>
X-Original-To: rtgwg@ietfa.amsl.com
Delivered-To: rtgwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 47FEA131E77 for <rtgwg@ietfa.amsl.com>; Wed, 26 Jul 2017 15:11:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.022
X-Spam-Level:
X-Spam-Status: No, score=-2.022 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=juniper.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jcJyxWPXRuuj for <rtgwg@ietfa.amsl.com>; Wed, 26 Jul 2017 15:11:49 -0700 (PDT)
Received: from NAM03-DM3-obe.outbound.protection.outlook.com (mail-dm3nam03on0090.outbound.protection.outlook.com [104.47.41.90]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 66BC112EC05 for <rtgwg@ietf.org>; Wed, 26 Jul 2017 15:11:49 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=juniper.net; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=MmsPYPf7IV3hi7Yimda3u9wtMqsBFBEgQrXHmimUCf0=; b=RoAX0tef/J22YbDfoNXzZkSOiLTkQdu8xxfAnJ6RcEuVnCx/Rn/THH/TaU7Qs7NpVh+vG0MilnRRiLm50LTV8kUZv1+Ofh0yKw77F76g2A9rn80B4kK7Zu9TXPZym2gmihvSGmk+fqsvIFYK5vuTRiktQ0ND2UNYcOxWCm1NZQk=
Received: from MWHPR05MB2829.namprd05.prod.outlook.com (10.168.245.11) by MWHPR05MB2864.namprd05.prod.outlook.com (10.168.245.22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.1.1304.10; Wed, 26 Jul 2017 22:11:47 +0000
Received: from MWHPR05MB2829.namprd05.prod.outlook.com ([10.168.245.11]) by MWHPR05MB2829.namprd05.prod.outlook.com ([10.168.245.11]) with mapi id 15.01.1304.014; Wed, 26 Jul 2017 22:11:47 +0000
From: Chris Bowers <cbowers@juniper.net>
To: David Lamparter <equinox@diac24.net>, "rtgwg@ietf.org" <rtgwg@ietf.org>
CC: Anton Smirnov <as@cisco.com>, Jen Linkova <furry@google.com>, Fred Baker <FredBaker.IETF@gmail.com>
Subject: RE: Persistent loops when mixing rtgwg-enterprise-pa-multihoming and rtgwg-dst-src-routing
Thread-Topic: Persistent loops when mixing rtgwg-enterprise-pa-multihoming and rtgwg-dst-src-routing
Thread-Index: AQHTALSR8NDhHeBsN0+K10gou7y3N6JcVcwAgApTD/A=
Date: Wed, 26 Jul 2017 22:11:47 +0000
Message-ID: <MWHPR05MB282950D357E8B6597685E828A9B90@MWHPR05MB2829.namprd05.prod.outlook.com>
References: <20170719172913.GU773745@eidolon> <20170720074132.GW773745@eidolon>
In-Reply-To: <20170720074132.GW773745@eidolon>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: spf=none (sender IP is ) smtp.mailfrom=cbowers@juniper.net;
x-originating-ip: [66.129.239.12]
x-ms-publictraffictype: Email
x-microsoft-exchange-diagnostics: 1; MWHPR05MB2864; 7:M1M+AIMsjswk81tIy/h9RxieNZ3rsZe4hLNn0zX4FdijESphGCDCcwj+P19lQkUO/fHvemF2YD2SJcOmUT41VhfhxlzIYV/LtVrUkkXVNiB5HXu0QQ9avo3Ja7LuLYn2WmD6muGRMS9OlxRuwJq8BZVUbA0rE+qqiu/BuGFjQn7zZxnQOJ/SrkD3TPo3KVbyEqRTAMED4rFlM/7i0kkJppsRt+qGBjoINvgDm0aH6W4TlJPriQSH4aaYIhhDykKaj6j36vLf0VWEGlb+zt5KHzIOr0d5cJuyeZ6QwZztxvatahjiKo1hq8RP1Rf9nFWfa2EtJQSWDkuEPXgocXxb0Yy/B0Rii2Xd/LroU/9cNobgxoqJaNfzxjVp6MRT32r/Gx+RxY7yZz/kfR/40jZ55RmTFeJI3lOfnD0fefyU5UNcyK7em05lgu4Na2ZX6eyhY3AldcjytBoBOiN+gN/M2jLa5EQnNVF3wrz2P6HnM0kZhQ3mMHIJpwqV31GPTSFBMjAszsDsqx9jWuxuyMFl+AT4e8HD5C9U2UfNdXTqtaBvrnFvLw351/vp3nFO04PzUmsV3NFFEHmxFoy1c1QFeGjwHqST8l4ykXlTOyOLYpplRet1+n6/kO1YksHwa6UWgsJ8IpMlhrI6xbi+PQDoBOjvEgL25o1qNkuOUKrysKmTMEO5M424ZlivYL3S5xc5JYLpInQai1DnEn0eY6vf7QQPWzp/nZ95LKVN7Jf5PlBrSWwTVGX0aAPkqbBgxXE9LnPhjYyL8XFeLHswrgKGE4Trtq9246FHfGxcuLZ7tRE=
x-ms-office365-filtering-correlation-id: 86908fd6-958f-4983-7484-08d4d4734ef8
x-ms-office365-filtering-ht: Tenant
x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254075)(48565401081)(300000503095)(300135400095)(2017052603031)(201703131423075)(201703031133081)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095); SRVR:MWHPR05MB2864;
x-ms-traffictypediagnostic: MWHPR05MB2864:
x-exchange-antispam-report-test: UriScan:(138986009662008)(211936372134217)(95692535739014)(153496737603132);
x-microsoft-antispam-prvs: <MWHPR05MB2864E2555CD3CF88B8A4A53DA9B90@MWHPR05MB2864.namprd05.prod.outlook.com>
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001)(100000703101)(100105400095)(93006095)(93001095)(920507026)(6055026)(6041248)(20161123562025)(20161123560025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123558100)(20161123564025)(20161123555025)(6072148)(100000704101)(100105200095)(100000705101)(100105500095); SRVR:MWHPR05MB2864; BCL:0; PCL:0; RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095); SRVR:MWHPR05MB2864;
x-forefront-prvs: 038002787A
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(6009001)(39840400002)(39400400002)(39450400003)(39410400002)(39850400002)(39860400002)(24454002)(51444003)(199003)(13464003)(189002)(377454003)(99286003)(54356999)(33656002)(305945005)(55016002)(7736002)(6506006)(6436002)(77096006)(2906002)(101416001)(34040400001)(229853002)(230783001)(86362001)(4326008)(53936002)(6246003)(2950100002)(53546010)(66066001)(81156014)(9686003)(81166006)(38730400002)(102836003)(7696004)(54906002)(8676002)(39060400002)(478600001)(14454004)(6116002)(3846002)(8936002)(6306002)(68736007)(2501003)(966005)(105586002)(5660300001)(3280700002)(2900100001)(25786009)(3660700001)(76176999)(189998001)(106356001)(97736004)(74316002)(50986999); DIR:OUT; SFP:1102; SCL:1; SRVR:MWHPR05MB2864; H:MWHPR05MB2829.namprd05.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en;
received-spf: None (protection.outlook.com: juniper.net does not designate permitted sender hosts)
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginatorOrg: juniper.net
X-MS-Exchange-CrossTenant-originalarrivaltime: 26 Jul 2017 22:11:47.6846 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: bea78b3c-4cdb-4130-854a-1d193232e5f4
X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR05MB2864
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtgwg/Mf_JSNYQSKSv3K3GVqtAXZsMRWg>
X-BeenThere: rtgwg@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Routing Area Working Group <rtgwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtgwg/>
List-Post: <mailto:rtgwg@ietf.org>
List-Help: <mailto:rtgwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Jul 2017 22:11:52 -0000

David,

It seems to me that the example that creates a routing loop depends on having two routes where 
one non-default source prefix of one route contains the source prefix of the other route

Specifically,  S=2001:db8::/32 contains S=2001:db8:ffff::/48.

I think that we can resolve the discrepancy between the two forwarding behavior descriptions if we 
generalize rule #3 from section 3 of draft-ietf-rtgwg-enterprise-pa-multihoming-01 to propagate
routes from each less specific source-prefix-scoped forwarding table to the more specific
source-prefix-scoped forwarding tables contained in the less specific one.  This can be
seen as generalizing the rule to propagate routes from the unscoped (S=::/0) forwarding table
to the scoped forwarding tables, since they are all contained in the S=::/0 prefix.

With this generalization, the forwarding table on B scoped for S=2001:db8:ffff::/48 would inherit an extra route from 
the forwarding table scoped for S=2001:db8::/32 which contains it.  The resulting forwarding table 
on B scoped for S=2001:db8:ffff::/48 would look as below.  So there is no routing loop.

-  scope S=2001:db8:ffff::/48
   2001:db8:aaaa::/48 local
   2001:db8::/32 local
   ::/0 via A

Does this generalization of rule #3 resolve the discrepancy ?

Thanks,
Chris


-----Original Message-----
From: David Lamparter [mailto:equinox@diac24.net] 
Sent: Thursday, July 20, 2017 2:42 AM
To: rtgwg@ietf.org
Cc: Anton Smirnov <as@cisco.com>; Jen Linkova <furry@google.com>; Chris Bowers <cbowers@juniper.net>; Fred Baker <FredBaker.IETF@gmail.com>
Subject: Persistent loops when mixing rtgwg-enterprise-pa-multihoming and rtgwg-dst-src-routing

Gah, forgot Cc:s, resending to list so replies will inherit Cc:s (please reply on this one to get the Cc:s)


On Wed, Jul 19, 2017 at 07:29:13PM +0200, David Lamparter wrote:
Hello again, rtgwg,


Unfortunately (and possibly contradicting earlier statements I may have made to the opposite), the routing system behaviour described in
https://tools.ietf.org/html/draft-ietf-rtgwg-enterprise-pa-multihoming-01#section-3
is not compatible with the behaviour described in https://tools.ietf.org/html/draft-ietf-rtgwg-dst-src-routing
and will result in loops in specific cases when mixing implementations.


The failure scenario is illustrated by the following setup:

Considering 2 connected routers A and B, A implementing dst-src-routing and B implementing enterprise-pa-multihoming.

Have:
- A advertise D=::/0, S=::/0
- B advertise D=2001:db8::/32, S=2001:db8::/32
- B advertise D=2001:db8:aaaa::/48, S=2001:db8:ffff::/48

B will build the following "scoped tables":
- unscoped:
  ::/0 via A
- scope 2001:db8::/32
  2001:db8::/32 local
  ::/0 via A
- scope 2001:db8:ffff::/48
  2001:db8:aaaa::/48 local
  ::/0 via A

Note that the last scope has no entry for 2001:db8::/32, since item 3.
in the first list in section 3 of the draft only prescribes propagating unscoped entries to the scoped table.

This leads to a packet with S=2001:db8:ffff::1, D=2001:db8::1 looping between the routers:
- router B performs the lookup as:
  - longest matching scoped table is S=2001:db8:ffff::/48
    - scoped table contains route to ::/0 pointing at A
- router A performs the lookup as:
  - most specific destination match is 2001:db8::/32
    - under this destination, route with S=2001:db8::/32 points to B => persistent loop.


It is my understanding that this discrepancy in behaviour is accidental and the enterprise-pa-multihoming draft is attempting to describe the same behaviour in local wording.

Assuming this is the case, I'm unsure how we've ended up in this situation.  I've heard the rtgwg-dst-src-routing draft may be hard to understand.  If there are specific concerns, I'd ask for them to be voiced so that I can address them.  I've checked for such feedback and found none, if I lost any I'm terribly sorry and hope it can be resent.
If there are shared unspecific concerns, I suppose I can look at different ways to argue section 3 / 3.1.

Still assuming that this was intended to be identical in behaviour, I would hope the mismatch can be addressed in enterprise-pa-multihoming.
Looking at, well, the title of that draft, it seems that it's trying to be complete in describing the specific application in multihomed enterprises.  This may also explain the specific mismatch in behaviour; it's in fact identical as long as one only considers exit routing with non-overlapping source prefix restrictions.

rtgwg-dst-src-routing argues a broader applicability of the idea and tries to be thorough in describing a routing system feature to build on.
As such, I'd be very happy to see enterprise-pa-multihoming describing in detail how to apply this feature for its title.

Assuming it is _not_ the case that the intention is for these to be identical, we're IMHO heading for a rather bad place.  I'd rather not argue this without confirming we're indeed there.


Cheers,

-David


P.S.: rtgwg-dst-src-routing already had a description on how to translate its routes into a form suitable for "policy routing"
implementations.  The version just posted adds a reference to https://hal.inria.fr/file/index/docid/947234/filename/source-sensitive-routing.pdf
which argues the implementation specifics and correctness of this translation in its full mathematical gore ;)