Re: [Gen-art] Gen-ART review for draft-ietf-grow-ix-bgp-route-server-operations

Nick Hilliard <nick@inex.ie> Thu, 16 October 2014 22:42 UTC

Return-Path: <nick@inex.ie>
X-Original-To: gen-art@ietfa.amsl.com
Delivered-To: gen-art@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5B5321A8767 for <gen-art@ietfa.amsl.com>; Thu, 16 Oct 2014 15:42:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Hn1Y51Se_Rci for <gen-art@ietfa.amsl.com>; Thu, 16 Oct 2014 15:42:50 -0700 (PDT)
Received: from mail.netability.ie (mail.netability.ie [IPv6:2a03:8900:0:100::5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 28C6B1A6EE6 for <gen-art@ietf.org>; Thu, 16 Oct 2014 15:42:49 -0700 (PDT)
X-Envelope-To: gen-art@ietf.org
Received: from cupcake.foobar.org (xe-0-0-2.transit07.phb1.foobar.org [87.192.56.84]) (authenticated bits=0) by mail.netability.ie (8.14.9/8.14.5) with ESMTP id s9GMgHT6068155 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 16 Oct 2014 23:42:39 +0100 (IST) (envelope-from nick@inex.ie)
X-Authentication-Warning: cheesecake.netability.ie: Host xe-0-0-2.transit07.phb1.foobar.org [87.192.56.84] claimed to be cupcake.foobar.org
Message-ID: <544049C9.7090608@inex.ie>
Date: Thu, 16 Oct 2014 23:42:17 +0100
From: Nick Hilliard <nick@inex.ie>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:31.0) Gecko/20100101 Thunderbird/31.2.0
MIME-Version: 1.0
To: "Romascanu, Dan (Dan)" <dromasca@avaya.com>, "gen-art@ietf.org" <gen-art@ietf.org>
References: <9904FB1B0159DA42B0B887B7FA8119CA5C8A1DDA@AZ-FFEXMB04.global.avaya.com>
In-Reply-To: <9904FB1B0159DA42B0B887B7FA8119CA5C8A1DDA@AZ-FFEXMB04.global.avaya.com>
X-Company-Info-1: Internet Neutral Exchange Association Limited. Registered in Ireland No. 253804
X-Company-Info-2: Registered Offices: 1-2, Marino Mart, Fairview, Dublin 3
X-Company-Info-3: Internet Neutral Exchange Association Limited is limited by guarantee
X-Company-Info-4: Offices: 4027 Kingswood Road, Citywest, Dublin 24.
Content-Type: text/plain; charset="windows-1252"
Content-Transfer-Encoding: 8bit
Archived-At: http://mailarchive.ietf.org/arch/msg/gen-art/zASDe8hBJYrKOcTLmt0C2QjsPYA
Cc: "draft-ietf-grow-ix-bgp-route-server-operations.all@tools.ietf.org" <draft-ietf-grow-ix-bgp-route-server-operations.all@tools.ietf.org>
Subject: Re: [Gen-art] Gen-ART review for draft-ietf-grow-ix-bgp-route-server-operations
X-BeenThere: gen-art@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "GEN-ART: General Area Review Team" <gen-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/gen-art>, <mailto:gen-art-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/gen-art/>
List-Post: <mailto:gen-art@ietf.org>
List-Help: <mailto:gen-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/gen-art>, <mailto:gen-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Oct 2014 22:42:53 -0000

Dan,

Comments inline.  Coauthors: rNNN refers to svn revision.

On 18/09/2014 11:27, Romascanu, Dan (Dan) wrote:
[...]
> 1.       The reference [RS-ARCH] mentioned in 4.2.1.1 and 4.2.1.2 is not
> reachable (Error 404). As the understanding of the issues described in the
> two sections depend on this reference, a valid reference is required.

This has now been updated to a fresh link (svn r101).

The location of this URL has not been static over the years, which means
that in future, a web search may be required to locate it.  The information
included in the reference was deliberately chosen to ensure that it could
be found by a future web search.

> 2.       Section 4.2.1.3 uses the term ‘flat layer 2 network’ which has at
> least two meanings depending on the context or layer – either one VLAN
> space at the link layer (as to differentiate from Customer VLAN and
> Provider VLAN) or a bridged network with no routers between the bridged
> segments. Clarification is needed.

changed to "deployed at IXPs where all connected routers are on the same
layer 2 broadcast domain". r102.

> 3.       The usage of keywords is inconsistent in a few place. In 4.6.1 the
> ‘should’ in the second paragraph needs to be capitalized. In 4.6.3 we have
> a capitalized SHOULD, but then a non-capitalized ‘may’ for statements that
> both seem to describe requirements of the same level.

mmm, the great rfc2119 debate.  SHOULD is better in this paragraph. MAY
won't work for the other points. r103.

(oops, just noticed another typo in the previous paragraph too - r104.)

> 4.       I am doubt that Section 4.7 is that useful. On one hand
> reliability of layer 2 forwarding is not in my opinion such a big issue,
> and measures can be taken a the link layer to improve it (use lags or
> redundant paths). Second the recommended mitigation (RFC 5881 BFD) is
> described as non-optimal, with no other alternative. I would just drop this
> section completely.

Practical experience shows that this happens from time to time, caused by
things like e.g. broken transceivers on an individual LAG bearer link,
misconfigured VPLS LSPs and so forth.  In fact, at the end of Sep, we had a
serious problem at INEX for several hours due to a suspected ASIC
misprogramming event on a switch.  ICMP monitoring probes didn't detect
this and BGP sessions on all affected routers stayed up but ~90-95% of
regular traffic was dropped.  Full failure is fine because traffic will be
rerouted; partial failure like this which causes traffic to be blackholed
is devastating for the affected parties when it happens.

Generally speaking, this is an extremely difficult problem to handle
because it requires monitoring from angles that the ixp operator may not
have visibility into, e.g. point to point connectivity between two mac
addresses across a fabric.  If there were a good generalised way of
handling this problem, it would be present in section 4.7, but there isn't
outside implementing ad-hoc heuristic-based reactive monitoring to look out
for specific failure modes.  This mostly throws up false negatives due to
e.g. ixp participant traffic engineering, etc.  You're not the only person
that finds this unsatisfactory.

On balance, we'd prefer to leave this paragraph in.

> Nits/editorial comments:
> 
>  
> 
> 1.       The English syntax of the second paragraph in the Abstract is broken.

yes, clumsy.  Reworded to:

--
... reduce the administrative and operational overhead associated with
connecting to IXPs; in some cases, route servers are used by IXP
participants as their preferred means of exchanging routing information.
--

r107

> 2.       In the introduction there is a mention of ‘using shared Layer-2
> networking media such as Ethernet’. Actually Ethernet is seldom used
> nowadays as a shared media, I would just recommend saying ‘using data link
> layers protocols such as Ethernet’

noted.  r105.

> 3.       In section 4.2 s/optimization technique is
> implemented/optimization technique that is implemented/

oops! r106.

We're working through a number of points brought up by various people at
the moment, and expect to post a new ID revision in a couple of days.

Otherwise, thanks for the time you took to read the draft and write this
review - this is very much appreciated.

Nick