some comments on draft-ietf-idr-bgp4-experience-protocol-02.txt

Curtis Villamizar <curtis@laptoy770.fictitious.org> Wed, 17 September 2003 14:44 UTC

Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA15591 for <idr-archive@ietf.org>; Wed, 17 Sep 2003 10:44:43 -0400 (EDT)
Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 19zdY2-00079S-00 for idr-archive@ietf.org; Wed, 17 Sep 2003 10:44:50 -0400
Received: from trapdoor.merit.edu ([198.108.1.26]) by ietf-mx with esmtp (Exim 4.12) id 19zdY2-000799-00 for idr-archive@ietf.org; Wed, 17 Sep 2003 10:44:50 -0400
Received: by trapdoor.merit.edu (Postfix) id 70814913B2; Wed, 17 Sep 2003 10:42:16 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 44070913AD; Wed, 17 Sep 2003 10:42:15 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id AE8939139C for <idr@trapdoor.merit.edu>; Wed, 17 Sep 2003 10:40:08 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 91A115DDB2; Wed, 17 Sep 2003 10:40:08 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from mail.avici.com (unknown [12.38.212.174]) by segue.merit.edu (Postfix) with ESMTP id 4A98F5DDA9 for <idr@merit.edu>; Wed, 17 Sep 2003 10:40:08 -0400 (EDT)
Received: from laptoy770.fictitious.org (tedev-tun1.avici.com [10.2.20.201]) by mail.avici.com (8.12.8/8.12.8) with ESMTP id h8HEc2Yf001216; Wed, 17 Sep 2003 10:38:03 -0400
Message-Id: <200309171438.h8HEc2Yf001216@mail.avici.com>
To: idr@merit.edu
Cc: curtis@fictitious.org
Reply-To: curtis@fictitious.org
Subject: some comments on draft-ietf-idr-bgp4-experience-protocol-02.txt
Date: Wed, 17 Sep 2003 10:38:08 -0400
From: Curtis Villamizar <curtis@laptoy770.fictitious.org>
Sender: owner-idr@merit.edu
Precedence: bulk

ref:
	Title		: Experience with the BGP-4 Protocol
	Author(s)	: D. McPherson, K. Patel
	Filename	: draft-ietf-idr-bgp4-experience-protocol-02.txt

Danny,

This looks to be in good shape.  If you do another version of the
draft here are some suggestions.

None of this is essential to the draft but it might be helpful.

Curtis


In the MED section you might want to describe the common "hot potatoe
/ cold potatoe" usage.  Here is some suggested text.

   In a situation where traffic flows between a pair of destinations,
   each connected to two transit networks, each of the transit
   networks has the choice of either sending the traffic to the
   closest peering to other transit provider or passing traffic to the
   peering which advertises the least cost through the other provider.
   The former method is called "hot potatoe routing" because like a
   hot potatoe held in bare hands, whoever has it tries to get rid of
   it quickly.  Hot potatoe routing is accomplished by not passing the
   EGBP learned MED into IBGP.  This minimizes transit traffic for the
   provider routing the traffic.  Far less common is "cold potatoe
   routing" where the transit provider uses their own transit capacity
   to get the traffic to the point in the adjacent transit provider
   advertised as being closest to the destination.  Cold potatoe
   routing is accomplished by passing the EBGP learned MED into IBGP.

   If one transit provider uses hot potatoe routing and another uses
   cold potatoe, traffic between the two tends to be symetric.
   Depending on the business relationships, if one provider has more
   capacity or a significantly less congested transit network, then
   that provider may use cold potatoe routing.  An example of
   widespread use of cold potatoe routing was the NSF funded NSFNET
   backbone and NSF funded regional networks in the mid 1990s.

   In some cases a provider may use hot potatoe routing for some
   destinations for a given peer AS and cold potatoe routing for
   others.  An example of this is the different treatment of
   commercial and research traffic in the NSFNET in the mid 1990s.

Optionally add to the last paragraph "This might best be described as
'mashed potatoe routing' a term which reflects the complexity of
router configurations in use at the time".  :-)

Under "Internet Dynamics" I would prefer if you would add:

   None of the current implementations of BGP Route Flap Damping store
   route history by unique NRLI and AS Path although it is listed as
   manditory in RFC 2439.  A potential result of failure to consider
   each AS Path separately is an overly aggressive suppression of
   destinations in a densely meshed network, with the most severe
   consequence being suppression of a destination after a single
   failure.  Because the top tier AS in the Internet are densely
   meshed, these adverse consequences are observed.

After "Limit Rate Updates" you might add a section:

   13 1/2.  Consideration of TCP Characteristics

   If a TCP receiver is processing input more slowly than the sender
   or if the TCP connection rate is the limiting factor, a form of
   backpressure is observed by the TCP sending application.  When the
   TCP buffer fills, the sending application will either block on the
   write or receive an error on the write.  Common errors in either
   early implementations or an occasional naive new implementation
   are to either set options to block on the write or set options for
   non-blocking writes and then treat the errors due to a full buffer
   as fatal.

   Having recognized that full write buffers are to be expected
   additional implementation pitfalls exist.  The application should
   not attempt to store the TCP stream within the application itself.
   If the receiver or the TCP connection is persistently slow, then
   the buffer can grow until memory is exhausted.  A BGP
   implementation must send changes to all peers for which the TCP
   connection is not blocked and must remember to send those changes
   to the remaining peers when the connection becomes unblocked.

   If the preferred route for a given NLRI changes multiple times
   while writes to one or more peer is blocked, only the most recent
   best route needs to be sent.  In this way BGP is work conserving.
   In times of extremely high route change, a higher volume of route
   change is sent to those peers which are able to process it more
   quickly and a lower volume of route change is sent to those peers
   not able to process the changes as quickly.

   For implentations which handle differing peer capacity to absorb
   route change well, if the majority of route change is contributed
   by a subset of unstable NRLI, the only impact on relatively stable
   NRLI which make an isolated route change is a slower convergence
   for which convergence time remains bounded regardless of the amount
   of instability.