Re: [Idr] why has 4096 bytes limit on BGP messages size?

"Vishwas Manral" <vishwas.ietf@gmail.com> Mon, 18 June 2007 06:39 UTC

Return-path: <idr-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1I0AtD-0005MN-9i; Mon, 18 Jun 2007 02:39:03 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1I0AtB-0005Lm-GR for idr@ietf.org; Mon, 18 Jun 2007 02:39:01 -0400
Received: from nz-out-0506.google.com ([64.233.162.234]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1I0AtB-0008E5-26 for idr@ietf.org; Mon, 18 Jun 2007 02:39:01 -0400
Received: by nz-out-0506.google.com with SMTP id z31so1212554nzd for <idr@ietf.org>; Sun, 17 Jun 2007 23:39:00 -0700 (PDT)
DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=DCi58LhZEOdDzENhoLrhGSZqjaF/fH1bqN9nxPyaNMSol6fowDuFIpLKFjmZvHCaY0ivW0SpV12fz/XOmCycaTWThBAYeWTG3jfbelZyXrBh/Mc81GpP79s3MWpYEzqelNCmDbB0+cerCiWd0ovwfkCB3t6WKlix7oCfpbcEKlo=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=AFwY67ZxrizGKdIJaXvaQzJXpRBkvKktAv7eCqBYllqnUY9B5f0Ojmq7i/j/WcVEMGpbeYkKhl2LAMaiowTaRGRdbNwyXtZpNZcY5KJTNyf29gQMqWTsC3yqE3YpupckOh1d36rPUKo+yMckRfhnLje0FXGV3fKhgnhandZCqlE=
Received: by 10.114.123.1 with SMTP id v1mr5815223wac.1182148740330; Sun, 17 Jun 2007 23:39:00 -0700 (PDT)
Received: by 10.114.154.5 with HTTP; Sun, 17 Jun 2007 23:39:00 -0700 (PDT)
Message-ID: <77ead0ec0706172339i69ef6febwaea1ca5eec9a3386@mail.gmail.com>
Date: Sun, 17 Jun 2007 23:39:00 -0700
From: Vishwas Manral <vishwas.ietf@gmail.com>
To: curtis@occnc.com
Subject: Re: [Idr] why has 4096 bytes limit on BGP messages size?
In-Reply-To: <200706180505.l5I550pR003832@sailbum.orleans.occnc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
References: <982B9BFB-D286-40D5-9A07-3D2DB51BC102@cisco.com> <200706180505.l5I550pR003832@sailbum.orleans.occnc.com>
X-Spam-Score: 0.0 (/)
X-Scan-Signature: b1c41982e167b872076d0018e4e1dc3c
Cc: Tony Li <tli@cisco.com>, idr <idr@ietf.org>
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/idr>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
Errors-To: idr-bounces@ietf.org

Hi Curtis,

> > Hmmm, ok, I guess we haven't been explicit enough.  I'll try again:
>
> We've been clear to the careful reader.  I mentioned that the TCP
> window has nothing to do with the BGP message size.
>
> BGP uses TCP therefore its message size is not relevant to performance
> except for internal parsing issues and the MTU is irrelevant to the
> BGP message size.
I agree with both the statements above. Having recently written some
DTCP/ IP and AKE implementation from scratch (both work over TCP), I
agree with you. The TCP does not differentiate between packet
boundaries from the application. So if 2 packets of size 2500 are sent
from the application it could be similar to the case of having sent 1
packet of 5000 bytes.

Let me know if I am wrong?

Thanks,
Vishwas

On 6/17/07, Curtis Villamizar <curtis@occnc.com> wrote:
>
> Hi Tony,
>
> Some further clarification inline, not for you since you know this,
> but for anyone that might still be confusing BGP message size and TCP
> buffer size (aka TCP window size).
>
> In message <982B9BFB-D286-40D5-9A07-3D2DB51BC102@cisco.com>
> Tony Li writes:
> >
> > On Jun 17, 2007, at 6:12 PM, Fenggen Jia wrote:
> >
> > >     I think my inital question is why the protocol has 4K limit on
> > > messages sizes,that is different from the implementation,an
> > > implementation may chose to use large read or write buffer(>4K) to
> > > handle multiple updates one time,still my question is if message
> > > size limit is a good pratice in protocol design?
> >
> > Hmmm, ok, I guess we haven't been explicit enough.  I'll try again:
>
> We've been clear to the careful reader.  I mentioned that the TCP
> window has nothing to do with the BGP message size.
>
> That was a different person confused about this.
>
> > 1) First, an implementation should NOT be using a message size that
> > is different than the specified 4k message size limit.  If an
> > implementation sends messages more than 4k, then other
> > implementations will not be able to parse them.  If an implementation
> > cannot receive 4k messages, then it will also not be able to
> > interoperate.
> >
> > 1a) Having a fixed size is good because it makes the protocol
> > implementations easy.  There is no point to having complexity in an
> > implementation if it provides no benefit.  Large messages don't
> > provide a wonderful benefit, as they need to be large enough to carry
> > the path attributes and associated prefixes.  For this purpose 4k is
> > probably adequate to date.
>
> Right.
>
> > 1b) Historically, 4k was considered a bit wasteful.  Of course, it
> > was wonderfully simple compared to EGP which used fragmented
> > packets.  Care to parse a 16k jumbo-gram?  Care to debug that?  Trust
> > me, it's not fun.
>
> Actually EGP was worse than that.  EGP required that you put everythin
> in one dategram packet which eventually exceeded the IP 64KB limit.
> Fortunately only one network was using EGP when that happenned (NASA
> still had Proteon routers that ran EGP and not BGP so they had to take
> partial routing and use default - they were even worse off when CIDR
> came along but finally retired the Proteons).  Then there was the AGS
> FDDI interface that couldn't handle 5 consecutive IP packets or
> fragments without dropping the 5th one.  Not amusing when EGP packets
> required 5 fragments (somewhere beyond 20KB in size).  Those were
> interesting times.  Its boring at times now that things mostly work.  :-)
>
> > 2) The 4k message size is *completely independent* of the TCP window
> > size.  An implementation is perfectly free to compose any number of
> > messages, each of which is within the 4k limit.  The implementation
> > can then cram any number of messages into its TCP socket, up to the
> > buffering limits of that TCP.
>
> I think that is the point people are missing.
>
> > 2a) Thus, the message size is *NOT* performance limiting, except when
> > an implementation could actually overfill a message.  Folks
> > maintaining current implementations might chime in here as to whether
> > or not they see this.
>
> Yes BGP updates are filled and spill into a second update.  No it is
> not performance limiting.  But you knew that.
>
> Performance is determined by the TCP buffer size set.  TCP buffer size
> in BSD is set with a setsockopt as I had mentioned before.  TCP buffer
> size has nothing to do with BGP message size.
>
> BGP message size has little to do with performance unless as I had
> mentioned parsing the attributes such as AS Path more than once ever
> became an issue.  If anything it is less an issue over time.
>
> > So, in summary, yes, a 4k message size limit is a fine situation *for
> > BGP*, for the way that it behaves and the job that it does.  This
> > does *NOT* necessarily generalize to other protocols, (e.g. OSPF)
> > where 4k exceeds the most common MTUs.  In those cases, you'd end up
> > with fragmentation, and that's bad.
>
> OSPF does not use TCP therefore its message size is relevant to
> performance and cannot exceed the MTU.
>
> BGP uses TCP therefore its message size is not relevant to performance
> except for internal parsing issues and the MTU is irrelevant to the
> BGP message size.
>
> > Regards,
> > Tony
>
> Curtis
>
> ps - For those new to IDR (and some of the posters on this thread seem
> to be) this sort of mini-lecture on how TCP works and how BGP uses it
> is somewhat of a rerun.  The topic or something similar comes up on
> this mailing list every few years.
>
> _______________________________________________
> Idr mailing list
> Idr@ietf.org
> https://www1.ietf.org/mailman/listinfo/idr
>

_______________________________________________
Idr mailing list
Idr@ietf.org
https://www1.ietf.org/mailman/listinfo/idr