Re: [Pppext] LCP echo request/reply support over multilink interface (RFC 1990)

James Carlson <carlsonj@workingcode.com> Thu, 18 March 2010 13:58 UTC

Return-Path: <carlsonj@workingcode.com>
X-Original-To: pppext@core3.amsl.com
Delivered-To: pppext@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 3D0F03A6BD7 for <pppext@core3.amsl.com>; Thu, 18 Mar 2010 06:58:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.286
X-Spam-Level:
X-Spam-Status: No, score=-0.286 tagged_above=-999 required=5 tests=[AWL=-1.460, BAYES_20=-0.74, DNS_FROM_OPENWHOIS=1.13, SARE_SUB_NEED_REPLY=0.784]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id u5OOczcrdk5G for <pppext@core3.amsl.com>; Thu, 18 Mar 2010 06:58:34 -0700 (PDT)
Received: from carlson.workingcode.com (carlsonj-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:1d9::2]) by core3.amsl.com (Postfix) with ESMTP id 1B8543A6BD5 for <pppext@ietf.org>; Thu, 18 Mar 2010 06:58:33 -0700 (PDT)
Received: from [10.50.23.149] ([65.170.40.132]) (authenticated bits=0) by carlson.workingcode.com (8.14.2+Sun/8.14.3) with ESMTP id o2IDpDSS019651 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 18 Mar 2010 09:51:13 -0400 (EDT)
Message-ID: <4BA22FCE.2000208@workingcode.com>
Date: Thu, 18 Mar 2010 09:51:10 -0400
From: James Carlson <carlsonj@workingcode.com>
User-Agent: Thunderbird 2.0.0.22 (X11/20090605)
MIME-Version: 1.0
To: Y Prasad <yprasad@juniper.net>
References: <4B98E356.2070804@workingcode.com> <201003111520.o2BFKpeP066268@calcite.rhyolite.com> <0DB0FFEA6887E349861A3F6B40D71C3A0639CE88@gaugeboson.jnpr.net> <4B9E6FD0.5080602@workingcode.com> <0DB0FFEA6887E349861A3F6B40D71C3A0639D15A@gaugeboson.jnpr.net>
In-Reply-To: <0DB0FFEA6887E349861A3F6B40D71C3A0639D15A@gaugeboson.jnpr.net>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-DCC-x.dcc-servers-Metrics: carlson; whitelist
Cc: pppext@ietf.org, i.goyret@alcatel-lucent.com, Vernon Schryver <vjs@calcite.rhyolite.com>
Subject: Re: [Pppext] LCP echo request/reply support over multilink interface (RFC 1990)
X-BeenThere: pppext@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: PPP Extensions <pppext.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/pppext>, <mailto:pppext-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/pppext>
List-Post: <mailto:pppext@ietf.org>
List-Help: <mailto:pppext-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/pppext>, <mailto:pppext-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Mar 2010 13:58:35 -0000

Y Prasad wrote:
> I'm unable to parse that statement.  Could you provide specifics on how
> running LCP Echo-Request at the bundle level provides some sort of
> performance improvement?  Or at least what link flaps have to do with
> the problem?  I cannot see how that's so.
> 
> Yp> LCP keep-alives help to identify faulty link and peer state
> (alive/dead).
> If an user depends on link Career detect (or some other means) to
> identify faulty link, he may opt for no-keepalives on that link. And in
> order to identify whether peer alive or not just a bundle keep alives
> are good enough. This would reduce control traffic and flooding of LCP
> keep-alives on multiple link flaps/bundle flaps when links rejoin into
> bundle after LCP.

If the user can actually rely on the underlying medium-dependent health
checks, then I don't see what LCP Echo-Request would do at the bundle level.

If the peer is working, then those links should all be healthy, right?
And if the peer fails for some reason, then the links would (since
they're assumed to be reliable) indicate the lack of health, wouldn't they?

In order for LCP Echo-Request at the bundle level to be interesting, we
need to be in a world where the peer can fail completely but leave all
of the links "up," and where the only way any individual link alone
could fail would be something that is detected by the lower layer.

I suppose that's plausible, though I've never seen such a thing in
practice.  All of the systems I've seen have possible failure modes
where links remain "up" at the lower layer, but are not actually usable
for PPP for one reason or another.

As for the claimed control traffic reduction, I still don't get it.  If
a bunch of links go down, you're going to have to renegotiate LCP when
the links come back.  That's a given; it's part of the protocol.  LCP
Echo-Request (whether implemented or not) is no part of that process.
It has nothing to do with it.  In a typical implementation, you don't
bother with LCP Echo-Request at all until some time well after the link
has been established and has been passing data.

Thus, using LCP Echo-Request doesn't slow down the work in
reestablishing the bundle, nor does it impose any greater workload on
that case.

Even if there were such a traffic reduction (I don't believe it to be a
correct analysis), the two solutions can be made equivalent in terms of
overhead in a trivial way.  If you're going to send LCP Echo-Request
messages on the bundle at interval "M" seconds, and that interval is an
acceptable amount of traffic to you, then you can equivalently send LCP
Echo-Request messages at the per-link level at interval "M*N" seconds
(where N is the number of links in the bundle).

The overall message rate is the same.  No savings.  The only difference
is the kinds of errors you can detect.

> Yp> I agree in case member links are "bad" (meaning can not reflect
> faulty link properly).
>  "good link" here means it does reflect its faultiness without the need
> of LCP keepalives.

How do you know for certain that each of the member links can always
indicate whether they're capable of carrying PPP traffic without
significant error or delay?

And if they're able to do that, then why can't they also tell you
whether the peer is working at all?  That's the only remaining case --
that the peer is completely dead -- that bundle-level LCP Echo-Request
can detect for you.

> Even if it were made mandatory, that'd still just be in the new
> document, and there are many devices out there that are designed to RFC
> 1990, and might never be changed.
> 
> Yp> I agree. Only way is bundle-keepalive need to made as optional and
> negotiable attribute. Then it would work with existing installations.
> down-ward-compatible :).

So you'd be OK with not using LCP Echo-Request on existing platforms, right?

I guess I can agree that if you had a negotiation, you would know
whether you could default LCP Echo-Request operation to "on" or "off"
based on the outcome of the negotiation, without needing operator
intervention.

I'm not sure I see the lasting value in this, given the circumstances,
but ok.

For what it's worth, there's a poor-man's approach to this problem.
When you contact a new peer, you can assume that LCP is still in a
pretty good state one millisecond after it goes to Opened state.  When
you establish your first bundle with a novel peer, send a few LCP
Echo-Request messages over the bundle at some nominal interval (say,
three messages at three seconds apart).  If you receive any replies,
then you can assume that the peer does implement the feature, and you
can then continue to use it, and perhaps record that fact for that peer
in stable storage.  If you receive no replies, then you can assume that
it doesn't implement the feature, so you'll need to disable it.

The messages themselves can be all the negotiation you need.  And, best
of all, this approach works with existing equipment, and needs no new
document.

> Yp> I think we can club this rectification along with any other if any.
> Just for this writing an another draft may not be worth trying :)

OK.  It's up to you ... or any others who might be interested enough to
work out such a document.

-- 
James Carlson         42.703N 71.076W         <carlsonj@workingcode.com>