Re: [btns] Q: How to deal with connection latch breaks?

Nicolas Williams <Nicolas.Williams@sun.com> Sun, 26 July 2009 22:31 UTC

Return-Path: <Nicolas.Williams@sun.com>
X-Original-To: btns@core3.amsl.com
Delivered-To: btns@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 3F4A03A683C for <btns@core3.amsl.com>; Sun, 26 Jul 2009 15:31:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.377
X-Spam-Level:
X-Spam-Status: No, score=-5.377 tagged_above=-999 required=5 tests=[AWL=0.669, BAYES_00=-2.599, HELO_MISMATCH_COM=0.553, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WhOFHd-5R48D for <btns@core3.amsl.com>; Sun, 26 Jul 2009 15:31:28 -0700 (PDT)
Received: from brmea-mail-4.sun.com (brmea-mail-4.Sun.COM [192.18.98.36]) by core3.amsl.com (Postfix) with ESMTP id 4126D3A6B87 for <btns@ietf.org>; Sun, 26 Jul 2009 15:31:28 -0700 (PDT)
Received: from dm-central-01.central.sun.com ([129.147.62.4]) by brmea-mail-4.sun.com (8.13.6+Sun/8.12.9) with ESMTP id n6QMVS0L009200 for <btns@ietf.org>; Sun, 26 Jul 2009 22:31:28 GMT
Received: from binky.Central.Sun.COM (binky.Central.Sun.COM [129.153.128.104]) by dm-central-01.central.sun.com (8.13.8+Sun/8.13.8/ENSMAIL, v2.2) with ESMTP id n6QMVSpf037593 for <btns@ietf.org>; Sun, 26 Jul 2009 16:31:28 -0600 (MDT)
Received: from binky.Central.Sun.COM (localhost [127.0.0.1]) by binky.Central.Sun.COM (8.14.3+Sun/8.14.3) with ESMTP id n6QMDWtQ006214; Sun, 26 Jul 2009 17:13:32 -0500 (CDT)
Received: (from nw141292@localhost) by binky.Central.Sun.COM (8.14.3+Sun/8.14.3/Submit) id n6QMDVON006213; Sun, 26 Jul 2009 17:13:31 -0500 (CDT)
X-Authentication-Warning: binky.Central.Sun.COM: nw141292 set sender to Nicolas.Williams@sun.com using -f
Date: Sun, 26 Jul 2009 17:13:31 -0500
From: Nicolas Williams <Nicolas.Williams@sun.com>
To: Mike Eisler <mre-ietf@eisler.com>
Message-ID: <20090726221331.GS1020@Sun.COM>
References: <e2cf27e98757db0c8561baadfb3ca335.squirrel@webmail.eisler.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <e2cf27e98757db0c8561baadfb3ca335.squirrel@webmail.eisler.com>
User-Agent: Mutt/1.5.7i
Cc: btns@ietf.org, lars.eggert@nokia.com
Subject: Re: [btns] Q: How to deal with connection latch breaks?
X-BeenThere: btns@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Better-Than-Nothing-Security Working Group discussion list <btns.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/btns>, <mailto:btns-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/btns>
List-Post: <mailto:btns@ietf.org>
List-Help: <mailto:btns-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/btns>, <mailto:btns-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 26 Jul 2009 22:31:29 -0000

On Wed, Jul 01, 2009 at 05:37:10PM -0700, Mike Eisler wrote:
> Isn't this DISCUSS specific to SCTP? Russ writes in the DISCUSS:

Sortof.

Russ wrote, at one point:

| This is the point.  I do not think that hangs until killed or
| restarted should be on this menu.

with respect to the default handling of latch breaks when no additional
APIs are available (or used by the application).

IMO it's a generic issue.

My earlier opinion was that pretending that bits aren't moving is the
right thing to do because applications ought to have timeout handling,
if not even ULPs (think of SO_KEEPALIVE).  After all, latch break is not
fatal -- state transitions from BROKEN to ESTABLISHED are allowed.

But reasonable people can disagree about that.  A latch break is likely
to be permanent in practice, and in any case, acting as though the
connection has been reset (as in valid RST received) is no more or less
harmful than acting as though bits aren't moving.

Also, a latch break may be indicative of an attack, and so resetting the
connection may be a better action[*].

To a reply along those likes Russ wrote back:

| I'm happy with a SHOULD statement that the WG agrees is appropriate
| to handle the situation gracefully.

This thread is all about obtaining WG consensus on this issue, so we can
clear this DISCUSS.  All we need is consensus.

I'll be happy with either possible default behavior on the face of latch
breaks: act as if the connection was reset, or act as if bits aren't
moving.

I'd also be happy to require that implementors pick and implement one of
those two, without us giving a recommendation.  After all, neither
default affects interoperability, so why should we recommend or require
one or the other?

> I am unsure that the SCTP section defines behavior which is consistent
> with application expectations.  The last paragraph of 5.4 implies that
> the whole connection terminates if one of the latches breaks.  This
> has an impact on the semantics of the application socket API.  While
> connection latching is transparent when everything is working, there
> are new failures that ripple to the application.  That is, the
> application will observe different behavior on a connection with and
> without latching.

Surely SCTP, and any connection-oriented ULP, must have a way to reset
connections, just like TCP does.  A latch break causing a connection
reset would not be new behavior as far as applications go.

> My conclusion is that the API ought to provide information for the
> application about the connection latching, and it just does not seem
> to be there.  If you can point me to a discussion of this topic on the
> WG mail list, then I'll clear.  I'm not trying to alter consensus, but
> I do want to make sure that this topic was considered.

APIs are nice, but existing apps won't use them until updated, and
anyways, connection latching adds value even without adding APIs, which
means we need a default response to latch breaks in the absence of new
APIs (either because not implemented or not used).

[*] Or not -- if the app is not doing authentication at all and the user
    is not doing leap-of-faith, then resetting the connection would be
    bad.  But then, the user can always interrupt the application and try
    again.

Nico
--