Re: [Dots] Improve controllability and predictability of keepalives

<mohamed.boucadair@orange.com> Thu, 14 November 2019 10:17 UTC

Return-Path: <mohamed.boucadair@orange.com>
X-Original-To: dots@ietfa.amsl.com
Delivered-To: dots@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0486E120110 for <dots@ietfa.amsl.com>; Thu, 14 Nov 2019 02:17:25 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TgQcO8HHaeAb for <dots@ietfa.amsl.com>; Thu, 14 Nov 2019 02:17:22 -0800 (PST)
Received: from relais-inet.orange.com (relais-inet.orange.com [80.12.70.35]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7B98A120058 for <dots@ietf.org>; Thu, 14 Nov 2019 02:17:22 -0800 (PST)
Received: from opfednr05.francetelecom.fr (unknown [xx.xx.xx.69]) by opfednr24.francetelecom.fr (ESMTP service) with ESMTP id 47DHTr6MCGz1y4l; Thu, 14 Nov 2019 11:17:20 +0100 (CET)
Received: from Exchangemail-eme6.itn.ftgroup (unknown [xx.xx.13.45]) by opfednr05.francetelecom.fr (ESMTP service) with ESMTP id 47DHTr5kNxzyQC; Thu, 14 Nov 2019 11:17:20 +0100 (CET)
Received: from OPEXCAUBMA2.corporate.adroot.infra.ftgroup ([fe80::e878:bd0:c89e:5b42]) by OPEXCAUBM42.corporate.adroot.infra.ftgroup ([fe80::1c8e:403e:fbea:5835%21]) with mapi id 14.03.0468.000; Thu, 14 Nov 2019 11:17:20 +0100
From: mohamed.boucadair@orange.com
To: Carsten Bormann <cabo@tzi.org>
CC: "dots@ietf.org" <dots@ietf.org>
Thread-Topic: Improve controllability and predictability of keepalives
Thread-Index: AdWZOjG0GmH8bbOEScaak/2iE6U5+QAGCIeAAAPLSJAAXGGyAA==
Date: Thu, 14 Nov 2019 10:17:19 +0000
Message-ID: <787AE7BB302AE849A7480A190F8B9330313D3421@OPEXCAUBMA2.corporate.adroot.infra.ftgroup>
References: <787AE7BB302AE849A7480A190F8B9330313624A5@OPEXCAUBMA2.corporate.adroot.infra.ftgroup> <18CFEB6A-81B8-44A4-B749-DB1689E0B442@tzi.org> <787AE7BB302AE849A7480A190F8B933031362BF5@OPEXCAUBMA2.corporate.adroot.infra.ftgroup>
In-Reply-To: <787AE7BB302AE849A7480A190F8B933031362BF5@OPEXCAUBMA2.corporate.adroot.infra.ftgroup>
Accept-Language: fr-FR, en-US
Content-Language: fr-FR
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.114.13.245]
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/dots/nLburF2C8gE3TDm4vKknMo8NvI4>
Subject: Re: [Dots] Improve controllability and predictability of keepalives
X-BeenThere: dots@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "List for discussion of DDoS Open Threat Signaling \(DOTS\) technology and directions." <dots.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dots>, <mailto:dots-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dots/>
List-Post: <mailto:dots@ietf.org>
List-Help: <mailto:dots-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dots>, <mailto:dots-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 14 Nov 2019 10:17:25 -0000

Hi Carsten, all, 

As promised, we updated the draft to take into account your inputs. The candidate version is available at (see section 4.7 in particular): 

https://github.com/boucadair/draft-ietf-dots-signal-channel/blob/master/draft-ietf-dots-signal-channel-39.txt

The main changes are:
* Use PUT to send heartbeat requests
* Use 2.04 instead of 2.03
* DOTS agents can negotiate a probing-rate
* Provide some guideline for setting the probing-rate. 

Do you have any further comment on the new heartbeat mechanism? Thank you. 

Cheers,
Med

> -----Message d'origine-----
> De : Dots [mailto:dots-bounces@ietf.org] De la part de
> mohamed.boucadair@orange.com
> Envoyé : mardi 12 novembre 2019 17:24
> À : Carsten Bormann
> Cc : dots@ietf.org
> Objet : Re: [Dots] Improve controllability and predictability of keepalives
> 
> Re-,
> 
> Please see inline.
> 
> Cheers,
> Med
> 
> > -----Message d'origine-----
> > De : Carsten Bormann [mailto:cabo@tzi.org]
> > Envoyé : mardi 12 novembre 2019 14:12
> > À : BOUCADAIR Mohamed TGI/OLN
> > Cc : dots@ietf.org
> > Objet : Re: Improve controllability and predictability of keepalives
> >
> > Hi Med,
> >
> > what the text below doesn’t say is what kind of information you want to
> > derive from the heartbeats.
> 
> [Med] How to interpret HBs by endpoints is discussed in Section 4.7.
> 
>  The way they currently (draft -39) are
> > defined, the client uses a GET (*).  GET is not supposed to influence
> > application state, so the server will not learn anything from that
> > heartbeat.  Is the intention that only the client needs to react to
> > heartbeat failures?
> 
> [Med] No, the server needs to react to heartbeat failures. The cases that
> are discussed in the spec are as follows:
> 
> *  If the DOTS server receives traffic from the peer DOTS client but
> maximum 'missing-hb-
>    allowed' threshold is reached, the DOTS server MUST NOT consider the
>    DOTS signal channel session disconnected.  The DOTS server MUST keep
>    on using the current DOTS signal channel session so that the DOTS
>    client can send mitigation requests over the current DOTS signal
>    channel session.  In this case, the DOTS server can identify the DOTS
>    client is under attack and the inbound link to the DOTS client
>    (domain) is saturated.
> 
> * If the DOTS server does not
>    receive a mitigation request from the DOTS client, it implies the
>    DOTS client has not detected the attack or, if an attack mitigation
>    is in progress, it implies the applied DDoS mitigation actions are
>    not yet effective to handle the DDoS attack volume
> 
> *  If the DOTS server does not receive any traffic from the peer DOTS
>    client during the time span required to exhaust the maximum 'missing-
>    hb-allowed' threshold, the DOTS server concludes the session is
>    disconnected.  The DOTS server can then trigger pre-configured
>    mitigation requests for this DOTS client (if any).
> 
> >
> > RFC 7252 defines PROBING_RATE as 1 B/s.  If you get a response within the
> > heartbeat interval to the non-confirmable requests, that is not relevant.
> > If you don’t, your heartbeat interval "MUST be chosen in
> >    such a way that an endpoint does not exceed an average data rate of
> >    PROBING_RATE in sending to another endpoint that does not respond.
> > If your interval is intended to be 15 s, that would mean your requests
> must
> > be ≤ 15 B, or you need to define PROBING_RATE differently for your
> > application.
> > It seems right now you are not trying to be particularly frugal with the
> > heartbeat message, which is probably OK since most of your networks will
> be
> > Ethernet and that will expand the frame size to 64 B anyway.  But that
> > means that you need to define PROBING_RATE to be ~ 5 B/s if you don’t
> want
> > to be slowed down in probing.
> 
> [Med] Good point. Will update accordingly.
> 
> >
> > Grüße, Carsten
> >
> > (*) And expects a 2.03, which would mean that the server confirms the
> ETag
> > given in the request.  But there is no ETag in that request, and I really
> > don’t see why a 2.05 with an empty payload wouldn’t also work.  But maybe
> > you want to move to POST anyway (so there can be application semantics,
> > like taking note of the heartbeat, on the server), and 2.04 would fit
> that
> > very well (see RFC 7252 Section 5.8.2).
> 
> [Med] Will consider the use of POST instead of GET.
> 
> >
> > > On Nov 12, 2019, at 10:18, <mohamed.boucadair@orange.com>
> > <mohamed.boucadair@orange.com> wrote:
> > >
> > > Hi Carsten,
> > >
> > > You indicated the following in an offline message (I’m adding dots
> > mailing list as you were OK; see below):
> > >
> > > > I would have expected using requests sent in non-confirmable
> messages,
> > > > requiring more work on the application side (interpreting losses) but
> > also
> > > > delivering a more regular, predictable, less noisy signal.
> > > > A little bit of specification text is then needed to ensure that
> those
> > > > requests/responses meet the requirements of RFC 8085 and the related
> > > > specifications in RFC 7252, and that both sides are in a good
> position
> > to
> > > > interpret the signal they get.
> > > …
> > > > If not, I would like to discuss the issue on the DOTS mailing list,
> and
> > see
> > > > whether a small set of changes to the keepalive mechanisms employed
> can
> > > > improve its controllability and predictability.
> > >
> > > Assuming that non-confirmable application HBs are used, which changes
> do
> > you think are needed to enhance the DOTS mechanism to meet 8085/7252
> > requirements?
> > >
> > > As a reminder we do have the following for setting the hb parameters:
> > >
> > > ======
> > >       Note: heartbeat-interval should be tweaked to also assist DOTS
> > >       messages for NAT traversal (SIG-011 of [RFC8612]).  According to
> > >       [RFC8085], keepalive messages must not be sent more frequently
> > >       than once every 15 seconds and should use longer intervals when
> > >       possible.  Furthermore, [RFC4787] recommends NATs to use a state
> > >       timeout of 2 minutes or longer, but experience shows that sending
> > >       packets every 15 to 30 seconds is necessary to prevent the
> > >       majority of middleboxes from losing state for UDP flows.  From
> > >       that standpoint, the RECOMMENDED minimum heartbeat-interval is 15
> > >       seconds and the RECOMMENDED maximum heartbeat-interval is 240
> > >       seconds.  The recommended value of 30 seconds is selected to
> > >       anticipate the expiry of NAT state.
> > >
> > >       A heartbeat-interval of 30 seconds may be considered as too
> chatty
> > >       in some deployments.  For such deployments, DOTS agents may
> > >       negotiate longer heartbeat-interval values to prevent any network
> > >       overload with too frequent keepalives.
> > >
> > >       Different heartbeat intervals can be defined for 'mitigating-
> > >       config' and 'idle-config' to reduce being too chatty during idle
> > >       times.  If there is an on-path translator between the DOTS client
> > >       (standalone or part of a DOTS gateway) and the DOTS server, the
> > >       'mitigating-config' heartbeat-interval has to be smaller than the
> > >       translator session timeout.  It is recommended that the 'idle-
> > >       config' heartbeat-interval is also smaller than the translator
> > >       session timeout to prevent translator traversal issues, or
> > >       disabled entirely.  Means to discover the lifetime assigned by a
> > >       translator are out of scope.
> > > =======
> > >
> > > Thank you.
> > >
> > > Cheers,
> > > Med
> 
> _______________________________________________
> Dots mailing list
> Dots@ietf.org
> https://www.ietf.org/mailman/listinfo/dots