Re: [Dots] Mirja's DISCUSS: Pending Point (AD Help Needed)

kaname nishizuka <kaname@nttv6.jp> Tue, 23 July 2019 18:39 UTC

Return-Path: <kaname@nttv6.jp>
X-Original-To: dots@ietfa.amsl.com
Delivered-To: dots@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7CD7D12084F; Tue, 23 Jul 2019 11:39:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=nttv6.jp
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 86akY3vnpm-l; Tue, 23 Jul 2019 11:39:37 -0700 (PDT)
Received: from guri.nttv6.jp (guri.nttv6.jp [115.69.228.140]) by ietfa.amsl.com (Postfix) with ESMTP id 3348F120834; Tue, 23 Jul 2019 11:39:37 -0700 (PDT)
Received: from z.nttv6.jp (z.nttv6.jp [IPv6:2402:c800:ff06:6::f]) by guri.nttv6.jp (NTTv6MTA) with ESMTP id 744D825F6BD; Wed, 24 Jul 2019 03:39:35 +0900 (JST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nttv6.jp; s=20180820; t=1563907175; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JPyJ3xwFlsqegsm77tRt2/ZrJ/v129PZgytRcrsiAgA=; b=XeUWfdnaqs7uv82hGkLriJo46/b9PApP2PvDvdip+8wlKB0CRrDUYVhgdXJr3fVIX0Z9a0 jtbVjk6nsEFkfMAnO8y1CazLnla+yRdfcFIF7lzHuIViC1w+uXvWhGwQ9hqEscSxe23OjB CT5HUKCagWEkoQIHrrGO6UQYahatpqw=
Received: from MacBook-Pro-17.local (fujiko.nttv6.jp [IPv6:2402:c800:ff06:136::141]) by z.nttv6.jp (NTTv6MTA) with ESMTP id 95836763504; Wed, 24 Jul 2019 03:39:31 +0900 (JST)
To: mohamed.boucadair@orange.com, "Konda, Tirumaleswar Reddy" <TirumaleswarReddy_Konda@McAfee.com>, Benjamin Kaduk <kaduk@mit.edu>, Valery Smyslov <valery@smyslov.net>
Cc: "dots-chairs@ietf.org" <dots-chairs@ietf.org>, "dots@ietf.org" <dots@ietf.org>
References: <787AE7BB302AE849A7480A190F8B93302FA841A9@OPEXCAUBMA2.corporate.adroot.infra.ftgroup> <00c201d53e27$194cfc20$4be6f460$@smyslov.net> <20190721040520.GS23137@kduck.mit.edu> <DM5PR16MB1705B068DCF6AB20658EF826EAC50@DM5PR16MB1705.namprd16.prod.outlook.com> <787AE7BB302AE849A7480A190F8B9330312E57CA@OPEXCAUBMA2.corporate.adroot.infra.ftgroup> <MWHPR16MB17119026CED493164A85FDDBEAC70@MWHPR16MB1711.namprd16.prod.outlook.com> <787AE7BB302AE849A7480A190F8B9330312E604F@OPEXCAUBMA2.corporate.adroot.infra.ftgroup>
From: kaname nishizuka <kaname@nttv6.jp>
Message-ID: <e016cba2-3e57-ffa5-9212-8d3088493de9@nttv6.jp>
Date: Wed, 24 Jul 2019 03:39:30 +0900
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:60.0) Gecko/20100101 Thunderbird/60.8.0
MIME-Version: 1.0
In-Reply-To: <787AE7BB302AE849A7480A190F8B9330312E604F@OPEXCAUBMA2.corporate.adroot.infra.ftgroup>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-US
Authentication-Results: guri.nttv6.jp; spf=pass smtp.mailfrom=kaname@nttv6.jp
Archived-At: <https://mailarchive.ietf.org/arch/msg/dots/LQixH1cF8wi_jT8yMWPLDca3KqM>
Subject: Re: [Dots] Mirja's DISCUSS: Pending Point (AD Help Needed)
X-BeenThere: dots@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "List for discussion of DDoS Open Threat Signaling \(DOTS\) technology and directions." <dots.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dots>, <mailto:dots-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dots/>
List-Post: <mailto:dots@ietf.org>
List-Help: <mailto:dots-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dots>, <mailto:dots-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Jul 2019 18:39:54 -0000

Thank you, Med.

Is the presentation about the signal-channel I-D already included in the agenda?
I can spare at least 5min from `2. Controlling Filtering Rules Using DOTS Signal Channel (15 min)` for this discussion.

thanks,
Kaname

On 2019/07/23 22:56, mohamed.boucadair@orange.com wrote:
> Re-,
>
> Thank you, Tiru,
>
> All: FWIW, we prepared a set of slides to expose the pending DISCUSS point from Mirja. The slides are available at:
> https://datatracker.ietf.org/meeting/105/materials/slides-105-dots-heartbeat-mechanism-mirjas-discuss-on-the-signal-channel-i-d-00
>
> Cheers,
> Med
>
>> -----Message d'origine-----
>> De : Konda, Tirumaleswar Reddy [mailto:TirumaleswarReddy_Konda@McAfee.com]
>> Envoyé : mardi 23 juillet 2019 08:41
>> À : BOUCADAIR Mohamed TGI/OLN; Benjamin Kaduk; Valery Smyslov
>> Cc : dots-chairs@ietf.org; dots@ietf.org
>> Objet : RE: [Dots] Mirja's DISCUSS: Pending Point (AD Help Needed)
>>
>>> -----Original Message-----
>>> From: mohamed.boucadair@orange.com
>>> <mohamed.boucadair@orange.com>
>>> Sent: Tuesday, July 23, 2019 11:02 AM
>>> To: Konda, Tirumaleswar Reddy
>>> <TirumaleswarReddy_Konda@McAfee.com>; Benjamin Kaduk
>>> <kaduk@mit.edu>; Valery Smyslov <valery@smyslov.net>
>>> Cc: dots-chairs@ietf.org; dots@ietf.org
>>> Subject: RE: [Dots] Mirja's DISCUSS: Pending Point (AD Help Needed)
>>>
>>>
>>>
>>> Hi Tiru, all,
>>>
>>> Please see inline.
>>>
>>> Cheers,
>>> Med
>>>
>>>> -----Message d'origine-----
>>>> De : Konda, Tirumaleswar Reddy
>>>> [mailto:TirumaleswarReddy_Konda@McAfee.com]
>>>> Envoyé : dimanche 21 juillet 2019 08:52 À : Benjamin Kaduk; Valery
>>>> Smyslov Cc : dots-chairs@ietf.org; BOUCADAIR Mohamed TGI/OLN;
>>>> dots@ietf.org Objet : RE: [Dots] Mirja's DISCUSS: Pending Point (AD
>>>> Help Needed)
>>>>
>>>> Hi Ben,
>>>>
>>>> There seems to several confusions regarding the heartbeat mechanism, I
>>>> will try to address all the comments/Discuss from you, Mirja and
>>>> Valery
>>>> below:
>>>>
>>>> [1] https://tools.ietf.org/html/rfc7252 is specific to UDP transport
>>>> (and does not deal with TCP). Please see the first paragraph in
>>>> https://tools.ietf.org/html/rfc7252#section-3. The message
>>>> transmission parameters (max-retransmit, ack-timeout and
>>>> ack-random-factor) and missing-hb-allowed discussed in DOTS signal
>>>> channel are specific to UDP transport.
>>>>
>>>> [2] CoAP over TCP is discussed in https://tools.ietf.org/html/rfc8323.
>>>> Please see the following differences b/w CoAP-over UDP and
>>>> CoAP-over-TCP relevant to our discussion:
>>>>
>>>> a) CoAP ping/pong defined in RFC7252 (uses Empty confirmable message
>>>> and
>>>> reset) will not work for CoAP-over-TCP. As per
>>>> https://tools.ietf.org/html/rfc8323#section-3.4, Empty messages (Code
>>>> 0.00) can always be sent and MUST be ignored by the recipient.
>>>> CoAP-over- TCP defines its own CoAP ping/pong for connection health
>>>> (see https://tools.ietf.org/html/rfc8323#section-5.4).
>>>>
>>>> b)Confirmable  and Non-confirmable message types are specific to UDP,
>>>> and are not supported in CoAP-over-TCP.
>>>>
>>>> [3] For TCP, if no ack is received for CoAP ping for specific
>>>> duration, TCP will close the connection, and the DOTS client will have
>>>> to re- establish the TCP connection. missing-hb-allowed is of no use
>>>> for TCP. We are all in the same page for TCP, and the draft can
>> probably
>>>>        be updated for better clarity.
>>>>
>>>> [4] Now coming to UDP, please see my responses below:
>>>>
>>>> a) As you already know, DOTS signal channel uses heartbeat exchange in
>>>> both directions, and hence CoAP ping is sent by both DOTS client and
>>>> server.
>>>> b) CoAP ping is a confirmable message and hence the exponential
>>>> back-off with the default value of MAX_RETRANSMIT is 4
>>>> (https://tools.ietf.org/html/rfc7252#section-4.8).
>>>> c) CoAP ping is the only confirmable message exchanged during attack
>>>> (all other messages exchanged during an attack are non-confirmable).
>>>> The specification allows distinct values for message transmission
>>>> parameters and missing-hb-allowed to be used during attack and peace
>>> times.
>>>> To handle congestion conditions during an attack, the specification
>>>> allows two options:
>>>>
>>>> [Option a] By setting MAX_RETRANSMIT to 1, exponential-back off is
>>>> avoided and missing-hb-allowed set to a very higher value (e.g. 20) to
>>>> handle congestion (high packet loss). The draft can be updated to
>>>> explain [Option a] in more detail.
>>>> [Option b] The CoAP MAX_RETRANSMIT default value of 4 is not modified,
>>>> and for example, missing-hb-allowed can be set to 5 (since 4 transmits
>>>> are not sufficient to detect the peer is not alive during congestion).
>>>>
>>> [Med] We can add this text to illustrate the configuration flexibility:
>>>
>>>     The specification allows for a flexible retry configuration when an
>>>     unreliable transport is in use.  For example, a server may be tweaked
>>>     to return a lower 'missing-hb-allowed' (e.g., 5) value but delegate
>>>     the retransmission to the underlying CoAP library by setting 'max-
>>>     retransmit' to a high value (e.g., 3).  The server may also be
>>>     configured to return a 'max-retransmit' set to '1' together with a
>>>     higher 'missing-hb-allowed' value (e.g., 15).
>> Looks good, Both these techniques are used by protocols today, I see DTLS
>> heartbeat uses retransmit and exponential back-off (see
>> https://tools.ietf.org/html/rfc6347#section-4.2.4.1) for liveness check
>> and in STUN usage for consent freshness
>> (https://tools.ietf.org/html/rfc7675) STUN binding requests are sent
>> periodically.
>>
>> Cheers,
>> -Tiru
>>
>>>
>>>> The Discuss from Mirja is not to rely on the CoAP ping/pong but to
>>>> define it in the DOTS layer itself (please see
>>>> https://mailarchive.ietf.org/arch/msg/dots/V6vv28zDpdY5eR_kaB7L-
>>> 60bhkk
>>>> ) and suggested to go with an alternate design using non-confirmable
>>>> messages. The alternate design won't work is our assessment, please
>>>> see my response
>>>>
>>> https://mailarchive.ietf.org/arch/msg/dots/QRMfsmhPTFksN6a_nBBKimVx-
>>> lM
>>>> Cheers,
>>>> -Tiru
>>>>
>>>>> -----Original Message-----
>>>>> From: Dots <dots-bounces@ietf.org> On Behalf Of Benjamin Kaduk
>>>>> Sent: Sunday, July 21, 2019 9:35 AM
>>>>> To: Valery Smyslov <valery@smyslov.net>
>>>>> Cc: dots-chairs@ietf.org; mohamed.boucadair@orange.com;
>>>>> dots@ietf.org
>>>>> Subject: Re: [Dots] Mirja's DISCUSS: Pending Point (AD Help Needed)
>>>>>
>>>>> This email originated from outside of the organization. Do not click
>>>> links or
>>>>> open attachments unless you recognize the sender and know the
>>>>> content is safe.
>>>>>
>>>>> Hi Valery,
>>>>>
>>>>> On Fri, Jul 19, 2019 at 02:42:50PM +0300, Valery Smyslov wrote:
>>>>>> Hi Med,
>>>>>>
>>>>>> I believe Mirja's main point was that if you use liveness check
>>>>>> mechanism in the transport layer, then if it reports that liveness
>>>> check fails,
>>>>> then it _also_ closes the transport session.
>>>>>> Quotes from her emails:
>>>>>> "Yes, as Coap Ping is used, the agent should not only conclude
>>>>>> that
>>>> the
>>>>> DOTS signal session is disconnected but also the Coap session and
>>>>> not
>>>> send
>>>>> any further Coap messages anymore."
>>>>>> and
>>>>>>
>>>>>> "Actually to my understanding this will not work. Both TCP
>>>>>> heartbeat
>>>> and
>>>>> Coap Ping are transmitted reliably. If you don’t receive an ack for
>>>> these
>>>>> transmissions you are not able to send any additional messages and
>>>>> can
>>>> only
>>>>> close the connection."
>>>>>> I'm not familiar with CoAP, but I suspect she's right about TCP -
>>>>>> if TCP layer itself doesn't receive ACK for the sent data after
>>>>>> several
>>>>> retransmissions, the connection is closed.
>>>>>
>>>>> Thanks for this crisp summary (and thanks Med for the detailed
>>>>> writeup
>>>> as
>>>>> well)!
>>>>>
>>>>>> As far as I understand the current draft allows underlying
>>>>>> liveness check to fail and has a parameter to restart this check
>>>>>> several times if this happens. It seems that a new transport
>>>>>> session will be created in this case (at least if TCP is used). In
>>>>>> my reading of the draft this seems not been assumed, it is assumed
>>>>>> that the session remains
>>>> the
>>>>> same. So, I think that main Mirja's concern is that it won't work
>>>>> (at
>>>> least with
>>>>> TCP).
>>>>>
>>>>> My sense is similar; if I could attempt to summarize Mirja's stance,
>>>> it's that
>>>>> we're invoking a transport-level feature that does its own
>>>>> retransmit
>>>> and
>>>>> backoff, but then if the transport comes back and says "the peer is
>>>> gone", we
>>>>> say "but we're under attack, so I don't believe you; try again".
>>>>> This kicks of another independent set of "retransmits" (I know it's
>>>>> not technically the right word) with a fresh exponential backoff.
>>>>> There's
>>>> two
>>>>> complaints about this: (1) we're changing the transport, since if
>>>>> the
>>>> transport
>>>>> concludes the peer is gone then the transport "normally" tears down
>>>>> the connection (*) entirely, and (2) the assembly of (exponential
>>>>> backoff
>>>> 1),
>>>>> (exponential backoff 2), (exponential backoff 2) is strange pacing,
>>>>> and
>>>> might
>>>>> be better served by a similar number of "retransmits" but with
>>>>> different pacing, since the long delay at the end of each backoff
>>>>> period is not
>>>> expected
>>>>> to add a huge amount of value in terms of letting congestion ease
>>>>> during attack time, and we would be just as well served by capping
>>>>> the delay between retransmits and having more retransmits.
>>>>>
>>>>> The asterisk on (1) is of course because, as is noted later in the
>>>> thread, only
>>>>> TCP tears down the association when it concludes the peer is gone
>>>> (assuming
>>>>> I'm reading the right parts of 7252).  Quoting 7252:
>>>>>
>>>>>                                                          If the
>>>>>     retransmission counter reaches MAX_RETRANSMIT on a timeout, or if
>>> the
>>>>>     endpoint receives a Reset message, then the attempt to transmit
>> the
>>>>>     message is canceled and the application process informed of
>> failure.
>>>>>     On the other hand, if the endpoint receives an acknowledgement in
>>>>>     time, transmission is considered successful.
>>>>>
>>>>> So all CoAP does is to tell the application "that request didn't
>>>>> work",
>>>> but CoAP
>>>>> is happy to try additional requests on the connection; the teardown
>>>> logic is
>>>>> indeed left up to the application.
>>>>>
>>>>> I'm not sure that we've seen much discussion about (2), though
>>>>> (sorry if
>>>> I
>>>>> missed it) -- why is the repeated backoff-and-restart the right
>>>>> pacing
>>>> for this
>>>>> purpose?
>>>>>
>>>>> -Ben
>>>>>
>>>>>> I didn't participate in the WG discussion on this, so I don't know
>>>>>> what was discussed regarding this issue. If it was discussed and
>>>>>> the WG has come to conclusion that this is not an issue, then I
>>>>>> believe more text should be added to the draft so, that people
>>>>>> like Mirja, who
>>>>> didn't participate in the discussion, don't have any concerns while
>>>> reading the
>>>>> draft.
>>>>>> Regards,
>>>>>> Valery.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: mohamed.boucadair@orange.com
>>>>>>> <mohamed.boucadair@orange.com>
>>>>>>> Sent: Friday, July 19, 2019 9:57 AM
>>>>>>> To: Benjamin Kaduk (kaduk@mit.edu) <kaduk@mit.edu>; dots-
>>>>>>> chairs@ietf.org; dots@ietf.org
>>>>>>> Subject: Mirja's DISCUSS: Pending Point (AD Help Needed)
>>>>>>>
>>>>>>> Hi Ben, chairs, all,
>>>>>>>
>>>>>>> (restricting the discussion to the AD/chairs/WG)
>>>>>>>
>>>>>>> * Status:
>>>>>>>
>>>>>>> All DISCUSS points from Mirja's review were fixed, except the
>>>>>>> one discussed in this message.
>>>>>>>
>>>>>>> * Pending Point:
>>>>>>>
>>>>>>> Rather than going into much details, I consider the following as
>>>>>>> the summary of the remaining DISCUSS point from Mirja:
>>>>>>>
>>>>>>>> I believe there are flaws in the design. First it’s a layer
>>>>>>>> violation, but if more an idealistic concern but usually
>>>>>>>> designing in layers is a good approach. But more importantly,
>>>>>>>> you end up with un-frequent messages which may still terminate
>>>>>>>> the connection at some point, while what you want is to simply
>>>>>>>> send messages frequently in an unreliable fashion but a low
>>>>>>>> rate until the
>>>> attack is over.
>>>>>>> * Discussion:
>>>>>>>
>>>>>>> (1) First of all, let's remind that RFC7252 does not define how
>>>>>>> CoAP ping must be used. It does only say:
>>>>>>>
>>>>>>> ==
>>>>>>>        Provoking a Reset
>>>>>>>        message (e.g., by sending an Empty Confirmable message) is
>>>> also
>>>>>>>        useful as an inexpensive check of the liveness of an
>> endpoint
>>>>>>>        ("CoAP ping").
>>>>>>> ==
>>>>>>>
>>>>>>> How the liveness is assessed is left to applications. So, there
>>>>>>> is
>>>>>>> ** no layer violation **.
>>>>>>>
>>>>>>> (2) What we need isn't (text from Mirja):
>>>>>>>
>>>>>>>> to simply send messages frequently in an unreliable fashion
>>>>>>>> but a low rate until the attack is over "
>>>>>>> It is actually the other way around. The spec says:
>>>>>>>
>>>>>>>    "... This is particularly useful for DOTS
>>>>>>>     servers that might want to reduce heartbeat frequency or
>> cease
>>>>>>>     heartbeat exchanges when an active DOTS client has not
>> requested
>>>>>>>     mitigation."
>>>>>>>
>>>>>>> What we want can be formalized as:
>>>>>>>   - Taking into account DDoS traffic conditions, a check to
>>>>>>> assess the liveness of the peer DOTS agent + maintain NAT/FW
>>>>>>> state on on-
>>>> path
>>>>> devices.
>>>>>>> An much more elaborated version is documented in SIG-004 of RFC
>>>> 8612.
>>>>>>> * My analysis:
>>>>>>>
>>>>>>> - The intended functionality is naturally provided by existing
>>>>>>> CoAP
>>>>> messages.
>>>>>>> - Informed WG decision: The WG spent a lot of cycles when
>>>>>>> specifying the current behavior to be meet the requirements set
>> in
>>> RFC8612.
>>>>>>> - Why not an alternative design: We can always define messages
>>>>>>> with duplicated functionality, but that is not a good design
>>>>>>> approach especially when there is no evident benefit.
>>>>>>> - The specification is not broken: it was implemented and
>> tested.
>>>>>>> And a logistic comment: this issue fits IMHO under the
>>>>>>> non-discuss criteria in
>>>>>>> https://www.ietf.org/blog/discuss-criteria-iesg-
>>>> review/#stand-
>>>>> undisc.
>>>>>>> * What's Next?
>>>>>>>
>>>>>>> As an editor, I don't think a change is needed but I'd like to
>>>>>>> hear from Ben, chairs, and the WG.
>>>>>>>
>>>>>>> Please share your thoughts and whether you agree/disagree with
>>>>>>> the above analysis.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Med
>>>>> _______________________________________________
>>>>> Dots mailing list
>>>>> Dots@ietf.org
>>>>> https://www.ietf.org/mailman/listinfo/dots
> _______________________________________________
> Dots mailing list
> Dots@ietf.org
> https://www.ietf.org/mailman/listinfo/dots