Re: [Dots] [core] Large asynchronous notifications under DDoS: New BLOCK Option?

Achim Kraus <achimkraus@gmx.net> Wed, 08 April 2020 16:00 UTC

Return-Path: <achimkraus@gmx.net>
X-Original-To: dots@ietfa.amsl.com
Delivered-To: dots@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 73CE03A0408; Wed, 8 Apr 2020 09:00:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.098
X-Spam-Level:
X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=gmx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id D-Ykr7pzTXWX; Wed, 8 Apr 2020 09:00:24 -0700 (PDT)
Received: from mout.gmx.net (mout.gmx.net [212.227.15.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5BA643A0C9C; Wed, 8 Apr 2020 09:00:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1586361609; bh=zE+o/qm4iUUWhAnRA4fda2/H6OwZetAuWbJFlxAmQr8=; h=X-UI-Sender-Class:Subject:To:References:From:Date:In-Reply-To; b=gbt6q79ZIhojJ/LulXVTnEx5ievnjZIJzvNPm52mEH/FSl4TUizhxsA1pWyuW795/ tFD+3j+1vQi1td3G18d+1a4kKl36NacYobMShDckbwx0RqYx1/MmDpmOPWRrH2FNxK t8Zxre4KzYkF8/ktP+xDsxDBlXH/K6L5a29rMvXQ=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from [192.168.178.45] ([88.65.144.250]) by mail.gmx.com (mrgmx005 [212.227.17.190]) with ESMTPSA (Nemesis) id 1MtwZ4-1j16vt22aS-00uJNJ; Wed, 08 Apr 2020 18:00:09 +0200
To: Jon Shallow <supjps-ietf@jpshallow.com>, mohamed.boucadair@orange.com, cabo@tzi.org, dots@ietf.org, core@ietf.org
References: <787AE7BB302AE849A7480A190F8B933031490173@OPEXCAUBMA2.corporate.adroot.infra.ftgroup> <2a255f3b-6614-f950-4ecc-15f170087c9f@gmx.net> <787AE7BB302AE849A7480A190F8B933031490894@OPEXCAUBMA2.corporate.adroot.infra.ftgroup> <019301d60d05$d87fcca0$897f65e0$@jpshallow.com> <a36c6114-d979-e04a-7806-3ad350208e4a@gmx.net> <566C58A5-0373-4D34-91F8-7B664423E373@tzi.org> <787AE7BB302AE849A7480A190F8B933031491200@OPEXCAUBMA2.corporate.adroot.infra.ftgroup> <023101d60d92$3642ebb0$a2c8c310$@jpshallow.com>
From: Achim Kraus <achimkraus@gmx.net>
Message-ID: <1cdc0e70-7e72-ef52-66e7-1d2056367fbf@gmx.net>
Date: Wed, 08 Apr 2020 18:00:08 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1
MIME-Version: 1.0
In-Reply-To: <023101d60d92$3642ebb0$a2c8c310$@jpshallow.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Language: de-AT-frami
Content-Transfer-Encoding: quoted-printable
X-Provags-ID: V03:K1:+npY5J6KcNjWBB0go7dq/iEVFgg7nflttNGT3xIrlMg91glVZq3 07oOo2owerr0PqahBkl98QZ9phmd+JWrlm7dB8rx2Kyke+12cPYt9QnTb6wjiHfEjpmBDLJ S0yf7twi+q65tnVs+JalItAcfX7WD7t+hd3Ve1sHtcU9F5fKFJ4GRj3WCitPL/U0AevOWHs xcMzFdROakiLNAdCfH36g==
X-UI-Out-Filterresults: notjunk:1;V03:K0:k8jJtTPAgoY=:+3+QO/R2l9EreViLM2PQJ8 v86g59jGHSowshDBdNwz1nUn1vL7lCsjmpTXq/ZD1kDpPHelOczAwEEpymeCEdI5OIzFnmDU9 /vYY2FphkZBTMsxAtOTCK8pVRmbt1MtrZc706dwSEGK9VX30Bb6jZq2QeEfZ67KCRtP1Lxbhq sLBjiXBKZVoLwK2aaf5RFv79RDn85VLqRmDNWbRqyzrLJNBI7dHMY8ZID1qYfmiu80eTuQHGH Kv30g9k8QeT1TBCxihPLmNuFO0ga9jZANtb0uEE0Ps3cGAltQYmwwJgWxeuAX9+2AphCWcbAZ ZcHgmPYgOP1B4dT0Hn6bfG0aG98HTDjNgXg8jVQvsmGK/NuIztSNnJSG9XHKEU+GelJIl/6vo iiWk7RQM5GLRAARamydLEw8xdjzB0XqAsqfN94mWLZBXQ7566pISXzEq3FLGFWBvpTjtTCOTj wvut7b62+d0CvhB5m/Zj+5ysD0KvqCIPEFkU6xLPuQnIzus0VfjPNECtxliD60/ZDlpu6Csnr 7jFt2mhgU3U/IChJPb0DMr15Y77BNnupZYwI6yZF0098kn5ELpBIds6QoVip3ogzD0dlBR/GI 4ohnlcVyouCzYELfuF0SbO4OZF2dfa1RJ4ZeuMY7qvEHAZzcjPRzS1uqv6Tpp+U37h1bLAfsa ikCuOktubVGrRSPA3kYiBTQvogdezA23C2FnSwNn6E4IFWMHMa8jznx5OeX6ug8rbwXmFoi6a qRtolavpF0wzaJYzldpg/CmqXoJ8s/3XPQMU1t9572szL59pdCCMIY0/nKF+4tPcrxrUyNy5F 5whwc0MchWIeg2hpo7zrnTLpb55YRgzClNy3RIIhwTQ+fPqxP+Rv2DnuHkeOCi9rJK+IAQ+4x QaZY0VmaiqKaHtOM4rOXkIOBx5OXMd4agaHkUAQtgyljFUPS/13Wf56a6lo5vSueCjrPpSVVL zRezrR6r2qGkAEZRovw3aID0eccsj0TQpGf0Drp7YkAd8/F+wkXmpxvuwMSdXBA0aMrbq2ebm GGtvhLNmjqvKIu6sEU7ouGoJWUOsmsKsW7hevGKDK2o2iwPw+PEZPr/ZAcJmTpvVFri+4ZJLM VjR8IpEgoNpVRajn/WrvHMxKrryU7GQMxhlweLh2sXxqFevrXHHeEITuNbcImRO01Fg8HBv5/ jnraVE9g3E2X4JjyBn642IQQ1Ht0btNk8DlWxj1HRxepQ9Tv0jDD1z6oHr2jylNpZkTKLd0Fd lkRz0wqd78Epsp/U9
Archived-At: <https://mailarchive.ietf.org/arch/msg/dots/ckH7S98J3DEB6UPCbNvUSOYQO6A>
Subject: Re: [Dots] [core] Large asynchronous notifications under DDoS: New BLOCK Option?
X-BeenThere: dots@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "List for discussion of DDoS Open Threat Signaling \(DOTS\) technology and directions." <dots.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dots>, <mailto:dots-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dots/>
List-Post: <mailto:dots@ietf.org>
List-Help: <mailto:dots-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dots>, <mailto:dots-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Apr 2020 16:00:36 -0000

Hi Jon,

a "clear" answer will be hard to give, without actual experience of such
special communication situations. FMPOV it would require some
experiments under your assumed conditions to find answers.

 From your diagrams, I would conclude:
- best, if the transfer works without sending messages back
- to fill gaps, sending messages will be acceptable
- both together, should then provide a "good enough" solution.
Yes, experiments may show, that it works "good enough", or withdraw it
(because the drops are too high).

To some details of your sequences.
Observe/Notify is defined in
https://tools.ietf.org/html/rfc7959#section-3.4

- the follow up / next block transfers usually don't contain the observe
option, only the "head".
- the follow up / next block transfer don't need to share the token with
the head, they use the tokens defined by each requests.

In your scenario, there will be no "next block" request, and so you
reuse the observer "token". There maybe implementations, which fail
receiving follow up blocks, with observe options and the token of the
observer request.

Therefore your approach would require at least a implementation
according that special interpretation, but will not work with all "RFC
7252-7641-7959" compliant implementations, because for me it adds
special conventions.

best regards
Achim

Am 08.04.20 um 12:41 schrieb Jon Shallow:
> Hi All,
>
> Thanks for all your input so far.  It has helped my thinking, but I still do not have a clear answer yet as how to best do this for the DOTS specific situation as well as more general use cases.
>
> Below is a more graphical way of trying to describe what is happening and how it may be possible to overcome some of the limitations of using BLOCK2 in a lossy traffic environment (caused by DDoS attacks).
>
> Regards
>
> Jon
>
> Primary packet loss is from Server to Client
> [Server is DDoS mitigating out in Internet, Client is on premise, DDoS flood against client]
>
> GET followed by Observe response - all working - as we would do it today.
>
>         CLIENT      SERVER
>           |          |
>           +--------->|   GET /path Token 0xf0 Observe 0
>           |          |
>           |<---------+   2.05 Token 0xf0 Observe 1234 Block2 0/1/1024
>           |          |
>           +--------->|   GET /path Token 0xf0 Block2 1/0/1024
>           |          |
>           |<---------+   2.05 Token 0xf0 Observe 1234 Block2 1/1/1024
>           |          |
>           +--------->|   GET /path Token 0xf0 Block2 2/0/1024
>           |          |
>           |<---------+   2.05 Token 0xf0 Observe 1234 Block2 2/1/1024
>           |          |
>           +--------->|   GET /path Token 0xf0 Block2 3/0/1024
>           |          |
>           |<---------+   2.05 Token 0xf0 Observe 1234 Block2 3/0/1024
> Observe triggered
>           |<---------+   2.05 Token 0xf0 Observe 1235 Block2 0/1/1024
>           |          |
>           +--------->|   GET /path Token 0xf1 Block2 1/0/1024
>           |          |
>           |<---------+   2.05 Token 0xf1 Observe 1235 Block2 1/1/1024
>           |          |
>           +--------->|   GET /path Token 0xf1 Block2 2/0/1024
>           |          |
>           |<---------+   2.05 Token 0xf1 Observe 1235 Block2 2/1/1024
>           |          |
>           +--------->|   GET /path Token 0xf1 Block2 3/0/1024
>           |          |
>           |<---------+   2.05 Token 0xf1 Observe 1235 Block2 3/0/1024
>
> Confirmable ACK responses not displayed, nor Etag, Size2 or Maxage.
>
> Now with some packet loss.
> Observe triggered
>           |<---------+   2.05 Token 0xf0 Observe 1236 Block2 0/1/1024
>           |          |
>           +--------->|   GET /path Token 0xf2 Block2 1/1/1024
>           |          |
>           |   X<-----+   2.05 Token 0xf2 Observe 1236 Block2 1/1/1024
>           |          |
> Timeout
> Retry if Confirmable
>           +--------->|   GET /path Token 0xf2 Block2 1/1/1024
>           |          |
>           |   X<-----+   2.05 Token 0xf2 Observe 1236 Block2 1/1/1024
>           |          |
> Retries continue - eventually timing out and possibly killing CoAP session.
>
> Communications can be locked out for 90 seconds as CoAP NSTART is 1.
>
> [DOTS uses NON-Confirmable for protocol reliability, not traffic reliability]
>
> If not using Confirmable, then the Client needs to time out at the application layer and retry, but can only go forward 1 block at a time (does not have to be in sequential order).
> Server may need to garbage collect on resource with Etag.
>
> What may help the situation (but client can decide when current Etag/Maxage is no longer valid and stop continue requesting, and server may still need to garbage collect)
>
> New CoAP Option NONBLOCK2 equivalent to BLOCK2, but does not rely on client doing GET to get the next block synchronously.
>
>
>         CLIENT      SERVER
>           |          |
>           +--------->|   GET /path Token 0xf0 Observe 0 NonBlock2 0/0/1024
>           |          |
>           |<---------+   2.05 Token 0xf0 Observe 1234 NonBlock2 0/1/1024
>           |          |
>           |<---------+   2.05 Token 0xf0 Observe 1234 NonBlock2 1/1/1024
>           |          |
>           |<---------+   2.05 Token 0xf0 Observe 1234 NonBlock2 2/1/1024
>           |          |
>           |<---------+   2.05 Token 0xf0 Observe 1234 NonBlock2 3/0/1024
> Observe triggered
>           |<---------+   2.05 Token 0xf0 Observe 1235 NonBlock2 0/1/1024
>           |          |
>           |<---------+   2.05 Token 0xf0 Observe 1235 NonBlock2 1/1/1024
>           |          |
>           |<---------+   2.05 Token 0xf0 Observe 1235 NonBlock2 2/1/1024
>           |          |
>           |<---------+   2.05 Token 0xf0 Observe 1235 NonBlock2 3/0/1024
> Observe triggered
>           |<---------+   2.05 Token 0xf0 Observe 1236 NonBlock2 0/1/1024
>           |          |
>           |   X<-----+   2.05 Token 0xf0 Observe 1236 NonBlock2 1/1/1024
>           |          |
>           |   X<-----+   2.05 Token 0xf0 Observe 1236 NonBlock2 2/1/1024
>           |          |
>           |<---------+   2.05 Token 0xf0 Observe 1236 NonBlock2 3/0/1024
> Client realises blocks are missing and asks for the missing ones in one go
>           +--------->|   GET /path Token 0xf1 NonBlock2 1/0/1024 NonBlock2 2/0/1024
>           |          |
>           |   X<-----+   2.05 Token 0xf1 Observe 1236 NonBlock2 1/1/1024
>           |          |
>           |<---------+   2.05 Token 0xf1 Observe 1236 NonBlock2 2/1/1024
> Get final missing block
>           +--------->|   GET /path Token 0xf2 NonBlock2 1/0/1024
>           |          |
>           |<---------+   2.05 Token 0xf2 Observe 1236 NonBlock2 1/1/1024
>
> Obviously the 3 second minimum for RTT calculations needs to be maintained so that the Attack situation is not made worse.  However, I think it acceptable that all the NonBlock2s for a particular data set are all sent as a stream.
>
>
>
>> -----Original Message-----
>> From: Dots [mailto: dots-bounces@ietf.org] On Behalf Of mohamed.boucadair@orange.com
>> Sent: 08 April 2020 10:56
>> To: Carsten Bormann
>> Cc: Jon Shallow; dots@ietf.org; core
>> Subject: Re: [Dots] [core] Large asynchronous notifications under DDoS:
>> New BLOCK Option?
>>
>> Hi Carsten,
>>
>> We are also considering how to solve the issue at the DOTS level. The good
>> news is that we are not blindly overriding pieces of data that are received in
>> distinct responses. We have some checks to update an active record. But as
>> mentioned by Jon, there are some challenges as well.
>>
>> With regards to the network environment, the direction of the attack is the
>> same as the one from servers to clients. That’s said, telemetry data can be
>> exchanged in both directions:
>>
>> * from client to servers: using a dedicated telemetry message or in an
>> efficacy update shared with the server when a mitigation is active. This is
>> done using PUT messages.
>> * from servers to clients: telemetry data can be sent in dedicated
>> notification messages or as part of the mitigation status update. Both rely
>> upon GET+Observe. Having a mechanism to ask for gaps would thus be
>> helpful.
>>
>> Cheers,
>> Med
>>
>>> -----Message d'origine-----
>>> De : Carsten Bormann [mailto:cabo@tzi.org]
>>> Envoyé : mardi 7 avril 2020 23:19
>>> À : Achim Kraus
>>> Cc : Jon Shallow; BOUCADAIR Mohamed TGI/OLN; core; dots@ietf.org
>>> Objet : Re: [core] [Dots] Large asynchronous notifications under DDoS:
>>> New BLOCK Option?
>>>
>>> Hi Achim,
>>>
>>> On 2020-04-07, at 21:04, Achim Kraus <achimkraus@gmx.net> wrote:
>>>>
>>>> FMPOV, the first thing to ensure is, that the "large payload" gets
>>> split
>>>> into "application blocks", which could be processed even when other
>>>> application blocks are missing.
>>>
>>> Indeed, “application layer framing” comes to mind here; that is always
>>> a good idea if it cannot be ensured that all messages make it or if it
>>> is good to be able to process messages while still waiting for others.
>>>
>>>                       .oOo.
>>>
>>> I’m trying to understand the very specific network environment that we
>>> are targeting here.
>>> Are we assuming packets are way more likely to make it from the server
>>> to the client than the other way around?
>>> That indeed would call for additional capabilities that base CoAP does
>>> not have.
>>> We also may not really care about congestion control much in a
>>> situation of massive (attacker-induced) congestion (although that
>>> might be misdiagnosed, so some care is still required); which would be
>>> another reason to maybe deviate from base CoAP.
>>>
>>> I don’t have a design in mind at the moment.  It would need to cater
>>> for the fact that sending the other response messages for the request
>>> would be on the initiative of the server.
>>> So observe (with non-confirmable responses) is maybe a good model
>>> indeed; what’s missing is some way to ask for gaps.
>>>
>>> I just resubmitted a draft that we have been discussing for a while in
>>> T2TRG:
>>> The Series Transfer Pattern (STP)
>>> https://www.ietf.org/id/draft-bormann-t2trg-stp-02.html
>>>
>>> This has some discussion that may be relevant here, although it does
>>> not address the specific DOTS problem.  I think it would be great if
>>> whatever we come up with to solve this problem would also be a step
>>> forward on the larger class of applications needing the series
>>> transfer pattern.
>>>
>>> Grüße, Carsten
>>
>> _______________________________________________
>> Dots mailing list
>> Dots@ietf.org
>> https://www.ietf.org/mailman/listinfo/dots
>
> _______________________________________________
> core mailing list
> core@ietf.org
> https://www.ietf.org/mailman/listinfo/core
>