Re: [core] [Dots] Large asynchronous notifications under DDoS: New BLOCK Option?

Hi All,

Thanks for all your input so far.  It has helped my thinking, but I still do not have a clear answer yet as how to best do this for the DOTS specific situation as well as more general use cases.

Below is a more graphical way of trying to describe what is happening and how it may be possible to overcome some of the limitations of using BLOCK2 in a lossy traffic environment (caused by DDoS attacks).

Regards

Jon

Primary packet loss is from Server to Client
[Server is DDoS mitigating out in Internet, Client is on premise, DDoS flood against client]

GET followed by Observe response - all working - as we would do it today.

       CLIENT      SERVER
         |          |
         +--------->|   GET /path Token 0xf0 Observe 0
         |          |
         |<---------+   2.05 Token 0xf0 Observe 1234 Block2 0/1/1024
         |          |
         +--------->|   GET /path Token 0xf0 Block2 1/0/1024
         |          |
         |<---------+   2.05 Token 0xf0 Observe 1234 Block2 1/1/1024
         |          |
         +--------->|   GET /path Token 0xf0 Block2 2/0/1024
         |          |
         |<---------+   2.05 Token 0xf0 Observe 1234 Block2 2/1/1024
         |          |
         +--------->|   GET /path Token 0xf0 Block2 3/0/1024
         |          |
         |<---------+   2.05 Token 0xf0 Observe 1234 Block2 3/0/1024
Observe triggered
         |<---------+   2.05 Token 0xf0 Observe 1235 Block2 0/1/1024
         |          |
         +--------->|   GET /path Token 0xf1 Block2 1/0/1024
         |          |
         |<---------+   2.05 Token 0xf1 Observe 1235 Block2 1/1/1024
         |          |
         +--------->|   GET /path Token 0xf1 Block2 2/0/1024
         |          |
         |<---------+   2.05 Token 0xf1 Observe 1235 Block2 2/1/1024
         |          |
         +--------->|   GET /path Token 0xf1 Block2 3/0/1024
         |          |
         |<---------+   2.05 Token 0xf1 Observe 1235 Block2 3/0/1024

Confirmable ACK responses not displayed, nor Etag, Size2 or Maxage.

Now with some packet loss.
Observe triggered
         |<---------+   2.05 Token 0xf0 Observe 1236 Block2 0/1/1024
         |          |
         +--------->|   GET /path Token 0xf2 Block2 1/1/1024
         |          |
         |   X<-----+   2.05 Token 0xf2 Observe 1236 Block2 1/1/1024
         |          |
Timeout
Retry if Confirmable
         +--------->|   GET /path Token 0xf2 Block2 1/1/1024
         |          |
         |   X<-----+   2.05 Token 0xf2 Observe 1236 Block2 1/1/1024
         |          |
Retries continue - eventually timing out and possibly killing CoAP session.

Communications can be locked out for 90 seconds as CoAP NSTART is 1.

[DOTS uses NON-Confirmable for protocol reliability, not traffic reliability]

If not using Confirmable, then the Client needs to time out at the application layer and retry, but can only go forward 1 block at a time (does not have to be in sequential order).
Server may need to garbage collect on resource with Etag.

What may help the situation (but client can decide when current Etag/Maxage is no longer valid and stop continue requesting, and server may still need to garbage collect)

New CoAP Option NONBLOCK2 equivalent to BLOCK2, but does not rely on client doing GET to get the next block synchronously.

       CLIENT      SERVER
         |          |
         +--------->|   GET /path Token 0xf0 Observe 0 NonBlock2 0/0/1024
         |          |
         |<---------+   2.05 Token 0xf0 Observe 1234 NonBlock2 0/1/1024
         |          |
         |<---------+   2.05 Token 0xf0 Observe 1234 NonBlock2 1/1/1024
         |          |
         |<---------+   2.05 Token 0xf0 Observe 1234 NonBlock2 2/1/1024
         |          |
         |<---------+   2.05 Token 0xf0 Observe 1234 NonBlock2 3/0/1024
Observe triggered
         |<---------+   2.05 Token 0xf0 Observe 1235 NonBlock2 0/1/1024
         |          |
         |<---------+   2.05 Token 0xf0 Observe 1235 NonBlock2 1/1/1024
         |          |
         |<---------+   2.05 Token 0xf0 Observe 1235 NonBlock2 2/1/1024
         |          |
         |<---------+   2.05 Token 0xf0 Observe 1235 NonBlock2 3/0/1024
Observe triggered
         |<---------+   2.05 Token 0xf0 Observe 1236 NonBlock2 0/1/1024
         |          |
         |   X<-----+   2.05 Token 0xf0 Observe 1236 NonBlock2 1/1/1024
         |          |
         |   X<-----+   2.05 Token 0xf0 Observe 1236 NonBlock2 2/1/1024
         |          |
         |<---------+   2.05 Token 0xf0 Observe 1236 NonBlock2 3/0/1024
Client realises blocks are missing and asks for the missing ones in one go
         +--------->|   GET /path Token 0xf1 NonBlock2 1/0/1024 NonBlock2 2/0/1024
         |          |
         |   X<-----+   2.05 Token 0xf1 Observe 1236 NonBlock2 1/1/1024
         |          |
         |<---------+   2.05 Token 0xf1 Observe 1236 NonBlock2 2/1/1024
Get final missing block
         +--------->|   GET /path Token 0xf2 NonBlock2 1/0/1024
         |          |
         |<---------+   2.05 Token 0xf2 Observe 1236 NonBlock2 1/1/1024

Obviously the 3 second minimum for RTT calculations needs to be maintained so that the Attack situation is not made worse.  However, I think it acceptable that all the NonBlock2s for a particular data set are all sent as a stream.

> -----Original Message-----
> From: Dots [mailto: dots-bounces@ietf.org] On Behalf Of mohamed.boucadair@orange.com
> Sent: 08 April 2020 10:56
> To: Carsten Bormann
> Cc: Jon Shallow; dots@ietf.org; core
> Subject: Re: [Dots] [core] Large asynchronous notifications under DDoS:
> New BLOCK Option?
> 
> Hi Carsten,
> 
> We are also considering how to solve the issue at the DOTS level. The good
> news is that we are not blindly overriding pieces of data that are received in
> distinct responses. We have some checks to update an active record. But as
> mentioned by Jon, there are some challenges as well.
> 
> With regards to the network environment, the direction of the attack is the
> same as the one from servers to clients. That’s said, telemetry data can be
> exchanged in both directions:
> 
> * from client to servers: using a dedicated telemetry message or in an
> efficacy update shared with the server when a mitigation is active. This is
> done using PUT messages.
> * from servers to clients: telemetry data can be sent in dedicated
> notification messages or as part of the mitigation status update. Both rely
> upon GET+Observe. Having a mechanism to ask for gaps would thus be
> helpful.
> 
> Cheers,
> Med
> 
> > -----Message d'origine-----
> > De : Carsten Bormann [mailto:cabo@tzi.org]
> > Envoyé : mardi 7 avril 2020 23:19
> > À : Achim Kraus
> > Cc : Jon Shallow; BOUCADAIR Mohamed TGI/OLN; core; dots@ietf.org
> > Objet : Re: [core] [Dots] Large asynchronous notifications under DDoS:
> > New BLOCK Option?
> >
> > Hi Achim,
> >
> > On 2020-04-07, at 21:04, Achim Kraus <achimkraus@gmx.net> wrote:
> > >
> > > FMPOV, the first thing to ensure is, that the "large payload" gets
> > split
> > > into "application blocks", which could be processed even when other
> > > application blocks are missing.
> >
> > Indeed, “application layer framing” comes to mind here; that is always
> > a good idea if it cannot be ensured that all messages make it or if it
> > is good to be able to process messages while still waiting for others.
> >
> >                      .oOo.
> >
> > I’m trying to understand the very specific network environment that we
> > are targeting here.
> > Are we assuming packets are way more likely to make it from the server
> > to the client than the other way around?
> > That indeed would call for additional capabilities that base CoAP does
> > not have.
> > We also may not really care about congestion control much in a
> > situation of massive (attacker-induced) congestion (although that
> > might be misdiagnosed, so some care is still required); which would be
> > another reason to maybe deviate from base CoAP.
> >
> > I don’t have a design in mind at the moment.  It would need to cater
> > for the fact that sending the other response messages for the request
> > would be on the initiative of the server.
> > So observe (with non-confirmable responses) is maybe a good model
> > indeed; what’s missing is some way to ask for gaps.
> >
> > I just resubmitted a draft that we have been discussing for a while in
> > T2TRG:
> > The Series Transfer Pattern (STP)
> > https://www.ietf.org/id/draft-bormann-t2trg-stp-02.html
> >
> > This has some discussion that may be relevant here, although it does
> > not address the specific DOTS problem.  I think it would be great if
> > whatever we come up with to solve this problem would also be a step
> > forward on the larger class of applications needing the series
> > transfer pattern.
> >
> > Grüße, Carsten
> 
> _______________________________________________
> Dots mailing list
> Dots@ietf.org
> https://www.ietf.org/mailman/listinfo/dots