Re: [Dots] [core] WG Last Call on draft-ietf-core-new-block

supjps-ietf@jpshallow.com Wed, 10 March 2021 13:42 UTC

Return-Path: <jon.shallow@jpshallow.com>
X-Original-To: dots@ietfa.amsl.com
Delivered-To: dots@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5F9103A07F7 for <dots@ietfa.amsl.com>; Wed, 10 Mar 2021 05:42:48 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id i92W9FZolGte for <dots@ietfa.amsl.com>; Wed, 10 Mar 2021 05:42:45 -0800 (PST)
Received: from mail.jpshallow.com (mail.jpshallow.com [217.40.240.153]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 35C573A07F2 for <dots@ietf.org>; Wed, 10 Mar 2021 05:42:45 -0800 (PST)
Received: from mail2.jpshallow.com ([192.168.0.3] helo=N01332) by mail.jpshallow.com with esmtp (Exim 4.92.3) (envelope-from <jon.shallow@jpshallow.com>) id 1lJz6m-00069V-TA; Wed, 10 Mar 2021 13:42:40 +0000
From: <supjps-ietf@jpshallow.com>
To: <christian@amsuess.com>
Cc: <draft-ietf-core-new-block@ietf.org>, <dots@ietf.org>, <core@ietf.org>
References: <022401d6e440$06763ba0$1362b2e0$@jpshallow.com> <YCxikyadpukaiK5I@hephaistos.amsuess.com> <004601d705f8$acbec250$063c46f0$@jpshallow.com> <YEei91E6YoP+wXiI@hephaistos.amsuess.com>
In-Reply-To: <YEei91E6YoP+wXiI@hephaistos.amsuess.com>
Date: Wed, 10 Mar 2021 13:42:42 -0000
Message-ID: <03d401d715b3$3e80d9c0$bb828d40$@jpshallow.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Mailer: Microsoft Outlook 14.0
Thread-Index: AQKHUmmxquhArXd4h/ZvumPsHfcPaAHVBsU8AgePCMYBMUDOUqj0HibQ
Content-Language: en-gb
Archived-At: <https://mailarchive.ietf.org/arch/msg/dots/HvFzdN89PV5C41oqcFK__yuI4D8>
Subject: Re: [Dots] [core] WG Last Call on draft-ietf-core-new-block
X-BeenThere: dots@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "List for discussion of DDoS Open Threat Signaling \(DOTS\) technology and directions." <dots.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dots>, <mailto:dots-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dots/>
List-Post: <mailto:dots@ietf.org>
List-Help: <mailto:dots-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dots>, <mailto:dots-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Mar 2021 13:42:48 -0000

Hi Christian,

Please see inline.  Updated document changes can be found at
https://tinyurl.com/new-block-latest 

Regards

Jon

> -----Original Message-----
> From: Christian Amsüss [mailto: christian@amsuess.com]
> Sent: 09 March 2021 16:32
> To: supjps-ietf@jpshallow.com
> Cc: draft-ietf-core-new-block@ietf.org; dots@ietf.org; core@ietf.org
> Subject: Re: [core] WG Last Call on draft-ietf-core-new-block
> 
> Hello Jon,
> 
> I've finally managed to look at the rest of the mail, only answering where
I
> see value in it (considering the rest done):

[Jon1] Thanks for getting back as well as confirming.

> 
> > > * CSM option: "There is little, if any, benefit of using these options
> > >   with [reliable transports]" ... and reliable transports are the only
> > >   thing CSMs are defined for. So why define a CSM if it's not expected
> > >   to be useful?
> >
> > [Jon] This is here for completeness.  DOTS tries to use UDP, but falls
> > back to using TCP if unable to use UDP.
> 
> This document's task is to solve a particular problem. I don't see how the
> document becomes better by covering parts just for orthogonality's sake
> when there is, to quote, "little, if any, benefit" to it.
> 
> There's this saying about things being good when there's nothing to
> remove. Keeping provisions for reliable transports in puts needless work
on
> the reviewers and just slows down this document's progress (and if it's
only
> because questions like this come up again and again).
> 
> Unless, of course, there is both a) the intention for the DOTS use case is
to
> go from the current (AIU: Block as baseline but using QBlock as an
> optimization) to QBlock as mandatory-to-implement *and* b) the bodies
> exceed the required-by-the-application Max-Message-Size -- then keeping
> reliable transports would make sense.
> 

[Jon1] Removed references for the CSM as Q-Block support can still be
detected as per CON testing.

> > > * The 2.31 Continue rules: Why not send it with CON? A 2.31 Continue
is
> > >   just as compact as an ACK, and can convey to the client that all
> > >   packages up to that point have been received, so it doesn't need to
> > >   retransmit the earlier ones even though the corresponding ACK may
> have
> > >   gotten lost.
> >
> > [Jon] For DOTS, we have to have the entire conversation for the
> > transmission of the telemetry data as NON, as there could be a flooded
> > pipe in one direction because of a DDoS attack.
> 
> Same topic here: Please make a choice between "we specify what we need"
> and "we specify the general case".
> 
> We can go through how to efficiently use this with CON (both in this
> concrete point and in the earlier discussion on mixing NON and CON in ways
> that don't apply to DOTS but would be useful). My impression though is
that
> this would require more effort and time than we have drive for in this
> document -- which is fine because it's special purpose. If that's not
done, my
> opinion is that this document should focus on NON-only exchanges, and
> leave its use with confirmable messages and transports out of scope.

[Jon1] Yes, DOTS environment requires NON when attacks are taking place. The
consequences of using CON only or reliable transport only with Q-Block need
to be brought out and so at that level need to be in scope.

[Jon1]I have added to the disadvantages section " Mixing of NON and CON
during requests/responses using Q-Block is not supported."
> 
> A full blockwise-bis will need to do this, and will profit greatly from
the
> discussions we've been having here.

[Jon1] Absolutely agree.  And then we can work through mixing NON/CON etc.
> 
> > > * The new QB2 M-bit rules depend on MAX_PAYLOADS to be agreed. But
> that
> > >   agreement is SHOULD only, and even that's already stretching what
I'd
> > >   expect of a configurable CoAP parameter.
> >
> > [Jon] Having the flexibility and ability to configure the actual value
> > and mutually agree on the new value could beneficial to DOTS (clients
> > on low bandwidth networks may want to tune it down as suggested as
> > per:-
> >
> > "If the CoAP peer reports at least one payload has not arrived for
> > each body for at least a 24 hour period and it is known that there are
> > no other network issues over that period, then the value of
> > MAX_PAYLOADS can be reduced by 1 at a time (to a minimum of 1) and the
> > situation re-evaluated for another 24 hour period until there is no
> > report of missing payloads under normal operating conditions. The
> > newly derived value for MAX_PAYLOADS should be used for both ends of
> > this particular CoAP peer link. Note that the CoAP peer will not know
> > about the MAX_PAYLOADS change until it is reconfigured. As a
> > consequence of not being reconfigured, the peer may indicate that
> > there are some missing payloads prior to the actual payload being
> transmitted as all of its MAX_PAYLOADS payloads have not arrived."
> 
> A bit of clarification may help here: Is the expectation that the peers
report
> their stats to some management that then evaluates it and reconfigures the
> peers? May they do the stats on their own, unilaterally change
> MAX_PAYLOADS and then tell their peer?
> 
> Suggested rough wording if that fits your answer: "Endpoints whose
> MAX_PAYLOADS are configurable can report their per-peer stats back to the
> source of their configuration. If between two endpoints at least one
> payload has not [...]. Th enewly derived value for MAX_PAYLOADS is then
> configured for both endpoints."

[Jon1] Updated draft for this para reflects that this is out of scope.
However the Note: of 3 paras previous does indicate how DOTS does this.
> 
> > >   It's also rather complicated. How is this better than a simple "M=1
> > >   means this block plus as many as you can comfortably send, where M=0
> > >   is for this one only"?
> >
> > [Jon] We still need to have a concept of 'Continue' to reduce any
> > NON_TIMEOUT delays for every MAX_PAYLOADS for handling congestion
> control.
> > 'Continue' is used in multiple places in the draft.
> 
> I would have figured that something local can be used here (like "have a
> short timeout that's extended and readjusted as new responses come in"),
> but with the above management, what's here should be fine. (If CONs were
> allowed, there'd be better options that don't introduce the delays -- but
see
> above).

[Jon1] I agree that in the general case use of CONs simplifies a lot of
things in terms of delays etc. (but there is potential for other confusion
when mixing NON/CON), but when forced to use NONs as DOTS is, we have to go
with this.

> 
> > > * "is meant to prevent amplification attacks": Could you elaborate? A
> > >   client permitted the use of QB2 has already some leverag on the
server
> > >   to start attacks, and would in any case (overlaps or not) only be
> > >   permitted MAX_PAYLOADS on a single request by the server, no matter
> > >   how it requests more than that.
> >
> > [Jon] A single request, containing multiple Q-Block2 with M set and
> > the same NUM (0 meaning all blocks is a really bad case) would
> > otherwise (no overlap
> > protection) cause MAX_PAYLOADS to be sent, NON_TIMEOUT pause, next
> set
> > of MAX_PAYLOADS, NON_TIMEOUT pause etc. for quite some time.
> Request
> > packet of
> > 1k+, each Q-Block2 being 2 bytes gives 500+ requests for lots of
packets.
> > Yes, there are NON_TIMEOUT gaps giving respite against a single
> > request, but multiple requests would average things out.
> 
> I understand this to imply that if duplicates were allowed to be requested
> the server would actually send them multiply, allowing the client to
create a
> virtual larger resource it can "pull out" with a single request as
compared to
> regular requests that "pull out" the resource at most once. Wouldn't have
> occurred to me as an implementer (I'd have only sent the union of the
> requested sets) -- but fair enough, thanks.

[Jon1] No problem - we cannot assume a rogue client will obey the rules...
> 
> > > * "then the body response SHOULD be restarted with a different ETag
> > >   Option value": That behavior sounds like a recipe for endless
running
> > >   requests when CON is involed (which, granted, are under flow
control,
> > >   but still don't do anything useful). Given there is also a
> > >   recommendation to keep the being-transmitted version alive, why not
> > >   just stop the transmission?  Or send just one final block of the new
> > >   value -- and then it's up to the client to decide whether that's a
> > >   representation it already knows (and just got a freshness statement
> > >   for) or needs to get it as a whole again?
> >
> > [Jon] My implementation maintains the cached body as per previous
> > RECOMMENDED statement which keeps things simple.  I was then trying to
> > cover the non-cached copy case.  I agree with your concern.
> >
> > [Jon] From the client's perspective, ETag is opaque with no way of
> > knowing a newly seen ETag is earlier or later.  If a NON (old) ETag is
> > out of sequence on arrival or there is a CON retransmit with an (old)
> > ETag the client has to make a decision on seeing the (old) ETag (which
> > may be for a single packet that holds the whole body and hence is not
> > in the clients history).  Does the current receipt of multiple
> > payloads get aborted or ... when the (old) ETag is seen?
> 
> Usually there's the sequence of requests (and thus tokens) by which the
> client can tell older from newer; in sending Q-Block that's not given,
thus
> see below.

[Jon1] OK
> 
> > [Jon] The new data length may be less than the previous payload, and
> > so the currently being transmitted block is beyond the size of the new
> data.
> >
> > [Jon] The (needs terminating) response can be for something other than
> > a GET, making it a bit more problematic for the client to continue -
> > especially if there is non-idempotent usage.
> >
> > [Jon] Two ways forward here
> > 1. Make previous statement a MUST and delete this statement Or 2.
> > OLD
> > If the server detects part way through a body transfer that the
> > resource data has changed and the server is not maintaining a cached
> > copy of the old data, then the body response SHOULD be restarted with
> > a different ETag Option value. Any subsequent missing block requests
> > MUST be responded to using the latest ETag Option value.
> > NEW
> > If the server detects part way through a body transfer that the
> > resource data has changed and the server is not maintaining a cached
> > copy of the old data, then the transmission is terminated.  Any
> > subsequent missing block requests MUST be responded to using the
> > latest ETag and Size2 Option values with the updated data.
> >
> > [Jon] Preferences ?
> 
> Both (MUST keep around as well as terminating the transmission) are fine.
> I'd be leaning towards the server just stopping if (or when) it doesn't
have
> the cached version around any more as it mitigates slow loris style DoS
> attacks. But as said, either are fine -- they both ensure that no
different
> ETags come on the same token unless disambiguated by an incrementing
> Observe option.

[Jon1] The -07 draft version has gone with the latter.

> 
> > > * "When the next client completes building the body, any existing
> > >   partial body transmission to the CoAP server is terminated": Just a
> > >   note, you're already using Request-Tag, so you could leave it up to
> > >   the proxy to try to run them simultaneously (it obviously can, as it
> > >   already got to the point of having both request bodies loaded in
> > >   full). The server can then still cancel one or postpone the other.
> >
> > [Jon] If the proxy uses multi-plexing client logic to talk to a
> > singular server with a common 'CoAP session', it has to determine
> > which body entity to terminate - and may chose the wrong one based as
> > timing, whereas it is simpler to get the proxy to make the correct
decision.
> 
> Why does it have to terminate either? It can simultaneously receive two
> requests, and then simultaneously relay them.
> 
[Jon1] OK - taking your suggestion.  So if there are 2 clients talking
directly to a server and updating the same resource using Q-Block1,
whichever body arrives last wins by overwriting the just updated resource
received by the first client to succeed - Fine.  With an interim proxy that
maintains secondary/upstream "CoAP Sessions" to the server, then the same is
true at the server - the last wins - But.  Let's say Client 1 is first to
the server and client 2 is the last.  So, the server representation for
resource is "Client 2".  However, the representation of the resource on the
proxy could be "Client 1" or "Client 2" depending on the timing (and any
recovery) of data flowing over the secondary/upstream "Session(s)".  Then a
subsequent request for this resource could end up being "Client 1" or
"Client 2" version if the proxy just elects to respond.

[Jon1]  Hence why the proxy needs to decide what to do (i.e. terminate one
of the concurrent updates). 

> 
> Best regards
> Christian
> 
> PS. I'm also around the IETF gather venue for most of the meeting time,
and
> now have a fresh memory of the state of affairs, so especially for the
last
> point it may help to do a faster back-and-forth there (although I should
be
> reasonably responsive with mail now too).
> 
> --
> To use raw power is to make yourself infinitely vulnerable to greater
> powers.
>   -- Bene Gesserit axiom