Re: [Dots] AD evaluation of draft-ietf-dots-signal-filter-control-03

Benjamin Kaduk <kaduk@mit.edu> Wed, 27 May 2020 17:14 UTC

Return-Path: <kaduk@mit.edu>
X-Original-To: dots@ietfa.amsl.com
Delivered-To: dots@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 17BB63A0FE2; Wed, 27 May 2020 10:14:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ugP1k7_Qrxqh; Wed, 27 May 2020 10:14:07 -0700 (PDT)
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D53F53A0FDF; Wed, 27 May 2020 10:14:06 -0700 (PDT)
Received: from kduck.mit.edu ([24.16.140.251]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 04RHE0p5026267 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 27 May 2020 13:14:03 -0400
Date: Wed, 27 May 2020 10:14:00 -0700
From: Benjamin Kaduk <kaduk@mit.edu>
To: mohamed.boucadair@orange.com
Cc: Benjamin Kaduk <bkaduk@akamai.com>, "draft-ietf-dots-signal-filter-control.all@ietf.org" <draft-ietf-dots-signal-filter-control.all@ietf.org>, "dots@ietf.org" <dots@ietf.org>
Message-ID: <20200527171400.GA58497@kduck.mit.edu>
References: <20200513021222.GY3811@akamai.com> <787AE7BB302AE849A7480A190F8B9330314B9EE0@OPEXCAUBMA2.corporate.adroot.infra.ftgroup>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <787AE7BB302AE849A7480A190F8B9330314B9EE0@OPEXCAUBMA2.corporate.adroot.infra.ftgroup>
User-Agent: Mutt/1.12.1 (2019-06-15)
Archived-At: <https://mailarchive.ietf.org/arch/msg/dots/BfakS47H9Ntz_ic0vVdTDVRx3z4>
Subject: Re: [Dots] AD evaluation of draft-ietf-dots-signal-filter-control-03
X-BeenThere: dots@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "List for discussion of DDoS Open Threat Signaling \(DOTS\) technology and directions." <dots.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dots>, <mailto:dots-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dots/>
List-Post: <mailto:dots@ietf.org>
List-Help: <mailto:dots-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dots>, <mailto:dots-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 27 May 2020 17:14:09 -0000

Hi Med,

Please go ahead and upload the new rev; the changes at
https://github.com/boucadair/filter-control/blob/master/AD-Review.pdf look
good.

A little bit of commentary inline.

On Wed, May 13, 2020 at 10:47:56AM +0000, mohamed.boucadair@orange.com wrote:
> Hi Ben, 
> 
> Thank you for the review. 
> 
> Please see inline.
> 
> Cheers,
> Med
> 
> > -----Message d'origine-----
> > De : Benjamin Kaduk [mailto:bkaduk@akamai.com]
> > Envoyé : mercredi 13 mai 2020 04:12
> > À : draft-ietf-dots-signal-filter-control.all@ietf.org; kaduk@mit.edu
> > Cc : dots@ietf.org
> > Objet : AD evaluation of draft-ietf-dots-signal-filter-control-03
> > 
> > Hi all,
> > 
> > This one is also generally in good shape, with mostly nit-level
> > comments.
> > (I didn't find it on github, so those are inline.)
> > 
> > I have a nagging suspicion that I'm repeating some comments I made on
> > the core protocol documents (having forgotten the response and thus
> > why
> > the change in question should not be made; my apologies in advance for
> > my poor memory.
> > 
> > Section 1.1
> > 
> >    The DOTS data channel protocol [I-D.ietf-dots-data-channel] is used
> >    for bulk data exchange between DOTS agents to improve the
> >    coordination of parties involved in the response to the Distributed
> > 
> > nit: s/the/a/
> > 
> 
> [Med] Fixed. Thanks.
> 
> >    the inbound link to the DOTS client domain.  In other words, the
> > DOTS
> >    client cannot use the DOTS data channel protocol to withdraw the
> >    accept-list filters when a DDoS attack is in progress.  This
> > assumes
> >    that this DOTS client is the owner of the filtering rule.
> > 
> > nit: the last sentence feels a little disconnected from the previous
> > discussion at the moment; I'd suggest adding at the end ", since
> > otherwise this DOTS client would not be authorized to modify it
> > anyway"
> > or similar.
> 
> [Med] After rereading, I think we can get rid of it. I deleted that part of the sentence. 

Good point.  The new text in the security considerations covers this point
anyway.

> > 
> > Section 1.2
> > 
> >    This specification addresses the problems discussed in Section 1.1
> > by
> >    adding the capability of managing filtering rules using the DOTS
> >    signal channel protocol, which enables a DOTS client to request the
> >    activation (or deactivation) of filtering rules during a DDoS
> > attack.
> > 
> > nit: the grammar here does not quite match up ("capability of" vs.
> > "using"); perhaps s/the capability of/a capability for/ (though other
> > fixes are possible).
> 
> [Med] Fixed. 
> 
> > 
> >    Sample examples are provided in Section 4, in particular:
> > 
> > nit: "sample examples" feels redundant; maybe "some examples" or just
> > "examples"?
> > 
> >    o  Section 4.1 illustrates how the filter control extension is used
> >       when conflicts with Access Control List (ACLs) are detected and
> >       reported by a DOTS server.
> > 
> > nit: s/List/Lists/
> 
> [Med] Fixed. Thanks.
> 
> > 
> > Section 3.1
> > 
> >    A filtering rule controlled by the DOTS signal channel is
> > identified
> >    by its ACL name (Section 7.2 of [I-D.ietf-dots-data-channel]).
> > Note
> > 
> > The referenced section is just an example; the actual YANG definition
> > is
> > in Section 4.3 thereof.
> 
> [Med] pointing to section 4.3 is better. Fixed. 
> 
> > 
> >    The activation or deactivation of an ACL by the DOTS signal channel
> >    overrides the 'activation-type' (defined in Section 7.2 of
> >    [I-D.ietf-dots-data-channel]) a priori conveyed with the filtering
> >    rules using the DOTS data channel protocol.
> > 
> > Is there a way to "cancel" the signal-channel-made changes and revert
> > to
> > the previous activation-type?
> 
> [Med] This is up to the DOTS client to tweak these filtering rules as a function of the context. For example, 
> - if a rule was deactivated because of a conflict, the client may delete that rule
> - if the rule was a rate-limit, the client will deactivate the rule.  
> 
> I added this NEW text:
> 
>    Once the attack is mitigated, the DOTS client may use the data
>    channel to control the 'activation-type' (e.g., revert to a default
>    value) of some of the filtering rules controlled by the DOTS signal
>    channel or delete some of these filters.  This behavior is deployment
>    specific. 
> 
> Do we need to say more?

What you added is just right; thanks.
(My original mental model was that the signal-filter-control operation was
an "override" and thus that the override would be reverted at some point;
subsequent discussion has clarified that it's just another mechanism to
update the status, and not an override at all.  Your text describes this
nicely.)

>   Or do we just make the client remember
> > the previous status and manually set that status in such a case?
> > 
> > Section 3.2.1
> > 
> > Why is the acl-name an optional attribute? 
> 
> [Med] Because:
> 
>    This specification extends the mitigation request defined in
>    Section 4.4.1 of [I-D.ietf-dots-signal-channel] to convey the
>    intended control of configured filtering rules.
> 
> A mitigation request does not need to include an acl name to be valid. 
> 
>  What would the
> > interpretation be if it was not provided?
> 
> [Med] This will be a mitigation request as defined in draft-ietf-dots-signal-channel.

I see.  Thanks for the new text about "mandatory attribute when 'acl-list'
is included".

> > 
> >    intended control of configured filtering rules.  Concretely, the
> > DOTS
> >    client conveys 'acl-list' attribute with the following sub-
> > attributes
> >    in the CBOR body of a mitigation request (see the YANG-encoded
> >    structure in Section 3.2.2.1):
> > 
> > nit: YANG is a data description language, not an encoding; perhaps
> > "YANG-formatted structure" or just "YANG structure" is better.
> 
> [Med] Changed to "YANG structure".
> 
> > 
> >    As the attack evolves, DOTS clients can adjust the 'activation-
> > type'
> >    of an ACL conveyed in a mitigation request or control other filters
> >    as necessary.  This can be achieved by sending a PUT request with a
> >    new 'mid' value.
> > 
> > This potentially involves a lot of new 'mid' values if many change are
> > made using signal-channel filtering control over the course of a given
> > mitigation.  Is this going to cause a noticeable increase in the
> > amount
> > of state required at the DOTS server?
> 
> [Med] No. Only the request with the highest 'mid' value will be maintained. 
> 
> We can add a reminder if you think this is useful. 

I don't think we need a reminder in the document; it's pretty clear in the
main signal-channel doc but I hadn't read that recently enough.

> > 
> >    If the DOTS client receives a 5.03 (Service Unavailable) with a
> >    diagnostic payload indicating a failed ACL update as a response to
> > an
> >    initial mitigation or a mitigation with adjusted scope, the DOTS
> >    client MUST immediately send a new request which repeats all the
> >    parameters as sent in the failed mitigation request but without
> >    including the ACL attributes.  After the expiry of Max-Age returned
> > 
> > Why does this need to be a MUST-level requirement?  What if the
> > situation has changed in the interim and the mitigation update is no
> > longer needed?
> 
> [Med] These requests are supposed to happen in a short interval and clients have to immediately place a mitigation request to avoid the inconvenience of including the ACLs. 
> 
> What about the following:  
> 
> s/client MUST immediately/client MUST by default immediately  

I don't have a particular preference between those versions.  Your
explanation makes sense to me, so we can probably leave this as-is for now.

> > 
> >    in the 5.03 (Service Unavailable) response, the DOTS client retries
> >    with a new mitigation request (i.e., a new 'mid') that repeats all
> >    the parameters as sent in the failed mitigation request.
> > 
> > Perhaps mention "including the ACL update" for clarity?
> > 
> 
> [Med] Works for me. 
> 
> > Section 3.2.2.1
> > 
> >        +--rw acl-list* [acl-name] {control-filtering}?
> > 
> > A question for the YANG doctor, no doubt, but is there a need for a
> > feature indicator that gates the entire content of a given module (as
> > opposed to just using module-level granularity to indicate support)?
> 
> [Med] I'm not a YANG doctor, but a feature needs to be defined within the module itself to mark parts that are conditional. I don't quite see why a module need to be explicitly taged as optional. This is actually why it is recommended that if you have a module with a large set of nodes with an if-feature, to define these nodes in a separate module.

I think we are on the same page as to how the YANG features are intended to
work.  What I'm still not sure about is whether there's anything in
ietf-dots-signal-control other than the acl-list (that is to say, whether
it makes sense to implement ieft-dots-signal-control but not support the
control-filtering feature).

> > 
> > Section 3.2.2.2
> > 
> >      namespace
> >         "urn:ietf:params:xml:ns:yang:ietf-dots-signal-control";
> >      prefix signal-control;
> > 
> > Can we think of anything else that might be in scope in the future for
> > which "signal-control" would be ambiguous or a name conflict?
> 
> [Med] What about "dots-control"?

Sure.

> > 
> >        list acl-list {
> >          key "acl-name";
> > 
> > Seeing this reminds me of a shadow of a memory that list keys are
> > implicitly mandatory, which seems relevant for an earlier comment.
> > 
> 
> [Med] If an acl-list is present, then acl-name must be present. Nevertheless, an acl-list is not a mandatory. 
> 
> >          [...]
> >        }
> >        leaf activation-type {
> >          type ietf-data:activation-type;
> >          default "activate-when-mitigating";
> >          description
> >            "Sets the activation type of an ACL.";
> > 
> > This structure seems to bind us to having a single activation-type
> > that
> > applies lto all of the ACLs in the acl-list;
> 
> [Med] Actually, no. We do have an activation-type value for each acl-name in the list. 

It looks like I misread the scope; sorry!

> 
>  furthermore, the global
> > structure of where we augment means that we can only have one (acl-
> > list,
> > activation-type) pair within a given mitigation request.  I *believe*
> > that is consistent with the core signal protocol in terms of what's
> > expected to be in a single message, but please confirm.
> > 
> > Section 4
> > 
> > (The section heading should probably get the same treatment of "Sample
> > Examples" as was previously used, and the body text here as well.)
> 
> [Med] Not sure about this one. Can you please clarify?

This referred back to a previous comment:

% nit: "sample examples" feels redundant; maybe "some examples" or just
% "examples"?

> > 
> > Section 4.1
> > 
> >    Host: {host}:{port}
> > 
> > nit(?): the data-channel doc just uses "example.com" for the host
> > header
> > field; it's not clear that we need to diverge from that style (applies
> > throughout).
> 
> [Med] Good catch. Fixed. Thanks. 
> 
> > 
> >    The DOTS client can also decide to send a PUT request to deactivate
> >    the "an-accept-list" ACL, if suspect traffic is received from an
> >    accept-listed source (2001:db8:1234::/48).  The structure of that
> > PUT
> >    is the same as the one shown in Figure 5.
> > 
> >    [...]
> >    Uri-Path: "mid=124"
> > 
> > Just as a sanity-check: this "mid" of 124 is consistent with the note
> > in
> > Section 3.2.1 that adjusting the 'activation-type' "[...] can be
> > achieved by sending a PUT request with a new 'mid' value", so the 124
> > (vs. 123) is intentional and correct.
> 
> [Med] I confirm.

Thanks!

> > 
> >             "ietf-dots-signal-control:acl-list": [
> >               {
> >                 "ietf-dots-signal-control:acl-name": "an-accept-list",
> >                 "ietf-dots-signal-control:activation-type":
> > "deactivate"
> >               }
> >             ]
> >            "lifetime": 3600
> > 
> > Shouldn't there be a comma after the ']'?
> 
> [Med] Yes. Thanks for catching this. 
> 
> > (Have the examples been run through a JSON validator?)
> 
> [Med] will double check. 
> 
> > 
> > Section 4.3
> > 
> >    Consider a DOTS client that contacts its DOTS server during 'idle'
> >    time to install an accept-list that rate-limits all (or a part
> >    thereof) traffic to be forwarded to 2001:db8:123::/48 as a last
> >    resort countermeasure whenever required.  It does so by sending,
> > for
> > 
> > nit: I could imagine a little confusion as to whether the action
> > referenced by "it does so" is the installation of the accept-list (my
> > belief) or the rate-limiting.  Perhaps "Installing the accpe-list can
> > be
> > done by sending, [...]"?
> 
> [Med] Fixed. 
> 
> 
> > 
> >    example, a PUT request shown in Figure 9.  The DOTS server installs
> > 
> > nit: either "the PUT request shown" or "a PUT request as shown".
> 
> [Med] Fixed.  
> 
> > 
> >                  "actions": {
> >                    "forwarding": "accept",
> >                    "rate-limit": "20.00"
> > 
> > 20 bytes per second?  That seems rather small.
> 
> [Med] Indeed. Will increase the value used in the example.
> 
> > 
> >    For some reason (e.g., the DOTS server, or the mitigator, is
> > lacking
> >    a capability or capacity), the DOTS client is still receiving the
> >    attack traffic which saturates available links.  To soften the
> > 
> > nit: I think s/the attack traffic/attack traffic/ is fine grammar.
> > 
> 
> [Med] Fixed. 
> 
> > Section 5.1
> > 
> >    o  Note to the RFC Editor: Please delete (TBA1-TBA2-TBA3) once the
> >       CBOR key is assigned from the 1-16383 range.  Please update
> >       Table 1 accordingly.
> > 
> > IIRC the unallocated values in this range include both values with a
> > 2-byte encoding and with a 3-byte encoding; do we have a reason to
> > prefer one or the other?
> 
> [Med] 2-byte encoding is preferred to keep the message compact. 
> 
>   Since we're allocating from an "IETF Review"
> > range, we get to suggest a value to IANA and it's not immediately
> > clear
> > to me how much say the DEs have in which value gets used.
> 
> [Med] Updated the table to suggest 52/53/54 values. 
> 
> > 
> > Section 6
> > 
> > We may get someone asking "this document uses YANG; why aren't you
> > using
> > the YANG security considerations template?"  I guess the signal-
> > channel
> > doc made it through without any particular note in that regard, so
> > maybe
> > we'll be safe for this one, too.
> 
> [Med] Agree. We do say explicitly the following in the signal channel:
> 
> =
> This YANG module is not intended to be used via NETCONF/
>    RESTCONF for DOTS server management purposes; such module is out of
>    the scope of this document.  It serves only to provide a data model
>    and encoding, but not a management data model.
> ==
> 
> > 
> > I'd consider mentioning that with both signal and data channels
> > available for updating ACL activation status, there is potential for
> > "skew" wherein an update made in one place is ~immediately overridden
> > by
> > another update.  Fortunately, there was already some degree of
> > asynchronicity/external-changes possible, so clients should already be
> > prepared for this sort of situation and will cope properly by
> > polling/OBSERVE notification.
> 
> [Med] Yeah.  
> 
> > 
> > We could also consider saying something about how the restriction to
> > only changing activation status over the signal channel (but not
> > creating new ACLS entirely) means that if a client hasn't prepared
> > adequately prior to an attack, it can get stuck in a bad place.  This,
> > of course, is also not really new with this document, but is a pretty
> > relevant consideration for its use.
> 
> [Med] Good point. Added this NEW text:
> 
>    This specification does not allow to create new filtering rules,
>    which is the responsibility of the DOTS data channel.  DOTS client
>    domains should be adequately prepared prior to an attack, e.g., by
>    creating filters that will be activated on demand when an attack is
>    detected.
> 
> > 
> > Similarly, a reminder that bad things happen when a DOTS server fails
> > to
> > maintain namespace separation across clients for ACL names could be in
> > order (but is not required).
> 
> [Med] Added this NEW text:
> 
> As a reminder,
>    DOTS servers must associate filtering rules with the DOTS client that
>    created these resources.  Failure to ensure such association by a
>    DOTS server will have severe impact on DOTS client domains.
> 
> > 
> >    A compromised DOTS client can use the filtering control capability
> > to
> >    exacerbate an ongoing attack.  Likewise, such compromised DOTS
> > client
> >    may abstain from reacting to an ACL conflict notification received
> > 
> > nit: singular/plural mismatch "DOTS client"/"such compromised" (so
> > either "such a compromised" or "DOTS clients").
> 
> [Med] Fixed. Thanks.

Thanks for all these fixes!

-Ben