Re: [Dots] AD evaluation of draft-ietf-dots-signal-call-home-09
Benjamin Kaduk <kaduk@mit.edu> Wed, 14 October 2020 04:31 UTC
Return-Path: <kaduk@mit.edu>
X-Original-To: dots@ietfa.amsl.com
Delivered-To: dots@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 95B573A1366; Tue, 13 Oct 2020 21:31:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9D8H2Wrmp3cA; Tue, 13 Oct 2020 21:31:49 -0700 (PDT)
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3B69A3A1365; Tue, 13 Oct 2020 21:31:48 -0700 (PDT)
Received: from kduck.mit.edu ([24.16.140.251]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 09E4VgKe029803 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 14 Oct 2020 00:31:47 -0400
Date: Tue, 13 Oct 2020 21:31:42 -0700
From: Benjamin Kaduk <kaduk@mit.edu>
To: draft-ietf-dots-signal-call-home.all@ietf.org
Cc: dots@ietf.org
Message-ID: <20201014043142.GG50845@kduck.mit.edu>
References: <20201013230819.GA50845@kduck.mit.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <20201013230819.GA50845@kduck.mit.edu>
Archived-At: <https://mailarchive.ietf.org/arch/msg/dots/FhRTudDNPL0GF6CkM534-EvTrhA>
Subject: Re: [Dots] AD evaluation of draft-ietf-dots-signal-call-home-09
X-BeenThere: dots@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "List for discussion of DDoS Open Threat Signaling \(DOTS\) technology and directions." <dots.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dots>, <mailto:dots-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dots/>
List-Post: <mailto:dots@ietf.org>
List-Help: <mailto:dots-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dots>, <mailto:dots-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 14 Oct 2020 04:31:52 -0000
Oops, forgot to note that I also put a bunch of (hopefully!) editorial suggestions up at https://github.com/boucadair/dots-call-home/pull/1 rather than littering them here. -Ben On Tue, Oct 13, 2020 at 04:08:19PM -0700, Benjamin Kaduk wrote: > Hi all, > > Nothing super-earth-shattering in here, but there's enough that we'll need > a revised I-D and some more WG input before I'm ready to start the IETF LC. > > -Ben > > > YANG lint says: > > ietf-dots-call-home@2020-07-07.yang:11: warning: imported module > "ietf-dots-signal-channel" not used > > AFAICT that's a yanglint bug, since we do augment nodes from the signal > channel. > > I see that this document has a few IPR notices against it; in > particular, the one at https://datatracker.ietf.org/ipr/3318/ does not > list any licensing terms ("Unwilling to Commit to the Provisions of a), > b), or c) Above", where a/b/c are "no license needed for > implementation"/"RAND with no fee"/"RAND with possible fee"). > I believe we are working to see if the situation can be clarified and > the IPR disclosure updated, but I don't know that we need to delay any > progress until that has occurred. That said, personally, I'm not very > comfortable about a protocol that potentially is encumbered by IPR with > unknown license terms. However, I am willing to at least advance the > document to IETF LC if there is clear WG support for doing so, even > knowing that use of the protocol would potentially be subject to > arbitrary terms from the IPR holder. I would really like to hear from > the WG members on this point. > > I mention it in the section-by-section comments, but we introduce the > possibility of "wait for administrator approval" in the > mitigation-request processing pipeline, which could exceed a normal CoAP > request timeout. I think we need to have some substantive text > discussion the expected behavior in this case. > > Section 1.1 > > Some of the DDoS attacks like spoofed RST or FIN packets, Slowloris, > and Transport Layer Security (TLS) re-negotiation are difficult to > detect on a home network device without adversely affecting its > > (side note) TLS renegotiation as an attack is just keeping a TLS > connection open and repeatedly re-negotiating to cause the server to > burn CPU on signing operations? I don't think I've heard of that one > before. > > Section 1.2 > > 'DOTS signal channel Call Home' (or DOTS Call Home, for short) refers > to a DOTS signal channel established at the initiative of a DOTS > server. That is, the DOTS server initiates a secure connection to a > DOTS client, and uses that connection to receive the attack traffic > information (e.g., attack sources) from the DOTS client. More > details are provided in Section 3. > > I think this introductory section needs to be crystal clear for the > reader about the relationship between the "conventional" signal channel > DOTS client and the corresponding call home DOTS client for a given > mitigation process. Some people will expect both things with "client" > in the name to be on the same device, and some people will expect the > (call home) client to be the device that makes mitigation requests, but > both cannot be right. The phrase "role reversal" might be useful in > describing these interpretations, as might a forward reference to > section 1.4. (Yes, I understand that there is no need for either DOTS > call home peer to be colocated with any base signal channel element, but > I think that the default scenario in many readers' minds will be the > simple role-reversal setup.) > > Section 1.3 > > It's a little unfortunate that we are using the "DMS" acronym in the > figure here, when we only mention the terminology of RFC 8612 in section > 2. (That said, I'm willing to leave it as-is for now and see if anyone > complains.) > > Section 3.1 > > The DOTS signal channel Call Home preserves all but one of the DOTS > client/server roles in the DOTS protocol stack, as compared to DOTS > client-initiated DOTS signal channel protocol > > I suppose one could quibble about whether "party allowed to behave > passively with respect to heartbeats" qualifies as a "client/server > role" ... I think I'm okay leaving this as-is for now, though. > > For example, a home network element (e.g., home router) co-located > with a Call Home DOTS server is the (D)TLS server. However, when > > Figure 7 has a box labelled "Call Home DOTS server" and also "(D)TLS > client" (not server). At first I thought this was trying to make an > analogy to a (normal) signal channel DOTS server as being the (D)TLS > server, but that would not be in a home network element. So maybe this > is just a typo? > > calling home, the DOTS server initially assumes the role of the > (D)TLS client, but the network element's role as a DOTS server > > We may want to continue using "DOTS Call Home server" in these two lines. > > remains the same. Furthermore, existing certificate chains and > mutual authentication mechanisms between the DOTS agents are > unaffected by the Call Home function. This Call Home function > > This may merit a bit more text, or at least a bit more thought, since we > are now asking the certificate validation to be used for a different > logical purpose. While it's true that the entity acting as a (D)TLS > server is likely to be the same device as a regular DOTS server (or at > least operated by the same DMS provider), and so there is a fairly > strong analogue there in terms of the (D)TLS server certificate > validation procedures, the (D)TLS client is something of a different > entity than a traditional DOTS client. It's true that RFC 8782 leaves > the specifics of the mutual authentication a bit under-specified (and > essentially just says that it has to happen somehow), but one might > imagine that the number of DOTS clients are fairly small and tied to > specific legal contracts, so manual provisioning (or provisioning during > onboarding) of client certificate information is reasonable. For the > call home case, we should expect a lot more call home DOTS servers > (i.e., (D)TLS clients) and thus should probably have a better story for > automating the mutual authentication check. Defining an > extendedKeyUsage value that indicate authorization to act as a call home > server would be one typical way to do so (it's perhaps unfortunate that > we didn't define an EKU for DOTS client usage), but if we're not going > to do that we should at least put some words in about how the mutual > authentication requirement remains but a different ACL may be needed for > call home than for traditional DOTS sessions. > > enables the DOTS server co-located with a network element (possibly > > (Still Call Home server, right?) > > Section 3.2.1 > > If the Call Home DOTS server does not receive any traffic from the > peer Call Home DOTS client during the time span required to exhaust > the maximum 'missing-hb-allowed' threshold, the Call Home DOTS server > concludes the session is disconnected. Then, the Call Home DOTS > server MUST try to resume the (D)TLS session. > > Why resume specifically, as opposed to the broader "initate a new (D)TLS > connection" that could encompass both resumption and a full handshake? > > Section 3.2.2 > > If a Call Home DOTS client wants to redirect a Call Home DOTS server > to another Call Home DOTS client, it MUST send a Non-confirmable PUT > request to the predefined resource ".well-known/dots/redirect" with > the new Call Home DOTS client FQDN or IP address in the body of the > PUT similar to what is described in Section 4.6 of > [I-D.ietf-dots-rfc8782-bis]. [...] > > I suggest that we mention the actual element in the YANG module that > contains the structure that will be used as the PUT body, as this text > in isolation feels like it's attempting to define the protocol by way of > example and analogy, which is not a great pattern for protocol design. > > Section 3.3.1 > > In > addition, the DOTS client MUST validate that attacker prefixes are > within the scope of the DOTS server domain. > > What does "within the scope" mean in the context of the base signal > channel? > > (i.e., the Call Home scenario depicted in Figure 7). The 'target- > uri' or 'target-fqdn' parameters can be included in a mitigation > request for diagnostic purposes to notify the Call Home DOTS server > domain administrator, but SHOULD NOT be used to determine the target > IP addresses. Note that 'target-prefix' becomes a mandatory > attribute in the mitigation request signaling the attack information > because 'target-uri' and 'target-fqdn' are optional attributes and > 'alias-name' will not be conveyed in a mitigation request. > > I think we have to use normative language that 'target-prefix' is > mandatory for call home, since the "don't rely on target-fqdn or > target-uri' is not a MUST. (Actually, I think they would have to be > "MUST NOT send", not just "MUST NOT rely on for identification", in > order for us to be able to get away with the current wording that states > it like a fact.) > > Also, we might want to explain why 'alias-name' cannot be used (and we > don't need normative language to ensure it): they are created using the > data channel but there is no call home data channel (yet, at least). > > In order to help attack source identification by a Call Home DOTS > server, the Call Home DOTS client SHOULD include in its mitigation > request additional information such as 'source-port-range' or > 'source-icmp-type-range'. The Call Home DOTS client may not include > such information if 'source-prefix' conveys an IPv6 address/prefix. > > I'm not sure what the "may not" is intending to convey, here. Are these > mandaroy for IPv4 prefixes? > > The Call Home DOTS server MUST check that the 'source-prefix' is > within the scope of the Call Home DOTS server domain. Note that in a > > (nit) this "MUST" seems redundant with the text I quoted previously. > > The Call Home DOTS server MUST check that the 'source-prefix' is > within the scope of the Call Home DOTS server domain. Note that in a > DOTS Call Home scenario, the Call Home DOTS server considers, by > default, that any routeable IP prefix enclosed in 'target-prefix' is > within the scope of the Call Home DOTS client. [...] > > We say "by default" -- how would some other behavior be activated? > > with or without DOTS server domain administrator consent. If the > attack traffic is blocked, the Call Home DOTS server informs the Call > Home DOTS client that the attack is being mitigated. > > This is just a normal 2.xx response code (and body) to the mitigation > request? It might be worth clarifying. > > If the attack traffic information is identified by the Call Home DOTS > server or the Call Home DOTS server domain administrator as > legitimate traffic, the mitigation request is rejected, and 4.09 > (Conflict) is returned to the Call Home DOTS client. The conflict- > > There may be quite some delay involved if the administrator needs to > decide. Should we say more about (e.g.) using 5.03 and Max-Age in this > case? > > Once the request is validated by the Call Home DOTS server, > appropriate actions are enforced to block the attack traffic within > the source network. The Call Home DOTS client is informed about the > progress of the attack mitigation following the rules in > [I-D.ietf-dots-rfc8782-bis]. For example, if the Call Home DOTS > server is embedded in a CPE, it can program the packet processor to > punt all the traffic from the compromised device to the target to > > I think the sentence about "informed about the progress" might be > misplaced at this location within the paragraph -- the example given seems to > just be talking about the "appropriate actions" that are taken for blocking > traffic, not any mitigation-status updates. > > Section 3.3.2 > > If a Carrier Grade NAT (CGN, including NAT64) is located between the > DOTS client domain and DOTS server domain, communicating an external > IP address in a mitigation request is likely to be discarded by the > Call Home DOTS server because the external IP address is not visible > locally to the Call Home DOTS server (see Figure 10). The Call Home > DOTS server is only aware of the internal IP addresses/prefixes bound > to its domain. Thus, the Call Home DOTS client MUST NOT include the > external IP address and/or port number identifying the suspect attack > source, but MUST include the internal IP address and/or port number. > > We're likely to get similar complaints about "how will they know there's > a NAT" that we did for the base signal channel. I don't have any great > suggestions for trying to forestall such comments, though I do note that > 8782 has some explicit text about "[t]his document does not make any > recommendations about possible translator discovery mechanisms". > > Also, it's amusing that for the base signal channel we said to *not* > use internal addresses, but for call home we say you have to use > internal addresses. In the base signal channel we also said that we did > not give recommendations on how to discover possible translator > mechanisms... > > To that aim, the Call Home DOTS client SHOULD rely on mechanisms, > such as [RFC8512] or [RFC8513], to retrieve the internal IP address > > ... yet here we seem to be making such recommendations! > > If a MAP Border Relay [RFC7597] or lwAFTR [RFC7596] is enabled in the > provider's domain to service its customers, the identification of an > attack source bound to an IPv4 address/prefix MUST also rely on > source port numbers because the same IPv4 address is assigned to > multiple customers. The port information is required to > unambiguously identify the source of an attack. > > [same question about how to know that they are in use] > > If a translator is enabled on the boundaries of the domain hosting > the Call Home DOTS server (e.g., a CPE with NAT enabled as shown in > Figures 11 and 12), the Call Home DOTS server uses the attack traffic > information conveyed in a mitigation request to find the internal > source IP address of the compromised device and blocks the traffic > > In a similar vein, I expect to get some questions about how the call > home DOTS server finds the internal source IP address from the attack > traffic information conveyed. I don't have a specific change to propose > at this time, since I don't know, myself, but we should at least have > some answer to give in response to such questions. > > The text is also a little unclear on why we provide both Figures 11 and > 12 -- while both cases are valid, we don't seem to have any discussion > that highlights differences between the cases. So perhaps we should say > that the behavior of the call home DOTS server is the same whether or > not the call home DOTS server is integrated with the CPE/NAT (if true)? > > Section 3.3.3 > > I think the YANG module might benefit from being moved up a level or two > in the section hierarchy. > > Section 3.3.3.1 > > Should we give some indication that 'signal' is the import prefix for > "ietf-dots-signal-channel" before going into the tree diagram? (I do > not know what the convention is in this regard.) > > Section 3.3.3.2 > > I suggest reiterating the note from 8782 about needing to check the > mapping output provided by YANG-to-CBOR in light of the situations where > differing CBOR/JSON types can arise (e.g., enumerations and 64-bit > quantities). > > I guess it's implicit that we reuse the CBOR map keys 8 and 9 for > lower-port and upper-port in the source-port-range array? > > Section 3.3.3.3 > > This module uses the common YANG types defined in [RFC6991] and the > data structure defined in [RFC8791]. > > (nit) I think we need another word here, maybe "data structure > extension" or "data structure statement"? > > list source-icmp-type-range { > key "lower-type"; > description > "ICMP type range. When only lower-type is > present, it represents a single ICMP type."; > > It seems that the interpretation of the source-icmp-type-range list is > dependent on the IP address family in use. Presumably one is supposed > to infer this from the source-prefix (though we don't say so), but the > source-prefix is optional when these fields are used in the base signal > channel. It is not entirely clear whether it is safe to rely on the > target-prefix for address-family determination, though (I do not recall > any reason why DOTS signal channel doesn't work in the presence of a > NAPT function). Should the icmp attributes only be allowed if the > source-prefix is present? > > sx:augment-structure "/signal:dots-signal/signal:message-type/" > + "signal:redirected-signal" { > description > "The alternate Call Home DOTS client."; > > Is there something we can/should do for the redirected-signal > augmentation to indicate that the alt-server and alt-server-record nodes > are removed/useless? > > leaf alt-ch-client { > type string; > description > "FQDN of an alternate Call Home DOTS client."; > > Can we discuss a bit what the implications of redirection are for (D)TLS > mutual authentication of the post-redirection channel? It seems that in > most cases the alt-ch-client FQDN will be needed to perform certificate > validation. Perhaps, though, there would be a case where a call home > server has a set of preconfigured call home clients (each with IP > address(es) and credentials), so a redirection by IP address to a > different client in that set would still be functional. So, I am not > sure that we need to make alt-ch-client a mandatory field (whether in > the YANG or in the prose), but we probably do need to have some text > earlier in the document covering the implications for authenticating the > redirected-to peer. I note that in the base signal channel, alt-server > is mandatory, so we would have some precendent for just making > als-ch-client mandatory as well. > > Section 4.2 > > Table 2 doesn't seem consistent with Table 1 -- Table 1 lists a couple > parameters that admit multiple CBOR types, but Table 2 only lists a > single CBOR Major Type for them. > > Section 4.3 > > We don't have any visible note about removing TBA9 (and should probably > add some text about 4 only being the *requested* value as well, though > I'm pretty sure we'd know if there were other Standards Actions in the > works that would be potentially requesting a conflicting value!). > > Section 5 > > We should probably say something about how the the call home channel is > a potential vector for attack, with mitigation requests potentially > causing the indicated device to be partially blocked or booted off the > network entirely; mutual authentication of a trusted call home server, > "a healthy dose of skepticism about the indicated attack" (or > rather, local inspection of the indicated traffic), and involvement of > the local administrator can mitigate the new risks that are opened up. > This is related to what we currently have in the last paragraph, but we > don't specifically discuss it as a new risk/attack vector due to this > protocol. > > We might also consider referencing the security considerations of RFC > 8071 (NETCONF/RESTCONF call home) since the "considerations not > associated with server authentication" are likely similar. > > There may also be some considerations when the indicated attack > source-prefix is in a private or local-use address range -- the "in > scope" check at the call home server doesn't mean much. > > Common precautions mitigating DoS attacks are recommended, such as > temporarily blacklisting the source address after a set number of > unsuccessful authentication attempts. > > [I note that we used the term "drop-list" in RFC 8783.] > > Section 6 > > We should probably say something about how the use of DPI and similar > IPS technologies will have privacy considerations of their own, but the > specific considerations are specific to the specific device or > technology in question (thus, out of scope for this specific document). > > Concretely, the protocol does not leak any new information that can > be used to ease surveillance. In particular, the Call Home DOTS > server is not required to share information that is local to its > network (e.g., internal identifiers of an attack source) with the > Call Home DOTS client. > > I guess it's true that we don't require the call-home server to share > internal addresses with the call-home client, but we do require the > call-home client to (only) use internal addresses in the mitigation > request. Presumably the client has to learn those addresses somehow, > and one could argue that this protocol is requiring that to happen even > if it doesn't convey them in-band in that direction. So I think we need > to say more about the privacy considerations of using internal addresses > -- it's not safe to claim that it's out of scope. > > Also, this paragraph seems to be the main part that is directly > privacy-relevant in this section -- would the other paragraphs be a > better fit in the Security Considerations section? I guess the last > paragraph does touch on privacy with the "not meant to track the > activity of users", so it could stay here. > > Triggers to send a DOTS mitigation request to a Call Home DOTS server > are deployment-specific. For example, a Call Home DOTS client may > rely on the output of some DDoS detection systems deployed within the > DOTS client domain to detect potential outbound DDoS attacks or on > abuse claims received from remote victim networks. Such DDoS > detection and mitigation techniques are not meant to track the > activity of users, but to protect the Internet and avoid altering the > IP reputation of the DOTS client domain. > > Perhaps we could say a little more about what steps this mechanism takes > to avoid identifying users? E.g., the indicated data refer only to the > source and target addresses where suspected attack flows are present; > while it does permit expressing flow-level granularity there is no > in-band protocol element that would correlate one suspected-attack flow > with another suspected-attack flow as being associated with the same > user. It is, however, possible that a faulty attack classification > algorithm could consistently identify the legitimate behaviors of a > particular user as being suspected attacks, in which case that user's > traffic would consistently be flagged, but that user's traffic is still > grouped in with the entire anonymity set of "all suspected attack > traffic". > > Section 9.2 > > If the toolchain supports it, we should probably refer to RFC 4632 as > BCP 122 in the text when we reference it. > > We may get someone asking for RFC 8612 to be normative since we expect > the reader to be familiar with its terminology. It is only an > informational document and is not on the downref registry, so in some > sense it is "safer" to preemptively move it to being a normative > reference that will automatically get called out as a downref during the > IETF LC announcement; on the other hand, the IESG does have leeway to > just approve it without another IETF LC if it does need to change from > informative to normative as a result of IETF LC or IESG comments. > > I might suggest using a different slug for the [Sec] reference; the > current short form is potentially misleading. > > Appendix A > > The other approach is signaling the role of each DOTS agent (e.g., by > using the DOTS data channel). For example, the DOTS agent in the > home network first initiates a DOTS data channel to the peer DOTS > agent in the ISP environment, at this time the DOTS agent in the home > network is the DOTS client and the peer DOTS agent in the ISP > environment is the DOTS server. After that, the DOTS agent in the > home network retrieves the DOTS Call Home capability of the peer DOTS > agent. If the peer supports the DOTS Call Home, the DOTS agent needs > to subscribe to the peer to use this extension. [...] > > I don't remember how such a capability negotiation on the data channel > would work. Is it supposed to just be keying off of the peer's > supported YANG modules or something like that? Might be worth > clarifying. >
- [Dots] AD evaluation of draft-ietf-dots-signal-ca… Benjamin Kaduk
- Re: [Dots] AD evaluation of draft-ietf-dots-signa… Benjamin Kaduk
- Re: [Dots] AD evaluation of draft-ietf-dots-signa… mohamed.boucadair