[Dots] AD review of draft-ietf-dots-signal-channel-25

Benjamin Kaduk <kaduk@mit.edu> Wed, 16 January 2019 00:14 UTC

Received-SPF: Pass (protection.outlook.com: domain of mit.edu designates 18.9.28.11 as permitted sender) receiver=protection.outlook.com; client-ip=18.9.28.11; helo=outgoing.mit.edu;
Date: Tue, 15 Jan 2019 18:14:04 -0600
From: Benjamin Kaduk <kaduk@mit.edu>
To: draft-ietf-dots-signal-channel@ietf.org
CC: dots@ietf.org
Message-ID: <20190116001404.GC91727@kduck.mit.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
User-Agent: Mutt/1.10.1 (2018-07-13)
SpamDiagnosticOutput: 1:99
SpamDiagnosticMetadata: NSPM
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Jan 2019 00:14:08.4548 (UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: b5bcb249-8eec-4d5e-20ac-08d67b4788cf
X-MS-Exchange-CrossTenant-Id: 64afd9ba-0ecf-4acf-bc36-935f6235ba8b
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=64afd9ba-0ecf-4acf-bc36-935f6235ba8b; Ip=[18.9.28.11]; Helo=[outgoing.mit.edu]
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN3PR01MB2019
Archived-At: <https://mailarchive.ietf.org/arch/msg/dots/l0jdpFnlHn2X5yTyc51P_t78QaQ>
Subject: [Dots] AD review of draft-ietf-dots-signal-channel-25
X-BeenThere: dots@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "List for discussion of DDoS Open Threat Signaling \(DOTS\) technology and directions." <dots.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dots>, <mailto:dots-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dots/>
List-Post: <mailto:dots@ietf.org>
List-Help: <mailto:dots-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dots>, <mailto:dots-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2019 00:14:17 -0000

AD review of -25.  (Note that -26 is out already, but to preserve my
sanity I am still using the section numbers from the -25.  I took a
quick look at the diff, and tried to remove comments that are no longer
applicable, but probably missed some.

This review is pretty long, but I think the document is actually in
pretty good shape -- the length of review comments is just a normal
frequency of rough edges multiplied by a long document.

I will make a github pull request with some changes that I think are
just editorial (but feel free to bring them up as discussion items if
you are not sure that they are editorial, disagree with the changes,
etc.)

We should probably introduce (or avoid) the concept of "peace-time"
(sometimes spelled without the hyphen) and perhaps the corresponding
attack situation before using it.

Section 1

I wasn't sure whether the RFC 4987 reference was supposed to be for what
an ACK-flood is or the fact that it is CPU-exhausting or something else.
(Searching for those terms in its text did not help me come to a
conclusion.)

We also talk about "DOTS servers" and "DOTS agents" in Section 1, before
we've defined the terminology.  But these two are sufficiently
self-explanatory that it may be fine to leave them as-is.

                                                 This protocol enables
   cooperation between DOTS agents to permit a highly-automated network
   defense that is robust, reliable, and secure.

We should explain somewhere what we mean by "secure".  (I guess, that
the signal and data channels are integrity- and
confidentiality-protected, and that changes in mitigation status are not
made unless properly authorized.)

Section 3

            DOTS clients MAY alternatively support means to dynamically
   discover the ports used by their DOTS servers.  [...]

If we have some examples in mind of dynamic discovery protocols, it
would be appropriate to provide informative references here.

We should also have an informative reference for "Happy Eyeballs" (e.g.,
RFC 8305).

   From that standpoint, this document specifies a YANG module for
   representing DOTS mitigation scopes, DOTS signal channel session
   [...]

Which standpoint?

   Representing these data as CBOR data is assumed to follow the rules
   in [I-D.ietf-core-yang-cbor] or those in [RFC7951] combined with
   JSON/CBOR conversion rules in [RFC7049].  All parameters in the
   payload of the DOTS signal channel are mapped to CBOR types as
   specified in Section 6.

How do I know which set of rules is in use?  Are the two sets of rules
equivalent and interoperable?

   DOTS agents MUST support GET, PUT, and DELETE CoAP methods.  [...]

As written, this applies to all agents (clients, servers, and proxies).
Presumably this is intended, but I just wante to double-check --
sometimes it's okay to place weaker requirements on clients.

   In deployments where one or more translators (e.g., Traditional NAT
   [RFC3022], CGN [RFC6888], NAT64 [RFC6146], NPTv6 [RFC6296]) are
   enabled between the client's network and the DOTS server, DOTS signal
   channel messages forwarded to a DOTS server MUST NOT include internal
   IP addresses/prefixes and/or port numbers; external addresses/
   prefixes and/or port numbers as assigned by the translator MUST be

Do we want to give any examples of signal channel messages that might
include internal addresses/prefixes?

   Such procedure is needed to avoid experiencing long connection
   delays.  For example, if an IPv4 path to reach a DOTS server is
   found, but the DOTS server's IPv6 path is not working, a dual-stack
   DOTS client may experience a significant connection delay compared to
   an IPv4-only DOTS client.  [...]

There seems a bit of a mismatch between "found" and "not working".  If
the point is supposed to be that the client is going to try both IPv4
and IPv6 but the v6 path has issues and it would take the client's
algorithm some extra time to choose to fall back to v4, then maybe "iv
an IPv4 path to a DOTS server is functional and there is also an
apparently present but non-functional IPv6 path to the same server"?

                                  These connection attempts are
   performed by the DOTS client when it initializes.  The results of the
   Happy Eyeballs procedure are used by the DOTS client for sending its
   subsequent messages to the DOTS server.

I see this is scoped to "when the client initializes", so I don't know
whether it adds more or not to end with "for the duration of the DOTS
session" or some similar scoping for the caching of the happy eyeballs
results.

   client because there is no long delay before using IPv4 and TCP.  The
   DOTS client repeats the mechanism to discover whether DOTS signal
   channel messages with DTLS over UDP becomes available from the DOTS
   server, so the DOTS client can migrate the DOTS signal channel from

What would cause UDP to "become available"?  Would this be a case like
if there's a stateful NAT or where we need to actively apply some sort
of NAT traversal?  If so, do we want to list it as an example?  (If it's
a complicated scenario, it may not be worth listing an example, of
course.)

I'm not sure if we want to add some annotations to Figure 4 that (e.g.)
the initial DTLS attempts do not need to be strictly ordered in the
order presented, or to indicate which lines are retries.

Section 4.4

   GET:    DOTS clients may use the GET method to subscribe to DOTS
           server status messages, or to retrieve the list of its
           mitigations maintained by a DOTS server (Section 4.4.2).

I'm not really a canonical HTTP expert, but I suspect that using GET to
"subscribe" will raise some eyebrows at later stages of review.  Maybe
we should just leave it as-is and wait for such review to arrive,
though, rather than trying to do something preemptively.

I don't know if this is the proper section to talk about it, but we
should probably have some text justifying the use of application-layer
acknowledgment as opposed to just using Confirmable CoAP messages.

Section 4.4.1

If Figure 5 is going to use actual/example values for cuid and mid, then
I'd suggest also using actual/example values for the JSON fields
(instead of "integer" and "string" and such).

   cuid:  Stands for Client Unique Identifier.  A globally unique
      identifier that is meant to prevent collisions among DOTS clients,
      especially those from the same domain.  It MUST be generated by
      DOTS clients.

In order to get global uniqueness we either need sufficient randomness
or a hierarchical allocation strategy.  Since we recommend a strategy
that involves hashing the (public) client certificate, we may also want
to note that this value does not need to be unguessable.  If a random
allocation is possible, I suggest giving guidance on how many random
bits must be used to sufficiently ensure global uniqueness (128 might be
enough).

                          The output of the cryptographic hash algorithm
      is truncated to 16 bytes; truncation is done by stripping off the
      final 16 bytes.  The truncated output is base64url encoded.

side note: Truncation to 128 bits reduces the collision resistance to
64-bit strength, though this is probably acceptable in this use case.

      DOTS servers MUST return 4.09 (Conflict) error code to a DOTS peer
      to notify that the 'cuid' is already in-use by another DOTS
      client.  Upon receipt of that error code, a new 'cuid' MUST be
      generated by the DOTS peer.

Is there any way in which this situation could arise during some sort of
migration scenario?

      Client-domain DOTS gateways MUST handle 'cuid' collision directly
      and it is RECOMMENDED that 'cuid' collision is handled directly by
      server-domain DOTS gateways.

Is a gateway supposed to know if it is "client-side" or "server-side" by
explicit configuration or some more automatic method?

   'cuid' and 'mid' MUST NOT appear in the PUT request message body.

Do we need to say this given that there's no obvious structure in which
to put them?  (That is, is this just a reminder to implementors of a
previous version of the draft versus guidance to new implementations?)

      The prefix list MUST NOT include broadcast, loopback, or multicast
      addresses.  These addresses are considered as invalid values.  In

Does "invalid" mean that the particular address gets ignored or that the
entire request is rejected?

   target-fqdn:   A list of Fully Qualified Domain Names (FQDNs)
      identifying resources under attack.  An FQDN is the full name of a
      resource, rather than just its hostname.  For example, "venera" is
      a hostname, and "venera.isi.edu" is an FQDN [RFC1983].

I would suggest referring to draft-ietf-dnsop-terminology-bis rather
than trying to reproduce a definition of FQDN.

We also may want to note that the DNS view can change depending on the
vantage point used to make the observation.

On page 18 we say that server-domain DOTS gateways "MUST supply [...]
'cdid'", but the previous paragraph says that there SHOULD be a
configuration option for whether to insert 'cdid' on such systems.  The
two requirements are in conflict; which was intended?  (I assume the
MUST -- why would it be desired to not have the information available?)

The same paragraph also says that the 'cdid' is "likely to be the same
as" the 'cuid', though in addition to the listed reasons the client only
has a SHOULD requirement to follow this procedure for generating 'cuid';
it may be worth noting the client's leeway here as well.

      If a DOTS client is provisioned, for example, with distinct
      certificates as a function of the peer server-domain DOTS gateway,
      distinct 'cdid' values may be supplied by a server-domain DOTS
      gateway.  The ultimate DOTS server MUST treat those 'cdid' values
      as equivalent.

Is this paragraph trying to say that a client might supply different
certs to different gateways in the same domain (e.g., if one such
gateway does not support ecdsa certs)?  It was pretty confusing on the
first read, in particular how the ultimate DOTS server would know that
the different 'cdid' values actually correspond to the same client
[domain].

      DOTS servers MUST ignore 'cdid' attributes that are directly
      supplied by source DOTS clients or client-domain DOTS gateways.
      This implies that first server-domain DOTS gateways MUST strip
      'cdid' attributes supplied by DOTS clients.  DOTS servers SHOULD
      support a configuration parameter to identify DOTS gateways that
      are trusted to supply 'cdid' attributes.

Is there any mechanism (e.g., autodetection) other than this by which
servers/proxies can learn about the topology information needed to
fulfil these requirements?

   Because of the complexity to handle partial failure cases, this
   specification does not allow for including multiple mitigation
   requests in the same PUT request.  Concretely, a DOTS client MUST NOT
   include multiple 'scope' parameters in the same PUT request.

"multiple 'scope' parameters" reads like a CBOR object with duplicate
keys; is this intended to refer to a multi-valued array?

   The DOTS server couples the DOTS signal and data channel sessions
   using the DOTS client identity and optionally the 'cdid' parameter

This requires the client to use the same certificate to authenticate the
signal and data channels to a given server (even when a gateway is in
use).  Above, we discussed cases where a client might have multiple
certificates available.  Where do we specify the requirement to use the
same certificate for signal and data channels?

                                               DOTS agents can safely
   ignore Vendor-Specific parameters they don't understand.

This is not a very useful statemnet if the agent must know that a
parameter is vendor-specific in order to safely ignore it.

   If the DOTS server does not find the 'mid' parameter value conveyed
   in the PUT request in its configuration data, it MAY accept the
   mitigation request by sending back a 2.01 (Created) response to the
   DOTS client; the DOTS server will consequently try to mitigate the
   attack.

   If the DOTS server finds the 'mid' parameter value conveyed in the
   PUT request in its configuration data bound to that DOTS client, it
   MAY update the mitigation request, and a 2.04 (Changed) response is
   returned to indicate a successful update of the mitigation request.

There are two "MAY"s spread across these two paragraphs; do we want to
say anything about in what cases the server might want to have different
behavior (i.e., ignore the MAY)?

                                                  To avoid maintaining a
   long list of overlapping mitigation requests (i.e., requests with the
   same 'trigger-mitigation' type and overlapping scopes) from a DOTS
   client and avoid error-prone provisioning of mitigation requests from
   a DOTS client, the overlapped lower numeric 'mid' MUST be
   automatically deleted and no longer available at the DOTS server.
   For example, if the DOTS server receives a mitigation request which
   overlaps with an existing mitigation with a higher numeric 'mid', the
   DOTS server rejects the request by returning 4.09 (Conflict) to the
   DOTS client.  [...]

This seems to be internally inconsistent -- first we hear that the
overlapped mitigation request with lower 'mid' MUST be deleted (BTW,
note my rewording), but at the end we are hearing that a request to
create an overlapping mitgiation must be rejected.  (A couple pages
later we also say "can be achieved by sending a PUT request [...] that
will override the existing one with overlapping mitigation scopes",
which suggests a resolution of the conflict.

                 The response includes enough information for a DOTS
   client to recognize the source of the conflict as described below:

   conflict-information:  Indicates that a mitigation request is
   [...]

I'm not entirely sure what encoding to infer from just this description
-- conflict-information/-cause/-scope are in the YANG model, but do we
need to say that the response contains a CBOR representation of the
subtree starting at conflict-information (or something like that), or is
this implicitly specified elsewhere in the document?  For example,
Figure 9 has an explicit JSON diagnostic notation for a message
response.

      conflict-scope:  Indicates the conflict scope.  It may include a
         list of IP addresses, a list of prefixes, a list of port
         numbers, a list of target protocols, a list of FQDNs, a list of
         URIs, a list of alias-names, or a 'mid'.

This feels a little under-specified -- do we need to include exactly the
conflicting entries?  Or would we just say something like "this
conflicts with 'mid' 8; figure it out"?

         3:  CUID Collision.  This code is returned when a DOTS client
             uses a 'cuid' that is already used by another DOTS client.
             This code is an indication that the request has been
             rejected and a new request with a new 'cuid' is to be re-
             sent by the DOTS client.  Note that 'conflict-status',
             'conflict-scope', and 'retry-timer' are not returned in the
             error response.

(Do we want "MUST NOT" 2119 language to prevent their inclusion or is
the current descriptive language sufficient?)

      conflict-scope:  Indicates the conflict scope.  It may include a
         list of IP addresses, a list of prefixes, a list of port
         numbers, a list of target protocols, a list of FQDNs, a list of
         URIs, a list of alias-names, or references to conflicting ACLs.

How do I reference an ACL?

                          This can be achieved by sending a PUT request
   with a new 'mid' value that will override the existing one with
   overlapping mitigation scopes.

Do we want to reiterate that the existing one will be deleted?

   For a mitigation request to continue beyond the initial negotiated
   lifetime, the DOTS client has to refresh the current mitigation
   request by sending a new PUT request.  This PUT request MUST use the
   same 'mid' value, and MUST repeat all the other parameters as sent in
   the original mitigation request apart from a possible change to the
   lifetime parameter value.

If the server did not have a check that the other parameters match, what
would happen?  (Do we need normative language that the server MUST
verify the other parameters when the 'mid' matches?)

Section 4.4.2

                          If the representation of all the active
   mitigation requests associated with the DOTS client does not fit
   within a single datagram, the DOTS server MUST use the Block2 Option
   with NUM = 0 in the GET response.  [...]

I'm not very familiar with CoAP blockwise transfers; is this in conflict
with some normal feature-negotiation scheme or is it okay to assume that
the client will be able to handle this even if the client has not
explictly indicated support?

      This is a mandatory attribute when an attack mitigation is
      triggered.  Particularly, 'mitigation-start' is not returned for a
      mitigation with 'status' code set to 8.

Is this better as "triggered" or as "active"?

bps-dropped and pps-dropped "SHOULD be a five-minute average", but that
does not seem quite well-defined.  Is it a rolling five-minute average
over bins of one second, or a binned five-minute average for the last
five-minute interval that ends on a multiple of '5', or something else?

   Upon receipt of a conflict notification message indicating that a
   mitigation request is deactivated because of a conflict, a DOTS
   client MUST NOT resend the same mitigation request before the expiry
   of 'retry-timer'.  It is also recommended that DOTS clients support
   means to alert administrators about mitigation conflicts.

Do we need to say anything regarding the server behavior when a
retransmit or similar was in flight already when the conflict
notification was generated?  (That is, the server's enforcement of the
MUST NOT may need to be restrained.)

   Alternatively, the DOTS client can explicitly deregister itself by
   issuing a GET request that has the Token field set to the token of
   the observation to be cancelled and includes an Observe Option with
   the value set to '1' (deregister).

This seems like the more "polite" behavior (as opposed to just
"forgetting" -- why is the polite behavior not the default behavior?

Does the GET in  Figure 13 need to have a ".well-known" and other path
components?

Section 4.4.3

I'm not questioning the decision, but I am a bit curious how we ended up
with a boolean "attack-status" parameter and no quantitative
measurements.

   The DOTS server indicates the result of processing a PUT request
   using CoAP response codes.  The response code 2.04 (Changed) is
   returned if the DOTS server has accepted the mitigation efficacy
   update.  The error response code 5.03 (Service Unavailable) is
   returned if the DOTS server has erred or is incapable of performing
   the mitigation.

What (if anythin) should the client do in response?  Should the client
assume the server is dead and move to a different one?

Section 4.4.4

   terminating period SHOULD be set by default to 120 seconds.  If the
   client requests mitigation again before the initial active-but-
   terminating period elapses, the DOTS server MAY exponentially
   increase the active-but-terminating period up to a maximum of 300
   seconds (5 minutes).

Is the base of the exponent supposed to be 2 (i.e., a doubling of the
delay)?  This should probably be made explicit.  Also, is the input to
the doubling the previous total delay, or the amount of delay remaining
when the repeated request arrives?

   If a mitigation is triggered due to a signal channel loss, the DOTS
   server relies upon normal triggers to stop that mitigation
   (typically, receipt of a valid DELETE request, expiry of the
   mitigation lifetime, or observation of traffic to the attack target).

I think "observation of traffic" probably needs more explanation -- as
written, it sounds like any packet arriving would be sufficient, which
is probably not the intended behavior...

Section 4.5.1

Does the "max-retransmit" value exclude the initial transmission?

I'm not sure whether there are any particularly interesting security
considerations to mention about what happens when ack-random-factor is
insufficiently random; it seems like things would only get really bad if
the value was constant across peers.

Section 4.5.2

                                                 A DOTS client MUST NOT
   transmit a "CoAP Ping" while waiting for the previous "CoAP Ping"
   response from the same DOTS server.

Is this intended to block retransmissions of the same CoAP Ping?

   sid:  Session Identifier is an identifier for the DOTS signal channel
      session configuration data represented as an integer.  This
      identifier MUST be generated by DOTS clients.  'sid' values MUST
      increase monotonically.

What event triggers the increase?

Section 4.5.3

                                       When a DDoS attack is active,
   refresh requests MUST NOT be sent by DOTS clients and the DOTS server
   MUST NOT terminate the (D)TLS session after the expiry of the value
   returned in Max-Age Option.

It seems like there's a bit of a race condition with attacks and refresh
requests coming in in parallel.

Section  4.6

   The DOTS server can return the error response code 5.03 in response
   to a request from the DOTS client or convey the error response code
   5.03 in a unidirectional notification response from the DOTS server.

Just to check my understanding, this unidirectional notification would
only happen if there was a previous Observe to establish the channel?

The "alt-server-record" description here is pretty vague on the string
formatting used, v4 vs. v6, etc.; I guess that the actual "type
inet:ip-address" in the YANG module is probably specific enough, though.

Section 4.7

The first two paragraphs seem to have high overlap and would probably
benefit from at least being joined together if not also trimmed down a
bit.

   In case of a massive DDoS attack that saturates the incoming link(s)
   to the DOTS client, all traffic from the DOTS server to the DOTS
   client will likely be dropped, although the DOTS server receives
   heartbeat requests in addition to DOTS messages sent by the DOTS
   client.  In this scenario, the DOTS agents MUST behave differently to
   handle message transmission and DOTS session liveliness during link
   saturation:

Do we need to say how the agents independently detect the
link-saturation situation?

Section 5.1

It looks like the tree diagram and the module differ about the
"mandatory false" nature of some nodes, e.g., upper-port.  Perhaps the
tree diagram should be regenerated?

Section 5.2

I mostly expect some heartburn from various folks about overriding
protocol number 0's interpretation as the IPv6 hop-by-hop option to mean
"all protocols", though it will probably be fine in practice.  I am
including a suggested rewording in my editorial changes that may help
alleviate concerns.

Section 6

I have several thoughts about the CBOR encoding, some of which I
mentioned in some previous emails.  (I also don't have much experience
with CBOR encoding, so I probably have some things wrong, too...)

It seems that CBOR major type 0 also encodes some width information for
integers.  Do we want to require that the CBOR encoding always uses the
width that matches the types in the YANG module, or do we permit a
smaller encoding when the value in question fits?  (Presumably the
answer falls out naturally if we expect people to use off-the-shelf CBOR
libraries...)

We say that the initial set of key values is comprehension-required;
does that mean that any other future extensions are necessarily
comprehension-optional, and there can be no additional
comprehension-required keys?

It's probably appropriate to make a statement in the initial text here
about the port number and/or media type indicating that this is the
syntax used for the top-level object.

I already mentioned that the encoding of the integer form of the key
values takes a different number of bytes depending on the integer value,
so I think those keys are rearranged in the -26 (I reviewed the -25).

Section 7.1

   A key challenge to deploying DOTS is the provisioning of DOTS
   clients, including the distribution of keying material to DOTS
   clients to enable the required mutual authentication of DOTS agents.
   EST defines a method of certificate enrollment by which domains

I think we need a reference for EST.

   o  TLS False Start [RFC7918] which reduces round-trips by allowing
      the TLS second flight of messages (ChangeCipherSpec) to also
      contain the DOTS signal.
[...]
   o  TCP Fast Open [RFC7413] can reduce the number of round-trips to
      convey DOTS signal channel message.

These are in something of a specification limbo, as they both appear to
only be formally defined for use with TLS, but TLS False Start seems to
have a straightforward extension to DTLS.  It may be worth tweaking the
text a little to acknowledge the situation here.

Section 7.2

The TLS 1.3 handshake with 0-RTT diagram needs to be
revisited/refreshed, as RFC 8446 does not look like that.  Additionally,
the usage of PSK as well as a certificate is not defined until
draft-housley-tls-tls13-cert-with-extern-psk is published.
I also note that in the diagram as presented, the client is not yet
known to be authenticated when the server sends its initial (0.5-RTT)
DOTS signal message.

Section 7.3

This whole section seems to only be relevant for UDP usage, so probably
the "If UDP is used" clause can be dropped and an introductory statement
added earlier on.

                              Path MTU MUST be greater than or equal to
   [CoAP message size + DTLS overhead of 13 octets + authentication
   overhead of the negotiated DTLS cipher suite + block padding]
   (Section 4.1.1.1 of [RFC6347]).  If the request size exceeds the path
   MTU then the DOTS client MUST split the DOTS signal into separate
   messages, for example the list of addresses in the 'target-prefix'
   parameter could be split into multiple lists and each list conveyed
   in a new PUT request.

(DTLS 1.3 will have a short header for some packets, that is less than
13 octets.)

Section 8

We've got some requirements in here about limiting behavior of
clients/servers when talking to gateways; is knowing about the presence
of a gateway something that's required to happen out of band or is there
an in-band way to identify that the peer is a gateway?

   messages from an authorized DOTS gateway, thereby creating a two-link
   chain of transitive authentication between the DOTS client and the
   DOTS server.

This seems to ignore the possibility of setups that include both
client-domain and server-domain gateways.

                 DOTS client certificate validation MUST be performed as
   per [RFC5280] and the DOTS client certificate MUST conform to the
   [RFC5280] certificate profile.  [...]

This seems to duplicate a requirement already stated in Section 7.1;
it's probably best to only have normative language in one location, even
if we need to mention the topic in multiple locations.
Similarly for the mutual authentication requirement, which we duplicate
in the second and fourth paragraphs of this section.

If we don't want to use example.com vs. example.net as sample domains,
we can also use whateverwewant.example, per RFC 6761.

Section 9

We should mention the media-type allocation in the top-level section.

"mappings to CBOR" feels confusing to me, since it leaves empty what we
are mapping from.  Perhaps just talking about a registry of "CBOR map
keys" would be better, both here and in the Section 9.3 intro.

Section 9.3

I suggest being very explicit about whether new requests for
registration should be directed to the mailing list or to IANA, as we've
had some confusion about this elsewhere.

The criteria used by the experts also just lists things they should
consider, but does not provide full clarity on which answer to the
question is more likely to be approved.  (And yes, I know that this text
is largely copied from already published RFCs, but we can still do
better.)

Section 9.3.1

If we want the value 0 to be reserved we need to say so.
Do we want to say anything about the usage of negative integers as map
keys?

I suggest not mentioning the postal address, given the recent (e.g.)
GDPR requirements.

Section 9.3.2

It may be worth mentioning Table 4 here as well.

Section 9.5.1

We need to specify which range of values we are asking for an allocation
from.

Section 9.6.1

As above, specify what range we're asking about.

I expect the current text to get some IESG (or directorate) feedback
that the Data Item and Semantics descriptions are repetitive and banal.

Section 9.7

IIUC, IANA is going to ask if we want this module to be "maintained by
IANA", so it would be good to have an answer ready even if we don't put
it in the document text.

   Rate-limiting DOTS requests, including those with new 'cuid' values,
   from the same DOTS client defends against DoS attacks that would

With respect to "new" 'cuid' values, is the server required to remember
which ones it has seen in perpetuity, or can it time them out
eventually?

Section 10

The security considerations seem to be taking a narrow focus on the
requirements for and consequences of specific bits on the wire in the
signal channel protocol.  I think it's appropriate to also include some
high-level thoughts about the functional behavior of the protocol,
allowing to signal that an attack is underway and trigger mitigation,
increasing the availability of services, etc., and the risks that are
posed by the protocol failing to work properly, whether that means
letting attack traffic through or improperly blocking legitimate
traffic.

Section 13.1

I think the IANA registries should be listed as Informational and not
Normative references.

Section 13.2

Pending resolution of the question about using draft-ietf-core-yang-cbor
rules or RFC7951+RFC7049, the draft-ietf-core-yang-cbor reference may
need to be Normative.

Given that "URI" is a well-known abbreviation, we may be able to get
away with not citing RFC 3986.  On the other hand, it's not causing any
harm to leave it in...

RFC 4632 needs to be Normative, since we need to know CIDR to
encode/decode target-prefixes.

Similarly, I think that RFCs 6234, 7413, 7589, 7918, 7924, and 7951
should also be Normative.


-Ben

[Dots] AD review of draft-ietf-dots-signal-channe… Benjamin Kaduk