Re: [secdir] secdir review of draft-ietf-tsvwg-circuit-breaker-11

Watson Ladd <watsonbladd@gmail.com> Fri, 12 February 2016 05:01 UTC

Return-Path: <watsonbladd@gmail.com>
X-Original-To: secdir@ietfa.amsl.com
Delivered-To: secdir@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7AB901B3F62; Thu, 11 Feb 2016 21:01:48 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id beqRpOopYhpC; Thu, 11 Feb 2016 21:01:44 -0800 (PST)
Received: from mail-yk0-x22f.google.com (mail-yk0-x22f.google.com [IPv6:2607:f8b0:4002:c07::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 93AC61B3F8C; Thu, 11 Feb 2016 21:01:42 -0800 (PST)
Received: by mail-yk0-x22f.google.com with SMTP id z13so30187277ykd.0; Thu, 11 Feb 2016 21:01:42 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=iDyU5JLQshYddTNXipzAXfTAWZ+vhVxLn5mEx6/NRD0=; b=jMSDhFLCJ7s1+NaLXc22yAXxC1aDE2I/wFEcNbDrf8Jg/dGsoGa7FEL7p2IIApynBi 2v06BqYac0xzGa4ih/I2Yw6JPl4WsRSE4PmU6rpcsuiMV+uh93+qfc+OjtkyxNCAlWtb kTvaUxlplwh1zWDX94MGdqYqcH52KvYNb3YBydjCP3GXR490GYEnFkhmEdF9QgraHbvN YvwEMJrZklieiePaxBc5EigZCcdPCrFCMq8RILIiDYJX/VRK8axrgpM+mFgduqpOD4dC DKV15SAArP9RvkX3KfzH+fLMO6KPSTJHbmwxso7aUW8euKlaUcKyZLiv986tCP0mvB0v khCw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=iDyU5JLQshYddTNXipzAXfTAWZ+vhVxLn5mEx6/NRD0=; b=W5Gb4+9RP7fiDBhe268s6tItESgsnpvOIhPPt7XWiumcl6ojQmKxOrSE6vqCbUEAJc MwzSkmY8LTck0+st49aKES22SpVml0Lgb05aOW5ddoA86C3e2td/N0kC8vkHuuuC/TxM Nl/YdHVi82q4W/yjtQgWVAXAbLAGKmZZJM+/NVYMBYHX3nj4thk5zpWBPu31RDFbdfNA QgUkbkCNq/D4BP6rkWCKzw3CSKGyvpgvoSIKys9lZp5xsNRZO4r9bQec2SV35pboSrUC 0DZ7yc/iDBHS/RF827LKRmTwLCCVmTuv1m8Ad6izs1W1ouiFgTmLZ65xGV9m/VzF9242 nO8w==
X-Gm-Message-State: AG10YOSvQT3R71objo6GpQie6nWFJ3ynaxJM8XJvgUpaTL/4TeMgfY9AdSyDlkcGBU3QEu9DBnBl+c8I7igrgw==
MIME-Version: 1.0
X-Received: by 10.37.78.5 with SMTP id c5mr27098846ybb.53.1455253301798; Thu, 11 Feb 2016 21:01:41 -0800 (PST)
Received: by 10.13.216.138 with HTTP; Thu, 11 Feb 2016 21:01:41 -0800 (PST)
In-Reply-To: <alpine.GSO.1.10.1602111737220.26829@multics.mit.edu>
References: <alpine.GSO.1.10.1602111737220.26829@multics.mit.edu>
Date: Thu, 11 Feb 2016 21:01:41 -0800
Message-ID: <CACsn0cnEu=f5NRsXYygbmUDA14gjo7JB0dYRbTMbzDVGcPXCSw@mail.gmail.com>
From: Watson Ladd <watsonbladd@gmail.com>
To: Benjamin Kaduk <kaduk@mit.edu>
Content-Type: text/plain; charset="UTF-8"
Archived-At: <http://mailarchive.ietf.org/arch/msg/secdir/InjPFJCIBwqJnfJ9wCL1dn-k0zU>
Cc: draft-ietf-tsvwg-circuit-breaker.all@ietf.org, "<iesg@ietf.org>" <iesg@ietf.org>, secdir@ietf.org
Subject: Re: [secdir] secdir review of draft-ietf-tsvwg-circuit-breaker-11
X-BeenThere: secdir@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Security Area Directorate <secdir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/secdir>, <mailto:secdir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/secdir/>
List-Post: <mailto:secdir@ietf.org>
List-Help: <mailto:secdir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/secdir>, <mailto:secdir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Feb 2016 05:01:48 -0000

On Thu, Feb 11, 2016 at 8:21 PM, Benjamin Kaduk <kaduk@mit.edu> wrote:
> I have reviewed this document as part of the security directorate's
> ongoing effort to review all IETF documents being processed by the
> IESG.  These comments were written primarily for the benefit of the
> security area directors.  Document editors and WG chairs should treat
> these comments just like any other last call comments.
>
> This document is ready with nits (the rest of this paragraph), modulo one
> question I have (the following paragraph).  Since it's more of a
> requirements doc than a full protocol specification, there are not too
> many requirements for security considerations.  This document correctly
> notes the risk of an attacker using the circuit breaker mechanism for
> denial of service and the need for integrity and authenticity of control
> messages.  It states that there is a trade-off between the cost of crypto
> and the need to authenticate control messages when there is a risk of
> on-path attack; I am a little uncomfortable with this statement (it is
> perhaps "too weak"), especially since it does not give guidance on
> determining the level of risk, but neither do I have a concrete objection
> to it.  (Given the availability of physical network taps to at least
> nation-state-level actors, there seems to always be a risk of on-path
> attack.)  Likewise, I am somewhat uneasy with the claim that just
> randomization of source port (or similar randomization in the packet
> header) suffices to deter an off-path attacker -- for example, in crypto,
> we usually talk of reducing the attacker's success probability to below
> 2^-32 or 2^-64 or something like that, but there are only ~2^16 ports
> numbers to randomize in, so the success probability from just port number
> randomization would not meet the usual criteria.  So, perhaps that's not a
> good example to use on its own in the requirements document; other fields
> in the packet header could have a larger search space and be more
> reasonable for this purpose.  The rest of the security considerations are
> good, covering the issues related to capacity and robustness, and
> mentioning the need for per-mechanism analysis.

Sending 2^32 packets is pretty easy, especially if I only need to get
lucky once. All control signals need to be authenticated, or they will
be spoofed. Secondly, this "on path" notion matters very little:
routing is easily forged, you don't know what's actually on your
network, routers have holes, etc.

>
> The one question I have relates to the possibility of circuit breakers
> becoming ubiquitous.  It seems pretty clear that going from a network with
> no circuit breakers to a network with one circuit breaker is worthwhile,
> offering a local improvement for the flows in question.  But if all, or
> nearly all routes through the network traversed one or more circuit
> breakers -- is there a risk of cascading failure, either accidental or
> purposefully triggered by an attacker?  In a network where circuit
> breakers occupy what a topologist would call a dense or fully-connected
> subset of the network, would one circuit breaker tripping cause subsequent
> breaker trips and near-complete network shutdown?  There seems to even be
> something of an analogue (though not a perfect one) in electrical circuit
> breakers, where an extreme failure in a device can cause the breaker in a
> power strip to fuse open, causing the breaker for the particular circuit
> in the building that it's on to fuse open, tripping the main breaker for
> the whole panel.  It is uncommon for the cascade to continue to the mains
> for the building or the local substation, but is a known risk.  So, has
> anyone thought about the behavior of circuit breakers if/when they are
> ubiquitous in the network?

Yes, this a serious concern.

Let's consider how IPsec or MPLS network tunnels carrying TCP traffic
currently function (at least to me). Each IP packet the endpoint sends
gets encrypted/labeled, goes out over the other end, and at the far
side of the tunnel gets decrypted and sent. Crucially a loss event in
the tunnel (and hopefully ECN, but I'm not holding my breath) becomes
a loss event for the TCP implementation, which then backs off, the
same as if the tunnel wasn't there.

One concern is if there are multiple possible tunnels to use, say
connecting three sites via IPsec, with each pair exposed as a
transport route for the OSPF routing happening over the VPN. Unusual:
sure, but I'm sure someone out there has done it. I go and SCP a big
file from site #1 to site #2, and the machines get up to full speed
because there is no congestion. Unfortunately it exceeds a badly
set-limit/a temporary congestion event happens. The circuit breaker on
the link between them triggers, causing a loss of the tunnel event. My
packets now go the long way around, triggering the second circuit
breaker. Congratulations, you've just knocked out the entire office
network temporarily. Even better is if the first tunnel goes down for
unrelated reasons, triggering a rerouting, and so properly sized
breakers become too small.

The argument here is that TDM wires don't fit into a congestion
controlled world as they are not congestion controlled, and dropping
packets makes them not work so we should drop 100% of the packets to
make things work better. I think this is really an argument against
deploying TDM PWs: the Internet is packet-switched, not
circuit-switched, and so circuits don't fit nicely into it. It's not
clear that officially blessing a dangerous, footgunny idea is a good
solution to this problem.

>
> (Also, it's amusing to see "CB" used for "circuit breaker" in this
> context, as I'm so used to seeing it expand to "channel binding".  It
> seems that the RFC Editor's abbreviation list
> https://www.rfc-editor.org/materials/abbrev.expansion.txt includes neither
> form...)
>
>
> Section 7.3 as written does not seem terribly connected to circuit
> breakers or the rest of the document.  Should it be removed?
>
>
> This secdir review also comes with a bonus copyediting pass; iesg@ and
> secdir@ feel free to stop reading now.
>
> The last sentence of the first paragraph of section 1 ("Just ...
> appliance.") does not have an independent clause, and leaves the reader
> hanging.
>
> In Section 1, second paragraph, "countered by the requirement to use
> congestion control by the transmission control protocol" would probably
> flow better as "in the [TCP]" or "with the [TCP]", since although the TCP
> specification is what requires the use of congestion control, the TCP
> protocol itself is just using congestion control.
>
> In Section 1, second paragraph, penultimate sentence, "applications of the
> Unix Datagram Protocol" suffers from the dual meaning of "applications" as
> "software programs" and "instances where it is used".  The first time I
> read it, I flagged it to be changed to "applications using [UDP]", but of
> course it is the latter meaning that was intended.  Not a big deal, but
> perhaps this could be rearranged to avoid the potential confusion.  (I
> don't think there's a good word to just replace the single word with,
> though.)
>
> Section 1, third paragraph, penultimate sentence has a comma splice.
>
> Section 1, fourth paragraph, second sentence: a timescale is inherently an
> order-of-magnitude thing, and different paths have a different RTT, so
> there is not a single timescale on which congestion control operates.  I
> suggest just saying "operates on the timescale of a packet RTT", but
> "operates on a timescale on the order of a packet RTT" is probably fine,
> too.
>
> In the following sentence, the concept of "packet loss/marking" is calmly
> used with no introduction.  I'm not personally familiar with packet
> marking in this sense, and though the usage later in the document gave me
> some rough sense of what it means, maybe a bit more introduction (e.g., an
> informational reference) would be useful.  Or maybe it's a term of art in
> transport and I'm just not a practitioner; that's possible, too.
>
> In a similar vein, "5-tuple" at the end of that same paragraph may want an
> informational reference to RFC 6146, or may be considered common knowledge
> for the target audience.
>
> In the next paragraph, there's a comma splice in the penultimate sentence.
>
> The long paragraph in the middle of page 4 seems to introduce a new term
> "control function" without much explanation; this phrase does not seem to
> be used anyplace else in the document (thought "control plane function"
> has one occurrence), so it seems likely that a slight rewording here would
> improve the document.  (I'm not actually entirely sure what it's trying to
> say, so I don't have any concrete suggestions.)  In this and the following
> sentence, it would be good to make more clear that the text is talking
> about circuit breakers and not other forms of congestion control
>
> The second bullet point for examples of situations that could trigger
> circuit breakers ("traffic generated by an application...utilised for
> other purposes") confused me the first time I read it.  Perhaps shuffle
> things around a bit to clarify that it is that "the network capacity
> provisioned for that application is being utilised for other purposes",
> though upon re-reading the existing text may suffice as-is.
>
> The second sentence of the first (full, i.e., non-bullet-point) paragraph
> on page 5 seems to suffer from a bit of pronoun/antecedent confusion.  In
> particular, "will generate elastic traffic that may be expected to
> regulate the load" reads as if it is the generated traffic itself that
> will regulate the load, whereas a common way of thinking about it would be
> that it is the application that is regulating the load produced by the
> traffic that the application generates.  Also, in "the load it introduces"
> there is ambiguity as to whether "it" refers to the application or the
> traffic.  (Perhaps this ambiguity is irrelevant, but in general ambiguity
> in a spec is to be avoided.)
>
> In the following paragraph, the second sentence is a bit long, and heavily
> broken up by qualifiers that are not really needed ("all but impossible",
> "may further be the case", "may have some difficulty", "has in fact been
> tripped").  As copyeditor, I would suggest splitting this into two
> sentences and removing some of the unneeded words.
>
> Should "Circuit Breaker" be uniformly capitalized throughout the document?
> It is not capitalized in the first sentence of Section 1.1.  (Perhaps the
> plural "Breakers" is also appropriate?)
>
> On pages 8/9, it would be good to maintain parallel structure across the
> enumerated items, most notably by including "that" in the first sentences
> ("An ingress meter that records the number of packets", "A measurement
> function that combines", ...).  Item 3 does not currently fit into that
> structure, and it may not be worth the drastic changes that would be
> needed to stuff it into place, since it is describing an action as opposed
> to the functions that are described in the other items.  But it's probably
> worth making the easy changes.
>
> In item 3 of that list, the capital "An" is not needed after a semicolon,
> and there is another list within the second sentence that could gain a
> more parallel structure if "be sending another in-band" were replaced with
> "sending an in-band".
>
> In Section 4, fourth bullet point, "adjust the traffic to experienced
> congetsion" might be better as "adjust the traffic when congestion is
> experienced".
>
> The fifth (i.e., next) bullet point seems to lack a subject for the first
> sentence.  Presumably it refers to the circuit breaker in question, but
> it's best to be explicit about it.
>
> The eighth bullet point (top of page 11), I'd put "it is" before the
> "triggered" in the parenthetical.
>
> In the sixteenth bullet point (second one on page 12), you refer to the
> "source" of control messages, which I think would more conventionally be
> written as the "authenticity" of those messages.  ("Source" is used in
> this fashion in at least one other place in the document, so please change
> all occurrences if changing any.)
>
> I am a bit hazy on what exactly is going on in the example in Section 5.1
> (the last three paragraphs), but I will chalk that up to my lack of
> knowledge about multicast routing.  It's probably worth expanding and/or
> putting an informational reference for PIM-SM, though, and offsetting the
> "however" in the last sentence with commas.
>
> In the first paragraph of Section 5.2, please us the plural "paths" in
> "paths provisioned using the Resource reservation protocol".
>
> Given the success of UDP-based protocols like QUIC, mosh, BitTorrent,
> etc., it seems a little strong to have this claim in Section 6.1 that "all
> applications ought to use a full-featured transport" when the meaning
> seems to really just be that all applications ought to have congestion
> control functionality for their traffic, whether obtained via a
> full-featured transport or built directly into the application [protocol].
>
> I would also consider removing the comma in the penultimate sentence of
> the first paragraph of Section 6.1, though I do not think I can claim that
> it is actually incorrect.
>
> In the next paragraph, "tailored *to* the type of traffic", and probably
> "when multiple congestion-controlled flows *combined* lead to short-term
> overload", since otherwise one could read it as saying that (multiple)
> (congestion-controlled flows lead to short-term overload), in which case
> the "multiple" is seemingly irrelevant.
>
> In the next paragraph (last one on page 15), there's a singular/plural
> mismatch in "a RTP-aware network devices"; I'd go with the plural, but
> it's your call.
>
> In Section 6.1.1, item 3 doesn't seem quite right -- I don't think that
> the breaker ought to trigger just by the act of using a TFRC-style check
> with a hard upper limit; I'd expect that the observed traffic would need
> to exceed that limit, too.  (Also, expand "TFRC".)
>
> >From a document structure perspective, it's slightly jarring to not have a
> subsection 6.2.1 with a dedicated example, but I can understand why the
> document is currently the way it is.
>
> The last sentence of Section 6.3.1 seems to come without much lead-in; it
> would be nice to get a better transition into it, and maybe a mention of
> "circuit breaker" and its releation thereto.
>
> In Section 7.1, third paragraph, I don't think I understand what "other
> sharing network traffic" is supposed to mean, or really, what the example
> is saying in general.
>
> In Section 7.2, second paragraph, "For sure" seems a rather informal way
> of starting a sentence.
>
> The third sentence in that paragraph contains a comma splice.
>
> The last sentence of Section 7.2 could benefit from avoiding the pronoun
> in "this protects other network traffic" to clarify what exactly is
> providing the protection ("the network configuration", perhaps?).
>
> In Section 8, first paragraph, it's probably worth covering the failure
> mode when the interval is too short, just for completeness (even though
> it's ~obvious and covered elsewhere in the document).
>
> -Ben
>
> _______________________________________________
> secdir mailing list
> secdir@ietf.org
> https://www.ietf.org/mailman/listinfo/secdir
> wiki: http://tools.ietf.org/area/sec/trac/wiki/SecDirReview



-- 
"Man is born free, but everywhere he is in chains".
--Rousseau.