Re: [Stackevo] Optional Security is Not

Spencer Dawkins at IETF <spencerdawkins.ietf@gmail.com> Tue, 29 January 2019 21:14 UTC

MIME-Version: 1.0
References: <7656D8AE-D010-478F-A4A1-FDB97AF02957@trammell.ch>
In-Reply-To: <7656D8AE-D010-478F-A4A1-FDB97AF02957@trammell.ch>
From: Spencer Dawkins at IETF <spencerdawkins.ietf@gmail.com>
Date: Tue, 29 Jan 2019 15:14:02 -0600
Message-ID: <CAKKJt-cQXvLWhf3znToJyzJZcnZoFXNU2k5x-rXRRri3VTt3cw@mail.gmail.com>
To: "Brian Trammell (IETF)" <ietf@trammell.ch>
Cc: Stackevo <stackevo@iab.org>
Content-Type: multipart/alternative; boundary="0000000000009c5e9b05809f4626"
Archived-At: <https://mailarchive.ietf.org/arch/msg/stackevo/-LW3d2QmBn83IFpdpr3AHfWIEfY>
Subject: Re: [Stackevo] Optional Security is Not
Precedence: list

Hi, Brian,

On Mon, Jan 14, 2019 at 8:43 AM Brian Trammell (IETF) <ietf@trammell.ch>
wrote:

> Hi, all,
>
> I've recently dusted off and updated a draft about an idea I've been
> tossing around for a while (first rev, IIRC, was written back in London):
> https://datatracker.ietf.org/doc/draft-trammell-optional-security-not/ --
> thanks to Martin for comments on this (last summer!).
>
> To some extent, this is shaping up to be a companion to RFC 8170. While
> the former takes a comprehensive look at protocol transitions, this looks
> at a particular impediment (the base-rate fallacy) to particular protocol
> transitions (optional security for routing, naming, and end-to-end
> transport, at least for the web), and attempts to derive guidelines for
> moving forward (tl;dr pay people to do stuff you want them to when natural
> incentives aren't enough, and coordinate action when you have to "break"
> things.)
>
> Comments, including what (if anything) I should do with this document,
> much appreciated!
>

I like this.

My high-order bit is a question about whether "optional" captures what
you're talking about - if I'm reading correctly, all of the examples cover
protocols that were deployed with much less awareness of security than we
need now, and have security extensions  added on after they were deployed
(at the "wild success level"), that are facing obstacles in deployment.

I have some notes following, but most of them are questions about
readability.

Thanks for the opportunity to provide feedback.

Spencer

               draft-trammell-optional-security-not-01

Abstract

  This document explores the common properties of optional security
  protocols and extensions, and notes that due to the base-rate fallacy

                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I should probably have been familiar with "base-rate fallacy" without a
reference (on first use in the body of the document), but I'm not -
although it makes enough sense that I wonder whether I know it under
another name. So, two suggestions - adding a reference, or definition, in
the body of the document, and considering dropping this text in the
abstract, which may not help readers decide whether to keep reading. But do
the right thing, of course.

  and general issues with coordinated deployment of protocols under

  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  uncertain incentives, optional security protocols have proven

  ^^^^^^^^^^^^^^^^^^^^^
  difficult to deploy in practice.  This document defines the problem,
  examines efforts to add optional security for routing, naming, and
  end-to-end transport, and extracts guidelines for future efforts to
  deploy optional security protocols based on successes and failures to
  date.

It's just a thought, but I'm not sure whether the word "optional" captures
the point you're making. Aren't all of your case studies about what happens
if you add security later?

1.  Introduction

  Many of the protocols that make up the Internet architecture were
  designed and first implemented in an envrionment of mutual trust

                                       ^ nit - "environment"
  among network engineers, operators, and users, on computers that were
  incapable of using cryptographic protection of confidentiality,

I think your point here is something like "on computers that were incapable
of performing cryptographic operations at a sufficient performance level
that would routinely allow protection of confidentiality ..."

  integrity, and authenticity for those protocols, in a legal
  environment where the distribution of cryptographic technology was
  largely restricted by licensing and/or prohibited by law.  The result
  has been a protocol stack where security properties have been added
  to core protocols using those protocols' extension mechanisms.

  As extension mechanisms are by design optional features of a
  protocol, this has led to a situation where security is optional up
  and down the protocol stack.  Protocols with optional security have

                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

AH. Now I get your use of "optional". Perhaps this could be clearer early
in the document?

  proven to be difficult to deploy.  This document describes and
  examines this problem, and provides guidance for future evolution of
  the protocol, based on current work in network measurement and usable
  security research.

2.  Problem statement

  Consider an optional security extension with the following
  properties:

  1. The extension is optional: a given connection or operation will
      succeed without the extension, albeit without the security
      properties the extension guarantees.

  2. The extension has a true positive probability P: the probability
      that it will cause any given operation to fail, thereby
      successfully preventing an attack that would have otherwise
      succeeded had the extension not been enabled.  This probability
      is a function of the extension's effectiveness as well as the
      probability that said operation will be an instance of the attack
      the extension prevents.

  3. The extension has a false positive probability Q: the probability
      it will cause any given operation to fail due to some condition
      other than an attack, e.g. due to a misconfiguration.

  Moving from no deployment of an optional security extension to full
  deployment is a protocol transition as described in [RFC8170].  We
  posit that the implicit transition plans for these protocols have
  generally suffered from an underestimation of the disincentive (as in
  section 5.2 of [RFC8170]) linked to the relationship between P and Q
  for any given protocol.

  Specifically, if Q is much greater than P, then any user of an
  optional security extension will face an overwhelming incentive to
  disable that extension, as the cost of dealing with spuriously
  failing operations overwhelms the cost of dealing with relatively
  rare successful attacks.  This incentive becomes stronger when the
  cause of the false positive is someone else's problem; i.e. not a
  misconfiguration the user can possibly fix.  This situation can arise
  when poor design, documentation, or tool support elevates the
  incidence of misconfiguration (high Q), in an environment where the
  attack models addressed by the extension are naturally rare (low P).

  This is not a novel observation; a similar phenomenon following from
  the base-rate fallacy has been studied in the literature on

                                         ^^^^^^^^^^^^^^^^^

I'm assuming that "in the literature" refers to [Axelsson99]. Is that what
you meant here? You might move that reference here.

  operational security, where the false positive and true positive
  rates for intrusion detection systems have a similar effect on the
  applicability of these systems.  Axelsson showed [Axelsson99] that
  the false positive rate must be held extremely low, on the order of 1
  in 100,000, for the probability of an intrusion given an alarm to be
  worth the effort of further investigation.

  Indeed, the situation is even worse than this.  Experience with
  operational security monitoring indicates that when Q is high enough,
  even true positives P may be treated as "in the way".

                                          ^^^^^^^^^^^^

"in the way" wasn't clear to me here. Is this saying something like "when Q
is high enough, ignoring true positives P becomes the path of least
resistance"?

3.  Case studies

  Here we examine four optional security extensions, BGPSEC [RFC8205],

                  ^^^

It might be simpler for the reader to map these four extensions onto the
following three sections if you made it clearer that BGPSEC and RPKI were
being treated together because they are complimentary, but do the right
thing, of course.

  RPKI [RFC6810], DNSSEC [RFC4033], and the addition of TLS to HTTP/1.1
  [RFC2818], to see how the relationship of P and Q has affected their
  deployment.

  We choose these examples as all four represent optional security, and
  that perfect deployment of the associated extensions - securing the
  routing control plane, the Internet naming system, and end-to-end
  transport (at least for the Web platform) - would represent
  completely "securing" the Internet architecture at layers 3 and 4.

3.1.  Routing security: BGPSEC and RPKI

  The Border Gateway Protocol [RFC4271] (BGP) is used to propagate
  interdomain routing information in the Internet.  Its original design
  has no integrity protection at all, either on a hop-by-hop or on an
  end-to-end basis.  In the meantime, the TCP Authentication Option
  [RFC5925] (and MD5 authentication [RFC2385], which it replaces) have

                                                                  ^^^^
  been deployed to add hop-by-hop integrity protection.

  ^^^^^^^^^^^^^

This text would make me think that TCP-AO has been deployed, and has
replaced MD5 authentication <in deployment>. Is that what you intended?

  End-to-end protection of the integrity of BGP announcements is
  protected by two complementary approaches.  Route announcements in
  BGP updates protected by BGPSEC [RFC8205] have the property that the
  every Autonomous System (AS) on the path of ASes listed in the UPDATE
  message has explicitly authorized the advertisement of the route to
  the subsequent AS in the path.  RPKI [RFC6810] protects prefixes,
  granting the right to advertise a prefix (i.e., be the first AS in
  the AS path) to a specific AS.  RPKI serves as a trust root for
  BGPSEC, as well.

  These approaches are not (yet) universally deployed.  BGP route
  origin authentication approaches provide little benefit to individual
  deployers until it is almost universally deployed [Lychev13].  RPKI
  route origin validation is similarly deployed in about 15% of the
  Internet core; two thirds of these networks only assign lower
  preference to non-validating announcements.  This indicates
  significant caution with respect to RPKI mistakes [Gilad17].  In both
  cases the lack of incentives for each independent deployment,
  including the false positive risk, greatly reduces the speed of
  incremental deployment and the chance of a successful transition
  [RFC8170].

  In addition, the perception of security as a secondary concern for
  interdomain routing hinders deployment.  A preference for secure
  routes over insecure ones is necessary to drive further deployment of
  routing security, but an internet service provider is unlikely to
  prefer a secure route over an insecure route when the secure route
  violates local preferences or results in a longer AS path [Lychev13].

3.2.  DNSSEC

  The Domain Name System (DNS) [RFC1035] provides a distributed
  protocol for the mapping of Internet domain names to information
  about those names.  As originally specified, an answer to a DNS query
  was considered authoritative if it came from an authoritative server,
  which does not allow for authentication of information in the DNS.
  DNS Security [RFC4033] remedies this through an extension, allowing
  DNS resource records to be signed using keys linked to zones, also
  distributed via DNS.  A name can be authenticated if every level of
  the DNS hierarchy from the root up to the zone containing the name is
  signed.

  The root zone of the DNS has been signed since 2010.  As of 2016, 89%
  of TLD zones were also signed.  However, the deployment status of
  DNSSEC for second-level domains (SLDs) varies wildly from region to
  region and is generally poor: only about 1% of .com, .net. and .org
  SLDs were properly signed [DNSSEC-DEPLOYMENT].  Chung et al found

       ^^^^^^^^^^^^^^^^^^^^

If I'm reading the reference correctly, you might say "were properly signed
at the end of 2016".

  recently that second-level domain adoption was linked incentives for

                                                       ^ to (?)
  deployment: TLDs which provided direct financial incentives to SLDs
  for having correctly signed DNS zones tend to have much higher
  deployment, though these incentives must be carefully designed to
  ensure that they measure correct deployment, as opposed to more
  easily-gamed indirect metrics [Chung17].

  However, the base-rate effect tends to reduce the use of DNSSEC
  validating resolvers, which remains below 15% of Internet clients
  [DNSSEC-DEPLOYMENT].

  DNSSEC deployment is hindered by other obstacles, as well.  Since the
  organic growth of DNS software predates even TCP/IP, even EDNS, the
  foundational extension upon which DNSSEC is built are not universally
  deployed, which inflates Q.  The current DNS Flag Day effort (see
  https://dnsflagday.net) aims to remedy this by purposely breaking
  backward interoperability with servers that are not EDNS-capable, by
  coordinating action among DNS software developers and vendors.

  In addition, for the Web platform at least, DNSSEC is not percieved

                                                            ^ perceived

                        (and this is a global change - there's more than
one)

  as having essential utility, given the deployment of TLS and the
  assurances provided by the Web PKI (on which, see Section 3.3).  A
  connection intercepted due to a poisoned DNS cache would fail to
  authenticate unless the attacker also obtained a valid certificate
  from the name, rendering DNS interception less useful, in effect,
  reducing P.

3.3.  HTTP over TLS

  Security was added to the Web via HTTPS, running HTTP over TLS over
  TCP, in the 1990s [RFC2818].  Deployment of HTTPS crossed 50% of web
  traffic in 2017.

  Base-rate effects didn't hinder the deployment of HTTPS per se;
  however, until recently, warnings about less-safe HTTPS
  configurations (e.g. self-signed certificates, obsolete versions of
  SSL/TLS, old ciphersuites, etc.) were less forceful due to the
  prevalence of these configurations.  As with DNS Flag Day, making
  changes to browser user interfaces that inform the user of low-
  security configurations is facilitated by coordination among browser
  developers [ChromeHTTPS].  If one browser moves alone to start
  displaying warnings or refusing to connect to sites with less-safe or
  unsafe configurations, then users will tend to percieve the safer
  browser as more broken, as websites that used to work don't anymore:
  i.e., non-coordinated action can lead to the false perception that an
  increase in P is an increase in Q.  This coordination continues up

  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If I'm tracking this correctly, is it also true that users encounter a
definite increase in costs with a higher Q when websites stop working after
the change, but the benefits users receive are potential benefits (P goes
up when you're being attacked, but not until then)?

  the Web stack within the W3C [SecureContexts].

  The Automated Certificate Management Environment [ACME] has further
  accelerated the deployment of HTTPS on the server side, by
  drastically reducing the effort required to properly manage server
  certificates, reducing Q by making configuration easier than
  misconfiguration.  Let's Encrypt leverages ACME to make it possible
  to offer certificates at scale for no cost with automated validation,
  issuing 90 million active certificates protecting 150 million domain
  names in December 2018 [LetsEncrypt2019].

  Deployment of HTTPS accelerated in the wake of the Snowden
  revelations.  Here, the perception of the utility of HTTPS has
  changed.  Increasing confidentiality of Web traffic for openly-
  available content was widely seen as not worth the cost and effort
  prior to these revelations.  However, as it became clear that the

                                                                ^^^
  attacker model laid out in [RFC7624] was a realistic one, content

  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

It's likely more reader-friendly to be descriptive here - perhaps something
like "the Pervasive Surveillance attacker model laid out in [RFC7624]"? I
had to go check which attacker model we were talking about ...

  providers and browser vendors put the effort in to increase
  implementation and deployment.

  The ubiquitous deployment of HTTPS is not yet complete; however, all
  indications are that it will represent a rare eventual success story
  in the ubiquitous deployment of an optional security extention.  What

                                                       ^ extension
  can we learn from this success?  We note that each endpoint deciding
  to use HTTPS saw an immediate benefit, which is an indicator of good
  chances of success for incremental deployment [RFC8170].  However,
  the acceleration of deployment since 2013 is the result of the
  coordinated effort of actors throughout the Web application and
  operations stack, unified around a particular event which acted as a

                                   ^^^^^^^^^^^^^^^^^^

I'm guessing you mean the Snowden revelations, but I'm not sure.

  call to arms.  While there are downsides to market consolidation, the
  relative consolidation of the browser market has made coordinated
  action to change user interfaces possible, as well as making it
  possible to launch a new certificate authority (by adding its issuer
  to the trusted roots of a relatively small number of browsers) from
  nothing in a short period of time.

4.  Discussion and Recommendations

  It has been necessary for all new protocol work in the IETF to
  consider security since 2003 [RFC3552], and the Internet Architecture
  Board recommended that all new protocol work provide confidentiality
  by default in 2014 [IAB-CONFIDENTIALITY]; new protocols should
  therefore already not rely on optional extensions to provide security
  guarantees for their own operations or for their users.

  In many cases in the running Internet, the ship has sailed: it is not
  at this point realistic to replace protocols relying on optional
  features for security with new, secure protocols.  While these full
  replacements would be less susceptible to base-rate effects, they
  have the same misaligned incentives to deploy as the extensions the
  architecture presently relies on.

  The base rate fallacy is essential to this situation, so the P/Q
  problem is difficult to sidestep.  However, an examination of our
  case studies does suggest incremental steps toward improving the
  current situation:

  o When natural incentives are not enough to overcome base-rate
     effects, external incentives (such as financial incentives) have
     been shown to be effective to motivate single deployment
     decisions.  This essentially provides utility in the form of cash,
     offseting the negative cost of high Q.

     ^ offsetting

  o While "flag days" are difficult to arrange in the current
     Internet, coordinated action among multiple actors in a market
     (e.g. DNS resolvers or web browsers) can reduce the risk that
     temporary breakage due to the deployment of new security protocols
     is perceived as an error, at least reducing the false perception
     of Q.

  o Efforts to automate configuration of security protocols, and
     thereby reduce the incidence of misconfiguration Q, have had a
     positive impact on deployability.

  Coordinated action has demonstrated success in the case of HTTPS, so
  examining the outcome (or failure) of DNS Flag Day will provide more
  information about the likelihood of future such actions to move
  deployment of optional security features forward.  It is difficult to
  see how insights on coordinated action in DNS and HTTPS can be
  applied to routing security, however, given the number of actors who
  would need to coordinate to make present routing security approaches
  widely useful.  We note, however, that the MANRS effort
  (https://www.manrs.org) provides an umbrella activity under which any
  future coordination might take place.

  We note that the cost of a deployment decision (at least for DNSSEC)
  could readily be extracted from the literature [Chung17].
  Extrapolation from this work of a model for determining the total
  cost of full deployment of DNSSEC (or, indeed, of comprehensive
  routing security) is left as future work.

5.  Acknowledgments

  Many thanks to Peter Hessler, Geoff Huston, and Roland van Rijswijk-
  Deij for conversations leading to the problem statement presented in
  this document.  Thanks to Martin Thomson for his feedback on the
  document itself, which has greatly improved subsequent versions.  The
  title shamelessly riffs off that of Berkeley tech report about IP
  options written by Rodrigo Fonseca et al., via a paper at IMC 2017 by
  Brian Goodchild et al.

  This work is partially supported by the European Commission under
  Horizon 2020 grant agreement no. 688421 Measurement and Architecture
  for a Middleboxed Internet (MAMI), and by the Swiss State Secretariat
  for Education, Research, and Innovation under contract no. 15.0268.
  This support does not imply endorsement.

[Stackevo] Optional Security is Not Brian Trammell (IETF)
Re: [Stackevo] Optional Security is Not Spencer Dawkins at IETF