Re: [TLS] Roman Danyliw's No Objection on draft-ietf-tls-sni-encryption-05: (with COMMENT)

Roman Danyliw <rdd@cert.org> Wed, 25 September 2019 17:28 UTC

From: Roman Danyliw <rdd@cert.org>
To: Christian Huitema <huitema@huitema.net>, The IESG <iesg@ietf.org>
CC: "draft-ietf-tls-sni-encryption@ietf.org" <draft-ietf-tls-sni-encryption@ietf.org>, "tls-chairs@ietf.org" <tls-chairs@ietf.org>, "tls@ietf.org" <tls@ietf.org>
Thread-Topic: [TLS] Roman Danyliw's No Objection on draft-ietf-tls-sni-encryption-05: (with COMMENT)
Thread-Index: AQHVbi8COnMhMIVEW0a3iVa1HwFkKKcyhpWAgADP/5A=
Date: Wed, 25 Sep 2019 17:27:53 +0000
Message-ID: <359EC4B99E040048A7131E0F4E113AFC01B3467E89@marathon>
References: <156881761812.4630.11745895149419124830.idtracker@ietfa.amsl.com> <c504a433-cc5b-df00-15c2-fcaf3116798c@huitema.net>
In-Reply-To: <c504a433-cc5b-df00-15c2-fcaf3116798c@huitema.net>
Accept-Language: en-US
Content-Language: en-US
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/tls/0eso0qcAIpR44hQYinLD8unFAPU>
Subject: Re: [TLS] Roman Danyliw's No Objection on draft-ietf-tls-sni-encryption-05: (with COMMENT)
Precedence: list

Hi Christian!

Thanks for the detailed responses and the helpful background.  Below are a number of proposed text block replacements to clarify my intent (instead of more questions).

Roman

> -----Original Message-----
> From: iesg [mailto:iesg-bounces@ietf.org] On Behalf Of Christian Huitema
> Sent: Wednesday, September 18, 2019 10:14 PM
> To: Roman Danyliw <rdd@cert.org>; The IESG <iesg@ietf.org>
> Cc: draft-ietf-tls-sni-encryption@ietf.org; tls-chairs@ietf.org; tls@ietf.org
> Subject: Re: [TLS] Roman Danyliw's No Objection on draft-ietf-tls-sni-
> encryption-05: (with COMMENT)
> 
> Thanks for the feedback, Roman. Comments in line.
> 
> On 9/18/2019 4:40 AM, Roman Danyliw via Datatracker wrote:
> > ** Section 1.  Per “More and more services are colocated on
> > multiplexed servers, loosening the relation between IP address and web
> > service”, completely agree.  IMO, unpacking “multiplexed servers” is
> > worthwhile to explain the subsequent text because it motivates the
> > loss of visibility due to encryption with network only monitoring.
> “Multiplex’ happens at two levels:
> >
> > -- co-tenants (e.g., virtual hosting) – multiple services on the same
> > server (i.e., an IP/port doesn’t uniquely identify the service)
> >
> > -- cloud/cdn  – a given platform hosts the services/servers of a lot
> > of organization (i.e., looking up to what netblock an IP belongs
> > reveals little)
> 
> 
> OK, will try to incorporate your text.

Thanks.

> >
> > ** Section 2.1.  Per “The SNI was defined to facilitate management of
> > servers, though the developers of middleboxes soon found out that they
> > could take advantage of the information.  Many examples of such usage
> > are reviewed in [RFC8404].”,
> >
> > -- Can’t middleboxes also help facilitate the management of servers?
> > This text seems to take a particular view on middleboxes which doesn't
> seem appropriate.
> 
> It is pretty clear that the load balancer in front of a server farm will need
> access to the service ID, and must be able to retrieve the decrypted SNI.
> There may be other examples, such as DoS mitigation boxes. The
> "unanticipated usage" comes typically from middle-boxes that are not in the
> same management domain as either the client or the server. Is there an
> established way to designate those?

I'm not sure I understand the original of the requirement that the client and server being in the same management domain.

RFC3546's definition of SNI opens with:
   [TLS] does not provide a mechanism for a client to tell a server the
   name of the server it is contacting.  It may be desirable for clients
   to provide this information to facilitate secure connections to
   servers that host multiple 'virtual' servers at a single underlying
   network address.

It seems to me that if we are trying to channel original intent, then only the virtual server use case applies.  I'd propose:

OLD
The SNI was defined to facilitate management of servers, though the developers of middleboxes soon found out that they could take advantage of the information.  Many examples of such usage are reviewed in [RFC8404].

NEW
The SNI was defined to facilitate secure connections to servers that host multiple 'virtual' servers at a single underlying network address [RFC3546].  However, addition management and security practices emerged making use of this information.  Examples of such usage are reviewed in [RFC8404].

This language would let you distinguish all of the middle box behaviors done by operators and enterprises from a possible [RFC7258] attacker.

> > -- RFC8404 describes a number of middlebox practices, but only Section
> > 6.2 explicitly discusses SNI, and of the examples list here, only one
> > comes from RFC8404.
> A few of the examples also come in the "deep packet inspection" sections of
> 8404. But rather than going in a long discussion, I would rather rewrite the
> sentence as: Many examples of such usage are reviewed in [@?RFC8404],
> other examples came out during discussions of this draft.
> >
> > ** Section 2.1. The “monitoring and identification of specific sites”
> > isn’t symmetric to the other examples – it is rather generic.  The
> > other examples, identify a what/who (e.g., ISP, firewall) + action (e.g.,
> block, filter).
> > Also, to implement most of the other example, “monitoring and
> > identification of specific sites” needs to be done.

I still think this needs to be cleaned up in some way.  IMO, I'd drop it.

> > ** Section 2.1.  Why is parental controls in quotes?  In RFC8404, it is not.
> > The quotes could be read as a judgement on the practice.
> See answer to Alissa. Removing the quotes.

Thanks.

> > ** Section 2.1.  Per “The SNI is probably also included in the general
> > collection of metadata by pervasive surveillance actors”, I recommend
> > against speculation and instead simply stating that SNI would be
> > interesting meta-data for a RFC7258 attacker.
> 
> Yes, Mirja made a similar comment. Proposed replacement:
> 
> The SNI is probably also included in the general collection of metadata by
> pervasive surveillance actors, for example to identify services used by
> surveillance targets.

IMO, explicitly linking it to the draft would help.

OLD:
The SNI is probably also included in the general collection of metadata by
pervasive surveillance actors, for example to identify services used by
surveillance targets.

NEW:
The SNI could be included in the general collection of metadata by
pervasive monitoring attacker [RFC7258], for example to identify services 
used by surveillance targets.

> >
> > ** Section 2.2.  Per “One reason may be that, when these RFCs were
> > written, the SNI information was available through a variety of other
> > means”, what would those “other means” be?
> 
> The list includes at a minimum:
> 
> Clear text exchanges amenable to deep packet inspection (DPI), server
> certificates send in clear text during TLS/SSL exchanges, DNS names of
> servers in clear text DNS queries, and server specific IP addresses in packet
> headers.
> 
> I guess I could write that all, but it makes the text a bit redundant, since the
> following paragraphs do discuss server certificates, DNS names and IP
> addresses.

I understand.  I didn't read it that way.  My recommendation isn't to describe the "other means" (as it is described below), but to be clear on the obvious, what is the SNI information.

OLD:
One reason may be that, when these RFCs were
written, the SNI information was available through a variety of other
means.

NEW: 
One reason may be that when the RFCs were written, the name of the server the being contacted by the client (i.e., the SNI) was evident through other means.

> >
> > ** Section 2.3.  Per “Deploying SNI encryption will help thwarting most of
> the
> > ‘unanticipated’ SNI usages described in Section 2.1, including censorship
> and
> > pervasive surveillance.”:
> >
> > -- Why the quotes around "unanticipated" SNI usage?
> Removing the quotes. Otherwise, you will be convinced that the authors
> believe that all middle-boxes are the spawn of the devil...

Thanks.

> > -- One person’s censorship is another person’s threat mitigation, policy
> > enforcement for a network they own, or parental controls (per the list in
> > Section 2.1) – recommend being more precise on the order of “Deploying
> SNI
> > encryption will {break | reduce the efficacy of } the operational practices
> and
> > techniques used in middleboxes described in Section 2.1”.
> 
> OK. I will try to make the text just stick to the facts:
> 
> Deploying SNI encryption thwarts most of the unanticipated SNI usages
> described in (#snileak). It reduces the efficacy of the operational
> practices and
> techniques implement in middle-boxes. Most of
> these functions can, however, be realized by other means.

Works for me.  However, I'd drop "Most of these functions can, however, be realized by other means" because this opens the debate on how exactly, etc.

> > ** Section 2.3.  Per “It will also thwart functions that are sometimes
> > described as legitimate”, what functions are those?  I’d recommend
> eliminating
> > this sentence as it reads like a value judgement on existing practices (which
> > doesn’t seem germane for discussing requirements).
> >
> > ** Section 3.  Per “Over the past years, there have been multiple proposals
> to
> > add an SNI encryption option in TLS.”, can these past proposals be cited so
> > future readers can learn from them.
> We are describing here a series of design proposed in the TLS working
> group over the years. The whole point of the draft is to provide the
> results of the analyses, as an easy to read kind of "threat model",
> without requiring readers to wade through years of archives. If you
> really are interested, you can indeed do just that but I would not
> encourage the approach...

I understand.  It doesn't seem practical to quote mailing list threads.

> >
> > ** Section 3.4. The existence of designs were alluded to but not cited.  Be
> > specific with citation.
> >
> > ** Section 3.7.1. The rational for including this discussion about ALPN isn’t
> > clear as it doesn’t suggest new requirements for SNI encryption.
> Got comments about that already, and updated the text.
> >
> > ** Section 4.  I got hung-up on the description of Section 4 describing a
> > “solution”.  Is Section 4 (and the related subsections) describing an
> > operational practice or a notional reference architecture?  The text reads
> one
> > part “people are doing” and another part “people could do”.
> Yes, I get that. Our point is to describe this solution as part of
> explaining why we really want a TLS level solution, not just HTTP. Not
> sure that I can change much here.

See below.  I think I may have an approach for clarity.

> > ** Section 4.  Per “In the absence of TLS-level SNI encryption, many sites
> rely
> > on an "HTTP Co-Tenancy" solution”, this seems like a strong of a statement
> > about utilization of this architecture explicitly to hide the
> > hidden.example.com SNI.  Can you provide a citation for a sense
> penetration.
> 
> That's really hard, because it is all cloak and dagger stuff. The one
> well known example is the encrypted messaging application "Signal", that
> was censored in Egypt during the "Arab Spring" events. They were hosted
> by Google, and apparently programmed their app to just connect to
> "https://google.com/", and then use "host: signal.org" in the HTTP
> headers, evading censorship. It is not clear at all to what amount they
> synchronized with Google when doing that. And I don't think that anybody
> ever spoke openly about this.

Thanks for the details.  Let me step back and try to restate my concern.  My feedback on the assertion that "many sites rely on an HTTP Co-Tenancy" and the question above about the "browser plugin" all come from my misunderstanding of the purpose of Section 4.   Is it describing a commonly accepted practice already done or a notional reference architecture.  IMO, given the framing of the rest of the document, it should be the latter.

The first paragraph states that "many sites" use this approach which suggested to me an existing best common practice.  However, as you clarified, there is little evidence that can be provided beyond signal.  To me the statement of "many sites" can't be supported.  My thinking is that this could be easy cleared up simply avoiding the discussion about adoption by saying:

OLD:
In the absence of TLS-level SNI encryption, many sites rely on an
"HTTP Co-Tenancy" solution.  The TLS connection is established with
the fronting server, and HTTP requests are then sent over that
connection to the hidden service.

NEW:
In the absence of TLS-level SNI encryption, a site could adopt an "HTTP Co-Tenancy" architecture to protect the SNI information.  In such an architecture, the client establishes a TLS connection with a fronting server, and the HTTP requests are then sent over that connection to the hidden service.

Related to my confusion is also the new text added to Section 1 of -06, "This document does not present the design of a solution, but provides guidelines for evaluating proposed solutions."  However, the current text in Section 4 is explicitly states it is providing a solution.  The sub-section of Section 4.x assume the solution in Section 4.0 and describe the follow-on work.  Section 2 - 3 do lay out the means for evaluation nicely.  Perhaps, something on the order of:

OLD:
This document does not present the design of a solution, but provides guidelines for evaluating proposed solutions.

NEW:
This provides guidelines on evaluating solution and proposes an architecture to mitigate the threats created by an unencrypted SNI using existing approaches.

> > ** Section 4.  Per the bullet “since this is an HTTP-level solution”, I
> > recommend citing that it fails on the requirement identified in Section 3.7
> > (instead of enumerating a list of protocols)
> Yes. Already fixed.

Thanks.

> > ** Section 4.  The opening of this section noted that “many sites” rely on
> the
> > architecture described in this section. Later, it is noted that “a browser
> > extension that support[s] HTTP Fronting” is a necessary architecture
> component.
> >   Can a few citations be made to the popular extensions.
> 
> The "Signal" deployment used a service specific app. The trick of using
> https://fronting + Host: hidden is really easy to pull in an app. To do
> that in a browser does indeed require an extension, that's pretty much a
> statement of fact.

Makes sense.  Back to my earlier comment about "many sites", if this text is describing a specific solution/best practice vs. a reference architecture.  If it is the former, then what's actually done needs to be described (i.e., an app-based approach).  If it is the latter, the text is fine.

Roman

[TLS] Roman Danyliw's No Objection on draft-ietf-… Roman Danyliw via Datatracker
Re: [TLS] Roman Danyliw's No Objection on draft-i… Christian Huitema
Re: [TLS] Roman Danyliw's No Objection on draft-i… Roman Danyliw
Re: [TLS] Roman Danyliw's No Objection on draft-i… Christian Huitema
Re: [TLS] Roman Danyliw's No Objection on draft-i… Benjamin Kaduk
Re: [TLS] Roman Danyliw's No Objection on draft-i… Roman Danyliw
Re: [TLS] Roman Danyliw's No Objection on draft-i… Roman Danyliw
Re: [TLS] Roman Danyliw's No Objection on draft-i… Christian Huitema
Re: [TLS] Roman Danyliw's No Objection on draft-i… Roman Danyliw
Re: [TLS] Roman Danyliw's No Objection on draft-i… Christian Huitema