Re: [OPSAWG] draft-mm-wg-effect-encrypt-13 review

Kyle Rose <krose@krose.org> Wed, 10 January 2018 17:46 UTC

Return-Path: <krose@krose.org>
X-Original-To: opsawg@ietfa.amsl.com
Delivered-To: opsawg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 883C112DA4E for <opsawg@ietfa.amsl.com>; Wed, 10 Jan 2018 09:46:27 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.7
X-Spam-Level:
X-Spam-Status: No, score=-2.7 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=krose.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MKYg0ymN6C-r for <opsawg@ietfa.amsl.com>; Wed, 10 Jan 2018 09:46:22 -0800 (PST)
Received: from mail-qk0-x234.google.com (mail-qk0-x234.google.com [IPv6:2607:f8b0:400d:c09::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2CB9412D890 for <opsawg@ietf.org>; Wed, 10 Jan 2018 09:46:22 -0800 (PST)
Received: by mail-qk0-x234.google.com with SMTP id b76so11078166qkc.1 for <opsawg@ietf.org>; Wed, 10 Jan 2018 09:46:22 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=krose.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=JNZfIn29doVyDt8rz3GWWJn9iK1BLEP3qnDkka+XChE=; b=fb3j/Q5sSTD2qo6SGlRq7bU9zUJJLcl/NKTqzETfGI/7YsNbQh7TXk4Xj11CC/pKXu kwISP2+8wJu7mh3D0cE24w2c97EWJmwOTAXW8slzLyDA119Jukcti3lcbGQ86AZ/uMN+ 6lPMzF+JMlZmqZ2BqnPoBLpivj0L7l7vftB9I=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=JNZfIn29doVyDt8rz3GWWJn9iK1BLEP3qnDkka+XChE=; b=SFhVLzVD9fzqt8HHmvFleZ/WwfD/mQGE3r8mptTFPzMgGUh9Ar5aN4ZQLl2rmk/JHB JPJz45Ki2eN4Bywxzt1YbftEam/TOEYokvIWyrpfSMrAXnT0O9fcSXoQgvHv13a2XHC+ QFXK1lM33adpcdaoXHHf1iQn5tOL889aCuX+BUEzJ/Wtyxj0oede2UiCp9AcJjegTii9 G2iFlM5kFn0+vs55+WIdA31OwBwu1Jq6Oqrz6vwF1WCdlkAYwqTtJsRvpaDrmiTrrbxW uxA5YdK7oCsfJBi45Udl3MnZiDcrD8c14ukuhV2gm3cG2+AoW2NpAe0dwvo6fO23QXqA 84Yw==
X-Gm-Message-State: AKwxytdsYVPZV6opQzGek4w1vMCZKIZUih0np+6NpBjCF3uSX1Is7xgU ptYEyYOs9EYWMUmz6PRfiG1t61Q8uOm2VUjyItNu5A==
X-Google-Smtp-Source: ACJfBov/Y4DzDMa/r5jxdWCq5g50LtKxM8UuxkyQDqJthB5mazW95av7cMJoIDNJ83UiLkkFNgF5hz/1F27FSw8fCTM=
X-Received: by 10.200.39.167 with SMTP id w36mr28004992qtw.206.1515606380626; Wed, 10 Jan 2018 09:46:20 -0800 (PST)
MIME-Version: 1.0
Received: by 10.12.160.129 with HTTP; Wed, 10 Jan 2018 09:46:19 -0800 (PST)
X-Originating-IP: [2001:4878:a000:3000:7144:b146:6dc:2a74]
In-Reply-To: <CAJU8_nXdpbz-k=oDkKE0bjJ28N-6NDspqHDXsSFqY6jaJOSfDQ@mail.gmail.com>
References: <CAJU8_nXdpbz-k=oDkKE0bjJ28N-6NDspqHDXsSFqY6jaJOSfDQ@mail.gmail.com>
From: Kyle Rose <krose@krose.org>
Date: Wed, 10 Jan 2018 12:46:19 -0500
Message-ID: <CAJU8_nVemvgq0dVfqGbkCn+xyb5=MhHTKA900E4RkJjM01TR-g@mail.gmail.com>
To: ietf@ietf.org, Kathleen Moriarty <kathleen.moriarty.ietf@gmail.com>, "MORTON, ALFRED C (AL)" <acmorton@att.com>, Brandon Williams <bowill@akamai.com>, Warren Kumari <warren@kumari.net>, Paul Hoffman <paul.hoffman@vpnc.org>, opsawg@ietf.org
Content-Type: multipart/alternative; boundary="001a1135ac76efd8ad05626f9bd4"
Archived-At: <https://mailarchive.ietf.org/arch/msg/opsawg/-P1vS7qxAqWI8HNtB-ebPp8FiDk>
Subject: Re: [OPSAWG] draft-mm-wg-effect-encrypt-13 review
X-BeenThere: opsawg@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: OPSA Working Group Mail List <opsawg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/opsawg>, <mailto:opsawg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/opsawg/>
List-Post: <mailto:opsawg@ietf.org>
List-Help: <mailto:opsawg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/opsawg>, <mailto:opsawg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Jan 2018 17:46:28 -0000

I would put each of my comments in the etherpad into one of three buckets:
observations and commentary, minor points and clarifications, and points
that I think require more discussion. I'm only really going to address the
last bucket: the others I raised but don't feel particularly strongly
about, at least at this late stage in the document's life.

   (section 3.2 of [RFC7525]), essentially preventing the negotiation
>    process resulting in fallback to the use of clear text.  In other
>    cases, some service providers have relied on middle boxes having
>    access to clear text for the purposes of load balancing, monitoring
>    for attack traffic, meeting regulatory requirements, or for other
>    purposes.  These middle box implementations, whether performing
>    functions considered legitimate by the IETF or not, have been
>    impacted by increases in encrypted traffic.  Only methods keeping
>    with the goal of balancing network management and PM mitigation in
>    [RFC7258] should be considered in solution work resulting from this
>    document.
>
> KR> I feel like this section could be better organized by:
>  * Moving the examples to 1.1 as a bulleted list of sample situations in
> which network operators attempted to and/or succeeded in defeating
> encryption to preserve existing operational mechanisms, or in which
> performance suffered for users (whether of the encrypted flows or of other
> flows impacted by encrypted flows).
>
> KM> Interesting point, but we'd need more examples.  I'll think about this
> more and chat with Al in case he has ideas.  For now, I went with Brandon's
> easier suggestion, but moving to this would be nice for the document
> readers.
>
> AM> Although I see how these examples could be part of the background, I
> think those who will
> eventually remove their objections will prefer the reduced emphasis on
> these examples where
> they are (in section 2). In one view, the entire memo is background, since
> nothing new is proposed.
>
> KR>
>  * Using this section as an introduction to the methodology for cataloging
> operational mechanisms depending on cleartext traffic monitoring, with the
> various caveats on what will be considered (e.g., only mechanisms required
> heretofore for operability), and for describing the approach to seeking
> mitigations and/or substitutions.
>
> KM> Hmm, interesting point.  I'll have to think about this more as it
> could be alot of work at this stage.
>
> AM> Unfortunately, we've already implemented many AD-level suggestions on
> the organization of Section 2.
> We're at the stage of *what can everybody live with", and re-re-re-org
> falls out now, IMO.
>

This is a reasonable objection, but I am mostly concerned about satisfying
the target audience. ISTM that audience is something like "IETF
participants who are skeptical about, or ignorant of, the operational
difficulties posed by widespread encryption of flows". If the document
meets your goals as-is, then further refinement is unnecessary. Does it? Or
is it impossible to know at this point?


> AM> Also, the neutral exposition that we've been asked to provide a
> million times actually
> comes from multiple perspectives expressed in contributions that we would
> combine
> in a balanced way, without value judgements (no good or bad).
> Where we lack balance, we lack specific contributions.
>

"Neutral" meaning "without advocacy"? It's a fine line. The document still
has a thesis, which (if I am reading it correctly) is to inform the
community of current operational practices that are encumbered in some way
by encryption. To wit, q( It provides network operators' perspectives about
the motivations and objectives of those practices as well as effects
anticipated by operators as use of encryption increases. ) Without that
thesis, which some might interpret as advocacy, it's not clear the document
would have enough focus to be useful. I'd like folks to see this doc as *at
worst* devil's advocacy, and preferably as a challenge to come up with
better arguments and to find alternative methods for dealing with network
management problems.

>
>    heuristics grows, and accuracy suffers.  For example, the traffic
>    patterns between server and browser are dependent on browser supplier
>    and version, even when the sessions use the same server application
>    (e.g., web e-mail access).  It remains to be seen whether more
>    complex inferences can be mastered to produce the same monitoring
>    accuracy.
>
> KR> This might be too formal of an approach for this doc, but it might be
> possible to construct a taxonomy of layers of metadata made unavailable by
> encryption at each layer to show the completeness/comprehensiveness of the
> survey. So, for instance:
>  * Protocol and port number are still available as a way of characterizing
> traffic over the public internet even if the payload is encrypted, but this
> information is lost if (e.g.) the traffic is traversing an IPsec tunnel or
> if radically different kinds of traffic all use port 443/tcp without any
> other way to distinguish between them.
>  * TCP is open to optimization/measurement even if using TLS, except when
> tunneled encrypted: congestion signals (like rexmits) previously
> transparent to the middlebox, for instance, are then lost.
>  * Encrypting the payload defeats attempts to survey traffic by user agent
> (if there's no other way to distinguish, e.g., by fingerprinting).
>
> KM> I think this would be a really helpful follow on document.  I'd be
> willing to work on it if you're game.  I've been thining about something
> similar, specific to TLS, but should be broadened.
>

I'm afraid this topic might be more of a research project than it initially
appears, but it's probably a reasonable exercise to see what we can produce
for a plaintext protocol vs. that same protocol over TLS. Ping me.


>    It is important to note that the push for encryption by application
>    providers has been motivated by the application of the described
>    techniques.  Some application providers have noted degraded
>    performance and/or user experience when network-based optimization or
>    enhancement of their traffic has occurred, and such cases may result
>    in additional operator troubleshooting, as well.
>
> KR> Observation: additionally, I think you'll encounter the argument that
> the responsibility for diagnosing bad interactions between applications and
> networks falls on the application owner rather than the network operator.
> Basically, I feel like the desire among protocol designers is for operators
> to provide a pipe with certain key characteristics that interact well with
> established transport protocol mechanisms, and otherwise to leave the
> traffic alone and let the application developers do what they want to
> within the expected constraints. If that's infeasible (e.g., in edge cases,
> or with respect to new technologies that interact badly with existing
> transports, such as the loss=congestion assumption of TCP that interacts
> badly with wifi), that's precisely the case needs to be made by this
> document.
>
> KM> We have encountered this argument already.  It's a tough one as SPs
> have the SLAs with customers, so they are the first call.  Many don't know
> how to get in touch with APP providers.  I understand the application
> developers perspecive, but also see that there has to be some ability to
> troubleshoot.  Sure, providers could wrap the protocols for transport to
> provide some way of measuring, but information is lost.  IPv6 with flow
> identifiers is another way to do it, but you might not be able to
> prioritize a call or protocol that has little tolerance for delay over one
> that does for instance.  And I realize that app providers just want all
> traffic to have the same priority, but emergency calls are important.
>
> BW> I think the point made by the document is correct though: operators
> are nearly always the first call, not the application provider.
>
> KM> We were asked to remove text that said that.  I agree that it is the
> case as the providers have the SLAs and you don't typlically have a number
> for App providers.
>
> BW> The operators are looking for ways to demonstrate that they did not
> cause the problem (or determine that they did) for efficient hand-off to
> the correct party for resolution. There are certainly problems an approach
> that changes the behavior of the protocol, but it's difficult to argue with
> the diagnostic need.
>
> AM> Using Netflix as an example, the first source of problem they mention
> is the network when
> addressing the question "Why doesn't Netflix work?":
>     "If Netflix isn’t working, you may be experiencing a network
> connectivity issue, an issue with your device, or an issue with your
> Netflix app or account."
>     from https://help.netflix.com/en/node/461?ui_action=kb-article-
> popular-categories
> They previously had even stronger wording, something like "First, make
> sure your network connection meets the Netflix requirements ... URL"
> One of the causes of re-buffering are CDN-related pauses when accessing
> the next segment:  completely hidden from users so far.
> Additional frequent cause: the unlicensed WiFi network owned and operated
> by the customer.
>
> Another way to look at this strategy: App providers are transferring as
> much overhead cost to the network operators as possible
> (troubleshooting customer problems is expensive - rolling a truck negates
> months of revenue), while preserving as
> much value/control/revenue as they can for themselves. The greed-thingy
> plays poorly over time.
> A user-focused strategy would be to form partnerships for troubleshooting
> of shared customers, but that might result in exposing
> the real causes and some would rather hide for now, it seems.
>

I agree with all the points you've made. How do we square reality (users
blame network operators) with the current approach to protocol design at
the IETF (keep the network dumb)? I feel like creating a conversation about
this apparent cognitive dissonance will be one of the most important
outcomes of publishing this document.

I have no doubt the conflict will resolve itself somehow: CDNs, for
instance, act as an intelligent overlay over dumb networks and can
therefore provide the most consistent user experience when deeply deployed
into carrier networks the structure of which they have intimate knowledge.
Is the right solution to continue to effectively delegate this
responsibility by encouraging breaking of connections at the edge, or
should the IETF be trying to optimize the end-to-end performance of its
protocols on the public internet?

Anyway, I digress. This isn't a conversation I'm proposing you have in this
document; just that the doc should raise these kinds of questions in the
reader.

   packet is able to provide stateless load balancing.  This ability
>    confers great reliability and scaleability advantages even if the
>    flow remains in a single POP, because the load balancing system is
>    not required to keep state of each flow.  Even more importantly,
>    there's no requirement to continuously synchronize such state among
>    the pool of load balancers.
>
> KR> An important point is that an integrated load balancer repurposing
> limited existing bits in transport flow state must maintain and synchronize
> per-flow state occasionally: using the sequence number as a cookie only
> works for so long given that there aren't that many bits available to
> divide across a pool of machines.
>
> KM> I added in this point, but have to check back on flow of text.
>

I checked the wording in -14.I'm going to propose slightly different
language:

q( This ability
   confers great reliability and scalability advantages even if the
   flow remains in a single POP, because the load balancing system is
   not required to keep state of each flow. There is value even when the
   repurposed bits are strictly insufficient for encoding all state: an
integrated load balancer repurposing
   limited existing bits in transport flow state must still maintain and
   synchronize per-flow state occasionally (using the sequence number as
   a cookie only works for so long given that there aren't that many
   bits available to divide across a pool of machines), but there is no
longer
   a requirement for such synchronization to be continuous or
instantaneous. )

KR> A dedicated mechanism for storing load balancer state, such as QUIC's
> proposed connection ID, is strictly better from the load balancer's point
> of view, and is probably even better from a privacy perspective than
> bolting it on to an unrelated transport signal because it can be tightly
> controlled by one of the endpoints and rotated to avoid roving client
> linkability: in other words, being a specific, separate signal, it can be
> governed in a way that is finely targeted at that specific use-case. (I'm
> thinking the advantages of separate mechanisms belongs in a different part
> of the doc; this section is more like the problem statement than the
> solution statement.)
>
> KM> This (above) needs to be reworded to be neutral and this does go
> towards solution space, which we were trying to avoid. How about:
>
> Another possibility is a dedicated mechanism for storing load balancer
> state, such as QUIC's proposed connection ID to provide visibility to the
> load balancer.  An identifier could be used for tracking purposes, but this
> may provide an option that is an improvement from  bolting it on to an
> unrelated transport signal. This method allows for tight control by one of
> the endpoints and can be rotated to avoid roving client linkability: in
> other words, being a specific, separate signal, it can be governed in a way
> that is finely targeted at that specific use-case.
>

SGTM. Maybe s/improvement from bolting it on to/improvement compared to
co-opting/.

   In future Network Function Virtualization (NFV) architectures, load
>    balancing functions are likely to be more prevalent (deployed at
>    locations throughout operators' networks)[.  NFV environments will
>    require some type of identifier (IPv6 flow identifiers, the Proposed
>     QUIC connection ID, etc.) for managing]
>    traffic using encrypted tunnels.[  The shift to increased encryption
>    will have an impact to visibility of flow information and will require
>    adjustments to perform similar load balancing functions within an NFV.]
>
> KR> I'm not sure what architecture this paragraph is discussing: are you
> talking about encrypted tunnels between NFV nodes? Is this something
> obvious to people involved in NFV? A diagram (or informational reference)
> would be helpeful to me here.
>
> KM> I see your point, the langauage here could be more clear. Do the above
> adjustments (ed: in []) help?
>

I am still a little confused. Is the idea that load balancing in NFV
environments has a unique need for stateless (or reduced-state) load
balancing that other applications don't have? I'm having a hard time
wrapping my head around why that would be the case. Or is the point here
just to highlight NFV as just another use case to consider?


> 2.2.2.  Differential Treatment based on Deep Packet Inspection (DPI)
>    ...
>    These effects and potential alternative solutions have been discussed
>    at the accord BoF [ACCORD] at IETF95.
>
> KR> This section is labeled DPI, but really, the underlying issue is what
> you stated in the first paragraph: different kinds of traffic have
> different QoS needs, yet a network provider can't rely on a voluntary
> signal from an untrusted device to decide on QoS or every packet is simply
> going to be marked "high importance" and so we're back to treating all
> traffic equivalently. I'd argue against one of the memes I heard at the
> accord BoF, that it's down to latency vs. throughput, by pointing out that
> some applications (e.g., live video with low hand-wave latency) need both.
>
> Even after reading this, I'm still skeptical of the need for any more
> granularity than flow, and using AQM on a per-flow (e.g., 5-tuple) or
> flow-aggregate (some subset of the 5-tuple) to prevent an application or
> user from consuming resources unfairly. What, for instance, prevents a
> carrier from privileging VoIP traffic by looking at endpoints? Would there
> be a way for someone else to masquerade non-VoIP traffic as VoIP traffic
> given this kind of setup? This is the kind of question that I need answered
> by this doc.
>
> BW> It might be useful to note in this section that QUIC and H2 both
> combine multiple micro-flows, possibly of different types, within a single
> encrypted transport-layer flow. They share this with IPsec tunnels and the
> like. IOW, the increased use of encrypted aggregating encapsulation can
> hide even the the most basic representation of a flow from the
> differentiated service element. This same concern applies to load balancing
> elements discussed in section 2.2.1.
>

Good point: an example of this is sending both Youtube and search responses
over the same QUIC or H2 connection, with no way for the network to
throttle one without throttling the other.

AM> We were asked not to refer to QUIC, for various reasons (e.g., still
> under development).
>
> There will always be areas where network can make the best decision,
> because of the
> information available to the network operators (and the lack of that same
> info at end-points).
>
> When network resources are constrained, only the network can manage
> priorities.
> This has been organized according to applications that can be identified,
> but there
> can be other solutions requiring cooperation between user devices and the
> network
> according to subscription to a special service (QCI above).
>

Got it. Is DPI the right framing for this, or is something more generic
(e.g., "content-aware traffic management") what is really required? E.g.,
the network doesn't necessarily need to know which video you're watching,
only that it is video, and maybe what the available bitrates are and
associated quality.


>    An application-type-aware network edge (middlebox) can further
>    control pacing, limit simultaneous HD videos, or prioritize active
>    videos against new videos, etc.
>
> KR> Observation: This subsection provides the first really compelling
> argument I've seen for exposing flow metadata to the path. On long paths,
> physics gets in the way of tight control feedback loops. If nothing else,
> this should provide motivation for protocol designers and operators to
> break down the characteristics of different kinds of flows, determine where
> control points are needed in each of them, and figure out how to implement
> those.
>
> I think there is this conceit among protocol designers that quality
> problems can all be solved at the endpoints without any cooperation from
> path elements; the really killer arguments are examples of where that
> cannot possibly be the case. ECN is a great example of this, and is a
> signal explicitly targeted at middleboxes with opt-in by the endpoints: it
> allows a middlebox to report congestion without dropping packets, which
> produces measurably better QoS for the user.
>
> KM> Ack, thanks.  You're not looking for additional text here, is that
> right?  If so, what are you thinking should be added?
>

No, just an observation that this was one of the more thought-provoking
sections of the doc for me.


>    Alternate approaches such as blind caches [I-D.thomson-http-bc] are
>    being explored to allow caching of encrypted content; however, they
>    still need to intercept the end-to-end transport connection.
>
> KM> [s/need to intercept the end-to-end transport connection/require
> cooperation between the content owners/CDNs and blind caches and fall
> outside the scope of what is covered in this document/
>

SGTM.


> 2.2.6.  Content Compression
>
>    In addition to caching, various applications exist to provide data
>    compression in order to conserve the life of the user's mobile data
>    plan and optimize delivery over the mobile link.  The compression
>    proxy access can be built into a specific user level application,
>    such as a browser, or it can be available to all applications using a
>    system level application.  The primary method is for the mobile
>    application to connect to a centralized server as a proxy, with the
>    data channel between the client application and the server using
>    compression to minimize bandwidth utilization.  The effectiveness of
>    such systems depends on the server having access to unencrypted data
>    flows.
>
> KR> Observation: given the side channels exposed by data compression that
> is blind to content, the inability to compress arbitrary payloads is likely
> to be regarded as a feature of encryption. (Though I recognize this is a
> catalog, not an endorsement.) Furthermore, in most cases eliminating
> compression is still 2-competitive with compression, so I'm not sure it's a
> really compelling use-case.
>
> BW> Per-object content compression might not be a compelling use case
> here. Aggregated data stream content compressions that spans objects and
> data sources is compelling, though. If there is a network element close to
> the receiver that sees all content destined for the receiver and can treat
> it all as part of a unified compression scheme (e.g., through the use of a
> shared segment store) will often be much more effective at providing data
> off-load.
>
> KM> Thanks, we'll add this text (modified) to make those helpful points
> clear.
>
> How about:
>     Aggregated data stream content compression that spans objects and data
> sources that can be treated as part of a unified compression scheme (e.g.,
> through the use of a shared segment store) is often effective at providing
> data offload when there is a network element close to the receiver that has
> access to see all the content.
>

Sounds good. This is general enough to cover the case of networks with
limited uplinks wanting to cache content that is conceptually shared (e.g.,
VOD) but delivered independently to end users via individual TLS
connections.


> KR> It might be worth discussing the typical opt-in strategy for these
> things in the presence of TLS, adding a new intercept CA to willing
> clients, which has the downside that it potentially exposes every https
> connection to an active MitM.
>
> BW> +1
>
> KM> OK, we hadn't done that before since the option doesn't change, but
> you make a good point, so I'll add in text.  Thanks.
>
> I added the following:
>
>     This method is also used by other types of network providers enabling
>      traffic inspection, but not modification.</t>
>
>              <t>Content filtering via a proxy can also utilize an
> intercepting
>           certificate where the client's session is terminated at the proxy
>           enabling for cleartext inspection of the traffic.  A new session
>           is created from the intercepting device to the client's
>           destination, this is an opt-in strategy for the client. Changes
> to
>           TLSv1.3 do not impact this more invasive method of interception,
> where
>           this has the potential to expose every HTTPS session to an active
>           man in the middle (MitM). </t>
>

Mostly sounds good. Is there a reason to mention TLS 1.3 specifically here?


> KR> Random comment: especially with respect to government content
> filtering, I'm worried that the IETF's current approach of playing chicken
> with regulators on end-to-end encryption is going to result in
> normalization of intercept CAs, which will be strictly worse than a
> compromise solution in which a subset of traffic can be inspected (but not
> modified) with the user's knowledge and consent (e.g., distinct optics in
> the browser). I wouldn't like either outcome, frankly, but it would be nice
> if we had a game plan for what to do for user privacy if intercept CAs
> become a requirement for using the web in large parts of the world
> (something we might be one "crisis" away from), and an honest evaluation of
> the alternatives. Fundamentally, I don't like it when discussion gets shut
> down because people want to bury their heads in the sand in the name of
> ideology.</rant>
>
> BW> +1. I also note that this concern applies to some of the other
> performance related use cases too.
>
> KM> I think the real argument here is a control one between the
> application and management folks and not security/privacy even though
> that's what is often discussed.  This is all about control.
>

Right, but the core issue being addressed by this document is that measures
intended for reasons of privacy and security (encryption) are impacting
something over which there is much less consensus (content-aware flow
management and path intelligence). I'm not proposing any language here,
only pontificating that the purity approach might backfire, and I'm not
sure we have a backup plan.


>    In addition, mobile network operator often sell tariffs that allow
>    free-data access to certain sites, known as 'zero rating'.  A session
>    to visit such a site incurs no additional cost or data usage to the
>    user.  This feature is impacted if encryption hides the details of
>    the content domain from the network.
>
> KR> There's the related issue that zero-rating by-implementation typically
> applies only to direct connections to a particular endpoint (e.g., by IP):
> if a user accidentally tunnels traffic from Spotify through a corporate
> VPN, that traffic won't be zero-rated, encrypted tunnel or not. (This goes
> back to the taxonomy of metadata layers comment I made near the top.)
> Carriers aren't going to trust e.g., a Host header for zero-rating, because
> that provides a simple way to tunnel traffic for free: consequently,
> determination of zero-rating will always involve some hard-to-impersonate
> credential, like an IP address or server certificate in the public trust
> web.
>
> KM> Not sure what to add here, any ideas, AL?
>

I think the only change I'd make here is to change "content domain" to
"content origin", because domain implies hostname where the origin is often
an IP.


>    When RTSP stream content is encrypted, the 5-tuple information within
>    the payload is not visible to these ALG implementations, and
>    therefore they cannot provision their associated middelboxes with
>    that information.
>
> KR> I would argue that this is a protocol design issue. This was
> originally a problem with firewalls and NATs, with content inspection as a
> hack to work around the protocol/network impedance mismatch. I'm not the
> only one who would argue the right solution today is to design protocols to
> not require linkage across connections by middleboxes that do basic
> filtering.
>
> KM> I think we are in agreement here for solution direction, but the
> document specificly tries to avoid solutions.  This example has been raised
> in the IESG by Warren and the apps side hadn't considered his view of it
> previously.  It would be good for protocols to have these considerations in
> their designs, they were mostly thinking it didn't matter and were
> end-to-end.  But poor video streaming sessions are an issue.  Not sure we
> should add any text here???
>

This was just another random observation.


>    Data center operators may also maintain packet recordings in order to
>    be able to investigate attacks, breach of internal processes, etc.
>    In some industries, organizations may be legally required to maintain
>    such information for compliance purposes.  Investigations of this
>
> KR> I think you'll get a "[citation needed]" from folks on the TLS mailing
> list.
>
> KM> I suspect this is one you have that recorded text, you have to
> maintain it for chain of custody with investigation handling.  I'll have to
> figure out if there is anything that would require the capture, I suspect
> not, but could be wrong.
>

Just making the point that this has been contended several times on various
mailing lists and at meetings, so it would be nice to get the oft-cited
cases documented somewhere as an informational reference.

   There are use cases where DAR or disk-level
>    encryption is required.  Examples include preventing exposure of data
>    if physical disks are stolen or lost.
>
> KR> I don't see these last two sentences are relevant, as they have
> nothing to do with the network flows.
>
> KM> I'm happy to remove.  DO they help a reader who is not familiar with
> the technology to understand the layers of encryption used at all or is it
> better to remove the sentence?
>

I'm not sure. I'm actually a bit confused about this section in general. It
seems to be discussing monitoring of data during transport to/from the
storage cluster in the same paragraph as encryption of data at rest, but
I'm not sure what point it's trying to make. Is it that operators have a
threat model that doesn't include the network connection between the
storage cluster and the client, but which does include exfiltration of the
disks in the cluster?


>    Security monitoring in the enterprise may also be performed at the
>    endpoint with numerous current solutions that mitigate the same
>    problems as some of the above mentioned solutions.  Since the
>    software agents operate on the device, they are able to monitor
>    traffic before it is encrypted, monitor for behavior changes, and
>    lock down devices to use only the expected set of applications.
>    Session encryption does not affect these solutions.  Some might argue
>    that scaling is an issue in the enterprise, but some large
>    enterprises have used these tools effectively.
>
> KR> This is another example of mixing proposed solutions in among the
> problem statement. I would argue for a clear separation, which may mean
> that this document needs to have a single-minded focus on "here are the
> problems and here's how enterprises currently address them."
>
> BW> Also, enterprises increasingly allow BYOD programs for their
> employees, and such programs make it more difficult to ensure that adequate
> endpoint-based defenses are active. This is especially true when the area
> of risk in question is the above #5 "track misuse and abuse by employees".
> Note too that endpoint-based defenses can be less effective when the device
> is already compromised, in which case detection of the compromised device
> and effective remediation can be made more effective through the additional
> use of an on-path element.
>
> KM> [made some subsequent edits to this section]
>

LGTM.

5.7.  Further work
>
>    Although incident response work will continue, new methods to prevent
>    system compromise through security automation and continuous
>    monitoring [SACM] may provide alternate approaches where system
>    security is maintained as a preventative measure.
>
> KR> Not clear how the unknowns relate to the purpose of this document.
> Being sarcastic for a minute, I'm interpreting this as "Any cleartext
> metadata just *might* be used in the future for some kind of enterprise
> security monitoring!"
>
> KM> Hmm, it's meant to say endpoints (which you control) should be used
> and technology like what is expected out of SACM will help with automating
> this.  We are open to text suggestions.
>

Yeah, I think I must have misread it the first time, because I get only
your meaning now.

Kyle