[rtcweb] Benjamin Kaduk's Discuss on draft-ietf-rtcweb-security-11: (with DISCUSS and COMMENT)

Datatracker on behalf of Benjamin Kaduk <ietf-secretariat-reply@ietf.org> Wed, 06 March 2019 19:08 UTC

Return-Path: <ietf-secretariat-reply@ietf.org>
X-Original-To: rtcweb@ietf.org
Delivered-To: rtcweb@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 2A5171224E8; Wed, 6 Mar 2019 11:08:47 -0800 (PST)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Datatracker on behalf of Benjamin Kaduk <ietf-secretariat-reply@ietf.org>
To: "The IESG" <iesg@ietf.org>
Cc: draft-ietf-rtcweb-security@ietf.org, Sean Turner <sean@sn3rd.com>, rtcweb-chairs@ietf.org, sean@sn3rd.com, rtcweb@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 6.93.0
Auto-Submitted: auto-generated
Precedence: bulk
Message-ID: <155189932716.14137.9903426522882898659.idtracker@ietfa.amsl.com>
Date: Wed, 06 Mar 2019 11:08:47 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtcweb/3BsD4n-xkW3SFjDaCKyKmg2OhVE>
Subject: [rtcweb] Benjamin Kaduk's Discuss on draft-ietf-rtcweb-security-11: (with DISCUSS and COMMENT)
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.29
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtcweb/>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2019 19:08:47 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-rtcweb-security-11: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-rtcweb-security/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

I'd like to have a brief discussion about a few points, though it's not clear that any change
to the document will be required (details in the COMMENT section for all of these):

Mutually-verifiable "secure mode" seems to require that the peer's browser be included in
the TCB, which is a bit hard to swallow.  Are we comfortable wrapping that in alongside
"we trust the peer to not be malicious"?

It's not clear how much benefit we can get from *optional* third-party identity providers;
won't the calling service have the ability to silently downgrade to their non-usage even if
both calling peers support it?


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

I mostly only have editorial comments, though there are a few that are
more content-ful.

Section 1

                                                                     As
   with any Web application, the Web server can move logic between the
   server and JavaScript in the browser, but regardless of where the
   code is executing, it is ultimately under control of the server.

The user can observe the javascript running the browser, though maybe
this distinction is not necessary here.

Section 3

                                                               Huang et
   al. [huang-w2sp] summarize the core browser security guarantee as:

      Users can safely visit arbitrary web sites and execute scripts
      provided by those sites.

I note that the author of this document is listed as a coauthor on
huang-w2sp; does the self-cite really add much authority to the
summary of the guarantee?

The use of ALL-CAPS to call out new terms feels a bit dated.

                                                                   Note
   that for non-HTTPS traffic, a network attacker is also a Web
   attacker, since it can inject traffic as if it were any non-HTTPS Web
   site.  Thus, when analyzing HTTP connections, we must assume that
   traffic is going to the attacker.

nit: I know this is a web-centric document, but the privileging of https
as the only "secure" traffic reads a bit oddly to me; something like
"note that in some cases, a network attacker is also a web attacker,
since transport protocols that do not provide integrity protection allow
the network to inject traffic as if they were any communications peer.
TLS, and HTTPS in particular, prevent against these attacks, but when
analyzing HTTP connections, we must assume that traffic is going to the
attacker."  (A thought experiment might be to consider whether wss://
traffic counts as "HTTPS traffic".)

Section 3.1

It might be appropriate to provide some example references in place of
"extensive research".

Section 4.1

                                                In either case, all the
   browser is able to do is verify and check authorization for whoever
   is controlling where the media goes.  [...]

nit: the wording here is a bit odd, since in case (1) you're verifying
you're talking to A, but you still control where the media goes (in
terms of A or not-A; A can of course then forward on the media further).

00000000000000000000000000000000000000000000000000000000000000000000By
   contrast, consent to send network traffic is about preventing the
   user's browser from being used to attack its local network.  [...]

nit: "local" is perhaps overly restricting, depending on interpretation

Section 4.1.1

Maybe note that the "result" of the cross-site requests that is leaked
is in the form of pixels and not structured data, but that does not
change the information content.

Section 4.1.3

   Now that we have seen another use case, we can start to reason about

nit: I'm confused by "another" here.

                                              While not suitable for all
   cases, this approach may be useful for some.  If we consider the case
   of advertising, it's not particularly convenient to require the
   advertiser to instantiate an iframe on the hosting site just to get
   permission; a more convenient approach is to cryptographically tie
   the advertiser's certificate to the communication directly.  We're

This seems to be relying on the reader to have some background knowledge
and make some leaps of reasoning that may not be reasonable to expect.

   Another case where media-level cryptographic identity makes sense is
   when a user really does not trust the calling site.  For instance, I
   might be worried that the calling service will attempt to bug my
   computer, but I also want to be able to conveniently call my friends.

This is especially challenging because if the site (and/or its
javascript) is in the path for binding a cryptographic identity to a
real-world identity, then a malicious site can still get whatever keys
it wants authorized.

Section 4.1.4

   3.  The attacker forges the response apparently http://calling-
       service.example.com/ to inject JS to initiate a call to himself.

seem to be missing a word or two here.

   which contain untrusted content.  If a page from a given origin ever
   loads JavaScript from an attacker, then it is possible for that
   attacker to infect the browser's notion of that origin semi-
   permanently.

nit: "If any page" is more emphatic, I think.

Section 4.2

Do we want any discussion of the risks when metered bandwidth (pay per
byte) is in use?

Section 4.2.1

There's probably some room to tighten up the verbiage here; e.g., "the
site initiating ICE" is referring to a website that is using a browser
API to request ICE against some remote peer (right?).  And "ICE
keepalives are indications" is using Indication as the technical term
for a message that doesn't get an ACK response, not in its common
English usage.

Section 4.2.2

A one- or two-sentence summary of the impact of misinterpretation
attacks is probably in order, instead of making us follow the reference
(which isn't a section reference).

   Where TCP is used the risk is substantial due to the potential
   presence of transparent proxies and therefore if TCP is to be used,
   then WebSockets style masking MUST be employed.

nit: "employed" to obfuscate what, exactly?

Section 4.2.3

   refuses to send other traffic until that message has been replied to.
   The message/reply pair must be generated in such a way that an
   attacker who controls the Web application cannot forge them,
   generally by having the message contain some secret value that must
   be incorporated (e.g., echoed, hashed into, etc.).  Non-ICE

nit: "incorporated" into what?

I think I'm a little confused about which legacy actors we're talking
about.  Are we still considering the broader situation a
webserver-mediated interaction between two browsers or brower-adjacent
applications?  (E.g., a WebRTC client calling some other sort of video
chat system?)

   leaves.  The appropriate technologies here are fairly similar to
   those for initial consent, though are perhaps weaker since the
   threats is less severe.

nit: "threat is"

Section 4.2.4

   Note that as soon as the callee sends their ICE candidates, the
   caller learns the callee's IP addresses.  The callee's server
   reflexive address reveals a lot of information about the callee's
   location.  In order to avoid tracking, implementations may wish to
   suppress the start of ICE negotiation until the callee has answered.

Is "answered" supposed to be some interaction with the controlling site?

   In ordinary operation, the site learns the browser's IP address,
   though it may be hidden via mechanisms like Tor
   [http://www.torproject.org] or a VPN.  However, because sites can
   cause the browser to provide IP addresses, this provides a mechanism
   for sites to learn about the user's network environment even if the
   user is behind a VPN that masks their IP address.  [...]

Some rewording for clarity is probably in order; "ordinary operation" is of
a website without WebRTC; "sites can cause the browser to provide IP
addresses" is when the site uses the browser API to request ICE
initiation; etc.

Section 4.3.1

[Obligatory note about "Forward Secrecy" vs. "Perfect Forward Secrecy"]

   to subsequent compromise.  It is this consideration that makes an
   automatic, public key-based key exchange mechanism imperative for
   WebRTC (this is a good idea for any communications security system)
   and this mechanism SHOULD provide perfect forward secrecy (PFS).  The
   signaling channel/calling service can be used to authenticate this
   mechanism.

To be clear, the authentication that the calling service provides is a
binding between identity and the public keys that are input to the key
exchange mechanism?

Section 4.3.2.1

                                                       Even if the user
   actually checks the other side's name (which all available evidence
   indicates is unlikely), this would require (a) the browser to trusted
   UI to provide the name and (b) the user to not be fooled by similar
   appearing names.

nit: "browser to use trusted UI"

Section 4.3.2.3

It's not clear that third-party identity providers actually provide
downgrade-resistance -- can't the site mediating the calls just decline
to acknowledge that a third-party identity is/was available for the
peer?

Section 4.3.2.4

                                    I.e., I must be able to verify that
   the person I am calling has engaged a secure media mode (see
   Section 4.3.3).  In order to achieve this it will be necessary to
   cryptographically bind an indication of the local media access policy
   into the cryptographic authentication procedures detailed in the
   previous sections.

This seems to require extending the TCB from just the local browser to
the remote browser as well, which is ... a stretch.
(Also, do we really need the first person?)

Section 9.2

The coordinates for [OpenID] don't seem quite right.