[rtcweb] Review of security documents

Martin Thomson <martin.thomson@gmail.com> Tue, 22 May 2012 22:13 UTC

Return-Path: <martin.thomson@gmail.com>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1A5C121F86B5 for <rtcweb@ietfa.amsl.com>; Tue, 22 May 2012 15:13:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.369
X-Spam-Level:
X-Spam-Status: No, score=-1.369 tagged_above=-999 required=5 tests=[AWL=-1.929, BAYES_20=-0.74, MANGLED_SAVELE=2.3, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PTaF0KsWBuih for <rtcweb@ietfa.amsl.com>; Tue, 22 May 2012 15:13:06 -0700 (PDT)
Received: from mail-bk0-f44.google.com (mail-bk0-f44.google.com [209.85.214.44]) by ietfa.amsl.com (Postfix) with ESMTP id 9D37B21F86B1 for <rtcweb@ietf.org>; Tue, 22 May 2012 15:13:05 -0700 (PDT)
Received: by bkty8 with SMTP id y8so6515496bkt.31 for <rtcweb@ietf.org>; Tue, 22 May 2012 15:13:04 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=23rYfcVzMa8vHRh5XIfyeLK9GLsTZszO7Iieu8h90sg=; b=XPrfMcleFf76mdaFCDJfVEYltIaMNeakMBBXavua5Db8/+6yprhP14TMRWWnrPvDC+ dDX+fZDyNeFiW9D5h4Jg9KRwM0zvQv3GJ+HO9sgyz51Lep8MJKks9cLujJp+uYzRaAZG GnUwLO+T01mgC0laNqNdwNuwNzG2us6VkVRddw4SqqwvDL7wtrDa+gaC+NvGwb1NxwVB Gb+yAd8fE+AgoHUlOqGZrMhh0sJrKIEWQIb+RHwXhjIn02aqQcL+PjnSxH4ZE1SegxkO T5jw4b70d2PKkyg82iOZZB6I4y2VloCEzFJzLbK34jLglO3XaWDA3gKWiSDx5ek9S+hV NFqg==
MIME-Version: 1.0
Received: by 10.204.153.204 with SMTP id l12mr10379613bkw.49.1337724784605; Tue, 22 May 2012 15:13:04 -0700 (PDT)
Received: by 10.204.66.4 with HTTP; Tue, 22 May 2012 15:13:04 -0700 (PDT)
Date: Tue, 22 May 2012 15:13:04 -0700
Message-ID: <CABkgnnW3T9Lcx3WskmxbD9ZtqxXfnr_i_KwTCDPziPZQ0OYEZg@mail.gmail.com>
From: Martin Thomson <martin.thomson@gmail.com>
To: rtcweb@ietf.org
Content-Type: text/plain; charset="UTF-8"
Subject: [rtcweb] Review of security documents
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 22 May 2012 22:13:07 -0000

Pedantic review of draft-ietf-rtcweb-security-02
... with reference to draft-ietf-rtcweb-security-arch-01

Both documents are in pretty good shape.  They a largely coherent and
comprehensive.  That's not to say that they are finished.

To some extent, a lot of these issues are simply due to entropy.  We've
made quite a few decisions in the time since these were last updated.
We've also made no progress on other issues.


HTTP/HTTPS

S3, last paragraph before S3.1:

   [...], but realistically many sites do not run
   HTTPS [RFC2818] and so our ability to defend against network
   attackers is necessarily somewhat limited.

This isn't especially relevant.

Obviously, the standard class of problems with unsecured HTTP exist, but
within the context of this application, there aren't that many more that
this enables.  The example in S4.1.3 is not unique to this application.
It applies to any user consent that is tied to a particular web origin.

In comparison to possibly visiting and _using_ a site operated by a web
attacker, this is not substantially worse, or requiring significantly
more effort to analyze.

Of course, the only safe assumption is that you are talking to a web
attacker when using unsecured HTTP.

((Aside: I'd be interested in learning how this might turn into an
attack on user consent.  Otherwise, this is a bit of a scary statement
to just drop in without support:
   Note:  this issue is not restricted to PAGES
   which contain mixed content.  If a page from a given origin ever
   loads mixed content then it is possible for a network attacker to
   infect the browser's notion of that origin semi-permanently.))

NETWORK ATTACKERS

Same paragraph:

   [...], with the assumption that
   protection against network attackers is provided by running HTTPS.

Thankfully, the draft does not make this assumption.  Attacks on HTTP
are out of scope, but we still need to deal with network attackers.

TYPES OF CONSENT

The document talks about consent in general as being important but it
doesn't do anything to address what specific consent is needed.

I think that it has been well-established that consent is required for
access to input devices (e.g., camera/microphone).  The implication from
S4.1 is that this is sufficient as well as necessary.  There is one
crucial piece of the argument that is absent:

   A site with access to camera or microphone could send media either to
   itself or any site that indicates consent (see CORS).  Sending media
   over HTTP or thewebsocketprotocol is likely to perform less well than
   is ideal, but it would work.

Therefore, it's easy to draw the implicit conclusions of the draft,
namely:
  1. a receipt consent mechanism like CORS is necessary, and
  2. user consent for access to input devices is necessary _and_
     sufficient...for this mode.

PRIVATE COMMUNICATIONS AND CONSENT

The above assumes that the site has access to media.  That is,
permission for input devices is being granted to the site.  However, it
is possible to imagine a mode of communications where the site mediates
the creation of a secure channel, but does not have access to that
channel.

This changes the assumptions - and the nature of the consent -
dramatically.  S4.1.2 and security-arch@S5.2 covers this case, but
doesn't really properly emphasize just how different this is.

Just as for the entirety of S4.1, the problem then becomes one of
unambiguous identification.  And UI.  From S4.1.2:

   Naturally, it is somewhat challenging to design UI
   primitives which express this sort of policy.

Complications include group calling.  How does the site ask for
permissions to talk to "a@b.com" and "x@y.net"?  How does such a
privilege persist?  Does it even make sense to persist at all?
Obviously, a conference of any significant size tends toward having a
bridge.  At that point, input devices are most likely granted to
"host.example.com".

TRULY PRIVATE COMMUNICATIONS

Keeping the site out of the loop requires that the browser lock down
access to media recording.  The trick is to ensure that the media is
unmodified all the way from source to sink.  It's relatively easy to
ensure that media coming off the network is unmodified.

It is harder to ensure that it did not get modified by a web attacker
prior to being put on the network.  That requires a verifiable assertion
from the remote user that it did not allow a web attacker access to see
or modify the stream prior to it being placed in the pipe.

The distinction between media stream visibility and modifiability might
be worth discussing a little.  My initial thought is that it was not
especially useful in this context.  I can imagine work-arounds that
would enable features that depend on visibility such as recording where
authenticity is also desirable.

IDENTIFICATION

S4.1.1.1.1.1.1.1.1 asks the obvious question:

   [...] there is a question about whether the
   user can know who they are talking to.

When a site has access to your media, then you are talking to the site.
...and anyone the site chooses to forward your media to.  This is
exactly what you get when you use SRTP security descriptions (and also
EKT, if you allow the site to insert keys).

USER INTERACTION CONSIDERATIONS

S4.1.1.2 hides two important UI considerations:

   [...] great care must be taken in the design of this interface
   to avoid the users just clicking through

and

   [...] the user
   interface chrome must clearly display elements showing that the call
   is continuing in order to avoid attacks where the calling site just
   leaves it up indefinitely

These are both massively important.  If there were a W3C companion
document, then it would make sense to include this sort of stuff there.

More robust treatment would be nice for:
 a) the limitations of consent mechanisms
 b) providing appropriate feedback

For the latter, this can get complicated.  I recall a discussion of the
geolocation API that lead to the conclusion that no UI feedback would be
provided.  I still believe that this was a poor choice.  This case bears
a certain amount of similarity to that discussion.  I should really join
that media capture group...

In any case, this probably needs at least basic treatment here, even if
that is just by reference.

AUTHENTICATION MODELS

(Crossing over into S4 of the security-arch document here to address the
use cases from S4.1.1.2 and S4.1.1.3.)

It looks like there is an assumption in play here.  That is, there is
something like an IdP in use.  Calling a site (as opposed to a person)
is very much a case where the usual domain trust anchors are perfectly
adequate.  The site can offer an identical (or similar) certificate in
the DTLS handshake as it did with the HTTPS TLS handshake.

Obviously, it's not as simple as just having this chain to a trust
anchor.  Any site that deployed TLSA (c.f. DANE WG) would require extra
checking.

I'm surprised to see nothing on this particular use case.  It's a
particularly useful one.

CALL MEDIATION AND REPUTATION

S4.1.1.3 contains a discussion about a case where the reputation of one
site is potentially affected by it mediating a call with a third party.
In the example, sites wouldn't want advertisements they display from
reflecting poorly on them.

I don't see how this changes the current situation significantly.  Site
and ad networks work very hard to ensure that advertisements are
appropriate to the context in which they are shown.  Realtime calling
doesn't really change this situation in any meaningful way.

A very big part of this is ensuring that the source of media is
correctly identified.  In part, this is the same problem we already
have.  In addition, sites need to take some responsibility for making
any necessary distinction sufficiently clear on their own interface.  In
this context, the burden lies with the ad network.

((Aside.  I'm not sure that this is always true:
   [...] sites which host
   advertisements often learn very little about whether individual users
   clicked through to the ads, or even which ads were presented

That's true for ads in iframes, but then the reputation concern isn't
entirely applicable.  If the ads are served inline, then I would have to
assume that obfuscation of ad network content is the only protection
against the site learning about ad content and user interactions.))

COMMUNICATIONS CONSENT

The idea that a target for a media flow consents is not binary in the
sense that ICE establishes.  Even with the addition of an expiration
time to ICE consent, there are two problems related to the volume of
packets that need addressing:

The rate at which STUN Binding requests are generated in ICE is
dependent on the bandwidth available for media.  Because this
information comes from a web attacker, the browser cannot trust this
information.  The browser has to rate-limit Binding requests based on
its own information.  In practice, this probably has to be based on a
fixed rate.

Even once the browser has validated consent, it has no idea _how much_
media is OK to send.  The difference between 8kbps audio and 5Mbps high
definition video is enough to make many an internet connection melt.
RTCP feedback like TMMBR is good, but maybe those packets could be made
to get "lost" by a network attacker.  Plus, if we are going to offer a
data channel that doesn't use RTCP, then that option isn't available.

BANDWIDTH LIMITING

Bandwidth limiting is probably an important security feature that isn't
really covered, though you touch on it in the security considerations of
security-arch.

SPECULATIVE STUFF

Don't think that we need masking: TURN TCP probably has enough
protection already and that's the only TCP I can imagine having to use,
aside from HTTPS...

Don't think that we need some sort of implicit consent mechanism.  For
consent, I don't think that it's unreasonable to expect ICE-lite at a
minimum.

See thread on hiding IP addresses.  In order to avoid tracking via IP,
clients should not use the Internet.  I hear that Tor is somewhat handy
at making it possible to have cake that you can eat too.  That said, I'm
not sure that that particular cake tastes all that nice...

IdP

MIXED CONTENT

security-arch@S5.1

It might make sense to consider _all_ unsecured HTTP is being in the
same origin for the purposes of this.

Furthermore, I subscribe to the view that mixed content == unsecured.

The idea that a page might become mixed content is a real concern.
Terminating the session is probably the only safe choice.

Here's the thought process.  Initially, I thought that it might be
sufficient to prevent modifications to the session once a page goes
rogue.  At least then if a secure browser-to-browser pipe exists that
the page could not access prior to the poisoning, this pipe can continue
unmodified.  However, if one browser goes off the grid, then it can
simply exploit any renegotiation capabilities that exist in the
application to trigger changes through the signaling channel.  For
instance, EKT might allow the attacker to update keys.  Given that it
should also be possible to insert a relay (in one direction at least) by
providing faked "candidates", this cracks it wide open.  The more
renegotiation capabilities exist on the media path, the worse this is.

ICE TRANSACTION ID

...need only be hidden for Binding requests that are outstanding.
Successful requests need not be hidden.  (Note that this means meeting
all of the success criteria specified in RFC 5245.)

ICE KEEPALIVES

Are not sufficient, so don't require them.  Instead, require a repeat of
the connectivity check.  That ensures consent is refreshed and it also
means that the statement about ICE-Lite remains true, which would not be
so if another mechanism were used.

UNLINKABILITY

security-arch@S5.5

Nothing on this in the security doc.  What is the goal here?

Minting a new key can be expensive.  How do you prevent sites from
pushing the button too often and causing a different sort of problem?