[rtcweb] Updated security documents

Eric Rescorla <ekr@rtfm.com> Mon, 15 July 2013 13:34 UTC

Return-Path: <ekr@rtfm.com>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (localhost []) by ietfa.amsl.com (Postfix) with ESMTP id 5D9B811E8103 for <rtcweb@ietfa.amsl.com>; Mon, 15 Jul 2013 06:34:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -100.835
X-Spam-Status: No, score=-100.835 tagged_above=-999 required=5 tests=[AWL=-0.991, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, J_CHICKENPOX_111=0.6, J_CHICKENPOX_14=0.6, J_CHICKENPOX_17=0.6, J_CHICKENPOX_18=0.6, J_CHICKENPOX_51=0.6, NORMAL_HTTP_TO_IP=0.001, RCVD_IN_DNSWL_LOW=-1, SARE_UNSUB18=0.131, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([]) by localhost (ietfa.amsl.com []) (amavisd-new, port 10024) with ESMTP id Fp2fJcsN+EIk for <rtcweb@ietfa.amsl.com>; Mon, 15 Jul 2013 06:33:55 -0700 (PDT)
Received: from mail-qc0-f176.google.com (mail-qc0-f176.google.com []) by ietfa.amsl.com (Postfix) with ESMTP id 0145A11E80F1 for <rtcweb@ietf.org>; Mon, 15 Jul 2013 06:33:54 -0700 (PDT)
Received: by mail-qc0-f176.google.com with SMTP id z10so6157946qcx.21 for <rtcweb@ietf.org>; Mon, 15 Jul 2013 06:33:52 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:from:date:message-id:subject:to :content-type:x-gm-message-state; bh=01Og5VWelJvOd/HJEmNQLkQZwbd6ST9Bf3uHYC29sDg=; b=XwjZMvUdHMx+kgOzCBmArCmyqac51ozH/sh9TgULTDuRq4S0DTEkTHGFYvlLCK17F+ 7V9l5wC9Es5D94+elREB+vHz3q8X9s6lny4E6CqUfLft+ToRjOL0o72NG50tGCVNcqln ZR6b/NrXybimdKtqNiUG1htbckoIoAxjqwLr8sRVV0ElXlzpmZCXSYLjaz5Xabw3RomI hAfhTXuoH9GpfmzruLM8Za1MLjKDbRFXEAbCfs0v0pbjR76mdKQ5qx+a2+jUm9H+R/JV NSnZzcroI63jFRyorFXVljFiWD7+jXbr6E4fgdgSkKGQsen8KE7T5WcIUL0uGkuc8dre GMVA==
X-Received: by with SMTP id js9mr50968972qeb.73.1373895232249; Mon, 15 Jul 2013 06:33:52 -0700 (PDT)
MIME-Version: 1.0
Received: by with HTTP; Mon, 15 Jul 2013 06:33:12 -0700 (PDT)
X-Originating-IP: []
From: Eric Rescorla <ekr@rtfm.com>
Date: Mon, 15 Jul 2013 06:33:12 -0700
Message-ID: <CABcZeBMYBNT=iLAVJk4We7Ha4gue2HGOAqCYmxM-+q6eei+rLw@mail.gmail.com>
To: "rtcweb@ietf.org" <rtcweb@ietf.org>
Content-Type: multipart/alternative; boundary="047d7b6da78246f54a04e18ce9ab"
X-Gm-Message-State: ALoCoQnDnpWnjClQlAJTvB/oOJX7U6r3M3GEOssCSR52qWouFiLJ/gRk2sfRwTKB40OoyPStlLIx
Subject: [rtcweb] Updated security documents
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Jul 2013 13:34:00 -0000

I've just posted updated security documents reflecting most, but not all
the last call comments. There's still a few TODOs I need to resolve
F2F, (+ the SDES thing which I haven't done anything with, since we're
discussing it in Berlin).

Below is a consolidated response to people's comments. Please let me
know if you think I haven't responded adequately to yours.

> I have reviewed the security draft (-04) as an individual and have the
> following comments.

Thanks for your comments. Responses below.

> 1. Usage of acronym RTCWEB vs WebRTC. I thought we earlier had a
> conclusion that we would use WebRTC as the acronym for the complete
> solution? Am I remembering incorrectly?

Changed, except when referring to the WG.

> 2. I think the introduction should contain some pointer to the overview
> draft for reverse lookup purpose if one stumbles on the document.


> 3. Section 3.1:
> Similarly, while Flash SWFs can access the camera and microphone,
>    they explicitly require that the user consent to that access.
> Can you please expand the SWF acronym and perhaps add a reference?

The acronym isn't very useful, since I think it refers ot
"Shockwave Flash". I changed this to programs ("SWFs")

Added a reference.

> 4. Section 3.2:
> Many other resources are accessible but isolated.  For instance,
>    while scripts are allowed to make HTTP requests via the
>    XMLHttpRequest() API those requests are not allowed to be made to any
>    server, but rather solely to the same ORIGIN from whence the script
>    came.[RFC6454] (although CORS [CORS] and WebSockets [RFC6455]
>    provides a escape hatch from this restriction, as described below.)
> This above looks strange around the first period, which is followed by
> RFC6454 reference. If is is a new sentence, the main sentence contains
> only a reference.


> 5. Section 3.2
> This SAME ORIGIN POLICY (SOP) prevents server A from mounting attacks
>    on server B via the user's browser, which protects both the user
>    (e.g., from misuse of his credentials) and the server (e.g., from DoS
>    attack).
> I think one can make clear that the server one protects are B in the end
> of the sentence. Although it reasonably clear from context, I think
> writing it like this:
> This SAME ORIGIN POLICY (SOP) prevents server A from mounting attacks
>    on server B via the user's browser, which protects both the user
>    (e.g., from misuse of his credentials) and the server B (e.g., from
>    DoS attack).
> Is clearer.


> 6. Section
> In both of the previous cases, the user has a direct relationship
>    (though perhaps a transient one) with the target of the call.
> Is it really the "target of the call" or is it the calling service that
> the user has a direct relation ship with?

I removed this section entirely since we decided to do nothing
about this and I have seen other people be confused by it.

> 7. Section 4.2.1:
> [[ OPEN ISSUE:  Do we need some way of verifying the expected traffic
>    rate, not just consent to receive traffic at all.]]
> Regarding this issue, I think you could add something which makes this
> not required, namely that by requiring congestion control of all flows
> established by the browser, the browser tries to avoid persistent
> congestion and any overload attack is less effective. The attack is more
> in the vain of reducing any competing flows supported rate to a lower
> fair share rate, which in worst case is below what is required to
> support the service being the target of the attack.
> Thus I think the open issue question is a nice to have, but not required
> feature.

Agreed, and I think we have mostly come to consensus on this point,
so I have removed the open issue.

> 8. Section 4.2.4:
> I think this document is missing to discuss the privacy threats beyond
> IP location privacy? I think that should be added, especially as we are
> addressing some issues we know of.
> These other threats that we have discussed are associated with anonymous
> calling services and linking and fingerprinting browsers or users with
> anonymous calls. The threats we do know exist are for example RTCP CNAME
> generation, DTLS certifcates, and API fingerprinting. I think these
> needs to be raised as a threat.

Added a privacy considerations section.


> Some minor comments on draft-ietf-rtcweb-security:

Thanks. Responses below.

> - Abstract
>     This document defines the RTC-Web threat model and defines an
>     architecture which provides security within that threat model
>   Isn't the architecture defined in draft-ietf-rtcweb-security-arch?

Good catch. Fixed.

> - Section, 3rd paragraph
>     However, this obviously presents a privacy challenge, as sites which
>     host advertisements in IFRAMEs often learn very little about whether
>     individual users clicked through to the ads, or even which ads were
>     presented.
>   What exactly is the privacy issue here? Is it that the hosting site
>   can eavesdrop on the call between the user and the advertiser since
>   IFRAMEs are not used?

I removed this section.

> - Section 4.3.1, 3rd paragraph
>     In addition, the system MUST NOT provide any APIs to extract either
>     long-term keying material or to directly access any stored traffic
>   Does this mean that SDES is out of the picture or does the sentence only
>   apply to DTLS(-SRTP)?

I don't think this really impacts SDES, in context, but I tried to add
some clarifying text.

> Abstract and the duplicate text in Introduction: RTC-web turn into
> either RTCWEB or WebRTC, with WebRTC being preferred.
> "some Web server" - slangy "a web server"


> Figure 1: might also want to show WebSockets as this is becoming very
> common and is mentioned later in the document without much context.

I added WebSockets.

> Also, do we want to show the signaling channel (the one referenced by
> the W3C WebRTC spec that transports the SDP offers and answers between
> browsers)?

I think it clutters the diagram, but I added some text.

> JS, popunder, and SWF are not defined in the draft

Defined JS and SWF and rewrote to not use popunder.

> 4.1.2. has a reference to draft-ietf-rescorla-rtcweb-generic-idp.
> What is the status of this draft?

Per WG discussion it has been folded into the security architecture.

> 4.3.  RFC 3711 SRTP is not mentioned in the opening paragraph, which
> seems strange, since it is the actual protocol providing the COMSEC
> properties mentioned, rather then DTLS or DTLS-SRTP for media.

Agreed. Fixed.

> 4.3.1. Might be useful to mention that this is sometimes called a
> "passive attack" whereas 4.3.2 is an "active attack".


> are a number of misleading statements about SAS in this
> section which should be corrected.  For example:
>  "This SAS is designed to be read over the voice channel and if
>  confirmed by both sides precludes MITM attack."  The SAS is to be
>  *compared* by the users - reading it out loud is only one possible
>  mechanism.

Rewritten as:

  ZRTP <xref target="RFC6189"/> uses a "short authentication string" (SAS)
which is derived
  from the key agreement protocol. This SAS is designed to be compared
  by the users (e.g., read aloud over the the voice channel or
  transmitted via an out of band channel) and if confirmed by both sides
precludes MITM
  attack. The intention is that the SAS is used once and then key
  continuity (though a different mechanism from that discussed
  above) is used thereafter.

> "Moreover, it is possible for an attacker who controls the browser to
> allow the SAS to succeed and then simulate call failure and reconnect,
> trusting that the user will not notice that the "no SAS" indicator has
> been set (which seems likely)." - I don't quite know what this means.
> If SAS is used, the UI is the browser chrome, so if this is possible,
> it is just a badly designed UI not a protocol failure or issue.


> "Even were SAS secure if used, it seems exceedingly unlikely that
> users will actually use it." - This reads like speculation.  There are
> users today using SAS, open source users and commercial users.  One
> thing is certain, if we don't provide SAS, then users will not use it.

Yes, I think this is fair. I rewrote this to be a bit more neutral.

   Additionally, it is unclear that users will actually use an SAS.
   As discussed above, the browser UI constraints preclude requiring
   the SAS exchange prior to completing the call and so it must be
   voluntary; at most the browser will provide some UI indicator that the
   SAS has not yet been checked. However, it it is well-known that when
   faced with optional security mechanisms, many users simply
   ignore them <xref target="whitten-johnny"/>.

LMK if you still object.

> The whole paragraph is about SAS, but then adds "or fingerprints" in
> the last sentence.  There are very significant differences between an
> SAS and a fingerprint (which I will call a Way Too Long Authentication
> String).  If fingerprints are going to be discussed in this section,
> then the differences between an SAS and a WTLAS need to be explained.

Fair enough. I removed the graf.

Bernard filed a bunch of issues:

Issue 4:
> It would be nice to see some editorial consistency among the document set.
> For example, the overview document uses the term "RTCWEB" to refer to the
> IETF WG or protocol specifications whereas this document uses "RTC-Web".

Changed to RTCWEB

Issue 5:
 > There also needs to be some mechanism for the browser to verify that
 >    the target of the traffic continues to wish to receive it.
 >    Obviously, some ICE-based mechanism will work here, but it has been
 >    observed that because ICE keepalives are indications, they will not
 >    work here, so some other mechanism is needed.
 >    [[ OPEN ISSUE:  Do we need some way of verifying the expected traffic
 >    rate, not just consent to receive traffic at all.]]
 > [BA] Since I believe we've figured this out, can we clean up the above
 > paragraph and OPEN ISSUE?

Agreed. Rewrote anda dded a link to draft-muthu.

>  [Note:  current thinking in the RTCWEB WG is not to support TCP and to
>  support SCTP over DTLS, thus removing the need for masking.]
>  [BA] This section seems somewhat "overtaken by events" given that the
>  channel will run over DTLS. How about the following?
>  4.2.2. Masking
>     Once consent is verified, there still is some concern about
>     misinterpretation attacks as described by Huang et al.[huang-w2sp].
>     Where TCP is used the risk is substantial due to the potential
>     presence of transparent proxies and therefore if TCP is to be used,
>     then WebSockets style masking MUST be employed.
>     Since DTLS (with the anti-chosen plaintext mechanisms required by
>     TLS 1.1) does not allow the attacker to generate predictable
>     ciphertext, there is no need for masking of protocols running over
>     DTLS (e.g. SCTP over DTLS, UDP over DTLS, etc.).


>     It is this consideration that makes an
>     automatic, public key-based key exchange mechanism imperative for
>     RTC-Web (this is a good idea for any communications security system)
>     and this mechanism SHOULD provide perfect forward secrecy (PFS).
>  [BA] Do we mean "SHOULD support" PFS or "SHOULD use"?  I don't believe
>  that DTLS/SRTP-EKT provides PFS.  Also, is there any implication that the
>  user should be able to somehow influence whether PFS is required or not?

Changed per discussion in the meeting.

> 4.3.2. Protecting Against During-Call Attack
>     Protecting against attacks during a call is a more difficult
>     proposition.  Even if the calling service cannot directly access
>     keying material (as recommended in the previous section), it can
>     simply mount a man-in-the-middle attack on the connection, telling
>     Alice that she is calling Bob and Bob that he is calling Alice, while
>     in fact the calling service is acting as a calling bridge and
>     capturing all the traffic.  While in theory it is possible to
>     construct techniques which protect against this form of attack, in
>     practice these techniques all require far too much user intervention
>     to be practical, given the user interface constraints described in
>     [abarth-rtcweb].
>  [BA] I think it's more than a user intervention/user interface issue.
>  Aside from snooping the signaling to see if the callee includes an
>  "isfocus" tag, how can the browser know if it is calling a conference
>  bridge or not? Personally, I'd remove the "in theory" sentence.

I rewrote to describe how this could in principle happen with
a fingerprint or an IdP.

>  4.2.4. IP Location Privacy
>     Note that as soon as the callee sends their ICE candidates, the
>     caller learns the callee's IP addresses.  The callee's server
>     reflexive address reveals a lot of information about the callee's
>     location.  In order to avoid tracking, implementations may wish to
>     suppress the start of ICE negotiation until the callee has answered.
>     In addition, either side may wish to hide their location entirely by
>     forcing all traffic through a TURN server.
>  [BA] Might be useful to say explicitly that the concern about location
>  privacy is restricted to media; hiding the client's location from the Web
>  server is handled by things like ToR, and signaling privacy is
>  by the signaling protocol (e.g. SIP privacy, etc.).

I added some material here.

> #11: Security vs. Security-Arch
>  At various points within the Security Architecture document, I had a
>  feeling of deju vu, encountering material that was also in the Security
>  doc.  The overlap made me wonder if we should either move some material
>  from Arch to Security, or whether we couldn't live with a single doc
>  instead of two.

I agree it's a bit awkward. I'm taking my guidance on one versus two
documents from the chairs. If the WG wants me to merge, I'll do so.

>  Abstract
>  All but the last sentence of the abstract is identical to that of the
>  Security document.  Would it make some sense to shorten the abstract?
>  Also, references aren't allowed in abstracts.

Shortened it.

>  Section 1
>  The first paragraph duplicates material from the abstract as well as
>  Section 1 of the Security doc, and figure 1 is also included in both
>  However, the security doc doesn't discuss the multidomain case.  Would it
>  make sense to move material from Section 1 of this document to the
>  security doc?  If this were done, would we still need Section 1 in this
>  doc?

As long as these are still separate documents, I'd like to keep this
one as-is. I tried to keep the security document high-level, so I think
this is probably the right split...

>  Section 5.2
>  This section, with its requirements on APIs, seemed similar in spirit to
>  Section 4.1 of the Security doc.

Agreed. The idea of Section 4.1 is to provide analysis and this to
provide nomative language. Agreed it's somewhat imperfect.

>  Section 5.4
>  Material in this section (such as the discussion of ToR) seems like it
>  might better belong in Section 4.2.4 of the Security doc.

I added some more high-level material into the security doc.

> #10: Additional Threats
>  Aside from the threats described in the document, a few others come to
>  mind.  It might be useful for the document to state explicitly why or why
>  not these are out of scope:
>  a. Live versus replayed streams.  While you might have media security,
>  this doesn't tell you whether the stream is actually originating from a
>  device or is being replayed.
>  b. Prank calling.  While the document talks about permission to make
>  calls, it doesn't mention permission to receive or blocking of unwanted
>  calls.

I added a section about malicious peers.

> I have made an individual review of the security architecture document
> version 6 and have the following comments.
> 1. In title and in other places, should we use WebRTC as the handle to
> the complete solution?


> 2. Section 7.2 is wrongly labeled, I assume it is -04 version

Unfortunately not.

"Version -04 was a version control mistake.  Please ignore."

> 3. Section 4:
> As is conventional in the Web, all identities are
>    ultimately rooted that system.
> I guess this should be ... rooted in that system?


> 4. Figure 4. The arrow between Bob's browser and Bob's IdP is missalinged.


> 5. Section 4.1:
> Because this is an audio/video call, it creates
>    two MediaStreams, one connected to an audio input and one connected
>    to a video input.
> I think this normally would be one mediaStream but with two
> MediaStreamTracks.


> 6. Section 4.1:
> If Bob agrees [I am ignoring early media for now], a PeerConnection
>    is instantiated with the message from Alice's side.
> I think you can remove the parenthesis. First of all I don't think it
> relavant in the paragraph. Secondly, early media doesn't exist in the
> context of a single peerConnection.


> 7. Section 4.3
> If Alice and Bob authenticated via their IdPs,
>    then they also know that the signaling service is not mounting a man-
>    in-the-middle attack on theor traffic.
> theor / their?


> 8. Section 5.1:
> It is RECOMMENDED that browsers which allow
>    active mixed content nevertheless disable RTCWEB functionality in
>    mixed content settings. [[ OPEN ISSUE:  Should this be a 2119 MUST?
>    It's not clear what set of conditions would make this OK, other than
>    that browser manufacturers have traditionally been permissive here
>    here.]]
> I am actually quite worried about not making this a MUST. Mixed content
> clearly causes significant security holes that I think really should be
> avoided.

Agreed. Changed per-wg discussion.

> 9. Section 5.1, similarly I think this applies to the second open issue
> in this section.

Changed. If people disagree, please objec.t

> 10. Section 5.3:
> [Note:
>    this document takes no position on the split between ICE in JS and
>    ICE in the browser.  The above text is written the way it is for
>    editorial convenience and will be modified appropriately if the WG
>    decides on ICE in the JS.]  The JS MUST NOT be permitted to control
>    the local ufrag and password, though it of course knows it.
> I think you can clean up this statement as there is consensus to leave
> ICE in the browser.


> 11. Section 5.3:
> A separate document will profile the ICE
>    timers to be used; see [I-D.muthu-behave-consent-freshness].
> We should clarify the WG consensus on this document and the scope the
> document has.

I'm not sure I understand this. Is this a proposed change to the
document or a WG action item?

> 12. Section 5.4:
>    API Requirement:  The API MUST provide a mechanism for the calling
>       application JS to indicate that only TURN candidates are to be
>       used.  This prevents the peer from learning one's IP address at
>       all.
> What about the ICE candidates related address information, that should
> be mentioned I think also as possible risk of leaking and what to set it

Good point!

> 13. I am missing a discussion on how to avoid the linkage issues. I
> think we need to point out the known sources to linkage that WebRTC
> adds. I am aware of RTCP CNAMEs and as well as the certificate used in
> DTLS(-SRTP). Their should be recommendations on how to avoid these issues.

See S 5.5 below. Are you looking for something else.

> 14. Section 5.5
>    [OPEN ISSUE:  What should the settings be here?  MUST?]
>    Implementations MAY support SDES for media traffic for backward
>    compatibility purposes.
> This needs to be resolved. I see no issues with retaining the
> possibility for SDES and other schemes. I think however, the downgrade
> issues may need more discussion here. This is a somewhat discussed in
> the security considerations section later.

I plan to hold off on this till after the discussion in Berlin.

> 15. Section 5.5
>    API Requirement:  The API MUST provide a mechanism to indicate that a
>       fresh DTLS key pair is to be generated for a specific call.  This
>       is intended to allow for unlinkability.  Note that there are also
>       settings where it is attractive to use the same keying material
>       repeatedly, especially those with key continuity-based
>       authentication.
> I think this has to do with 13. For example should really the same
> certificate be used with different origins, unless it is a intentionally
> added endpoint certificate that provide verifiable source?

No, it shouldn't be used with different origins except for thjat.

> 16. Section 5.5
> [largely derived from
>       [I-D.kaufman-rtcweb-security-ui]
> Maybe this is more suitable in a contributors section instead of inline
> in text.


> 17. Section 5.5:
>       *  The "security characteristics" MUST indicate the cryptographic
>          algorithms in use (For example:  "AES-CBC" or "Null Cipher".)
> Does there need to be additional discussion or clarifications on high
> level indications when one uses NULL cipher that doesn't provide
> confidentiality?

Good point. Added a new requirement here.

> 18. Section 5.6.2:
> The details of the mechanism are described in the W3C API
>    specification,
> I think a reference needs to be added here.
> 19. Section 5.6.3
> Todo REF!


> 20. Section
>    The "algorithm" and digest values correspond directly to the
>    algorithm and digest in the a=fingerprint line of the SDP.
> Should there be a reference to where a=fingerprint is defined here?


> Also, does it need to be clarified what "algorithm" and digest
> reference, the ABNF contructs or value for those parameters.

I added "values" here. Is that what you were looking for?

> 21. Section
> This SDP attribute is underspecified. No clear definition of what is
> allowed, no IANA consideration section registering it.

It's just a base64-encoded identity attribute.

> Also, should this SDP attribute also carry the origin of the IdP? I just
> wonder how a non browser can determine which IdP has been providing the
> assertion.

It's in the assertion. Do you think it would be better in the

> 22. Section and following.
> I would like that all these JSON objects would have a bit more crisp
> definitions. Sure, the JSON has certain limiations in what values can be
> used, but it is unclear if the full UTF-8 values can be used in all
> cases, or if there is restrictions on the URL provided, or if it should
> be URI's in fact?

OK. I will circle around with Ted (as a URL expert) and anyone else
who feels expert to try to nail this down.

> 23. I am missing a security discussion around the fact that I can claim
> to have an IdP and load whatever JS code into the proxy. What security
> implications deos this have. Section does provide some basic
> restrictions here. Are more needed?

I tried to hit this in 5.7.4. Generally if you are a site you can
always load new JS into the browser, so this is just part of the
threat model. Can you give me an example of what you're looking for.

> 24. Section
>    idp:  A dictionary containing the domain name of the provider and the
>       protocol string
> What is the definition of dictionary here?

I'm not sure enough of the terminology here. Isn't dictionary
what we call JS associative-array type structures.

> 25. Are there any time limits on responding to SIGN or Verify, or can
> you get applications to hand simply by not responding to a IdP message?

I would hope the browser would produce an error at some point,
but I'm not sure we need to specify here.

> 26. Section
> Are the fields required to be included or not. This is a bit unclear.

Currently it says:

"MUST contain a message field consisting of a dictionary/hash with the
following fields:"

Would you like it to say "which MUST contain the following fields"

> Section 27. Section
> Open issue to resolve. Please try to resolve this issue by talking to
> some people.

Agreed. This is now on my TODO list.

> 27. Section
> Open issue needing resolution. No opinion.

Will raise this in Berlin.

> 28. Section 5.7.2:
> Missing the linkability issues discussion here.

Added some material.

> 29. Section 5.7.2:
> Combined RTCWEB/Tor
>    implementations SHOULD arrange to route the media as well as the
>    signaling through Tor. [Currently this will produce very suboptimal
>    performance.]


> I think you can remove the parenthis but keep the text here.
> 30. Page 40 SDP example looks strange with no a=idenity attribute,
> instead just the JSON object.

> When reading draft-ietf-rtcweb-security-arch I noticed a couple of things
that might be missing:
> - According to section in draft-ietf-rtcweb-security there shold
>   be a mechanism to verify that the remote browser has engaged a "secure
>   media mode". This mechanism is not (yet) described in the document.

Good catch. I have to write something here but I haven't yet.

> - Except for paragraph 6 in section 4.1, there is not much said about
>   the implications of the "secure media mode".

This is referring to isolated streams, which is now in gUM. How much
do you think I should say here?

> - An enterprise network with an HTTP proxy may not want external web
sites to
>   learn the IP addresses of its hosts using the PeerConnection API. I'm
>   not sure if this kind of attack is relevant or not. If it is then
>   it should be mentioned in Section 5.4. IP Location Privacy.

Added some text.
> I also have some minor comments on the rest of the document:
> - Section 3, 1st paragraph
>    The basic assumption of this architecture is that network resources
>    exist in a hierarchy of trust, rooted in the browser, which serves as
>   The term "hierarchy of trust" is unclear in this context. Are there
>   nodes in this hierarchy except for the browser and web server?

I would claim browser, web server, idp (if any), other network elements.

> - Section 3, 2nd paragraph
>     This is a natural extension of the end-to-end principle.
>   Perhaps it's just me but I don't understand what this end-to-end
principle is.

I just removed it since it doesn't add value.

> - Section 4.3, 1st paragraph
>     The total number of channels depends on the amount of
>     muxing; in the most likely case we are using both RTP/RTCP mux and
>     muxing multiple media streams on the same channel, in which case
>     there is only one DTLS handshake.
>    Won't there be separate handshake for the data channel?

My understanding was that we could mux the datachannel onto
the same 5-tuple.

> - Section 5.4, 1st paragraph
>     [...] which leaks large amounts of location information,
>     especially for mobile devices.
>    Why are mobile devices different in this aspect?

They're not. Removed.

> - Section 5.6.4
>   The SDP example contains RTP/AVP but I thought SRTP was made mandatory?

Bit rot. Fixed.

> - Section, 3rd paragraph
>    All requests from the PeerConnection object MUST contain an "id"
>    field which MUST be unique for that PeerConnection object.  Any
>    responses from the IdP proxy MUST contain the same id in response,
>    which allows the PeerConnection to correlate requests and responses.
>   Couldn't the browser implementation handle the message routing
>   without the id field?

This isn't about routing, it's about what happens if there are
multiple outstanding queries.

> - Section
>   The mandatory id field is missing in the example


> - Section
>   How is the assertion field transformed into the a=identity attribute?
>   I guess you´re using some form of B64 encoding but this is not expained
>   anywhere.

It's earlier, but I added it.

> - Section
>   I think the reason for including request_origin should be explained here
>   instead of in the end of the document.

I added a forward reference instead to avoid breaking up the document

> - Section 5.7.1, 3rd paragraph
>   WebCrypto API lacks a reference.
> - Section
>     Any IdP which uses cookies to persist logins will be broken
>     by third-party cookie blocking.
>   Shouldn't it say "Any IdP which uses third-party cookies"?

IdPs are inherently third-party in this context.

> - Appendix A and B
>   Both appendices appear unfinished. For example, ROAP is still mentioned
>   in othe BrowserID example and it's unclear what client credentials
>   Bob uses in the OAuth example.

Agreed. I have a TODO to clean these up.

> Again, lots of RTCWEB, RTCWeb, etc that should be WebRTC.


o> The document introduction seems to be specific to audio and video
  WebRTC sessions.  Later, it becomes clear that the analysis applies
  to the data channel as well.  This should be stated clearly up front
  and any differences between the two should be explained (i.e. DTLS
  vs SRTP encryption).

I added a sentence in S 4.3. This is actually the first significant
discussion of SRTP, so I think this works best.

> "SIP or XMPP" - XMPP isn't a signaling protocol but Jingle is, so that is
probably what should be mentioned, but Jingle doesn't use SDP.

I just left it as SIP.

> "Web sites whose origin we can verify (optimally via HTTPS, but in some
cases because we are on a topologically restricted network, such as behind
a firewall)"  - what is the 2nd case - no verification?  Verification using
something other than HTTPS?

I added "and can infer authentication from firewall behavior"

> 3.1  middle para graph is basically saying that security policy can be
applied after authentication - might be worth stating this after the Dr.
Evil analogy.

Added some text.

> 4. "Specifically, Alice and Bob have relationships with
>    some Identity Provider (IdP) that supports a protocol such as OpenID
>    or BrowserID) that can be used to demonstrate their identity to other
>    parties." - needs one more ( to parse correctly.
> Figures 3 & 4.  Would be better to have IdP1 and IdP2 to make clear that
this can be 2 different providers. Also, Secure WebSockets should be shown
with HTTPS.


> In Figures 3 & 4, media is shown as DTLS-SRTP when the media is in fact
SRTP which is keyed with DTLS-SRTP.  In fact, RFC 3711 is not even
referenced, when it should be a normative reference, which is potentially
very confusing and misleading for implementors.  There are many places
where DTLS-SRTP described as if it is more than a key agreement protocol
for SRTP.

I changed this to be "DTLS+SRTP", added a reference to RFC 3711, and
went through each use of DTLS-SRTP for appropriateness, modifying
where I thought necessary. Please let me know if there are places I

> Figure 4 is the Trapezoid of the overview and should be referenced as

Added a back reference to figure 2.

> JS is used but not defined.

Defined it back in the intro.

> 4.1  The first example in this document of microphone and webcam
permissions is that of a permanent grant.  Is this the model we want to
encourage?  Shouldn't the first mention be of the much safer per call
grant?  Later the per call grant is mentioned, but deep in the document.


> 4.1 "This message is sent to the signaling server, e.g., by
XMLHttpRequest[XmlHttpRequest] or by WebSockets [RFC6455] "  Sentence is
missing a period.  Do we really mean HTTPS and Secure WebSockets here?

I added "preferably over TLS".

> 4.1 "This allows the browser to display a trusted element in the browser
chrome indicating that a call is coming in from Alice."  Can this assertion
be made now, prior to the DTLS-SRTP handshake completing?  If this identity
assertion is made at this stage, then a different fingerprint shows up in
the DTLS-SRTP handshake, does the browser chrome then indicate 'sorry, I
made a mistake, it isn't Alice calling'?

I think at that point you just show a network error. The identity
assertion indicates that Alice was trying to call and if someone
is attacking, then that's just a failure.

> There are many cases of using [] instead of () for parentheical text.

I try to use [] for Notes or open issues.

> 4.2 ICE is not defined or referenced in its first occurance.

Fixed in 4.1.

> typo "theor"
> 5.1 "[[ OPEN ISSUE:  Should this be a 2119 MUST?
>    It's not clear what set of conditions would make this OK, other than
>    that browser manufacturers have traditionally been permissive here
>    here.]] "
>    [[ OPEN ISSUE::  Should we be more aggressive about this?]]
> Do these issues need to be discussed in a W3C document where browser
expertise might help us do the right thing?

I think we resolved this in Orlando and the document reflects what I think
the consensus was.

> 5.2  Do we really expect browsers to have X.509 certificates?  Does
anyone do this with DTLS-SRTP today?  Do browsers have UI to manage these

No, we don't. This was a request from Martin because he was expecting
servers might. I added some text to try to make that clear.

> 5.2 The API and UI requirements and elsewhere - should these be in a W3C
document instead of/in addition to this document?

My sense is that IETF was responsible for the security analysis so they
belong here.

> 5.2 "Implementations which support some form of direct user authentication
>    SHOULD also provide a policy by which a user can authorize calls only
>    to specific counterparties."  What is a counterparty?

Changed to "communicating peers."

> 5.3   ICE-Lite is not defined or referenced.  Since this is a MUST, we
need a normative reference - is it what is described as ICE Lite in RFC
5245 or draft-rescorla-mmusic-ice-lite?

It's 5245 again. I added a cite.

> Question:  If a browser has multiple public/private keys, including an
X.509 one, can the JS suggest which one to use for a particular

I think the answer needs to be yes here, but we probably need to clean
up the W3C API a bit to match.

> 5.5 Again, the media channel is secured using SRTP, not DTLS-SRTP,
> which is one way to key SRTP.  In the future, new and better key
> agreements will be developed, and if we write all our specs assuming
> one particular keying method, then they will become obsolete.

Good point. Here is my new text which I hope is clearer:

   Implementations MUST implement SRTP [RFC3711].  Implementations MUST
   implement DTLS [RFC4347] and DTLS-SRTP [RFC5763][RFC5764] for SRTP
   keying.  Implementations MUST implement

   All media channels MUST be secured via SRTP.  Media traffic MUST NOT
   be sent over plain (unencrypted) RTP.  DTLS-SRTP MUST be offered for
   every media channel and MUST be the default; i.e., if an
   implementation receives an offer for DTLS-SRTP and SDES, DTLS-SRTP
   MUST be selected.

   All data channels MUST be secured via DTLS.

> Section 5.6 with Appendix A reads like a completely separate
> document.  It is also very hard reading as there are lots of new
> concepts and ideas.  Should perhaps this be in a separate document?
> The basic security architecture for WebRTC is unlikely to change,
> but given that we have no experience with this IdP system, it seems
> likely that it will evolve and perhaps change significantly, which
> is an argument for a separate document.

I agree the writing is a bit awkward. I'm rewriting App A to make it
clearer and I'll see what I can do to about making S 5.6 merge better
as well.

>  Is this document defining a new SDP attribute?  Shouldn't this
be done in MMUSIC?

I have a TODO here. Will consult with the chairs.

> typos "implemementations"  "termnating" "restrcitions"


> WebIntents mentioned without reference or explanation.

WebIntents is OBE. Removed.

> Missing normative reference: RFC 3711.