Re: [MMUSIC] APPSDIR review of draft-ietf-mmusic-latching-05

Emil Ivov <emcho@jitsi.org> Wed, 28 May 2014 22:32 UTC

Return-Path: <emcho@sip-communicator.org>
X-Original-To: mmusic@ietfa.amsl.com
Delivered-To: mmusic@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B59E11A0702 for <mmusic@ietfa.amsl.com>; Wed, 28 May 2014 15:32:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.601
X-Spam-Level:
X-Spam-Status: No, score=-2.601 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XD9IyoJEX4f1 for <mmusic@ietfa.amsl.com>; Wed, 28 May 2014 15:32:33 -0700 (PDT)
Received: from mail-we0-f176.google.com (mail-we0-f176.google.com [74.125.82.176]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DDB1B1A06F0 for <mmusic@ietf.org>; Wed, 28 May 2014 15:32:32 -0700 (PDT)
Received: by mail-we0-f176.google.com with SMTP id q59so11728378wes.21 for <mmusic@ietf.org>; Wed, 28 May 2014 15:32:28 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to:content-type :content-transfer-encoding; bh=FcDNB5PgNRYvp0iPILjEDvA5irMUW+SrUdKTEzjfaWw=; b=g/Jj9mkqG/B13K9LHRznN1T54myaB3DVaXYAiuKU698mUhNr/su9hhfhSk6MEnI2ft EukZQN4KnNbUn1wvBzE3aq9t/BvrmOvVKhdadl8sz5Owobcyt/3RYzpADEofoHgHp5Kv Iw9bSQX9eAQMhJbKBTU7F2e6GhCvTE9lUjmP/5nHnjCH3ZyFtJ5N04iZ9tl0CuhGzr24 ZuTDqgjAEunHhb52r5uu9KAnUqsrAk7mK4ee5RZNGGea1LyoY8uEf18YSVuYu6eoX04a /7b8qWZwDTAX6IzNtYSOwK6StEAJ8YiQMa0L2qElgZYoF3jbtbFDxVscnj7jHga92C62 y+rw==
X-Gm-Message-State: ALoCoQldd7mVoYiERMA4R8c81nSQ09WPyHieYgWD4SpmHkE5D0Rf4oV1LJ5PH5vLW5rzzSeHpwBF
X-Received: by 10.194.63.46 with SMTP id d14mr4137913wjs.24.1401316348205; Wed, 28 May 2014 15:32:28 -0700 (PDT)
Received: from camionet.local (9.6.69.91.rev.sfr.net. [91.69.6.9]) by mx.google.com with ESMTPSA id b1sm46591918wjb.37.2014.05.28.15.32.25 for <multiple recipients> (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 28 May 2014 15:32:25 -0700 (PDT)
Message-ID: <538663F1.4080709@jitsi.org>
Date: Thu, 29 May 2014 00:32:17 +0200
From: Emil Ivov <emcho@jitsi.org>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.5.0
MIME-Version: 1.0
To: "Vijay K. Gurbani" <vkg@bell-labs.com>, "apps-discuss@ietf.org" <apps-discuss@ietf.org>, draft-ietf-mmusic-latching@tools.ietf.org
References: <537F520E.3060501@bell-labs.com>
In-Reply-To: <537F520E.3060501@bell-labs.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: http://mailarchive.ietf.org/arch/msg/mmusic/BwXIvHUZp54mgph1v006lGhmJqU
Cc: IESG IESG <iesg@ietf.org>, mmusic <mmusic@ietf.org>
Subject: Re: [MMUSIC] APPSDIR review of draft-ietf-mmusic-latching-05
X-BeenThere: mmusic@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Multiparty Multimedia Session Control Working Group <mmusic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mmusic>, <mailto:mmusic-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/mmusic/>
List-Post: <mailto:mmusic@ietf.org>
List-Help: <mailto:mmusic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mmusic>, <mailto:mmusic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 28 May 2014 22:32:36 -0000

Hey Vijay,

Thanks for your review! Comments inline.

On 23.05.14, 15:50, Vijay K. Gurbani wrote:
> I have been selected as the Applications Area Directorate reviewer for
> this draft (for background on appsdir, please see ​
> http://trac.tools.ietf.org/area/app/trac/wiki/ApplicationsAreaDirectorate).
>
> Please resolve these comments along with any other Last Call comments
> you may receive. Please wait for direction from your document shepherd
> or AD before posting a new version of the draft.
>
> Document: draft-ietf-mmusic-latching-05
> Title: Latching: Hosted NAT Traversal (HNT) for Media in Real-Time
> Communication
> Reviewer: Vijay K. Gurbani
> Review Date: May 23-2014
> IETF Last Call Date: May-28-2014
> IESG Telechat Date: May-29-2014
>
> Summary: This draft is almost ready for publication as an Informational
> RFC but has a few issues that should be fixed before publication.
>
> The draft captures excellent discussions on latching and provides a
> great overview on what it is, how it works, and despite its simplicity,
> why not to use it.  However, the tone of the draft should change from
> colloquial to something more objective (various examples of this in
> the Nits section).
>
> Major: 0
> Minor: numerous (see below)
> Nits:  numerous (see below)
>
> Minor:
>
> - S1, second to last paragraph: Not sure I understand what you mean by
>
>     The SIP signaling-layer component of HNT is typically more
>     implementation-specific and deployment-specific than the SDP and
>     media components.
>
>   SDP is as much of a standard as SIP is, no?  Maybe you mean to say that
>   because of the complexity of representing SIP messages, the SIP portion
>   of a RTC component's stack may vary between implementations and deploy-
>   ments.  SDP, being a simpler protocol (at least syntactically), is not
>   exposed to such vagaries.  Yes?

Yes, exactly. SIP's routing features provide for several ways of doing 
NAT traversal for signalling. Some can be entirely transparent to a UA 
(basically terminating the session at the SBC) others could do it in a 
more opaque and SIP-y way like for example the one described in:

http://tools.ietf.org/html/rfc3665#section-3.5

On the other hand there only so many ways you can change an IP address 
in SDP. So, how about the following wording?

    SIP's many features in terms of controlling message routing provide
    for various ways of addressing NAT traversal. As a result, the HNT
    component for SIP is typically more implementation-specific and
    deployment-specific than the SDP and media components.

Would this be better?

> - S3, second paragraph: "newly introduced media relay" --- newly
>   introduced to who?

That was meant from a general perspective. So "newly introduced" to the 
reader.

>   Perhaps you mean "a media relay incorporated into
>   session establishment"?

Right.

> Is this paragraph saying that the SBC and
>   media relay may be co-located on the same host?

It is saying that it is most often the same, that it can also be a 
different one in some cases and that it would almost always be of the 
same address family. In other words: if we confirmed that a UA has IPv4 
access to an SBC it is unlikely that we will risk introducing an IPv6 relay.

So how about this?

    While this is not necessary for HNT to work, quite often, the IP
    address of that media relay may be the same as that of the signaling
    intermediary (i.e. the SIP SBC and media relay are co-located on the
    same host). Also, in almost all cases, the address of the media
    relay would belong to the same IP address family as the one used for
    signaling, as it is known to work for that UA.

Is this better or do you prefer the original?

> - S4, the descriptional steps below the text "The latching mechanism
>   works as follows:" will be improved if you used "UAC" or "UAS" instead
>   of "UA" (I recognize that in SIP it is not necessary that the UAC
>   make the first offer, but these steps are for illustrative purposes
>   and it helps to be as clear as we can for neophyte readers).  Even
>   much better if you could cast the actors in terms of the principals
>   Alice and Bob that appear in Figures 2.  So, something like "After
>   receving an offer from Alice (UAC) who is behind a NAT" is preferable
>   (I think) to "After receiving an offer from a NATed UA".

I agree this can be said better. What would you say about:

    This way, while a session is still being setup, the signalling
    intermediary is not yet aware what addresses and ports the caller
    and the callee would end up using for media traffic: it has only
    seen them advertise the private addresses they use behind their
    respective NATs. Therefore media relays used in HNT would often use
    a mechanism called "latching".

> - S4, the numbered steps in Figure 2: change UAC to Alice in all of
>   the steps, and change UAS to Bob in all of the steps.  Easier for
>   neophyte readers to follow.  Alternatively, you may put "(UAC)"
>   above Alice and "(UAS)" above Bob to impart the same semantics.

Agreed! Would be much better!

> - S4, Figure 2: the text between step 12 and step 13 ("(SBC latches
>   to source IP address and port seen in (10))" --- what is this
>   referring to?  Is it referring to the latching created by Alice?
>   or Bob?  It is not clear, specially since I am not sure if the
>   "(10)" in the text I quote is a typo or not.  Maybe you meant
>   "(7)" or maybe "(2)"?

Latching happens as soon as the first media packet arrives so this meant 
(11). Thanks for the catch and sorry about the confusion!

> - S4, Figure 2, step 13: Much as you do in step 11, you should
>   spell out the "dest" field here as well.  So,
>   s/RTP/RTP, dest=198.51.100.2:22007/

OK, that's going to require some ASCII art magic because we are out of 
place there already but I do agree it would be useful :)

> - S5, first paragraph, line 12: s/onto packets/onto media packets/

Well they don't need to be media. They would be ignored even if they 
weren't. WDYT?

> - S5, first paragraph, line 14: "... a range of IP addresses belonging
>   to the same network..."  Here, "same" as which one?  I suspect you
>   mean the attacker's network.  But being explicit is better here.

It means the same as the one that the source address of the signalling 
packets belongs to. How about this:

    In some cases the
    limitation may be loosened to allow media from a predefined range of
    IP addresses in order to allow for use cases such as decomposed UAs

> - S5, first paragraph, towards the end: "widen the gap for potential
>   attackers..." Here, I suspect you mean that it provides the attackers
>   more IP address from which to mount attacks, i.e., advantage:
>   attackers.  Yes?

Yes. Is this a clarifying question or do you think it would be better if 
changed that way?

> - S5, second paragraph: s/In all/All/

Oh right! Much better!

> - S5, second paragraph: s/disturb media/impact media/

Thanks!

> - S5, third paragraph:
>   s/since the SBC will not latch onto the attacker's packets./since the
>   SBC will not latch onto the attacker's media packets, not having
>   seen the corresponding signaling packets first./

OK

> - S5, third paragraph: I am not sure I understand case (2).  Why would
>   an SBC latch on to an attacker if it is using restricted latching?

Case 2 is about the attacker being behind the same NAT as the UA, so 
traffic from the attacked would appear legit to the SBC.

>   Teasing this case further, if the legitimate user hangs up the call
>   because it is not getting any media packets, then a BYE should go
>   through the SBC causing it to discard latched states, thereby
>   preventing any more media packets from going to the attacker.

Right, but that's already mentioned in the text:

    the
    legitimate SIP UA will end the call anyway, because a human user
    would not hear anything and will hang up.

>   Assuming that the attacker had inserted itself as a man-in-the-middle
>   and was relaying packets to the UA, then sure, the legitimate user
>   will not hang up and continue conversing.
>
>   The case of a non-human user (answering machine) is redundant since
>   the behaviour of the user (non-human or otherwise) is essentially
>   the same.

Well, automated UAs are often less picky than humans about the kind of 
media they get (if at all) and the intention was to warn aout that 
option. Happy to remove it if you find it unnecessary.

> - S5, fourth paragraph: "For example, in cases where end-to-end
>   encryption is used it would still be possible for an attacker to
>   hijack a session despite the use of SRTP and perform a denial of
>   service attack.  However, media integrity would not be compromised."
>
>   Can you explain more broadly how the above would work?  If we assume
>   that the endpoints exchange keys end-to-end and create secure channels
>   end-to-end, how would the attacker hijack a session?  (Heartbleed and
>   all such mechanisms aside, of course.  If we assume keys are derived
>   end-to-end and don't follow the hop-by-hop model, how would the
>   attacker prevail?)

End-to-end was probably not the best choice of words here. Imagine 
something like ZRTP though. Given that such mechanisms rely exclusively 
on the media path for key negotiation, there would be no way for the SBC 
to authenticate the UA so it would still latch onto the wrong endpoint. 
Obviously ZRTP's authentication would prevent from an actual MitM attack 
but given that it's already on the media path, the attacker can simply 
stop relaying media and effectively perform a DoS. Does this make sense?


> Nits:
>
> - Abstract, 3rd paragraph:
>   s/components of the HNT components/components of HNT/

thanks

>   In fact, the entire 3rd paragraph could be written more succinctly as
>   follows:
>
>     Latching, one of the components of HNT, has a number of security
>     issues detailed in this document.  Based on the known threats, the
>     IETF advises against use of this mechanism on the Internet and
>     recommends other solutions, such as the Interactive Connectivity
>     Establishment (ICE) protocol.

That paragraph was the result of literally tens of back and forths so, 
unless you think this is paramount, I think it would be better to keep 
it as is. Specifically, the above version is missing on the fact that a) 
using SRTP allows for the security concerns to be addresses and the 
mechanism becomes acceptable b) sometimes the security concerns simply 
don't come into play (e.g. public conferences) c) there are controlled 
environments where security is taken care of in another way.

> - S1, second paragraph: s/They use IP/These protocols use IP/

OK

> - S1, second paragraph: s/as, in the/as, and in the/

OK

> - S1, fourth paragraph:
>   s/some manufacturers sometimes/a number of manufacturers sometimes/

Thanks

> - S1, fifth paragraph: s/creation time/publication/

OK

> - S1, fifth paragraph: s/in the foreseeable/for the foreseeable/

OK

> - S1, sixth paragraph:
>   s/are not novel to experts/are well known in the community/

OK

> - S1, seventh paragraph:
>   s/In no way does this document/This document does not/

OK

> - S4, first paragraph: s/couple/tuple/

well, in this case the text does refer to the address:port couple but I 
am ok with changing that to the "addres:port/transport" tuple (or triplet?)

> - S4, Figure 3: why not use the principals Alice and Bob here as
>   well instead of "XMPP Client 1" and "XMPP Client 2"?

Frankly the main reason was because it otherwise looks exactly like SIP 
and it becomes confusing. But I don't mind switching to Romeo and Juliet 
(which are XSF's protagonists of choice ;) ).

> - S5, last sentence of second-to-last paragraph:
>   s/However it is sometimes argued that, neither S/MIME nor
>   [RFC4474] are widely deployed and that this may not be
>   a real concern./However, neither S/MIME or [RFC4474] are widely
>   deployed, thus not being able to sign/verify requests appear not
>   to be a concern at this time./

OK.

Thanks again for the review!
Emil


-- 
https://jitsi.org