Re: [BEHAVE] WGLC: draft-ietf-behave-turn-tcp-04

Dave Thaler <dthaler@microsoft.com> Mon, 24 August 2009 16:50 UTC

Return-Path: <dthaler@microsoft.com>
X-Original-To: behave@core3.amsl.com
Delivered-To: behave@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 59CD228C162 for <behave@core3.amsl.com>; Mon, 24 Aug 2009 09:50:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.401
X-Spam-Level:
X-Spam-Status: No, score=-10.401 tagged_above=-999 required=5 tests=[AWL=0.198, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id n3NHLIneMcTl for <behave@core3.amsl.com>; Mon, 24 Aug 2009 09:50:43 -0700 (PDT)
Received: from smtp.microsoft.com (mailc.microsoft.com [131.107.115.214]) by core3.amsl.com (Postfix) with ESMTP id 1F2B13A6C26 for <behave@ietf.org>; Mon, 24 Aug 2009 09:50:43 -0700 (PDT)
Received: from TK5EX14HUBC101.redmond.corp.microsoft.com (157.54.7.153) by TK5-EXGWY-E803.partners.extranet.microsoft.com (10.251.56.169) with Microsoft SMTP Server (TLS) id 8.2.176.0; Mon, 24 Aug 2009 09:50:34 -0700
Received: from TK5EX14MLTW651.wingroup.windeploy.ntdev.microsoft.com (157.54.71.39) by TK5EX14HUBC101.redmond.corp.microsoft.com (157.54.7.153) with Microsoft SMTP Server id 14.0.621.7; Mon, 24 Aug 2009 09:50:33 -0700
Received: from TK5EX14MBXW653.wingroup.windeploy.ntdev.microsoft.com ([169.254.3.25]) by TK5EX14MLTW651.wingroup.windeploy.ntdev.microsoft.com ([157.54.71.39]) with mapi; Mon, 24 Aug 2009 09:50:36 -0700
From: Dave Thaler <dthaler@microsoft.com>
To: Simon Perreault <simon.perreault@viagenie.ca>
Thread-Topic: [BEHAVE] WGLC: draft-ietf-behave-turn-tcp-04
Thread-Index: AcoQ8pHnO58KBYnbRjO0tt9j9fIZ2QHYMcMQAyuQhgAAClQMkA==
Date: Mon, 24 Aug 2009 16:50:28 +0000
Message-ID: <E4561B14EE2A3E4E9D478EBFB5416E1B1B35B1@TK5EX14MBXW653.wingroup.windeploy.ntdev.microsoft.com>
References: <0a9901ca10f2$93cdab60$5f7d150a@cisco.com> <E4561B14EE2A3E4E9D478EBFB5416E1B197488@TK5EX14MBXW653.wingroup.windeploy.ntdev.microsoft.com> <4A92A330.1080103@viagenie.ca>
In-Reply-To: <4A92A330.1080103@viagenie.ca>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "draft-ietf-behave-turn-tcp@tools.ietf.org" <draft-ietf-behave-turn-tcp@tools.ietf.org>, "behave@ietf.org" <behave@ietf.org>
Subject: Re: [BEHAVE] WGLC: draft-ietf-behave-turn-tcp-04
X-BeenThere: behave@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: mailing list of BEHAVE IETF WG <behave.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/behave>, <mailto:behave-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/behave>
List-Post: <mailto:behave@ietf.org>
List-Help: <mailto:behave-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/behave>, <mailto:behave-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 24 Aug 2009 16:50:44 -0000

Simon Perreault write:
> Dave Thaler wrote, on 08/08/09 02:29 PM:
> > 1) Are data connections to be closed gracefully or abortively,
>
> The draft is intentionally silent about this. It is expected that
> "closing a
> connection" will be done by following Figure 6 in RFC 793 (the arrow
> labeled
> CLOSE below the ESTAB state), but the server is free to do otherwise as
> long as
> it is valid TCP.
>
> Do you have a reason in mind why we should be more explicit?

I think you should be explicit that either is ok, rather than it
looking like an oversight that will generate questions like mine.

> >    and does this depend on how the corresponding (peer vs client)
> >    connection was closed?
>
> Note that requiring this would probably not be implementable in non-
> root user-space.

I wouldn't go there, as that's very OS-specific.

> > 2) When error 447 is returned, section 4.3 says the client MUST
> >    wait at least 10 seconds before retrying.  I think this is
> >    for the _same_ destination IP/port, right?  If the client
> >    has other destination IP/port pairs to try, they can be
> >    tried immediately, correct?
>
> Correct. I have change the wording:
>
> The client MAY retry *with the same XOR-PEER-ADDRESS attribute*, but
> MUST wait
> at least 10 seconds.

Looks ok.

> > 3) Section 4.6 notes that no keepalive mechanism is provided
> >    and an application can do this at a higher layer.  For
> >    NATs between a TURN client and a TURN server, why wouldn't
> >    TCP keepalives be a possibility?
>
> Because in non-root user-space, TCP keepalives as provided by the OS
> are nearly
> worthless in many important cases because they cannot be setup per-
> socket. They
> are setup system-wide and you need to be root to change the timeout.

Um, no.  Again, this is OS specific, and in fact it's per-socket
configurable on Windows:
http://msdn.microsoft.com/en-us/library/dd877220(VS.85).aspx

This should be rephrased to allow for the possibility
that the OS supports per-socket TCP keepalive intervals.

> Basically it's another item in the list of things with STUN/TURN that
> are the
> way they are because of the requirement that a server be implementable
> in
> non-root user-space.
>
> > 4) If I understand the 3rd paragraph of section 5.2, correctly,
> >    there's an important limitation/constraint that should be
> >    called out early (e.g. in the overview).  Normally, TCP
> >    clients can open multiple parallel connections (e.g., to
> >    a web server) to the same destination address/port.  To
> >    accomplish this with TURN, it seems that the client needs
> >    multiple control connections so it can get multiple
> >    allocations, since it can only have one connection to a
> >    given peer address/port per allocation, correct?
>
> Yes, this is correct. It follows naturally from the fact that a 5-tuple
> uniquely
> identifies a connection, and that the local address and port are fixed
> from the
> moment the allocation is created.
>
> How about adding the following sentence to the third paragraph of
> section 3?
>
> "Note that a maximum of one connection to a given peer (address and
> port
> combination) can be established per allocation."

Ok.

> > 5) The 5th paragraph of Section 5.2 also implies that it's
> >    illegal for a TURN server to refuse to initiate connections
> >    based on the XOR-PEER-ADDRESS value.  For example, if it
> >    refuses to initiate connections to 127.0.0.1 it would be
> >    violating a MUST.  I'd like to see this relaxed.
>
> How about this modified text?
>
> "Otherwise, *and if the new connection is permitted by local policy,*
> the server
> MUST initiate an outgoing TCP connection."
>
> This lets the server filter by peer address as well as other criteria.

Yep, that looks great.

> > 6) The draft talks about timing out a peer connection after
> >    30 seconds in section 5.3 if the client never does a
> >    ConnectionBind for a connection initiated by a peer.
> >    However, section 5.2 has no corresponding rule.  This
> >    means a client can send a Connect request, and then
> >    never open a data connection, and the server will never
> >    time out the peer connection.  The same rule needs to be
> >    added to 5.2 to address this.
>
> Good catch. I've copied the paragraph in question to the end of section
> 5.2.

Ok.

> > 7) Continuing on the same point, the security considerations
> >    section needs to add a discussion of a memory DoS attack
> >    due to the TURN server having to buffer data from a peer
> >    while waiting for a ConnectionBind (which may never happen).
>
> Actually, this highlights a bug in the current spec (section 5.2,
> paragraph 7):
>
>    The server MUST buffer any data received from the peer.
>    Data MUST NOT be lost.  It is up to the server to adjust its
>    advertised TCP receive window should the buffer size become a
>    limiting factor.
>
> If it is not not lose data, the server must rely on the peer obeying
> the TCP
> window. This is bad. Instead, I propose the following:
>
>    The server MUST buffer any data received from the peer.
>    Data MUST NOT be lost unless the buffer is about to exceed a limit
> defined
>    by local policy, in which case the data connection MUST be closed.
> The
>    server adjusts its advertised TCP receive window to reflect the
> amount of
>    empty buffer space.
>
> Similar wording would be applied to section 5.3, paragraph 3.

Ok. I would assume it's legal for the server to just not do a read()
until after a ConnectionBind request is received from the client,
thereby making the OS handle the buffering, right?

> How about adding the following text to the security considerations
> section?
>
>     After a TCP connection is established between the server and a
> peer, and
>     before a ConnectionBind request is received from the client, the
> server
>     buffers all data received from the peer. This protocol
> specification lets
>     the server drop the connection if the buffer size is about to
> exceed a
>     limit defined by local policy. This policy should ensure that
> memory
>     resources are not exceeded.
>
> >    Currently it has references to docs that contain
> >    discussion of bandwidth DoS attacks, but not memory
> >    DoS attacks, as far as I can find.
>
> The above paragraph would indeed be much improved with such a
> reference. Could
> you provide one?

How about [RFC4732] section 2.1?

-Dave