Re: [secdir] SecDir review of draft-ietf-xmpp-3920bis-17

Peter Saint-Andre <> Fri, 29 October 2010 03:59 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 2FD473A69FA; Thu, 28 Oct 2010 20:59:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -102.569
X-Spam-Status: No, score=-102.569 tagged_above=-999 required=5 tests=[AWL=0.030, BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id o9674M9fU3De; Thu, 28 Oct 2010 20:59:26 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 27A623A680E; Thu, 28 Oct 2010 20:59:26 -0700 (PDT)
Received: from squire.local ( []) (Authenticated sender: stpeter) by (Postfix) with ESMTPSA id C939840BB9; Thu, 28 Oct 2010 22:09:29 -0600 (MDT)
Message-ID: <>
Date: Thu, 28 Oct 2010 22:01:15 -0600
From: Peter Saint-Andre <>
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv: Gecko/20101013 Thunderbird/3.1.5
MIME-Version: 1.0
To: Yaron Sheffer <>
References: <>
In-Reply-To: <>
X-Enigmail-Version: 1.1.1
OpenPGP: url=
Content-Type: multipart/signed; protocol="application/pkcs7-signature"; micalg="sha1"; boundary="------------ms070600050905090708010909"
X-Mailman-Approved-At: Mon, 01 Nov 2010 08:18:58 -0700
Cc:,, XMPP <>,
Subject: Re: [secdir] SecDir review of draft-ietf-xmpp-3920bis-17
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Security Area Directorate <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 29 Oct 2010 03:59:29 -0000

Thanks for your careful and thorough review. To maintain forward
momentum, this is Part 1 of my reply, up through the end of Section 4. I
shall endeavor to reply regarding the remainder of your review in the
next 24-48 hours.

[Copying to keep the WG in the loop...]

On 10/28/10 4:28 AM, Yaron Sheffer wrote:
> I have reviewed this document as part of the security directorate's
> ongoing effort to review all IETF documents being processed by the IESG.
> These comments were written primarily for the benefit of the security
> area directors.  Document editors and WG chairs should treat these
> comments just like any other last call comments.
> The document updates RFC 3920, which is the definition of the core XMPP
> real-time messaging protocol. It should be noted the instant messaging
> and presence are layered on top of this protocol, and specified separately.
> General
> The document is initially intimidating, because of its length. But it is
> extremely well written (for which I would like to thank the editor) and
> well organized. So overall, a good read.


> I have not found anything that I consider a glaring security hole. But
> this is a layered security architecture (application layer == XMPP core
> == SASL == TLS) which is not easy to do right. Hence the large number of
> comments and questions below.
> I also appreciate the open discussion of the existing implementations
> and their security issues (e.g. "server dialback", shiver). I hope this
> document results in a security improvement in real deployments.

We hope so!

> Detailed Comments
> Note: these comments are based on rev -17 of the draft. This only
> matters as far as section numbers, with the only security-relevant
> change in -18 being a useful note on end-to-end protection.
> - 1.3: Implementation note: I suggest adding something like "Solutions
> specified in this document offer a significantly better level of security."

True. In my working I've added this sentence (including a forward
reference to the discussion of strong security):

   The solutions specified in this document offer a significantly
   stronger level of security (see also Section 13.6).

> - 3.3: why do we not recommend to use TLS (stateless) session resumption
> for reconnection?

Your suggestion is a good one.

The XMPP WG and broader XMPP developer community have had some
discussions about methods for "quick reconnect". These would involve the
use of TLS session resumption, pipelining of requests from the
initiating entity to the receiving entity, perhaps some SASL tricks (see
for example draft-cridland-sasl-tls-sessions-00), etc. Those discussions
are rather preliminary at this time, however I think we can safely
recommend the use TLS session resumption.

I propose that we add the following paragraph at the end of Section 3.3:

   It is RECOMMENDED to make use of TLS session resumption [TLS-RESUME]
   when reconnecting.  A future version of this document, or a separate
   specification, might provide more detailed guidelines regarding
   methods for speeding the reconnection process.

> - 4.2.5 the access control rule at the bottom of this section makes
> sense. Why why does it explicitly not apply to elements other than XML
> stanzas? Are there cases where you negotiate TLS with someone other than
> the server? You can even imagine weird tunneling attacks using this
> "feature".
> - 4.2.5: "other than itself" - am I really allowed to send "message"
> stanzas from one resource of a JID to another resource of the same JID,
> before feature negotiation and before TLS negotiation?

I now see that the following paragraph is poorly phrased:

   The initiating entity MUST NOT attempt to send XML stanzas
   (Section 8) to entities other than itself (i.e., the client's
   connected resource or any other authenticated resource of the
   client's account) or the server to which it is connected until stream
   negotiation has been completed.  Even if the initiating entity does
   attempt to do so, the receiving entity MUST NOT accept such stanzas
   and MUST return a <not-authorized/> stream error.  This rule applies
   to XML stanzas only (i.e., <message/>, <presence/>, and <iq/>
   elements qualified by the content namespace) and not to XML elements
   used for stream negotiation (e.g., elements used to complete TLS
   negotiation (Section 5) or SASL negotiation (Section 6)).

There are several points that could be clarified here:

1. You can't send non-stanzas to other entities because (as XMPP is
currently defined) only stanzas have 'to' addresses. The XML elements
that we use for things like TLS negotiation and SASL negotiation are not
addressable to other entities (e.g., a remote server or client) but only
to the server to which a client has connected, so we don't need to
mention them in this paragraph.

2. However, that's contingent on how XMPP is currently defined, i.e., we
assume that there are only stanzas and stream-negotiation elements. This
leaves a bit of a loophole for elements that don't fit in either of
those buckets, e.g., sending <foo:bar to='baz'/> from the client. Is the
server supposed to route or deliver that element to baz? I would say no.
Someone could write a server that allows the communication of arbitrary
XML elements over XML streams, but that's not what 3920bis defines. So I
think we need to tighten that up, although this paragraph is not the
place to do so (I think it belongs in Section 4.7.3 "Other Namespaces").

3. Yes, a server might allow a client to communicate with its bare JID
or other full JIDs after it has authenticated but before it has bound a
resource (e.g., to send a service discovery request so it can discover
which other resources are currently connected). This is not an important
"feature" by any means, but as far as I can see it does not open any
security holes as long as the server requires authentication first. Thus
the phrase "until stream negotiation has been completed" in the quoted
paragraph is misleading.

Taking these considerations together, I propose the following modified
paragraph at the end of Section 4.2.5:

   An initiating entity MUST NOT attempt to send data to entities
   other than itself (i.e., the bare JID of the user's account) or the
   server to which it has connected until it has authenticated with the
   receiving entity.  If the initiating entity attempts to do so, the
   receiving entity MUST NOT accept such data and MUST close the stream
   with a <not-authorized/> stream error.

Then I propose that we modify Section 4.7.4 as follows:


4.7.3.  Other Namespaces

   Either party to a stream MAY send data qualified by namespaces other
   than the content namespace and the streams namespace.  For example,
   this is how data related to TLS negotiation and SASL negotiation are
   exchanged, as well as XMPP extensions such as Stream Management
   [XEP-0198] and Server Dialback [XEP-0220].  (For historical reasons,
   some server implementations expect a declaration of the 'jabber:
   server:dialback' namespace on server-to-server streams, as explained
   in [XEP-0220].)

   However, an XMPP server MUST NOT route or deliver data received over
   an input stream if that data is (a) qualified by another namespace
   and (b) addressed to an entity other than the server, unless the
   other party to the output stream over which the server would send the
   data has explicitly negotiated or advertised support for receiving
   arbitrary data from the server.  This rule is included because XMPP
   is designed for the exchange of XML stanzas (not arbitrary XML data),
   and because allowing an entity to send arbitrary data to other
   entities could significantly increase the potential for exchanging
   malicious information.  As an example of this rule, the server
   hosting the domain would not route the following first-
   level XML element from <> to <>:

     <ns1:foo xmlns:ns1=''

   This rule also applies to first-level elements that look like stanzas
   but that are improperly namespaced and therefore really are not
   stanzas at all (see also Section 4.7.4), for example:

     <ns2:message xmlns:pre=''

   Upon receiving arbitrary first-level XML elements over an input
   stream, a server MUST either ignore the data or return a stream
   error, which SHOULD be <unsupported-stanza-type/>.


> - 4.2.6: why do we even allow non-SASL protected server-to-server
> communication?

Because, in practice, most existing services end up using Server
Dialback, not TLS + SASL EXTERNAL with PKIX certificates.

> - 4.3: How is TLS negotiated for the additional streams? 

Each stream is separately secured.

> How is it bound
> to the SASL negotiation that (apparently) only takes place once?

It isn't, because each stream is separately secured (preferably by means
of TLS + SASL).

I propose that we clarify these matters by modifying the following
paragraphs from Section 4.3 ("Directionality"):


4.3.  Directionality

   An XML stream is always unidirectional, by which is meant that XML
   stanzas can be sent in only one direction over the stream (either
   from the initiating entity to the receiving entity or from the
   receiving entity to the initiating entity).

   Depending on the type of session that has been negotiated and the
   nature of the entities involved, the entities might use:

   o  Two streams over a single TCP connection, where the security
      context negotiated for the first stream is applied to the second
      stream.  This is typical for client-to-server sessions, and a
      server MUST allow a client to use the same TCP connection for both

   o  Two streams over two TCP connections, where each stream is
      separately secured.  In this approach, one TCP connection is used
      for the stream in which stanzas are sent from the initiating
      entity to the receiving entity, and the other TCP connection is
      used for the stream in which stanzas are sent from the receiving
      entity to the initiating entity.  This is typical for server-to-
      server sessions.

   o  Multiple streams over two or more TCP connections, where each
      stream is separately secured.  This approach is sometimes used for
      server-to-server communication between two large XMPP service
      providers; however, this can make it difficult to maintain
      coherence of data received over multiple streams in situations
      described under Section 10.1, which is why a server MAY return a
      <conflict> stream error to a remote server that attempts to
      negotiate more than one stream (as described under

   This concept of directionality applies only to stanzas and explicitly
   does not apply to first-level children of the stream root that are
   used to bootstrap or manage the stream (e.g., first-level elements
   used for TLS negotiation, SASL negotiation, Server Dialback
   [XEP-0220], and Stream Management [XEP-0198]).

   The foregoing considerations imply that while completing STARTTLS
   negotiation (Section 5) and SASL negotiation (Section 6) two servers
   would use one TCP connection, but after the stream negotiation
   process is done that original TCP connection would be used only for
   the initiating server to send XML stanzas to the receiving server.
   In order for the receiving server to send XML stanzas to the
   initiating server, the receiving server would need to reverse the
   roles and negotiate an XML stream from the receiving server to the
   initiating server over a separate TCP connection.


> - 4.4: this section appears to tie the two streams in the opposite
> directions together - when you close one you expect the other guy to
> close the other ASAP. But what is the behavior for "multiple streams
> over multiple connections" (Sec. 4.3)?

That is an excellent question. Section 4.4 currently assumes that there
is a defined pairing of streams because that is what happens in well
over 99% of the cases (I'd venture to guess that using multiple streams
over multiple TCP connections is extremely rare and happens only for
some high-volume server-to-server links). IMHO we don't have enough
experience with this case to recommend server behavior, although it
would be good to gather feedback from any service providers who are
using multiple streams over multiple TCP connections. Therefore I
propose that we modify Section 4.4 as follows:


   If the parties are using either two streams over a single TCP
   connection or two streams over two TCP connections, the entity that
   sends the closing stream tag SHOULD behave as follows:

   1.  Wait for the other party to also close its stream before
       terminating the underlying TCP connection(s); this gives the
       other party an opportunity to finish transmitting any data in the
       opposite direction before the TCP connection(s) is terminated.

   2.  Refrain from initiating the sending of further data over that
       stream but continue to process data sent by the other entity
       (and, if necessary, react to such data).

   3.  Consider both streams to be void if the other party does not send
       its closing stream tag within a reasonable amount of time (where
       the definition of "reasonable" is a matter of implementation or

   4.  After receiving a reciprocal closing stream tag from the other
       party or waiting a reasonable amount of time with no response,
       terminate the underlying TCP connection(s).

   If the parties are using multiple streams over multiple TCP
   connections, there is no defined pairing of streams and therefore the
   behavior is a matter for implementation.


> - 4.4: what about orderly tear-down of the TLS association ("closure
> alert")?

Sending of the closure alert is mandated by RFC 4346. Do you think it is
necessary (or would be helpful) to call it out in the specification of
an application protocol that re-use TLS?

I note that Section 7.2.1 ("Closure Alerts") of RFC 4346 states in part:

   If the application protocol using TLS provides that any data may be
   carried over the underlying transport after the TLS connection is
   closed, the TLS implementation must receive the responding
   close_notify alert before indicating to the application layer that
   the TLS connection has ended.  If the application protocol will not
   transfer any additional data, but will only close the underlying
   transport connection, then the implementation MAY choose to close the
   transport without waiting for the responding close_notify.  No part
   of this standard should be taken to dictate the manner in which a
   usage profile for TLS manages its data transport, including when
   connections are opened or closed.

I note also that rule #2 in Section 4.4 of 3920bis states:

   2.  Refrain from initiating the sending of further data over that
       stream but continue to process data sent by the other entity
       (and, if necessary, react to such data).

This implies that the entity that is closing the stream might send more
data and certaily might receive data (on the other stream). Thus I think
we do need to clarify use of closure alerts. I propose adding the
following security note to Section 4.4:

      Security Note: In accordance with Section 7.2.1 of [TLS], to help
      prevent a truncation attack the party that is closing the stream
      MUST send a TLS close_notify alert and MUST receive a responding
      close_notify alert from the other party before closing the
      underlying TCP connection(s).

> - does not-authorized only refer to stream-level, rather than
> stanza-level errors? Are there cases when I am authorized to send some
> stanza types but not others?

There are three <not-authorized/> error conditions, each qualified by a
different namespace:

1. The stream error condition is triggered by trying to send data before
completing authentication.

2. The SASL error condition is triggered by providing a bad username or
incorrect credentials.

3. The stream error condition is triggered by attempting to complete an
application-layer action that requires authentication (e.g., attempting
to join a password-protected chatroom without providing the password).

A server could restrict a user's ability to send different types of
stanzas (e.g., a presence-only XMPP service could return a stanza error
if a user attempts to send a <message/> stanza, but in that case it's a
service-wide policy and the <not-allowed/> stanza error condition is
more appropriate). In general, the <not-authorized/> stanza error
condition implies that the sender could do *something* to modify the
data it sent in order to prove that it is in fact authorized to perform
the action requested in the stanza (e.g., include a password in the
chatroom join request).

> - remote-connection-failed has dubious security benefit (why
> tell the world that your RADIUS server is down), compared to reusing
> internal-error.

It's not an internal error because the source of the failure is a remote
entity. As far as I know, this error condition has been used only in the
context of server dialback (in the case where a stream error occurs
between the Receiving Server and the Authoritative Server), but we've
tried to scrub the-protocol-that-shall-not-be-named from 3920bis
wherever possible; doing so in here has led to confusion on your part so
I propose that we might want to modify the description, as so:

   The server is unable to properly connect to a remote entity that is
   needed for authentication or authorization (e.g., in certain
   scenarios related to Server Dialback [XEP-0220]); this condition is
   not to be used when the cause of the error is within the
   administrative domain of the XMPP service provider, in which case the
   <internal-server-error/> condition is more appropriate.

> - shouldn't we say that "reset" (when the stream is encrypted)
> also applies to the higher layers, i.e. encryption and authentication
> should be performed again?

Good point. I propose that we modify the description as follows:

   The server is closing the stream because it has new (typically
   security-critical) features to offer, because the keys or
   certificates used to establish a secure context for the stream have
   expired or have been revoked during the life of the stream
   (Section, because the TLS sequence number has wrapped
   (Section 5.3.5), etc.  The reset applies to the stream and to any
   security context established for that stream (e.g., via TLS and
   SASL), which means that encryption and authentication need to be
   negotiated again for the new stream (e.g., TLS session resumption
   cannot be used).

> - what are the security implications of a "redirect"? 

Of <see-other-host/>? Seemingly underspecified. :(

> Should
> the client apply the same policy, e.g. for using TLS, as for the
> original server? 


> Which "to" identity to use? 

The 'to' address of the initial stream header would still be the DNS
domain name of the XMPP service to which the initiating entity is trying
to connect (see also draft-saintandre-tls-server-id-check).

> Can redirection occur
> before the recipient is even authenticated?


I propose that we modify the description as follows:

   The server will not provide service to the initiating entity but is
   redirecting traffic to another host under the administrative control
   of the same service provider.  The XML character data of the <see-
   other-host/> element returned by the server MUST specify the
   alternate hostname or IP address at which to connect, which MUST be a
   valid domainpart or a domainpart plus port number (separated by the
   ':' character in the form "domainpart:port").  If the domainpart is
   the same as the source domain, derived domain, or resolved IP address
   to which the initiating entity originally connected (differing only
   by the port number), then the initiating entity SHOULD simply attempt
   to reconnect at that address.  Otherwise, the initiating entity MUST
   resolve the hostname specified in the <see-other-host/> element as
   described under Section 3.2.

I also propose that we add the following paragraph to the end of the

   When negotiating a stream with the host to which it has been
   redirected, the initiating entity MUST apply the same policies it
   would have applied to the original connection attempt (e.g., a policy
   requiring TLS), MUST specify the same 'to' address on the initial
   stream header, and MUST verify the identity of the new host using the
   same reference identifier(s) it would have used for the original
   connection attempt (in accordance with [TLS-CERTS].  Even if
   receiving entity returns a <see-other-host/> error before the
   confidentiality and integrity of the stream have been established
   (thus introducing the possibility of a denial of service attack), the
   fact that the initiating entity needs to verify the identity of the
   XMPP service based on the same reference identifiers implies that the
   initiating entity will not connect to a malicious entity; however, to
   reduce the possibility of a denial of service attack, the receiving
   entity SHOULD NOT return a <see-other-host/> error until after the
   stream has been protected (e.g., via TLS).

> - 4.9: Don't we resend the "stream" header again after completing the
> TLS negotiation (Sec. 4.2.3).

For sure. I propose that we change this:

   [ ... channel encryption ... ]

   [ ... authentication ... ]

   [ ... resource binding ... ]


   [ ... stream negotiation ... ]

### END OF PART 1 ###