[radext] Fwd: RE: Fwd: RE: Fwd: RE: Mail reguarding draft-ietf-radext-dynamic-discovery

Stefan Winter <stefan.winter@restena.lu> Wed, 10 July 2013 12:41 UTC

Return-Path: <stefan.winter@restena.lu>
X-Original-To: radext@ietfa.amsl.com
Delivered-To: radext@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6B5B821F9F02 for <radext@ietfa.amsl.com>; Wed, 10 Jul 2013 05:41:48 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.086
X-Spam-Level:
X-Spam-Status: No, score=-2.086 tagged_above=-999 required=5 tests=[AWL=0.513, BAYES_00=-2.599]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XL1JBt3fa5Ly for <radext@ietfa.amsl.com>; Wed, 10 Jul 2013 05:41:47 -0700 (PDT)
Received: from smtprelay.restena.lu (smtprelay.restena.lu [IPv6:2001:a18:1::62]) by ietfa.amsl.com (Postfix) with ESMTP id E361D21F9EF1 for <radext@ietf.org>; Wed, 10 Jul 2013 05:41:46 -0700 (PDT)
Received: from smtprelay.restena.lu (localhost [127.0.0.1]) by smtprelay.restena.lu (Postfix) with ESMTP id F07A710583 for <radext@ietf.org>; Wed, 10 Jul 2013 14:41:43 +0200 (CEST)
Received: from aragorn.restena.lu (aragorn.restena.lu [IPv6:2001:a18:1:8::155]) by smtprelay.restena.lu (Postfix) with ESMTPS id E1A6810581 for <radext@ietf.org>; Wed, 10 Jul 2013 14:41:43 +0200 (CEST)
Message-ID: <51DD5683.3070202@restena.lu>
Date: Wed, 10 Jul 2013 14:41:39 +0200
From: Stefan Winter <stefan.winter@restena.lu>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130620 Thunderbird/17.0.7
MIME-Version: 1.0
To: "radext@ietf.org" <radext@ietf.org>
References: <88ACDECA21EE5B438CA26316163BC14C25D334A9@BASS.ad.clarku.edu>
In-Reply-To: <88ACDECA21EE5B438CA26316163BC14C25D334A9@BASS.ad.clarku.edu>
X-Enigmail-Version: 1.5.1
X-Forwarded-Message-Id: <88ACDECA21EE5B438CA26316163BC14C25D334A9@BASS.ad.clarku.edu>
Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature"; boundary="----enig2RMPPRJRXBEJIDUEGMRWM"
X-Virus-Scanned: ClamAV
Subject: [radext] Fwd: RE: Fwd: RE: Fwd: RE: Mail reguarding draft-ietf-radext-dynamic-discovery
X-BeenThere: radext@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: RADIUS EXTensions working group discussion list <radext.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/radext>, <mailto:radext-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/radext>
List-Post: <mailto:radext@ietf.org>
List-Help: <mailto:radext-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/radext>, <mailto:radext-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Jul 2013 12:41:48 -0000

Hi,

Brian did a VERY thorough analysis of one particular feature of the
draft: loop prevention. See the forward below.

I believe it's worth discussing the question in Berlin (again).

My point so far is: loops can occur, so we should do all we can to
detect and prevent them. This requires all NAPTR -> SRV -> A/AAAA
lookups to be done, and look for inconsistencies in the resulting set.

Brian's counter-point is: loops can happen also outside dynamic
discovery, too; and will fix themselves by busting RADIUS packet
boundaries. In the interest of speed, let's be less thorough in finding
them and take shortcuts in DNS response evaluation wherever possible.

Greetings,

Stefan Winter

-------- Original Message --------
Subject: RE: [radext] Fwd: RE: Fwd: RE: Mail reguarding draft-ietf-radext-dynamic-discovery
Date: Tue, 9 Jul 2013 19:00:36 +0000
From: Brian Julin <BJulin@clarku.edu>
To: Stefan Winter <stefan.winter@restena.lu>


Stefan,

Having read through the new version of the draft, I now understand what
the draft is trying to achieve WRT loop detection.

Here are my comments on that:

The motivation for performing this loop detection seems to be predominantly
aimed at dealing with mis-configured edge RADIUS proxies which are handling
local realms as though they were remote.

First let us consider the scenario where there are no failures in DNS and no
failures encountered when connecting to servers selected via the DDDS
algorithm.  In this case, only the highest matching order/preference/priority
will be used.  The possible conditions under which a loop might occur in this
scenario are as follows:

1) The DDDS algorithm selects a listening address of the same RADIUS server.
This can be detected by a safeguard to prevent tight loops performed when
inidividual connections are initiated and/or received, and does not require
examination of any of the other DDDS results.

2) The next hop is an arrangement of statically routed servers which have a static
route back to a listening address of the initiating server.  In this case there is
no clue present in any of the DDDS results that a loop has occurred.  This could
happen both on backbone federation servers, or within an institution where different
RADIUS instances perform different roles.

3) A terminal A record with multiple RRs is selected, one of which is a listening
address on the initiating server.  Since the A lookup needed to be performed
anyway, all addresses are already available for a sanity check.

4) A SRV contains multiple RRs of the same priority, one of which is a listening
address on the initiating server.  Since all A records in a SRV are not customarily
looked up, only those selected for transit, this scenario could result in loops
around a load-balancing pool, if said pool was not configured to refuse connections
from itself and other members of the pool.

5) The result of DDDS otherwise forwards the packet to another server that also
performs DDDS for the same service/protocol.  This will always result in a loop,
since DDDS for a given service/protocol must only be performed once.  (What
was said about doing the same thing multiple times and expecting different results?)

For the scenario that a backup SRV priority or a backtracked NAPTR order/preference
is selected, or a connection failure, the situations are the same as above, with the
addition that in any of the above situations the results of DDDS could differ in any
servers involved in the loop.  This could even be done intentionally or accidentally
with a split horizon DNS server.  There is also the specter of transient results during
DNS changes.  None of these are actual problems: if any of that ever matters, the
root problem is that DDDS is being performed a second time for the same
service/protocol during the routing of the same packet, when that should never be
the case.

Note that if a non-fatal DNS failure prevents a RADIUS server from seeing
its own listening address/port in the DDDS results, this can render a sanity
check ineffective.

Also note (per 2 above) that there are several scenarios where the list of addresses/ports
against which the DDDS result would need to be checked to correctly break loops is not
the same as the list of listening sockets on the server performing DDDS -- i.e. any
arrangement of multiple autonomous RADIUS servers that forwards packets internally.

When a RADIUS routing loop occurs, there is no formally defined mechanism
for breaking loops.  However, implementations have worked in some duplicate
detection for performance reasons (e.g. not sending the retransmits needed
in an unreliable UDP-based environment into a reliable TCP-based environment)
and to deal with network topologies that might generate duplicate UDP traffic.
In FreeRADIUS the duplicate detection relies on the authentication vector and
size of the packet as well as the src/dst ip/port.  This would cut short most
naturally occurring loops after a finite number of iterations -- the exception being
scenarios where hops modify transiting requests in a way that is not
eventually idempotent, without also creating a packet that exceeds any
RADIUS packet format limitations, or implementations that change their source
ports frequently.  RADIUS administrators should be advised not to disable
duplicate checking on incoming TCP connections when DDDS is in use.

Other possible safeguards for implementors to consider could include
per-realm rate limiting on transit hops, with violations leading to a purge
of all requests from that realm for a short period of time and log messages
to that effect.

General loop detection in RADIUS is an issue beyond the scope of the draft --
albeit one which does deserve some attention.

While it is a helpful safeguard to prevent a server from trying to connect
to itself, the scenarios sketched out above are likely to be rare enough that
a balance needs to be struck between the performance of the normal, functional,
case and the efficacy of safeguards, which cannot be perfect as DDDS cannot
in and of itself solve the problem.  Given the expense of performing a full
enumeration of the DDDS tree, I suggest that line should be drawn somewhat
short of that measure. with implementations given the option to be more
thorough where they deem fit.  I'd suggest that servers SHOULD lookup and
loop-guard all A/AAAA records for RRs of an active SRV priority, with a hard
failure if even one matches, and SHOULD likewise loop-guard against all
A/AAAA addresses retrieved when multiple A/AAAA RRs respond to the same
owner, and of course MUST never try to forward to the same listening address
upon which the original request was received.

I might further offer that implementations MAY shut down all attempts to connect
to a realm if they ever are asked to connect to themselves for that realm in a
manner that generates noisy complaints in logs and requires manual intervention
to clear.  In general these problems are of the type that will not self-heal and/or
they indicate a security incident which merits notification of an administrator.
Most implementations will probably elect not to permanently poison a realm for
fear that a single spoofed DNS result could use this as a DoS.  They should be
at least encouraged to log such episodes as security events.

Regards,

Brian S. Julin