[radext] Review of draft-ietf-radext-radiusdtls

Alan DeKok <aland@deployingradius.com> Mon, 11 March 2024 13:57 UTC

Return-Path: <aland@deployingradius.com>
X-Original-To: radext@ietfa.amsl.com
Delivered-To: radext@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 54DA7C14F616 for <radext@ietfa.amsl.com>; Mon, 11 Mar 2024 06:57:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.909
X-Spam-Level:
X-Spam-Status: No, score=-6.909 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BbT0BHUFMV3d for <radext@ietfa.amsl.com>; Mon, 11 Mar 2024 06:57:29 -0700 (PDT)
Received: from mail.networkradius.com (mail.networkradius.com [62.210.147.122]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EC539C14F694 for <radext@ietf.org>; Mon, 11 Mar 2024 06:57:06 -0700 (PDT)
Received: from smtpclient.apple (unknown [75.98.136.130]) by mail.networkradius.com (Postfix) with ESMTPSA id 931E41D5 for <radext@ietf.org>; Mon, 11 Mar 2024 13:57:03 +0000 (UTC)
Authentication-Results: NetworkRADIUS; dmarc=none (p=none dis=none) header.from=deployingradius.com
From: Alan DeKok <aland@deployingradius.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\))
Message-Id: <CA9BEA9C-39EF-4764-A0FE-D122413B37F7@deployingradius.com>
Date: Mon, 11 Mar 2024 09:57:01 -0400
To: radext@ietf.org
X-Mailer: Apple Mail (2.3696.120.41.1.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/radext/2jgS0Np5MF1AJLpylUtRbs4niko>
Subject: [radext] Review of draft-ietf-radext-radiusdtls
X-BeenThere: radext@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: RADIUS EXTensions working group discussion list <radext.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/radext>, <mailto:radext-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/radext/>
List-Post: <mailto:radext@ietf.org>
List-Help: <mailto:radext-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/radext>, <mailto:radext-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 11 Mar 2024 13:57:31 -0000

  Now that I have a little bit of time, I'll do a detailed review of this document.  I still need to compare it to the previous docs and correlate missing / copied / added text, but that can wait a little bit.

  I'll leave nits for later.

1.1

	 ... where RADIUS packets need to be transferred through different administrative domains and untrusted, potentially hostile networks.

  The "and" here should be "or" instead.  Perhaps even just

	... where RADIUS packets need to be sent across insecure or untrusted networks.

1.2

	... 	• RFC6614 marked TLSv1.1 or later as mandatory, this specification requires TLSv1.2 as minimum and recommends usage of TLSv1.3

  We should mandate that RADIUS servers use TLS 1.3.  If we don't do that, then there is little incentive for clients to upgrade, because servers won't support it.

	... 	• RFC6614 allowed usage of TLS compression, this document forbids it.

  Good, but why?  Perhaps "it was not found to be useful", or "no one implemented it".

2.

	... Client implementations SHOULD implement both, but MUST only implement one of RADIUS/TLS or RADIUS/DTLS.

  I'm not sure why clients can't support both?  Perhaps

	...  but MUST implement at least one of RADIUS/TLS or RADIUS/DTLS.

3.1

	... The requirement that RADIUS remain largely unchanged ensures the simplest possible implementation and widest interoperability of the specification.

  It would be worth repeating here that MD5 is bad, and people should be aware of security issues.  See "Security Considerations" section, and the ALPN doc.

3.2

	... RADIUS/(D)TLS does not use separate ports for authentication, accounting and dynamic authorization changes. The source port is arbitrary. 

  It would be worth noting here that clients still have an 8-bit ID limitation, and that this issue is addressed by ALPN.

	... RADIUS/TLS servers MUST immediately start the TLS negotiation when a new connection is opened. They MUST close the connection and discard any data sent if the connecting client does not start a TLS negotiation.

  Add "or if the TLS negotiation fails at any point".

	... RADIUS/(D)TLS peers MUST NOT use the old RADIUS/UDP or RADIUS/TCP ports for RADIUS/DTLS or RADIUS/TLS.

  This seems to contradict the first paragraph in 3.2?


3.3

	... RADIUS/(D)TLS clients MUST mark a connection DOWN if one or more of the following conditions are met: * The administrator has marked the connection administrative DOWN. * The network stack indicates that the connection is no longer viable. * The application-layer watchdog algorithm has marked it DOWN.

  formatting off bullet points

	... If a RADIUS/(D)TLS client has multiple connection to a server, it MUST NOT decide to mark the whole server as DOWN until all connections to it have been marked DOWN.

  Maybe add a discussion of what, exactly, is a "server".  Destination IP?  How is this affected by RFC 7585 dynamic DNS lookups?

	... For RADIUS/TLS, the peers MAY send TCP keepalives 

  Perhaps change this to SHOULD, and explain why.  Experience shows that many people put firewalls between critical network services.  And those firewalls then discard TCP session state for sessions they think are "dead".  And then the critical network services can no longer communicate.

  It may also be worth adding a note that such practices are likely to cause network outages.  Especially when the firewall team is separate from the RADIUS team, and the two don't talk to each other.


4.1

	... Implementations MUST follow the recommendations given in [RFC9325]. 

  Which are...?  Why are we following these recommendations?

	... 	• support for TLS 1.3 [RFC8446] / DTLS 1.3 [RFC9147] or higher is RECOMMENDED.

   I'd make this mandatory for servers.


4.2

	... Allowing anonymous clients would ensure privacy for RADIUS/(D)TLS traffic, but would negate all other security aspects of the protocol

  add: plus, the use of a fixed shared secret would negate all security if peers were not mutually authenticated.


4.2.1.

	... 	• Implementations MUST allow the configuration of a list of trusted Certificate Authorities for new TLS sessions.

  Perhaps also note that this list should be specific to the application, and SHOULD NOT use the default system certificate store.

	... Implementations SHOULD indicate their trusted Certification authorities (CAs). 

  Do RADUUS/TLS implementations do this today?  

	... For clients configured by name

  DNS name?  

	... It is possible for a RADIUS/(D)TLS server to not require additional name checks for incoming RADIUS/(D)TLS clients, i.e. if the client used dynamic lookup.

  Do we care about client IP / hostname for TLS?  Perhaps make a note that the server should implement IP filtering.  TLS-PSK Section 6.2.1 has some text.

	... When the configured trust base changes (e.g., removal of a CA from the list of trusted CAs; issuance of a new CRL for a given CA), implementations SHOULD renegotiate the TLS session to reassess the connecting peer's continued authorization

  I'd say this is a MUST.   If a CA is removed from the list of trusted CAs, then every connection using that CA should be torn down.

 Similarly, if a client certificate is revoked, then any connection using that certificate should be torn down.

  The main problem here is tracking that information.  It may be difficult and/or expensive to do.  The naive approach would be to simply tear down all connections and let the clients re-authenticate.  But that can cause outages.

4.3

  All of this is good, but it's also good to add text on IP filtering, as per TLS-PSK Section 6.2.1.


4.4.

	... When an unwanted packet of type 'Accounting-Request' is received, the RADIUS/(D)TLS server SHOULD reply with an Accounting-Response ...

  Perhaps Protocol-Error would be better here?  The temptation for proxies would be to simply return any Accounting-Response packet without examining it.  Which does not meet the goal of hop-by-hop signalling.


5.1

	...  Similarly, if there is no response to a RADIUS packet over one RADIUS/TLS connection, implementations MUST NOT retransmit that packet over a different connection to the same destination IP address and port

  Perhaps "to the same server"?  6613 was written before 7585, and therefore doesn't include provisions for dynamic lookups.

  It's also worth noting that accounting packets may be re-transmitted according to the provisions of RFC 5080, but only if the Acct-Delay-Time is updated, which means it's a different packet.


5.3

  One thing missed from earlier specifications is the issue of retransmissions when proxies accept packets over TCP, but forward packets over UDP.  The packets aren't retransmitted over the TCP connection.  So the proxy likely has generate the retransmissions itself over any UDP connections.

 Except for Accounting packets with Acct-Delay-Time.  :(

  This is an issue we've mostly avoided for now by just using RADIUS/TLS everywhere.  But as DTLS gets more widely used, this issue will crop up more often.  So it needs to be resolved.


6.1

	... The DTLS encryption adds an additional overhead to each packet sent. RADIUS/DTLS implementations MUST support sending and receiving RADIUS packets of 4096 bytes in length, 

  What about RFC 7499 (fragmentation)?  It provides for a way to negotiate (or at least signal) support for larger packets.


6.2.

  More discussion of IP filtering would be useful here.


6.4.1.1

	... Sessions (both 4-tuple and entry) MUST be deleted when a TLS Closure Alert ([RFC5246], Section 7.2.1) or a fatal TLS Error Alert ([RFC5246], Section 7.2.2) is received

  Perhaps note that sessions must be deleted when the TLS connection is closed for any reason.  And then enumerate the reasons as per the existing text.	

	... Sessions MUST also be deleted when a non-RADIUS packet is received,

  add "over the DTLS connection".  Otherwise people might close DTLS sessions when a forged UDP packet is sent to the server.

  And then the following text can be deleted, and replaced with a reference to the "invalid packet" text earlier in the document.

	... The timestamp MUST NOT be updated in other situations

  for clarity:  The timestamp MUST NOT be updated in other situations, such as when packets are "silently discarded"

	... The server MAY cache the TLS session parameters, in order to provide for fast session resumption

  Delete this text.  The later paragraphs talk more about resumption.


6.4.2

	... RADIUS/DTLS clients SHOULD use PMTU discovery

  Perhaps add a note that UDP fragmentation still doesn't work across the wider Internet.

  Also we need to add somewhere a discussion of UDP fragmentation issues.  i.e. if the typical PTMU for UDP is (say) 576 bytes, then some of that is taken up by the IP / UDP / *and* DTLS headers.  Which leaves less for RADIUS.

  It's perhaps worth noting at the start of Section 6 that sending UDP over the wider Internet is a bad idea, and generally doesn't work.  RADIUS/DTLS is largely for local networks where PMTU is less of an issue.

  That being said, I have seen sites which have "local" networks where the practical MTU was much less than the default ethernet MTU.  Most commonly due to having multiple layers of VPNs / MPLS.  This could be explained as something that people should watch out for.

  It may also be worth adding a section on "TLS vs DTLS applicability".  That section can then explain the pros and cons of using TLS or DTLS, and where you would want to use them.

  Some of the text on session management here is redundant with 6.4.1.1.  Perhaps unify them into a section on session management, as many of the issues are the same for both client and server.

7.1

  Perhaps also mention 7585, and suggest that this issue is best resolved by limiting the use of proxies.  The TLS-PSK document has some text on this.


7.3

  Perhaps also discussion IP filtering, and how it can affect DoS.  Though the use of 7585 makes this more difficult.

 And suggest that RADIUS servers should generally not be exposed to the wider Internet.

	... RADIUS/DTLS servers MUST limit the number of partially open DTLS sessions and SHOULD expose this limit as configurable option to the administrator.

  Perhaps also suggest much lower idle timeouts for partially open TLS sessions.  i.e. if a connection isn't established within 3-5s, then it's likely bad.  In contrast, an authenticated session might not send packets for 30s, and that's fine.