Re: [dtn] Martin Duke's Discuss on draft-ietf-dtn-tcpclv4-22: (with DISCUSS and COMMENT)

Benjamin Kaduk <kaduk@mit.edu> Wed, 11 November 2020 06:11 UTC

Return-Path: <kaduk@mit.edu>
X-Original-To: dtn@ietfa.amsl.com
Delivered-To: dtn@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BC1E93A00D8; Tue, 10 Nov 2020 22:11:15 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BZ4zDcDbd6gB; Tue, 10 Nov 2020 22:11:12 -0800 (PST)
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D8D763A005F; Tue, 10 Nov 2020 22:11:11 -0800 (PST)
Received: from kduck.mit.edu ([24.16.140.251]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 0AB6AuTL027086 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 11 Nov 2020 01:11:07 -0500
Date: Tue, 10 Nov 2020 22:10:55 -0800
From: Benjamin Kaduk <kaduk@mit.edu>
To: Brian Sipos <BSipos@rkf-eng.com>
Cc: Martin Duke <martin.h.duke@gmail.com>, "draft-ietf-dtn-tcpclv4@ietf.org" <draft-ietf-dtn-tcpclv4@ietf.org>, "dtn-chairs@ietf.org" <dtn-chairs@ietf.org>, "iesg@ietf.org" <iesg@ietf.org>, "dtn@ietf.org" <dtn@ietf.org>, "edward.birrane@jhuapl.edu" <edward.birrane@jhuapl.edu>
Message-ID: <20201111061055.GS39170@kduck.mit.edu>
References: <160479610567.30934.2651041608307943087@ietfa.amsl.com> <d7b8cefdb52977bfbb99bf2608083a2f281dc807.camel@rkf-eng.com> <CAM4esxTNouowAR_f6_WaGC7FfVabDPGjbVGjEgd4rWbBqUixYw@mail.gmail.com> <MN2PR13MB356760FBF3C9232E40C7754E9FE90@MN2PR13MB3567.namprd13.prod.outlook.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <MN2PR13MB356760FBF3C9232E40C7754E9FE90@MN2PR13MB3567.namprd13.prod.outlook.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/dtn/_KkJ_s2dUroI3blqy6H1Nfep83Q>
Subject: Re: [dtn] Martin Duke's Discuss on draft-ietf-dtn-tcpclv4-22: (with DISCUSS and COMMENT)
X-BeenThere: dtn@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Delay Tolerant Networking \(DTN\) discussion list at the IETF." <dtn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dtn>, <mailto:dtn-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dtn/>
List-Post: <mailto:dtn@ietf.org>
List-Help: <mailto:dtn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dtn>, <mailto:dtn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 11 Nov 2020 06:11:16 -0000

Hi Brian, Martin,

As a bit of background, we do to some extent consider certificate
validation ("the X.509 stack") as being partially distinct from the TLS
protocol implementation -- TLS 1.3 even introduces separate TLS extensions
to indicate which signature algorithms can be used for TLS messages vs
X.509 certificates, to accomodate implementations where the lists are
different (which might happen if the X.509 library is OS-provided and does
not evolve in lockstep with the TLS implementation).  That said, we
typically do still see X.509 certificate validation as occuring during the
handshake, so as you note the TCPCL behavior that goes back to look at the
certificate more, later on, is a bit unusual.

On Tue, Nov 10, 2020 at 05:19:49AM +0000, Brian Sipos wrote:
> Martin,
> This appears to be getting close. The one remaining comment is related to TLS handshake and failures.
> 
> One reason for some of this trouble, I think, is the lack of obvious pre-existing concrete requirements for the use of TLS [1] by application protocols. I'm looking through references to TLS [2] for both "proposed standard" status and "normatively references" type, and see many extensions to TLS and many references to data used by TLS but for the use of TLS I see only NTP [3] and its requirements are quite loose and certainly don't include a protocol state machine definition of how TLS fits in.

Most likely such a document would not yet be referring to RFC 8446, so RFC
5246 could be more interesting in this regard (though it has many more
references to it to sort through).  But see below.

> There seems to be a kind of assumption in protocol specifications that using TLS just lets the implementation of TLS "do its thing" even though TLS itself explicitly excludes doing things like certificate validation in Section 4.4.2.4:
> In general, detailed certificate validation procedures are out of scope for TLS (see [RFC5280]).
> 
> The use of TLS for TCPCL has its own specific issues related to some of the certificate validation (DNS name) happening within the TLS implementation during handshake and some (Node ID) happening after the handshake has completed, when many TLS APIs don't allow injecting arbitrary Alerts into the TLS message stream. Sending a post-TLS-handshake SESS_TERM seems to be allowed by TLS itself and a reasonable way to avoid intermingling of CL-level messaging with TLS-level messaging.

(As noted above, this is not well-trodden territory.)
Yes, typically which specific alert value to send is not under the direct
control of the application, though it's easy to cause the TLS layer to send
*some* alert (which is basically always going to cause the other end of the
connection to tear things down).  If our procedure involves letting the TLS
handshake complete, though, having a dedicated application-layer message
for the particular shutting-down semantics in question seems okay to me.

> Considering all that, is there a good RFC to point to as an example of a well-specified example of using TLS 1.3 (or even an earlier version) in a way which is actually compatible with TLS implementations?

I think the main idea of TLS is that it's supposed to be simple to slot in,
since it provides a stream interface to the application just as TCP does,
but with crypto.  The main additional requirement on the application is to
be able to tell if the other party is the intended one, which can involve
something as simple as the application providing a name that the peer is
expected to be able to authenticate as (this is the most common case).  For
TLS server authentication, RFC 6125 specifies procedures for how the
authentication is performed, and (IIRC) also gives a list of what the
application/protocol needs to provide to the TLS stack in order for that
to happen.  I think that, instead of searching through documents that
reference 8446 or 5246, you may have better luck looking through those that
cite 6125.  When TLS mutual authentication is used, we don't have a
dedicated RFC for those authentication procedures (yet), though many
aspects of RRC 6125 can be reused.  We did just go through quite a bit of
discussion for how to do the TLS mutual auth for draft-ietf-nfsv4-rpc-tls
(approved by the IESG just a week ago and now in the RFC Editor's queue); I
would suggest starting with that (and how it cites RFC 6125) to see how
well it can map to what TCPCL needs.

Thanks,

Ben

> Thanks again for your feedback,
> Brian S.
> 
> [1] https://tools.ietf.org/html/rfc8446
> [2] https://datatracker.ietf.org/doc/rfc8446/referencedby/
> [3] https://tools.ietf.org/html/rfc8915
> [4] https://tools.ietf.org/html/rfc5280
> 
> 
> ________________________________
> From: Martin Duke <martin.h.duke@gmail.com>
> Sent: Monday, November 9, 2020 17:30
> To: Brian Sipos <BSipos@rkf-eng.com>
> Cc: iesg@ietf.org <iesg@ietf.org>rg>; dtn-chairs@ietf.org <dtn-chairs@ietf.org>rg>; draft-ietf-dtn-tcpclv4@ietf.org <draft-ietf-dtn-tcpclv4@ietf.org>rg>; dtn@ietf.org <dtn@ietf.org>rg>; edward.birrane@jhuapl.edu <edward.birrane@jhuapl.edu>
> Subject: Re: Martin Duke's Discuss on draft-ietf-dtn-tcpclv4-22: (with DISCUSS and COMMENT)
> 
> Hi Brian,
> 
> On Sun, Nov 8, 2020 at 5:49 PM Brian Sipos <BSipos@rkf-eng.com<mailto:BSipos@rkf-eng.com>> wrote:
> Martin,
> My responses are in-line below with prefix "BS1".
> 
> >
> > First of all, "failure of the transfer" is ambiguous because there
> > may be two
> > transfers going on, one in each direction.
> >
> BS1: This phrasing is a bit unclear because the statement applies to
> the transfer itself. Instead of "a transmitting node" it should be "a
> node transmitting a transfer" so the subject of the statement becomes
> the outgoing transfer.
> 
> Yes, that would help.
> 
> 
> > Second, I would like further clarity on what it means that nodes
> > "SHALL"
> > consider it "a failure of the transfer". What is actionable if it's a
> > failure?
> > If nothing is actionable, it shouldn't be a SHALL. Does this mean
> > that in
> > subsequent sessions I must resend the whole bundle?
> >
> BS1: There are requirements about how a CLA interacts with its
> overlaying BPA, and this requirement applies to what will be indicated
> to the BPA about the formerly-in-progress transfer. Section 3.1
> explains that CLA--BPA interface.
> 
> This reply makes sense. Perhaps a reference to Sec 3.1 would help?
> 
> 
> > Can you list some reasons why an endpoint would choose to close
> > uncleanly? Some
> > motivation might provide helpful context.
> >
> BS1: The motivation here is that there are some reasons why the CLA
> will know that it's about to lose connectivity and send a positive
> indication of "I'm about to go away and cannot finish a transfer (in
> either direction)" instead of waiting (potentially a long time) for a
> CLA timeout or a TCP timeout. For example, a low battery starting into
> a power saving mode and stopping a data link. This is as opposed to a
> clean termination, which is not time-critical and more of a "I don't
> want to start any more transfers, so don't attempt any new ones."
> 
> Thanks. Maybe add a sentence about this? "For instance, an endpoint may know it's about to lose connectivity and request abrupt termination rather than forcing a timeout."
> 
> 
> > Moreover the "sending or receiving" bit is confusing.
> > - So one option is that I'm a session that has decided to do an
> > unclean
> > termination rather than a clean one. So I send SESS_TERM and then
> > close (FIN?
> > RST?) the TCP connection. So if it's a FIN, I might very well get the
> > last
> > XFER_ACK.  If I RST or don't get that ACK, then I do think it's clear
> > that the
> > transfer is a failure, whatever that means.
> >
> BS1: I can add a top-level requirement that closing a TCP connection
> always means using a FIN.
> 
> Thanks.
> 
> 
> > - But as a receiver, how do I know that the termination is unclean?
> > Have I made
> > an independent decision to close uncleanly? Am I to take the receipt
> > of a
> > SESS_TERM without my having sent XFER_ACK as an unclean close? If I
> > sent
> > XFER_ACK, how do I know that the sender sent it? And what does it
> > mean, as a
> > receiver, that the transfer "failed" if I have all the data?
> >
> BS1: Some of these requirements may be trying to overspecify behavior
> and just get confusing. There is always the potential for the TCP
> connection to close for some other reason outside the CLA's control.
> Maybe a better set of requirements is to mention that a peer MAY close
> the connection before a transfer is finished, and that a closed
> connection before receiving the final XFER_ACK SHALL be treated as a
> failed transfer (even outside of an unclean termination, because at
> that point there is no possibility of ever receving that final ACK).
> 
> I think that would be an improvement
> 
> Some clearer description of when an endpoint TCPCL should send close to its TCP, given receipt of a SESS_TERM and FIN, would be helpful. For most SESS_TERMs, I don't want to send FIN until I have both completed any in-progress transmission, and have XFER_ACKed all incoming segments. In the unclean case I probably want to send a FIN more or less immediately to avoid a timeout, I think (?), and some guidance on how to distinguish these cases would make this clearer.
> 
> 
> >
> > ---------------------------------------------------------------------
> > -
> > COMMENT:
> > ---------------------------------------------------------------------
> > -
> >
> > Thanks for this document. I have numerous minor concerns:
> >
> > Sec 4.3. "the TCP connection SHALL be closed." Can you clarify if
> > this is a
> > clean close (FIN) or abort (RST)? In fact, if you always mean FIN, it
> > might be
> > good to clarify that in the terminology section to indicate that
> > there are no
> > RSTs anywhere.
> >
> BS1: as mentioned above, I will add a requirement that FIN is the
> correct method.
> 
> Thanks
> 
> 
> > Sec 4.3. VERSION_MISMATCH would benefit from being split into
> > VERSION_TOO_HIGH
> > and VERSION_TOO_LOW. For example, if the passive is at v4 only and
> > the active
> > supports v1, v2, and v3, it will take three tries to figure out that
> > there is
> > no way for these nodes to communicate. Even better would be a QUIC-
> > style
> > Version Negotiation message that would communicate the options in 1
> > RTT.
> >
> BS1: This had been discussed in the WG somewhat and this is a carry-
> over from the behavior of the TCPCLv3.
> The current requirements on the actie entity are to start at the
> highest supported version and on the passive entity are to begin
> TLS/session negotiation on *any* acceptable TCPCL version, which
> assumes that a later version will be more acceptable to an arbitrary
> passive entity.
> Adding a "supported versions set" to the Contact Header is possible,
> but there needs to be WG discussion on whether this is a worthwile late
> addition.
> 
> If it's routine for a node that supports v4 to also support earlier versions, than I agree there is no serious problem here.
> 
> 
> > There are few items that seem to be artifacts of TLS happening after
> > session
> > negotiation in v3:
> >
> > Sec 4.4.3. "If certificate validation
> >    fails or if security policy disallows a certificate for any
> > reason,
> >    the entity SHALL terminate the session"
> >
> > I don't understand this; certificate validation generally occurs
> > during the TLS
> > handshake, where there is no session?
> >
> BS1: There is a bit of gray area when the TLS handshake is involved,
> because it can either succeed or fail but after the handshake it is
> possible to send TCPCL messages regardless of handshake success. I
> suppose that if the TLS stack fails within the TLS handshake that there
> will be a TLS-level message indicating this fact and the TCPCL-level
> message is redundant. Using the SESS_TERM regardless of whether the TLS
> stack rejected the handshake or the CLA rejected the peer certificate
> guarantees that there is some over-the-wire indication of what went
> wrong (as opposed to the connection just being closed after a seemingly
> valid TLS handshake).
> 
> I am uneasy about this response. The TLS alert mechanism can already signal the peer when there is a problem with the handshake. However, many TLS stacks deliberately obscure the precise reason for handshake failure to avoid attacker analysis. This text treats cert validation as something outside the handshake, which I don't believe is correct.
> 
> 
> > Sec 4.4.3
> > "the active
> >    entity SHALL authenticate the DNS name (of the passive entity)
> > used
> >    to initiate the TCP connection"
> > s/TCP Connection/TLS session. TCP connections don't consider DNS at
> > all.
> >
> BS1: This may be a phrasing issue, maybe instead:
>    Either during or immediately after the TLS handshake if the active
>    entity resolved a DNS name (of the passive entity) in order to
>    initiate the TCP connection, the active
>    entity SHALL authenticate that DNS name using any DNS-ID of the peer
>    certificate.
> 
> LGTM.
> 
> 
> > Sec 4.5.
> > "After the initial exchange of a Contact Header, all messages
> >    transmitted over the session are identified by a one-octet header
> >    with the following structure:"
> >
> > Obviously, TLS handshake messages are after the Contact Header and
> > are not
> > prepended by these headers. Perhaps this is an artifact from v3?
> >
> BS1: This is an artifact, it should be clarified as:
>     After the initial exchange of a Contact Header and any TLS
>     handshake exchanges, ...
> 
> LGTM
> 
> 
> > Sec 4.7
> > "Once this process of parameter negotiation is completed (which
> >    includes a possible completed TLS handshake of the connection to
> > use
> >    TLS),"
> >
> > The TLS handshake occurs before parameter negotiation.
> >
> BS1: This is a vestigal behavior and needs to be rephrased to remove
> that parenthetical.
> 
> Yep
> 
> 
> > Sec 5.2.4 I find it odd that each CL would have its own set of reason
> > codes
> > that it will simply pass to the bundle layer. Far better for it to be
> > a common
> > set of CL-agnostic errors that the bundle layer implements, as they
> > literally
> > do not matter to the CL at all.
> >
> BS1: There had been earlier WG discussion about a standardized CLA--BPA
> interface and nomenclature, but the CLA capabilities are incredibly
> link-layer dependent and not at all consistent, especially ways in
> which a transfer can fail (some link layers are even unidirectional and
> don't provide any feedback at all to the CLA).
> This could be discussed in the WG to see if the consensus has evolved
> in any way in the meanwhile.
> 
> 
> OK, I don't think it's necessary to revisit consensus on this.