Re: [tcpm] Fwd: TCP Loopback Connections with the Same Src/Dest Port

Joe Touch <touch@isi.edu> Mon, 22 July 2013 15:38 UTC

Return-Path: <touch@isi.edu>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A0D1411E80EE for <tcpm@ietfa.amsl.com>; Mon, 22 Jul 2013 08:38:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -106.299
X-Spam-Level:
X-Spam-Status: No, score=-106.299 tagged_above=-999 required=5 tests=[AWL=-0.300, BAYES_00=-2.599, J_CHICKENPOX_21=0.6, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yBgYzJCDqajY for <tcpm@ietfa.amsl.com>; Mon, 22 Jul 2013 08:38:06 -0700 (PDT)
Received: from vapor.isi.edu (vapor.isi.edu [128.9.64.64]) by ietfa.amsl.com (Postfix) with ESMTP id E481D11E811E for <tcpm@ietf.org>; Mon, 22 Jul 2013 08:38:03 -0700 (PDT)
Received: from [75.226.50.119] (119.sub-75-226-50.myvzw.com [75.226.50.119]) (authenticated bits=0) by vapor.isi.edu (8.13.8/8.13.8) with ESMTP id r6MFZg6O026487 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Mon, 22 Jul 2013 08:35:53 -0700 (PDT)
Message-ID: <51ED5156.9030808@isi.edu>
Date: Mon, 22 Jul 2013 08:35:50 -0700
From: Joe Touch <touch@isi.edu>
User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:17.0) Gecko/20130620 Thunderbird/17.0.7
MIME-Version: 1.0
To: David Borman <dab@weston.borman.com>
References: <CAFc6gu_q1X10EzsHrvnmYuQ0ZnKz9uNXbfJJe-guva6J-QKAow@mail.gmail.com> <51EC10A6.7040300@gont.com.ar> <51ECBCE7.8080805@isi.edu> <E7C6F731-C737-47BE-AE15-7C573115BA2E@weston.borman.com>
In-Reply-To: <E7C6F731-C737-47BE-AE15-7C573115BA2E@weston.borman.com>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-ISI-4-43-8-MailScanner: Found to be clean
X-MailScanner-From: touch@isi.edu
Cc: "tcpm@ietf.org" <tcpm@ietf.org>, Fernando Gont <fernando@gont.com.ar>
Subject: Re: [tcpm] Fwd: TCP Loopback Connections with the Same Src/Dest Port
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jul 2013 15:38:11 -0000

On 7/22/2013 7:17 AM, David Borman wrote:
> Testing that a self-connected socket works should be part of every TCP regression test.

 From RFC793:

     To allow for many processes within a single Host to use TCP
     communication facilities simultaneously, the TCP provides a set of
     addresses or ports within each host.  Concatenated with the network
     and host addresses from the internet communication layer, this forms
     a socket.  A pair of sockets uniquely identifies each connection.

This text says "a pair". I interpreted that as prohibiting use of a 
single socket for both ends, but I suppose you could allow it to happen.

The benefit of that situation is it would test simultaneous open and 
close, but note that some docs have written those off lately when we 
think that "can't" happen.

You'd have to prove that every protocol could handle simultaneous cases 
and state.

The key question is "why would this ever happen, or should it ever 
reasonably happen"? It's just as easily handled by having the Un*x 
socket "lie" about having TCP and just copy the buffer from send to 
receive - and what, really, is the point of that?

Joe

>
> 			-David Borman
>
> On Jul 22, 2013, at 12:02 AM, Joe Touch <touch@isi.edu> wrote:
>
>> Same src/dst for the same IP makes no sense; there need to be two ends to a connection, and each end is supposed to be uniquely determined by the socket (as defined in 793, not Un*x).
>>
>> IMO, it ought to be rejected by the API, just as would be one that was otherwise incompletely or incorrectly specified (picking a source address not on a local interface, picking a port range you don't have privilege to access, etc.).
>>
>> However, loopback is a subnet (127.0.0.0/8), not just a single address. It ought to be feasible and correct to open a connection to yourself on the same port on different loopback addresses.
>>
>> Joe
>>
>> On 7/21/2013 9:47 AM, Fernando Gont wrote:
>>> Folks,
>>>
>>> Found this by chance -- probably a datapoint that advice is needed in
>>> this area (that is, draft-gont-tcpm-tcp-seq-validation).
>>>
>>> P.S.: Will present results from real-world testing at the next tcpm meeting.
>>>
>>> Cheers,
>>> Fernando
>>>
>>>
>>>
>>>
>>> -------- Original Message --------
>>> From: Matt Miller <matt@matthewjmiller.net>
>>> Date: Wed, 17 Jul 2013 07:08:26 -0400
>>> X-Google-Sender-Auth: ba5SYKKiksigElenhewyH0EffCs
>>> Message-ID:
>>> <CAFc6gu_q1X10EzsHrvnmYuQ0ZnKz9uNXbfJJe-guva6J-QKAow@mail.gmail.com>
>>> Subject: TCP Loopback Connections with the Same Src/Dest Port
>>> To: FreeBSD Net <freebsd-net@freebsd.org>
>>>
>>> Our system is based on FreeBSD 8.1.  In some tests, we were having
>>> issues caused by connections of this form (more details below):
>>>
>>> TCP4      0      0      0/   0/   0    127.0.0.1.665   127.0.0.1.665
>>> FIN_WAIT_1
>>> TCP4      0      0      0/   0/   0    127.0.0.1.637   127.0.0.1.637
>>> FIN_WAIT_1
>>> TCP4      0      0      0/   0/   0    127.0.0.1.648   127.0.0.1.648
>>> FIN_WAIT_1
>>>
>>> Some questions we had:
>>>
>>> - Has anyone else ever seen these same src/dest address/port TCP
>>> connections created?  Does anyone know of a legitimate reason why they
>>> should be allowed?
>>>
>>> - If there are no known use cases for this type of connection, does
>>> anyone have more context/insight on the design here: should this type
>>> of inpcb creation be prevented in the kernel or is it the
>>> application's responsibility to ensure it never creates this type of
>>> socket?
>>>
>>> For those interested, more details of the issue seen follow.  The
>>> connection seems to get stuck in swi_net sending and receiving pure
>>> FIN/ACKs to itself:
>>>
>>> #12 0xffffffff804372ce in ip_output (m=0xffffff0003ccf300,
>>> opt=<optimized out>, ro=0xffffff8020c2b6a0, flags=0, imo=0x0,
>>> inp=0xffffff0019933968) at ../../../../sys/netinet/ip_output.c
>>> #13 0xffffffff804423dc in tcp_output (tp=0xffffff0019de2370) at
>>> ../../../../sys/netinet/tcp_output.c
>>> #14 0xffffffff8043ef5d in tcp_do_segment (m=0xffffff0019af1200,
>>> th=0x100200, so=0xffffff011ac59570, tp=0xffffff0019de2370,
>>> drop_hdrlen=52, tlen=0, iptos=0 '\000', ti_locked=3) at
>>> ../../../../sys/netinet/tcp_input.c
>>> #15 0xffffffff80440311 in tcp_input (m=0xffffff0019af1200,
>>> off0=<optimized out>) at ../../../../sys/netinet/tcp_input.c
>>> #16 0xffffffff8043530b in ip_input (m=0xffffff0019af1200) at
>>> ../../../../sys/netinet/ip_input.c
>>> #17 0xffffffff8040889f in netisr_process_workstream_proto
>>> (proto=<optimized out>, nwsp=<optimized out>) at
>>> ../../../../sys/net/netisr.c
>>> #18 swi_net (arg=0xffffffff80f59800) at ../../../../sys/net/netisr.c
>>>
>>> swi_net() just continues in this loop, ad nauseam:
>>>
>>> 759         while ((bits = nwsp->nws_pendingbits) != 0) {
>>> 760                 while ((prot = ffs(bits)) != 0) {
>>> 761                         prot--;
>>> 762                         bits &= ~(1 << prot);
>>> 763                         (void)netisr_process_workstream_proto(nwsp,
>>> prot);
>>> 764                 }
>>> 765         }
>>>
>>> The tcp_output() being triggered in tcp_do_segment() in the case is
>>> the one show on line 2303 below:
>>>
>>> 2212         /*
>>> 2213          * In ESTABLISHED state: drop duplicate ACKs; ACK out of range
>>> 2214          * ACKs.  If the ack is in the range
>>> 2215          *      tp->snd_una < th->th_ack <= tp->snd_max
>>> 2216          * then advance tp->snd_una to th->th_ack and drop
>>> 2217          * data from the retransmission queue.  If this ACK reflects
>>> 2218          * more up to date window information we update our
>>> window information.
>>> 2219          */
>>> 2220         case TCPS_ESTABLISHED:
>>> 2221         case TCPS_FIN_WAIT_1:
>>> 2222         case TCPS_FIN_WAIT_2:
>>> 2223         case TCPS_CLOSE_WAIT:
>>> 2224         case TCPS_CLOSING:
>>> 2225         case TCPS_LAST_ACK:
>>> 2226                 if (SEQ_GT(th->th_ack, tp->snd_max)) {
>>> 2227                         TCPSTAT_INC(tcps_rcvacktoomuch);
>>> 2228                         goto dropafterack;
>>> 2229                 }
>>> ...
>>> 2234                 if (SEQ_LEQ(th->th_ack, tp->snd_una)) {
>>> ...
>>> 2248                         if (tlen == 0 && tiwin == tp->snd_wnd) {
>>> 2249                                 TCPSTAT_INC(tcps_rcvdupack);
>>> ...
>>> 2277                                 if (!tcp_timer_active(tp, TT_REXMT) ||
>>> 2278                                     th->th_ack != tp->snd_una)
>>> 2279                                         tp->t_dupacks = 0;
>>> 2280                                 else if (++tp->t_dupacks >
>>> tcprexmtthresh ||
>>> 2281                                     ((V_tcp_do_newreno ||
>>> 2282                                       (tp->t_flags &
>>> TF_SACK_PERMIT)) &&
>>> 2283                                      IN_FASTRECOVERY(tp))) {
>>> 2284                                         if ((tp->t_flags &
>>> TF_SACK_PERMIT) &&
>>> 2285                                             IN_FASTRECOVERY(tp)) {
>>> 2286                                                 int awnd;
>>> 2287
>>> 2288                                                 /*
>>> 2289                                                  * Compute the
>>> amount of data in flight first.
>>> 2290                                                  * We can inject
>>> new data into the pipe iff
>>> 2291                                                  * we have less
>>> than 1/2 the original window's
>>> 2292                                                  * worth of data in
>>> flight.
>>> 2293                                                  */
>>> 2294                                                 awnd =
>>> (tp->snd_nxt - tp->snd_fack) +
>>> 2295
>>> tp->sackhint.sack_bytes_rexmit;
>>> 2296                                                 if (awnd <
>>> tp->snd_ssthresh) {
>>> 2297
>>> tp->snd_cwnd += tp->t_maxseg;
>>> 2298                                                         if
>>> (tp->snd_cwnd > tp->snd_ssthresh)
>>> 2299
>>> tp->snd_cwnd = tp->snd_ssthresh;
>>> 2300                                                 }
>>> 2301                                         } else
>>> 2302                                                 tp->snd_cwnd +=
>>> tp->t_maxseg;
>>> 2303                                         (void) tcp_output(tp);
>>> 2304                                         goto drop;
>>>
>>> I've noticed that we don't yet have this patch in our code:
>>>
>>> http://svnweb.freebsd.org/base?view=revision&revision=239672
>>>
>>> Which seems like it could be relevant here to the general case of both
>>> ends of the connection entering FIN_WAIT_1 at the same time and
>>> sending FIN/ACKs repeatedly (though our connections are a bizarre case
>>> of this where both ends of the connection are actually the same
>>> connection).
>>>
>>> Thanks,
>>>
>>> Matt
>>> _______________________________________________
>>> freebsd-net@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>>>
>>>
>> _______________________________________________
>> tcpm mailing list
>> tcpm@ietf.org
>> https://www.ietf.org/mailman/listinfo/tcpm
>