From owner-freebsd-bugs@FreeBSD.ORG Tue Feb 26 19:20:01 2013 Return-Path: Delivered-To: freebsd-bugs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 33A04477 for ; Tue, 26 Feb 2013 19:20:01 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 19C891A2B for ; Tue, 26 Feb 2013 19:20:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r1QJK0vh053177 for ; Tue, 26 Feb 2013 19:20:00 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r1QJK0Xt053172; Tue, 26 Feb 2013 19:20:00 GMT (envelope-from gnats) Resent-Date: Tue, 26 Feb 2013 19:20:00 GMT Resent-Message-Id: <201302261920.r1QJK0Xt053172@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Julien Charbon Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id E849711C for ; Tue, 26 Feb 2013 19:13:38 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22]) by mx1.freebsd.org (Postfix) with ESMTP id C8D9A19E9 for ; Tue, 26 Feb 2013 19:13:38 +0000 (UTC) Received: from red.freebsd.org (localhost [127.0.0.1]) by red.freebsd.org (8.14.5/8.14.5) with ESMTP id r1QJDcM0099752 for ; Tue, 26 Feb 2013 19:13:38 GMT (envelope-from nobody@red.freebsd.org) Received: (from nobody@localhost) by red.freebsd.org (8.14.5/8.14.5/Submit) id r1QJDcLF099751; Tue, 26 Feb 2013 19:13:38 GMT (envelope-from nobody) Message-Id: <201302261913.r1QJDcLF099751@red.freebsd.org> Date: Tue, 26 Feb 2013 19:13:38 GMT From: Julien Charbon To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Subject: kern/176446: Concurrency in ixgbe driving out-of-order packet process and spurious RST X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2013 19:20:01 -0000 >Number: 176446 >Category: kern >Synopsis: Concurrency in ixgbe driving out-of-order packet process and spurious RST >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Tue Feb 26 19:20:00 UTC 2013 >Closed-Date: >Last-Modified: >Originator: Julien Charbon >Release: FreeBSD 8.3 >Organization: Verisign >Environment: FreeBSD atlas-dl360-3 8.3-RELEASE FreeBSD 8.3-RELEASE #14 r50M: Tue Feb 26 15:13:53 UTC 2013 jcharbon@atlas-dl360-3:/usr/obj/usr/src/jcharbon/svn/freebsd-src-8.3.4/sys/GENERIC amd64 >Description: Under TCP network load using the ixgbe driver, we found an unexpected TCP behaviour: 15:26:45.129164 IP 192.168.100.23.30222 > 192.168.100.144.8080: Flags [S], seq 2020028671, win 14600, options [mss 1460,sackOK,TS val 1622018533 ecr 0,nop,wscale 7], length 0 15:26:45.130844 IP 192.168.100.144.8080 > 192.168.100.23.30222: Flags [S.], seq 1114608110, ack 2020028672, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 2170246269 ecr 1622018533], length 0 15:26:47.007276 IP 192.168.100.23.30222 > 192.168.100.144.8080: Flags [.], ack 1, win 115, options [nop,nop,TS val 1622020597 ecr 2170246269], length 0 15:26:47.007283 IP 192.168.100.23.30222 > 192.168.100.144.8080: Flags [P.], ack 1, win 115, options [nop,nop,TS val 1622020597 ecr 2170246269], length 4 15:26:47.013799 IP 192.168.100.144.8080 > 192.168.100.23.30222: Flags [R], seq 1114608111, win 0, length 0 15:26:47.019366 IP 192.168.100.144.8080 > 192.168.100.23.30222: Flags [P.], ack 5, win 1040, options [nop,nop,TS val 2170248157 ecr 1622020597], length 128 15:26:47.020353 IP 192.168.100.144.8080 > 192.168.100.23.30222: Flags [F.], seq 129, ack 5, win 1040, options [nop,nop,TS val 2170248158 ecr 1622020597], length 0 15:26:48.565166 IP 192.168.100.23.30222 > 192.168.100.144.8080: Flags [R], seq 2020028676, win 0, length 0 This TCP request was both RST'ed _and_ replied. Using net.inet.tcp.log_debug=1, we saw in debug log: Feb 26 15:26:47 atlas-dl360-3 kernel: TCP: [192.168.100.23]:30222 to [192.168.100.144]:8080; syncache_socket: in_pcbconnect failed with error 48 Feb 26 15:26:47 atlas-dl360-3 kernel: TCP: [192.168.100.23]:30222 to [192.168.100.144]:8080 tcpflags 0x10; tcp_input: Listen socket: Socket allocation failed due to limits or memory shortage, sending RST Feb 26 15:26:47 atlas-dl360-3 kernel: TCP: [192.168.100.23]:30222 to [192.168.100.144]:8080 tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache entry (possibly syncookie only), segment ignored By adding more debug log in kernel (see joined patch), the origin of this unexpected behaviour seems to be in ixgbe driver: Feb 26 15:26:47 atlas-dl360-3 kernel: TCP: [192.168.100.23]:30222 to [192.168.100.144]:8080 tcpflags 0x18; syncache_expand: SEQ 2020028672, ACK 1114608111, syncache_socket() succeed Feb 26 15:26:47 atlas-dl360-3 kernel: #0 0xffffffff80796b86 at syncache_expand+0x546 Feb 26 15:26:47 atlas-dl360-3 kernel: #1 0xffffffff8078e36b at tcp_input+0xf4b Feb 26 15:26:47 atlas-dl360-3 kernel: #2 0xffffffff8071feec at ip_input+0xac Feb 26 15:26:47 atlas-dl360-3 kernel: #3 0xffffffff806cb11e at netisr_dispatch_src+0x7e Feb 26 15:26:47 atlas-dl360-3 kernel: #4 0xffffffff806c11bd at ether_demux+0x14d Feb 26 15:26:47 atlas-dl360-3 kernel: #5 0xffffffff806c15c7 at ether_input+0x197 Feb 26 15:26:47 atlas-dl360-3 kernel: #6 0xffffffff803e691b at ixgbe_rxeof+0x1eb Feb 26 15:26:47 atlas-dl360-3 kernel: #7 0xffffffff803e7108 at ixgbe_msix_que+0xa8 Feb 26 15:26:47 atlas-dl360-3 kernel: #8 0xffffffff805e7674 at intr_event_execute_handlers+0x104 Feb 26 15:26:47 atlas-dl360-3 kernel: #9 0xffffffff805e8d05 at ithread_loop+0x95 Feb 26 15:26:47 atlas-dl360-3 kernel: #10 0xffffffff805e488f at fork_exit+0x11f Feb 26 15:26:47 atlas-dl360-3 kernel: #11 0xffffffff808edc7e at fork_trampoline+0xe Feb 26 15:26:47 atlas-dl360-3 kernel: TCP: [192.168.100.23]:30222 to [192.168.100.144]:8080; syncache_socket: in_pcbconnect failed with error 48 Feb 26 15:26:47 atlas-dl360-3 kernel: TCP: [192.168.100.23]:30222 to [192.168.100.144]:8080 tcpflags 0x10; syncache_expand: SEQ 2020028672, ACK 1114608111, syncache_socket() failed Feb 26 15:26:47 atlas-dl360-3 kernel: #0 0xffffffff80796b86 at syncache_expand+0x546 Feb 26 15:26:47 atlas-dl360-3 kernel: #1 0xffffffff8078e36b at tcp_input+0xf4b Feb 26 15:26:47 atlas-dl360-3 kernel: #2 0xffffffff8071feec at ip_input+0xac Feb 26 15:26:47 atlas-dl360-3 kernel: #3 0xffffffff806cb11e at netisr_dispatch_src+0x7e Feb 26 15:26:47 atlas-dl360-3 kernel: #4 0xffffffff806c11bd at ether_demux+0x14d Feb 26 15:26:47 atlas-dl360-3 kernel: #5 0xffffffff806c15c7 at ether_input+0x197 Feb 26 15:26:47 atlas-dl360-3 kernel: #6 0xffffffff803e691b at ixgbe_rxeof+0x1eb Feb 26 15:26:47 atlas-dl360-3 kernel: #7 0xffffffff803e7c31 at ixgbe_handle_que+0xd1 Feb 26 15:26:47 atlas-dl360-3 kernel: #8 0xffffffff8064e2c5 at taskqueue_run_locked+0x85 Feb 26 15:26:47 atlas-dl360-3 kernel: #9 0xffffffff8064e45e at taskqueue_thread_loop+0x4e Feb 26 15:26:47 atlas-dl360-3 kernel: #10 0xffffffff805e488f at fork_exit+0x11f Feb 26 15:26:47 atlas-dl360-3 kernel: #11 0xffffffff808edc7e at fork_trampoline+0xe Feb 26 15:26:47 atlas-dl360-3 kernel: TCP: [192.168.100.23]:30222 to [192.168.100.144]:8080 tcpflags 0x10; tcp_input: Listen socket: Socket allocation failed due to limits or memory shortage, sending RST Feb 26 15:26:47 atlas-dl360-3 kernel: TCP: [192.168.100.23]:30222 to [192.168.100.144]:8080 tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache entry (possibly syncookie only), segment ignored Here we saw the concurrency between ixgbe_handle_que() and ixgbe_msix_que() both calling ixgbe_rxeof() at same time, and the client packet being proceed before the client packet. If we are currently testing FreeBSD 8.3, we saw nothing that would prevent this issue in 9.1 and in CURRENT. >How-To-Repeat: Load a TCP service enough to see the "syncache_socket: in_pcbconnect failed with error 48" error. >Fix: Patch attached with submission follows: Index: sys/netinet/tcp_syncache.c =================================================================== --- sys/netinet/tcp_syncache.c (revision 50) +++ sys/netinet/tcp_syncache.c (working copy) @@ -97,6 +97,11 @@ #include +#include +#include +#include +#include + static VNET_DEFINE(int, tcp_syncookies) = 1; #define V_tcp_syncookies VNET(tcp_syncookies) SYSCTL_VNET_INT(_net_inet_tcp, OID_AUTO, syncookies, CTLFLAG_RW, @@ -200,6 +205,11 @@ #define SCH_UNLOCK(sch) mtx_unlock(&(sch)->sch_mtx) #define SCH_LOCK_ASSERT(sch) mtx_assert(&(sch)->sch_mtx, MA_OWNED) +struct stack *st; +struct mtx st_mtx; +struct sbuf st_sb; +char st_trace[65536]; + /* * Requires the syncache entry to be already removed from the bucket list. */ @@ -223,6 +233,10 @@ { int i; + st = stack_create(); + mtx_init(&st_mtx, "tcp_sc_stack", NULL, MTX_DEF); + sbuf_new(&st_sb, st_trace, sizeof(st_trace), SBUF_FIXEDLEN); + V_tcp_syncache.cache_count = 0; V_tcp_syncache.hashsize = TCP_SYNCACHE_HASHSIZE; V_tcp_syncache.bucket_limit = TCP_SYNCACHE_BUCKETLIMIT; @@ -277,6 +291,10 @@ struct syncache *sc, *nsc; int i; + sbuf_delete(&st_sb); + mtx_destroy(&st_mtx); + stack_destroy(st); + /* Cleanup hash buckets: stop timers, free entries, destroy locks. */ for (i = 0; i < V_tcp_syncache.hashsize; i++) { @@ -949,11 +967,27 @@ *lsop = syncache_socket(sc, *lsop, m); - if (*lsop == NULL) + if (*lsop == NULL) { + if ((s = tcp_log_addrs(inc, th, NULL, NULL))) + log(LOG_DEBUG, "%s; %s: SEQ %u, ACK %u, syncache_socket() " + "failed\n", s, __func__, th->th_seq, th->th_ack); TCPSTAT_INC(tcps_sc_aborted); - else + } + else { + if ((s = tcp_log_addrs(inc, th, NULL, NULL))) + log(LOG_DEBUG, "%s; %s: SEQ %u, ACK %u, syncache_socket() " + "succeed\n", s, __func__, th->th_seq, th->th_ack); TCPSTAT_INC(tcps_sc_completed); + } + mtx_lock(&st_mtx); + stack_save(st); + stack_sbuf_print(&st_sb, st); + sbuf_finish(&st_sb); + log(LOG_DEBUG, "%s", sbuf_data(&st_sb)); + sbuf_clear(&st_sb); + mtx_unlock(&st_mtx); + /* how do we find the inp for the new socket? */ if (sc != &scs) syncache_free(sc); >Release-Note: >Audit-Trail: >Unformatted: