Date: Mon, 05 Jan 2009 08:23:27 -0500 (EST) From: Terry Kennedy <terry@tmk.com> To: Robert Watson <rwatson@FreeBSD.org> Cc: Peter Jeremy <peterjeremy@optushome.com.au>, freebsd-stable@FreeBSD.org Subject: Re: rdump stuck in sbwait state (RELENG_7) Message-ID: <01N3XI0VWEA00008L3@tmk.com> In-Reply-To: "Your message dated Mon, 05 Jan 2009 13:18:27 %2B0000 (GMT)" <alpine.BSF.2.00.0901051317160.98366@fledge.watson.org> References: <01N3OFGBCXMS000125@tmk.com> <01N3OYSUCHAE000125@tmk.com> <01N3VGDZ7EOM0008L3@tmk.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> I may have missed this earlier in the thread, but I don't see a kernel stack > trace of the stuck thread/process. Could you grab one using procstat -k, DDB, > or KGDB? I'd like to confirm that the 'sbwait' really reflects waiting to > send, rather than waiting to receive, which (for better or worse) uses the > same wmesg. procstat -k may be the simplest of the above to do if your system > is reasonable recent. I didn't post that earlier as no-one had asked for it 8-) The system is current as of December 29th. Here's the relevant info: (0:10) test4:/sysprog/terry# uname -a FreeBSD test4.tmk.com 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #0: Mon Dec 29 11:48:04 EST 2008 terry@test4.tmk.com:/usr/obj/usr/src/sys/PE1550 i386 (0:11) test4:/sysprog/terry# ps -axwww | grep dump UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND 0 4436 4411 0 8 0 35896 34552 wait I+ p1 0:00.70 /sbin/rdump 0uLa -b 64 -C 32 -f server /usr (rdump) 0 4439 4436 0 4 0 35896 34784 sbwait I+ p1 0:03.05 rdump: /dev/amrd0s1f: pass 4: 18.48% done, finished in 0:17 at Sat Jan 3 21:02:05 2009 (rdump) 0 4440 4439 0 20 0 35896 34624 pause I+ p1 0:05.26 /sbin/rdump 0uLa -b 64 -C 32 -f server /usr (rdump) 0 4441 4439 0 20 0 35896 34624 pause I+ p1 0:05.26 /sbin/rdump 0uLa -b 64 -C 32 -f server /usr (rdump) 0 4442 4439 0 4 0 35896 34624 sbwait I+ p1 0:05.26 /sbin/rdump 0uLa -b 64 -C 32 -f server /usr (rdump) (0:12) test4:/sysprog/terry# procstat -k 4436 PID TID COMM TDNAME KSTACK 4436 100115 rdump - mi_switch sleepq_switch sleepq_catch_signals sleepq_wait_sig _sleep kern_wait wait4 syscall Xint0x80_syscall (0:13) test4:/sysprog/terry# procstat -k 4439 PID TID COMM TDNAME KSTACK 4439 100127 rdump - mi_switch sleepq_switch sleepq_catch_signals sleepq_wait_sig _sleep sbwait soreceive_generic soreceive soo_read dofileread kern_readv read syscall Xint0x80_syscall (0:14) test4:/sysprog/terry# procstat -k 4440 PID TID COMM TDNAME KSTACK 4440 100131 rdump - mi_switch sleepq_switch sleepq_catch_signals sleepq_wait_sig _sleep kern_sigsuspend sigsuspend syscall Xint0x80_syscall (0:15) test4:/sysprog/terry# procstat -k 4441 PID TID COMM TDNAME KSTACK 4441 100105 rdump - mi_switch sleepq_switch sleepq_catch_signals sleepq_wait_sig _sleep kern_sigsuspend sigsuspend syscall Xint0x80_syscall (0:16) test4:/sysprog/terry# procstat -k 4442 PID TID COMM TDNAME KSTACK 4442 100135 rdump - mi_switch sleepq_switch sleepq_catch_signals sleepq_wait_sig _sleep sbwait soreceive_generic soreceive soo_read dofileread kern_readv read syscall Xint0x80_syscall As I understand it, the processes in sbwait state are waiting to receive. That would seem to indicate that they don't see the ACKs from the other end, despite the tcpdump showing that they were received. Let me know if you need more information. Terry Kennedy http://www.tmk.com terry@tmk.com New York, NY USA
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?01N3XI0VWEA00008L3>