Date: Wed, 7 Oct 2015 11:39:42 +0200 From: Palle Girgensohn <girgen@FreeBSD.org> To: freebsd-net@freebsd.org Subject: Process hung in STOPPED_SINGLE, wchan vodead, and cannot be killed or continued Message-ID: <60F10B6B-0B90-4728-B405-4B916CDF7FD6@FreeBSD.org>
next in thread | raw e-mail | index | archive | help
Hi, I see a process that is hung in a jail, and cannot be killed or = continued: # ps HO wchan,nwchan,ppid -p 92266 PID WCHAN NWCHAN PPID TT STAT TIME COMMAND 92266 - - 1 - TJ 0:00,73 /usr/local/bin/jsvc = -home /usr/local/openjdk8 -server 92266 vodead fffff811a5e6b400 1 - TJ 0:00,48 /usr/local/bin/jsvc = -home /usr/local/openjdk8 -server # top ... PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU = COMMAND 92266 nobody 2 20 0 4470M 418M STOP 2 0:20 0.00% = jsvc # ps axu USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND nobody 92266 0,0 0,4 4577204 427756 - TJ 11:02pm 0:20,08 = /usr/local/bin/jsvc -home /usr/local/openjdk8 ... # sockstat USER COMMAND PID FD PROTO LOCAL ADDRESS FOREIGN = ADDRESS =20 nobody jsvc 92266 15 stream (not connected) nobody jsvc 92266 16 tcp4 127.0.0.1:8078 *:* ? ? ? ? tcp4 127.0.0.1:8078 = 127.0.0.1:22789 ... # sockstat | grep '^?' |wc -l 151 # netstat -an | less netstat: kvm not available: /dev/mem: No such file or directory Active Internet connections (including servers) Proto Recv-Q Send-Q Local Address Foreign Address = (state) tcp4 374 0 127.0.0.1.8078 127.0.0.1.32866 CLOSED ... # procstat -t 92266 PID TID COMM TDNAME CPU PRI STATE WCHAN =20= 92266 105754 jsvc - 20 120 stop - = =20 92266 106982 jsvc - 2 120 stop vodead = =20 # procstat -k 92266 PID TID COMM TDNAME KSTACK = =20 92266 105754 jsvc - mi_switch = thread_suspend_switch thread_single exit1 sys_sys_exit amd64_syscall = Xfast_syscall=20 92266 106982 jsvc - mi_switch sleepq_switch = sleepq_wait _sleep vnode_create_vobject zfs_freebsd_open VOP_OPEN_APV = vn_open_vnode vn_open_cred kern_openat amd64_syscall Xfast_syscall=20 8078 is the java port that it used to listen to... all look like this ? ? ? ? tcp4 127.0.0.1:8078 = 127.0.0.1:53583 # gdb -p 92266 /usr/local/bin/jsvc=20 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you = are welcome to change it and/or distribute copies of it under certain = conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for = details. This GDB was configured as "amd64-marcel-freebsd"...(no debugging = symbols found)... Attaching to program: /usr/local/bin/jsvc, process 92266 [ just hangs ]... ^Z [1]+ Stopped gdb -p 92266 /usr/local/bin/jsvc [root@tranbar /]#=20 [root@tranbar /]#=20 [root@tranbar /]# kill %1 [root@tranbar /]#=20 [1]+ Terminated gdb -p 92266 /usr/local/bin/jsvc [root@tranbar /]#=20 The culprit to begin with could be this: Oct 7 07:54:00 host kernel: sonewconn: pcb 0xfffff80b49171310: Listen = queue overflow: 151 already in queue awaiting acceptance (6 occurrences) Occurred all through the night, saturating a service, *very likely* the = one now showing problems, but i was never there to check. 151 lost = network sockets (see sockstat above) connects the dots. It seems the service entered STOP when we tried to stop it. jsvc is = similar to daemontools, and I remeber seeing a references to a parent = process 92265, but I might be imaginating, since the ppid =3D 1. Trying to shut down the jail we got hanging shutdown processes: from host:/var/log/console.jailname: ... Stopping tomcat. Waiting for PIDS: 9226690 second watchdog timeout expired. Shutdown = terminated. Ons 7 Okt 2015 08:27:19 CEST ... # freebsd-version -ku 10.2-RELEASE-p3 10.2-RELEASE-p3 So basically, is there a way to get rid of this process without = rebooting? Palle
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?60F10B6B-0B90-4728-B405-4B916CDF7FD6>