Date: Wed, 6 Oct 2010 14:28:31 +0200 From: Kai Gallasch <gallasch@free.de> To: freebsd-fs@freebsd.org Subject: Locked up processes after upgrade to ZFS v15 Message-ID: <39F05641-4E46-4BE0-81CA-4DEB175A5FBE@free.de>
next in thread | raw e-mail | index | archive | help
Hi. Two days ago I upgraded my server to 8.1-STABLE (amd64) and upgraded ZFS = from v14 to v15. After zpool & zfs upgrade the server was running stable for about half a = day, but then apache processes running inside jails would lock up and = could not be terminated any more. In the end apache (both worker and prefork) itself locked up, because it = lost control of its child processes. - only webserver jails with a prefork or worker apache do lock up - non-apache processes in other jails do not show this problem - locked httpd processes will not terminate when rebooting. in 'top' the stuck processes show up with state zfs or zfsmrb: PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU = COMMAND 2341 root 1 44 0 112M 12760K select 3 0:04 0.00% httpd 2365 root 1 44 0 12056K 4312K select 0 0:00 0.00% = sendmail 2376 root 1 48 0 7972K 1628K nanslp 4 0:00 0.00% cron 2214 root 1 44 0 6916K 1440K select 0 0:00 0.00% = syslogd 24731 www 1 44 0 114M 13464K zfsmrb 6 0:00 0.00% httpd 12111 www 1 44 0 114M 13520K zfs 5 0:00 0.00% httpd 24729 www 1 44 0 114M 13408K zfsmrb 4 0:00 0.00% httpd 24728 www 1 47 0 114M 13404K zfsmrb 5 0:00 0.00% httpd 11051 www 1 44 0 114M 13456K zfs 1 0:00 0.00% httpd 26368 www 1 44 0 114M 13460K zfsmrb 6 0:00 0.00% httpd 24730 www 1 44 0 114M 13444K zfsmrb 5 0:00 0.00% httpd 88803 www 1 44 0 114M 13388K zfs 1 0:00 0.00% httpd 10887 www 1 44 0 114M 13436K zfs 6 0:00 0.00% httpd 16493 www 1 44 0 114M 13528K zfs 5 0:00 0.00% httpd 12461 www 1 44 0 114M 13340K zfs 1 0:00 0.00% httpd 89018 www 1 51 0 114M 13260K zfs 1 0:00 0.00% httpd 48699 www 1 52 0 114M 13308K zfs 3 0:00 0.00% httpd 31090 www 1 44 0 114M 13404K zfs 3 0:00 0.00% httpd 18094 www 1 44 0 114M 13312K zfs 2 0:00 0.00% httpd 69479 www 1 46 0 114M 13424K zfs 4 0:00 0.00% httpd 12890 www 1 44 0 114M 13336K zfs 5 0:00 0.00% httpd 67204 www 1 44 0 114M 13328K zfs 5 0:00 0.00% httpd 69402 www 1 60 0 114M 13432K zfs 4 0:00 0.00% httpd 91162 www 1 56 0 114M 13408K zfs 0 0:00 0.00% httpd 89781 www 1 45 0 114M 13428K zfs 4 0:00 0.00% httpd 48663 www 1 45 0 114M 13388K zfs 4 0:00 0.00% httpd 12112 www 1 44 0 114M 13340K zfs 6 0:00 0.00% httpd 91161 www 1 54 0 114M 13280K zfs 5 0:00 0.00% httpd 88839 www 1 44 0 114M 13592K zfsmrb 5 0:00 0.00% httpd 89144 www 1 58 0 114M 13304K zfs 0 0:00 0.00% httpd 78946 www 1 45 0 114M 13420K zfs 0 0:00 0.00% httpd 81984 www 1 44 0 114M 13396K zfs 5 0:00 0.00% httpd 93431 www 1 61 0 114M 13340K zfs 5 0:00 0.00% httpd 91179 www 1 76 0 114M 13360K zfs 4 0:00 0.00% httpd 69400 www 1 53 0 114M 13324K zfs 0 0:00 0.00% httpd 54211 www 1 45 0 114M 13404K zfs 6 0:00 0.00% httpd 36335 www 1 45 0 114M 13400K zfs 4 0:00 0.00% httpd 31093 www 1 44 0 114M 13348K zfs 2 0:00 0.00% httpd I compiled a debug kernel with following options: options KDB # Enable kernel debugger = support. options DDB # Support DDB. options GDB # Support remote GDB. options INVARIANTS # Enable calls of extra sanity = checking options INVARIANT_SUPPORT # Extra sanity checks of = internal structures, required by INVARIANTS options WITNESS # Enable checks to detect = deadlocks and cycles options WITNESS_SKIPSPIN # Don't run witness on spinlocks = for speed # options SW_WATCHDOG options DEBUG_LOCKS options DEBUG_VFS_LOCKS After process lockups only output on console was: witness_lock_list_get: witness exhausted I also moved the jails with the stuck httpd processes to another server = (also 8.1-STABLE, ZFS v15) - but the lockup also ouccured there. How can I debug this and get further information? At the moment I am = thinking about reverting from zfs to ufs - to save some nerves. Would be = a big disappointment for me, after all the time and effort trying to use = zfs in production. Regards, Kai.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?39F05641-4E46-4BE0-81CA-4DEB175A5FBE>