Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 18 Jan 2013 15:01:59 +0000 (GMT)
From:      Gavin Atkinson <gavin@FreeBSD.org>
To:        freebsd-fs@FreeBSD.org
Subject:   ZFS lock up 9-stable r244911 (Jan)
Message-ID:  <alpine.BSF.2.00.1301181356140.29541@thunderhorn.york.ac.uk>

next in thread | raw e-mail | index | archive | help

Hi all,

I have a machine on which ZFS appears to have locked up, and any processes 
that attempt to access the ZFS filesystem.  This machine is running 
9-stable amd64 r244911 (though from cvs, not SVN), and therefore I believe 
has all of avg's ZFS deadlock patches.

This machine has both UFS and ZFS filesystems.  All of the "system" 
filesystems are on UFS, and as a result the machine itself is responsive 
and I can investigate state using kgdb against the live kernel.  I've 
included all thread backtraces, a couple of other bits relating to held 
locks, and ps/sysctl output at
 http://people.freebsd.org/~gavin/tay-zfs-hang.txt 
 http://people.freebsd.org/~gavin/tay-sysctl-a.txt 
 http://people.freebsd.org/~gavin/tay-ps-auxwwwH.txt

This machine was in use as a pkgng package builder, using poudriere.  
Poudriere makes heavy use of zfs filesystems within jails, including "zfs 
get", "zfs set", "zfs snapshot", "zfs rollback", "zfs diff" and other 
commands, although there do not appear to be any instances of the zfs 
process running currently. At the time of the hang 16 parallel builds were 
in progress, 

The underlying disk subsystem is a single hardware RAID-10 on a twa 
controller, and the zpool is on a single partition of this device.  The 
RAID-10 itself is intact, the controller reports no errors.  There is no 
L2ARC or separate ZIL.  The UFS filesystems (still seem to be fully 
functional) are on separate partitions on the same underlying device, so I 
do not believe the underlying storage is having issues.  I can "dd" from 
the underlying ZFS partition without problem.  Nothing has been logged to 
/var/log/messages.

I can keep this machine in this state for a couple of days, so can get 
further details as required.  I am happy to work with somebody in 
order to diagnose this hang further - Note however that the kernel does 
not have WITNESS etc compiled in.

Thanks,

Gavin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.1301181356140.29541>