Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 5 Nov 2007 12:05:08 -0500
From:      Adam McDougall <mcdouga9@egr.msu.edu>
To:        Kris Kennaway <kris@freebsd.org>
Cc:        freebsd-current@freebsd.org
Subject:   Re: ZFS Hangs
Message-ID:  <20071105170508.GA4037@egr.msu.edu>
In-Reply-To: <472EE13E.9030908@FreeBSD.org>
References:  <200711021208.25913.Thomas.Sparrevohn@btinternet.com> <200711041423.54336.Thomas.Sparrevohn@btinternet.com> <472DDEA2.7080804@FreeBSD.org> <200711050041.38229.Thomas.Sparrevohn@btinternet.com> <472EE13E.9030908@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Nov 05, 2007 at 10:24:14AM +0100, Kris Kennaway wrote:

  Thomas Sparrevohn wrote:
>> On Sunday 04 November 2007 15:00:50 Kris Kennaway wrote:
>>> http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html
>>> 
>> Oh my god - Overlooked that ;-) - funny that -  Its a bit tricky as it not 
>> possibly to dump a kernel
>> when the swap is on ZFS - I did a test with all debugging enabled and the 
>> problem
>> did not show up - which makes it somewhat nasty - I check if I can 
>> reproduce it with only DDB enabled 
  
  You can still hook up a serial console, or at the very least take 
  photographs of the screen with the relevant DDB information.  Or add 
  another disk and dump on that.
  
  Kris
  
I have some screenshots of ps in ddb from one of several zfs hangs I've had
on one amd64 system:

http://www.egr.msu.edu/~mcdouga9/pics/zfs/

I didn't post every single screenful since I don't have a microsd reader handy,
and emailing the pictures off my phone is painful.  If I missed a screenshot of
one or more particular processes that might have a telling state, let me know.

I also have a gzipped kernel + dump from a forced panic when it was in this
state, if a developer is interested in it please let me know so I can post it
somewhere private since the system is in NIS and likely has tables cached
in memory.  

It is running a kernel from Oct 17.  I tried a kernel with WITNESS, INVARIANTS
etc but it did the same hang without any panic.  I completed a zpool scrub
this morning with no errors.  Lately zfs seems to wedge up every single night
when rsync from remote servers run.  This is the only amd64 system I have zfs on,
the other two are i386 and the problems on those systems have only been kmem panics
which so far have been avoidable.  

I can help by checking somewhat specific things and running prescribed tests,
but right now I don't have time to tackle this problem on this system and learn
how to debug it entirely on my own starting with nothing more than a DDB guide
from the handbook.  Its not that I refuse to; I recognize its difficult to
join remote skill with local hands for something this technical. 

Friday I replaced the motherboard/cpu just as a shot in the dark (since the
system had some strange instability in the past) but this didn't help zfs 
(not surprised).  When zfs was hung saturday morning, I tried to reboot it
but reboot would not even get far enough to stop new ssh connections.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071105170508.GA4037>