Date: Wed, 22 Sep 2010 12:38:08 +0200 From: Attila Nagy <bra@fsn.hu> To: freebsd-fs@FreeBSD.org Subject: zcolli (zcollide) state, what does znode dying means? Message-ID: <4C99DC90.70208@fsn.hu>
next in thread | raw e-mail | index | archive | help
Hello, I have a machine, which is heavily hammered with file system operations, running a very recent 8-STABLE. The symptom is that everything works fine for a few minutes, then a lot of processes get into zcolli state (according to top). At that there there are two outcomes: 1. the disks calm down for a while (for long seconds, there is no, or very small amount of IO, verified with gstat), top shows nearly 100% system, a lot of processes are on the run queue (load is in the sky, around 300 and 1000), all operations stop, top refreshes, but I can't really execute new programs, then suddenly the zcolli states change and the IO resumes and the run queue decreases. 2. the system remains in this state, after 5-10 minutes there is still no change, only a reset helps (doesn't even react to CTRL-ALT-DEL, but running programs, like top still refreshes, but no disk IO can be made) zcollide state only appears here: http://fxr.watson.org/fxr/source/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c#L915 which says this is due to a dying znode. My question is: what does a dying znode mean? I don't think it's related to the on-disk structure, because the disks seem to be healthy, respond quickly (or at least evenly slow, due to the load, I can't see a disk, which would have a read error, or slow responses). Having slowdowns due to this is bad, but having lockups is a lot more worse... Thanks,
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C99DC90.70208>