Date: Wed, 05 Sep 2007 08:33:39 +0930 From: Benjamin Close <Benjamin.Close@clearchain.com> To: Kenneth Vestergaard Schmidt <kvs@pil.dk> Cc: freebsd-current@freebsd.org Subject: Re: Unkillable and runaway processes Message-ID: <46DDE44B.1060203@clearchain.com> In-Reply-To: <m1lkbmtw6z.fsf@binarysolutions.dk> References: <m1lkbmtw6z.fsf@binarysolutions.dk>
next in thread | previous in thread | raw e-mail | index | archive | help
Kenneth Vestergaard Schmidt wrote: > Hello. > > Our ZFS testbed is experiencing some weird problems with rsync. We run a > nightly backup of about 1.6 TB data (that's how much is stored, not how > much is transferred), but after the initial sync I haven't been able to > get the machine through one full cycle. > > After many hours of rsyncing data from 50+ machines, suddenly one > rsync-process will hang, spinning on the CPU. > > It switches state between CPU0, CPU1, RUN and 'zfs:(&', but doesn't > really do anything. It can't be killed, and you can't reboot the machine > - it'll get past syncing disks, but won't shutdown or reboot. > > I can't do an 'ls' in the directory that rsync is running on - it'll > just hang, too. > > The machine is running current from August 29th. > > I could use some pointers on what to do - is there some way I can debug > this better, maybe give some better info? > > I do a similar thing with close to 3 TB of data and have found that too much activity causes the same hang you mention. Disabiling ZIL fixes the issues: vfs.zfs.zil_disable=1 in /boot/loader.conf Since ZFS is always consistent on disk and ZIL and it's a nightly rsync, disabling ZIL is quite safe. I'd love to debug here this but can't as the box uses a USB mouse/keyboard so every time I drop to a debugger I lose keyboard support :( Cheers, Benjamin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?46DDE44B.1060203>