Date: Tue, 17 May 2016 11:08:14 +0200 From: rainer@ultra-secure.de To: Fabian Keil <freebsd-listen@fabiankeil.de> Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>, owner-freebsd-fs@freebsd.org Subject: Re: zfs receive stalls whole system Message-ID: <c090ab7bbff2fffe2a49284f9be70183@ultra-secure.de> In-Reply-To: <20160517102757.135c1468@fabiankeil.de> References: <0C2233A9-C64A-4773-ABA5-C0BCA0D037F0@ultra-secure.de> <20160517102757.135c1468@fabiankeil.de>
next in thread | previous in thread | raw e-mail | index | archive | help
Am 2016-05-17 10:27, schrieb Fabian Keil: > Rainer Duffner <rainer@ultra-secure.de> wrote: > >> I have two servers, that were running FreeBSD 10.1-AMD64 for a long >> time, one zfs-sending to the other (via zxfer). Both are NFS-servers >> and MySQL-slaves, the sender is actively used as NFS-server, the >> recipient is just a warm-standby, in case something serious happens >> and we don’t want to wait for a day until the restore is back in >> place. The MySQL-Slaves are actively used as read-only servers (at the >> application level, Python’s SQL-Alchemy does that, apparently). >> >> They are HP DL380G8 (one CPU, hexacore) with over 128 GB RAM (I think >> one has 144, the other has 192). >> While they were running 10.1, they used HP P420 RAID-controllers with >> individual 12 RAID0 volumes that I pooled into 6-disk RAIDZ2 vdevs. >> I use zfsnap to do hourly, daily and weekly snapshots. > [...] >> Now, when I do a zxfer, sometimes the whole system stalls while the >> data is sent over, especially if the delta is large or if something >> else is reading from the disk at the same time (backup agent). >> >> I had this before, on 10.0 (I believe, we didn’t have this in 9.1 >> either, IIRC) and it went away in 10.1. > > Do you use geli for swap device(s)? Yes, I do. /dev/mirror/swap.eli none swap sw 0 0 Bad idea? >> It’s very difficult (well, impossible) to debug, because the system >> totally hangs and doesn’t accept any keypresses. > > You could try reducing ZFS's deadman timeout to get a panic. > On systems with local disks I usually use: > > vfs.zfs.deadman_enabled: 1 > vfs.zfs.deadman_checktime_ms: 5000 > vfs.zfs.deadman_synctime_ms: 10000 Too bad I don't have a spare-system I could use to test this ;-)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?c090ab7bbff2fffe2a49284f9be70183>