Date: Mon, 8 Feb 2016 00:58:10 +0800 (AWST) From: David Adam <zanchey@ucc.gu.uwa.edu.au> To: freebsd-fs@freebsd.org Subject: Re: Poor ZFS+NFSv3 read/write performance and panic Message-ID: <alpine.DEB.2.11.1602080056390.17583@motsugo.ucc.gu.uwa.edu.au> In-Reply-To: <alpine.DEB.2.11.1601292153420.26396@motsugo.ucc.gu.uwa.edu.au> References: <alpine.DEB.2.11.1601292153420.26396@motsugo.ucc.gu.uwa.edu.au>
next in thread | previous in thread | raw e-mail | index | archive | help
Just wondering if anyone has any idea how to identify which devices are implicated in ZFS' vdev_deadman(). I have updated the firmware on the mps(4) card that has our disks attached but that hasn't helped. Thanks David On Fri, 29 Jan 2016, David Adam wrote: > We have a FreeBSD 10.2 server sharing some ZFS datasets over NFSv3. It's > worked well until recently, but has started to routinely perform > exceptionally poorly, eventually panicing in vdev_deadman() (which I > understand is a feature). > > Initally after booting, things are fine, but performance rapidly begins to > degrade. Both read and write performance is terrible, with many operations > either hanging indefinitely or timing out. > > When this happens, I can break into DDB and see lots of nfsd process stuck > waiting for a lock: > Process 784 (nfsd) thread 0xfffff80234795000 (100455) > shared lockmgr zfs (zfs) r = 0 (0xfffff8000b91f548) locked @ > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:2196 > > and the backtrace looks like this: > sched_switch() at sched_switch+0x495/frame 0xfffffe04677740b0 > mi_switch() at mi_switch+0x179/frame 0xfffffe04677740f0 > turnstile_wait() at turnstile_wait+0x3b2/frame 0xfffffe0467774140 > __mtx_lock_sleep() at __mtx_lock_sleep+0x2c0/frame 0xfffffe04677741c0 > __mtx_lock_flags() at __mtx_lock_flags+0x102/frame 0xfffffe0467774210 > vmem_size() at vmem_size+0x5a/frame 0xfffffe0467774240 > arc_reclaim_needed() at arc_reclaim_needed+0xd2/frame 0xfffffe0467774260 > arc_get_data_buf() at arc_get_data_buf+0x157/frame 0xfffffe04677742a0 > arc_read() at arc_read+0x68b/frame 0xfffffe0467774350 > dbuf_read() at dbuf_read+0x7ed/frame 0xfffffe04677743f0 > dmu_tx_check_ioerr() at dmu_tx_check_ioerr+0x8b/frame 0xfffffe0467774420 > dmu_tx_count_write() at dmu_tx_count_write+0x17e/frame 0xfffffe0467774540 > dmu_tx_hold_write() at dmu_tx_hold_write+0xba/frame 0xfffffe0467774580 > zfs_freebsd_write() at zfs_freebsd_write+0x55d/frame 0xfffffe04677747b0 > VOP_WRITE_APV() at VOP_WRITE_APV+0x193/frame 0xfffffe04677748c0 > nfsvno_write() at nfsvno_write+0x13e/frame 0xfffffe0467774970 > nfsrvd_write() at nfsrvd_write+0x496/frame 0xfffffe0467774c80 > nfsrvd_dorpc() at nfsrvd_dorpc+0x66b/frame 0xfffffe0467774e40 > nfssvc_program() at nfssvc_program+0x4e6/frame 0xfffffe0467774ff0 > svc_run_internal() at svc_run_internal+0xbb7/frame 0xfffffe0467775180 > svc_run() at svc_run+0x1db/frame 0xfffffe04677751f0 > nfsrvd_nfsd() at nfsrvd_nfsd+0x1f0/frame 0xfffffe0467775350 > nfssvc_nfsd() at nfssvc_nfsd+0x124/frame 0xfffffe0467775970 > sys_nfssvc() at sys_nfssvc+0xb7/frame 0xfffffe04677759a0 > amd64_syscall() at amd64_syscall+0x278/frame 0xfffffe0467775ab0 > Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0467775ab0 > > Is this likely to be due to bad hardware? I can't see any problems in > the SMART data, and `camcontrol tags da0 -v` etc. does not reveal any > particularly long queues. Are there other useful things to check? > > If not, do you have any other ideas? I can make the full DDB information > available if that would be helpful. > > The pool is configured thus: > NAME STATE READ WRITE CKSUM > space ONLINE 0 0 0 > mirror-0 ONLINE 0 0 0 > da0 ONLINE 0 0 0 > da1 ONLINE 0 0 0 > mirror-1 ONLINE 0 0 0 > da2 ONLINE 0 0 0 > da3 ONLINE 0 0 0 > mirror-2 ONLINE 0 0 0 > da4 ONLINE 0 0 0 > da6 ONLINE 0 0 0 > mirror-3 ONLINE 0 0 0 > da7 ONLINE 0 0 0 > da8 ONLINE 0 0 0 > logs > mirror-4 ONLINE 0 0 0 > gpt/molmol-slog ONLINE 0 0 0 > gpt/molmol-slog0 ONLINE 0 0 0 > where the da? devices are WD Reds and the SLOG partitions are on Samsung > 840s. > > Many thanks, > > David Adam > zanchey@ucc.gu.uwa.edu.au > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > Cheers, David Adam zanchey@ucc.gu.uwa.edu.au Ask Me About Our SLA!
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.DEB.2.11.1602080056390.17583>