Date: Wed, 09 Jan 2013 19:35:04 +0100 From: "Ronald Klop" <ronald-freebsd8@klop.yi.org> To: freebsd-fs@freebsd.org Subject: Re: slowdown of zfs (tx->tx) Message-ID: <op.wqnpwqu08527sy@212-182-167-131.ip.telfort.nl> In-Reply-To: <20130109162613.GA34276@mid.pc5.i.0x5.de> References: <20130108174225.GA17260@mid.pc5.i.0x5.de> <CAFqOu6jgA8RWV5d%2BrOBk8D=3Vu3yWSnDkAi1cFJ0esj4OpBy2Q@mail.gmail.com> <20130109162613.GA34276@mid.pc5.i.0x5.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 09 Jan 2013 17:26:13 +0100, Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org> wrote: > * Artem Belevich <art@freebsd.org> [2013-01-08 12:47 -0800]: >> On Tue, Jan 8, 2013 at 9:42 AM, Nicolas Rachinsky >> <fbsd-mas-0@ml.turing-complete.org> wrote: >> > NAME STATE READ WRITE CKSUM >> > pool1 DEGRADED 0 0 0 >> > raidz2-0 DEGRADED 0 0 0 >> > ada5 ONLINE 0 0 0 >> > ada8 ONLINE 0 0 0 >> > ada2 ONLINE 0 0 0 >> > ada3 ONLINE 0 0 0 >> > 11846390416703086268 UNAVAIL 0 0 0 was >> /dev/dsk/ada1 >> > ada6 ONLINE 0 0 0 >> > ada0 ONLINE 0 0 1 >> > ada7 ONLINE 0 0 0 >> > ada4 ONLINE 0 0 3 >> >> You seem to have some checksum errors which does suggest hardware >> troubles. > > I somehow missed these. Is there any way to learn when these checksum > errors happen? > >> For starters, check smart info for all drives and see if they have any >> relocated sectors. > > There are some disks with relocated sectors, but for both ada0 and > ada4 Reallocated_Sector_Ct is 0. > >> Use gstat during your workload to see if any of the drives takes much >> longer than others to handle its job. > > There is one disk sticking out a bit. > >> > There is almost no disk activity during this time. >> >> What kind of disk activity *is* there? > > What would be interesting? > > >> > sync is disabled for the whole pool. >> >> If that's the case (assyming you're talking about sync=disabled zfs >> property), then synchronous writes are probably not the cause of >> slowdown. My guess would be either failing HDD or something funky with >> cabling or sata controller. > > Yes, sync=disabled for pool1. > > > Ok, I will start swapping hardware (sadly the machine is quite a drive > away). > > Thank you very much for your help. > > Nicolas If you are driving anyway replace this one: >> > 11846390416703086268 UNAVAIL 0 0 0 was >> /dev/dsk/ada1 If the pool is healthy checksum errors will be noticed earlier by the sysadmin. Ronald.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?op.wqnpwqu08527sy>