Date: Fri, 25 Jan 2008 16:38:46 -0800 From: Jeremy Chadwick <koitsu@FreeBSD.org> To: Joe Peterson <joe@skyrush.com> Cc: freebsd-stable@freebsd.org Subject: Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1 Message-ID: <20080126003845.GA52183@eos.sc1.parodius.com> In-Reply-To: <479A4CB0.5080206@skyrush.com> References: <479A0731.6020405@skyrush.com> <20080125162940.GA38494@eos.sc1.parodius.com> <479A3764.6050800@skyrush.com> <3803988D-8D18-4E89-92EA-19BF62FD2395@mac.com> <479A4CB0.5080206@skyrush.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Joe, I wanted to send you a note about something that I'm still in the process of dealing with. The timing couldn't be more ironic. I decided it would be worthwhile to migrate from my two-disk ZFS stripe with a non-ZFS disk for nightly backups, to to a RAIDZ pool of all 3 disks combined (since they're all the same size). I had another terminal with gstat -I500ms running in it, so I could see overall I/O. All was going well until about the 81GB mark of the copy. gstat started showing 0KB in/out on all the drives, and the rsync was stalled. ^Z did nothing, which is usually a bad sign. :-) I ssh'd in and did a dmesg (summarised): ad6: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad6: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly ad6: WARNING - SET_MULTI taskqueue timeout - completing request directly ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13951071 ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13951327 ad6: FAILURE - WRITE_DMA timed out LBA=13951071 ad6: FAILURE - WRITE_DMA timed out LBA=13951327 ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13951583 ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13951839 ad6: FAILURE - WRITE_DMA timed out LBA=13951583 ad6: FAILURE - WRITE_DMA timed out LBA=13951839 ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13952095 ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13952351 g_vfs_done():ad6s1d[WRITE(offset=7142916096, length=131072)]error = 5 g_vfs_done():ad6s1d[WRITE(offset=7143047168, length=131072)]error = 5 g_vfs_done():ad6s1d[WRITE(offset=7143178240, length=131072)]error = 5 g_vfs_done():ad6s1d[WRITE(offset=7143309312, length=131072)]error = 5 g_vfs_done():ad6s1d[WRITE(offset=7143440384, length=131072)]error = 5 It appears my /dev/ad6 (a Seagate -- more irony) must have some bad blocks. Actually, after letting things go for a while, I realised the box just locked up. Probably kernel panic'd due to the I/O problem. I'll have to poke at SMART stats later to see what showed up. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080126003845.GA52183>