Date: Mon, 5 Feb 2007 01:13:31 -0600 (CST) From: "Richard Lynch" <ceo@l-i-e.com> To: "Chuck Swiger" <cswiger@mac.com> Cc: freebsd-questions@freebsd.org Subject: Re: READ_DMA48 error interpretation Message-ID: <2195.67.184.122.32.1170659611.squirrel@www.l-i-e.com> In-Reply-To: <3E64E786-E7A9-4914-BF29-DE89F25597E3@mac.com> References: <1398.216.230.84.67.1168982036.squirrel@www.l-i-e.com> <3E64E786-E7A9-4914-BF29-DE89F25597E3@mac.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, January 16, 2007 3:21 pm, Chuck Swiger wrote: > On Jan 16, 2007, at 1:13 PM, Richard Lynch wrote: >> I know the messages below mean the hard drive or IDE cards are >> having >> problems. But is this like RED ALERT or more like YELLOW or what? ... >> +ad1: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=404955007 >> +ad1: FAILURE - READ_DMA48 status=51<READY,DSC,ERROR> >> error=10<NID_NOT_FOUND> >> LBA=404955007 >> +g_vfs_done():ad1s1[READ(offset=207336931328, length=16384)]error = 5 > If you have current backups, it's a yellow alert. Otherwise... > >> And what do I do about it? >> >> umount and fsck everything a lot? >> swap cards/drives around until it stops? >> Ignore it and pray? > > Try installing the sysutils/smartmontools port and run a drive self- > test. That will give you a much better assessment of the state of > the drive and whether it is likely to completely fail in the next 24 > hours... I ran the short test on the problem drives, and it said everything was fine. I'll try the long test at a later date. Meanwhile, I turned on the smartd daemon, and am seeing two issues in the logs... #1. The drive temperatures seem ridiculously high to this naive reader, but what do I know?... 110 to 190 Celcius? Yikes... Or maybe that's normal? How hot is too hot? #2. Sequences like this show up a fair amount: Device: /dev/ad2, SMART Prefailure Attribute: 3 Spin_Up_Time changed from 152 to 153 Device: /dev/ad2, SMART Prefailure Attribute: 3 Spin_Up_Time changed from 153 to 152 Device: /dev/ad0, SMART Prefailure Attribute: 8 Seek_Time_Performance changed from 251 to 250 So is the real "problem" just that the drives are spun down and can't spin up fast enough? I can probably live with the consequences of that, and just go on with life -- The occasional HTTP request for an audio file will fail the first time, and they have to hit reload. This box is the fail-safe roll-over server for audio files that are all up online somewhere else managed by a professional (not me), so it's no surprise that the rare time-out on the real server also ends up with a drive spin up and failed request on the "backup". Kind of annoying, I guess, to an end user, but forcing the drives to always be spinning is probably not a Good Idea. Oh, here's a rather long excerpt of the log in case there's minutae within it that I've failed to include: http://l-i-e.com/smartd.log Any help in interpreting these results is most appreciated! THANKS!!! -- Some people have a "gift" link here. Know what I want? I want you to buy a CD from some starving artist. http://cdbaby.com/browse/from/lynch Yeah, I get a buck. So?
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2195.67.184.122.32.1170659611.squirrel>