Date: Sun, 9 Jan 2011 22:51:09 -0800 From: Jeremy Chadwick <freebsd@jdc.parodius.com> To: Tom Vijlbrief <tom.vijlbrief@xs4all.nl> Cc: freebsd-stable@freebsd.org Subject: Re: Panic 8.2 PRERELEASE WRITE_DMA48 Message-ID: <20110110065109.GA61075@icarus.home.lan> In-Reply-To: <AANLkTim13%2B=jUuRUm1=x9r=x9ufWga66jMCLyRaGKCdk@mail.gmail.com> References: <AANLkTi=iaq1Lx521oUF2BSB4-2wi9Ys2fTLzz4kLaLVo@mail.gmail.com> <20110109122243.GA37530@icarus.home.lan> <AANLkTin3FHcsdMtA9OYaA2wrUx%2BfpyEsTThdRmS8sXA5@mail.gmail.com> <20110109163027.GA42562@icarus.home.lan> <AANLkTim13%2B=jUuRUm1=x9r=x9ufWga66jMCLyRaGKCdk@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jan 10, 2011 at 07:13:57AM +0100, Tom Vijlbrief wrote: > 2011/1/9 Jeremy Chadwick <freebsd@jdc.parodius.com>: > > > > > Not to get off topic, but what is causing this? It looks like you have > > a cron job or something very aggressive doing a "smartctl -t short > > /dev/ad4" or equivalent. If you have such, please disable this > > immediately. You shouldn't be doing SMART tests with such regularity; > > it accomplishes absolutely nothing, especially the "short" tests. Let > > the drive operate normally, otherwise run smartd and watch logs instead. > > > > I have this default entry (from the author of that file) in > smartd.conf and enabled it on many machines over the years. > Is it a bad practice? > > # First (primary) ATA/IDE hard disk. Monitor all attributes, enable > # automatic online data collection, automatic Attribute autosave, and > # start a short self-test every day between 2-3am, and a long self test > # Saturdays between 3-4am. > /dev/hda -a -o on -S on -s (S/../.././02|L/../../6/03) I'll have to talk to Bruce Allen about that. Those entries in smartd.conf are pretty old (meaning they've existed for a very long time, and chances are Bruce hasn't gone back to revamp them or reconsider the logic/justification behind them). I'm an opponent of running SMART tests automatically, given what some do to drives. It's important to remember that most SMART tests can be done while the drive is in operation, and some of theses tests stress the drive, which could potentially cause timeouts or other I/O anomalies (data loss is unlikely, but odd errors may occur; it all depends on the firmware). This is especially important WRT "long" tests. For example, on newer 2TB Western Digital Caviar Black drives, a long test does something that I haven't heard (yes, heard) any other drive do -- it emits a noise that's almost identical to that of a head crash. It could be scanning a very specific region of LBAs (possibly out-of-range sectors, e.g. spares) repetitively, but it sounds nothing like a selective LBA scan. Honestly it does sound like a head crash. Is this something you'd really want to be running every 7 days? I've always advocated that people run smartd only if they want to monitor attributes -- which ultimately are the most important things to keep an eye on anyway. It's even more important to know how to read them. :-) 90% of drives out there update their attributes at set intervals or when the SMART READ DATA command is encountered. And honestly I've never seen a SMART short test do anything useful, on any drive I've used (SATA or SCSI; WD, Seagate, Maxtor, Hitachi, Fujitsu). Long test are different in this regard. I'm fully aware that the terms "short" and "long" are vague in nature and don't really tell a person what the drive is doing behind the scenes. Sadly that's the nature of SMART; they're just tests that are defined on a per-vendor (or per-disk-model!) basis. But as my 2nd paragraph above implies, the behaviour is not consistent. So when people ask me "how do I monitor my disks reliably with SMART then?", I tell them to either do it by hand (which is what I do), or run smartd(8) and keep an eye on their logs. This requires some tuning, and familiarity with what attribute means what, and again on a per-drive or per-vendor basis. It's great that there's no actual standard for these, isn't it? :-) -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110110065109.GA61075>