Date: Sat, 9 Oct 2004 17:01:01 +0000 From: "Mikhail P." <miha@ghuug.org> To: freebsd-hackers@freebsd.org Cc: Dag-Erling =?iso-8859-1?q?Sm=F8rgrav?= <des@des.no> Subject: Re: ad0: FAILURE - WRITE_DMA Message-ID: <200410091701.01987.miha@ghuug.org> In-Reply-To: <xzp8yafsvwz.fsf@dwp.des.no> References: <200410081937.15068.miha@ghuug.org> <200410091617.26794.miha@ghuug.org> <xzp8yafsvwz.fsf@dwp.des.no>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] On Saturday 09 October 2004 16:23, Dag-Erling Smørgrav wrote: > "Mikhail P." <miha@ghuug.org> writes: > > On Saturday 09 October 2004 15:01, Dag-Erling Smørgrav wrote: > > > A lot of them, or just one or two? Some ATA drives will spin down at > > > regular intervals to recalibrate, and you'll get a harmless timeout if > > > you try to write to the disk while it's doing that. > > > > Unfortunately, all the drives (so far - four 200GB drives). > > I meant "a lot of timeouts", not "a lot of drives". If you only get > one or two timeouts per drive at regular intervals (say, once a > month), they're just recalibrating and there's nothing to worry about. > Well, there is no pattern. Often it just happens by itself - system runs 3-10 days fine (no warnings, no timeouts), and after that time I start seeing lots of these. To be more exact, for example I have user who's home dir is /home/user; user uses FTP to upload/download files under that directory. Let's say he has 5k files in total (ranging in size from 1kb to 20mb), so what happens is that when user tries to access certain files (either to continue upload, or continue download of the file), system spews lots of these timeouts and basically "input/ourput error" occurs. For example, yesterday it showed 360 of these messages during 12 hour period, and unfortunately during the time I was sleeping system has locked itself - last message in /var/log/messages was regarding ad0 failure. I'm not exactly sure on which files it timed out yesterday, but I do know under which directory it happened - directory has 20k files in it (not in the single dir, but including subdirs). Maybe someone knows a quick way I could open every file in under that directory - this could probably help to identify exactly on which file timeouts happened. Before replacing the drives, I had that server up for 120 days, and it did spew these messages (more and more with every day, started on about 90th day of uptime count). After rebooting system, it asked for fsck, which I did run, but it showed some softupdates inconsistencies, and refused to mount /home in rw. By the way, I just ran fsck on rw mounted /home (that's where those timeouts occurred yesterday), and I have attached it's output. I also got another message off-list, where author suggested to play with UDMA values. I switched from UDMA100 to UDMA66. System's uptime is 12 hours, and no timeouts so far.. but I'm quite sure they will get back in few days. > BTW, are you using ataidle or anything similar? nope, nothing. > > DES regards, M. [-- Attachment #2 --] [root]@[beer]:/usr/local/etc/rc.d> fsck /home ** /dev/ad0s1g (NO WRITE) ** Last Mounted on /home ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts LINK COUNT FILE I=8715003 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715004 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715005 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715006 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715007 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715008 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715009 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715010 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715016 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715017 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715080 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715086 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715087 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715093 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715094 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715100 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715101 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715107 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715129 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715142 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715143 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715156 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715157 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no LINK COUNT FILE I=8715163 OWNER=noc MODE=0 SIZE=0 MTIME=Oct 9 09:50 2004 COUNT 0 SHOULD BE -1 ADJUST? no ** Phase 5 - Check Cyl groups SUMMARY INFORMATION BAD SALVAGE? no BLK(S) MISSING IN BIT MAPS SALVAGE? no ALLOCATED FRAGS 34852132-34852134 MARKED FREE ALLOCATED FRAGS 34852264-34852268 MARKED FREE ALLOCATED FRAGS 34852344-34852347 MARKED FREE ALLOCATED FRAGS 34852376-34852380 MARKED FREE ALLOCATED FRAGS 34852452-34852453 MARKED FREE ALLOCATED FRAGS 34852512-34852513 MARKED FREE ALLOCATED FRAGS 34852536-34852540 MARKED FREE ALLOCATED FRAGS 34852544-34852545 MARKED FREE ALLOCATED FRAGS 34852548-34852549 MARKED FREE ALLOCATED FRAG 34852567 MARKED FREE ALLOCATED FRAG 34852583 MARKED FREE ALLOCATED FRAGS 34852594-34852599 MARKED FREE ALLOCATED FRAGS 34852616-34852620 MARKED FREE ALLOCATED FRAGS 34852757-34852758 MARKED FREE ALLOCATED FRAGS 34852818-34852820 MARKED FREE ALLOCATED FRAGS 34852824-34852827 MARKED FREE ALLOCATED FRAG 34852906 MARKED FREE ALLOCATED FRAGS 34852925-34852927 MARKED FREE ALLOCATED FRAGS 34853136-34853140 MARKED FREE ALLOCATED FRAGS 34853144-34853148 MARKED FREE ALLOCATED FRAGS 34853152-34853156 MARKED FREE ALLOCATED FRAGS 34853160-34853164 MARKED FREE ALLOCATED FRAGS 34853168-34853172 MARKED FREE ALLOCATED FRAGS 34853245-34853246 MARKED FREE ALLOCATED FRAGS 34853280-34853284 MARKED FREE ALLOCATED FRAGS 34853288-34853292 MARKED FREE ALLOCATED FRAGS 34853304-34853308 MARKED FREE ALLOCATED FRAGS 34853352-34853356 MARKED FREE ALLOCATED FRAGS 34853365-34853366 MARKED FREE ALLOCATED FRAGS 34853368-34853372 MARKED FREE ALLOCATED FRAGS 34853400-34853404 MARKED FREE ALLOCATED FRAGS 34853490-34853494 MARKED FREE ALLOCATED FRAGS 34853496-34853500 MARKED FREE ALLOCATED FRAGS 34853536-34853545 MARKED FREE ALLOCATED FRAGS 34853568-34853572 MARKED FREE ALLOCATED FRAGS 34853868-34853870 MARKED FREE ALLOCATED FRAGS 34853949-34853951 MARKED FREE ALLOCATED FRAGS 34854074-34854075 MARKED FREE ALLOCATED FRAGS 34854934-34854935 MARKED FREE ALLOCATED FRAGS 34855504-34855508 MARKED FREE ALLOCATED FRAGS 34855776-34855777 MARKED FREE ALLOCATED FRAGS 34855920-34855924 MARKED FREE ALLOCATED FRAGS 34856856-34856857 MARKED FREE ALLOCATED FRAGS 34857067-34857068 MARKED FREE ALLOCATED FRAGS 34871843-34871847 MARKED FREE ALLOCATED FRAGS 34879373-34879374 MARKED FREE ALLOCATED FRAGS 37584536-37584551 MARKED FREE ALLOCATED FRAGS 37601008-37601014 MARKED FREE 471717 files, 47373681 used, 38091807 free (33239 frags, 4757321 blocks, 0.0% fragmentation) [root]@[beer]:/usr/local/etc/rc.d>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200410091701.01987.miha>
