Date: Thu, 09 Jul 1998 06:49:32 +0900 From: Tetsuro FURUYA <ht5t-fry@asahi-net.or.jp> To: smarzloff@carif-idf.org Cc: freebsd-stable@FreeBSD.ORG, Tetsuro FURUYA <tfu@ff.iij4u.or.jp> Subject: Re: Disk problem. Message-ID: <199807082149.GAA01464@galois.tf.or.jp> In-Reply-To: Your message of "Wed, 8 Jul 1998 17:30:36 %2B0200" References: <19980708173036.A14305@rafiki.intranet.carif.asso.fr>
next in thread | previous in thread | raw e-mail | index | archive | help
Stephane Marzloff <smarzloff@carif-idf.org> wrote: > Hi.. > > I have a problem with a 2.2.6-STABLE (6 Jul) on a Ppro 200. > > Sometimes, when I launch some applications (mutt, ls, vmstat..), there is no > responses during 10 sec. > I suspect a disk problem. > > The machine isn't charge, Load average is constantly : 0.00 (0.50 maximum). > There 18Mo of Free RAM. > > And 5 minutes ago, I have this message on the console : > Jul 8 17:07:46 rafiki /kernel: wd0: interrupt timeout: > Jul 8 17:07:46 rafiki /kernel: wd0: interrupt timeout: > Jul 8 17:07:46 rafiki /kernel: wd0: status 50<rdy,seekdone> error 0 > Jul 8 17:07:46 rafiki /kernel: wd0: status 50<rdy,seekdone> error 0 Your ide disk sector is broken. Try bad144 -s -v /dev/wd0 or badsect & fsck (This is rather difficult. So, please read man). If system hang up while disk access, 1) install kernel debugger ddb compiled into kernel. When system hang up, type contrl-alt-esc, and get into ddb, and wait until disk access stops for about 20-60 seconds(this depends on system). Then, type 'c' to continue bad144 or fsck. 2) patch /usr/src/sys/i386/isa/wd.c. See this mail. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Message-Id: <199806102228.PAA00747@dingo.cdrom.com> X-Mailer: exmh version 2.0zeta 7/24/97 To: Tetsuro FURUYA <ht5t-fry@asahi-net.or.jp> cc: mike@smith.net.au, robinson@public.bta.net.cn, freebsd-stable@freebsd.org, freebsd-questions@freebsd.org, Tetsuro FURUYA <tfu@ff.iij4u.or.jp> Subject: Re: Bug in wd driver In-reply-to: Your message of "Thu, 11 Jun 1998 04:41:08 +0900." <199806101941.EAA11696@dilemma.tf.or.jp> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 10 Jun 1998 15:28:29 -0700 From: Mike Smith <mike@smith.net.au> Sender: owner-freebsd-stable@freebsd.org X-Loop: FreeBSD.ORG > > > >fsck /usr > > > >..... > > > >wd0: interrupt timeout: > > > >wd0: status 50<rdy,seekdone> error 0 > > > >wd0: interrupt timeout: > > > >wd0: status 50<rdy,seekdone> error 1<no_dam> > > > > > > >===> hang up > > > >===> type 'cntrl-alt-esc' > > > > This defers the interrupt timeout... > > > > > >db>wd0s1f: hard error reading fsbn 1152850 of 1152850-1152851(wd0s1 bn > > > >1279826; cn 317 tn 26 sn 44) > > > >wd0: status 59<rdy,seekdone,drq,err> error 40<uncorr> > > > > ... but not the interrupt, which finally arrives and contains real > > error information. Note that the interrupt timeouts in your case > > *don't* have DRQ set. Are you running in multi-block mode? > > > > > As for wd.c source, I will try to experiment :) > > > > Please do. It looks like your information may lead to a result here. > > It seems too late for writing reply to mailing list. Not at all; better late than never! > But, this seems important to note-users, so I dare to report the result of > my experiment of patch to /usr/src/sys/i386/isa/wd.c > which Mr. Mike Smith's stated, ... > > if (wdtab[ctrlr].b_errcnt == 0) > > du->dk_timeout = 1 + 10; > > else > > du->dk_timeout = 1 + 3; <---- Only this line. > > > > > >Increase the 10 and 3 values (first and subsequent timeouts). Try > >raising them lots, then come down slowly. > > Unfortunately, my /usr/src/sys/i386/isa/wd.c is different > from the above source code. > There is just only the last line in the wd.c. > > So, I rewrite only this last line, and increased 3 to 50. ( Is this OK?) It's just a number, and you're in the best position to determine whether it's big enough. > Up to now, I have not yet experienced any disk crash, nor cannot-mount-root > problem, nor anything bad else. Excellent! And thanks for confirming this. I hope that the original plaintiff is in a position to try this themselves - I would be more than happy to be completely wrong about the situation. 8) > You have written that > >raising them lots, then come down slowly. > > Is there any inconvenience when du->dk_timeout value is > very large ? > What if du->dk_timeout value is too large ? The only inconvenience is in the case where the disk has truly failed to generate an interrupt, and the delay involved before reporting the failure. > What is this du->dk_timeout ? It determines how long a disk is allowed to take to complete a command. > I've just tried 'cd /usr; badsect BAD 1152850 1215577' & 'fsck /dev/rwd0s1f', > but 'bad144 -s -v /dev/wd0' should work fine. > ( I had often used bad144. But now, my bad sectors of wd0 become too many > for bad144 :( ) > badsect & fsck don't take care of swap area, > nevertheless they are working fine now :) > > So, Thank you Mr. Mike Smith ! No, definitely this time the thanks are for you. I'll look at increasing this timeout significantly for both -stable and -current, if someone doesn't beat me to it. -- \\ Sometimes you're ahead, \\ Mike Smith \\ sometimes you're behind. \\ mike@smith.net.au \\ The race is long, and in the \\ msmith@freebsd.org \\ end it's only with yourself. \\ msmith@cdrom.com =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message ======================================================================== TEL: 048-852-3520 FAX: 048-858-1597 E-Mail: ht5t-fry@asahi-net.or.jp tfu@ff.iij4u.or.jp pgp-fingerprint: pub Tetsuro FURUYA <ht5t-fry@asahi-net.or.jp> Key fingerprint = F1 BA 5F C1 C2 48 1D C7 AE 5F 16 ED 12 17 75 38 ========================================================================= To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199807082149.GAA01464>