Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 9 Oct 2004 17:01:01 +0000
From:      "Mikhail P." <miha@ghuug.org>
To:        freebsd-hackers@freebsd.org
Cc:        Dag-Erling =?iso-8859-1?q?Sm=F8rgrav?= <des@des.no>
Subject:   Re: ad0: FAILURE - WRITE_DMA
Message-ID:  <200410091701.01987.miha@ghuug.org>
In-Reply-To: <xzp8yafsvwz.fsf@dwp.des.no>
References:  <200410081937.15068.miha@ghuug.org> <200410091617.26794.miha@ghuug.org> <xzp8yafsvwz.fsf@dwp.des.no>

next in thread | previous in thread | raw e-mail | index | archive | help
--Boundary-00=_NlBaBLZGX4OeKse
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

On Saturday 09 October 2004 16:23, Dag-Erling Sm=F8rgrav wrote:
> "Mikhail P." <miha@ghuug.org> writes:
> > On Saturday 09 October 2004 15:01, Dag-Erling Sm=F8rgrav wrote:
> > > A lot of them, or just one or two?  Some ATA drives will spin down at
> > > regular intervals to recalibrate, and you'll get a harmless timeout if
> > > you try to write to the disk while it's doing that.
> >
> > Unfortunately, all the drives (so far - four 200GB drives).
>
> I meant "a lot of timeouts", not "a lot of drives".  If you only get
> one or two timeouts per drive at regular intervals (say, once a
> month), they're just recalibrating and there's nothing to worry about.
>

Well, there is no pattern. Often it just happens by itself - system runs 3-=
10=20
days fine (no warnings, no timeouts), and after that time I start seeing lo=
ts=20
of these. To be more exact, for example I have user who's home dir=20
is /home/user; user uses FTP to upload/download files under that directory.=
=20
Let's say he has 5k files in total (ranging in size from 1kb to 20mb), so=20
what happens is that when user tries to access certain files (either to=20
continue upload, or continue download of the file), system spews lots of=20
these timeouts and basically "input/ourput error" occurs. For example,=20
yesterday it showed 360 of these messages during 12 hour period, and=20
unfortunately during the time I was sleeping system has locked itself - las=
t=20
message in /var/log/messages was regarding ad0 failure.
I'm not exactly sure on which files it timed out yesterday, but I do know=20
under which directory it happened - directory has 20k files in it (not in t=
he=20
single dir, but including subdirs). Maybe someone knows a quick way I could=
=20
open every file in under that directory - this could probably help to=20
identify exactly on which file timeouts happened.

Before replacing the drives, I had that server up for 120 days, and it did=
=20
spew these messages (more and more with every day, started on about 90th da=
y=20
of uptime count). After rebooting system, it asked for fsck, which I did ru=
n,=20
but it showed some softupdates inconsistencies, and refused to mount /home =
in=20
rw.

By the way, I just ran fsck on rw mounted /home (that's where those timeout=
s=20
occurred yesterday), and I have attached it's output.

I also got another message off-list, where author suggested to play with UD=
MA=20
values. I switched from UDMA100 to UDMA66. System's uptime is 12 hours, and=
=20
no timeouts so far.. but I'm quite sure they will get back in few days.

> BTW, are you using ataidle or anything similar?

nope, nothing.

>
> DES

regards,
M.

--Boundary-00=_NlBaBLZGX4OeKse
Content-Type: text/plain;
  charset="iso-8859-1";
  name="fsck.txt"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename="fsck.txt"

[root]@[beer]:/usr/local/etc/rc.d> fsck /home
** /dev/ad0s1g (NO WRITE)
** Last Mounted on /home
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
LINK COUNT FILE I=8715003  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715004  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715005  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715006  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715007  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715008  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715009  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715010  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715016  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715017  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715080  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715086  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715087  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715093  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715094  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715100  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715101  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715107  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715129  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715142  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715143  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715156  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715157  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715163  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

** Phase 5 - Check Cyl groups
SUMMARY INFORMATION BAD
SALVAGE? no

BLK(S) MISSING IN BIT MAPS
SALVAGE? no

ALLOCATED FRAGS 34852132-34852134 MARKED FREE
ALLOCATED FRAGS 34852264-34852268 MARKED FREE
ALLOCATED FRAGS 34852344-34852347 MARKED FREE
ALLOCATED FRAGS 34852376-34852380 MARKED FREE
ALLOCATED FRAGS 34852452-34852453 MARKED FREE
ALLOCATED FRAGS 34852512-34852513 MARKED FREE
ALLOCATED FRAGS 34852536-34852540 MARKED FREE
ALLOCATED FRAGS 34852544-34852545 MARKED FREE
ALLOCATED FRAGS 34852548-34852549 MARKED FREE
ALLOCATED FRAG 34852567 MARKED FREE
ALLOCATED FRAG 34852583 MARKED FREE
ALLOCATED FRAGS 34852594-34852599 MARKED FREE
ALLOCATED FRAGS 34852616-34852620 MARKED FREE
ALLOCATED FRAGS 34852757-34852758 MARKED FREE
ALLOCATED FRAGS 34852818-34852820 MARKED FREE
ALLOCATED FRAGS 34852824-34852827 MARKED FREE
ALLOCATED FRAG 34852906 MARKED FREE
ALLOCATED FRAGS 34852925-34852927 MARKED FREE
ALLOCATED FRAGS 34853136-34853140 MARKED FREE
ALLOCATED FRAGS 34853144-34853148 MARKED FREE
ALLOCATED FRAGS 34853152-34853156 MARKED FREE
ALLOCATED FRAGS 34853160-34853164 MARKED FREE
ALLOCATED FRAGS 34853168-34853172 MARKED FREE
ALLOCATED FRAGS 34853245-34853246 MARKED FREE
ALLOCATED FRAGS 34853280-34853284 MARKED FREE
ALLOCATED FRAGS 34853288-34853292 MARKED FREE
ALLOCATED FRAGS 34853304-34853308 MARKED FREE
ALLOCATED FRAGS 34853352-34853356 MARKED FREE
ALLOCATED FRAGS 34853365-34853366 MARKED FREE
ALLOCATED FRAGS 34853368-34853372 MARKED FREE
ALLOCATED FRAGS 34853400-34853404 MARKED FREE
ALLOCATED FRAGS 34853490-34853494 MARKED FREE
ALLOCATED FRAGS 34853496-34853500 MARKED FREE
ALLOCATED FRAGS 34853536-34853545 MARKED FREE
ALLOCATED FRAGS 34853568-34853572 MARKED FREE
ALLOCATED FRAGS 34853868-34853870 MARKED FREE
ALLOCATED FRAGS 34853949-34853951 MARKED FREE
ALLOCATED FRAGS 34854074-34854075 MARKED FREE
ALLOCATED FRAGS 34854934-34854935 MARKED FREE
ALLOCATED FRAGS 34855504-34855508 MARKED FREE
ALLOCATED FRAGS 34855776-34855777 MARKED FREE
ALLOCATED FRAGS 34855920-34855924 MARKED FREE
ALLOCATED FRAGS 34856856-34856857 MARKED FREE
ALLOCATED FRAGS 34857067-34857068 MARKED FREE
ALLOCATED FRAGS 34871843-34871847 MARKED FREE
ALLOCATED FRAGS 34879373-34879374 MARKED FREE
ALLOCATED FRAGS 37584536-37584551 MARKED FREE
ALLOCATED FRAGS 37601008-37601014 MARKED FREE
471717 files, 47373681 used, 38091807 free (33239 frags, 4757321 blocks, 0.0% fragmentation)
[root]@[beer]:/usr/local/etc/rc.d>
--Boundary-00=_NlBaBLZGX4OeKse--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200410091701.01987.miha>