FreeBSD Mail Archives

Date:      Wed, 12 Mar 2014 05:07:28 -0700 (PDT)
From:      Anton Shterenlikht <mexas@bris.ac.uk>
To:        be@0x20.net, mexas@bris.ac.uk
Cc:        freebsd-current@freebsd.org
Subject:   Re: reproducible panic every day at 03:02, probably triggered by daily periodic scipts - help
Message-ID:  <201403121207.s2CC7QJw076837@mech-cluster241.men.bris.ac.uk>
In-Reply-To: <20140306215209.GA74933@e-new.0x20.net>

index | next in thread | previous in thread | raw e-mail


>From be@0x20.net Thu Mar  6 22:02:56 2014
>
>On Thu, Mar 06, 2014 at 12:59:14AM -0800, Anton Shterenlikht wrote:
>> In my initial PR (sparc64 r261798),
>> 
>>  http://www.freebsd.org/cgi/query-pr.cgi?pr=187080
>> 
>> I said that rsync was triggering this panic.
>> While true, I now see that there's more to it.
>> I disabled the rsync, and the cron jobs.
>> Still I get exactly the same panic every
>> night at 03:02:
>> 
>> # grep Dumptime /var/crash/*
>> /var/crash/info.0:  Dumptime: Wed Feb 26 10:10:51 2014
>> /var/crash/info.1:  Dumptime: Thu Feb 27 03:02:14 2014
>> /var/crash/info.2:  Dumptime: Fri Feb 28 03:02:29 2014
>> /var/crash/info.3:  Dumptime: Sat Mar  1 03:02:25 2014
>> /var/crash/info.4:  Dumptime: Tue Mar  4 03:02:01 2014
>> /var/crash/info.5:  Dumptime: Wed Mar  5 03:02:05 2014
>> /var/crash/info.6:  Dumptime: Thu Mar  6 03:02:11 2014
>> /var/crash/info.last:  Dumptime: Thu Mar  6 03:02:11 2014
>> # 
>> 
>> This is likely triggered by one of
>> the daily periodic scipts,
>> after about 1 min from start:
>> 
>> # grep daily /etc/crontab
>> # Perform daily/weekly/monthly maintenance.
>> 1       3       *       *       *       root    periodic daily
>> #
>> 
>> but which one?
> 
>Some time ago I had a similar problem with 8.x. Setting
>
>vm.kmem_size="512M"
>vm.kmem_size_max="512M"
>
>in loader.conf helped. It's just a wild guess but might help.

This didn't make any difference.

However, I noticed that the panics were
happening more and more often. I started
suspecting a disk failure, so decided to
do a full integrity check with dd, got:

17054+0 records in
17054+0 records out
17882415104 bytes transferred in 1035.889679 secs (17262857 bytes/sec)
(ada1:ata2:0:1:0): READ_DMA. ACB: c8 00 80 93 9c 42 00 00 00 00 00 00
(ada1:ata2:0:1:0): CAM status: ATA Status Error
(ada1:ata2:0:1:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
(ada1:ata2:0:1:0): RES: 51 40 21 94 9c 02 02 00 00 00 00
(ada1:ata2:0:1:0): Retrying command
(ada1:ata2:0:1:0): READ_DMA. ACB: c8 00 80 93 9c 42 00 00 00 00 00 00
(ada1:ata2:0:1:0): CAM status: ATA Status Error
(ada1:ata2:0:1:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
(ada1:ata2:0:1:0): RES: 51 40 21 94 9c 02 02 00 00 00 00
(ada1:ata2:0:1:0): Retrying command
(ada1:ata2:0:1:0): READ_DMA. ACB: c8 00 80 93 9c 42 00 00 00 00 00 00
(ada1:ata2:0:1:0): CAM status: ATA Status Error
(ada1:ata2:0:1:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
(ada1:ata2:0:1:0): RES: 51 40 21 94 9c 02 02 00 00 00 00
(ada1:ata2:0:1:0): Retrying command
(ada1:ata2:0:1:0): READ_DMA. ACB: c8 00 80 93 9c 42 00 00 00 00 00 00
(ada1:ata2:0:1:0): CAM status: ATA Status Error
(ada1:ata2:0:1:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
(ada1:ata2:0:1:0): RES: 51 40 21 94 9c 02 02 00 00 00 00
(ada1:ata2:0:1:0): Retrying command
(ada1:ata2:0:1:0): READ_DMA. ACB: c8 00 80 93 9c 42 00 00 00 00 00 00
(ada1:ata2:0:1:0): CAM status: ATA Status Error
(ada1:ata2:0:1:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
(ada1:ata2:0:1:0): RES: 51 40 21 94 9c 02 02 00 00 00 00
(ada1:ata2:0:1:0): Error 5, Retries exhausted
dd: /dev/ada1b: Input/output error

I guess the disk is fucked, right?

Given that it's about 10 years old,
this is not surprising.

Anton

home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201403121207.s2CC7QJw076837>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation