FreeBSD Mail Archives

Date:      Fri, 21 Apr 2000 00:32:05 -0700
From:      Alfred Perlstein <bright@wintelcom.net>
To:        Mike Smith <msmith@FreeBSD.ORG>
Cc:        stable@FreeBSD.ORG
Subject:   Re: amr still seems to have issues.
Message-ID:  <20000421003205.A25458@fw.wintelcom.net>
In-Reply-To: <200004201817.LAA00917@mass.cdrom.com>; from msmith@FreeBSD.ORG on Thu, Apr 20, 2000 at 11:17:21AM -0700
References:  <20000420085841.G1838@fw.wintelcom.net> <200004201817.LAA00917@mass.cdrom.com>

* Mike Smith <msmith@FreeBSD.ORG> [000420 11:39] wrote:
> > Hi, we're running 4.0-stable as of Sat Apr 15 18:39:08 PDT 2000
> > which include the recent amr fixes which we were hoping would cure
> > the lockups with amr.  Unfortunatly we are now experiancing reboots,
> > the messages file reveals this:
> >
> > Apr 15 13:31:06 abacus /kernel: amr0: command 31 wedged after 30 seconds
> 
> This is extra-bad.  Without more feedback from the controller (no 
> documentation from AMI yet, sorry. 8() I can only wonder whether you're 
> getting a SCSI bus error of some sort that's causing the kernel to time 
> these commands out (because the controller is taking too long to respond).
> 
> You could try increasing the timeout allowance in amr_periodic(), or just 
> disable the poll entirely.  This won't help if the controller is really 
> dropping commands, though.
> 
> > Right now I'm attempting to log off a serial console to see what's
> > going on, however this box has been in production (and doing miserably)
> > for some time now so doing debugging is pretty difficult as well as
> > time consuming where I really need to be working on other issues.
> 
> At this point, I have no other ideas, sorry.

Here's something I hope it helps:

amr0: command 40 wedged after 30 seconds
biodone: page busy < 0, pindex: 144, foff: 0x(0,90000), resid: 4096, index: 0
 iosize: 8192, lblkno: 72, flags: 0x30020aa0, npages: 2
 valid: 0xff, dirty: 0x0, wired: 1
panic: biodone: page busy < 0

mp_lock = 01000001; cpuid = 1; lapic.id = 00000000
boot() called on cpu#1

syncing disks... 

Fatal trap 12: page fault while in kernel mode
mp_lock = 01000002; cpuid = 1; lapic.id = 00000000
fault virtual address   = 0x30
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc0226765
stack pointer           = 0x10:0xff80dd9c
frame pointer           = 0x10:0xff80dda0                                       code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          = bio  <- SMP: XXX
trap number             = 12
panic: page fault
mp_lock = 01000002; cpuid = 1; lapic.id = 00000000
boot() called on cpu#1
Uptime: 2d4h11m39s
amrd0: still open, can't shutdown

dumping to dev #da/0x20001, offset 128
dump 1023 1022 Aborting dump due to I/O error.
(da0:ahc1:0:6:0): WRITE(06). CDB: a 7 da f7 8 0 
(da0:ahc1:0:6:0): error code 0 at block no. -964632618 (decimal)
failed, reason: i/o error
Automatic reboot in 15 seconds - press a key on the console to abort
Rebooting...

Any ideas?

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20000421003205.A25458>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation