Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 13 Aug 2000 13:59:43 -0700
From:      Joe Modjeski <jmodjeski@ms1.northlink.com>
To:        "'Bernd Walter '" <ticso@mail.cicely.de>
Cc:        "''freebsd-scsi@freebsd.org' '" <freebsd-scsi@freebsd.org>
Subject:   RE: to Vinum or not to Vinum
Message-ID:  <00101B7A7FDDD311A89500A0CC56C79048BE@MS1>

next in thread | raw e-mail | index | archive | help
 

-----Original Message-----
From: Bernd Walter
To: Joe Modjeski
Cc: 'Bernd Walter'; 'freebsd-scsi@freebsd.org'
Sent: 8/12/00 3:20 PM
Subject: Re: to Vinum or not to Vinum

On Sat, Aug 12, 2000 at 01:11:39PM -0700, Joe Modjeski wrote:
>  
> > On Thu, Aug 10, 2000 at 12:09:47PM -0700, Joe Modjeski wrote:
> > > Currently we have 3 Compaq Proliant 1600R servers with 6 
> > 9.1 Ultra3 drives
> > > in each.  We are attempting (very unsuccessfully) to do 
> > Raid5 with vinum.
> > > We get fatal trap 12 errors very regularly and after a few 
> > reboots the vinum
> > > volume is so chewed up that we end up having to rebuild the 
> > system.  I
> > > tracked down the majority of the problems to the 
> > /etc/security script.  I
> > > believe it is about the 6th or 7th line down where it 
> > starts the find run.
> > > The box starts off fine but after about 1 minute it starts 
> > to hit all the
> > > drives at once then BLAM!! It gives me the error.
> > 
> > Are your fatal trap 12 errors kernel panics?
> > If yes do you see some SCSI error messages directly before 
> > this happens?
> 
> Yes they are kernel panics.  And Yes there are always SCSI errors.
> 
> BAD DSA ( SOME_HEX_NUMBER ) in queue
> SCSI BUS RESET DETECTED sym0:0:-1:-1
> 
> The above isn't exact.  The message conveniently misses the logs.  I
can get
> the exact messages if you would like.  I am trying to avoid crashing
the box
> as much as possible. :)

The exact error including the hex codes is important to distinguish
between
a bus error or something in the code.

> The drives are Hotswap and it does appear that they get "Disconnected"
when
> the error happens.  It is however no specific.  In my original vinum
setup I
> was spanning the raid across all 6 drives.  Then it was consistant
with
> drive 0.  I though that was reason for the trouble so I changed the
> configuration to the one included in the previous message.
> 
> I have compiled a debug kernel in an effort to get a dump and now the
fatal
> trap 12 kernel panics are less the SCSI errors that go along with them
are
> more consistant.

You mean you get SCSI errors sometimes without panics directly after?
Are you still using the sym controller or is that behavour with the ahc
card you mentioned?

-- 
B.Walter              COSMO-Project         http://www.cosmo-project.de
ticso@cicely.de         Usergroup           info@cosmo-project.de

Yes that is corect.  I get the SCSI errors without the panics directly
after.  This actually where the activity on the box gets strange.  Some
times it will repeat the SCSI BUS RESET DETECTED error over and over on the
console.  If I log in from a remote session (telnet or ssh) it will print
the header and login prompt but when I try to login the console starts to
try to reboot.  I get the "syncing disks..." printed over and over until I
hard reboot.

The other scenario is I get the SCSI BUS RESET DETECTED error once and I am
able to establish a remote connection to the box and everything seems fine
except for a zombie "find" process, which urks the heck out of me so I
reboot the box.

In either scenario the console session gets hung and you are unable to
switch vtys or use the keyboard.

I will get the exact messages for you tommorow if everything goes well.  I
have had a newly installed FreeBSD proxy that has been acting up and the
resort we installed it in is having a conference this week.  The only thing
that has comforted the Execs at the resort has been my pager.

Joe


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?00101B7A7FDDD311A89500A0CC56C79048BE>