Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 30 Jul 2004 16:33:32 +0100
From:      Jason Thomson <jason.thomson@mintel.com>
To:        Vinod Kashyap <vkashyap@amcc.com>
Cc:        Paul Saab <ps@mu.org>
Subject:    Re: Reproducible FreeBSD 4.10-STABLE (Jul 7) ,  3ware 7506-4 lockup.
Message-ID:  <410A6A4C.4060008@mintel.com>
In-Reply-To: <I0YQI602.N07@hadar.amcc.com>
References:  <I0YQI602.N07@hadar.amcc.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Vinod Kashyap wrote:

 > After the system locks up, from the DDB prompt, do a
 > 'tr, 20'.  What does it say?
 >
 > Please check the drive compatibility list at:
 > http://www.3ware.com/products/pdf/Drive_compatibility_list.pdf
 >
 > If you suspect a problem with any of the 3ware components,
 > I strongly encourage you to contact 3ware support.
 >


Apologies for taking so long to reply.

I've finally got a serial console connected to this machine.

When the machine locks up (after the controller reports an error),
breaking into the debugger from the console just shows:

twe0: AEN: <twe0: port 3: sector repair occurred>

db> tr, 20
siointr1(c326b000,c04d1cc8,0,ffc08ff4,c039fd70) at siointr1+0xc1
siointr(c326b000) at siointr+0x17
Xfastintr4(0,ffc09000,0,0,ddaac000) at Xfastintr4+0x20
idle_loop() at idle_loop+0x44

Does this mean that it's not locked up in the kernel,  it's just the
disk controller / driver that is frozen?

I've included the process list at the bottom of this mail.  I'm stuck
for clues with regard to what else I should look at.  I can provide
access to the serial console on this machine from the internet,  if
anyone is able to help debug this?  Please reply in private mail.

(To recap,  I can reproduce the problem by dd'ing from the disk to
/dev/null - when it hits a bad sector on the disk,  no further twe I/O
takes place.  Contrary to a previous report,  it doesn't always seem to
hit a bad sector in the same place).


With respect to the drive compatibility list,  the drives we are using
are not on the list,  but drives from the same range are:  The drives we
have are 5A300J0 and 4A320J8 Maxtor drives - the Maxtor 4A300J0 is on
the list.

I don't suspect a problem with these specific 3ware components - we've
had the same problem occur on 3 different machines (all Dell 1600SCs
with 7506-4LP controllers).  I don't know if there is a design fault
with the 3ware hardware or the Maxtor disks that means they don't play
well together.  I would guess this is a fairly popular hardware
configuration - and I haven't read any problem reports about operating
systems other than FreeBSD.

BTW I did contact 3ware support, but heard nothing back - this may be
because I submitted a too vague problem report.  I will try again,  if
you think they might be able to help.



db> ps

db> ps
   pid   proc     addr    uid  ppid  pgrp  flag stat wmesg   wchan   cmd
   229 dffbcc20 dfffc000    0   227   227 004004  3  getblk cfa1a03c atrun
   228 dffbcdc0 dffe0000    0   226   226 8000004  3  spread cfa161d4 sh
   227 dffbcf60 dffd8000    0   225   227 004084  3    wait dffbcf60 sh
   226 dffbd2a0 dffe7000    0   224   226 004084  3    wait dffbd2a0 sh
   225 dffbd100 dfff3000    0    92    92 000084  3  piperd dfebe3e0 cron
   224 dffbd780 dffaa000    0    92    92 000084  3  piperd dfebe700 cron
   218 dffbd440 dffdc000 1003   210   218 004106  3   inode c3503d00 systat
   210 dffbd5e0 dffc7000 1003   209   210 2004086  3   pause dffc7260 csh
   209 dffbde00 dffbe000 1003   207    94 000184  3  select c04bd588 sshd
   207 dffbd920 dffc2000    0    94    94 000184  3  sbwait ddac4268 sshd
   196 dffbdc60 dffca000    0   162   196 004086  3   ttyin c1ddb430 csh
   181 dffbdac0 dffcf000    0   155   181 004006  3  physstr cfa16088 dd
   171 dc059ea0 dffae000 1003   170   171 004086  3   ttyin c3506830 csh
   170 dc05a1e0 dff9c000 1003   159    94 000184  3  select c04bd588 sshd
   162 dc05a040 dffa1000 1003   161   162 2004086  3   pause dffa1260 csh
   161 dc05a520 dff7d000 1003   157    94 000184  3  select c04bd588 sshd
   159 dc05a380 dff96000    0    94    94 000184  3  sbwait ddac47a8 sshd
   157 dc05a6c0 dff88000    0    94    94 000184  3  sbwait ddac4348 sshd
   155 dc05cdc0 dfeb8000    0   151   155 2004086  3   pause dfeb8260 csh
   151 dc05a860 dff6b000    0     1   151 004186  3    wait dc05a860 login
   150 dc05aa00 dff67000    0     1   150 004086  3   ttyin c3571210 getty
   149 dc05aba0 dff63000    0     1   149 004086  3   ttyin c3571410 getty
   148 dc05ad40 dff5f000    0     1   148 004086  3   ttyin c3571610 getty
   147 dc05aee0 dff5b000    0     1   147 004086  3   ttyin c3571810 getty
   146 dc05b3c0 dff45000    0     1   146 004086  3   ttyin c3571a10 getty
   145 dc05b560 dff3a000    0     1   145 004086  3   ttyin c3571c10 getty
   144 dc05ba40 dff32000    0     1   144 004086  3   ttyin c356be10 getty
   143 dc05cf60 dfeb0000    0     1   143 004086  3   ttyin c318d110 getty
   140 dc05b080 dff50000    0     1   140 000085  3  select c04bd588 nmbd
   138 dc05b220 dff3f000    0     1   138 000085  3  select c04bd588 smbd
   132 dc05b8a0 dff36000    0   130    10 000086  3  nanslp c04a3910 3dmd
   131 dc05c740 dfef5000    0   130    10 000086  3  accept ddac2ff2 3dmd
   130 dc05bf20 dff19000    0     1    10 000086  3  nanslp c04a3910 3dmd
   129 dc05b700 dff2b000    0     1   129 000084  3  select c04bd588 rsync
   102 dc05bbe0 dff25000   25     1   102 2000184  3   pause dff25260 
sendmail
    99 dc05bd80 dff21000    0     1    99 000184  3  select c04bd588 
sendmail
    96 dc05c0c0 dff15000    0     1    96 000084  3  select c04bd588 usbd
    94 dc05c260 dff11000    0     1    94 000184  3  select c04bd588 sshd
    92 dc05c400 dff0b000    0     1    92 000084  3  nanslp c04a3910 cron
    90 dc05c5a0 dfef9000    0     1    90 000084  3  select c04bd588 inetd
    83 dc05c8e0 dfec9000    0     1    83 000084  3  select c04bd588 ntpd
    79 dc05ca80 dfec4000    0     1    79 000004  3  getblk cfa1ea28 syslogd
    31 dc05cc20 dfec0000    0     1    31 2000084  3   pause dfec0260 
adjkerntz
     9 dc05d100 deb18000    0     0     0 000204  3  getblk cfa1a03c syncer
     8 dc05d2a0 deb15000    0     0     0 000204  3  vlruwt dc05d2a0 vnlru
     7 dc05d440 deb12000    0     0     0 000204  3  psleep c04a3ae4 
bufdaemon
     6 dc05d5e0 deb0f000    0     0     0 000204  3  psleep c04b2c20 
vmdaemon
     5 dc05d780 deb0c000    0     0     0 000204  3  psleep c047cdf8 
pagedaemon
     4 dc05d920 dda8e000    0     0     0 000204  3  usbtsk c04c2778 usbtask
     3 dc05dac0 dda8b000    0     0     0 000204  3  usbevt c318f210 usb0
     2 dc05dc60 dda65000    0     0     0 000204  3   tqthr c04bd584 
taskqueue
     1 dc05de00 dc062000    0     0     1 004284  3    wait dc05de00 init
     0 c04bc8a0 c0579000    0     0     0 000204  3   sched c04bc8a0 swapper



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?410A6A4C.4060008>