Date: Sun, 27 Feb 2005 01:12:53 -0600 From: Dan Nelson <dnelson@allantgroup.com> To: freebsd-questions@freebsd.org Subject: Re: Constant mysterious SCSI errors Message-ID: <20050227071253.GD8778@dan.emsphone.com> In-Reply-To: <738952320.20050226202358@wanadoo.fr> References: <738952320.20050226202358@wanadoo.fr>
next in thread | previous in thread | raw e-mail | index | archive | help
In the last episode (Feb 26), Anthony Atkielski said: > I get constant streams of messages concerning my disks on the console > whenever I have a lot of disk activity on my system (2x SCSI disks, > no IDE or other disks). I'd very much like to know what's going on > (there's nothing wrong with the hardware, so either it's a > configuration problem, or it's a bug). > > There doesn't seem to be any data loss or corruption occurring. I've > had one or two panics, though (which may or may not have caused data > loss--it's hard to tell). > > While recompiling the kernel, the system stalled periodically (at least > anything involving disk I/O stalled) and generated several hundred > kilobytes of messages looking like this: > > Feb 26 20:09:23 contactdish kernel: (da0:ahc0:0:0:0): Queue Full > Feb 26 20:09:23 contactdish kernel: (da0:ahc0:0:0:0): tagged openings now 64 > Feb 26 20:09:23 contactdish kernel: (da0:ahc0:0:0:0): Retrying Command Try lowering the max tags for that drive: "camcontrol tags da0 -N 32". If that works, you can stick it in rc.local, or add an entry to the xpt_quirk_table[] in /sys/cam/cam_xpt.c . It probably needs something similar to the quantum quirk lines. > In addition, I sometimes get bursts of much longer messages, looking > something like this: > > Feb 25 20:09:29 contactdish kernel: ahc0: Recovery Initiated > Feb 25 20:09:29 contactdish kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<< > Feb 25 20:09:29 contactdish kernel: <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> > Feb 25 20:09:29 contactdish kernel: (da1:ahc0:0:2:0): SCB 0x49 - timed out > Feb 25 20:09:29 contactdish kernel: sg[0] - Addr 0x1309b000 : Length 2048 > Feb 25 20:09:29 contactdish kernel: (da1:ahc0:0:2:0): Queuing a BDR SCB > Feb 25 20:09:29 contactdish kernel: ahc0: Timedout SCBs already complete. Interrupts may not be functioning. I never know what to look for in this output, but most of the time, I think it's a cabling or termination problem. Reseat all the plugs :) -- Dan Nelson dnelson@allantgroup.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050227071253.GD8778>