From owner-freebsd-questions@FreeBSD.ORG Tue Mar 22 06:14:15 2005 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3074716A4CE for ; Tue, 22 Mar 2005 06:14:15 +0000 (GMT) Received: from mail.freebsd-corp-net-guide.com (mail.freebsd-corp-net-guide.com [65.75.192.90]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7BBF743D48 for ; Tue, 22 Mar 2005 06:14:14 +0000 (GMT) (envelope-from tedm@toybox.placo.com) Received: from tedwin2k (nat-rtr.freebsd-corp-net-guide.com [65.75.197.130]) j2M6EIb30543; Mon, 21 Mar 2005 22:14:18 -0800 (PST) (envelope-from tedm@toybox.placo.com) From: "Ted Mittelstaedt" To: "RacerX" , Date: Mon, 21 Mar 2005 22:14:07 -0800 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook IMO, Build 9.0.6604 (9.0.2911.0) In-Reply-To: <20050321095647.R83831@makeworld.com> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1478 Importance: Normal Subject: RE: Anthony's drive issues.Re: ssh password delay X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Mar 2005 06:14:15 -0000 owner-freebsd-questions@freebsd.org wrote: > Anthony - > > I'm curious - with the issues you are having with the drives (SCSI > I think you mentioned) have you considered these ideas? > > 1. Upgrade the system BIOS > 2. Upgrade the firmware in the SCSI controller > 3. Upgrade the firmware in the array (if applicable) > > Ther may be a bug-a-boo in one of those. If you have not - consider > doing so and see if this "may" correct your issues. > Racer, Anthony has an on-motherboard Adaptec chip in an 8 year old Vectra. It does not use firmware. It might be possible that he has a flashable motherboard BIOS but that BIOS isn't going to have microcode for the Adaptec controller in it. (And in any case if he's never flashed his BIOS I would -strongly- recommend he don't do it now, since his eeprom has probably had the existing BIOS code burned into it by so long without an update) He is stuck with the ROM that is burned into the Adaptec controller by the manufacturer. And I wouldn't put it past HP to have tampered with the Adaptec microcode anyway. Compaq definitely did with Adaptec controllers they put into their machines that were made during the same era. I also checked his disk drives and neither of them have upgradable firmware in the drives. He does not have an array controller. As I've told him in the past, he has 2 disks on his SCSI chain, one of them is a Seagate that syncs up at 10Mbt to the controller, the other is a newer Quantum that syncs up at 20Mbt. I have told him to go into his Vectra BIOS and limit the sync negotiation on both disk drives to the same speed - 10Mbt. He refuses to try doing this. I've also told him to remove the Quantum and try running a FreeBSD system off the Seagate, to see if it errors with just the single Seagate drive on it. He refuses to do that either. Others have told him to check termination. It is possible one or both drives are pinned for termination, and since his chassis provides termination that would be an error. It is also possible that one or both drives isn't pinned to supply terminator power to the bus which would be a problem as well. He has dismissed all of these without checking, claiming his termination is fine. The basic problem is that Anthony has an error that is non-damaging to his data - every once in a while the machine spews a bunch of SCSI errors, resets the bus and everything on it, things slow down for a moment, then life continues. He has by his admission, not lost data - yet. So the summary of it is that IMHO he LIKES things the way they are - it's been happening enough so that he's not afraid of losing data anymore, yet it gives him an error he can wave around every time he wants to knock FreeBSD's drivers. He isn't really interested in finding the root of the problem or isolating it to either a controller, a disk, or a software driver issue. Instead he thinks that the SCSI driver author can just wave a wand, and look at a non-debug output of the error messages, and magically know exactly what workaround to stick in the driver to make the error messages go away. It is rather amusing or pathetic now, depending on your POV. For all we know the SCSI device driver under Windows NT ran into the exact same error - and simply did the bus reset silently, without informing the user. That would be completely in character with how Microsoft approaches things (ie: if it doesen't kill the system the user doesen't need to know about it) As I have told him before the only way to find the error is to install a SCSI analyzer onto the SCSI bus, and only Adaptec and the disk drive manufacturers have such a tool - and if one did, they would almost certainly find out it is some kind of low-level timing od SCSI command set implementation issue that would need a correction in either the Adaptec controller microcode, or one of the disk drive's microcode - and you could identify which disk it was a lot simpler and quicker by just doing the troubleshooting suggestions that have already been given to him. Besides which, a half hour of time on such a tool would probably cost more than the price of a brand new server. Ted