From owner-freebsd-current Thu Mar 23 14: 9:59 2000 Delivered-To: freebsd-current@freebsd.org Received: from mail.kpnqwest.ch (mail.eunet.ch [146.228.10.7]) by hub.freebsd.org (Postfix) with ESMTP id 859EF37C701 for ; Thu, 23 Mar 2000 14:09:52 -0800 (PST) (envelope-from mw@kpnqwest.ch) Received: (from mw@localhost) by mail.kpnqwest.ch (8.9.3/1.34) id WAA22966 for freebsd-current@freebsd.org; Thu, 23 Mar 2000 22:09:49 GMT env-from (mw@kpnqwest.ch) From: mw@kpnqwest.ch Message-Id: <200003232209.WAA22966@mail.kpnqwest.ch> Subject: Re: AMI MegaRAID lockup? not accepting commands. In-Reply-To: <200003232150.NAA02123@mass.cdrom.com> from Mike Smith at "Mar 23, 2000 01:50:28 pm" To: Mike Smith Date: Thu, 23 Mar 2000 23:09:03 +0100 (CET) X-Mailer: ELM [version 2.4ME+ PL72 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > Can you try instead the changes that I just committed to -current? I > think that the problem shows up when the controller is heavily loaded; > your patch will keep the load on the controller down, which may mask the > 'real' bug. I tried your approach (that was what I described with "fiddling with DELAY"). I even went even further to clear that loop, but it didn't help. This is what I currently still have in there from these experiments: /* from linux: The "volatile" is due to gcc bugs */ #define barrier() __asm__ __volatile__("": : :"memory") for (i = 10000, done = 0, worked = 0; (i > 0) && !done; i--) { s = splbio(); /* is the mailbox free? */ if (sc->amr_mailbox->mb_busy == 0) { debug("got mailbox"); sc->amr_mailbox64->mb64_segment = 0; bcopy(&ac->ac_mailbox, sc->amr_mailbox, AMR_MBOX_CMDSIZE); sc->amr_submit_command(sc); done = 1; sc->amr_workcount++; TAILQ_INSERT_TAIL(&sc->amr_work, ac, ac_link); /* not free, try to clean up while we wait */ } else { debug("busy flag %x\n", sc->amr_mailbox->mb_busy); /* don't do this in here for now, it involves talking to the * controller to see whether there's work done, and since we * just saw that the controller is somewhat busy, that's perhaps * not such a good idea? */ /* worked += amr_done(sc); */ } splx(s); DELAY(100); barrier(); } /* check here for work to be done */ s = splbio(); worked += amr_done(sc); splx(s); This did *NOT* stop the controller from crashing. Ignore the comment above, I'll take this amr_done call back up, but I just wanted to REALLY be sure this loop wasn't the cause for the crash. Markus -- KPNQwest Switzerland Ltd P.O. Box 9470, Zweierstrasse 35, CH-8036 Zuerich Tel: +41-1-298-6030, Fax: +41-1-291-4642 Markus Wild, Manager Engineering, e-mail: markus.wild@kpnqwest.ch To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message