Date: Mon, 05 Feb 1996 22:12:41 +1000 From: Stephen McKay <syssgm@devetir.qld.gov.au> To: freebsd-scsi@freebsd.org Cc: syssgm@devetir.qld.gov.au Subject: aha1542 MBO problem in 2.0.5 Message-ID: <199602051212.WAA04227@orion.devetir.qld.gov.au>
index | next in thread | raw e-mail
Since I revamped my machine (16->24Mb ram, DX33->DX4/100, +CD-ROM), I have had
various SCSI problems. I still run 2.0.5 (because I'm using the PC too much
to upgrade yet), and use a BT545S SCSI card to run the disk, tape and CDROM.
I can't access my Archive 2525 at all using the bt driver, or I get crashes
and reboots (bounce buffer problem, maybe). So, I use the aha driver.
Unfortunately, I get lots of messages when accessing the disk and the CDROM
simultaneously, like:
aha0: MBO 01 and not 00 (free)
sd0(aha0:0:0): timed out
The timeouts are not necessarily paired with complaints about MBO.
Now, I have a wild theory about the MBO not free problem. :-)
Outgoing mailboxes are paired up with ccb's pretty early on in the aha driver.
Thereafter, ccb's are allocated, and mailboxes just come with them. The most
recently freed ccb is the next to be allocated, so when the system is busy,
it is highly likely that a ccb will be reused immediately. This implies that
the outgoing mailbox will be quickly reused. The manual with my BT545S
proudly proclaims its multi tasking nature, so perhaps if it gets really
busy, it might postpone marking the mailbox as read, especially since mailboxes
are supposed to be used in a round robin manner, and there are bound to be
a few still free.
So, the scenario I am postulating is:
host - allocate and set up ccb
host - mark mailbox as active
bt545 - read mailbox
bt545 - read ccb, do the work, mark ccb done
bt545 - interrupt host
bt545 - (become really busy and defer updating mailbox)
host - reallocate same ccb
host - complain about mailbox still marked busy
host - set up ccb
host - mark mailbox as active
bt545 - (finish being busy)
bt545 - mark mailbox as free
host - timeout (because bt545 ignored the mailbox)
To combat this, I've changed the ccb allocation policy to reuse the oldest
rather than the newest free ccb, in the expectation that this would access
mailboxes almost round robin.
I applied the patch given below, and thrashed the disk, tape and cdrom
simultaneously (doing tar's and wc of big files in a loop) without any
failures or errors logged. I reverted to the previous kernel and MBO not
free errors turned up almost immediately. Then I got a couple of "pid 301:
sh: uid 0: exited on signal 11" type messages and hurredly terminated the
experiment. I'm back on the patched kernel and abusing it as I type.
So, it appears that treating one's outgoing mailboxes in the official
round robin manner is not optional. I intend to add proper round robin
code myself soon, but I'm realistic enough about my erratic spare time
to invite others to beat me to it.
Anyway, here's my patch against 2.0.5 (but -current doesn't LOOK much
different):
Patch relative to "aha1542.c,v 1.45 1995/05/30 08:01:05 rgrimes Exp"
--- aha1542.c Tue May 30 18:01:05 1995
+++ aha1542.sgm.c Sun Feb 4 21:26:02 1996
@@ -302,6 +302,7 @@
long int kv_phys_xor;
struct aha_mbx aha_mbx; /* all the mailboxes */
struct aha_ccb *aha_ccb_free; /* the next free ccb */
+ struct aha_ccb *aha_ccb_tail; /* end of the free ccb list */
struct aha_ccb aha_ccb[AHA_MBX_SIZE]; /* all the CCBs */
int aha_int; /* irq level */
int aha_dma; /* DMA req channel */
@@ -782,14 +783,20 @@
if (!(flags & SCSI_NOMASK))
opri = splbio();
- ccb->next = aha->aha_ccb_free;
- aha->aha_ccb_free = ccb;
ccb->flags = CCB_FREE;
+
+ ccb->next = NULL;
+ if (aha->aha_ccb_free == NULL)
+ aha->aha_ccb_free = ccb;
+ else
+ aha->aha_ccb_tail->next = ccb;
+ aha->aha_ccb_tail = ccb;
+
/*
* If there were none, wake anybody waiting for
* one to come free, starting with queued entries
*/
- if (!ccb->next) {
+ if (aha->aha_ccb_free == aha->aha_ccb_tail) {
wakeup((caddr_t)&aha->aha_ccb_free);
}
if (!(flags & SCSI_NOMASK))
@@ -819,6 +826,8 @@
}
if (rc) {
aha->aha_ccb_free = aha->aha_ccb_free->next;
+ if (aha->aha_ccb_free == NULL)
+ aha->aha_ccb_tail = NULL; /* Unnecessary, but neat. */
rc->flags = CCB_ACTIVE;
}
if (!(flags & SCSI_NOMASK))
@@ -1214,6 +1223,7 @@
* into a free-list
* this is a kludge but it works
*/
+ aha->aha_ccb_tail = &aha->aha_ccb[0];
for (i = 0; i < AHA_MBX_SIZE; i++) {
aha->aha_ccb[i].next = aha->aha_ccb_free;
aha->aha_ccb_free = &aha->aha_ccb[i];
@@ -1354,9 +1364,13 @@
xs->error = XS_DRIVER_STUFFUP;
return (TRY_AGAIN_LATER);
}
- if (ccb->mbx->cmd != AHA_MBO_FREE)
+ if (ccb->mbx->cmd != AHA_MBO_FREE) {
printf("aha%d: MBO %02x and not %02x (free)\n",
- unit, ccb->mbx->cmd, AHA_MBO_FREE);
+ unit, ccb->mbx->cmd, AHA_MBO_FREE);
+ aha_free_ccb(unit, ccb, flags);
+ xs->error = XS_DRIVER_STUFFUP;
+ return (TRY_AGAIN_LATER);
+ }
/*
* Put all the arguments for the xfer in the ccb
Stephen McKay.
help
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199602051212.WAA04227>
