Date: Sun, 27 Oct 2002 18:00:42 +0000 From: Ian Dowse <iedowse@maths.tcd.ie> To: freebsd-mobile@freebsd.org Subject: Patch to fix/shorten "wi" freezes Message-ID: <200210271800.aa11251@salmon.maths.tcd.ie>
next in thread | raw e-mail | index | archive | help
The wi driver causes quite long system freezes both when the pccard is removed, and also if the hardware becomes confused. I've found on -current that sometimes the whole machine can become unresponsive for a period of minutes with messages such as: wi0: timeout in wi_cmd 0x0002; event status 0x8080 wi0: timeout in wi_cmd 0x0000; event status 0x8080 wi0: wi_cmd: busy bit won't clear. wi0: wi_cmd: busy bit won't clear. wi0: init failed wi0: wi_cmd: busy bit won't clear. ... <repeated 20 times> wi0: wi_cmd: busy bit won't clear. wi0: failed to allocate 1594 bytes on NIC wi0: tx buffer allocation failed wi0: wi_cmd: busy bit won't clear. wi0: failed to allocate 1594 bytes on NIC The "wi_cmd 0x0002" is WI_CMD_DISABLE from wi_stop(). Each of the "busy bit won't clear" messages comes after a 5-second busy-loop delay in wi_cmd(), so the above takes 2-3 minutes to complete. This, BTW, is a Lucent silver card, probed as: wi0: <WaveLAN/IEEE> at port 0x100-0x13f irq 9 function 0 config 1 on pccard0 wi0: 802.11 address: 00:02:2d:21:57:d3 wi0: using Lucent Technologies, WaveLAN/IEEE wi0: Lucent Firmware: Station 6.16.01 The patch below does a few things: - It adds a 20ms delay at the end of wi_init(), which seems to fix the above timeouts in wi_stop(), as it seems that calling wi_stop() too soon after wi_init() can cause these. - The busy-bit loop timeout is reduced from 5 seconds to 500ms. - When a status of 0xffff is returned or the "busy bit won't clear" error occurs, sc->wi_gone is set to 1, so that other operations will fail immediately instead of going back into the long busy loops. Since sc->wi_gone had been used as a sanity test in wi_generic_detach() to make sure devices are not detached twice, this has been changed to use the previously unused WI_FLAGS_ATTACHED flag. We also need to remove the wi_gone test in wi_stop(), since otherwise the untimeout() calls will be missed if wi_gone is set by something other than wi_generic_detach(). - The functions wi_cmd() and wi_seek() now test wi_gone, and return immediately if it is set. For me this makes the card work much more reliably and it reduces the length of any hangs to less than 1 second. I guess it will take testing on other cards and configurations to see if this improves things in general or causes problems with some combinations. Ian Index: if_wi.c =================================================================== RCS file: /dump/FreeBSD-CVS/src/sys/dev/wi/if_wi.c,v retrieving revision 1.117 diff -u -r1.117 if_wi.c --- if_wi.c 14 Oct 2002 01:59:57 -0000 1.117 +++ if_wi.c 27 Oct 2002 16:57:04 -0000 @@ -200,7 +200,7 @@ WI_LOCK(sc, s); ifp = &sc->arpcom.ac_if; - if (sc->wi_gone) { + if ((sc->wi_flags & WI_FLAGS_ATTACHED) == 0) { device_printf(dev, "already unloaded\n"); WI_UNLOCK(sc, s); return(ENODEV); @@ -214,6 +214,7 @@ ether_ifdetach(ifp, ETHER_BPF_SUPPORTED); bus_teardown_intr(dev, sc->irq, sc->wi_intrhand); wi_free(dev); + sc->wi_flags &= ~WI_FLAGS_ATTACHED; sc->wi_gone = 1; WI_UNLOCK(sc, s); @@ -471,6 +472,7 @@ */ ether_ifattach(ifp, ETHER_BPF_SUPPORTED); callout_handle_init(&sc->wi_stat_ch); + sc->wi_flags |= WI_FLAGS_ATTACHED; WI_UNLOCK(sc, s); return(0); @@ -1002,20 +1004,24 @@ { int i, s = 0; static volatile int count = 0; + + if (sc->wi_gone) + return (ENODEV); if (count > 1) panic("Hey partner, hold on there!"); count++; /* wait for the busy bit to clear */ - for (i = 500; i > 0; i--) { /* 5s */ + for (i = 500; i > 0; i--) { /* 500ms */ if (!(CSR_READ_2(sc, WI_COMMAND) & WI_CMD_BUSY)) { break; } - DELAY(10*1000); /* 10 m sec */ + DELAY(1000); /* 1ms */ } if (i == 0) { device_printf(sc->dev, "wi_cmd: busy bit won't clear.\n" ); + sc->wi_gone = 1; count--; return(ETIMEDOUT); } @@ -1052,6 +1058,8 @@ if (i == WI_TIMEOUT) { device_printf(sc->dev, "timeout in wi_cmd 0x%04x; event status 0x%04x\n", cmd, s); + if (s == 0xffff) + sc->wi_gone = 1; return(ETIMEDOUT); } return(0); @@ -1364,6 +1372,9 @@ int selreg, offreg; int status; + if (sc->wi_gone) + return (ENODEV); + switch (chan) { case WI_BAP0: selreg = WI_SEL0; @@ -1391,6 +1402,8 @@ if (i == WI_TIMEOUT) { device_printf(sc->dev, "timeout in wi_seek to %x/%x; last status %x\n", id, off, status); + if (status == 0xffff) + sc->wi_gone = 1; return(ETIMEDOUT); } @@ -2196,6 +2209,13 @@ sc->wi_stat_ch = timeout(wi_inquire, sc, hz * 60); WI_UNLOCK(sc, s); + /* + * A 10ms or greater delay here seems to avoid a problem that + * causes some Lucent orinoco cards to time out in wi_stop() + * if called immediately after wi_init(). Use 20ms to be safe. + */ + DELAY(20000); + return; } @@ -2471,11 +2491,11 @@ int s; WI_LOCK(sc, s); - - if (sc->wi_gone) { - WI_UNLOCK(sc, s); - return; - } + /* + * Ignore wi_gone here, as we still need to do the untimeout calls. + * Currently everything here should be safe to do even if the + * hardware is gone. + */ wihap_shutdown(sc); To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-mobile" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi? <200210271800.aa11251>