Date: Wed, 7 Feb 2007 01:31:39 +1100 (EST) From: Bruce Evans <bde@zeta.org.au> To: Oleg Bulyzhin <oleg@FreeBSD.org> Cc: Robin Gruyters <r.gruyters@yirdis.nl>, freebsd-net@FreeBSD.org Subject: Re: [Fwd: Re: bge Ierr rate increase from 5.3R -> 6.1R] Message-ID: <20070207003539.I31484@besplex.bde.org> In-Reply-To: <20070205232800.GA45487@lath.rinet.ru> References: <20070125170532.c9c2374hkwk4oc4k@server.yirdis.net> <20070205232800.GA45487@lath.rinet.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 6 Feb 2007, Oleg Bulyzhin wrote: > Sorry for the late reply - i was AFK for some time and didnt read mails. > Patch against 6.2R attached, please let me know does it helps or not. I didn't test the old version of this since I have too many local changes in my non-6.x versions. I might test the 6.x version. I use jdp's quicker fix. It works fine for detecting cable unplug and replug, but link detection is still very bad at boot time and after down/up (seems to be worse for down/up than unplug/replug?). Link detection in -current generally seems to be much worse than in 5.2. On some systems I use two ping -c2's early in the boot to wait for the link to actually be up. The first ping tends to fail and the second tends to work, both long after the lonk claims to be up. Then other network activity still takes too long to start. Without the pings, an "ntpdate -b" early in the boot fails about half the time and gives messed up timing activity when it fails, and initial nfs mounts tak 30-60 seconds. Later after down/up and waiting for the "up" message, ttcp -u usually fails to connect the first time and then works normally with no failure or connection delay the second time. % Index: sys/dev/bge/if_bgereg.h % =================================================================== % RCS file: /home/ncvs/src/sys/dev/bge/if_bgereg.h,v % retrieving revision 1.36.2.8.2.1 % diff -u -r1.36.2.8.2.1 if_bgereg.h % --- sys/dev/bge/if_bgereg.h 21 Dec 2006 21:53:54 -0000 1.36.2.8.2.1 % +++ sys/dev/bge/if_bgereg.h 5 Feb 2007 23:23:22 -0000 % @@ -2460,8 +2460,13 @@ % uint32_t bge_tx_buf_ratio; % int bge_if_flags; % int bge_txcnt; % - int bge_link; /* link state */ % - int bge_link_evt; /* pending link event */ % + uint32_t bge_sts; % +#define BGE_STS_LINK 0x00000001 /* MAC link status */ % +#define BGE_STS_LINK_EVT 0x00000002 /* pending link event */ % +#define BGE_STS_AUTOPOLL 0x00000004 /* PHY auto-polling */ % +#define BGE_STS_BIT(sc, x) ((sc)->bge_sts & (x)) % +#define BGE_STS_SETBIT(sc, x) ((sc)->bge_sts |= (x)) % +#define BGE_STS_CLRBIT(sc, x) ((sc)->bge_sts &= ~(x)) Style bugs: - inconsistents tabs. bge mostly consitently doesn't follow KNF for the tab after #define. - I don't like obfuscating bit testing and setting using macros. The above macros are quite different from the BGE and PCI SETBIT and CLRBIT macros -- the latter operate on hardware bits and do something useful by hiding the details of the hardware accesses, and aren't assoicated with bit testing macros. Apart from the above, testing of hardware bits and testing an setting of software bits is always done using direct "&" and "|" operations in bge. % Index: sys/dev/bge/if_bge.c % =================================================================== % RCS file: /home/ncvs/src/sys/dev/bge/if_bge.c,v % retrieving revision 1.91.2.18.2.1 % diff -u -r1.91.2.18.2.1 if_bge.c % --- sys/dev/bge/if_bge.c 21 Dec 2006 21:53:54 -0000 1.91.2.18.2.1 % +++ sys/dev/bge/if_bge.c 5 Feb 2007 23:23:29 -0000 % @@ -2645,12 +2648,12 @@ % % /* Note link event. It will be processed by POLL_AND_CHECK_STATUS cmd */ % if (statusword & BGE_STATFLAG_LINKSTATE_CHANGED) % - sc->bge_link_evt++; % + BGE_STS_SETBIT(sc, BGE_STS_LINK_EVT); % % if (cmd == POLL_AND_CHECK_STATUS) % if ((sc->bge_asicrev == BGE_ASICREV_BCM5700 && % sc->bge_chipid != BGE_CHIPID_BCM5700_B2) || % - sc->bge_link_evt || sc->bge_tbi) % + BGE_STS_BIT(sc, BGE_STS_LINK_EVT) || sc->bge_tbi) % bge_link_upd(sc); I suspected this is the cause of slowness for polling in idle (see previous mail), but since it doesn't do any PCI access directly it should be fast enough. The macros make it unclear that it doesn't do any PCI accesses directly. % @@ -2699,7 +2702,7 @@ % % if ((sc->bge_asicrev == BGE_ASICREV_BCM5700 && % sc->bge_chipid != BGE_CHIPID_BCM5700_B2) || % - statusword || sc->bge_link_evt) % + statusword || BGE_STS_BIT(sc, BGE_STS_LINK_EVT)) % bge_link_upd(sc); % % if (ifp->if_drv_flags & IFF_DRV_RUNNING) { The software link status handling causes a problem in the interrupt handler. To avoid pessimizing the case of shared interrupts, the interrupt handler should be able to read the status word from the status block and return without doing anything if the interrupt is not for it. This can be done without acquiring the driver lock (since the driver lock is neither necessary nor sufficient for accessing the status block). Software link status gets in the way of this, since accessing it requires the driver lock. (Note that `statusword' in the above is now bogusly named. It used to be the status word from the status block but is now the mac status for a PCI register. It is still from the status block in similar code in bge_poll(). Not pessimizing the case of shared interrupts requires using the status block again, and then the read from the PCI register might become a dummy.) Elsewhere, you added code to force interrupt after link status changes. I think this is to get the interrupt handler to look at the software link status in the above. When I first saw it, I thought that forcing an interrupt is needed in more places. There are also complications to work around hardware bugs in reporting the link status. Maybe the complications can be avoided by pushing them to the tick routine. My version doesn't worry about them except in a commment and returns early without locking if the status block doesn't claim to have been updated. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070207003539.I31484>