Date: Tue, 16 Apr 2013 17:14:54 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: YongHyeon PYUN <pyunyh@gmail.com> Cc: Sean Bruno <sean_bruno@yahoo.com>, bde <bde@freebsd.org>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org> Subject: Re: bge(4) sysctl tuneables -- a blast from the past. Message-ID: <20130416162150.X1106@besplex.bde.org> In-Reply-To: <20130416052500.GA1428@michelle.cdnetworks.com> References: <1365781568.1418.1.camel@localhost> <20130413200512.G1165@besplex.bde.org> <1366065356.1350.7.camel@localhost> <20130416052500.GA1428@michelle.cdnetworks.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 16 Apr 2013, YongHyeon PYUN wrote: > On Mon, Apr 15, 2013 at 03:35:56PM -0700, Sean Bruno wrote: >> >>> FreeBSD has too many knobs, but it would be nice if the bge defaults weren't >>> so broken, so that they don't need overriding. >> >> So many knobs ... well here's more. :-) >> >> http://people.freebsd.org/~sbruno/bge_config_update.txt >> >> At least this gets a man page update with references to manuals. > > You have to change BGE_STD_RX_RING_CNT to change number of RX > descriptors. It's hard-coded and it needs much more work to change > that. And I don't see any reason to modify that though(Max # of RX > descriptor is 512). I thought that at first too, but a simple change along these lines must be OK since old versions had it. There was a BGE_SSLOTS "option" that was 256. This was used instead of BGE_STD_RX_RING_CNT in much the same places that the tunable is now used, since 512 bds used to be a lot. From FreeBSD-~5.2: @ /* @ * The standard receive ring has 512 entries in it. At 2K per mbuf cluster, @ * that's 1MB or memory, which is a lot. For now, we fill only the first @ * 256 ring entries and hope that our CPU is fast enough to keep up with @ * the NIC. @ */ "1MB or (sic) memory" wasn't actually a lot when this code was written in 2001. @ static int @ bge_init_rx_ring_std(sc) @ struct bge_softc *sc; @ { @ int i; @ @ /* XXX dishonour the above comment and the misplaced def of BGE_SSLOTS. */ @ #undef BGE_SSLOTS @ #define BGE_SSLOTS BGE_STD_RX_RING_CNT The above 3 lines are from ~5.2, to blow away the BGE_SSLOTS. It was easier to edit here than the header file. @ for (i = 0; i < BGE_SSLOTS; i++) { @ if (bge_newbuf_std(sc, i, NULL) == ENOBUFS) @ return(ENOBUFS); @ }; We intentionally removed this "option" other SSLOTS "options", and changed the default to the maximum possible (BGE_STD_RING_CNT here). No one knew how to tune these options, especially since they weren't in conf/options. Not many more than one one knew that these options existed. I don't see how this tunable is useful. The default of the maximum value optimizes for throughput and for minimizing dropped packets. Smaller values would only save a little RAM, but everyone has plenty of RAM. Reducing RAM footprint may reduce (~L2) cache pressure, but it takes _very_ delicate tuning to take advantage of that. > I think bge(4) touches minimal set of coalescing parameters but > publicly available bge(4) data sheet shows more coalescing > parameters. I don't remember any others except a different set for interrupt mode. > These parameters could be programmed with different > values(BDs & ticks) during interrupt. And some parameters are not No, these are worse than useless, except possibly with with a different interrupt handler organization. I looked at a linux driver using them. They just gave pessimizations, even in the Linux driver. IIRC, this is because using them complicates synchronization, so that an extra PIO read or two is needed to synchronize. All this for a change in the settings that is useless for most interrupt handler organizations. You could try reducing the interrupt moderation while in interrupt mode, so as to see all new activity before returning. That is unlikely to be good. In other interrupt handlers, it is an optimization to not check for new activity before returning, because the (PIO) read of the status register is slower than taking another interrupt. bge only needs a read from the status block, so it would not be so slow (but yongari thought that trusting the status block on entry to the interrupt handler like my version does is too fragile). Also, only a small amount of new activity can occur while in interrupt mode (else the interrupt handler can't keep up), and handling in small batches (perhaps 1 bd at a time) would be slow. You could try increasing the interrupt moderation while in interrupt mode. I can't see any point in that. Perhaps if the hardware is really good, completely turning off interrupt moderation while in interrupt mode would be good. The good hardware would keep DMAing and updating the status block to indicate how far it got, and we would process as many descriptors as possible in a single batch before returning, without caring about missing a few. Then the problems are avoiding races in this and switching back to normal mode and getting another interrupt for descriptors that we missed. With good hardware, there would be no extra i/o for this. You could add knobs for the interrupt mode settings, and code to support them. The main use for the knobs would be to re-verify that any non-default use of these knobs gives a pessimization. Not useful. > applicable to certain controllers. In addition, the allowed value > range for certain parameters vary on controller models. So I think > it's good idea to mention allowed value range for each parameters > as well as a warning that mentions possible connection lost caused > by wrongly programmed value(i.e. no RX interrupt for > bge_rx_coal_ticks == 0 && bge_rx_max_coal_bds == 0) Does anyone know the ranges for all models? I still haven't found any documentation that the range is null (nothing works) on 5705-"plus". bge_rx_coal_ticks == 0 && bge_rx_max_coal_bds == 0 might work accidentally if there are enough tx interrupts. There is also the DEVICE_POLLING mistake. In polling mode, these parameters of course have no effect (Polling mode disables interrupts, and the coal parameters have no effect when interrupts are disabled). Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130416162150.X1106>