Date: Sun, 20 Mar 2011 12:24:13 -0500 From: Alan Cox <alan.l.cox@gmail.com> To: George Neville-Neil <gnn@neville-neil.com> Cc: arch@freebsd.org, Navdeep Parhar <nparhar@gmail.com> Subject: Re: Updating our TCP and socket sysctl values... Message-ID: <AANLkTi=FJnaPhYpHvV2hAogqRhgMzMnk=k02X0ZBFoGs@mail.gmail.com> In-Reply-To: <281E39E0-55D0-4B52-9CD9-F437442B67EC@neville-neil.com> References: <132388F1-44D9-45C9-AE05-1799A7A2DCD9@neville-neil.com> <AANLkTi=ptv617t0KhgNrcxTUzLmQd0eLFBf2x4%2BP7EAL@mail.gmail.com> <281E39E0-55D0-4B52-9CD9-F437442B67EC@neville-neil.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Mar 19, 2011 at 10:47 PM, George Neville-Neil <gnn@neville-neil.com>wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > > On Mar 20, 2011, at 08:13 , Navdeep Parhar wrote: > > > On Fri, Mar 18, 2011 at 11:37 PM, George Neville-Neil > > <gnn@neville-neil.com> wrote: > >> -----BEGIN PGP SIGNED MESSAGE----- > >> Hash: SHA1 > >> > >> Howdy, > >> > >> I believe it's time for us to upgrade our sysctl values for TCP sockets > so that > >> they are more in line with the modern world. At the moment we have > these limits on > >> our buffering: > >> > >> kern.ipc.maxsockbuf: 262144 > >> net.inet.tcp.recvbuf_max: 262144 > >> net.inet.tcp.sendbuf_max: 262144 > >> > >> I believe it's time to up these values to something that's in line with > higher speed > >> local networks, such as 10G. Perhaps it's time to move these to 2MB > instead of 256K. > >> > >> Thoughts? > > > > 256KB seems adequate for 10G (as long as the consumer can keep > > draining the socket rcv buffer fast enough). If you consider 2 x > > bandwidth delay product to be a reasonable socket buffer size then > > 256K allows for 10G networks with ~100ms delays. Normally the delay > > is _way_ less than this for 10G and even 256K may be an overkill (but > > this is ok, the kernel has tcp_do_autorcvbuf on by default) > > > > While we're here discussing defaults, what about nmbclusters and > > nmbjumboXX? Now those haven't kept up with modern machines (imho). > > > > Yes we should also up the nmbclusters, IMHO, but I wasn't going to > put that in the same bucket with the TCP buffers just yet. > On 64 bit/large memory machines you could make the nmbclusters > far higher than our current default. I know people who just set > that to 1,000,000 by default. > > If people are also happy to up nmbclusters I'm willing to conflate > that with this. > > A more modest but nonetheless significant increase could also be possible on i386 machines. If you go back to r129906, wherein we switched to using UMA for allocating mbufs and mbuf clusters, and read it carefully, you'll find that there was a subtle mistake made in the changes to the sizing of the kmem_map, or the "kernel heap". Prior to r129906, the overall size of the kmem map was based on the limits on mbufs and mbuf clusters PLUS the amount of kernel heap that was desired for everything else. After r129906, the limits on mbufs and mbuf clusters no longer made any difference to the size of the kmem map. The reason being that the limit on mbuf clusters was factored into the autosizing too early. It is added to the minimum "kernel heap" size, not the desired size. So, the end result is that mbufs, mbuf clusters, and everything else were made to compete for a smaller kmem map. In short, r129906 should have increased VM_KMEM_SIZE_MAX from its current limit of 320MB. I'd be curious if people running i386-based network servers have any problems with using #ifndef VM_KMEM_SIZE_MAX #define VM_KMEM_SIZE_MAX ((VM_MAX_KERNEL_ADDRESS - \ VM_MIN_KERNEL_ADDRESS + 1) * 3 / 5) #endif in place of #ifndef VM_KMEM_SIZE_MAX #define VM_KMEM_SIZE_MAX (320 * 1024 * 1024) #endif Really, the only downside to this change is that it reduces the available kernel virtual address space for thread stacks and 9 and 16KB jumbo frames. Alan
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTi=FJnaPhYpHvV2hAogqRhgMzMnk=k02X0ZBFoGs>