Date: Tue, 10 Jul 2001 16:00:01 -0700 (PDT) From: "Richard A. Steenbergen" <ras@e-gerbil.net> To: freebsd-bugs@FreeBSD.org Subject: Re: conf/28882: Network defaults are absurdly low. Message-ID: <200107102300.f6AN01e48582@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
The following reply was made to PR conf/28882; it has been noted by GNATS. From: "Richard A. Steenbergen" <ras@e-gerbil.net> To: bicknell@ufp.org Cc: FreeBSD-gnats-submit@freebsd.org, billf@elvis.mu.org Subject: Re: conf/28882: Network defaults are absurdly low. Date: Tue, 10 Jul 2001 18:52:36 -0400 (EDT) > kern.ipc.maxsockbuf=16777216 > net.inet.tcp.sendspace=4194304 > net.inet.tcp.recvspace=4194304 > > I suspect the FreeBSD authors will want tobe more conservative to > allow lower memory machines to operate properly, so I suggest the > following system defaults: > > kern.ipc.maxsockbuf=1048576 > net.inet.tcp.sendspace=524288 > net.inet.tcp.recvspace=524288 This is potentially a "Very Bad Thing" (tm). These numbers set the default maximums of the socket buffers for every TCP socket on the system. With a maxsockbuf set that high, a malicious user could consume all system memory and most likely provoke a crash, by setting their sendspace to 16MB, connecting to a slow host, and dumping 16MB of data into the kernel. Even non-malicious uses can result in very bad behavior, for example any one using those defaults on a system which opens any number of sockets (for example a http server which could dump large files into the send buffer unnecessarily, or an irc server with many thousands of ports) could quickly find their machine crashing due to mbuf exaustion. Admins who tweak the global defaults for these settings do so at their own risk, and it is assumed they understand that they will only be able to run applications which use a very small number of high bandwidth TCP streams. A better solution for an application which will function under those conditions is to raise the buffers on a per-socket basis with setsockopt, hence why the default maxsockbuf is 256k (a reasonably high number which supports a fairly large amount of bandwidth, and this can be tuned higher if so desired). Your numbers are not an appropriate default for most users. > The following sysctl variables show FreeBSD defaults: > > kern.ipc.maxsockbuf: 262144 > net.inet.tcp.sendspace: 16384 > net.inet.tcp.recvspace: 16384 > > These are absurd. The tcp.sendspace/recvspace limit the window size > of a TCP connection, which in turn limit throughput. On a 50ms coast > to coast path, it imposes a limit of 16384 Byes * 1000ms/sec / 50ms = > 327 KBytes/sec. This is a third of what a 10Mbps/sec cable modem > should be able to deliver, say nothing of a 100Meg FE connected host > (eg server) at an ISP. Go further, 155ms to Japan from the east coast > of the US and you're down to under 100 KBytes/sec, all due to a poor > software limit. Actually you're working on the wrong cure of the right problem. TCP window size does limit thruput over high bandwidth high latency connections, and under the current implementation the TCP window which is advertised is limited by the size of the socket buffer, but the correct solution is not to increase the socket buffer, but instead to dynamically allocate them based on performance of the TCP session. The socket buffers are not chunks of memory which are allocated when the socket is created, they are a number which is used to check against the validity of appending a new mbuf with additional data. The advertised receive window is used as a form of host-memory congestion control, and is presently advertised based on the fixed number set for the socket buffer. This is stupid and limiting, since a connection could be limited to 16kb in flight when no congestion has been encountered and the system is capable of receiving larger amounts of data. This should be changed to be based off a global memory availability status, most likely a reserved amount for all network socket buffers. The demand from this global "pool" can be determined by the TCP sessions actual congestion window, improving the thruput performance of all sockets automatically without the potential danger of impacting the entire system negatively. This memory is actually only allocated during packet loss recovery, when you are holding data with gaps and waiting for retransmission, or holding data which has not been acknowledged and must be retransmitted. This makes it particularly tricky to reach optimium levels of performance while still being able to gracefully handle a network situation which introduces recovery on all network sessions simultaniously. Work is currently being done designing this system, which will result in significantly better performance and stability then just "making the numbers bigger". > tcp_extensions="YES" > > should be the default, as they are highly unlikey to break anything in > this day and age, and are necessary to use windows over 64K. An > interesting compromise would be to only set the settings I suggest if > tcp_extensions is turned on. This was just recently set as the default, in the hopes that the ancient terminal servers which broke when they received a TCP option they did not understand have all been retired. Other possible options for improving the versatility of this option (which Windows 2000 implements) are seperation of the two major components of RFC1323 (Timestamping and Window Scaling), and a "passive" mode which would respond to TCP SYNs which include these options, but would not initiate them on outbound connections. -- Richard A Steenbergen <ras@e-gerbil.net> http://www.e-gerbil.net/ras PGP Key ID: 0x138EA177 (67 29 D7 BC E8 18 3E DA B2 46 B3 D8 14 36 FE B6) To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200107102300.f6AN01e48582>