Date: Tue, 8 Jan 2002 19:33:25 -0500 (EST) From: Robert Watson <rwatson@FreeBSD.org> To: qa@FreeBSD.org Subject: Re: Reduced reliability due to larger socket queue defaults for TCP Message-ID: <Pine.NEB.3.96L.1020108192957.32228A-100000@fledge.watson.org> In-Reply-To: <Pine.NEB.3.96L.1020106174749.96223A-100000@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
The temptation here is one of two things: (1) Back out the change increasing the send-q socket buffer size as a default, and restore tuning(7) to recommend increasing the value, or (2) To add the following text to the release notes: In 4.5-RELEASE, default socket buffer sizes are increased to maximize performance on high speed networks. However, under some circumstances, this can dramatically increase the memory requirements of the network system, requiring a manual bumping of the kernel NMBCLUSTERS setting. This can be set using kern.ipc.nmbclusters. My temptation is to bump back (1) a bit, possibly to bump up the keepalive rate, and stick in this note. Reliability==good. Robert N M Watson FreeBSD Core Team, TrustedBSD Project robert@fledge.watson.org NAI Labs, Safeport Network Services On Sun, 6 Jan 2002, Robert Watson wrote: > > Recently ran into the following circumstance on a server with about a 15 > day uptime (and hence about a 15-day old version of -STABLE): > > tcp4 0 33090 204.156.12.50.80 213.197.75.52.2378 FIN_WAIT_1 > tcp4 0 33304 204.156.12.50.80 198.54.202.4.24052 FIN_WAIT_1 > tcp4 0 32120 204.156.12.50.80 24.27.14.83.50129 FIN_WAIT_1 > tcp4 0 33089 204.156.12.50.80 213.197.75.52.2381 FIN_WAIT_1 > tcp4 0 33304 204.156.12.50.80 198.54.202.4.23509 FIN_WAIT_1 > tcp4 0 33304 204.156.12.50.80 212.182.63.102.28130 FIN_WAIT_1 > tcp4 0 33304 204.156.12.50.80 62.233.128.65.13712 FIN_WAIT_1 > tcp4 0 33580 204.156.12.50.80 212.182.13.23.3473 LAST_ACK > tcp4 0 31856 204.156.12.50.80 198.54.202.4.20584 FIN_WAIT_1 > tcp4 0 31856 204.156.12.50.80 212.182.63.102.29962 LAST_ACK > tcp4 0 33304 204.156.12.50.80 198.54.202.4.23960 FIN_WAIT_1 > tcp4 0 31482 204.156.12.50.80 213.197.75.52.2373 FIN_WAIT_1 > tcp4 0 32551 204.156.12.50.80 213.197.75.52.2374 FIN_WAIT_1 > > (on the order of hundreds of these), resulting in mbufs getting exhausted. > maxusers is set to 256, so nmbclusters is 4608, which was previously a > reasonable default. Presumably the problem I'm experiencing is that dud > connections have doubled in capacity due to a larger send queue size. I've > temporarily dropped the send queue max until I can reboot the machine to > increase nmbclusters, but this failure mode does seem unfortunate. It's > also worth considering adding a release note entry indicating that while > this can improve performance, it can also reduce scalability. I suppose > this shouldn't have caught me by surprise, but it did, since that server > had previously not had a problem... :-) > > I don't suppose the TCP spec allows us to drain send socket queues in > FIN_WAIT_1 or LAST_ACK? :-) Any other bright suggestions on ways we can > make this change "safer"? > > Robert N M Watson FreeBSD Core Team, TrustedBSD Project > robert@fledge.watson.org NAI Labs, Safeport Network Services > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-qa" in the body of the message > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-qa" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1020108192957.32228A-100000>