Date: Mon, 14 Jan 2013 17:05:43 +0100 From: Andre Oppermann <andre@freebsd.org> To: Alexander Motin <mav@FreeBSD.org> Cc: Adrian Chadd <adrian@freebsd.org>, src-committers@freebsd.org, Alan Cox <alc@rice.edu>, "Jayachandran C." <jchandra@freebsd.org>, svn-src-all@freebsd.org, Alfred Perlstein <bright@mu.org>, Oleksandr Tymoshenko <gonzo@bluezbox.com>, freebsd-arch@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r243631 - in head/sys: kern sys Message-ID: <50F42CD7.6020400@freebsd.org> In-Reply-To: <50F4297F.8050708@FreeBSD.org> References: <201211272119.qARLJxXV061083@svn.freebsd.org> <ABB3E29B-91F3-4C25-8FAB-869BBD7459E1@bluezbox.com> <50C1BC90.90106@freebsd.org> <50C25A27.4060007@bluezbox.com> <50C26331.6030504@freebsd.org> <50C26AE9.4020600@bluezbox.com> <50C3A3D3.9000804@freebsd.org> <50C3AF72.4010902@rice.edu> <330405A1-312A-45A5-BB86-4969478D8BBD@bluezbox.com> <50D03E83.8060908@rice.edu> <50DD081E.8000409@bluezbox.com> <50EB1841.5030006@bluezbox.com> <50EB22D2.6090103@rice.edu> <50EB415F.8020405@freebsd.org> <CA%2B7sy7CkdoyScOEDEXWuwJxjCS5zTcC8_fu9isCeTFxT8opNJQ@mail.gmail.com> <50F04FE5.7010406@rice.edu> <CA%2B7sy7D=ZjTLirGW3BVGcAu0h8-dWpib%2BYziUjEqegOL9J4adw@mail.gmail.com> <CAJ-VmonLoL4E3UsNwx87p2FuHXTbJe7wFs9hBn5Zmr7TTQOSkg@mail.gmail.com> <50F1BD69.4060104@mu.org> <CAJ-VmokjZ_vpcmYeD65pWJN5tfhqn6yDXrFFcXf8dvYc55tQtg@mail.gmail.com> <50F2F79C.7040109@mu.org> <50F41F8C.5030900@freebsd.org> <50F4297F.8050708@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 14.01.2013 16:51, Alexander Motin wrote: > On 14.01.2013 17:09, Andre Oppermann wrote: >> On 13.01.2013 19:06, Alfred Perlstein wrote: >>> On 1/12/13 10:32 PM, Adrian Chadd wrote: >>>> On 12 January 2013 11:45, Alfred Perlstein <bright@mu.org> wrote: >>>> >>>>> I'm not sure if regressing to the waterfall method of development is >>>>> a good >>>>> idea at this point. >>>>> >>>>> I see a light at the end of the tunnel and we to continue to just >>>>> handle >>>>> these minor corner cases as we progress. >>>>> >>>>> If we move to a model where a minor bug is grounds to completely remove >>>>> helpful code then nothing will ever get done. >>>>> >>>> Allocating 512MB worth of callwheels on a 16GB MIPS machine is a >>>> little silly, don't you think? >>>> >>>> That suggests to me that the extent of which maxfiles/maxusers/etc >>>> percolates the codebase wasn't totally understood by those who wish to >>>> change it. >>>> >>>> I'd rather see some more investigative work into outlining things that >>>> need fixing and start fixing those, rather than "just change stuff and >>>> fix whatever issues creep up." >>>> >>>> I kinda hope we all understand what we're working on in the kernel a >>>> little better than that. >>> >>> Cool! I'm glad people are now aware of the callwheel allocation >>> being insane with large maxusers. >>> >>> I saw this about a month ago (if not longer), but since there were >>> half a dozen people calling me an >>> imbecile who hadn't really yet read the code I didn't want to inflame >>> them more by fixing that with >>> "a hack". (actually a simple fix). >>> >>> A simple fix is to clamp callwheel size to the previous result of a >>> maxusers of 384 and call it a day. >>> >>> However the simplicity of that approach would probably inflame too >>> many feelings so I am unsure as >>> how to proceed. >>> >>> Any ideas? >> >> I noticed the callwheel dependency as well and asked mav@ about it >> in a short email exchange. He said it has only little use and goes >> away with the calloutng import. While that is outstanding we need >> to clamp it to a sane value. >> >> However I don't know what a sane value would be and why its size is >> directly derived from maxproc and maxfiles. If there can be one >> callout per process and open file descriptor in the system, then >> it probably has to be so big. If it can deal with 'collisions' >> in the wheel it can be much smaller. > > As I've actually written, there are two different things: > ncallout -- number of preallocated callout structures for purposes of > timeout() calls. That is a legacy API that is probably not very much > used now, so that value don't need to be too big. But that allocation is > static and if it will ever be exhausted system will panic. That is why > it was set quite high. The right way now would be to analyze where that > API is still used and estimate the really required number. Can timeout() be emulated on top of another API so we can do away with it? > callwheelsize -- number of slots in the callwheel. That is purely > optimizational value. If set too low, it will just increase number of > hash collisions without effects other then some slowdown. Optimal value > here does depend on number of callouts in system, but not only. Since > array index there is not really a hash, it is practically useless to set > array size it higher then median callout interval divided by hz (or by > 1ms in calloutng). The problem is to estimate that median value, that > completely depends on workload. OK. So for example a large number of TCP connection would use up a large number of slots in the callwheel. I'll try to come up with a reasonable sane scaling value. > Each one ncallout cost 32-52 bytes, while one callwheelsize only 8-16 > and could probably be reduced to 4-8 by replacing TAILQ with LIST. So > that is ncallout and respective timeout() API what should be managed in > first order. I'll give it a try. -- Andre
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?50F42CD7.6020400>