Date: Mon, 20 Aug 2012 12:59:58 +0300 From: Alexander Motin <mav@FreeBSD.org> To: Doug Barton <dougb@FreeBSD.org> Cc: Adrian Chadd <adrian@freebsd.org>, lev@freebsd.org, current@freebsd.org Subject: Re: CURRENT as gateway on not-so-fast hardware: where is a bottlneck? Message-ID: <50320A9E.5070303@FreeBSD.org> In-Reply-To: <5031F636.1020405@FreeBSD.org> References: <157941699.20120815004542@serebryakov.spb.ru> <CAJ-Vmon86-FPs4%2BXXkQXAow1jW465pMM2Sj7ZHi_0_E9VYSFSA@mail.gmail.com> <502AE8B5.9090106@FreeBSD.org> <502B775D.7000101@FreeBSD.org> <5031F636.1020405@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 20.08.2012 11:32, Doug Barton wrote: > On 08/15/2012 03:18, Alexander Motin wrote: >> On 15.08.2012 03:09, Doug Barton wrote: >>> On 08/14/2012 12:20 PM, Adrian Chadd wrote: >>>> Would you be willing to compile a kernel with KTR so you can capture >>>> some KTR scheduler dumps? >>>> >>>> That way the scheduler peeps can feed this into schedgraph.py (and you >>>> can too!) to figure out what's going on. >>>> >>>> Maybe things aren't being scheduled correctly and the added latency is >>>> killing performance? >>> >>> You might also try switching to SCHED_ULE to see if it helps. >>> >>> Although, in the last few months as mav has been converging the 2 I've >>> started to see the same problems I saw on my desktop systems previously >>> re-appear even using ULE. For example, if I'm watching an AVI with VLC >>> and start doing anything that generates a lot of interrupts (like moving >>> large quantities of data from one disk to another) the video and sound >>> start to skip. Also, various other desktop features (like menus, window >>> switching, etc.) start to take measurable time to happen, sometimes >>> seconds. >>> >>> ... and lest you think this is just a desktop problem, I've seen the >>> same scenario on 8.x systems used as web servers. With ULE they were >>> frequently getting into peak load situations that created what I called >>> "mini thundering herd" problems where they could never quite get caught >>> up. Whereas switching to 4BSD the same servers got into high-load >>> situations less often, and they recovered on their own in minutes. >> >> It is quite pointless to speculate without real info like mentioned >> above KTR_SCHED traces. > > I'm sorry, you're quite wrong about that. In the cases I mentioned, and > in about 2 out of 3 of the cases where users reported problems and I > suggested that they try 4BSD, the results were clear. This obviously > points out that there is a serious problem with ULE, and if I were the > one who was responsible for that code I would be looking at ways of > helping users figure out where the problems are. But that's just me. I am not telling anything bad about 4BSD. Choice is provided because they are indeed different and none is perfect. 4BSD also has problems. What I would like to say is that if we want to improve situation, we need more detailed info then just verbal description. I am not telling that ULE is perfect. I went there because I've seen problems, and I am still fixing some pieces. I am just trying to explain described behavior from the point of my knowledge about it, hoping that it may help somebody to set up some new experiments or try some tuning/fixing. -- Alexander Motin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?50320A9E.5070303>