Date: Wed, 31 Oct 2007 22:38:24 +0100 From: Kris Kennaway <kris@FreeBSD.org> To: Alexey Popov <lol@chistydom.ru> Cc: freebsd-hackers@freebsd.org, freebsd-stable@freebsd.org Subject: Re: amrd disk performance drop after running under high load Message-ID: <4728F5D0.5020906@FreeBSD.org> In-Reply-To: <47286CF2.4090804@chistydom.ru> References: <47137D36.1020305@chistydom.ru> <47140906.2020107@FreeBSD.org> <47146FB4.6040306@chistydom.ru> <47147E49.9020301@FreeBSD.org> <47149E6E.9000500@chistydom.ru> <4715035D.2090802@FreeBSD.org> <4715C297.1020905@chistydom.ru> <4715C5D7.7060806@FreeBSD.org> <471EE4D9.5080307@chistydom.ru> <4723BF87.20302@FreeBSD.org> <47286CF2.4090804@chistydom.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
Alexey Popov wrote: > Hi > > Kris Kennaway wrote: >>>>>>> So I can conclude that FreeBSD has a long standing bug in VM that >>>>>>> could be triggered when serving large amount of static data (much >>>>>>> bigger than memory size) on high rates. Possibly this only >>>>>>> applies to large files like mp3 or video. >>>>>> It is possible, we have further work to do to conclude this though. >>>>> I forgot to mention I have pmc and kgmon profiling for good and bad >>>>> times. But I have not enough knowledge to interpret it right and >>>>> not sure if it can help. >>>> pmc would be useful. >>> pmc profiling attached. >> OK, the pmc traces do seem to show that it's not a lock contention >> issue. That being the case I don't think the fact that different >> servers perform better is directly related. > But it was evidence of mbuf lock contention in mutex profiling, wasn't > it? As far as I understand, mutex problems can exist without increasing > CPU load in pmc stats, right? No, the lock functions will show up as using a lot of CPU. I guess the lock profiling trace showed high numbers because you ran it for a long time. >> There is also no evidence of a VM problem. What your vmstat and pmc >> traces show is that your system really isn't doing much work at all, >> relatively speaking. >> There is also still no evidence of a disk problem. In fact your disk >> seems to be almost idle in both cases you provided, only doing between >> 1 and 10 operations per second, which is trivial. > vmstat and network output graphs shows that the problem exists. If it is > not a disk or network or VM problem, what else could be wrong? The vmstat output you provided so far doesn't show anything specific. >> In the "good" case you are getting a much higher interrupt rate but >> with the data you provided I can't tell where from. You need to run >> vmstat -i at regular intervals (e.g. every 10 seconds for a minute) >> during the "good" and "bad" times, since it only provides counters and >> an average rate over the uptime of the system. > I'll try this, but AFAIR there was no strangeness with interrupts. > > I believe the reason of high interrupt rate in "good" cases is that > server sends much traffic. > >> What there is evidence of is an interrupt aliasing problem between em >> and USB: >> irq16: uhci0 1464547796 1870 >> irq64: em0 1463513610 1869 > I tried disabling USB in kernel, this ussie was gone, but the main > problem was left. Also I have this issue with interrupt aliasing on many > servers without problems. OK. Kris
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4728F5D0.5020906>