Date: Sat, 24 Nov 2012 01:17:58 +0300 From: Alex Chistyakov <alexclear@gmail.com> To: Andriy Gapon <avg@freebsd.org> Cc: "freebsd-emulation@freebsd.org" <freebsd-emulation@freebsd.org>, Alexander Motin <mav@freebsd.org> Subject: Re: VirtualBox 4.2.4 on FreeBSD 9.1-PRERELEASE problem: VMs behave very different when pinned to different cores Message-ID: <CA%2Bkq2xv%2BU4ZnfK=1js4PRaNpTNdW-y-G50GV4%2BMVP0LugBf1pQ@mail.gmail.com> In-Reply-To: <50AFAD05.1050604@FreeBSD.org> References: <CA%2Bkq2xvh3j5CM7UzRVfXCeLhHwpTY%2B_M7dCJx0c27NtV8EVJwg@mail.gmail.com> <CAE-m3X1UPsy%2Bwbqm_02JpXMr-UO3m7N6z_ZwY2HNo4GL0YUi1w@mail.gmail.com> <CA%2Bkq2xva61m_bHdzBZM2TYL5z7XiohvkxsYWtOyoBwQkpyvp0A@mail.gmail.com> <50AFAD05.1050604@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Nov 23, 2012 at 9:06 PM, Andriy Gapon <avg@freebsd.org> wrote: > > I've cc-ed Alexander who is deeply familiar with both the scheduler and t= he timer > code. > I think that it would be nice to get ktr(4) information suitable for use = with > schedgraph (please google for these keywords). I collected two samples and put them here: http://1888.spb.ru/samples.zip sched-cpu0.ktr is for a VM running on CPU #0 and sched-cpu1.ktr is for a VM running on CPU #1 They seem to be very different. > Also, version of your kernel, kern.version: FreeBSD 9.1-PRERELEASE #4: Fri Nov 23 22:38:47 MSK 2012 Sources were grabbed on Nov, 16. > output of sysctls kern.eventtimer and kern.sched. kern.eventtimer.choice: LAPIC(600) HPET(550) HPET1(440) HPET2(440) i8254(100) RTC(0) kern.eventtimer.et.LAPIC.flags: 7 kern.eventtimer.et.LAPIC.frequency: 50002806 kern.eventtimer.et.LAPIC.quality: 600 kern.eventtimer.et.RTC.flags: 17 kern.eventtimer.et.RTC.frequency: 32768 kern.eventtimer.et.RTC.quality: 0 kern.eventtimer.et.i8254.flags: 1 kern.eventtimer.et.i8254.frequency: 1193182 kern.eventtimer.et.i8254.quality: 100 kern.eventtimer.et.HPET.flags: 7 kern.eventtimer.et.HPET.frequency: 14318180 kern.eventtimer.et.HPET.quality: 550 kern.eventtimer.et.HPET1.flags: 3 kern.eventtimer.et.HPET1.frequency: 14318180 kern.eventtimer.et.HPET1.quality: 440 kern.eventtimer.et.HPET2.flags: 3 kern.eventtimer.et.HPET2.frequency: 14318180 kern.eventtimer.et.HPET2.quality: 440 kern.eventtimer.periodic: 0 kern.eventtimer.timer: LAPIC kern.eventtimer.activetick: 1 kern.eventtimer.idletick: 0 kern.eventtimer.singlemul: 2 kern.sched.cpusetsize: 8 kern.sched.preemption: 1 kern.sched.topology_spec: <groups> kern.sched.steal_thresh: 2 kern.sched.steal_idle: 1 kern.sched.balance_interval: 127 kern.sched.balance: 1 kern.sched.affinity: 1 kern.sched.idlespinthresh: 16 kern.sched.idlespins: 10000 kern.sched.static_boost: 152 kern.sched.preempt_thresh: 80 kern.sched.interact: 30 kern.sched.slice: 12 kern.sched.quantum: 94488 kern.sched.name: ULE I tried kern.eventtimer.periodic=3D1 and kern.timecounter.hardware=3DACPI-fast but that did not help. > BTW, do you use the default ULE scheduler? Yep. I tried SCHED_4BSD and the situation became much better but not ideal. %si was around 3-7% on the guest and I had to boot with noacpi and disable the tickless kernel on the guest to lower it. At least I was able to run a VM on CPU #0 and all cores became equal. > Also, is your kernel DTrace enabled? Yep. Thank you! -- SY, Alex > > on 23/11/2012 17:52 Alex Chistyakov said the following: >> On Fri, Nov 23, 2012 at 6:20 PM, Bernhard Fr=F6hlich <decke@freebsd.org>= wrote: >>> On Fri, Nov 23, 2012 at 2:15 PM, Alex Chistyakov <alexclear@gmail.com> = wrote: >>>> Hello, >>>> >>>> I am back with another problem. As I discovered previously setting a >>>> CPU affinity explicitly helps to get decent performance on guests, but >>>> the problem is that guest performance is very different on core #0 and >>>> cores #5 or #7. Basically when I use 'cpuset -l 0 VBoxHeadless -s >>>> "Name" -v on' to start the VM it is barely usable at all. The best >>>> performance results are on cores #4 and #5 (I believe they are the >>>> same physical core due to HT). #7 and #8 are twice as slow as #5, #0 >>>> and #1 are the slowest and other cores lay in the middle. >>>> If I disable a tickless kernel on the guest running on #4 or #5 it >>>> becomes as slow as a guest running on #7 so I suspect this is a >>>> timer-related issue. >>>> I also discovered that there are quite a lot of system interrupts on >>>> slow guests (%si is about 10-15) but Munin does not render them on its >>>> CPU graphs for some reason. >>>> All my VMs are on cores #4 and #5 right now but I want to utilize >>>> other cores too. I am not sure what to do next, this looks like a >>>> VirtualBox bug. What can be done to solve this? >>> >>> I do not want to sound ignorant but what do you expect? Each VBox >>> VM consists of somewhere around 15 threads and some of them are the >>> vCPUs. You bind them all to the same CPU so they will fight for CPU tim= e >>> on that single core and latency will get unpredictable as well as >>> performance. And then you add more and more craziness by running >>> it on cpu0 and a HT enabled CPU ... >> >> Your point regarding HTT is perfectly valid so I just disabled it in >> BIOS. Unfortunately it did not help. >> When I run a single VM on CPU #0 I get the following load pattern on the= host: >> >> last pid: 2744; load averages: 0.93, 0.63, 0.31 >> >> up 0+00:05:25 19:37:17 >> 368 processes: 8 running, 344 sleeping, 16 waiting >> CPU 0: 14.7% user, 0.0% nice, 85.3% system, 0.0% interrupt, 0.0% idle >> CPU 1: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle >> CPU 2: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle >> CPU 3: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle >> CPU 4: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle >> CPU 5: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle >> Mem: 410M Active, 21M Inact, 921M Wired, 72K Cache, 60G Free >> ARC: 136M Total, 58M MRU, 67M MFU, 272K Anon, 2029K Header, 8958K Other >> Swap: 20G Total, 20G Free >> >> And when I run it on CPU #4 the situation is completely different: >> >> last pid: 2787; load averages: 0.05, 0.37, 0.31 >> >> up 0+00:11:45 19:43:37 >> 368 processes: 9 running, 343 sleeping, 16 waiting >> CPU 0: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle >> CPU 1: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle >> CPU 2: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle >> CPU 3: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle >> CPU 4: 1.8% user, 0.0% nice, 11.0% system, 0.0% interrupt, 87.2% idle >> CPU 5: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle >> Mem: 412M Active, 20M Inact, 1337M Wired, 72K Cache, 60G Free >> ARC: 319M Total, 136M MRU, 171M MFU, 272K Anon, 2524K Header, 9340K Othe= r >> Swap: 20G Total, 20G Free >> >> Regarding pinning the VM to a certain core - yes, I agree with you, >> it's better not to pin VMs explicitly but I was forced to do this. If >> I do not pin the VM explicitly it gets scheduled to the "bad" core >> sooner or later and the whole VM gets unresponsive. And I was able to >> run as many as 6 VMs on HTT cores #4/#5 quite successfully. These VMs >> were staging machines without too much load on them but I wanted to >> put some production resources on this host too - that's why I wanted >> to know how to utilize other cores safely. > > > -- > Andriy Gapon
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2Bkq2xv%2BU4ZnfK=1js4PRaNpTNdW-y-G50GV4%2BMVP0LugBf1pQ>