From owner-freebsd-emulation@FreeBSD.ORG Tue Nov 27 00:45:53 2012 Return-Path: Delivered-To: freebsd-emulation@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A71731AC; Tue, 27 Nov 2012 00:45:53 +0000 (UTC) (envelope-from alexclear@gmail.com) Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id E62BA8FC08; Tue, 27 Nov 2012 00:45:52 +0000 (UTC) Received: by mail-vb0-f54.google.com with SMTP id l1so4412004vba.13 for ; Mon, 26 Nov 2012 16:45:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=nj59BYJBM6NHXZhY2VFQVs05do6vN67dps5E7cB5nOQ=; b=n3QeLzeR8hoyKsuC+lLgG7S1T2L6aCjAPskeTbfO3pRNASpQ0wNxorxmqraROnYevf y6Wj/2NGss8uuBZdPyewGeldOSZetx4LYMq5wnlOxn1lnSD6PEo91Fx/iT7gOHXOMRuh C2gRyc5SahIl5OpjHn3QRs8GUovFsPhBM2V87QqOTQD4sLK+4igf3SY9wf6JwFq10FW9 PndieDWDX75Ex6C4jaK1nQtMeLXWEyzGsTygdISdBhL03er/jc9jM/L/PKfRobJpf1V/ gpBQDYtXvBwrXHb3IWBPElPQTCzFjmOvrSe+ginXbhCoObedLTrJbTEmsnm1kwIzlQd8 7kVA== MIME-Version: 1.0 Received: by 10.220.240.80 with SMTP id kz16mr21857199vcb.7.1353977152107; Mon, 26 Nov 2012 16:45:52 -0800 (PST) Received: by 10.220.108.209 with HTTP; Mon, 26 Nov 2012 16:45:51 -0800 (PST) In-Reply-To: References: <50AFAD05.1050604@FreeBSD.org> <50B25C17.20208@FreeBSD.org> Date: Tue, 27 Nov 2012 03:45:51 +0300 Message-ID: Subject: Re: VirtualBox 4.2.4 on FreeBSD 9.1-PRERELEASE problem: VMs behave very different when pinned to different cores From: Alex Chistyakov To: "freebsd-emulation@freebsd.org" Content-Type: text/plain; charset=ISO-8859-1 Cc: Alexander Motin , Andriy Gapon X-BeenThere: freebsd-emulation@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Development of Emulators of other operating systems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Nov 2012 00:45:53 -0000 On Mon, Nov 26, 2012 at 11:10 AM, Alex Chistyakov wrote: > On Sun, Nov 25, 2012 at 9:57 PM, Andriy Gapon wrote: >> on 24/11/2012 00:17 Alex Chistyakov said the following: >>> On Fri, Nov 23, 2012 at 9:06 PM, Andriy Gapon wrote: >>>> >>>> I've cc-ed Alexander who is deeply familiar with both the scheduler and the timer >>>> code. >>>> I think that it would be nice to get ktr(4) information suitable for use with >>>> schedgraph (please google for these keywords). >>> >>> I collected two samples and put them here: http://1888.spb.ru/samples.zip >>> sched-cpu0.ktr is for a VM running on CPU #0 and sched-cpu1.ktr is for >>> a VM running on CPU #1 >>> They seem to be very different. >> >> It looks like you didn't stop ktr tracing before running ktrdump or something >> like that. schedgraph can not grok the files because it believes that the >> timestamps are incorrect. >> >> # - While the workload is continuing (i.e. before it finishes), disable >> # KTR tracing by setting 'sysctl debug.ktr.mask=0'. This is necessary >> # to avoid a race condition while running ktrdump, i.e. the KTR ring buffer >> # will cycle a bit while ktrdump runs, and this confuses schedgraph because >> # the timestamps appear to go backwards at some point. >>>> Also, version of your kernel, >>> >>> kern.version: FreeBSD 9.1-PRERELEASE #4: Fri Nov 23 22:38:47 MSK 2012 >>> Sources were grabbed on Nov, 16. >>> >>>> output of sysctls kern.eventtimer and kern.sched. >>> >>> kern.eventtimer.choice: LAPIC(600) HPET(550) HPET1(440) HPET2(440) >>> i8254(100) RTC(0) >>> kern.eventtimer.et.LAPIC.flags: 7 >>> kern.eventtimer.et.LAPIC.frequency: 50002806 >>> kern.eventtimer.et.LAPIC.quality: 600 >>> kern.eventtimer.et.RTC.flags: 17 >>> kern.eventtimer.et.RTC.frequency: 32768 >>> kern.eventtimer.et.RTC.quality: 0 >>> kern.eventtimer.et.i8254.flags: 1 >>> kern.eventtimer.et.i8254.frequency: 1193182 >>> kern.eventtimer.et.i8254.quality: 100 >>> kern.eventtimer.et.HPET.flags: 7 >>> kern.eventtimer.et.HPET.frequency: 14318180 >>> kern.eventtimer.et.HPET.quality: 550 >>> kern.eventtimer.et.HPET1.flags: 3 >>> kern.eventtimer.et.HPET1.frequency: 14318180 >>> kern.eventtimer.et.HPET1.quality: 440 >>> kern.eventtimer.et.HPET2.flags: 3 >>> kern.eventtimer.et.HPET2.frequency: 14318180 >>> kern.eventtimer.et.HPET2.quality: 440 >>> kern.eventtimer.periodic: 0 >>> kern.eventtimer.timer: LAPIC >>> kern.eventtimer.activetick: 1 >>> kern.eventtimer.idletick: 0 >>> kern.eventtimer.singlemul: 2 >>> kern.sched.cpusetsize: 8 >>> kern.sched.preemption: 1 >>> kern.sched.topology_spec: >>> kern.sched.steal_thresh: 2 >>> kern.sched.steal_idle: 1 >>> kern.sched.balance_interval: 127 >>> kern.sched.balance: 1 >>> kern.sched.affinity: 1 >>> kern.sched.idlespinthresh: 16 >>> kern.sched.idlespins: 10000 >>> kern.sched.static_boost: 152 >>> kern.sched.preempt_thresh: 80 >>> kern.sched.interact: 30 >>> kern.sched.slice: 12 >>> kern.sched.quantum: 94488 >>> kern.sched.name: ULE >>> >>> I tried kern.eventtimer.periodic=1 and >>> kern.timecounter.hardware=ACPI-fast but that did not help. >> >> Could you please also provide the CPU identification block from dmesg? > > CPU: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz (3200.18-MHz K8-class CPU) > Origin = "GenuineIntel" Id = 0x206d7 Family = 0x6 Model = 0x2d > Stepping = 7 > Features=0xbfebfbff > Features2=0x1fbee3bf > AMD Features=0x2c100800 > AMD Features2=0x1 > TSC: P-state invariant, performance statistics > >> >>>> BTW, do you use the default ULE scheduler? >>> >>> Yep. >>> I tried SCHED_4BSD and the situation became much better but not ideal. >>> %si was around 3-7% on the guest and I had to boot with noacpi and >>> disable the tickless kernel on the guest to lower it. >>> At least I was able to run a VM on CPU #0 and all cores became equal. >> >> Interesting results. >> >>>> Also, is your kernel DTrace enabled? >> >> Could you please run the following script for some seconds and report its output >> for both of the scenarios: >> >> profile:::profile-4001 >> { >> @stacks[pid, tid, execname, stack()] = count(); >> } >> >> END >> { >> trunc(@stacks, 40); >> printa(@stacks); >> } > > A DTrace result for "bad" core is at https://gist.github.com/4145782 > and a result for "good" core is at https://gist.github.com/4146946 Okay, I have finally found a way to utilize PMP and got some stacktraces for "good" and "bad" VM runs. The "good" results are here: https://gist.github.com/4151328 and the "bad" ones are here: https://gist.github.com/4151343 I was sampling for 160 seconds both times, a sample per a second. I used a script titled "Full technology demonstration" on http://poormansprofiler.org/ so numbers on the left are total counts of repeating stacktraces. As you can see the second result has "select,select,listenerRun,??,??" and "select,select,clientInput,??,??" samples so I guessed they can be related to VNC/VRDE. I disabled VRDE and was able to boot the VM on the "bad" core without major problems then. But CPU usage remained quite high on the "bad" core and ping was very unstable too so this was not the end of the story. I collected PMP stats for the "bad" core with VRDE disabled: https://gist.github.com/4151530 A lot of time seems to be spent in VMMR3HwAccRunGC call. Well, this function is just a bridge for the guest code, so I'm not sure how to interpret this fact. Thank you, -- SY, Alex