Date: Wed, 7 Dec 2016 18:05:06 -0600 From: Rajil Saraswat <rajil.s@gmail.com> To: freebsd-virtualization@FreeBSD.org Subject: Re: Debian 8 CPU stall Message-ID: <6cff5bf2-9654-8627-83c4-6ab48ee763b5@gmail.com> In-Reply-To: <9c9e83a5-16c6-0ab5-0ac4-af0a54430706@freebsd.org> References: <b011b080-5637-2da5-8a8a-819b7b1fabd3@gmail.com> <9c9e83a5-16c6-0ab5-0ac4-af0a54430706@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 12/06/2016 10:50 PM, Peter Grehan wrote: > Hi Rajil, > >> I get these messages in Debian 8 VM running in bhyve FreeBSD-11 >> release. Any idea what could be the issue: >> >> INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected by 0, >> t=11047 jiffies, g=1038939, c=1038938, q=77) >> INFO: Stall ended before state dump start > > That's a sign that a vCPU wasn't able to run for an amount of time. > > Is the system oversubscribed ? i.e. more vCPUs than physical CPUs ? > Or, is the guest performing a lot of i/o ? > > later, > > Peter. > No the system is not oversubscribed. I have a 11 vCPU (1 on debian and 1 on ubuntu) on a 24 core machine. The debian jail is running x2go and an ssh server for remote access, so the I/O shouldnt be an issue. The ubuntu jail doesnt give out any warning messages though. Following is the latest error i received in the debian vm: [152444.353007] INFO: rcu_sched self-detected stall on CPU { 0} (t=6809 jiffies g=2685598 c=2685597 q=6) [152444.354261] sending NMI to all CPUs: [152444.354270] NMI backtrace for cpu 0 [152444.354274] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-4-amd64 #1 Debian 3.16.36-1+deb8u2 [152444.354275] Hardware name: BHYVE, BIOS 1.00 03/14/2014 [152444.354277] task: ffffffff8181a460 ti: ffffffff81800000 task.ti: ffffffff81800000 [152444.354278] RIP: 0010:[<ffffffff81047a6d>] [<ffffffff81047a6d>] default_send_IPI_mask_sequence_phys+0xad/0xe0 [152444.354293] RSP: 0018:ffff88007fc03e18 EFLAGS: 00000046 [152444.354294] RAX: 0000000000000400 RBX: 000000000000a0ea RCX: 0000000000000000 [152444.354296] RDX: 0000000000000000 RSI: 0000000000000200 RDI: 0000000000000300 [152444.354297] RBP: ffffffff818e29c0 R08: ffffffff818e29c0 R09: 00000000000001bb [152444.354298] R10: 0000000000000000 R11: ffff88007fc03b96 R12: 0000000000000400 [152444.354299] R13: 0000000000000096 R14: 0000000000000002 R15: 0000000000000000 [152444.354301] FS: 0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 [152444.354303] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [152444.354303] CR2: 00000000019a3240 CR3: 00000000797ca000 CR4: 00000000000406f0 [152444.354305] Stack: [152444.354306] 0000000000000000 ffff88007fc0d6a0 ffffffff81853800 0000000000000000 [152444.354308] ffffffff818e2f20 0000000000000006 ffffffff81853800 ffffffff81047cf3 [152444.354309] ffff88007fc0d6a0 ffffffff810c73ea ffffffff00000007 ffffffff810c8f35 [152444.354311] Call Trace: [152444.354313] <IRQ> [152444.354318] [<ffffffff81047cf3>] ? arch_trigger_all_cpu_backtrace+0xc3/0x140 [152444.354327] [<ffffffff810c73ea>] ? rcu_check_callbacks+0x42a/0x670 [152444.354331] [<ffffffff810c8f35>] ? timekeeping_update.constprop.9+0x35/0x70 [152444.354335] [<ffffffff810d1df0>] ? tick_sched_handle.isra.16+0x60/0x60 [152444.354343] [<ffffffff81075f80>] ? update_process_times+0x40/0x70 [152444.354345] [<ffffffff810d1db0>] ? tick_sched_handle.isra.16+0x20/0x60 [152444.354347] [<ffffffff810d1e2c>] ? tick_sched_timer+0x3c/0x60 [152444.354351] [<ffffffff8108c667>] ? __run_hrtimer+0x67/0x210 [152444.354353] [<ffffffff8108ca69>] ? hrtimer_interrupt+0xe9/0x220 [152444.354359] [<ffffffff8151b46b>] ? smp_apic_timer_interrupt+0x3b/0x50 [152444.354365] [<ffffffff815194fd>] ? apic_timer_interrupt+0x6d/0x80 [152444.354366] <EOI> [152444.354374] [<ffffffff8101da20>] ? mwait_idle+0xa0/0xa0 [152444.354381] [<ffffffff81052c02>] ? native_safe_halt+0x2/0x10 [152444.354384] [<ffffffff8101da39>] ? default_idle+0x19/0xd0 [152444.354389] [<ffffffff810a9b44>] ? cpu_startup_entry+0x374/0x470 [152444.354392] [<ffffffff81903076>] ? start_kernel+0x497/0x4a2 [152444.354394] [<ffffffff81902a04>] ? set_init_arg+0x4e/0x4e [152444.354396] [<ffffffff81902120>] ? early_idt_handler_array+0x120/0x120 [152444.354398] [<ffffffff8190271f>] ? x86_64_start_kernel+0x14d/0x15c [152444.354399] Code: 8b 0c 25 00 53 5f ff 80 e5 10 75 f2 44 89 f8 c1 e0 18 89 04 25 10 53 5f ff 41 83 fe 02 44 89 e0 41 0f 45 c6 89 04 25 00 53 5f ff <eb> 91 4c 89 ef 57 9d 0f 1f 44 00 00 48 83 c4 08 5b 5d 41 5c 41 I use vm-bhyve for managing the jails which looks like this for debian8: guest="linux" loader="grub" cpu=1 memory=2048M network0_type="virtio-net" network0_switch="lannetwork" disk0_type="virtio-blk" disk0_name="/dev/zvol/vmpool/os2" disk0_dev="custom" passthru0="2/0/0" passthru1="2/0/1" Thanks, Rajil
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6cff5bf2-9654-8627-83c4-6ab48ee763b5>