Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 7 Dec 2016 18:05:06 -0600
From:      Rajil Saraswat <rajil.s@gmail.com>
To:        freebsd-virtualization@FreeBSD.org
Subject:   Re: Debian 8 CPU stall
Message-ID:  <6cff5bf2-9654-8627-83c4-6ab48ee763b5@gmail.com>
In-Reply-To: <9c9e83a5-16c6-0ab5-0ac4-af0a54430706@freebsd.org>
References:  <b011b080-5637-2da5-8a8a-819b7b1fabd3@gmail.com> <9c9e83a5-16c6-0ab5-0ac4-af0a54430706@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On 12/06/2016 10:50 PM, Peter Grehan wrote:
> Hi Rajil,
>
>> I get these messages in Debian 8 VM running in bhyve FreeBSD-11
>> release.  Any idea what could be the issue:
>>
>> INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected by 0,
>> t=11047 jiffies, g=1038939, c=1038938, q=77)
>> INFO: Stall ended before state dump start
>
>  That's a sign that a vCPU wasn't able to run for an amount of time.
>
>  Is the system oversubscribed ? i.e. more vCPUs than physical CPUs ?
> Or, is the guest performing a lot of i/o ?
>
> later,
>
> Peter.
>

No the system is not oversubscribed. I have a 11 vCPU (1 on debian and 1
on ubuntu) on a 24 core machine. The debian jail is running x2go and an
ssh server for remote access, so the I/O shouldnt be an issue. The
ubuntu jail doesnt give out any warning messages though.

Following is the latest error i received in the debian vm:
[152444.353007] INFO: rcu_sched self-detected stall on CPU { 0}  (t=6809
jiffies g=2685598 c=2685597 q=6)
[152444.354261] sending NMI to all CPUs:
[152444.354270] NMI backtrace for cpu 0
[152444.354274] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-4-amd64
#1 Debian 3.16.36-1+deb8u2
[152444.354275] Hardware name:   BHYVE, BIOS 1.00 03/14/2014
[152444.354277] task: ffffffff8181a460 ti: ffffffff81800000 task.ti:
ffffffff81800000
[152444.354278] RIP: 0010:[<ffffffff81047a6d>]  [<ffffffff81047a6d>]
default_send_IPI_mask_sequence_phys+0xad/0xe0
[152444.354293] RSP: 0018:ffff88007fc03e18  EFLAGS: 00000046
[152444.354294] RAX: 0000000000000400 RBX: 000000000000a0ea RCX:
0000000000000000
[152444.354296] RDX: 0000000000000000 RSI: 0000000000000200 RDI:
0000000000000300
[152444.354297] RBP: ffffffff818e29c0 R08: ffffffff818e29c0 R09:
00000000000001bb
[152444.354298] R10: 0000000000000000 R11: ffff88007fc03b96 R12:
0000000000000400
[152444.354299] R13: 0000000000000096 R14: 0000000000000002 R15:
0000000000000000
[152444.354301] FS:  0000000000000000(0000) GS:ffff88007fc00000(0000)
knlGS:0000000000000000
[152444.354303] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[152444.354303] CR2: 00000000019a3240 CR3: 00000000797ca000 CR4:
00000000000406f0
[152444.354305] Stack:
[152444.354306]  0000000000000000 ffff88007fc0d6a0 ffffffff81853800
0000000000000000
[152444.354308]  ffffffff818e2f20 0000000000000006 ffffffff81853800
ffffffff81047cf3
[152444.354309]  ffff88007fc0d6a0 ffffffff810c73ea ffffffff00000007
ffffffff810c8f35
[152444.354311] Call Trace:
[152444.354313]  <IRQ>

[152444.354318]  [<ffffffff81047cf3>] ?
arch_trigger_all_cpu_backtrace+0xc3/0x140
[152444.354327]  [<ffffffff810c73ea>] ? rcu_check_callbacks+0x42a/0x670
[152444.354331]  [<ffffffff810c8f35>] ?
timekeeping_update.constprop.9+0x35/0x70
[152444.354335]  [<ffffffff810d1df0>] ? tick_sched_handle.isra.16+0x60/0x60
[152444.354343]  [<ffffffff81075f80>] ? update_process_times+0x40/0x70
[152444.354345]  [<ffffffff810d1db0>] ? tick_sched_handle.isra.16+0x20/0x60
[152444.354347]  [<ffffffff810d1e2c>] ? tick_sched_timer+0x3c/0x60
[152444.354351]  [<ffffffff8108c667>] ? __run_hrtimer+0x67/0x210
[152444.354353]  [<ffffffff8108ca69>] ? hrtimer_interrupt+0xe9/0x220
[152444.354359]  [<ffffffff8151b46b>] ? smp_apic_timer_interrupt+0x3b/0x50
[152444.354365]  [<ffffffff815194fd>] ? apic_timer_interrupt+0x6d/0x80
[152444.354366]  <EOI>

[152444.354374]  [<ffffffff8101da20>] ? mwait_idle+0xa0/0xa0
[152444.354381]  [<ffffffff81052c02>] ? native_safe_halt+0x2/0x10
[152444.354384]  [<ffffffff8101da39>] ? default_idle+0x19/0xd0
[152444.354389]  [<ffffffff810a9b44>] ? cpu_startup_entry+0x374/0x470
[152444.354392]  [<ffffffff81903076>] ? start_kernel+0x497/0x4a2
[152444.354394]  [<ffffffff81902a04>] ? set_init_arg+0x4e/0x4e
[152444.354396]  [<ffffffff81902120>] ? early_idt_handler_array+0x120/0x120
[152444.354398]  [<ffffffff8190271f>] ? x86_64_start_kernel+0x14d/0x15c
[152444.354399] Code: 8b 0c 25 00 53 5f ff 80 e5 10 75 f2 44 89 f8 c1 e0
18 89 04 25 10 53 5f ff 41 83 fe 02 44 89 e0 41 0f 45 c6 89 04 25 00 53
5f ff <eb> 91 4c 89 ef 57 9d 0f 1f 44 00 00 48 83 c4 08 5b 5d 41 5c 41

I use vm-bhyve for managing the jails which looks like this for debian8:

guest="linux"
loader="grub"
cpu=1
memory=2048M
network0_type="virtio-net"
network0_switch="lannetwork"
disk0_type="virtio-blk"
disk0_name="/dev/zvol/vmpool/os2"
disk0_dev="custom"
passthru0="2/0/0"
passthru1="2/0/1"


Thanks,
Rajil



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6cff5bf2-9654-8627-83c4-6ab48ee763b5>