From owner-freebsd-virtualization@freebsd.org Thu Sep 7 04:30:29 2017 Return-Path: Delivered-To: freebsd-virtualization@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 35C9CE050C5 for ; Thu, 7 Sep 2017 04:30:29 +0000 (UTC) (envelope-from karihre@gmail.com) Received: from mail-it0-x22f.google.com (mail-it0-x22f.google.com [IPv6:2607:f8b0:4001:c0b::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C9F488274D for ; Thu, 7 Sep 2017 04:30:28 +0000 (UTC) (envelope-from karihre@gmail.com) Received: by mail-it0-x22f.google.com with SMTP id f199so16000348ita.1 for ; Wed, 06 Sep 2017 21:30:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=z/wFvAk71dJJgg8kIvdFbZTQC0kML6fjBhaIJaIpZqY=; b=YFHzysOUrMjkqgjegfjjBpFlEWyhKPt+JP7JiLGcTx4Loa2ZlFGR4nSWrtSuoG9P0q JwVo4s7OhHVX18A4aZqsb9x05Bz8qX1qvlO5z4q+7wtpve2h2WE8tP2IZ9SEkAGO8OwE siiU0WGsnJeQMN2lqYu/J7bxXpNRnqte1dvp4ZoOD6IZI/gzxRqqILTVnomOmwq9Bq+p L2Eo0WslkHNDunIwKuGOJ8qgF8BcRYjTA9wj94iaJzaULz48b/8daNeG1oW56A3T+W0+ vdKNi4Lp54WZQiRhE+mOjJn+j29rHLCUZT0OMhbLDZkZdagkreCl+z+ra7S/RofFuymm LLpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=z/wFvAk71dJJgg8kIvdFbZTQC0kML6fjBhaIJaIpZqY=; b=ToR0Q9bqLOFlVg9BZlX8SzbsG3OeIV++2n17Hu0Rzqb1KJLzlOd+h7cVGvHfSctj2O Ywwq1WMPDqavqRxiiPe8G/Ub8bgHbk3tlbDMf88zSM7HZgwk1QcL0PV16jtev6Ya4RBg UNkFQNBCrZnerpExepmMokTMXDxHYn+m77z7IMhoJZh/DrpkulrvbOgcEpXweS0l2/nw VhC0e4AThjQfdZdOe375Bw+9dxGw4Sk3xYlajmao5n1uugZCvLdB/0Z9U5aK+6FXutcb vg+EwejDopLWLVKmzCn9R83JFF3W5PGNrGkB9st4clrhSbsSsu++0ePFFemNDYLg/ix7 5ZJw== X-Gm-Message-State: AHPjjUhIUs3GpKZaUQlNJcg5dyr6GqqsZ7BrBZ5R4ageNOEhSpuoVoMV j/l+qjfIXyMRIPMgwRWOIsxsiU7GEIqOj7Y= X-Google-Smtp-Source: ADKCNb6p0en2QEcCyijavv/s4RDCP7cOOHDlgLJMHFsvo7bvC+Iz9Yq5bitEXvSNCYUK8HoKkxhvG65jYOUH8l0oOvI= X-Received: by 10.36.91.85 with SMTP id g82mr2593409itb.141.1504758627586; Wed, 06 Sep 2017 21:30:27 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.98.72 with HTTP; Wed, 6 Sep 2017 21:30:07 -0700 (PDT) From: =?UTF-8?Q?K=C3=A1ri_Hreinsson?= Date: Wed, 6 Sep 2017 21:30:07 -0700 Message-ID: Subject: (bhyve) Debian vm crashing with kernel panic To: freebsd-virtualization@freebsd.org Content-Type: text/plain; charset="UTF-8" X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Sep 2017 04:30:29 -0000 Dear all, I have been experiencing random linux kernel panics on a Debian virtual machine running under bhyve on FreeBSD 11.1, and believe it may be related to the virtualization environment. I am not an advanced FreeBSD user by any means, which is why I am turning to this mailing list for possible answers, realizing that I could be making some simple errors. I have two similar (same version and kernel) Debian VMs running on the FreeBSD host, one of them lightly loaded and running without any issues, the other one more heavily loaded and experiencing kernel panics a few days after booting. CPU: Intel(R) Xeon(R) CPU E3-1275 v6 Host system: 11.1-RELEASE-p1 VM: Debian 9 (Stretch), kernel 4.9.0-3-amd64 #1 SMP Debian 4.9.30-2+deb9u3 On the FreeBSD side of things I find nothing in any logs under /var/log indicating any problem (perhaps I am not looking in the right places?). On the Debian side of things an open ssh session got plenty of these leading up to the crash: kernel:[489300.648296] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kworker/0:0:14902] Debian kern.log file contains this just before the crash: Sep 6 10:23:59 hostname kernel: [488456.219948] INFO: rcu_sched self-detected stall on CPU Sep 6 10:23:59 hostname kernel: [488456.220007] 0-...: (5249 ticks this GP) idle=b45/140000000000001/0 softirq=27802459/27802459 fqs=2423 Sep 6 10:23:59 hostname kernel: [488456.220062] (t=5250 jiffies g=10449032 c=10449031 q=319) Sep 6 10:23:59 hostname kernel: [488456.220093] Task dump for CPU 0: Sep 6 10:23:59 hostname kernel: [488456.220094] kworker/0:0 R running task 0 14902 2 0x00000008 Sep 6 10:23:59 hostname kernel: [488456.220108] Workqueue: rpciod rpc_async_schedule [sunrpc] Sep 6 10:23:59 hostname kernel: [488456.220109] ffffffff90713580 ffffffff8faa3bcb 0000000000000000 ffffffff90713580 Sep 6 10:23:59 hostname kernel: [488456.220111] ffffffff8fb7a4b6 ffff8a0bffc18fc0 ffffffff9064a6c0 0000000000000000 Sep 6 10:23:59 hostname kernel: [488456.220112] ffffffff90713580 00000000ffffffff ffffffff8fadee04 0000000000e746a9 Sep 6 10:23:59 hostname kernel: [488456.220113] Call Trace: Sep 6 10:23:59 hostname kernel: [488456.220114] Sep 6 10:23:59 hostname kernel: [488456.220116] [] ? sched_show_task+0xcb/0x130 Sep 6 10:23:59 hostname kernel: [488456.220118] [] ? rcu_dump_cpu_stacks+0x92/0xb2 Sep 6 10:23:59 hostname kernel: [488456.220119] [] ? rcu_check_callbacks+0x754/0x8a0 Sep 6 10:23:59 hostname kernel: [488456.220121] [] ? update_wall_time+0x473/0x790 Sep 6 10:23:59 hostname kernel: [488456.220122] [] ? tick_sched_handle.isra.12+0x50/0x50 Sep 6 10:23:59 hostname kernel: [488456.220124] [] ? update_process_times+0x28/0x50 Sep 6 10:23:59 hostname kernel: [488456.220125] [] ? tick_sched_handle.isra.12+0x20/0x50 Sep 6 10:23:59 hostname kernel: [488456.220125] [] ? tick_sched_timer+0x38/0x70 Sep 6 10:23:59 hostname kernel: [488456.220126] [] ? __hrtimer_run_queues+0xdc/0x240 Sep 6 10:23:59 hostname kernel: [488456.220127] [] ? hrtimer_interrupt+0x9c/0x1a0 Sep 6 10:23:59 hostname kernel: [488456.220128] [] ? smp_apic_timer_interrupt+0x39/0x50 Sep 6 10:23:59 hostname kernel: [488456.220129] [] ? apic_timer_interrupt+0x82/0x90 Sep 6 10:23:59 hostname kernel: [488456.220130] Sep 6 10:23:59 hostname kernel: [488456.220131] [] ? native_queued_spin_lock_slowpath+0x21/0x190 Sep 6 10:23:59 hostname kernel: [488456.220132] [] ? _raw_spin_lock+0x1d/0x20 Sep 6 10:23:59 hostname kernel: [488456.220141] [] ? nfs4_close_done+0xfa/0x400 [nfsv4] Sep 6 10:23:59 hostname kernel: [488456.220145] [] ? nfs4_xdr_dec_open_downgrade+0xf0/0xf0 [nfsv4] Sep 6 10:23:59 hostname kernel: [488456.220151] [] ? __rpc_sleep_on_priority+0x340/0x340 [sunrpc] Sep 6 10:23:59 hostname kernel: [488456.220155] [] ? __rpc_sleep_on_priority+0x340/0x340 [sunrpc] Sep 6 10:23:59 hostname kernel: [488456.220159] [] ? rpc_exit_task+0x2a/0x90 [sunrpc] Sep 6 10:23:59 hostname kernel: [488456.220163] [] ? __rpc_execute+0x86/0x420 [sunrpc] Sep 6 10:23:59 hostname kernel: [488456.220164] [] ? process_one_work+0x184/0x410 Sep 6 10:23:59 hostname kernel: [488456.220165] [] ? worker_thread+0x4d/0x480 Sep 6 10:23:59 hostname kernel: [488456.220166] [] ? process_one_work+0x410/0x410 Sep 6 10:23:59 hostname kernel: [488456.220167] [] ? do_group_exit+0x3a/0xa0 Sep 6 10:23:59 hostname kernel: [488456.220168] [] ? kthread+0xd7/0xf0 Sep 6 10:23:59 hostname kernel: [488456.220169] [] ? kthread_park+0x60/0x60 Sep 6 10:23:59 hostname kernel: [488456.220170] [] ? ret_from_fork+0x25/0x30 This seems to be all I have to go on. This is the first panic I experience after upgrading to 11.1, in the past I was experiencing similar panics on 11.0 but the log file output from those seemed different as the kernel spat out hundreds of errors in the hours leading up to finally crashing. I'm not sure those are relevant as I was running 11.0 and didn't see the same (but similar) errors this time around, but I can attach that log file if anyone is interested. The vm startup command is: bhyve -AHP \ -s 0:0,hostbridge \ -s 1:0,lpc \ -s 2:0,virtio-net,tap0 \ -s 3:0,virtio-net,tap1 \ -s 4:0,virtio-blk,/dev/zvol/tank/vms/hostname-root \ -s 5:0,virtio-blk,/dev/zvol/tank/vms/hostname-scratch \ -s 6:0,virtio-blk,/dev/zvol/tank/vms/hostname-temp \ -s 29,fbuf,tcp=127.0.0.1:5900,w=800,h=600 \ -l com1,/dev/nmdm0A \ -l bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd \ -c 2 \ -m 32G hostname Anything that could shed some light on this issue would be much appreciated. If I can provide any additional information please let me know. Thank you, Kari Hreinsson