From owner-freebsd-bugs@freebsd.org Sat Sep 5 07:48:28 2020 Return-Path: Delivered-To: freebsd-bugs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 538453E4B2F for ; Sat, 5 Sep 2020 07:48:28 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (mailman.nyi.freebsd.org [IPv6:2610:1c1:1:606c::50:13]) by mx1.freebsd.org (Postfix) with ESMTP id 4Bk69S1knJz45gt for ; Sat, 5 Sep 2020 07:48:28 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id 3B7273E4C0F; Sat, 5 Sep 2020 07:48:28 +0000 (UTC) Delivered-To: bugs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 3B3543E4959 for ; Sat, 5 Sep 2020 07:48:28 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Bk69S0sLYz45Sg for ; Sat, 5 Sep 2020 07:48:28 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 01C641A6F7 for ; Sat, 5 Sep 2020 07:48:28 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 0857mRXl015407 for ; Sat, 5 Sep 2020 07:48:27 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 0857mRl4015404 for bugs@FreeBSD.org; Sat, 5 Sep 2020 07:48:27 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 249123] PVH domU not migrating from one XEN host to another Date: Sat, 05 Sep 2020 07:48:27 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Many People X-Bugzilla-Who: pbraun@nethence.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 Sep 2020 07:48:28 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D249123 Bug ID: 249123 Summary: PVH domU not migrating from one XEN host to another Product: Base System Version: CURRENT Hardware: amd64 OS: Any Status: New Severity: Affects Many People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: pbraun@nethence.com XEN farm is 4.14 release Dom0s are Slackware current (Aug 2020) and downgraded Linux kernel 4.18.20 I've tried with FreeBSD 13.0-CURRENT but I suppose that would reproduce with latest stable release also. PVH domU is not migrating from one XEN host to another. I didn't try PV, since there is PVH now. pro5s1# xl migrate freebsd pro5s2 migration target: Ready to receive domain. Saving to migration stream new xl format (info 0x3/0x0/2176) Loading new save file (new xl fmt info 0x3/0x0/2176) Savefile contains xl domain config in JSON format Parsing config from xc: info: Saving domain 41, type x86 HVM xc: info: Found x86 HVM domain from Xen 4.14 xc: info: Restoring domain libxl: error: libxl_dom_suspend.c:362:suspend_common_wait_guest_timeout: Do= main 41:guest did not suspend, timed out xc: error: save callback suspend() failed: 0: Internal error xc: error: Save failed (0 =3D Success): Internal error libxl: error: libxl_stream_write.c:347:libxl__xc_domain_save_done: Domain 41:saving domain: domain responded to suspend request: Success migration sender: libxl_domain_suspend failed (rc=3D-8) xc: error: Failed to read Record Header from stream (0 =3D Success): Intern= al error xc: error: Restore failed (0 =3D Success): Internal error libxl: error: libxl_stream_read.c:850:libxl__xc_domain_restore_done: restor= ing domain: Success libxl: error: libxl_create.c:1576:domcreate_rebuild_done: Domain 12:cannot (re-)build domain: -3 libxl: error: libxl_domain.c:1182:libxl__destroy_domid: Domain 12:Non-exist= ant domain libxl: error: libxl_domain.c:1136:domain_destroy_callback: Domain 12:Unable= to destroy guest libxl: error: libxl_domain.c:1063:domain_destroy_cb: Domain 12:Destruction = of domain failed migration target: Domain creation failed (code -3). libxl: info: libxl_exec.c:117:libxl_report_child_exitstatus: migration transport process [19981] exited with error status 1 Migration failed, failed to suspend at sender. and guest console shows =D0=9F=D0=BE=D0=B4=D1=80=D0=B0=D0=B7=D0=B4=D0=B5=D0=BB=D0=B5=D0=BD=D0=B8=D0= =B5 =D0=91=D0=B0=D0=BD=D0=BA=D0=B0 =D0=BF=D0=BE =D0=BC=D0=B5=D1=81=D1=82=D1= =83 =D0=BE=D1=84=D0=BE=D1=80=D0=BC=D0=BB=D0=B5=D0=BD=D0=B8=D1=8F =D0=BA=D0= =B0=D1=80=D1=82=D1=8B (=D0=A6=D0=9E=D0=9F=D0=9F =E2=84=968610/07770 =D0=B3.= =D0=9A=D0=B0=D0=B7=D0=B0=D0=BD=D1=8C, =D1=83=D0=BB.=D0=9F=D0=B5=D1=82=D0=B5=D1=80=D0=B1=D1=83=D1=80=D0=B3=D1=81= =D0=BA=D0=B0=D1=8F, 28 ,420107) =D0=94=D0=BE=D0=BF.=D0=BE=D1=84=D0=B8=D1=81 =E2=84=968610/0138 =D0=B3.=D0= =9A=D0=B0=D0=B7=D0=B0=D0=BD=D1=8C, =D1=83=D0=BB.=D0=A7=D0=B8=D1=81=D1=82=D0= =BE=D0=BF=D0=BE=D0=BB=D1=8C=D1=81=D0=BA=D0=B0=D1=8F, 5 ,420066 lock order reversal: 1st 0xfffffe004d0a4018 xnrx_0 (netfront receive lock, sleep mutex) @ /usr/src/sys/dev/xen/netfront/netfront.c:423 2nd 0xfffffe004d0a8018 xntx_0 (netfront transmit lock, sleep mutex) @ /usr/src/sys/dev/xen/netfront/netfront.c:424 3rd 0xfffffe004d0a4d28 xnrx_1 (netfront receive lock, sleep mutex) @ /usr/src/sys/dev/xen/netfront/netfront.c:423 lock order netfront receive lock -> netfront transmit lock established at: #0 0xffffffff80c4408d at witness_checkorder+0x46d #1 0xffffffff80bb3eb4 at __mtx_lock_flags+0x94 #2 0xffffffff80a60ab4 at gnttab_resume+0xad04 #3 0xffffffff80c12373 at bus_generic_suspend_child+0x43 #4 0xffffffff80c12446 at bus_generic_suspend+0x66 #5 0xffffffff80c12373 at bus_generic_suspend_child+0x43 #6 0xffffffff80c12446 at bus_generic_suspend+0x66 #7 0xffffffff80a6787b at xs_unlock+0x35b #8 0xffffffff80c12373 at bus_generic_suspend_child+0x43 #9 0xffffffff80c12446 at bus_generic_suspend+0x66 #10 0xffffffff80c12373 at bus_generic_suspend_child+0x43 #11 0xffffffff80c12446 at bus_generic_suspend+0x66 #12 0xffffffff80c12373 at bus_generic_suspend_child+0x43 #13 0xffffffff80c12446 at bus_generic_suspend+0x66 #14 0xffffffff80a54bca at xc_printf+0x162a #15 0xffffffff80a548ae at xc_printf+0x130e #16 0xffffffff80a67bd9 at xs_unlock+0x6b9 #17 0xffffffff80b92b30 at fork_exit+0x80 lock order netfront transmit lock -> netfront receive lock attempted at: #0 0xffffffff80c449ec at witness_checkorder+0xdcc #1 0xffffffff80bb3eb4 at __mtx_lock_flags+0x94 #2 0xffffffff80a60a9a at gnttab_resume+0xacea #3 0xffffffff80c12373 at bus_generic_suspend_child+0x43 #4 0xffffffff80c12446 at bus_generic_suspend+0x66 #5 0xffffffff80c12373 at bus_generic_suspend_child+0x43 #6 0xffffffff80c12446 at bus_generic_suspend+0x66 #7 0xffffffff80a6787b at xs_unlock+0x35b #8 0xffffffff80c12373 at bus_generic_suspend_child+0x43 #9 0xffffffff80c12446 at bus_generic_suspend+0x66 #10 0xffffffff80c12373 at bus_generic_suspend_child+0x43 #11 0xffffffff80c12446 at bus_generic_suspend+0x66 #12 0xffffffff80c12373 at bus_generic_suspend_child+0x43 #13 0xffffffff80c12446 at bus_generic_suspend+0x66 #14 0xffffffff80a54bca at xc_printf+0x162a #15 0xffffffff80a548ae at xc_printf+0x130e #16 0xffffffff80a67bd9 at xs_unlock+0x6b9 #17 0xffffffff80b92b30 at fork_exit+0x80 kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid =3D 1; apic id =3D 02 fault virtual address =3D 0x8 fault code =3D supervisor read data, page not present instruction pointer =3D 0x20:0xffffffff80fe4894 stack pointer =3D 0x28:0xfffffe000bafd8f0 frame pointer =3D 0x28:0xfffffe000bafd900 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D resume, IOPL =3D 0 current process =3D 11 (idle: cpu1) trap number =3D 12 kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid =3D 7; apic id =3D 0e fault virtual address =3D 0x38 fault code =3D supervisor read data, page not present instruction pointer =3D 0x20:0xffffffff80fe4894 stack pointer =3D 0x28:0xfffffe000bb1b8f0 frame pointer =3D 0x28:0xfffffe000bb1b900 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D resume, IOPL =3D 0 current process =3D 11 (idle: cpu7) trap number =3D 12 kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid =3D 2; apic id =3D 04 fault virtual address =3D 0x10 kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid =3D 5; apic id =3D 0a fault virtual address =3D 0x28 fault code =3D supervisor read data, page not present instruction pointer =3D 0x20:0xffffffff80fe4894 stack pointer =3D 0x28:0xfffffe000bb118f0 frame pointer =3D 0x28:0xfffffe000bb11900 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D resume, IOPL =3D 0 current process =3D 11 (idle: cpu5) trap number =3D 12 fault code =3D supervisor read data, page not present instruction pointer =3D 0x20:0xffffffff80fe4894 stack pointer =3D 0x28:0xfffffe000bb028f0 frame pointer =3D 0x28:0xfffffe000bb02900 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D resume, IOPL =3D 0 current process =3D 11 (idle: cpu2) trap number =3D 12 kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid =3D 4; apic id =3D 08 fault virtual address =3D 0x20 fault code =3D supervisor read data, page not present instruction pointer =3D 0x20:0xffffffff80fe4894 stack pointer =3D 0x28:0xfffffe0043aab860 frame pointer =3D 0x28:0xfffffe0043aab870 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D resume, IOPL =3D 0 current process =3D 12 (irq2096: xc0) trap number =3D 12 kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid =3D 3; apic id =3D 06 fault virtual address =3D 0x18 fault code =3D supervisor read data, page not present instruction pointer =3D 0x20:0xffffffff80fe4894 stack pointer =3D 0x28:0xfffffe000bb078f0 frame pointer =3D 0x28:0xfffffe000bb07900 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D resume, IOPL =3D 0 current process =3D 11 (idle: cpu3) trap number =3D 12 kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid =3D 6; apic id =3D 0c fault virtual address =3D 0x30 fault code =3D supervisor read data, page not present instruction pointer =3D 0x20:0xffffffff80fe4894 stack pointer =3D 0x28:0xfffffe000bb168f0 frame pointer =3D 0x28:0xfffffe000bb16900 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D resume, IOPL =3D 0 current process =3D 11 (idle: cpu6) trap number =3D 12 timeout stopping cpus panic: page fault cpuid =3D 3 time =3D 1599291086 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe000bb07= 5a0 vpanic() at vpanic+0x182/frame 0xfffffe000bb075f0 panic() at panic+0x43/frame 0xfffffe000bb07650 trap_fatal() at trap_fatal+0x387/frame 0xfffffe000bb076b0 trap_pfault() at trap_pfault+0x97/frame 0xfffffe000bb07710 trap() at trap+0x2ab/frame 0xfffffe000bb07820 calltrap() at calltrap+0x8/frame 0xfffffe000bb07820 --- trap 0xc, rip =3D 0xffffffff80fe4894, rsp =3D 0xfffffe000bb078f0, rbp = =3D 0xfffffe000bb07900 --- cpususpend_handler() at cpususpend_handler+0x34/frame 0xfffffe000bb07900 xen_cpususpend_handler() at xen_cpususpend_handler+0x9/frame 0xfffffe000bb0= 7910 intr_event_handle() at intr_event_handle+0xde/frame 0xfffffe000bb07960 intr_execute_handlers() at intr_execute_handlers+0x66/frame 0xfffffe000bb07= 990 xen_intr_handle_upcall() at xen_intr_handle_upcall+0x1c6/frame 0xfffffe000bb079e0 Xxen_intr_upcall() at Xxen_intr_upcall+0xb1/frame 0xfffffe000bb079e0 --- interrupt, rip =3D 0xffffffff80fdac72, rsp =3D 0xfffffe000bb07ab0, rbp = =3D 0xfffffe000bb07ac0 --- cpu_idle_acpi() at cpu_idle_acpi+0x42/frame 0xfffffe000bb07ac0 cpu_idle() at cpu_idle+0x9f/frame 0xfffffe000bb07ae0 sched_idletd() at sched_idletd+0x3d1/frame 0xfffffe000bb07bb0 fork_exit() at fork_exit+0x80/frame 0xfffffe000bb07bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe000bb07bf0 --- trap 0, rip =3D 0, rsp =3D 0, rbp =3D 0 --- KDB: enter: panic [ thread pid 11 tid 100006 ] Stopped at kdb_enter+0x37: movq $0,0x10b6606(%rip) db> trace Tracing pid 11 tid 100006 td 0xfffffe000e3b7e00 kdb_enter() at kdb_enter+0x37/frame 0xfffffe000bb075a0 vpanic() at vpanic+0x19e/frame 0xfffffe000bb075f0 panic() at panic+0x43/frame 0xfffffe000bb07650 trap_fatal() at trap_fatal+0x387/frame 0xfffffe000bb076b0 trap_pfault() at trap_pfault+0x97/frame 0xfffffe000bb07710 trap() at trap+0x2ab/frame 0xfffffe000bb07820 calltrap() at calltrap+0x8/frame 0xfffffe000bb07820 --- trap 0xc, rip =3D 0xffffffff80fe4894, rsp =3D 0xfffffe000bb078f0, rbp = =3D 0xfffffe000bb07900 --- cpususpend_handler() at cpususpend_handler+0x34/frame 0xfffffe000bb07900 xen_cpususpend_handler() at xen_cpususpend_handler+0x9/frame 0xfffffe000bb0= 7910 intr_event_handle() at intr_event_handle+0xde/frame 0xfffffe000bb07960 intr_execute_handlers() at intr_execute_handlers+0x66/frame 0xfffffe000bb07= 990 xen_intr_handle_upcall() at xen_intr_handle_upcall+0x1c6/frame 0xfffffe000bb079e0 Xxen_intr_upcall() at Xxen_intr_upcall+0xb1/frame 0xfffffe000bb079e0 --- interrupt, rip =3D 0xffffffff80fdac72, rsp =3D 0xfffffe000bb07ab0, rbp = =3D 0xfffffe000bb07ac0 --- cpu_idle_acpi() at cpu_idle_acpi+0x42/frame 0xfffffe000bb07ac0 cpu_idle() at cpu_idle+0x9f/frame 0xfffffe000bb07ae0 sched_idletd() at sched_idletd+0x3d1/frame 0xfffffe000bb07bb0 fork_exit() at fork_exit+0x80/frame 0xfffffe000bb07bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe000bb07bf0 --- trap 0, rip =3D 0, rsp =3D 0, rbp =3D 0 --- db> FWIW, I also filed a similar report for NetBSD few months ago: netbsd domU does not migrate properly from one xen host to another http://gnats.netbsd.org/55207 --=20 You are receiving this mail because: You are the assignee for the bug.=