Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 05 Sep 2020 07:48:27 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 249123] PVH domU not migrating from one XEN host to another
Message-ID:  <bug-249123-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D249123

            Bug ID: 249123
           Summary: PVH domU not migrating from one XEN host to another
           Product: Base System
           Version: CURRENT
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Many People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: pbraun@nethence.com

XEN farm is 4.14 release
Dom0s are Slackware current (Aug 2020) and downgraded Linux kernel 4.18.20

I've tried with FreeBSD 13.0-CURRENT but I suppose that would reproduce with
latest stable release also.  PVH domU is not migrating from one XEN host to
another.  I didn't try PV, since there is PVH now.

pro5s1# xl migrate freebsd pro5s2
migration target: Ready to receive domain.
Saving to migration stream new xl format (info 0x3/0x0/2176)
Loading new save file <incoming migration stream> (new xl fmt info
0x3/0x0/2176)
 Savefile contains xl domain config in JSON format
Parsing config from <saved>
xc: info: Saving domain 41, type x86 HVM
xc: info: Found x86 HVM domain from Xen 4.14
xc: info: Restoring domain
libxl: error: libxl_dom_suspend.c:362:suspend_common_wait_guest_timeout: Do=
main
41:guest did not suspend,
timed out
xc: error: save callback suspend() failed: 0: Internal error
xc: error: Save failed (0 =3D Success): Internal error
libxl: error: libxl_stream_write.c:347:libxl__xc_domain_save_done: Domain
41:saving domain: domain responded to suspend request: Success
migration sender: libxl_domain_suspend failed (rc=3D-8)
xc: error: Failed to read Record Header from stream (0 =3D Success): Intern=
al
error
xc: error: Restore failed (0 =3D Success): Internal error
libxl: error: libxl_stream_read.c:850:libxl__xc_domain_restore_done: restor=
ing
domain: Success
libxl: error: libxl_create.c:1576:domcreate_rebuild_done: Domain 12:cannot
(re-)build domain: -3
libxl: error: libxl_domain.c:1182:libxl__destroy_domid: Domain 12:Non-exist=
ant
domain
libxl: error: libxl_domain.c:1136:domain_destroy_callback: Domain 12:Unable=
 to
destroy guest
libxl: error: libxl_domain.c:1063:domain_destroy_cb: Domain 12:Destruction =
of
domain failed
migration target: Domain creation failed (code -3).
libxl: info: libxl_exec.c:117:libxl_report_child_exitstatus: migration
transport process [19981] exited with error status 1
Migration failed, failed to suspend at sender.

and guest console shows

=D0=9F=D0=BE=D0=B4=D1=80=D0=B0=D0=B7=D0=B4=D0=B5=D0=BB=D0=B5=D0=BD=D0=B8=D0=
=B5 =D0=91=D0=B0=D0=BD=D0=BA=D0=B0 =D0=BF=D0=BE =D0=BC=D0=B5=D1=81=D1=82=D1=
=83 =D0=BE=D1=84=D0=BE=D1=80=D0=BC=D0=BB=D0=B5=D0=BD=D0=B8=D1=8F =D0=BA=D0=
=B0=D1=80=D1=82=D1=8B (=D0=A6=D0=9E=D0=9F=D0=9F =E2=84=968610/07770 =D0=B3.=
=D0=9A=D0=B0=D0=B7=D0=B0=D0=BD=D1=8C,
=D1=83=D0=BB.=D0=9F=D0=B5=D1=82=D0=B5=D1=80=D0=B1=D1=83=D1=80=D0=B3=D1=81=
=D0=BA=D0=B0=D1=8F, 28 ,420107)

=D0=94=D0=BE=D0=BF.=D0=BE=D1=84=D0=B8=D1=81 =E2=84=968610/0138 =D0=B3.=D0=
=9A=D0=B0=D0=B7=D0=B0=D0=BD=D1=8C, =D1=83=D0=BB.=D0=A7=D0=B8=D1=81=D1=82=D0=
=BE=D0=BF=D0=BE=D0=BB=D1=8C=D1=81=D0=BA=D0=B0=D1=8F, 5 ,420066

lock order reversal:
 1st 0xfffffe004d0a4018 xnrx_0 (netfront receive lock, sleep mutex) @
/usr/src/sys/dev/xen/netfront/netfront.c:423
 2nd 0xfffffe004d0a8018 xntx_0 (netfront transmit lock, sleep mutex) @
/usr/src/sys/dev/xen/netfront/netfront.c:424
 3rd 0xfffffe004d0a4d28 xnrx_1 (netfront receive lock, sleep mutex) @
/usr/src/sys/dev/xen/netfront/netfront.c:423
lock order netfront receive lock -> netfront transmit lock established at:
#0 0xffffffff80c4408d at witness_checkorder+0x46d
#1 0xffffffff80bb3eb4 at __mtx_lock_flags+0x94
#2 0xffffffff80a60ab4 at gnttab_resume+0xad04
#3 0xffffffff80c12373 at bus_generic_suspend_child+0x43
#4 0xffffffff80c12446 at bus_generic_suspend+0x66
#5 0xffffffff80c12373 at bus_generic_suspend_child+0x43
#6 0xffffffff80c12446 at bus_generic_suspend+0x66
#7 0xffffffff80a6787b at xs_unlock+0x35b
#8 0xffffffff80c12373 at bus_generic_suspend_child+0x43
#9 0xffffffff80c12446 at bus_generic_suspend+0x66
#10 0xffffffff80c12373 at bus_generic_suspend_child+0x43
#11 0xffffffff80c12446 at bus_generic_suspend+0x66
#12 0xffffffff80c12373 at bus_generic_suspend_child+0x43
#13 0xffffffff80c12446 at bus_generic_suspend+0x66
#14 0xffffffff80a54bca at xc_printf+0x162a
#15 0xffffffff80a548ae at xc_printf+0x130e
#16 0xffffffff80a67bd9 at xs_unlock+0x6b9
#17 0xffffffff80b92b30 at fork_exit+0x80
lock order netfront transmit lock -> netfront receive lock attempted at:
#0 0xffffffff80c449ec at witness_checkorder+0xdcc
#1 0xffffffff80bb3eb4 at __mtx_lock_flags+0x94
#2 0xffffffff80a60a9a at gnttab_resume+0xacea
#3 0xffffffff80c12373 at bus_generic_suspend_child+0x43
#4 0xffffffff80c12446 at bus_generic_suspend+0x66
#5 0xffffffff80c12373 at bus_generic_suspend_child+0x43
#6 0xffffffff80c12446 at bus_generic_suspend+0x66
#7 0xffffffff80a6787b at xs_unlock+0x35b
#8 0xffffffff80c12373 at bus_generic_suspend_child+0x43
#9 0xffffffff80c12446 at bus_generic_suspend+0x66
#10 0xffffffff80c12373 at bus_generic_suspend_child+0x43
#11 0xffffffff80c12446 at bus_generic_suspend+0x66
#12 0xffffffff80c12373 at bus_generic_suspend_child+0x43
#13 0xffffffff80c12446 at bus_generic_suspend+0x66
#14 0xffffffff80a54bca at xc_printf+0x162a
#15 0xffffffff80a548ae at xc_printf+0x130e
#16 0xffffffff80a67bd9 at xs_unlock+0x6b9
#17 0xffffffff80b92b30 at fork_exit+0x80
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid =3D 1; apic id =3D 02
fault virtual address   =3D 0x8
fault code              =3D supervisor read data, page not present
instruction pointer     =3D 0x20:0xffffffff80fe4894
stack pointer           =3D 0x28:0xfffffe000bafd8f0
frame pointer           =3D 0x28:0xfffffe000bafd900
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D resume, IOPL =3D 0
current process         =3D 11 (idle: cpu1)
trap number             =3D 12
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid =3D 7; apic id =3D 0e
fault virtual address   =3D 0x38
fault code              =3D supervisor read data, page not present
instruction pointer     =3D 0x20:0xffffffff80fe4894
stack pointer           =3D 0x28:0xfffffe000bb1b8f0
frame pointer           =3D 0x28:0xfffffe000bb1b900
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D resume, IOPL =3D 0
current process         =3D 11 (idle: cpu7)
trap number             =3D 12
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid =3D 2; apic id =3D 04
fault virtual address   =3D 0x10
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid =3D 5; apic id =3D 0a
fault virtual address   =3D 0x28
fault code              =3D supervisor read data, page not present
instruction pointer     =3D 0x20:0xffffffff80fe4894
stack pointer           =3D 0x28:0xfffffe000bb118f0
frame pointer           =3D 0x28:0xfffffe000bb11900
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D resume, IOPL =3D 0
current process         =3D 11 (idle: cpu5)
trap number             =3D 12
fault code              =3D supervisor read data, page not present
instruction pointer     =3D 0x20:0xffffffff80fe4894
stack pointer           =3D 0x28:0xfffffe000bb028f0
frame pointer           =3D 0x28:0xfffffe000bb02900
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D resume, IOPL =3D 0
current process         =3D 11 (idle: cpu2)
trap number             =3D 12
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid =3D 4; apic id =3D 08
fault virtual address   =3D 0x20
fault code              =3D supervisor read data, page not present
instruction pointer     =3D 0x20:0xffffffff80fe4894
stack pointer           =3D 0x28:0xfffffe0043aab860
frame pointer           =3D 0x28:0xfffffe0043aab870
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D resume, IOPL =3D 0
current process         =3D 12 (irq2096: xc0)
trap number             =3D 12
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid =3D 3; apic id =3D 06
fault virtual address   =3D 0x18
fault code              =3D supervisor read data, page not present
instruction pointer     =3D 0x20:0xffffffff80fe4894
stack pointer           =3D 0x28:0xfffffe000bb078f0
frame pointer           =3D 0x28:0xfffffe000bb07900
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D resume, IOPL =3D 0
current process         =3D 11 (idle: cpu3)
trap number             =3D 12
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid =3D 6; apic id =3D 0c
fault virtual address   =3D 0x30
fault code              =3D supervisor read data, page not present
instruction pointer     =3D 0x20:0xffffffff80fe4894
stack pointer           =3D 0x28:0xfffffe000bb168f0
frame pointer           =3D 0x28:0xfffffe000bb16900
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D resume, IOPL =3D 0
current process         =3D 11 (idle: cpu6)
trap number             =3D 12
timeout stopping cpus
panic: page fault
cpuid =3D 3
time =3D 1599291086
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe000bb07=
5a0
vpanic() at vpanic+0x182/frame 0xfffffe000bb075f0
panic() at panic+0x43/frame 0xfffffe000bb07650
trap_fatal() at trap_fatal+0x387/frame 0xfffffe000bb076b0
trap_pfault() at trap_pfault+0x97/frame 0xfffffe000bb07710
trap() at trap+0x2ab/frame 0xfffffe000bb07820
calltrap() at calltrap+0x8/frame 0xfffffe000bb07820
--- trap 0xc, rip =3D 0xffffffff80fe4894, rsp =3D 0xfffffe000bb078f0, rbp =
=3D
0xfffffe000bb07900 ---
cpususpend_handler() at cpususpend_handler+0x34/frame 0xfffffe000bb07900
xen_cpususpend_handler() at xen_cpususpend_handler+0x9/frame 0xfffffe000bb0=
7910
intr_event_handle() at intr_event_handle+0xde/frame 0xfffffe000bb07960
intr_execute_handlers() at intr_execute_handlers+0x66/frame 0xfffffe000bb07=
990
xen_intr_handle_upcall() at xen_intr_handle_upcall+0x1c6/frame
0xfffffe000bb079e0
Xxen_intr_upcall() at Xxen_intr_upcall+0xb1/frame 0xfffffe000bb079e0
--- interrupt, rip =3D 0xffffffff80fdac72, rsp =3D 0xfffffe000bb07ab0, rbp =
=3D
0xfffffe000bb07ac0 ---
cpu_idle_acpi() at cpu_idle_acpi+0x42/frame 0xfffffe000bb07ac0
cpu_idle() at cpu_idle+0x9f/frame 0xfffffe000bb07ae0
sched_idletd() at sched_idletd+0x3d1/frame 0xfffffe000bb07bb0
fork_exit() at fork_exit+0x80/frame 0xfffffe000bb07bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe000bb07bf0
--- trap 0, rip =3D 0, rsp =3D 0, rbp =3D 0 ---
KDB: enter: panic
[ thread pid 11 tid 100006 ]
Stopped at      kdb_enter+0x37: movq    $0,0x10b6606(%rip)
db> trace
Tracing pid 11 tid 100006 td 0xfffffe000e3b7e00
kdb_enter() at kdb_enter+0x37/frame 0xfffffe000bb075a0
vpanic() at vpanic+0x19e/frame 0xfffffe000bb075f0
panic() at panic+0x43/frame 0xfffffe000bb07650
trap_fatal() at trap_fatal+0x387/frame 0xfffffe000bb076b0
trap_pfault() at trap_pfault+0x97/frame 0xfffffe000bb07710
trap() at trap+0x2ab/frame 0xfffffe000bb07820
calltrap() at calltrap+0x8/frame 0xfffffe000bb07820
--- trap 0xc, rip =3D 0xffffffff80fe4894, rsp =3D 0xfffffe000bb078f0, rbp =
=3D
0xfffffe000bb07900 ---
cpususpend_handler() at cpususpend_handler+0x34/frame 0xfffffe000bb07900
xen_cpususpend_handler() at xen_cpususpend_handler+0x9/frame 0xfffffe000bb0=
7910
intr_event_handle() at intr_event_handle+0xde/frame 0xfffffe000bb07960
intr_execute_handlers() at intr_execute_handlers+0x66/frame 0xfffffe000bb07=
990
xen_intr_handle_upcall() at xen_intr_handle_upcall+0x1c6/frame
0xfffffe000bb079e0
Xxen_intr_upcall() at Xxen_intr_upcall+0xb1/frame 0xfffffe000bb079e0
--- interrupt, rip =3D 0xffffffff80fdac72, rsp =3D 0xfffffe000bb07ab0, rbp =
=3D
0xfffffe000bb07ac0 ---
cpu_idle_acpi() at cpu_idle_acpi+0x42/frame 0xfffffe000bb07ac0
cpu_idle() at cpu_idle+0x9f/frame 0xfffffe000bb07ae0
sched_idletd() at sched_idletd+0x3d1/frame 0xfffffe000bb07bb0
fork_exit() at fork_exit+0x80/frame 0xfffffe000bb07bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe000bb07bf0
--- trap 0, rip =3D 0, rsp =3D 0, rbp =3D 0 ---
db>

FWIW, I also filed a similar report for NetBSD few months ago:

netbsd domU does not migrate properly from one xen host to another
http://gnats.netbsd.org/55207

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-249123-227>