Date: Fri, 29 Apr 2022 15:33:45 +0000 From: dandan@lysator.liu.se To: freebsd-infiniband@freebsd.org Cc: kempe@lysator.liu.se Subject: [PATCH] ibcore: allow passing NULL-pointers to ib_umem_release() Message-ID: <20220429153345.Horde.6oVap54ZXOmz41donxfLJJ5@webmail.lysator.liu.se>
next in thread | raw e-mail | index | archive | help
This message is in MIME format. --=_87sLKDUSE5xZjYL8evQBNF_ Content-Type: text/plain; charset=utf-8; format=flowed; DelSp=Yes Content-Disposition: inline Hi! The attached patch fixes the following kernel panic in ibcore when unloading the mlx4ib kernel module: > Fatal trap 12: page fault while in kernel mode > cpuid = 31; apic id = 2f > fault virtual address = 0x70 > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff82f9fe8e > stack pointer = 0x28:0xfffffe0159d3db70 > frame pointer = 0x28:0xfffffe0159d3dba0 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 1866 (kldunload) > trap number = 12 > panic: page fault > cpuid = 31 > time = 1650661418 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > 0xfffffe0159d3d930 > vpanic() at vpanic+0x17f/frame 0xfffffe0159d3d980 > panic() at panic+0x43/frame 0xfffffe0159d3d9e0 > trap_fatal() at trap_fatal+0x385/frame 0xfffffe0159d3da40 > trap_pfault() at trap_pfault+0xab/frame 0xfffffe0159d3daa0 > calltrap() at calltrap+0x8/frame 0xfffffe0159d3daa0 > --- trap 0xc, rip = 0xffffffff82f9fe8e, rsp = 0xfffffe0159d3db70, > rbp = 0xfffffe0159d3dba0 --- ib_umem_release() at > ib_umem_release+0xe/frame 0xfffffe0159d3dba0 > mlx4_ib_destroy_qp() at mlx4_ib_destroy_qp+0x654/frame 0xfffffe0159d3dc00 > ib_destroy_qp_user() at ib_destroy_qp_user+0xce/frame 0xfffffe0159d3dc50 > ipoib_transport_dev_cleanup() at > ipoib_transport_dev_cleanup+0x1c/frame 0xfffffe0159d3dc70 > ipoib_dev_cleanup() at ipoib_dev_cleanup+0xd6/frame 0xfffffe0159d3dcb0 > ipoib_remove_one() at ipoib_remove_one+0xec/frame 0xfffffe0159d3dcf0 > ib_unregister_client() at ib_unregister_client+0x1b7/frame 0xfffffe0159d3dd30 > ipoib_cleanup_module() at ipoib_cleanup_module+0x50/frame 0xfffffe0159d3dd40 > linker_file_sysuninit() at linker_file_sysuninit+0x147/frame > 0xfffffe0159d3dd70 > linker_file_unload() at linker_file_unload+0x269/frame 0xfffffe0159d3ddb0 > kern_kldunload() at kern_kldunload+0x18d/frame 0xfffffe0159d3de00 > amd64_syscall() at amd64_syscall+0x12e/frame 0xfffffe0159d3df30 > fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0159d3df30 > --- syscall (444, FreeBSD ELF64, sys_kldunloadf), rip = > 0x3d2ea023e63a, rsp = 0x3d2e9f30cac8, rbp = 0x3d2e9f30d320 --- KDB: > enter: panic > [ thread pid 1866 tid 100919 ] > Stopped at kdb_enter+0x32: movq $0,0x127cea3(%rip) Linux added the same functionality in commit 836a0fbb3e76f704ad65ddfb57f00725245e509b. The patch is based on FreeBSD commit 0abcc1d2d33aef2333dab28e1ec1858cf45b314a. // Daniel Dandanell, Lysator ACS --=_87sLKDUSE5xZjYL8evQBNF_ Content-Type: text/plain; name=0001-ibcore-allow-passing-NULL-pointers-to-ib_umem_releas.patch Content-Disposition: attachment; size=3647; filename=0001-ibcore-allow-passing-NULL-pointers-to-ib_umem_releas.patch >From 9e481d7bb452ec18c230ffd8fd66e52f0f75125f Mon Sep 17 00:00:00 2001 From: Daniel Dandanell <dandan@lysator.liu.se> Date: Thu, 28 Apr 2022 22:49:55 +0200 Subject: [PATCH] ibcore: allow passing NULL-pointers to ib_umem_release() Commit b633e08c705fe43180567eae26923d6f6f98c8d9 removed the NULL-checks from the mlx4ib-driver. This fixes the following panic in ibcore: Fatal trap 12: page fault while in kernel mode cpuid = 31; apic id = 2f fault virtual address = 0x70 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff82f9fe8e stack pointer = 0x28:0xfffffe0159d3db70 frame pointer = 0x28:0xfffffe0159d3dba0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 1866 (kldunload) trap number = 12 panic: page fault cpuid = 31 time = 1650661418 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0159d3d930 vpanic() at vpanic+0x17f/frame 0xfffffe0159d3d980 panic() at panic+0x43/frame 0xfffffe0159d3d9e0 trap_fatal() at trap_fatal+0x385/frame 0xfffffe0159d3da40 trap_pfault() at trap_pfault+0xab/frame 0xfffffe0159d3daa0 calltrap() at calltrap+0x8/frame 0xfffffe0159d3daa0 --- trap 0xc, rip = 0xffffffff82f9fe8e, rsp = 0xfffffe0159d3db70, rbp = 0xfffffe0159d3dba0 --- ib_umem_release() at ib_umem_release+0xe/frame 0xfffffe0159d3dba0 mlx4_ib_destroy_qp() at mlx4_ib_destroy_qp+0x654/frame 0xfffffe0159d3dc00 ib_destroy_qp_user() at ib_destroy_qp_user+0xce/frame 0xfffffe0159d3dc50 ipoib_transport_dev_cleanup() at ipoib_transport_dev_cleanup+0x1c/frame 0xfffffe0159d3dc70 ipoib_dev_cleanup() at ipoib_dev_cleanup+0xd6/frame 0xfffffe0159d3dcb0 ipoib_remove_one() at ipoib_remove_one+0xec/frame 0xfffffe0159d3dcf0 ib_unregister_client() at ib_unregister_client+0x1b7/frame 0xfffffe0159d3dd30 ipoib_cleanup_module() at ipoib_cleanup_module+0x50/frame 0xfffffe0159d3dd40 linker_file_sysuninit() at linker_file_sysuninit+0x147/frame 0xfffffe0159d3dd70 linker_file_unload() at linker_file_unload+0x269/frame 0xfffffe0159d3ddb0 kern_kldunload() at kern_kldunload+0x18d/frame 0xfffffe0159d3de00 amd64_syscall() at amd64_syscall+0x12e/frame 0xfffffe0159d3df30 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0159d3df30 --- syscall (444, FreeBSD ELF64, sys_kldunloadf), rip = 0x3d2ea023e63a, rsp = 0x3d2e9f30cac8, rbp = 0x3d2e9f30d320 --- KDB: enter: panic Sponsored by: Lysator ACS --- sys/ofed/drivers/infiniband/core/ib_umem.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/sys/ofed/drivers/infiniband/core/ib_umem.c b/sys/ofed/drivers/infiniband/core/ib_umem.c index 48df27522a5..889908eed68 100644 --- a/sys/ofed/drivers/infiniband/core/ib_umem.c +++ b/sys/ofed/drivers/infiniband/core/ib_umem.c @@ -248,11 +248,13 @@ static void ib_umem_account(struct work_struct *work) */ void ib_umem_release(struct ib_umem *umem) { - struct ib_ucontext *context = umem->context; struct mm_struct *mm; struct task_struct *task; unsigned long diff; + if (!umem) + return; + if (umem->odp_data) { ib_umem_odp_release(umem); return; @@ -279,7 +281,7 @@ void ib_umem_release(struct ib_umem *umem) * up here and not be able to take the mmap_sem. In that case * we defer the vm_locked accounting to the system workqueue. */ - if (context->closing) { + if (umem->context->closing) { if (!down_write_trylock(&mm->mmap_sem)) { INIT_WORK(&umem->work, ib_umem_account); umem->mm = mm; -- 2.35.1 --=_87sLKDUSE5xZjYL8evQBNF_--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20220429153345.Horde.6oVap54ZXOmz41donxfLJJ5>