Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 29 Apr 2022 15:33:45 +0000
From:      dandan@lysator.liu.se
To:        freebsd-infiniband@freebsd.org
Cc:        kempe@lysator.liu.se
Subject:   [PATCH] ibcore: allow passing NULL-pointers to ib_umem_release()
Message-ID:  <20220429153345.Horde.6oVap54ZXOmz41donxfLJJ5@webmail.lysator.liu.se>

next in thread | raw e-mail | index | archive | help
This message is in MIME format.

--=_87sLKDUSE5xZjYL8evQBNF_
Content-Type: text/plain; charset=utf-8; format=flowed; DelSp=Yes
Content-Disposition: inline

Hi!

The attached patch fixes the following kernel panic in ibcore when  
unloading the mlx4ib kernel module:

> Fatal trap 12: page fault while in kernel mode
> cpuid = 31; apic id = 2f
> fault virtual address   = 0x70
> fault code              = supervisor read data, page not present
> instruction pointer     = 0x20:0xffffffff82f9fe8e
> stack pointer           = 0x28:0xfffffe0159d3db70
> frame pointer           = 0x28:0xfffffe0159d3dba0
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                       = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0  
> current process         = 1866 (kldunload)
> trap number             = 12
> panic: page fault
> cpuid = 31
> time = 1650661418
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame  
> 0xfffffe0159d3d930
> vpanic() at vpanic+0x17f/frame 0xfffffe0159d3d980
> panic() at panic+0x43/frame 0xfffffe0159d3d9e0
> trap_fatal() at trap_fatal+0x385/frame 0xfffffe0159d3da40
> trap_pfault() at trap_pfault+0xab/frame 0xfffffe0159d3daa0
> calltrap() at calltrap+0x8/frame 0xfffffe0159d3daa0
> --- trap 0xc, rip = 0xffffffff82f9fe8e, rsp = 0xfffffe0159d3db70,  
> rbp = 0xfffffe0159d3dba0 --- ib_umem_release() at  
> ib_umem_release+0xe/frame 0xfffffe0159d3dba0
> mlx4_ib_destroy_qp() at mlx4_ib_destroy_qp+0x654/frame 0xfffffe0159d3dc00
> ib_destroy_qp_user() at ib_destroy_qp_user+0xce/frame 0xfffffe0159d3dc50
> ipoib_transport_dev_cleanup() at  
> ipoib_transport_dev_cleanup+0x1c/frame 0xfffffe0159d3dc70
> ipoib_dev_cleanup() at ipoib_dev_cleanup+0xd6/frame 0xfffffe0159d3dcb0
> ipoib_remove_one() at ipoib_remove_one+0xec/frame 0xfffffe0159d3dcf0
> ib_unregister_client() at ib_unregister_client+0x1b7/frame 0xfffffe0159d3dd30
> ipoib_cleanup_module() at ipoib_cleanup_module+0x50/frame 0xfffffe0159d3dd40
> linker_file_sysuninit() at linker_file_sysuninit+0x147/frame  
> 0xfffffe0159d3dd70
> linker_file_unload() at linker_file_unload+0x269/frame 0xfffffe0159d3ddb0
> kern_kldunload() at kern_kldunload+0x18d/frame 0xfffffe0159d3de00
> amd64_syscall() at amd64_syscall+0x12e/frame 0xfffffe0159d3df30
> fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0159d3df30
> --- syscall (444, FreeBSD ELF64, sys_kldunloadf), rip =  
> 0x3d2ea023e63a, rsp = 0x3d2e9f30cac8, rbp = 0x3d2e9f30d320 --- KDB:  
> enter: panic
> [ thread pid 1866 tid 100919 ]
> Stopped at      kdb_enter+0x32: movq    $0,0x127cea3(%rip)

Linux added the same functionality in commit  
836a0fbb3e76f704ad65ddfb57f00725245e509b.

The patch is based on FreeBSD commit 0abcc1d2d33aef2333dab28e1ec1858cf45b314a.

// Daniel Dandanell, Lysator ACS


--=_87sLKDUSE5xZjYL8evQBNF_
Content-Type: text/plain;
 name=0001-ibcore-allow-passing-NULL-pointers-to-ib_umem_releas.patch
Content-Disposition: attachment; size=3647;
 filename=0001-ibcore-allow-passing-NULL-pointers-to-ib_umem_releas.patch

>From 9e481d7bb452ec18c230ffd8fd66e52f0f75125f Mon Sep 17 00:00:00 2001
From: Daniel Dandanell <dandan@lysator.liu.se>
Date: Thu, 28 Apr 2022 22:49:55 +0200
Subject: [PATCH] ibcore: allow passing NULL-pointers to ib_umem_release()

Commit b633e08c705fe43180567eae26923d6f6f98c8d9 removed the NULL-checks from
the mlx4ib-driver.

This fixes the following panic in ibcore:

Fatal trap 12: page fault while in kernel mode
cpuid = 31; apic id = 2f
fault virtual address   = 0x70
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff82f9fe8e
stack pointer           = 0x28:0xfffffe0159d3db70
frame pointer           = 0x28:0xfffffe0159d3dba0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 1866 (kldunload)
trap number             = 12
panic: page fault
cpuid = 31
time = 1650661418
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0159d3d930
vpanic() at vpanic+0x17f/frame 0xfffffe0159d3d980
panic() at panic+0x43/frame 0xfffffe0159d3d9e0
trap_fatal() at trap_fatal+0x385/frame 0xfffffe0159d3da40
trap_pfault() at trap_pfault+0xab/frame 0xfffffe0159d3daa0
calltrap() at calltrap+0x8/frame 0xfffffe0159d3daa0
--- trap 0xc, rip = 0xffffffff82f9fe8e, rsp = 0xfffffe0159d3db70, rbp = 0xfffffe0159d3dba0 ---
ib_umem_release() at ib_umem_release+0xe/frame 0xfffffe0159d3dba0
mlx4_ib_destroy_qp() at mlx4_ib_destroy_qp+0x654/frame 0xfffffe0159d3dc00
ib_destroy_qp_user() at ib_destroy_qp_user+0xce/frame 0xfffffe0159d3dc50
ipoib_transport_dev_cleanup() at ipoib_transport_dev_cleanup+0x1c/frame 0xfffffe0159d3dc70
ipoib_dev_cleanup() at ipoib_dev_cleanup+0xd6/frame 0xfffffe0159d3dcb0
ipoib_remove_one() at ipoib_remove_one+0xec/frame 0xfffffe0159d3dcf0
ib_unregister_client() at ib_unregister_client+0x1b7/frame 0xfffffe0159d3dd30
ipoib_cleanup_module() at ipoib_cleanup_module+0x50/frame 0xfffffe0159d3dd40
linker_file_sysuninit() at linker_file_sysuninit+0x147/frame 0xfffffe0159d3dd70
linker_file_unload() at linker_file_unload+0x269/frame 0xfffffe0159d3ddb0
kern_kldunload() at kern_kldunload+0x18d/frame 0xfffffe0159d3de00
amd64_syscall() at amd64_syscall+0x12e/frame 0xfffffe0159d3df30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0159d3df30
--- syscall (444, FreeBSD ELF64, sys_kldunloadf), rip = 0x3d2ea023e63a, rsp = 0x3d2e9f30cac8, rbp = 0x3d2e9f30d320 ---
KDB: enter: panic

Sponsored by: Lysator ACS
---
 sys/ofed/drivers/infiniband/core/ib_umem.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/sys/ofed/drivers/infiniband/core/ib_umem.c b/sys/ofed/drivers/infiniband/core/ib_umem.c
index 48df27522a5..889908eed68 100644
--- a/sys/ofed/drivers/infiniband/core/ib_umem.c
+++ b/sys/ofed/drivers/infiniband/core/ib_umem.c
@@ -248,11 +248,13 @@ static void ib_umem_account(struct work_struct *work)
  */
 void ib_umem_release(struct ib_umem *umem)
 {
-	struct ib_ucontext *context = umem->context;
 	struct mm_struct *mm;
 	struct task_struct *task;
 	unsigned long diff;
 
+	if (!umem)
+		return;
+
 	if (umem->odp_data) {
 		ib_umem_odp_release(umem);
 		return;
@@ -279,7 +281,7 @@ void ib_umem_release(struct ib_umem *umem)
 	 * up here and not be able to take the mmap_sem.  In that case
 	 * we defer the vm_locked accounting to the system workqueue.
 	 */
-	if (context->closing) {
+	if (umem->context->closing) {
 		if (!down_write_trylock(&mm->mmap_sem)) {
 			INIT_WORK(&umem->work, ib_umem_account);
 			umem->mm   = mm;
-- 
2.35.1


--=_87sLKDUSE5xZjYL8evQBNF_--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20220429153345.Horde.6oVap54ZXOmz41donxfLJJ5>