From nobody Fri Apr 29 15:33:45 2022 X-Original-To: freebsd-infiniband@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 545241998EA3 for ; Fri, 29 Apr 2022 15:34:00 +0000 (UTC) (envelope-from dandan@lysator.liu.se) Received: from mail.lysator.liu.se (mail.lysator.liu.se [IPv6:2001:6b0:17:f0a0::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Kqc3C3MBQz4S2w for ; Fri, 29 Apr 2022 15:33:56 +0000 (UTC) (envelope-from dandan@lysator.liu.se) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id DEFF0D34D; Fri, 29 Apr 2022 17:33:45 +0200 (CEST) Received: by mail.lysator.liu.se (Postfix, from userid 33) id DD3A6D25B; Fri, 29 Apr 2022 17:33:45 +0200 (CEST) Received: from lain.lysator.liu.se (lain.lysator.liu.se [2001:6b0:17:f0a0::cd]) by webmail.lysator.liu.se (Horde Framework) with HTTPS; Fri, 29 Apr 2022 15:33:45 +0000 Date: Fri, 29 Apr 2022 15:33:45 +0000 Message-ID: <20220429153345.Horde.6oVap54ZXOmz41donxfLJJ5@webmail.lysator.liu.se> From: dandan@lysator.liu.se To: freebsd-infiniband@freebsd.org Cc: kempe@lysator.liu.se Subject: [PATCH] ibcore: allow passing NULL-pointers to ib_umem_release() User-Agent: Horde Application Framework 5 Content-Type: multipart/mixed; boundary="=_87sLKDUSE5xZjYL8evQBNF_" List-Id: Infiniband on FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-infiniband List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-infiniband@freebsd.org MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP X-Rspamd-Queue-Id: 4Kqc3C3MBQz4S2w X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=pass (policy=none) header.from=lysator.liu.se; spf=pass (mx1.freebsd.org: domain of dandan@lysator.liu.se designates 2001:6b0:17:f0a0::3 as permitted sender) smtp.mailfrom=dandan@lysator.liu.se X-Spamd-Result: default: False [-2.80 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_SPF_ALLOW(-0.20)[+a:mail.lysator.liu.se]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[multipart/mixed,text/plain]; HAS_ATTACHMENT(0.00)[]; TO_DN_NONE(0.00)[]; RCVD_COUNT_THREE(0.00)[4]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MID_RHS_MATCH_FROMTLD(0.00)[]; NEURAL_HAM_SHORT(-1.00)[-1.000]; CTYPE_MIXED_BOGUS(1.00)[]; FROM_NO_DN(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; MLMMJ_DEST(0.00)[freebsd-infiniband]; DMARC_POLICY_ALLOW(-0.50)[lysator.liu.se,none]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:+]; ASN(0.00)[asn:1653, ipnet:2001:6b0::/32, country:EU]; RCVD_TLS_LAST(0.00)[] X-ThisMailContainsUnwantedMimeParts: N This message is in MIME format. --=_87sLKDUSE5xZjYL8evQBNF_ Content-Type: text/plain; charset=utf-8; format=flowed; DelSp=Yes Content-Disposition: inline Hi! The attached patch fixes the following kernel panic in ibcore when unloading the mlx4ib kernel module: > Fatal trap 12: page fault while in kernel mode > cpuid = 31; apic id = 2f > fault virtual address = 0x70 > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff82f9fe8e > stack pointer = 0x28:0xfffffe0159d3db70 > frame pointer = 0x28:0xfffffe0159d3dba0 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 1866 (kldunload) > trap number = 12 > panic: page fault > cpuid = 31 > time = 1650661418 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > 0xfffffe0159d3d930 > vpanic() at vpanic+0x17f/frame 0xfffffe0159d3d980 > panic() at panic+0x43/frame 0xfffffe0159d3d9e0 > trap_fatal() at trap_fatal+0x385/frame 0xfffffe0159d3da40 > trap_pfault() at trap_pfault+0xab/frame 0xfffffe0159d3daa0 > calltrap() at calltrap+0x8/frame 0xfffffe0159d3daa0 > --- trap 0xc, rip = 0xffffffff82f9fe8e, rsp = 0xfffffe0159d3db70, > rbp = 0xfffffe0159d3dba0 --- ib_umem_release() at > ib_umem_release+0xe/frame 0xfffffe0159d3dba0 > mlx4_ib_destroy_qp() at mlx4_ib_destroy_qp+0x654/frame 0xfffffe0159d3dc00 > ib_destroy_qp_user() at ib_destroy_qp_user+0xce/frame 0xfffffe0159d3dc50 > ipoib_transport_dev_cleanup() at > ipoib_transport_dev_cleanup+0x1c/frame 0xfffffe0159d3dc70 > ipoib_dev_cleanup() at ipoib_dev_cleanup+0xd6/frame 0xfffffe0159d3dcb0 > ipoib_remove_one() at ipoib_remove_one+0xec/frame 0xfffffe0159d3dcf0 > ib_unregister_client() at ib_unregister_client+0x1b7/frame 0xfffffe0159d3dd30 > ipoib_cleanup_module() at ipoib_cleanup_module+0x50/frame 0xfffffe0159d3dd40 > linker_file_sysuninit() at linker_file_sysuninit+0x147/frame > 0xfffffe0159d3dd70 > linker_file_unload() at linker_file_unload+0x269/frame 0xfffffe0159d3ddb0 > kern_kldunload() at kern_kldunload+0x18d/frame 0xfffffe0159d3de00 > amd64_syscall() at amd64_syscall+0x12e/frame 0xfffffe0159d3df30 > fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0159d3df30 > --- syscall (444, FreeBSD ELF64, sys_kldunloadf), rip = > 0x3d2ea023e63a, rsp = 0x3d2e9f30cac8, rbp = 0x3d2e9f30d320 --- KDB: > enter: panic > [ thread pid 1866 tid 100919 ] > Stopped at kdb_enter+0x32: movq $0,0x127cea3(%rip) Linux added the same functionality in commit 836a0fbb3e76f704ad65ddfb57f00725245e509b. The patch is based on FreeBSD commit 0abcc1d2d33aef2333dab28e1ec1858cf45b314a. // Daniel Dandanell, Lysator ACS --=_87sLKDUSE5xZjYL8evQBNF_ Content-Type: text/plain; name=0001-ibcore-allow-passing-NULL-pointers-to-ib_umem_releas.patch Content-Disposition: attachment; size=3647; filename=0001-ibcore-allow-passing-NULL-pointers-to-ib_umem_releas.patch >From 9e481d7bb452ec18c230ffd8fd66e52f0f75125f Mon Sep 17 00:00:00 2001 From: Daniel Dandanell Date: Thu, 28 Apr 2022 22:49:55 +0200 Subject: [PATCH] ibcore: allow passing NULL-pointers to ib_umem_release() Commit b633e08c705fe43180567eae26923d6f6f98c8d9 removed the NULL-checks from the mlx4ib-driver. This fixes the following panic in ibcore: Fatal trap 12: page fault while in kernel mode cpuid = 31; apic id = 2f fault virtual address = 0x70 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff82f9fe8e stack pointer = 0x28:0xfffffe0159d3db70 frame pointer = 0x28:0xfffffe0159d3dba0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 1866 (kldunload) trap number = 12 panic: page fault cpuid = 31 time = 1650661418 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0159d3d930 vpanic() at vpanic+0x17f/frame 0xfffffe0159d3d980 panic() at panic+0x43/frame 0xfffffe0159d3d9e0 trap_fatal() at trap_fatal+0x385/frame 0xfffffe0159d3da40 trap_pfault() at trap_pfault+0xab/frame 0xfffffe0159d3daa0 calltrap() at calltrap+0x8/frame 0xfffffe0159d3daa0 --- trap 0xc, rip = 0xffffffff82f9fe8e, rsp = 0xfffffe0159d3db70, rbp = 0xfffffe0159d3dba0 --- ib_umem_release() at ib_umem_release+0xe/frame 0xfffffe0159d3dba0 mlx4_ib_destroy_qp() at mlx4_ib_destroy_qp+0x654/frame 0xfffffe0159d3dc00 ib_destroy_qp_user() at ib_destroy_qp_user+0xce/frame 0xfffffe0159d3dc50 ipoib_transport_dev_cleanup() at ipoib_transport_dev_cleanup+0x1c/frame 0xfffffe0159d3dc70 ipoib_dev_cleanup() at ipoib_dev_cleanup+0xd6/frame 0xfffffe0159d3dcb0 ipoib_remove_one() at ipoib_remove_one+0xec/frame 0xfffffe0159d3dcf0 ib_unregister_client() at ib_unregister_client+0x1b7/frame 0xfffffe0159d3dd30 ipoib_cleanup_module() at ipoib_cleanup_module+0x50/frame 0xfffffe0159d3dd40 linker_file_sysuninit() at linker_file_sysuninit+0x147/frame 0xfffffe0159d3dd70 linker_file_unload() at linker_file_unload+0x269/frame 0xfffffe0159d3ddb0 kern_kldunload() at kern_kldunload+0x18d/frame 0xfffffe0159d3de00 amd64_syscall() at amd64_syscall+0x12e/frame 0xfffffe0159d3df30 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0159d3df30 --- syscall (444, FreeBSD ELF64, sys_kldunloadf), rip = 0x3d2ea023e63a, rsp = 0x3d2e9f30cac8, rbp = 0x3d2e9f30d320 --- KDB: enter: panic Sponsored by: Lysator ACS --- sys/ofed/drivers/infiniband/core/ib_umem.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/sys/ofed/drivers/infiniband/core/ib_umem.c b/sys/ofed/drivers/infiniband/core/ib_umem.c index 48df27522a5..889908eed68 100644 --- a/sys/ofed/drivers/infiniband/core/ib_umem.c +++ b/sys/ofed/drivers/infiniband/core/ib_umem.c @@ -248,11 +248,13 @@ static void ib_umem_account(struct work_struct *work) */ void ib_umem_release(struct ib_umem *umem) { - struct ib_ucontext *context = umem->context; struct mm_struct *mm; struct task_struct *task; unsigned long diff; + if (!umem) + return; + if (umem->odp_data) { ib_umem_odp_release(umem); return; @@ -279,7 +281,7 @@ void ib_umem_release(struct ib_umem *umem) * up here and not be able to take the mmap_sem. In that case * we defer the vm_locked accounting to the system workqueue. */ - if (context->closing) { + if (umem->context->closing) { if (!down_write_trylock(&mm->mmap_sem)) { INIT_WORK(&umem->work, ib_umem_account); umem->mm = mm; -- 2.35.1 --=_87sLKDUSE5xZjYL8evQBNF_--