From nobody Sat Dec 23 13:38:50 2023 X-Original-To: bugs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Sy4y34CFRz54WsV for ; Sat, 23 Dec 2023 13:38:51 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Sy4y32bZzz4dpS for ; Sat, 23 Dec 2023 13:38:51 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1703338731; a=rsa-sha256; cv=none; b=O9QrgRlNvR8OlFd1Z3lOsboW5+Rrtnw3BfhGMvTTkqiBcZFIPiqrv0MznYyveuSoo4MaD4 TwCkf4e4BGECYbvsBVqR12S2vgU/FlJl7Ukcv3Xa/1xIZs4BGYK/iayeqwCkFR9doDF46q GdGyvP20NShLwvweT/CKVdi4rv0Q1ZXIkHQE1/rrh/RCVWmq95O+hmjXGFBRRLhcB2yQxp Y6Ialj3zFytvFUHKmE8U62e2DPI16UzFupW89NJeTOzmuywNIf1MlCsEJ3QAUT6sEIzHr2 D+uD0AmiYYV9/IJ9h/pIZljx7IPcEh9Vp/g/6IHCzKPk554bPyHeg8I5FNWedA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1703338731; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=1ohO3zPPjPGc58z+qaYqVNAQxV9gBe84mz3XoSlCYyE=; b=w87/oZM/9GO9y3ehbwxGiaCrPU2aaQn09LBP2jDm5Rvr3+in4YSO4eGFUWoQMMJV76Di0c sVxQfKHjSL4y8qwRlvdd2vBK31VQgZHyzPTEHqGRoNMa+LW0qEO27Gq9ISmchbCfFjyUv4 J4A9UecAbPocpo0J/GsTi64MCa1NlnUZ1zibv0gExW0U0ZZ7GVBKszYwm/2zbt6UUsRsGQ J8ffKO3Law9byey0sAnY7PSaRyBl5DISLhVoLKSySvOkXVf060EV98xa3Xv80Sfpjj1Mpw LZ0eorQyTHuPamQ5h+mM648Ksr78GvCBX4uAeIdnUYqdbctiBNTPkeC1U50zww== Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4Sy4y31fVfz12Jt for ; Sat, 23 Dec 2023 13:38:51 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 3BNDcpca017799 for ; Sat, 23 Dec 2023 13:38:51 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 3BNDcpCO017798 for bugs@FreeBSD.org; Sat, 23 Dec 2023 13:38:51 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 275897] mlx4en: Panic when mlx4en is loaded Date: Sat, 23 Dec 2023 13:38:50 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 14.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: yuuzi41@hotmail.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Bug reports List-Archive: https://lists.freebsd.org/archives/freebsd-bugs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-bugs@freebsd.org MIME-Version: 1.0 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D275897 Bug ID: 275897 Summary: mlx4en: Panic when mlx4en is loaded Product: Base System Version: 14.0-RELEASE Hardware: Any OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: yuuzi41@hotmail.com Created attachment 247214 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D247214&action= =3Dedit core Kernel Panic (Page fault) happen when I tried to load mlx4en. My machine has Mellanox ConnectX-3. ---- % pciconf -vl hostb0@pci0:0:0:0: class=3D0x060000 rev=3D0x00 hdr=3D0x00 vendor=3D0x8= 086 device=3D0x4e24 subvendor=3D0x8086 subdevice=3D0x7270 vendor =3D 'Intel Corporation' class =3D bridge subclass =3D HOST-PCI vgapci0@pci0:0:2:0: class=3D0x030000 rev=3D0x01 hdr=3D0x00 vendor=3D0x8= 086 device=3D0x4e61 subvendor=3D0x8086 subdevice=3D0x2212 vendor =3D 'Intel Corporation' device =3D 'JasperLake [UHD Graphics]' class =3D display subclass =3D VGA xhci0@pci0:0:20:0: class=3D0x0c0330 rev=3D0x01 hdr=3D0x00 vendor=3D0x8= 086 device=3D0x4ded subvendor=3D0x8086 subdevice=3D0x7270 vendor =3D 'Intel Corporation' class =3D serial bus subclass =3D USB none0@pci0:0:20:2: class=3D0x050000 rev=3D0x01 hdr=3D0x00 vendor=3D0x8= 086 device=3D0x4def subvendor=3D0x8086 subdevice=3D0x7270 vendor =3D 'Intel Corporation' class =3D memory subclass =3D RAM none1@pci0:0:22:0: class=3D0x078000 rev=3D0x01 hdr=3D0x00 vendor=3D0x8= 086 device=3D0x4de0 subvendor=3D0x8086 subdevice=3D0x7270 vendor =3D 'Intel Corporation' device =3D 'Management Engine Interface' class =3D simple comms sdhci_pci0@pci0:0:26:0: class=3D0x080501 rev=3D0x01 hdr=3D0x00 vendor=3D0x8= 086 device=3D0x4dc4 subvendor=3D0x8086 subdevice=3D0x7270 vendor =3D 'Intel Corporation' class =3D base peripheral subclass =3D SD host controller pcib1@pci0:0:28:0: class=3D0x060400 rev=3D0x01 hdr=3D0x01 vendor=3D0x8= 086 device=3D0x4db8 subvendor=3D0x8086 subdevice=3D0x7270 vendor =3D 'Intel Corporation' class =3D bridge subclass =3D PCI-PCI pcib2@pci0:0:28:1: class=3D0x060400 rev=3D0x01 hdr=3D0x01 vendor=3D0x8= 086 device=3D0x4db9 subvendor=3D0x8086 subdevice=3D0x7270 vendor =3D 'Intel Corporation' class =3D bridge subclass =3D PCI-PCI pcib3@pci0:0:28:2: class=3D0x060400 rev=3D0x01 hdr=3D0x01 vendor=3D0x8= 086 device=3D0x4dba subvendor=3D0x8086 subdevice=3D0x7270 vendor =3D 'Intel Corporation' class =3D bridge subclass =3D PCI-PCI pcib4@pci0:0:28:3: class=3D0x060400 rev=3D0x01 hdr=3D0x01 vendor=3D0x8= 086 device=3D0x4dbb subvendor=3D0x8086 subdevice=3D0x7270 vendor =3D 'Intel Corporation' class =3D bridge subclass =3D PCI-PCI pcib5@pci0:0:28:4: class=3D0x060400 rev=3D0x01 hdr=3D0x01 vendor=3D0x8= 086 device=3D0x4dbc subvendor=3D0x8086 subdevice=3D0x7270 vendor =3D 'Intel Corporation' class =3D bridge subclass =3D PCI-PCI isab0@pci0:0:31:0: class=3D0x060100 rev=3D0x01 hdr=3D0x00 vendor=3D0x8= 086 device=3D0x4d87 subvendor=3D0x8086 subdevice=3D0x7270 vendor =3D 'Intel Corporation' class =3D bridge subclass =3D PCI-ISA none2@pci0:0:31:3: class=3D0x040300 rev=3D0x01 hdr=3D0x00 vendor=3D0x8= 086 device=3D0x4dc8 subvendor=3D0x8086 subdevice=3D0x7270 vendor =3D 'Intel Corporation' device =3D 'Jasper Lake HD Audio' class =3D multimedia subclass =3D HDA none3@pci0:0:31:4: class=3D0x0c0500 rev=3D0x01 hdr=3D0x00 vendor=3D0x8= 086 device=3D0x4da3 subvendor=3D0x8086 subdevice=3D0x7270 vendor =3D 'Intel Corporation' device =3D 'Jasper Lake SMBus' class =3D serial bus subclass =3D SMBus none4@pci0:0:31:5: class=3D0x0c8000 rev=3D0x01 hdr=3D0x00 vendor=3D0x8= 086 device=3D0x4da4 subvendor=3D0x8086 subdevice=3D0x7270 vendor =3D 'Intel Corporation' device =3D 'Jasper Lake SPI Controller' class =3D serial bus igc0@pci0:1:0:0: class=3D0x020000 rev=3D0x04 hdr=3D0x00 vendor=3D0x8= 086 device=3D0x125c subvendor=3D0x8086 subdevice=3D0x0000 vendor =3D 'Intel Corporation' device =3D 'Ethernet Controller I226-V' class =3D network subclass =3D ethernet igc1@pci0:2:0:0: class=3D0x020000 rev=3D0x04 hdr=3D0x00 vendor=3D0x8= 086 device=3D0x125c subvendor=3D0x8086 subdevice=3D0x0000 vendor =3D 'Intel Corporation' device =3D 'Ethernet Controller I226-V' class =3D network subclass =3D ethernet igc2@pci0:3:0:0: class=3D0x020000 rev=3D0x04 hdr=3D0x00 vendor=3D0x8= 086 device=3D0x125c subvendor=3D0x8086 subdevice=3D0x0000 vendor =3D 'Intel Corporation' device =3D 'Ethernet Controller I226-V' class =3D network subclass =3D ethernet nvme0@pci0:4:0:0: class=3D0x010802 rev=3D0x03 hdr=3D0x00 vendor=3D0x8= 086 device=3D0xf1a6 subvendor=3D0x8086 subdevice=3D0x390b vendor =3D 'Intel Corporation' device =3D 'SSD Pro 7600p/760p/E 6100p Series' class =3D mass storage subclass =3D NVM mlx4_core0@pci0:5:0:0: class=3D0x020000 rev=3D0x00 hdr=3D0x00 vendor=3D0x1= 5b3 device=3D0x1003 subvendor=3D0x15b3 subdevice=3D0x0113 vendor =3D 'Mellanox Technologies' device =3D 'MT27500 Family [ConnectX-3]' class =3D network subclass =3D ethernet ---- Reproduce procedure: # kldload mlx4en the core is attached. Analysis: The way I see the stacktrace in the core, the kernel panic happened because "ifm->ifm_status" was NULL at=20 https://cgit.freebsd.org/src/tree/sys/net/if_media.c?h=3Dreleng/14.0#n293 and that statement has been executed when mlx4en was calling ether_ifattach= () function. https://cgit.freebsd.org/src/tree/sys/dev/mlx4/mlx4_en/mlx4_en_netdev.c?h= =3Dreleng/14.0#n2296 ifm_status callback looks to be set in ifmedia_init() function https://cgit.freebsd.org/src/tree/sys/net/if_media.c?h=3Dreleng/14.0#n87 but mlx4en calls ifmedia_init() function after mlx4en calls ether_ifattach() function. https://cgit.freebsd.org/src/tree/sys/dev/mlx4/mlx4_en/mlx4_en_netdev.c?h= =3Dreleng/14.0#n2298 I think that that is the root cause. I'd like to propose a patch to fix it as below. It changes the order of statements. ---- diff --git a/sys/dev/mlx4/mlx4_en/mlx4_en_netdev.c b/sys/dev/mlx4/mlx4_en/mlx4_en_netdev.c index c26afc0099b5..583de1816d1b 100644 --- a/sys/dev/mlx4/mlx4_en/mlx4_en_netdev.c +++ b/sys/dev/mlx4/mlx4_en/mlx4_en_netdev.c @@ -2293,7 +2293,6 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port, dev_addr[ETHER_ADDR_LEN - 1 - i] =3D (u8) (priv->mac >> (8 = * i)); - ether_ifattach(dev, dev_addr); if_link_state_change(dev, LINK_STATE_DOWN); ifmedia_init(&priv->media, IFM_IMASK | IFM_ETH_FMASK, mlx4_en_media_change, mlx4_en_media_status); @@ -2306,6 +2305,8 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port, DEBUGNET_SET(dev, mlx4_en); + ether_ifattach(dev, dev_addr); + en_warn(priv, "Using %d TX rings\n", prof->tx_ring_num); en_warn(priv, "Using %d RX rings\n", prof->rx_ring_num); ---- --=20 You are receiving this mail because: You are the assignee for the bug.=