From owner-freebsd-bugs@freebsd.org Tue Jan 5 22:25:03 2021 Return-Path: Delivered-To: freebsd-bugs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 06A524CEB8B for ; Tue, 5 Jan 2021 22:25:03 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (mailman.nyi.freebsd.org [IPv6:2610:1c1:1:606c::50:13]) by mx1.freebsd.org (Postfix) with ESMTP id 4D9RrZ6YDGz3gxh for ; Tue, 5 Jan 2021 22:25:02 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id E0C864CE9C4; Tue, 5 Jan 2021 22:25:02 +0000 (UTC) Delivered-To: bugs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id E08DA4CE8F5 for ; Tue, 5 Jan 2021 22:25:02 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4D9RrZ5w4gz3RD6 for ; Tue, 5 Jan 2021 22:25:02 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id BAFBF13C56 for ; Tue, 5 Jan 2021 22:25:02 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 105MP21f086631 for ; Tue, 5 Jan 2021 22:25:02 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 105MP2Q7086630 for bugs@FreeBSD.org; Tue, 5 Jan 2021 22:25:02 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 252453] iflib doesn't initialize multi-queue netmap properly Date: Tue, 05 Jan 2021 22:25:02 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 12.2-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: brpoole@vt.edu X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jan 2021 22:25:03 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D252453 Bug ID: 252453 Summary: iflib doesn't initialize multi-queue netmap properly Product: Base System Version: 12.2-RELEASE Hardware: amd64 OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: brpoole@vt.edu Hello, I am testing a variety of network cards configured for multiple netmap queu= es on FreeBSD-12.2. I was debugging a panic in iflib's netmap_fl_refill() which always occurs when running pkt-gen in receive mode with multiple threads and the ice driver. That led me to discover what I think is a logic error with netmap usage under iflib. I believe iflib doesn't properly track the initial ringid. Then when multiple threads or processes each connect to a different queue on the same interface using NR_REG_ONE_NIC, rings are overwritten and re-configured. I observed this behavior by looking at the output of netmap_reset() which is called by iflib_netmap_rxq_init() which in turn is called by iflib_init_locked(). Each time a ring is configured, the function steps thr= ough all configured rings starting at 0. For comparison, I looked at the cxgbe driver, which does not use iflib. It has a first_nm_rxq variable which is a= dded to the start of all loops. Below is my procedure, followed by the output from the failing ice and igb drivers using iflib, followed by the output of the working cxgbe driver. I added CONFIG_NETMAP_DEBUG to netmap_kern.h to enable the netmap_debug varia= ble. Then I rebuilt/rebooted and at runtime set sysctl dev.netmap.debug=3D1 to e= nable NM_DEBUG_ON as well as sysctl dev.netmap.verbose=3D1. # recv is a simple netmap receiver, this opens just queue 9 ./recv netmap:ice0-9 [1921] netmap_interp_ringid ice0: tx [9,10) rx [9,10) id 9 [1994] netmap_krings_get ice0: grabbing tx [9, 10) rx [9, 10) [4121] netmap_reset ice0 TX9 hwofs 0 -> 0, hwtail 1023 -> 1023 [4121] netmap_reset ice0 RX9 hwofs 0 -> 0, hwtail 0 -> 0 [1083] netmap_mmap_single cdev 0xfffff80107432c00 foff 0 size 343121= 920 objp 0xfffffe0125f36968 prot 3 [ 991] netmap_dev_pager_ctor handle 0xfffff801707580b0 size 343121920 p= rot 3 foff 0 [3400] netmap_poll device ice0 events 0x1 # Now start second process connecting to queue 10, also resets queue 9 ./recv netmap:ice0-10 [1921] netmap_interp_ringid ice0: tx [10,11) rx [10,11) id 10 [1994] netmap_krings_get ice0: grabbing tx [10, 11) rx [10, 11) [4121] netmap_reset ice0 TX9 hwofs 0 -> 0, hwtail 1023 -> 1023 [4121] netmap_reset ice0 TX10 hwofs 0 -> 0, hwtail 1023 -> 1023 [4121] netmap_reset ice0 RX9 hwofs 0 -> 0, hwtail 0 -> 0 [4121] netmap_reset ice0 RX10 hwofs 0 -> 0, hwtail 0 -> 0 [3400] netmap_poll device ice0 events 0x1 [3400] netmap_poll device ice0 events 0x1 [3400] netmap_poll device ice0 events 0x1 [3400] netmap_poll device ice0 events 0x1 [1083] netmap_mmap_single cdev 0xfffff80107432c00 foff 0 size 343121= 920 objp 0xfffffe0125f4f968 prot 3 [ 991] netmap_dev_pager_ctor handle 0xfffff801707580c0 size 343121920 p= rot 3 foff 0 [3400] netmap_poll device ice0 events 0x1 # On to the igb driver ./recv netmap:igb4-2 [1921] netmap_interp_ringid igb4: tx [2,3) rx [2,3) id 2 [1994] netmap_krings_get igb4: grabbing tx [2, 3) rx [2, 3) [4121] netmap_reset igb4 TX2 hwofs 0 -> 0, hwtail 1023 -> 1023 [4121] netmap_reset igb4 RX2 hwofs 0 -> 0, hwtail 0 -> 0 [1083] netmap_mmap_single cdev 0xfffff80104b30400 foff 0 size 343019= 520 objp 0xfffffe010006d9a8 prot 3 [ 991] netmap_dev_pager_ctor handle 0xfffff80104f827a0 size 343019520 p= rot 3 foff 0 [3400] netmap_poll device igb4 events 0x1 # Now start second process, again this resets queues 2 and 3 ./recv netmap:igb4-3 [1921] netmap_interp_ringid igb4: tx [3,4) rx [3,4) id 3 [1994] netmap_krings_get igb4: grabbing tx [3, 4) rx [3, 4) [4121] netmap_reset igb4 TX2 hwofs 0 -> 0, hwtail 1023 -> 1023 [4121] netmap_reset igb4 TX3 hwofs 0 -> 0, hwtail 1023 -> 1023 [4121] netmap_reset igb4 RX2 hwofs 0 -> 0, hwtail 0 -> 0 [3400] netmap_poll device igb4 events 0x1 [4121] netmap_reset igb4 RX3 hwofs 0 -> 0, hwtail 0 -> 0 [3400] netmap_poll device igb4 events 0x1 [1083] netmap_mmap_single cdev 0xfffff80104b30400 foff 0 size 343019= 520 objp 0xfffffe011720e9a8 prot 3 [ 991] netmap_dev_pager_ctor handle 0xfffff80100087a00 size 343019520 p= rot 3 foff 0 [3400] netmap_poll device igb4 events 0x1 # As a sanity check, here's the output from cxgbe - only one queue is modif= ied ./recv netmap:cc0-9 [1921] netmap_interp_ringid cc0: tx [9,10) rx [9,10) id 9 [1994] netmap_krings_get cc0: grabbing tx [9, 10) rx [9, 10) [4121] netmap_reset cc0 RX9 hwofs 0 -> 0, hwtail 0 -> 0 [4121] netmap_reset cc0 TX9 hwofs 0 -> 0, hwtail 1022 -> 1022 [1083] netmap_mmap_single cdev 0xfffff80107432c00 foff 0 size 343121= 920 objp 0xfffffe0125f4a968 prot 3 [ 991] netmap_dev_pager_ctor handle 0xfffff8016ea92f40 size 343121920 p= rot 3 foff 0 [3400] netmap_poll device cc0 events 0x1 ./recv netmap:cc0-10 [1921] netmap_interp_ringid cc0: tx [10,11) rx [10,11) id 10 [1994] netmap_krings_get cc0: grabbing tx [10, 11) rx [10, 11) [4121] netmap_reset cc0 RX10 hwofs 0 -> 0, hwtail 0 -> 0 [4121] netmap_reset cc0 TX10 hwofs 0 -> 0, hwtail 1022 -> 1022 [1083] netmap_mmap_single cdev 0xfffff80107432c00 foff 0 size 343121= 920 objp 0xfffffe0120623968 prot 3 [ 991] netmap_dev_pager_ctor handle 0xfffff8016e9a9d20 size 343121920 p= rot 3 foff 0 [3400] netmap_poll device cc0 events 0x1 --=20 You are receiving this mail because: You are the assignee for the bug.=