From owner-freebsd-current@freebsd.org Wed Oct 23 10:12:02 2019 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id C37AB175BD2 for ; Wed, 23 Oct 2019 10:12:02 +0000 (UTC) (envelope-from aleksandr.fedorov@vstack.com) Received: from relay02.itglobal.com (relay02.itglobal.com [46.243.181.6]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 46ymPs38yRz3wqQ; Wed, 23 Oct 2019 10:12:00 +0000 (UTC) (envelope-from aleksandr.fedorov@vstack.com) X-Virus-Scanned: by SpamTitan at itglobal.com DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=vstack.com; s=relay; t=1571825508; bh=N6qO8P+GDkHdroPNmppiRi4kAJXETLyKbT9w2I6x4mA=; h=From:To:CC:Subject:Date; b=VzE9ijoHIhxDe9+W8QR8Lp+oKoxUuOcQtEkiKcVloH+2HArKRpYVzpyhOnlIGOdrc BJSZKOLame/Or6OHqUdg6LLCxZoZ41mh1TMEqsfF65FCD8/VFSYrFmhEMT6tJv8HbW TC/MBEAzL25RkSnVYR18vYcUw8nJHO4JOZ2tdvM6Vax3EjcRMA84Qz469rA32gpyYO siG6hwH0j6DRcXH1Db4QLR0jbFFc9XyQI7ah9E1wc0gMjWB5y6e+5bG2Aa1BvBNngO 5H2zPf+W9/WEPvAItX9SyG03Ou3wGAX6vsnAuZ0mvtAj/KND3y6N+rQaUFo6SiC1lH uvTcDxUjaut1g== From: "Fedorov, Aleksandr" To: "freebsd-current@freebsd.org" CC: "cem@FreeBSD.org" Subject: Re: > r353680: multiuser crash due to: m_getzone: Inavlid cluster size 0 Thread-Topic: > r353680: multiuser crash due to: m_getzone: Inavlid cluster size 0 Thread-Index: AQHViYg9dsnqWJfP00a0AJqCjHGUhA== Date: Wed, 23 Oct 2019 10:11:44 +0000 Message-ID: <279a3b54b5454b3e935389ad55d68298@vstack.com> Accept-Language: ru-RU, en-US Content-Language: ru-RU X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.32.254.11] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Rspamd-Queue-Id: 46ymPs38yRz3wqQ X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=pass header.d=vstack.com header.s=relay header.b=VzE9ijoH; dmarc=pass (policy=none) header.from=vstack.com; spf=pass (mx1.freebsd.org: domain of aleksandr.fedorov@vstack.com designates 46.243.181.6 as permitted sender) smtp.mailfrom=aleksandr.fedorov@vstack.com X-Spamd-Result: default: False [-1.98 / 15.00]; ARC_NA(0.00)[]; FAKE_REPLY(1.00)[]; R_DKIM_ALLOW(-0.20)[vstack.com:s=relay]; HAS_XOIP(0.00)[]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+mx]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; NEURAL_HAM_MEDIUM(-0.99)[-0.987,0]; DKIM_TRACE(0.00)[vstack.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[vstack.com,none]; TO_DN_EQ_ADDR_ALL(0.00)[]; RCVD_COUNT_ZERO(0.00)[0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; IP_SCORE(0.00)[country: RU(0.01)]; ASN(0.00)[asn:209974, ipnet:46.243.181.0/24, country:RU]; MID_RHS_MATCH_FROM(0.00)[] X-Mailman-Approved-At: Thu, 24 Oct 2019 23:31:30 +0000 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Oct 2019 10:12:02 -0000 I discovered a similar kernel panic. To reproduce, just run CURRENT in bhyve with e1000 network backend. I think the problem is that the debugnet_any_ifnet_update () function calls= iflib_debugnet_init () when the private driver data is not yet fully initi= alized. sys/net/iflib.c: 6724iflib_debugnet_init(if_t ifp, int *nrxr, int *ncl, int *clsize) 6725{ 6726 if_ctx_t ctx; 6727 6728 ctx =3D if_getsoftc(ifp); 6729 CTX_LOCK(ctx); 6730 *nrxr =3D NRXQSETS(ctx); 6731 *ncl =3D ctx->ifc_rxqs[0].ifr_fl->ifl_size; 6732 *clsize =3D ctx->ifc_rxqs[0].ifr_fl->ifl_buf_size; <<<<<<<<------ i= fl_buf_size is equal zero!!! 6733 CTX_UNLOCK(ctx); 6734} So, it seems that ifnet_link_event EVENTHANDLER is too early to initialize = debugnet. Because ifl_buf_size is initialized with ctx-> ifc_rx_mbuf_sz, which is ini= tialized with iflib_calc_rx_mbuf_sz (), I use the following patch, as a wor= karound: diff --git a/sys/net/iflib.c b/sys/net/iflib.c index 73606981a492..1caf3505932a 100644 --- a/sys/net/iflib.c +++ b/sys/net/iflib.c @@ -6729,7 +6729,8 @@ iflib_debugnet_init(if_t ifp, int *nrxr, int *ncl, in= t *clsize) CTX_LOCK(ctx); *nrxr =3D NRXQSETS(ctx); *ncl =3D ctx->ifc_rxqs[0].ifr_fl->ifl_size; - *clsize =3D ctx->ifc_rxqs[0].ifr_fl->ifl_buf_size; + iflib_calc_rx_mbuf_sz(ctx); + *clsize =3D iflib_get_rx_mbuf_sz(ctx); CTX_UNLOCK(ctx); } em0: port 0x2000-0x2007 mem 0xc00000= 00-0xc001ffff,0xc0020000-0xc002ffff irq 16 at device 2.0 on pci0 em0: Using 1024 TX descriptors and 1024 RX descriptors em0: Ethernet address: 00:a0:98:b9:5c:99 em0: netmap queues/slots: TX 1/1024, RX 1/1024 virtio_pci0: port 0x2040-0x207f mem 0xc0030000-0= xc0031fff irq 17 at device 3.0 on pci0 vtblk0: on virtio_pci0 vtblk0: 16384MB (33554432 512 byte sectors) atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] driver bug: Unable to set devclass (class: atkbdc devname: (unknown)) Unhandled ps2 mouse command 0xe1 psm0: irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: model Generic PS/2 mouse, device ID 0 uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 uart0: console (9600,n,8,1) uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0 vga0: at port 0x3b0-0x3bb iomem 0xb0000-0xb7fff pnpid PNP= 0900 on isa0 Timecounters tick every 10.000 msec usb_needs_explore_all: no devclass em0: link state changed to UP panic: m_getzone: invalid cluster size 0 cpuid =3D 0 time =3D 1 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0011b8d= 7f0 vpanic() at vpanic+0x17e/frame 0xfffffe0011b8d850 panic() at panic+0x43/frame 0xfffffe0011b8d8b0 debugnet_mbuf_reinit() at debugnet_mbuf_reinit+0x21b/frame 0xfffffe0011b8d8= f0 debugnet_any_ifnet_update() at debugnet_any_ifnet_update+0x107/frame 0xffff= fe0011b8d940 do_link_state_change() at do_link_state_change+0x1b3/frame 0xfffffe0011b8d9= 90 taskqueue_run_locked() at taskqueue_run_locked+0x10c/frame 0xfffffe0011b8d9= f0 taskqueue_run() at taskqueue_run+0x4a/frame 0xfffffe0011b8da10 ithread_loop() at ithread_loop+0x1c6/frame 0xfffffe0011b8da70 fork_exit() at fork_exit+0x80/frame 0xfffffe0011b8dab0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0011b8dab0 --- trap 0, rip =3D 0, rsp =3D 0, rbp =3D 0 --- KDB: enter: panic [ thread pid 12 tid 100010 ] Stopped at kdb_enter+0x37: movq $0,0x1098a86(%rip) db>=20