From nobody Wed Jun 14 11:08:54 2023 X-Original-To: freebsd-arm@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Qh2jg2Qvkz4cVlM for ; Wed, 14 Jun 2023 11:08:55 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Qh2jf6NxKz3CYj for ; Wed, 14 Jun 2023 11:08:54 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1686740934; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=Du64rOg5ecFI9BYJfBIgfDfQEXRNR5NVkdHcK5Tf6H8=; b=uh++KrYRG3NxFBbU4DI0tkFiK934JG38JIz/Dh4tFIBwyhR2ALwVo/nojycoODIJfFuNwm K+VY6TNcBU491hgr8C2NGBqIMAXz5fXVpvvG6IN9pDcPcNMZoYI/5X7Z1Lk7xx5/keF/xP kdilbsW3ljdelvcMsfOm7qee0fGf+vEPwM/nDEpqU/8uycHNs6sVJnS50Gp1r5o5EMhSJv ut8ZOobYn4c1gPFZFoPPcvuusQQHwJ+w80KQB97DMpAwupIS9klyWchMlkj+4GMp9WZMDS npo3ZPUVfubLhknSM/ds09k8eC6AQ6YhlVuc5SR6nA8QyUY0X3JHX4/2K3wWeA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1686740934; a=rsa-sha256; cv=none; b=bEB7GhjMmWlah7TWzMuQC8O+ktoMXUJ18FbWAn1yY/1z+h4rfRH3Ivxw++mPWAaQJyhvx+ Y45w3WQB2aCJ+bBelqAT/iShPVtTo6p3IdGnzsnUsAC61Ij9sXgbwHS7ISRdPw+gJdECmr X+z8M/nzpMkx4j/0kn7jS5B2a8t5Ey9EPXXKbsiamtiMwzOcULZfYtAS6dty7knGd4SaT0 vxz3BAIiPOjFit3MRmnteaJ2+2fGz2vJir/l3/H7i98F1Up1YLERzNMerS7niMunc9Fio5 rvz9m+oefDS77htRtXsKFXRniYV/mBAFONABOxf5hY/eaIDMwxqp/8ZZ57lhXw== Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4Qh2jf5TCKz11yB for ; Wed, 14 Jun 2023 11:08:54 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 35EB8sm2023637 for ; Wed, 14 Jun 2023 11:08:54 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 35EB8sn2023636 for freebsd-arm@FreeBSD.org; Wed, 14 Jun 2023 11:08:54 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: freebsd-arm@FreeBSD.org Subject: [Bug 271990] IRQ mapping table is full after stress devctl disable/enable Date: Wed, 14 Jun 2023 11:08:54 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: arm X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Many People X-Bugzilla-Who: osamaabb@amazon.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-arm@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Porting FreeBSD to ARM processors List-Archive: https://lists.freebsd.org/archives/freebsd-arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@freebsd.org MIME-Version: 1.0 X-ThisMailContainsUnwantedMimeParts: N https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D271990 Bug ID: 271990 Summary: IRQ mapping table is full after stress devctl disable/enable Product: Base System Version: CURRENT Hardware: arm64 OS: Any Status: New Severity: Affects Many People Priority: --- Component: arm Assignee: freebsd-arm@FreeBSD.org Reporter: osamaabb@amazon.com Reproduction steps: ------------------- 1. Create an AWS EC2 instance from one of the following AMIs in us-east-1 1.1: ami-0b55af91f40cd29ee - FreeBSD 14.0-CURRENT-arm64-20230525 UEFI 1.2: ami-0fdc715f878897386 - FreeBSD 13.2-STABLE-arm64-20230601 UEFI 1.3: ami-0e1fd0c2493efe1d1 - FreeBSD 12.4-STABLE-arm64-2023-06-01 2. run the following reset loop script: #!/bin/sh while true do devctl disable ena0 devctl enable ena0 done Result: ------- Crashes every time. 100% reproducible. ***The same test does not fail on intel based instances.*** Stack trace: ------------ 2023-06-14T08:05:02.374Z panic: IRQ mapping table is full. 2023-06-14T08:05:02.374Z cpuid =3D 18 2023-06-14T08:05:02.374Z time =3D 1686729902 2023-06-14T08:05:02.374Z KDB: stack backtrace: 2023-06-14T08:05:02.374Z db_trace_self() at db_trace_self 2023-06-14T08:05:02.374Z db_trace_self_wrapper() at db_trace_self_wrapper+0x30 2023-06-14T08:05:02.374Z vpanic() at vpanic+0x13c 2023-06-14T08:05:02.374Z panic() at panic+0x44 2023-06-14T08:05:02.374Z intr_map_irq() at intr_map_irq+0xb0 2023-06-14T08:05:02.374Z intr_alloc_msix() at intr_alloc_msix+0x1d8 2023-06-14T08:05:02.374Z generic_pcie_acpi_alloc_msix() at generic_pcie_acpi_alloc_msix+0x78 2023-06-14T08:05:02.374Z pci_alloc_msix_method() at pci_alloc_msix_method+0x168 2023-06-14T08:05:02.374Z=20=20=20=20=20=20=20 ena_enable_msix_and_set_admin_interrupts() at ena_enable_msix_and_set_admin_interrupts+0x10c 2023-06-14T08:05:02.374Z ena_attach() at ena_attach+0x65c 2023-06-14T08:05:02.375Z device_attach() at device_attach+0x= 3f8 2023-06-14T08:05:02.375Z device_probe_and_attach() at device_probe_and_attach+0x7c 2023-06-14T08:05:02.375Z devctl2_ioctl() at devctl2_ioctl+0x= 44c 2023-06-14T08:05:02.375Z devfs_ioctl() at devfs_ioctl+0xd4 2023-06-14T08:05:02.375Z vn_ioctl() at vn_ioctl+0xc0 2023-06-14T08:05:02.375Z devfs_ioctl_f() at devfs_ioctl_f+0x= 20 2023-06-14T08:05:02.375Z kern_ioctl() at kern_ioctl+0x2dc 2023-06-14T08:05:02.375Z sys_ioctl() at sys_ioctl+0x118 2023-06-14T08:05:02.375Z do_el0_sync() at do_el0_sync+0x520 2023-06-14T08:05:02.375Z handle_el0_sync() at handle_el0_sync+0x44 2023-06-14T08:05:02.375Z --- exception, esr 0x56000000 2023-06-14T08:05:02.375Z Uptime: 4m1s 2023-06-14T08:05:02.375Z Dumping 2053 out of 64453 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% 2023-06-14T08:10:36.676Z Dump complete 2023-06-14T08:10:37.976Z UEFI firmware (version built at 09:00:00 on Nov 1 2018) 2023-06-14T08:10:38.076Z=20=20=20=20=20=20=20 [2J[01;01H[=3D3h[2J[01;01H[2J[01;01H[=3D3h[2J[01;01H[2J[01;01H[=3D3h[2J[01;= 01H[0m[35m[40m[2J[01;01H[2J[01;01H[0m[37m[40m[01;01HConsoles: EFI console 2023-06-14T08:10:38.076Z Reading loader env vars from /efi/freebsd/loader.env 2023-06-14T08:10:38.076Z Setting currdev to disk0p1: 2023-06-14T08:10:38.076Z FreeBSD/arm64 EFI loader, Revision = 1.1 2023-06-14T08:10:38.076Z (Thu May 25 06:36:21 UTC 2023 root@releng1.nyi.freebsd.org) 2023-06-14T08:10:38.076Z=20=20=20=20=20=20=20=20 2023-06-14T08:10:38.076Z Command line arguments: loader.efi 2023-06-14T08:10:38.176Z Image base: 0x7856f000 2023-06-14T08:10:38.176Z EFI version: 2.70 2023-06-14T08:10:38.176Z EFI Firmware: EDK II (rev 1.00) 2023-06-14T08:10:38.176Z Console: efi (0x1000) 2023-06-14T08:10:38.176Z Load Path: \EFI\BOOT\BOOTAA64.EFI 2023-06-14T08:10:38.176Z Load Device: PciRoot(0x0)/Pci(0x4,0x0)/NVMe(0x1,00-00-00-00-00-00-00-00)/HD(1,GPT,B61C1E= 65-FAFA-11ED-84CB-002590EC5BF2,0x3,0x10418) 2023-06-14T08:10:38.176Z BootCurrent: 0001=20 Initial investigation results: ------------------------------ Tried to reproduce the issue on Intel based instances, no reproduction even after 50k up/down iteration. Looked into the fbsd ena driver [1] up/down flows, saw that the driver does= the pci_msix_allocate/release and bus_allocation/release in the correct order. [1] https://github.com/amzn/amzn-drivers/tree/master/kernel/fbsd/ena Since the pci/bus APIs should be platform agnostic (?) I assume it to be an issue with ARM side of the kernel --=20 You are receiving this mail because: You are the assignee for the bug.=