Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 22 Aug 2023 13:49:24 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 273289] panic on removal of SAS drive
Message-ID:  <bug-273289-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D273289

            Bug ID: 273289
           Summary: panic on removal of SAS drive
           Product: Base System
           Version: 13.2-STABLE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: jfc@mit.edu

I removed an SAS SSD and the system crashed with message

panic: free: called with spinlock or critical section held

The erroneous free is in pqisrc_device_mem_free.  My kernel is based on
4c22848d1a7e3fc996adc0cb71e35d7be8b26ffb except I have INVARIANTS enabled.

The drive identifies as SEAGATE XS960SE70004 (960 GB SAS SSD).  It holds a =
ZFS
pool which I exported before removing the drive.

The system is an HPE Proliant DL 325 Gen 10 with the controller below:

ses0: <HPE Smart Adapter 1.99> Fixed Enclosure Services SPC-3 SCSI device
ses0: 1200.000MB/s transfers
ses0: SES Device
ses0: da0,pass0 in 'ArrayElement0000', SAS Slot: 1 phys at slot 1
...
ses0: da7,pass7 in 'ArrayElement0007', SAS Slot: 1 phys at slot 8
ses0:  phy 0: SAS device type 1 phy 7 Target ( SSP )
ses0:  phy 0: parent 51402ec013d6a5b4 addr 5000c5003e85f2bd

I removed and reinserted da7.  The panic appears to have been triggered by
removal.

Crash dump information follows.

Unread portion of the kernel message buffer:
[INFO]:[ pqisrc_display_device_info ] [ 324 ]removed scsi BTL 0:71:0:  SEAG=
ATE=20
XS960SE70004     Physical     SSDSmartPathCap- En- Exp+ qd=3D65535
[INFO]:[ pqisrc_remove_device ] [ 1302 ]vendor: SEAGATE XS960SE70004     mo=
del:
XS960SE70004     bus:0 target:71 lun:0 is_physical_device:0x1 expose_device=
:0x1
volume_offline 0x0 volume_status 0x0=20
[INFO]:[ pqisrc_wait_for_device_commands_to_complete ] [ 515 ]Device
Outstanding IO count =3D 0
panic: free: called with spinlock or critical section held
cpuid =3D 11
time =3D 1692710548
KDB: stack backtrace:
#0 0xffffffff80c19e05 at kdb_backtrace+0x65
#1 0xffffffff80bcf112 at vpanic+0x152
#2 0xffffffff80bcef13 at panic+0x43
#3 0xffffffff80ba4b5f at free+0xcf
#4 0xffffffff811247ee at pqisrc_free_device+0x16e
#5 0xffffffff811210ce at os_remove_device+0x7e
#6 0xffffffff81125a3f at pqisrc_scan_devices+0xe7f
#7 0xffffffff8112736d at pqisrc_ack_all_events+0x16d
#8 0xffffffff80c2e87b at taskqueue_run_locked+0xab
#9 0xffffffff80c2e78d at taskqueue_run+0x4d
#10 0xffffffff80b8c9e6 at ithread_loop+0x256
#11 0xffffffff80b89910 at fork_exit+0x80
#12 0xffffffff8105f5ee at fork_trampoline+0xe
Uptime: 20d2h33m19s
Dumping 11245 out of 98100 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%.=
.91%

__curthread () at /usr/home/jfc/freebsd/src/sys/amd64/include/pcpu_aux.h:55
55              __asm("movq %%gs:%P1,%0" : "=3Dr" (td) : "n" (offsetof(stru=
ct
pcpu,
(kgdb) #0  __curthread ()
    at /usr/home/jfc/freebsd/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=3D<optimized out>)
    at /usr/home/jfc/freebsd/src/sys/kern/kern_shutdown.c:396
#2  0xffffffff80bced22 in kern_reboot (howto=3D260)
    at /usr/home/jfc/freebsd/src/sys/kern/kern_shutdown.c:484
#3  0xffffffff80bcf17f in vpanic (
    fmt=3D0xffffffff811f6ecd "free: called with spinlock or critical section
held", ap=3Dap@entry=3D0xfffffe01b7549a00)
    at /usr/home/jfc/freebsd/src/sys/kern/kern_shutdown.c:923
#4  0xffffffff80bcef13 in panic (fmt=3D<unavailable>)
    at /usr/home/jfc/freebsd/src/sys/kern/kern_shutdown.c:847
#5  0xffffffff80ba4b5f in free_dbg (mtp=3D0xffffffff81b49030 <M_SMARTPQI>,=
=20
    addrp=3D<optimized out>)
    at /usr/home/jfc/freebsd/src/sys/kern/kern_malloc.c:866
#6  free (addr=3Daddr@entry=3D0xfffff80104da9700,=20
    mtp=3D0xffffffff81b49030 <M_SMARTPQI>)
    at /usr/home/jfc/freebsd/src/sys/kern/kern_malloc.c:904
#7  0xffffffff8112cef4 in os_mem_free (softs=3Dsofts@entry=3D0xfffffe01b852=
a000,=20
    addr=3D<unavailable>, addr@entry=3D0xfffff80104da9700 "", size=3D<unava=
ilable>,=20
    size@entry=3D184)
    at /usr/home/jfc/freebsd/src/sys/dev/smartpqi/smartpqi_mem.c:192
#8  0xffffffff811247ee in pqisrc_device_mem_free (softs=3D0xfffffe01b852a00=
0,=20
    device=3D0xfffff80104da9700)
    at /usr/home/jfc/freebsd/src/sys/dev/smartpqi/smartpqi_discovery.c:1432
#9  pqisrc_free_device (softs=3Dsofts@entry=3D0xfffffe01b852a000,=20
    device=3Ddevice@entry=3D0xfffff80104da9700)
    at /usr/home/jfc/freebsd/src/sys/dev/smartpqi/smartpqi_discovery.c:1464
#10 0xffffffff811210ce in os_remove_device (softs=3D0xfffffe01b852a000,=20
    device=3D0xfffff80104da9700)
    at /usr/home/jfc/freebsd/src/sys/dev/smartpqi/smartpqi_cam.c:152
#11 0xffffffff81124673 in pqisrc_remove_device (softs=3D0x99d1f44cbec872bb,=
=20
    softs@entry=3D0xfffffe01b852a000, device=3D<unavailable>,=20
    device@entry=3D0xfffff80104da9700)
    at /usr/home/jfc/freebsd/src/sys/dev/smartpqi/smartpqi_discovery.c:1317
#12 0xffffffff81125a3f in pqisrc_update_device_list (
    softs=3D0xfffffe01b852a000, new_device_list=3D0xfffff802f4277b80,=20
    num_new_devices=3D9)
    at /usr/home/jfc/freebsd/src/sys/dev/smartpqi/smartpqi_discovery.c:1597
#13 pqisrc_scan_devices (softs=3Dsofts@entry=3D0xfffffe01b852a000)
    at /usr/home/jfc/freebsd/src/sys/dev/smartpqi/smartpqi_discovery.c:1992
#14 0xffffffff8112736d in pqisrc_rescan_devices (softs=3D0xfffffe01b852a000)
    at /usr/home/jfc/freebsd/src/sys/dev/smartpqi/smartpqi_event.c:42
#15 pqisrc_ack_all_events (arg1=3D0xfffffe01b852a000)
    at /usr/home/jfc/freebsd/src/sys/dev/smartpqi/smartpqi_event.c:123
#16 0xffffffff80c2e87b in taskqueue_run_locked (
    queue=3Dqueue@entry=3D0xfffff8010191be00)
    at /usr/home/jfc/freebsd/src/sys/kern/subr_taskqueue.c:514
#17 0xffffffff80c2e78d in taskqueue_run (queue=3D0xfffff8010191be00)
    at /usr/home/jfc/freebsd/src/sys/kern/subr_taskqueue.c:529
#18 0xffffffff80b8c9e6 in intr_event_execute_handlers (ie=3D0xfffff8010191b=
d00,=20
    p=3D<optimized out>) at /usr/home/jfc/freebsd/src/sys/kern/kern_intr.c:=
1169
#19 ithread_execute_handlers (ie=3D0xfffff8010191bd00, p=3D<optimized out>)
    at /usr/home/jfc/freebsd/src/sys/kern/kern_intr.c:1182
#20 ithread_loop (arg=3Darg@entry=3D0xfffff80101964c60)
    at /usr/home/jfc/freebsd/src/sys/kern/kern_intr.c:1270
#21 0xffffffff80b89910 in fork_exit (
    callout=3D0xffffffff80b8c790 <ithread_loop>, arg=3D0xfffff80101964c60,=
=20
    frame=3D0xfffffe01b7549f40)
    at /usr/home/jfc/freebsd/src/sys/kern/kern_fork.c:1094
#22 <signal handler called>

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-273289-227>