Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 20 Jul 2021 15:43:17 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 257298] kernel panic with kern.cam.da.enable_uma_ccbs=1
Message-ID:  <bug-257298-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D257298

            Bug ID: 257298
           Summary: kernel panic with kern.cam.da.enable_uma_ccbs=3D1
           Product: Base System
           Version: CURRENT
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: pr@aoek.com

Hi,
despite https://reviews.freebsd.org/D31054 I can still reproduce the same e=
xact
bt as reported in the mailing list
https://lists.freebsd.org/archives/freebsd-current/2021-June/000267.html

In particular, with a GENERIC kernel I get:
panic: Duplicate free of 0xffffa02039d7a000 from zone
0xffff000166aec000(ada_ccb) slab 0xffffa02039d7afd8(0)=20=20=20=20=20=20=20=
=20=20=20
cpuid =3D 10=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
time =3D 1626781044=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
KDB: stack backtrace:=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
db_trace_self() at db_trace_self=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20
db_trace_self_wrapper() at db_trace_self_wrapper+0x30=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
vpanic() at vpanic+0x188=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20
panic() at panic+0x44=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
uma_dbg_free() at uma_dbg_free+0x1e4=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20
uma_zfree_arg() at uma_zfree_arg+0x358=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20
ahci_end_transaction() at ahci_end_transaction+0x7a4
ahci_ch_intr_main() at ahci_ch_intr_main+0x660
ahci_ch_intr() at ahci_ch_intr+0x5c
ahci_intr() at ahci_intr+0xe4
ithread_loop() at ithread_loop+0x2a8
fork_exit() at fork_exit+0x74
fork_trampoline() at fork_trampoline+0x14
KDB: enter: panic
[ thread pid 12 tid 100137 ]
Stopped at      kdb_enter+0x48: undefined       f904411f

And with a GENERIC-NODEBUG kernel I get:
panic: vm_fault failed: ffff0000007a9b40 error 1
cpuid =3D 2
time =3D 1626773972
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x188
panic() at panic+0x44
data_abort() at data_abort+0x1e0
handle_el1h_sync() at handle_el1h_sync+0x74
--- exception, esr 0x96000044
zone_release() at zone_release+0x224
bucket_drain() at bucket_drain+0xe8
bucket_cache_reclaim_domain() at bucket_cache_reclaim_domain+0x3b0
zone_reclaim() at zone_reclaim+0x194
uma_reclaim_domain() at uma_reclaim_domain+0xbc
vm_pageout_worker() at vm_pageout_worker+0x594
vm_pageout() at vm_pageout+0x1e0
fork_exit() at fork_exit+0x94
fork_trampoline() at fork_trampoline+0x14
KDB: enter: panic
[ thread pid 33 tid 100222 ]
Stopped at      kdb_enter+0x48: undefined       f903c11f

This is with CURRENT as of 439097486ba0453e057c05d548fa306d91c784e5
Author: Jessica Clarke <jrtc27@FreeBSD.org>
Date:   Mon Jul 19 17:19:23 2021 +0100

(This is just where I am now, nothing to do with Jessica commit).

Environment:
# uname -a
FreeBSD asn 14.0-CURRENT FreeBSD 14.0-CURRENT #0 main-n248066-439097486ba0:=
 Mon
Jul 19 21:33:35 CEST 2021=20=20=20=20
root@asn:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC-NODEBUG  arm64

(or GENERIC instead than GENERIC-NODEBUG)

I have a board that is known to have low signal levels in the SATA subsystem
and hits frequent minor troubles with ada disks, such as:
 (ada0:ahcich1:0:0:0): READ_FPDMA_QUEUED. ACB: 60 08 c0 ce 36 40 06 00 00 0=
0 00
00
(ada0:ahcich1:0:0:0): CAM status: Uncorrectable parity/CRC error
(ada0:ahcich1:0:0:0): Retrying command, 3 more tries remain

or

ahcich1: Timeout on slot 14 port 0
ahcich1: is 00000000 cs 0003c080 ss 0003c080 rs 0003c080 tfd 50 serr 001800=
00
cmd 0000c017
(ada0:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 20 e8 8f ed 40 08 00 00 0=
0 00
00
(ada0:ahcich1:0:0:0): CAM status: Command timeout
(ada0:ahcich1:0:0:0): Retrying command, 3 more tries remain

This is ok, I mean, FreeBSD is solid enough to cope with that and get no da=
ta
loss at fs level.

While I understand that this is sub optimal, the circumstance reveals the b=
ug
which is the object of this report: i.e. the kernel panics with faulty
hardware.

Interestingly I can avoid the bug by setting kern.cam.da.enable_uma_ccbs=3D0

Note that I set .da., not .ada sysctl. I have no da disks in the system, on=
ly
ada (two).

# sysctl -a | fgrep cbs
kern.cam.da.enable_uma_ccbs: 0
kern.cam.ada.enable_uma_ccbs: 0

I am unable to get a kernel dump with line numbers (RAM >> swap). Is there a
workaround for this?

Regarding the bug, I can test further, please suggest the direction.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-257298-227>