Date: Wed, 27 Jan 2021 10:15:24 -0800 From: Steven Schlansker <stevenschlansker@gmail.com> To: freebsd-fs@freebsd.org Subject: Re: persistent integer divide fault panic in zfs_rmnode Message-ID: <CAHjY6CV6viiZ-EbVUnzt94zwSVXF=kCgYePeAKHrv1w_mJWPMA@mail.gmail.com> In-Reply-To: <CAHjY6CVPTfkgzZ2kwwkKxRemmRyn5DpVu4SY=4GCvmo62sircQ@mail.gmail.com> References: <CAHjY6CVPTfkgzZ2kwwkKxRemmRyn5DpVu4SY=4GCvmo62sircQ@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Does anybody have any suggestions as to what I can try next regarding this panic? At this point the only path forward I see is to declare the zpool corrupt and attempt to move all the data off, destroy, and migrate back, and hope the recreated pool does not tickle this bug. That would be a pretty disappointing end to a long fatal-problem-free run with ZFS. Thanks, Steven On Fri, Jan 8, 2021 at 3:41 PM Steven Schlansker <stevenschlansker@gmail.com> wrote: > Hi freebsd-fs, > > I have a 8-way raidz2 system running FreeBSD 12.2-RELEASE-p1 GENERIC > Approximately since upgrading to FreeBSD 12.2-RELEASE, I receive a nasty > panic when trying to unlink any of a large number of files. > > Fatal trap 18: integer divide fault while in kernel mode > > > The pool reports as healthy: > > pool: universe > state: ONLINE > status: One or more devices are configured to use a non-native block size. > Expect reduced performance. > action: Replace affected devices with devices that support the > configured block size, or migrate data to a properly configured > pool. > scan: resilvered 416M in 0 days 00:08:35 with 0 errors on Thu Jan 7 > 02:16:03 2021 > When some files are unlinked, the system panics with a partial backtrace > of: > > #6 0xffffffff82a148ce at zfs_rmnode+0x5e > #7 0xffffffff82a35612 at zfs_freebsd_reclaim+0x42 > #8 0xffffffff812482db at VOP_RECLAIM_APV+0x7b > #9 0xffffffff80c8e376 at vgonel+0x216 > #10 0xffffffff80c8e9c5 at vrecycle+0x45 > > I captured a dump, and using kgdb extracted a full backtrace, and filed it > as https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=250784 > > #8 0xffffffff82963725 in get_next_chunk (dn=0xfffff804325045c0, > start=<optimized out>, minimum=0, l1blks=<optimized out>) > at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:721 > warning: Source file is more recent than executable. > 721 (roundup(*start, iblkrange) - (minimum / iblkrange * > iblkrange)) / > (kgdb) list > 716 * L1 blocks in this range have data. If we can, we use > this > 717 * worst case value as an estimate so we can avoid having > to look > 718 * at the object's actual data. > 719 */ > 720 uint64_t total_l1blks = > 721 (roundup(*start, iblkrange) - (minimum / iblkrange * > iblkrange)) / > 722 iblkrange; > 723 if (total_l1blks <= maxblks) { > 724 *l1blks = total_l1blks; > 725 *start = minimum; > (kgdb) print iblkrange > $1 = 0 > (kgdb) print minimum > $2 = 0 > > It looks like it is attempting to compute 0 / 0, causing the panic. > > How can I restore my zpool to a working state? Thank you for any > assistance. >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHjY6CV6viiZ-EbVUnzt94zwSVXF=kCgYePeAKHrv1w_mJWPMA>