FreeBSD Mail Archives

Date:      Wed, 27 Jan 2021 10:15:24 -0800
From:      Steven Schlansker <stevenschlansker@gmail.com>
To:        freebsd-fs@freebsd.org
Subject:   Re: persistent integer divide fault panic in zfs_rmnode
Message-ID:  <CAHjY6CV6viiZ-EbVUnzt94zwSVXF=kCgYePeAKHrv1w_mJWPMA@mail.gmail.com>
In-Reply-To: <CAHjY6CVPTfkgzZ2kwwkKxRemmRyn5DpVu4SY=4GCvmo62sircQ@mail.gmail.com>
References:  <CAHjY6CVPTfkgzZ2kwwkKxRemmRyn5DpVu4SY=4GCvmo62sircQ@mail.gmail.com>

Does anybody have any suggestions as to what I can try next regarding this
panic?

At this point the only path forward I see is to declare the zpool corrupt
and attempt to
move all the data off, destroy, and migrate back, and hope the recreated
pool does not tickle this bug.

That would be a pretty disappointing end to a long fatal-problem-free run
with ZFS.

Thanks,
Steven

On Fri, Jan 8, 2021 at 3:41 PM Steven Schlansker <stevenschlansker@gmail.com>
wrote:

> Hi freebsd-fs,
>
> I have a 8-way raidz2 system running FreeBSD 12.2-RELEASE-p1 GENERIC
> Approximately since upgrading to FreeBSD 12.2-RELEASE, I receive a nasty
> panic when trying to unlink any of a large number of files.
>
> Fatal trap 18: integer divide fault while in kernel mode
>
>
> The pool reports as healthy:
>
>   pool: universe
>  state: ONLINE
> status: One or more devices are configured to use a non-native block size.
>         Expect reduced performance.
> action: Replace affected devices with devices that support the
>         configured block size, or migrate data to a properly configured
>         pool.
>   scan: resilvered 416M in 0 days 00:08:35 with 0 errors on Thu Jan  7
> 02:16:03 2021
> When some files are unlinked, the system panics with a partial backtrace
> of:
>
> #6 0xffffffff82a148ce at zfs_rmnode+0x5e
> #7 0xffffffff82a35612 at zfs_freebsd_reclaim+0x42
> #8 0xffffffff812482db at VOP_RECLAIM_APV+0x7b
> #9 0xffffffff80c8e376 at vgonel+0x216
> #10 0xffffffff80c8e9c5 at vrecycle+0x45
>
> I captured a dump, and using kgdb extracted a full backtrace, and filed it
> as https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=250784
>
> #8  0xffffffff82963725 in get_next_chunk (dn=0xfffff804325045c0,
> start=<optimized out>, minimum=0, l1blks=<optimized out>)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:721
> warning: Source file is more recent than executable.
> 721                 (roundup(*start, iblkrange) - (minimum / iblkrange *
> iblkrange)) /
> (kgdb) list
> 716              * L1 blocks in this range have data. If we can, we use
> this
> 717              * worst case value as an estimate so we can avoid having
> to look
> 718              * at the object's actual data.
> 719              */
> 720             uint64_t total_l1blks =
> 721                 (roundup(*start, iblkrange) - (minimum / iblkrange *
> iblkrange)) /
> 722                 iblkrange;
> 723             if (total_l1blks <= maxblks) {
> 724                     *l1blks = total_l1blks;
> 725                     *start = minimum;
> (kgdb) print iblkrange
> $1 = 0
> (kgdb) print minimum
> $2 = 0
>
> It looks like it is attempting to compute 0 / 0, causing the panic.
>
> How can I restore my zpool to a working state?  Thank you for any
> assistance.
>

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHjY6CV6viiZ-EbVUnzt94zwSVXF=kCgYePeAKHrv1w_mJWPMA>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation