Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 28 Nov 2021 12:22:46 +1100
From:      Peter Jeremy <peterj@freebsd.org>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org
Subject:   Re: git: b19740f4ce7a - main - swap_pager: lock vnode in swapdev_strategy()
Message-ID:  <YaLZ5rcBo6SxCGQK@server.rulingia.com>
In-Reply-To: <YaFtGQ8vVXScXdjZ@kib.kiev.ua>
References:  <202111251935.1APJZA1e094731@gitrepo.freebsd.org> <YaC8j8ZwYotIKGSO@server.rulingia.com> <YaFtGQ8vVXScXdjZ@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help

--IOV4qFaapzeWeugn
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2021-Nov-27 01:26:17 +0200, Konstantin Belousov <kostikbel@gmail.com> wr=
ote:
>commit 9c62295373f728459c19138f5aa03d9cb8422554
>Author: Konstantin Belousov <kib@FreeBSD.org>
>Date:   Sat Nov 27 01:22:27 2021 +0200
>
>    swapoff_one(): only check free pages count manually turning swap off

That didn't work but I don't think the underlying bug is related to
your recent work on swap_pager - digging back through my logs, I've
found another similar panic in August last year.

Nov 28 09:40:17 rock64 syslogd: exiting on signal 15
Waiting (max 60 seconds) for system process `vnlru' to stop... done
Waiting (max 60 seconds) for system process `syncer' to stop...=20
Syncing disks, vnodes remaining... 0 0 done
Waiting (max 60 seconds) for system thread `bufdaemon' to stop... done
Waiting (max 60 seconds) for system thread `bufspacedaemon-0' to stop... do=
ne
All buffers synced.
No strategy for buffer at 0xffff0000bf8dc000
vnode 0xffffa00009024a80: type VBAD
    usecount 2, writecount 0, refcount 33263 seqc users 1
    hold count flags ()
    flags (VIRF_DOOMED|VV_VMSIZEVNLOCK)
    lock type nfs: SHARED (count 1)
swap_pager: I/O error - pagein failed; blkno 241400,size 4096, error 45
panic: VOP_STRATEGY failed bp=3D0xffff0000bf8dc000 vp=3D0
cpuid =3D 0
time =3D 1638052821
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x178
panic() at panic+0x44
bufstrategy() at bufstrategy+0x80
swapdev_strategy() at swapdev_strategy+0xcc
swap_pager_getpages_locked() at swap_pager_getpages_locked+0x460
swapoff_one() at swapoff_one+0x3e4
swapoff_all() at swapoff_all+0x9c
bufshutdown() at bufshutdown+0x2ac
kern_reboot() at kern_reboot+0x240
sys_reboot() at sys_reboot+0x358
do_el0_sync() at do_el0_sync+0x4a4
handle_el0_sync() at handle_el0_sync+0x9c
--- exception, esr 0x56000000
KDB: enter: panic
[ thread pid 1 tid 100002 ]
Stopped at      kdb_enter+0x48: undefined       f900c11f
db>=20

This is the same traceback as my previous mail.  Looking at the code
path, the test whether there's enough RAM to swap in all the data
passes in both cases: If swapoff_one() returned ENOMEM then
swapoff_all() would report a "Cannot remove swap device" error and
keep going (not bother to actually remove the swap device) - and
that's not happening.

I think the important message is "No strategy for buffer at 0x..."
which comes from vop_nostrategy() and causes bufstrategy() to panic:

swapdev_strategy()
 =3D> bstrategy()
 =3D> BO_STRATEGY()
 =3D> bufstrategy()
 =3D> VOP_STRATEGY()
 =3D> VOP_STRATEGY_APV()
 =3D> vop_nostrategy()
    =3D> bufdone() =3D> swp_pager_async_iodone()

Presumably, stopping the network means there's no longer any way for
swap operations to complete so the swap device has become associated
with default_vnodeops, (though I haven't dug into the actual code
path that does that).

Moving up a level, does it really matter if swapoff_one() is skipped?
If it actually returned an error (eg if the free memory test failed),
then that's what would happen.  By this point in the shutdown, there's
no userland left (which makes me wonder why there's anything left in
swap in any case) and only the final cleanups remain before the kernel
shuts down.  What's really needed is a way to detect that the relevant
swap I/O provider has gone away and return to swapoff_all() without
panicing.

--=20
Peter Jeremy

--IOV4qFaapzeWeugn
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iQKTBAEBCgB9FiEE7rKYbDBnHnTmXCJ+FqWXoOSiCzQFAmGi2eFfFIAAAAAALgAo
aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldEVF
QjI5ODZDMzA2NzFFNzRFNjVDMjI3RTE2QTU5N0EwRTRBMjBCMzQACgkQFqWXoOSi
CzQXuxAAmZ8WeG9bcvQLUuyrx37KNUcYBV88wZf/MiWFr9v9r+1Bww493ICescF1
RLx8fydOTKTBuNOvZmIDPm15ZRuDKT2z8n0sSBwxQIO75SjVkuMSkJqxfR0BPVB6
cSr4iwpxQfHjoGReKHufkciTelvSTHLEYnHa+rpIMn7PgQ72Nr9aGbapBqsWvYNl
ZH4NGnj4swsfw/LL7XHB9uYaISK1ZdDHxeaSpshXPjDkVek/SEaIfxzHX0NdYJDt
bYOKcWWPKiGWx03loDi5Z4+I5Popb1ACC+Jwv3L4RIA3b4IlxWiBtWfpvD3xbZWm
rpGQjky2SwUo6K+1MvLFIqsHeczciTo2CYsq3fWVXjue8b/Z0d1ooiqxfcWeY/r+
ExRIbP6S+YS/4UBMpekI1Lrf6aD849c/B0hCPkEExnhUeJAK23OCaJL706Zf98gJ
2zHLcPBsOmVpVn+bpTM2mfI7qvtbYwHZncWGfsUErH67RMRU6jic0B+nQ2q4ysWE
7Xrg9c4jXNcmoQ/71/nRCB1fGFp8/Of8IrC27h0qAn9IDfkwk9BkgVlAHrtg/BNG
CdQexx+YzrH/CSzF1HQu5oMTUV1C56RXFTawbF9EJOvdR8elO9JFI5ecEm8mG8pk
V6BSB/nA0xI4qyB/h02kAKJLr41y2bqorARBWiBMIuHfX2lRkI8=
=vXIi
-----END PGP SIGNATURE-----

--IOV4qFaapzeWeugn--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YaLZ5rcBo6SxCGQK>