Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 8 Jul 2020 12:29:55 +0200
From:      Kajetan Staszkiewicz <vegeta@tuxpowered.net>
To:        net@FreeBSD.org
Subject:   Deadlocks when using pf tags or socket owner matching
Message-ID:  <48de3ee4-ba52-f6a7-b2d7-c7d77a635eb9@tuxpowered.net>

next in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--6Of8gmt13SoPjymQr9LyK3pjeHp5qDQwP
Content-Type: multipart/mixed; boundary="UkvR4H5J5VCKr7C31oruLwzrE00EhNufW"

--UkvR4H5J5VCKr7C31oruLwzrE00EhNufW
Content-Type: text/plain; charset=utf-8
Content-Language: en-GB
Content-Transfer-Encoding: quoted-printable

Hello group,

I've decided to give a try to two functions of pf: tags, so that I can
match packets on outgoing side of my loadbalancer to the ones which came
in and socket ownership, so that I can match packets originating on the
loadbalancer itself (dns, puppet and such). My kernel has some
additional patches so I'm starting here before reporting this in Bugzilla=
=2E

After a few hours runnig with the new ruleset some of my systems freeze.
I've built a kernel without KDB_UNATTENDED, and NMI from IPMI card got
me the debugger console where I dumped the memory and rebooted the
system. So far so good.

Now I'm having problems understading who holds the lock which causes
trouble.

Looking at what NIC queue threads do, I find that all of threads apart
from one are locked on backtraces similar (the exact path to reach
pf_test() might be a bit different, or it's pf_test6 instead) to this:

info threads:
 85 Thread 100096 (PID=3D12: intr/irq301: ixl0:q7)  sched_switch
(td=3D0xfffff800118c6620, newtd=3D0xfffff8001513b620, flags=3D<value opti=
mized
out>) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/sched_ule.c:2103

(kgdb) thr 85
[Switching to thread 85 (Thread 100096)]#0  sched_switch
(td=3D0xfffff800118c6620, newtd=3D0xfffff8001513b620, flags=3D<value opti=
mized
out>) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/sched_ule.c:2103
2103	/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/sched_ule.c: No
such file or directory.
	in /usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/sched_ule.c

(kgdb) bt
#0  sched_switch (td=3D0xfffff800118c6620, newtd=3D0xfffff8001513b620,
flags=3D<value optimized out>) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/sched_ule.c:2103
#1  0xffffffff806719d8 in mi_switch (flags=3D259, newtd=3D0x0) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/kern_synch.c:439
#2  0xffffffff806c3741 in turnstile_wait (ts=3D0xfffff8001503f180,
owner=3D<value optimized out>, queue=3D-500) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/subr_turnstile.c:751
#3  0xffffffff80647d81 in __mtx_lock_sleep (c=3D0xffffffff8184f278,
v=3D<value optimized out>) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/kern_mutex.c:637
#4  0xffffffff80661e53 in _rm_rlock (rm=3D0xffffffff8184f220,
tracker=3D0xfffffe0c04808a38, trylock=3D0) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/kern_rmlock.c:408
#5  0xffffffff81825604 in pf_test (dir=3D2, ifp=3D0xfffff80012f9a000,
m0=3D0xfffffe0c04808cc8, inp=3D0x0) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/netpfil/pf/pf.c:6006
#6  0xffffffff81826759 in pf_test (dir=3D1, ifp=3D<value optimized out>,
m0=3D<value optimized out>, inp=3D0x0) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/netpfil/pf/pf.c:5594
#7  0xffffffff818370ed in pf_check_in (arg=3D<value optimized out>,
m=3D0xfffffe0c04808dc0, ifp=3D<value optimized out>, dir=3D<value optimiz=
ed
out>, flags=3D<value optimized out>, inp=3D<value optimized out>)
    at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/netpfil/pf/pf_ioctl.c:3916=

#8  0xffffffff8077b647 in pfil_run_hooks (ph=3D0xffffffff81335108,
mp=3D<value optimized out>, ifp=3D0xfffff80012f2c000, dir=3D1, flags=3D0,=

inp=3D0x0) at /usr/home/kajetan.staszkiewicz/freebsd.git/sys/net/pfil.c:1=
09
=2E..

My pf.c, function pf_test():

6003
6004         pd.pf_mtag =3D pf_find_mtag(m);
6005
6006         PF_RULES_RLOCK();
6007
6008         if (ip_divert_ptr !=3D NULL &&
6009             ((ipfwtag =3D m_tag_locate(m, MTAG_IPFW_RULE, 0, NULL))
!=3D NULL)) {
6010                 struct ipfw_rule_ref *rr =3D (struct ipfw_rule_ref
*)(ipfwtag+1);

All those NIC queues threads and some pfctl processes (the loadbalancer
collects stats from pf labels or updates loadbalancing tables) are
waiting for one more (I hope that's the correct way to get this
information):

(kgdb) print rm->rm_activeReaders->lh_first->rmp_thread->td_name
$27 =3D 0xfffff800118c3294 "irq296: ixl0:q2"


[9:17:44] hwlb-foe-aw-02 ~/ # grep ixl0:q2 ps.core2
  90 Thread 100091 (PID=3D12: intr/irq296: ixl0:q2)  sched_switch
(td=3D0xfffff800118c3000, newtd=3D0xfffff8000b0d7000, flags=3D<value opti=
mized
out>) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/sched_ule.c:2103


And that one is waiting on:

(kgdb) thr 90
[Switching to thread 90 (Thread 100091)]#0  sched_switch
(td=3D0xfffff800118c3000, newtd=3D0xfffff8000b0d7000, flags=3D<value opti=
mized
out>) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/sched_ule.c:2103
2103	in /usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/sched_ule.c

(kgdb) bt
#0  sched_switch (td=3D0xfffff800118c3000, newtd=3D0xfffff8000b0d7000,
flags=3D<value optimized out>) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/sched_ule.c:2103
#1  0xffffffff806719d8 in mi_switch (flags=3D259, newtd=3D0x0) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/kern_synch.c:439
#2  0xffffffff806c3741 in turnstile_wait (ts=3D0xfffff8000b9ecb40,
owner=3D<value optimized out>, queue=3D1) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/subr_turnstile.c:751
#3  0xffffffff80663511 in __rw_rlock_hard (rw=3D0xffffffff81336188,
td=3D0xfffff800118c3000, v=3D<value optimized out>) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/kern_rwlock.c:601
#4  0xffffffff80795f61 in in_pcblookup_hash (pcbinfo=3D0xffffffff81336130=
,
faddr=3D{s_addr =3D 2208380372}, fport=3D13698, laddr=3D{s_addr =3D 27357=
72508},
lport=3D17822, lookupflags=3D2)
    at /usr/home/kajetan.staszkiewicz/freebsd.git/sys/netinet/in_pcb.c:19=
75
#5  0xffffffff818251c2 in pf_socket_lookup (direction=3D<value optimized
out>, pd=3D0xfffffe0c047efa98, m=3D0xfffff80015509200) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/netpfil/pf/pf.c:3052
#6  0xffffffff8182aa84 in pf_test_rule (rm=3D0xfffffe0c047efb40,
sm=3D0xfffffe0c047efb48, direction=3D2, kif=3D0xfffff80012e4bb00, m=3D<va=
lue
optimized out>, off=3D20, pd=3D0xfffffe0c047efa98, am=3D0xfffffe0c047efb2=
8,
rsm=3D0xfffffe0c047efb18, inp=3D0x0)
    at /usr/home/kajetan.staszkiewicz/freebsd.git/sys/netpfil/pf/pf.c:348=
9
#7  0xffffffff818269e2 in pf_test (dir=3D2, ifp=3D<value optimized out>,
m0=3D<value optimized out>, inp=3D0x0) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/netpfil/pf/pf.c:6115
#8  0xffffffff81826759 in pf_test (dir=3D1, ifp=3D<value optimized out>,
m0=3D<value optimized out>, inp=3D0x0) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/netpfil/pf/pf.c:5594
#9  0xffffffff818370ed in pf_check_in (arg=3D<value optimized out>,
m=3D0xfffffe0c047efdc0, ifp=3D<value optimized out>, dir=3D<value optimiz=
ed
out>, flags=3D<value optimized out>, inp=3D<value optimized out>)
    at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/netpfil/pf/pf_ioctl.c:3916=

#10 0xffffffff8077b647 in pfil_run_hooks (ph=3D0xffffffff81335108,
mp=3D<value optimized out>, ifp=3D0xfffff80012f2c000, dir=3D1, flags=3D0,=

inp=3D0x0) at /usr/home/kajetan.staszkiewicz/freebsd.git/sys/net/pfil.c:1=
09
#11 0xffffffff8079a8dd in ip_input (m=3D0xfffff80015509200) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/netinet/ip_input.c:604
=2E..

pf.c line 3052 is pf_socket_lookup():

3049         switch (pd->af) {
3050 #ifdef INET
3051         case AF_INET:
3052                 inp =3D in_pcblookup_mbuf(pi, saddr->v4, sport,
daddr->v4,
3053                     dport, INPLOOKUP_RLOCKPCB, NULL, m);



(kgdb) f 3
#3  0xffffffff80663511 in __rw_rlock_hard (rw=3D0xffffffff81336188,
td=3D0xfffff800118c3000, v=3D<value optimized out>) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/kern_rwlock.c:601
601	/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/kern_rwlock.c:
No such file or directory.
	in /usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/kern_rwlock.c

(kgdb) print *rw
$37 =3D {lock_object =3D {lo_name =3D 0xffffffff80b8490f "pcbinfohash",
lo_flags =3D 86179840, lo_data =3D 0, lo_witness =3D 0x0}, rw_lock =3D
18446735277970142758}

(kgdb) f 2
#2  0xffffffff806c3741 in turnstile_wait (ts=3D0xfffff8000b9ecb40,
owner=3D<value optimized out>, queue=3D1) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/subr_turnstile.c:751
751
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/subr_turnstile.c: No
such file or directory.
	in /usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/subr_turnstile.c

(kgdb) print *ts
$38 =3D {ts_lock =3D {lock_object =3D {lo_name =3D 0xffffffff80b73a17 "tu=
rnstile
lock", lo_flags =3D 720896, lo_data =3D 0, lo_witness =3D 0x0}, mtx_lock =
=3D 0},
ts_blocked =3D 0xfffff8000b9ecb60, ts_pending =3D {tqh_first =3D 0x0,
    tqh_last =3D 0xfffff8000b9ecb80}, ts_hash =3D {le_next =3D 0x0, le_pr=
ev =3D
0xffffffff8127cba8}, ts_link =3D {le_next =3D 0x0, le_prev =3D
0xfffff8001513b748}, ts_free =3D {lh_first =3D 0xfffff80015273000},
ts_lockobj =3D 0xffffffff81336188,
  ts_owner =3D 0xfffff8001513b620}


(kgdb) print *ts->ts_owner
=2E..
td_name =3D 0xfffff8001513b8b4 "unbound"
=2E..

[9:32:27] hwlb-foe-aw-02 ~/ # grep '(td=3D0xfffff8001513b620' ps.core2
  175 Thread 100215 (PID=3D1308: unbound)  sched_switch
(td=3D0xfffff8001513b620, newtd=3D0xfffff8000b0d9620, flags=3D<value opti=
mized
out>) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/sched_ule.c:2103

(kgdb) thr 175
[Switching to thread 175 (Thread 100215)]#0  sched_switch
(td=3D0xfffff8001513b620, newtd=3D0xfffff8000b0d9620, flags=3D<value opti=
mized
out>) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/sched_ule.c:2103
2103	/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/sched_ule.c: No
such file or directory.
	in /usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/sched_ule.c
(kgdb) bt
#0  sched_switch (td=3D0xfffff8001513b620, newtd=3D0xfffff8000b0d9620,
flags=3D<value optimized out>) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/sched_ule.c:2103
#1  0xffffffff806719d8 in mi_switch (flags=3D259, newtd=3D0x0) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/kern_synch.c:439
#2  0xffffffff806c3741 in turnstile_wait (ts=3D0xfffff8001503f180,
owner=3D<value optimized out>, queue=3D0) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/subr_turnstile.c:751
#3  0xffffffff80647d81 in __mtx_lock_sleep (c=3D0xffffffff8184f278,
v=3D<value optimized out>) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/kern_mutex.c:637
#4  0xffffffff80661e53 in _rm_rlock (rm=3D0xffffffff8184f220,
tracker=3D0xfffffe0c04aeea48, trylock=3D0) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/kern_rmlock.c:408
#5  0xffffffff8182e935 in pf_test6 (dir=3D2, pflags=3D0,
ifp=3D0xfffff80012f2c000, m0=3D0xfffffe0c04aeec40, inp=3D0xfffff800154911=
d0)
at /usr/home/kajetan.staszkiewicz/freebsd.git/sys/netpfil/pf/pf.c:6395
#6  0xffffffff818371ed in pf_check6_out (arg=3D<value optimized out>,
m=3D0xfffffe0c04aeec40, ifp=3D<value optimized out>, dir=3D<value optimiz=
ed
out>, flags=3D<value optimized out>, inp=3D<value optimized out>)
    at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/netpfil/pf/pf_ioctl.c:3976=

#7  0xffffffff8077b647 in pfil_run_hooks (ph=3D0xffffffff81338060,
mp=3D<value optimized out>, ifp=3D0xfffff80012f2c000, dir=3D2, flags=3D0,=

inp=3D0xfffff800154911d0) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/net/pfil.c:109
#8  0xffffffff8084fdbf in ip6_output (m0=3D<value optimized out>,
opt=3D0xfffff8010cf99400, ro=3D0xfffffe0c04aeee70, flags=3D0, im6o=3D0x0,=

ifpp=3D0x0, inp=3D0xfffff800154911d0)
    at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/netinet6/ip6_output.c:835
#9  0xffffffff80868b86 in udp6_send (so=3D<value optimized out>,
flags=3D<value optimized out>, m=3D<value optimized out>, addr=3D<value
optimized out>, control=3D<value optimized out>, td=3D<value optimized ou=
t>)
    at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/netinet6/udp6_usrreq.c:897=

#10 0xffffffff806f6ef4 in sosend_dgram (so=3D0xfffff80302b2b6d0,
addr=3D<value optimized out>, uio=3D<value optimized out>, top=3D<value
optimized out>, control=3D0x0, flags=3D<value optimized out>,
td=3D0xfffff8001513b620)
    at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/uipc_socket.c:1174
#11 0xffffffff806fdf85 in kern_sendit (td=3D<value optimized out>, s=3D20=
,
mp=3D<value optimized out>, flags=3D0, control=3D0x0, segflg=3DUIO_USERSP=
ACE) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/uipc_syscalls.c:884
#12 0xffffffff806fe2de in sendit (td=3D0xfffff8001513b620, s=3D20,
mp=3D0xfffffe0c04aef1c0, flags=3D0) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/uipc_syscalls.c:804
#13 0xffffffff806fe12d in sys_sendto (td=3D<value optimized out>,
uap=3D<value optimized out>) at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/kern/uipc_syscalls.c:935
#14 0xffffffff8095bb46 in amd64_syscall (td=3D0xfffff8001513b620,
traced=3D0) at freebsd.git/sys/amd64/amd64/../../kern/subr_syscall.c:132
#15 0xffffffff80939cdd in fast_syscall_common () at
/usr/home/kajetan.staszkiewicz/freebsd.git/sys/amd64/amd64/exception.S:49=
4
#16 0x00000008017de0aa in ?? ()
Previous frame inner to this frame (corrupt stack?)


pf.c 6395 is pf_test6():

6392         if (m->m_flags & M_SKIP_FIREWALL)
6393                 return (PF_PASS);
6394
6395         PF_RULES_RLOCK();
6396
6397         /* We do IP header normalization and packet reassembly here =
*/
6398         if (pf_normalize_ip6(m0, dir, kif, &reason, &pd) !=3D PF_PAS=
S) {
6399                 action =3D PF_DROP;


So as far as I understand:
- most NIC queues want to access "pf rulesets" lock
- which is locked held by one more NIC queue which tries to lock
  "pcbinfohash"
- which is held by unbound process
- which tries to lock "pf rulesets"

--=20
| pozdrawiam / greetings | Powered by macOS, Debian and FreeBSD |
|  Kajetan Staszkiewicz  |  www: http://vegeta.tuxpowered.net   |
`------------------------^--------------------------------------'


--UkvR4H5J5VCKr7C31oruLwzrE00EhNufW--

--6Of8gmt13SoPjymQr9LyK3pjeHp5qDQwP
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----

iF0EARECAB0WIQSOEQZObv2B8mf0JbnjtFCvbXs6FAUCXwWgJAAKCRDjtFCvbXs6
FIneAJ9ZUQR3DNic3s+Ztfh2Extfv+FUSQCeP6cCCPx/9HailKuFN8ywSb3Bzsw=
=nVvL
-----END PGP SIGNATURE-----

--6Of8gmt13SoPjymQr9LyK3pjeHp5qDQwP--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?48de3ee4-ba52-f6a7-b2d7-c7d77a635eb9>