Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 07 Aug 2023 22:09:06 +0000
From:      bugzilla-noreply@freebsd.org
To:        virtualization@FreeBSD.org
Subject:   [Bug 263062] tcp_inpcb leaking in VM environment
Message-ID:  <bug-263062-27103-ytNl4ubZ7w@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-263062-27103@https.bugs.freebsd.org/bugzilla/>
References:  <bug-263062-27103@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D263062

--- Comment #6 from Max Khon <fjoe@FreeBSD.org> ---
I can confirm that switching Hetzner VM to i440fx (rescale to Intel plan, t=
hen
ask Hetzner support to switch to i440fx as some Intel VMs are also provisio=
ned
with Q35 chipset) solves the issue (on the same 13.2-RELEASE kernel):

--- cut here ---
ITEM                   SIZE  LIMIT     USED     FREE      REQ     FAILSLEEP
XDOMAIN
udp_inpcb:              496, 510927,      12,    1516,   70257,   0,   0,  =
 0
tcp_inpcb:              496, 510927,     649,    1383,  111267,   0,   0,  =
 0
udplite_inpcb:          496, 510927,       0,       0,       0,   0,   0,  =
 0
--- cut here ---

The difference between i440fx and Q35 is that the latter provides "modern"
virtio devices, while i440fx provides "legacy" virtio devices.

I suspect the problem is somewhere in "modern" virtqueue or modern vtnet
implementation (which has been added in FreeBSD 13). FreeBSD 12 does not ev=
en
boot on Q35 chipset because of missing "modern" support.

I would suggest to not do any MFC of "modern" virtio until this issue is fi=
xed.

On a side note: I have reproduced this issue with Q35 chipset ("modern" vir=
tio)
on a plain 13.2-RELEASE in a Hetzner Q35 VM (any AMD plan) with just nginx
serving static content (default nginx page) and running "ab -c 100 -n
1000000000 http://x.y.z.w/" in a loop:

--- cut here ---
ITEM                   SIZE  LIMIT     USED     FREE      REQ     FAILSLEEP
XDOMAIN
udp_inpcb:              496, 126863,   12187,     261,   12475,   0,   0,  =
 0
tcp_inpcb:              496, 126863,   29245,     219, 1204697,   0,   0,  =
 0
udplite_inpcb:          496, 126863,       0,       0,       0,   0,   0,  =
 0
--- cut here ---

Also, I noticed that nginx process becomes unkillable (even with SIGKILL) a=
nd
"ps axl | grep nginx" output is as follows:

--- cut here ---
   0  848    1 1  20  0 20024  7624 pause    Is    -     0:00.00 nginx: mas=
ter
process /usr/local/sbin/nginx
  80  898  848 0  33  0 20024  8480 -        R     -     1:44.88 nginx: wor=
ker
process (nginx)
--- cut here ---

Notice that nginx worker process does not have MWCHAN. Also, trying to do
ktrace/struss or attaching gdb to nginx process just hangs.

Additionally, adding a simple Django application (just default empty Django
application, run as "manage.py runserver") behind nginx increases a probabi=
lity
of inpcb leak (USER counters grows faster). I use simple reverse proxying l=
ike
this:

--- cut here ---
        location / {
            proxy_pass http://localhost:8000;
        }
--- cut here ---

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-263062-27103-ytNl4ubZ7w>