Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 10 Jul 2010 12:33:10 -0700
From:      Ali Mashtizadeh <mashtizadeh@gmail.com>
To:        Maxim Sobolev <sobomax@freebsd.org>
Cc:        freebsd-net@freebsd.org, Jack Vogel <jfvogel@gmail.com>, FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject:   Re: Sudden mbuf demand increase and shortage under the load
Message-ID:  <AANLkTimpaVuM1ScQCGN-y1r9ZHkx8gf93Vu2pAmq9CRx@mail.gmail.com>
In-Reply-To: <4B79297D.9080403@FreeBSD.org>
References:  <4B79297D.9080403@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi Maxim,

I experienced the same issue recently on 8-STABLE branch and it seems
it has been fixed since 8.1-RC2 and above. I couldn't track down the
root cause in the code nor could I find a commit that seems to be the
obvious fix.

Thanks,
~ Ali

2010/2/15 Maxim Sobolev <sobomax@freebsd.org>:
> Hi,
>
> Our company have a FreeBSD based product that consists of the numerous
> interconnected processes and it does some high-PPS UDP processing (30-50K
> PPS is not uncommon). We are seeing some strange periodic failures under =
the
> load in several such systems, which usually evidences itself in IPC (even
> through unix domain sockets) suddenly either breaking down or pausing and
> restoring only some time later (like 5-10 minutes). The only sign of fail=
ure
> I managed to find was the increase of the "requests for mbufs denied" in =
the
> netstat -m and number of total mbuf clusters (nmbclusters) raising up to =
the
> limit.
>
> I have tried to raise some network-related limits (most notably maxusers =
and
> nmbclusters), but it has not helped with the issue - it's still happening
> from time to time to us. Below you can find output from the netstat -m fe=
w
> minutes right after that shortage period - you see that somehow the syste=
m
> has allocated huge amount of memory for the network (700MB), with only ti=
ny
> amount of that being actually in use. This is for the kern.ipc.nmbcluster=
s:
> 302400. Eventually the system reclaims all that memory and goes back to i=
ts
> normal use of 30-70MB.
>
> This problem is killing us, so any suggestions are greatly appreciated. M=
y
> current hypothesis is that due to some issues either with the network dri=
ver
> or network subsystem itself, the system goes insane and "eats" up all mbu=
fs
> up to nmbclusters limit. But since mbufs are shared between network and
> local IPC, IPC goes down as well.
>
> We observe this issue with systems using both em(4) driver and igb(4)
> driver. I believe both drivers share the same design, however I am not su=
re
> if this is some kind of design flaw in the driver or part of a larger
> problem with the network subsystem.
>
> This happens on amd64 7.2-RELEASE and 7.3-PRERELEASE alike, with 8GB of
> memory. I have not tried upgrading to 8.0, this is production system so
> upgrading will not be easy. =C2=A0I don't believe there are some differen=
ces that
> let us hope that this problem will go away after upgrade, but I can try i=
t
> as the last resort.
>
> As I said, this is very critical issue, so I can provide any additional
> debug information upon request. We are ready to go as far as paying someb=
ody
> reasonable amount of money for tracking down and resolving the issue.
>
> Regards,
> --
> Maksym Sobolyev
> Sippy Software, Inc.
> Internet Telephony (VoIP) Experts
> T/F: +1-646-651-1110
> Web: http://www.sippysoft.com
> MSN: sales@sippysoft.com
> Skype: SippySoft
>
>
> [ssp-root@ds-467 /usr/src]$ netstat -m
> 17061/417669/434730 mbufs in use (current/cache/total)
> 10420/291980/302400/302400 mbuf clusters in use (current/cache/total/max)
> 10420/0 mbuf+clusters out of packet secondary zone in use (current/cache)
> 19/1262/1281/51200 4k (page size) jumbo clusters in use
> (current/cache/total/max)
> 0/0/0/25600 9k jumbo clusters in use (current/cache/total/max)
> 0/0/0/12800 16k jumbo clusters in use (current/cache/total/max)
> 25181K/693425K/718606K bytes allocated to network (current/cache/total)
> 1246681/129567494/67681640 requests for mbufs denied
> (mbufs/clusters/mbuf+clusters)
> 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
> 0/0/0 sfbufs in use (current/peak/max)
> 0 requests for sfbufs denied
> 0 requests for sfbufs delayed
> 0 requests for I/O initiated by sendfile
> 0 calls to protocol drain routines
>
> [FEW MINUTES LATER]
>
> [ssp-root@ds-467 /usr/src]$ netstat -m
> 10001/84574/94575 mbufs in use (current/cache/total)
> 6899/6931/13830/302400 mbuf clusters in use (current/cache/total/max)
> 6899/6267 mbuf+clusters out of packet secondary zone in use (current/cach=
e)
> 2/1151/1153/51200 4k (page size) jumbo clusters in use
> (current/cache/total/max)
> 0/0/0/25600 9k jumbo clusters in use (current/cache/total/max)
> 0/0/0/12800 16k jumbo clusters in use (current/cache/total/max)
> 16306K/39609K/55915K bytes allocated to network (current/cache/total)
> 1246681/129567494/67681640 requests for mbufs denied
> (mbufs/clusters/mbuf+clusters)
> 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
> 0/0/0 sfbufs in use (current/peak/max)
> 0 requests for sfbufs denied
> 0 requests for sfbufs delayed
> 0 requests for I/O initiated by sendfile
> 0 calls to protocol drain routines
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>



--=20
Ali Mashtizadeh
=D8=B9=D9=84=DB=8C =D9=85=D8=B4=D8=AA=DB=8C =D8=B2=D8=A7=D8=AF=D9=87



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTimpaVuM1ScQCGN-y1r9ZHkx8gf93Vu2pAmq9CRx>