Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 12 Feb 2014 23:49:56 -0500
From:      Garrett Wollman <wollman@bimajority.org>
To:        John Baldwin <jhb@freebsd.org>
Cc:        FreeBSD Net <freebsd-net@freebsd.org>
Subject:   Re: Use of contiguous physical memory in cxgbe driver
Message-ID:  <21244.20212.423983.960018@hergotha.csail.mit.edu>
In-Reply-To: <201402121446.19278.jhb@freebsd.org>
References:  <21216.22944.314697.179039@hergotha.csail.mit.edu> <201402111348.52135.jhb@freebsd.org> <CAJ-VmonCdNQPUCQwm0OhqQ3Kt_7x6-g-JwGVZQfzWTgrDYfmqw@mail.gmail.com> <201402121446.19278.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
<<On Wed, 12 Feb 2014 14:46:19 -0500, John Baldwin <jhb@freebsd.org> said:

> Is this because UMA keeps lots of mbufs cached in your workload?
> The physmem buddy allocator certainly seeks to minimize
> fragmentation.  However, it can't go yank memory out of UMA caches
> to do so.

It's not just UMA caches: there are TCP queues, interface queues, the
NFS request "cache", and elsewhere.  I first discovered this problem
in the NFS context: what happens is that you build up very large TCP
send buffers (NFS forces the socket buffers to 2M) for many clients
(easy if the server is dedicated 10G and the clients are all on shared
1G links).  The NIC is eventually unable to replenish its receive
ring, and everything just stops.  Eventually, the TCP connections time
out, the buffers are freed, and the server mysteriously starts working
again.  (Actually, the last bit never happens in production.  It's
more like: Eventually, the users start filing trouble tickets, then
Nagios starts paging the sysadmins, then someone does a hard reset
because that's the fastest way to recover.  And then they blame me.)

-GAWollman




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?21244.20212.423983.960018>