Date: Wed, 12 Feb 2014 23:49:56 -0500 From: Garrett Wollman <wollman@bimajority.org> To: John Baldwin <jhb@freebsd.org> Cc: FreeBSD Net <freebsd-net@freebsd.org> Subject: Re: Use of contiguous physical memory in cxgbe driver Message-ID: <21244.20212.423983.960018@hergotha.csail.mit.edu> In-Reply-To: <201402121446.19278.jhb@freebsd.org> References: <21216.22944.314697.179039@hergotha.csail.mit.edu> <201402111348.52135.jhb@freebsd.org> <CAJ-VmonCdNQPUCQwm0OhqQ3Kt_7x6-g-JwGVZQfzWTgrDYfmqw@mail.gmail.com> <201402121446.19278.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
<<On Wed, 12 Feb 2014 14:46:19 -0500, John Baldwin <jhb@freebsd.org> said: > Is this because UMA keeps lots of mbufs cached in your workload? > The physmem buddy allocator certainly seeks to minimize > fragmentation. However, it can't go yank memory out of UMA caches > to do so. It's not just UMA caches: there are TCP queues, interface queues, the NFS request "cache", and elsewhere. I first discovered this problem in the NFS context: what happens is that you build up very large TCP send buffers (NFS forces the socket buffers to 2M) for many clients (easy if the server is dedicated 10G and the clients are all on shared 1G links). The NIC is eventually unable to replenish its receive ring, and everything just stops. Eventually, the TCP connections time out, the buffers are freed, and the server mysteriously starts working again. (Actually, the last bit never happens in production. It's more like: Eventually, the users start filing trouble tickets, then Nagios starts paging the sysadmins, then someone does a hard reset because that's the fastest way to recover. And then they blame me.) -GAWollman
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?21244.20212.423983.960018>