From owner-freebsd-net@FreeBSD.ORG Thu Feb 13 04:50:00 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BB9F0615; Thu, 13 Feb 2014 04:50:00 +0000 (UTC) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 5836116BA; Thu, 13 Feb 2014 04:50:00 +0000 (UTC) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.14.7/8.14.7) with ESMTP id s1D4nvEQ002577; Wed, 12 Feb 2014 23:49:57 -0500 (EST) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.14.7/8.14.4/Submit) id s1D4nvpL002574; Wed, 12 Feb 2014 23:49:57 -0500 (EST) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <21244.20212.423983.960018@hergotha.csail.mit.edu> Date: Wed, 12 Feb 2014 23:49:56 -0500 From: Garrett Wollman To: John Baldwin Subject: Re: Use of contiguous physical memory in cxgbe driver In-Reply-To: <201402121446.19278.jhb@freebsd.org> References: <21216.22944.314697.179039@hergotha.csail.mit.edu> <201402111348.52135.jhb@freebsd.org> <201402121446.19278.jhb@freebsd.org> X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (hergotha.csail.mit.edu [127.0.0.1]); Wed, 12 Feb 2014 23:49:57 -0500 (EST) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on hergotha.csail.mit.edu Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Feb 2014 04:50:00 -0000 < said: > Is this because UMA keeps lots of mbufs cached in your workload? > The physmem buddy allocator certainly seeks to minimize > fragmentation. However, it can't go yank memory out of UMA caches > to do so. It's not just UMA caches: there are TCP queues, interface queues, the NFS request "cache", and elsewhere. I first discovered this problem in the NFS context: what happens is that you build up very large TCP send buffers (NFS forces the socket buffers to 2M) for many clients (easy if the server is dedicated 10G and the clients are all on shared 1G links). The NIC is eventually unable to replenish its receive ring, and everything just stops. Eventually, the TCP connections time out, the buffers are freed, and the server mysteriously starts working again. (Actually, the last bit never happens in production. It's more like: Eventually, the users start filing trouble tickets, then Nagios starts paging the sysadmins, then someone does a hard reset because that's the fastest way to recover. And then they blame me.) -GAWollman