From owner-freebsd-stable@FreeBSD.ORG Mon Dec 17 10:59:19 2007 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C50E916A420; Mon, 17 Dec 2007 10:59:19 +0000 (UTC) (envelope-from dg@dglawrence.com) Received: from dglawrence.com (static-72-90-113-2.ptldor.fios.verizon.net [72.90.113.2]) by mx1.freebsd.org (Postfix) with ESMTP id B390513C45A; Mon, 17 Dec 2007 10:59:19 +0000 (UTC) (envelope-from dg@dglawrence.com) Received: from tnn.dglawrence.com (localhost [127.0.0.1]) by dglawrence.com (8.14.1/8.14.1) with ESMTP id lBHAOY1Y046966; Mon, 17 Dec 2007 02:24:34 -0800 (PST) (envelope-from dg@dglawrence.com) Received: (from dg@localhost) by tnn.dglawrence.com (8.14.1/8.14.1/Submit) id lBHAOX2L046965; Mon, 17 Dec 2007 02:24:33 -0800 (PST) (envelope-from dg@dglawrence.com) X-Authentication-Warning: tnn.dglawrence.com: dg set sender to dg@dglawrence.com using -f Date: Mon, 17 Dec 2007 02:24:33 -0800 From: David G Lawrence To: Mark Fullmer Message-ID: <20071217102433.GQ25053@tnn.dglawrence.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (dglawrence.com [127.0.0.1]); Mon, 17 Dec 2007 02:24:34 -0800 (PST) Cc: freebsd-net@freebsd.org, freebsd-stable@freebsd.org Subject: Re: Packet loss every 30.999 seconds X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Dec 2007 10:59:19 -0000 > While trying to diagnose a packet loss problem in a RELENG_6 snapshot > dated > November 8, 2007 it looks like I've stumbled across a broken driver or > kernel routine which stops interrupt processing long enough to severly > degrade network performance every 30.99 seconds. I noticed this as well some time ago. The problem has to do with the processing (syncing) of vnodes. When the total number of allocated vnodes in the system grows to tens of thousands, the ~31 second periodic sync process takes a long time to run. Try this patch and let people know if it helps your problem. It will periodically wait for one tick (1ms) every 500 vnodes of processing, which will allow other things to run. Index: ufs/ffs/ffs_vfsops.c =================================================================== RCS file: /home/ncvs/src/sys/ufs/ffs/ffs_vfsops.c,v retrieving revision 1.290.2.16 diff -c -r1.290.2.16 ffs_vfsops.c *** ufs/ffs/ffs_vfsops.c 9 Oct 2006 19:47:17 -0000 1.290.2.16 --- ufs/ffs/ffs_vfsops.c 25 Apr 2007 01:58:15 -0000 *************** *** 1109,1114 **** --- 1109,1115 ---- int softdep_deps; int softdep_accdeps; struct bufobj *bo; + int flushed_count = 0; fs = ump->um_fs; if (fs->fs_fmod != 0 && fs->fs_ronly != 0) { /* XXX */ *************** *** 1174,1179 **** --- 1175,1184 ---- allerror = error; vput(vp); MNT_ILOCK(mp); + if (flushed_count++ > 500) { + flushed_count = 0; + msleep(&flushed_count, MNT_MTX(mp), PZERO, "syncw", 1); + } } MNT_IUNLOCK(mp); /* -DG David G. Lawrence President Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500 The FreeBSD Project - http://www.freebsd.org Pave the road of life with opportunities.