From owner-freebsd-performance@FreeBSD.ORG  Wed Apr 28 14:27:07 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id A09DF16A4D5
	for <freebsd-performance@freebsd.org>;
	Wed, 28 Apr 2004 14:27:07 -0700 (PDT)
Received: from svaha.com (svaha.com [38.113.6.50])
	by mx1.FreeBSD.org (Postfix) with ESMTP id ECFE243D2F
	for <freebsd-performance@freebsd.org>;
	Wed, 28 Apr 2004 14:27:06 -0700 (PDT)
	(envelope-from meconlen@obfuscated.net)
Received: from [10.140.1.78] (noc.neutelligent.com [64.156.25.3])
  (AUTH: LOGIN meconlen)
  by svaha.com with esmtp; Wed, 28 Apr 2004 17:27:05 -0400
Mime-Version: 1.0 (Apple Message framework v613)
Content-Transfer-Encoding: 7bit
Message-Id: <CB2CDF3D-995A-11D8-A291-00039367611E@obfuscated.net>
Content-Type: text/plain; charset=US-ASCII; format=flowed
To: freebsd-performance@freebsd.org
From: Michael Conlen <meconlen@obfuscated.net>
Date: Wed, 28 Apr 2004 17:27:04 -0400
X-Mailer: Apple Mail (2.613)
Subject: NFS Server
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Apr 2004 21:27:07 -0000

I've got an NFS server that's doing some heavy load. It's holding the 
web pages, images and videos for a cluster of servers doing about 40 
Mbit/sec of traffic (and 160 requests/second). the NFS server has been 
doing between 40 Mbit/sec in and about 10 Mbit/sec out as daily 
averages for over 45 days and everything runs well.

Today I noticed that at Midnight *exactly* the interrupt time went 
through the roof on the system (from 5% to 20%). I checked out the 
system and noticed that it's actually going to the disks a lot, 2-7 
MB/sec of disk usage in systat -vmstat. My first thought is that 
something's got the inactive pages hosed, so I made a 2 GB file (dd 
if=/dev/zero of=foo bs=1024k count=2048), removed it and sync; sync; 
sync. Just like magic the Inactive page count vaporized as expected. 
The disk usage is the same as it had been when there was 1.6 GB of 
inactive pages. After running about a half hour the system still 
doesn't have much inactive page use. I've included systat -vmstat  
output below, though it's difficult to read. The main thing is that 
there's about 3500KB of inactive page use with a system doing 2-7 
MB/sec of disk activity, mostly read operations (despite the network 
traffic, which I think is due to caching).

Now, the whole system performs like magic right now, so I'm not too 
worried about it, until I dump another 80 MBit/sec of web traffic and 
100 GB of more files on to the system. At that time I plan to jump to 4 
GB of memory, with the idea that the extra memory used for inactive 
pages means less disk IO than there would otherwise be, but today's 
activity has me puzzled.

The only thing in the whole system that might cause this is the backup 
process which kicks off at... ...midnight! The catch is that it's been 
kicking off every midnight for weeks and it's never affected the CPU. 
The current backup process is (don't shoot me, please) that I mount the 
filesystems on another server and rsync them on that server to local 
filesystems. The process ran and finished as normal. The backup server 
has since been rebooted (to address other needs) and is fine.

Any thoughts as to why I've lost my inactive pages and have gone 
straight to disk for all operations?

Having written all this the page count is still

Mem: 16M Active, 3496K Inact, 270M Wired, 92K Cache, 199M Buf, 1719M 
Free
Swap: 4079M Total, 48K Used, 4079M Free


What follows is systat -vmstat output


     4 users    Load  1.46  1.28  1.16                  Apr 28 17:14

Mem:KB    REAL            VIRTUAL                     VN PAGER  SWAP 
PAGER
         Tot   Share      Tot    Share    Free         in  out     in  
out
Act    6096    2652    16900     3956 1790332 count
All  266596    3884  2431192     8064         pages
                                                                  
Interrupts
Proc:r  p  d  s  w    Csw  Trp  Sys  Int  Sof  Flt        cow    8617 
total
     15       11      2640    6  101 8617   42    7 246924 wire   8389 
mux irq11
                                                     16236 act         
ata1 irq15
22.5%Sys  23.2%Intr  0.0%User  0.0%Nice 54.2%Idl     3344 inact       
fdc0 irq6
|    |    |    |    |    |    |    |    |    |         92 cache       
atkbd0 irq
===========++++++++++++                           1790240 free        
ppc0 irq7
                                                           daefr   100 
clk irq0
Namei         Name-cache    Dir-cache                     prcfr   128 
rtc irq8
     Calls     hits    %     hits    %                     react
      1050     1050  100                                   pdwake
                                           zfod            pdpgs
Disks aacd0  acd0   md0                   ofod            intrn
KB/t  16.26  0.00  0.00                   %slo-z   204096 buf
tps     467     0     0              1734 tfree       219 dirtybuf
MB/s   7.41  0.00  0.00                            134716 desiredvnodes
% busy   15     0     0                            121459 numvnodes
                                                    118582 freevnodes


From owner-freebsd-performance@FreeBSD.ORG  Sat May  1 05:08:25 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id E2A8216A4D3
	for <freebsd-performance@freebsd.org>;
	Sat,  1 May 2004 05:08:25 -0700 (PDT)
Received: from rhid.com (rhid.com [200.46.204.134])
	by mx1.FreeBSD.org (Postfix) with ESMTP id EDFFF43D4C
	for <freebsd-performance@freebsd.org>;
	Sat,  1 May 2004 05:08:24 -0700 (PDT)	(envelope-from flaw@rhid.com)
Received: from void.rhid.com (rhid.com [200.46.204.134])
	by rhid.com (Postfix) with ESMTP
	id 18660720E86; Sat,  1 May 2004 12:08:18 +0000 (GMT)
Received: by void.rhid.com (Postfix, from userid 1000)
	id 2141A2C900; Sat,  1 May 2004 05:08:24 -0700 (MST)
Date: Sat, 1 May 2004 05:08:24 -0700
From: James William Pye <flaw@rhid.com>
To: Tim Traver <tt-list@simplenet.com>
Message-ID: <20040501120823.GB510@void.ph.cox.net>
References: <6.0.1.1.0.20040330113631.01ef7ec0@mail1.simplenet.com>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="cNdxnHkX5QqsyA0e"
Content-Disposition: inline
In-Reply-To: <6.0.1.1.0.20040330113631.01ef7ec0@mail1.simplenet.com>
Organization: rhid development
User-Agent: Mutt/1.5.5.1i
cc: freebsd-performance@freebsd.org
Subject: Re: shmem release
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 01 May 2004 12:08:26 -0000


--cNdxnHkX5QqsyA0e
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

A little late with the reply, but I've had (iirc) a similar issue before,
and again not more than a few days ago.

The problem seems to be with world being somehow out of sync with
the kernel; I frequently install a newer kernel without touching world,
and this issue has shown its ugly little head to me twice now.

For me, just make'ing buildworld installworld squashes it.

On 03/30/04:13/2, Tim Traver wrote:
> shmget() failed:  No space left on device

--
Regards,
        James William Pye

--cNdxnHkX5QqsyA0e
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (FreeBSD)

iQEVAwUBQJOTN6ZpiPNPvu8yAQL2DQf/QF6GAyw30AMyKjBcW8fuCwcP6K2/awM+
dU4/hXCywZ2FJReVzDT27o2M3Njq0flD1bw0r4RkYPtiblSdSOQHrDzriDUTSKbZ
BEuYmP9d6WyYXLLYQFe4BRJPxA3538qxohbWNvUe9jFIRQ2WDWTflcudXMkEtubQ
LcWBVDi2+nce5AtyU1AkiCT9ktcCNnUzecJDQcXs3Y932tand08R8PAoZWPQEl2r
denZ1JPYWvaHwXf+KVTedzmzWImwtpCvbqtDirAaOb903p5yAP5tOuJqkjG1lrMk
m5IimCCTU6XtD0W43+7TRE/AevGOLPcG45D+YGu8eN0BO3bt4MiPyA==
=+VDm
-----END PGP SIGNATURE-----

--cNdxnHkX5QqsyA0e--