From owner-freebsd-performance@FreeBSD.ORG  Sun Apr 18 19:20:53 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 31B9816A4CE
	for <freebsd-performance@freebsd.org>;
	Sun, 18 Apr 2004 19:20:53 -0700 (PDT)
Received: from flake.decibel.org (flake.decibel.org [66.143.173.58])
	by mx1.FreeBSD.org (Postfix) with SMTP id A3ECB43D48
	for <freebsd-performance@freebsd.org>;
	Sun, 18 Apr 2004 19:20:50 -0700 (PDT)
	(envelope-from decibel@decibel.org)
Received: (qmail 30264 invoked by uid 1001); 19 Apr 2004 02:20:43 -0000
Date: Sun, 18 Apr 2004 21:20:43 -0500
From: "Jim C. Nasby" <jim@nasby.net>
To: Uwe Doering <gemini@geminix.org>
Message-ID: <20040419022043.GO87362@nasby.net>
References: <20040416163845.GG87362@nasby.net>
	<E1BEbKR-000ISM-00.shmukler-mail-ru@f7.mail.ru>
	<20040416221211.GM87362@nasby.net> <4080DF9F.3040302@geminix.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4080DF9F.3040302@geminix.org>
X-Operating-System: FreeBSD 4.9-RELEASE-p3 i386
X-Distributed: Join the Effort!  http://www.distributed.net
User-Agent: Mutt/1.5.6i
cc: freebsd-performance@freebsd.org
Subject: Re: How does disk caching work?
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Apr 2004 02:20:53 -0000

On Sat, Apr 17, 2004 at 09:41:19AM +0200, Uwe Doering wrote:
> The disk i/o buffers you refer to (the 'Buf' column in 'top') are the 
> actual interface between the VM system and the disk device drivers.  For 
> file and directory data, sets of VM pages get referred by and assigned 
> to disk i/o buffers.  There they are dealt with by a kernel daemon 
> process that does the actual synchronization between VM and disks. 
> That's where the soft updates algorithm is implemented, for instance.
> 
> In case of file and directory data, once the data has been written out 
> to disk (if the memory pages were "dirty") the respective disk i/o 
> buffer gets released immediately and can be recycled for other purposes, 
> since it just referred to memory pages that continue to exist within the 
> VM system.
> 
> Meta data (inodes etc.) is a different matter, though.  There is no VM 
> representation for this, so for disk i/o they have to be cached in extra 
> memory allocated for this purpose.  A disk i/o buffer then refers to 
> this memory range and tries to keep it around for as long as possible. 
> A classical cache algorithm like LRU recycles these buffers and memory 
> allocations eventually.
> 
> As usual, the actual implementation is even more complex, but I think 
> you got a picture of how it works.

Yes, much clearer now, thanks!

A few questions if I may...

What's a good way to tune amount of space dedicated to IO buffers?

What impact will vm_min|max_cache have on system performance? Is there
any advantage to setting it fairly high?

The machine I'm tuning is a dual Opteron box with 4G of ram, a mirror
and a 6 disk RAID10. It's running PostgreSQL.
-- 
Jim C. Nasby, Database Consultant                  jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

From owner-freebsd-performance@FreeBSD.ORG  Sun Apr 18 19:22:40 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 9624816A4CE
	for <freebsd-performance@freebsd.org>;
	Sun, 18 Apr 2004 19:22:40 -0700 (PDT)
Received: from flake.decibel.org (flake.decibel.org [66.143.173.58])
	by mx1.FreeBSD.org (Postfix) with SMTP id 2398B43D2F
	for <freebsd-performance@freebsd.org>;
	Sun, 18 Apr 2004 19:22:40 -0700 (PDT)
	(envelope-from decibel@decibel.org)
Received: (qmail 30315 invoked by uid 1001); 19 Apr 2004 02:22:39 -0000
Date: Sun, 18 Apr 2004 21:22:39 -0500
From: "Jim C. Nasby" <jim@nasby.net>
To: Aaron Seelye <aseelye-lists@eltopia.com>
Message-ID: <20040419022239.GP87362@nasby.net>
References: <20040416220556.GL87362@nasby.net>
	<002701c42404$e9dbecf0$3102a8c0@metallus>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <002701c42404$e9dbecf0$3102a8c0@metallus>
X-Operating-System: FreeBSD 4.9-RELEASE-p3 i386
X-Distributed: Join the Effort!  http://www.distributed.net
User-Agent: Mutt/1.5.6i
cc: freebsd-performance@freebsd.org
Subject: Re: command piped into bzip not using all available CPU
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Apr 2004 02:22:40 -0000

Perhapse I didn't make it clear, but this is on a dual CPU machine. I
would expect that either bzip2 or pgsql would hit 100% CPU, using one
entire CPU. The 47% idle indicates to me that it's not.

On Fri, Apr 16, 2004 at 03:48:19PM -0700, Aaron Seelye wrote:
> I would venture a guess that bzip is not multi threaded and therefore
> isn't spreading the load around.
> 
> -Aaron Seelye
> ----- Original Message -----
> From: "Jim C. Nasby" <jim@nasby.net>
> To: <freebsd-performance@freebsd.org>
> Sent: Friday, April 16, 2004 3:05 PM
> Subject: command piped into bzip not using all available CPU
> 
> 
> As you can see below, a command piped into bzip2 is only effectively
> using one CPU. It's not disk bound, both systat and gstat report less
> than 10% disk utilization. Why is this?
> 
> The command I'm running is:
> pg_dump -vZ0 ogr | bzip2 > ogr-20040416.sql.bz2
> 
> last pid: 18345;  load averages:  1.17,  1.09,  0.81  up 8+22:12:27
> 17:00:56
> 66 processes:  2 running, 64 sleeping
> CPU states: 49.4% user,  0.0% nice,  3.7% system,  0.2% interrupt, 46.7%
> idle
> Mem: 67M Active, 2935M Inact, 359M Wired, 331M Cache, 255M Buf, 5576K
> Free
> Swap: 8192M Total, 64M Used, 8127M Free, 48K Out
> 
>   PID USERNAME PRI NICE   SIZE    RES STATE  C   TIME   WCPU    CPU
> COMMAND
> 17334 decibel  109    0 10856K  7164K CPU0   0  11:05 65.77% 65.77%
> bzip2
> 17335 pgsql      4    0   154M   124M sbwait 0   5:54 34.03% 34.03%
> postgres
> 17333 decibel   -8    0 20128K  3236K pipdwt 0   0:46  2.88%  2.88%
> pg_dump
> --
> Jim C. Nasby, Database Consultant                  jim@nasby.net
> Member: Triangle Fraternity, Sports Car Club of America
> Give your computer some brain candy! www.distributed.net Team #1828
> 
> Windows: "Where do you want to go today?"
> Linux: "Where do you want to go tomorrow?"
> FreeBSD: "Are you guys coming, or what?"
> _______________________________________________
> freebsd-performance@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
> To unsubscribe, send any mail to
> "freebsd-performance-unsubscribe@freebsd.org"
> 
> 
> 
> _______________________________________________
> freebsd-performance@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
> To unsubscribe, send any mail to "freebsd-performance-unsubscribe@freebsd.org"
> 

-- 
Jim C. Nasby, Database Consultant                  jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

From owner-freebsd-performance@FreeBSD.ORG  Sun Apr 18 23:37:56 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id BE7F216A4CF
	for <freebsd-performance@freebsd.org>;
	Sun, 18 Apr 2004 23:37:56 -0700 (PDT)
Received: from gen129.n001.c02.escapebox.net (gen129.n001.c02.escapebox.net
	[213.73.91.129])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 447AC43D48
	for <freebsd-performance@freebsd.org>;
	Sun, 18 Apr 2004 23:37:56 -0700 (PDT)
	(envelope-from gemini@geminix.org)
Message-ID: <408373C0.7080502@geminix.org>
Date: Mon, 19 Apr 2004 08:37:52 +0200
From: Uwe Doering <gemini@geminix.org>
Organization: Private UNIX Site
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.6) Gecko/20040119
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: freebsd-performance@freebsd.org
References: <20040416163845.GG87362@nasby.net>
	<E1BEbKR-000ISM-00.shmukler-mail-ru@f7.mail.ru>
	<20040416221211.GM87362@nasby.net> <4080DF9F.3040302@geminix.org>
	<20040419022043.GO87362@nasby.net>
In-Reply-To: <20040419022043.GO87362@nasby.net>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Received: from gemini by geminix.org with asmtp (TLSv1:AES256-SHA:256)
	(Exim 3.36 #1)
	id 1BFSPi-000B9b-00; Mon, 19 Apr 2004 08:37:54 +0200
Subject: Re: How does disk caching work?
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Apr 2004 06:37:56 -0000

Jim C. Nasby wrote:
> On Sat, Apr 17, 2004 at 09:41:19AM +0200, Uwe Doering wrote:
> [...]
> A few questions if I may...
> 
> What's a good way to tune amount of space dedicated to IO buffers?

You can tune the number of i/o buffers, and therefore indirectly the 
amount of memory they may allocate, by using the variable 'kern.nbuf' in 
'/boot/loader.conf'.  Note that this number gets multiplied by 16384 
(the default filesystem block size) to arrive at the amount of memory it 
results in.

My experience is that with large amounts of RAM this area becomes 
unduely big, though.  It's not that you have to skimp on RAM in this 
enviroment, but the disk i/o buffers eat away at the KVM region (kernel 
virtual memory), which happens to be just 1 GB by default and doesn't 
grow with the RAM size.  So it can be a good idea to actually reduce the 
number of disk i/o buffers (compared to its auto-scaled default) on 
systems with plenty of RAM (since you don't need that many buffers, 
anyway, due to the VM interaction I just described) and save the 
available KVM rather for other purposes (kernel resources).  Systems 
that run out of KVM are prone to kernel panics, given the right 
combination of circumstances.

> What impact will vm_min|max_cache have on system performance? Is there
> any advantage to setting it fairly high?

I'm not quite sure which variables you are referring to.  In FreeBSD 
there are 'vm.v_cache_min' and 'vm.v_cache_max'.  I don't recommend 
tuning them, though, without having a very deep and thorough look at the 
kernel sources.  Many of these variables don't really do what their name 
suggests, and there are interdependencies between some of them.  You can 
lock up your server by tuning them improperly.

> The machine I'm tuning is a dual Opteron box with 4G of ram, a mirror
> and a 6 disk RAID10. It's running PostgreSQL.

I'm not a PostgreSQL expert, but there have been discussions on this 
mailing list and elsewhere about tuning PostgreSQL.  I suggest to take a 
look at the archives.

    Uwe
-- 
Uwe Doering         |  EscapeBox - Managed On-Demand UNIX Servers
gemini@geminix.org  |  http://www.escapebox.net

From owner-freebsd-performance@FreeBSD.ORG  Mon Apr 19 00:08:36 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 93A7C16A4CE
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 00:08:36 -0700 (PDT)
Received: from katmai.eltopia.com (katmai.eltopia.com [64.146.186.25])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 3E62343D5C
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 00:08:35 -0700 (PDT)
	(envelope-from aseelye-lists@eltopia.com)
Received: from metallus (unverified [68.116.17.36]) by katmai.eltopia.com
 (Vircom SMTPRS 3.0.286) with SMTP id <B0015889074@katmai.eltopia.com>;
 Mon, 19 Apr 2004 00:08:34 -0700
Message-ID: <001801c425dd$1fda0b00$3102a8c0@metallus>
From: "Aaron Seelye" <aseelye-lists@eltopia.com>
To: "Jim C. Nasby" <jim@nasby.net>
References: <20040416220556.GL87362@nasby.net>
	<002701c42404$e9dbecf0$3102a8c0@metallus> <20040419022239.GP87362@nasby.net>
Date: Mon, 19 Apr 2004 00:08:32 -0700
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2720.3000
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2739.300
cc: freebsd-performance@freebsd.org
Subject: Re: command piped into bzip not using all available CPU
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Apr 2004 07:08:36 -0000

I'm not sure the exact technical reason, but as I understand that, it's
47% idle on the total cpu power of the machine, which would indicate
that one cpu was 100% full, and the other was 3%, due to system usage,
i/o, or whatever else was running.  This is quite normal in my
experience, and what you should expect to see.

-Aaron
----- Original Message -----
From: "Jim C. Nasby" <jim@nasby.net>
To: "Aaron Seelye" <aseelye-lists@eltopia.com>
Cc: <freebsd-performance@freebsd.org>
Sent: Sunday, April 18, 2004 7:22 PM
Subject: Re: command piped into bzip not using all available CPU


Perhapse I didn't make it clear, but this is on a dual CPU machine. I
would expect that either bzip2 or pgsql would hit 100% CPU, using one
entire CPU. The 47% idle indicates to me that it's not.

On Fri, Apr 16, 2004 at 03:48:19PM -0700, Aaron Seelye wrote:
> I would venture a guess that bzip is not multi threaded and therefore
> isn't spreading the load around.
>
> -Aaron Seelye
> ----- Original Message -----
> From: "Jim C. Nasby" <jim@nasby.net>
> To: <freebsd-performance@freebsd.org>
> Sent: Friday, April 16, 2004 3:05 PM
> Subject: command piped into bzip not using all available CPU
>
>
> As you can see below, a command piped into bzip2 is only effectively
> using one CPU. It's not disk bound, both systat and gstat report less
> than 10% disk utilization. Why is this?
>
> The command I'm running is:
> pg_dump -vZ0 ogr | bzip2 > ogr-20040416.sql.bz2
>
> last pid: 18345;  load averages:  1.17,  1.09,  0.81  up 8+22:12:27
> 17:00:56
> 66 processes:  2 running, 64 sleeping
> CPU states: 49.4% user,  0.0% nice,  3.7% system,  0.2% interrupt,
46.7%
> idle
> Mem: 67M Active, 2935M Inact, 359M Wired, 331M Cache, 255M Buf, 5576K
> Free
> Swap: 8192M Total, 64M Used, 8127M Free, 48K Out
>
>   PID USERNAME PRI NICE   SIZE    RES STATE  C   TIME   WCPU    CPU
> COMMAND
> 17334 decibel  109    0 10856K  7164K CPU0   0  11:05 65.77% 65.77%
> bzip2
> 17335 pgsql      4    0   154M   124M sbwait 0   5:54 34.03% 34.03%
> postgres
> 17333 decibel   -8    0 20128K  3236K pipdwt 0   0:46  2.88%  2.88%
> pg_dump
> --
> Jim C. Nasby, Database Consultant                  jim@nasby.net
> Member: Triangle Fraternity, Sports Car Club of America
> Give your computer some brain candy! www.distributed.net Team #1828
>
> Windows: "Where do you want to go today?"
> Linux: "Where do you want to go tomorrow?"
> FreeBSD: "Are you guys coming, or what?"
> _______________________________________________
> freebsd-performance@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
> To unsubscribe, send any mail to
> "freebsd-performance-unsubscribe@freebsd.org"
>
>
>
> _______________________________________________
> freebsd-performance@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
> To unsubscribe, send any mail to
"freebsd-performance-unsubscribe@freebsd.org"
>

--
Jim C. Nasby, Database Consultant                  jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"


From owner-freebsd-performance@FreeBSD.ORG  Mon Apr 19 06:45:51 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 73CBF16A4CF; Mon, 19 Apr 2004 06:45:51 -0700 (PDT)
Received: from shadow.wixb.com (shadow.wixb.com [65.43.82.173])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 8C12143D2D; Mon, 19 Apr 2004 06:45:50 -0700 (PDT)
	(envelope-from jbronson@wixb.com)
Received: from dakota.wixb.com (shadow.wixb.com [10.43.82.173])
	i3JDjnAe000875;	Mon, 19 Apr 2004 08:45:49 -0500 (CDT)
Message-Id: <6.1.0.6.2.20040419084442.0244d618@localhost>
Date: Mon, 19 Apr 2004 08:45:50 -0500
To: freebsd-performance@freebsd.org
From: "J.D. Bronson" <jbronson@wixb.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
X-Antivirus: Scanned by F-Prot Antivirus 4.4.1
X-Scanned-By: MIMEDefang 2.42
cc: freebsd-questions@freebsd.org
Subject: etherchannel on 5.2.1 - possible?
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Apr 2004 13:45:51 -0000

I am looking for performance. Not fail-over..

Does anyone have this working with either
intel or broadcom nics?

Anyone have any good site that talks about what is needed to make this work
as well? - I do have a Cisco switch and it fully supports this.

I need a little advice on setting this up...

Thanks in advance!

   -JBD

From owner-freebsd-performance@FreeBSD.ORG  Mon Apr 19 07:09:27 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 098AE16A4CE
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 07:09:27 -0700 (PDT)
Received: from flake.decibel.org (flake.decibel.org [66.143.173.58])
	by mx1.FreeBSD.org (Postfix) with SMTP id 826A343D49
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 07:09:26 -0700 (PDT)
	(envelope-from decibel@decibel.org)
Received: (qmail 65300 invoked by uid 1001); 19 Apr 2004 14:09:21 -0000
Date: Mon, 19 Apr 2004 09:09:21 -0500
From: "Jim C. Nasby" <jim@nasby.net>
To: Aaron Seelye <aseelye-lists@eltopia.com>
Message-ID: <20040419140921.GS87362@nasby.net>
References: <20040416220556.GL87362@nasby.net>
	<002701c42404$e9dbecf0$3102a8c0@metallus> <20040419022239.GP87362@nasby.net>
	<001801c425dd$1fda0b00$3102a8c0@metallus>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <001801c425dd$1fda0b00$3102a8c0@metallus>
X-Operating-System: FreeBSD 4.9-RELEASE-p3 i386
X-Distributed: Join the Effort!  http://www.distributed.net
User-Agent: Mutt/1.5.6i
cc: freebsd-performance@freebsd.org
Subject: Re: command piped into bzip not using all available CPU
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Apr 2004 14:09:27 -0000

Why would I expect to see it only use one CPU? It was CPU bound, not
disk bound. There were two CPU-intensive processes running, why wouldn't
they each use a different CPU?

On Mon, Apr 19, 2004 at 12:08:32AM -0700, Aaron Seelye wrote:
> I'm not sure the exact technical reason, but as I understand that, it's
> 47% idle on the total cpu power of the machine, which would indicate
> that one cpu was 100% full, and the other was 3%, due to system usage,
> i/o, or whatever else was running.  This is quite normal in my
> experience, and what you should expect to see.
> 
> -Aaron
> ----- Original Message -----
> From: "Jim C. Nasby" <jim@nasby.net>
> To: "Aaron Seelye" <aseelye-lists@eltopia.com>
> Cc: <freebsd-performance@freebsd.org>
> Sent: Sunday, April 18, 2004 7:22 PM
> Subject: Re: command piped into bzip not using all available CPU
> 
> 
> Perhapse I didn't make it clear, but this is on a dual CPU machine. I
> would expect that either bzip2 or pgsql would hit 100% CPU, using one
> entire CPU. The 47% idle indicates to me that it's not.
> 
> On Fri, Apr 16, 2004 at 03:48:19PM -0700, Aaron Seelye wrote:
> > I would venture a guess that bzip is not multi threaded and therefore
> > isn't spreading the load around.
> >
> > -Aaron Seelye
> > ----- Original Message -----
> > From: "Jim C. Nasby" <jim@nasby.net>
> > To: <freebsd-performance@freebsd.org>
> > Sent: Friday, April 16, 2004 3:05 PM
> > Subject: command piped into bzip not using all available CPU
> >
> >
> > As you can see below, a command piped into bzip2 is only effectively
> > using one CPU. It's not disk bound, both systat and gstat report less
> > than 10% disk utilization. Why is this?
> >
> > The command I'm running is:
> > pg_dump -vZ0 ogr | bzip2 > ogr-20040416.sql.bz2
> >
> > last pid: 18345;  load averages:  1.17,  1.09,  0.81  up 8+22:12:27
> > 17:00:56
> > 66 processes:  2 running, 64 sleeping
> > CPU states: 49.4% user,  0.0% nice,  3.7% system,  0.2% interrupt,
> 46.7%
> > idle
> > Mem: 67M Active, 2935M Inact, 359M Wired, 331M Cache, 255M Buf, 5576K
> > Free
> > Swap: 8192M Total, 64M Used, 8127M Free, 48K Out
> >
> >   PID USERNAME PRI NICE   SIZE    RES STATE  C   TIME   WCPU    CPU
> > COMMAND
> > 17334 decibel  109    0 10856K  7164K CPU0   0  11:05 65.77% 65.77%
> > bzip2
> > 17335 pgsql      4    0   154M   124M sbwait 0   5:54 34.03% 34.03%
> > postgres
> > 17333 decibel   -8    0 20128K  3236K pipdwt 0   0:46  2.88%  2.88%
> > pg_dump
> > --
> > Jim C. Nasby, Database Consultant                  jim@nasby.net
> > Member: Triangle Fraternity, Sports Car Club of America
> > Give your computer some brain candy! www.distributed.net Team #1828
> >
> > Windows: "Where do you want to go today?"
> > Linux: "Where do you want to go tomorrow?"
> > FreeBSD: "Are you guys coming, or what?"
> > _______________________________________________
> > freebsd-performance@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-performance
> > To unsubscribe, send any mail to
> > "freebsd-performance-unsubscribe@freebsd.org"
> >
> >
> >
> > _______________________________________________
> > freebsd-performance@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-performance
> > To unsubscribe, send any mail to
> "freebsd-performance-unsubscribe@freebsd.org"
> >
> 
> --
> Jim C. Nasby, Database Consultant                  jim@nasby.net
> Member: Triangle Fraternity, Sports Car Club of America
> Give your computer some brain candy! www.distributed.net Team #1828
> 
> Windows: "Where do you want to go today?"
> Linux: "Where do you want to go tomorrow?"
> FreeBSD: "Are you guys coming, or what?"
> 
> 
> 

-- 
Jim C. Nasby, Database Consultant                  jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

From owner-freebsd-performance@FreeBSD.ORG  Mon Apr 19 07:12:51 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 869DF16A4DD; Mon, 19 Apr 2004 07:12:48 -0700 (PDT)
Received: from multiplay.co.uk (www1.multiplay.co.uk [212.42.16.7])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 1B11343D39; Mon, 19 Apr 2004 07:12:48 -0700 (PDT)
	(envelope-from killing@multiplay.co.uk)
Received: from vader ([212.135.219.179])
	by multiplay.co.uk (multiplay.co.uk [212.42.16.7])
	(MDaemon.PRO.v7.0.1.R)
	with ESMTP id md50000132669.msg;
	Mon, 19 Apr 2004 15:11:55 +0100
Message-ID: <019301c42618$5db6fb00$b3db87d4@multiplay.co.uk>
From: "Steven Hartland" <killing@multiplay.co.uk>
To: <freebsd-performance@freebsd.org>,
	"J.D. Bronson" <jbronson@wixb.com>
References: <6.1.0.6.2.20040419084442.0244d618@localhost>
Date: Mon, 19 Apr 2004 15:12:35 +0100
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2800.1409
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1409
X-Spam-Processed: multiplay.co.uk, Mon, 19 Apr 2004 15:11:55 +0100
	(not processed: message from valid local sender)
X-MDRemoteIP: 212.135.219.179
X-Return-Path: killing@multiplay.co.uk
X-MDAV-Processed: multiplay.co.uk, Mon, 19 Apr 2004 15:11:58 +0100
cc: freebsd-questions@freebsd.org
Subject: Re: etherchannel on 5.2.1 - possible?
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Apr 2004 14:12:51 -0000

When I checked the various methods a while back all resulted
in performance drop not increase on a dual port intel
etherxpress Pro 100. If anyone has different experiences I
also would be interested. Also noted was that Gb performance
on nge was also lower than that of fxp on 100Mb especially
when using link0 on the fxp.
Notes: Results using ftp ( proftpd ). Performance increase using
nge was obtained when using single processor kernel and
polling enabled without this interrupt rates appeared to be
overloading the system.

    Steve
----- Original Message ----- 
From: "J.D. Bronson" <jbronson@wixb.com>
To: <freebsd-performance@freebsd.org>
Cc: <freebsd-questions@freebsd.org>
Sent: Monday, April 19, 2004 2:45 PM
Subject: etherchannel on 5.2.1 - possible?


> I am looking for performance. Not fail-over..
> 
> Does anyone have this working with either
> intel or broadcom nics?
> 
> Anyone have any good site that talks about what is needed to make this work
> as well? - I do have a Cisco switch and it fully supports this.
> 
> I need a little advice on setting this up...
> 
> Thanks in advance!
> 
>    -JBD
> 
> _______________________________________________
> freebsd-performance@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
> To unsubscribe, send any mail to "freebsd-performance-unsubscribe@freebsd.org"
> 
> 


================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone (023) 8024 3137
or return the E.mail to postmaster@multiplay.co.uk.

From owner-freebsd-performance@FreeBSD.ORG  Mon Apr 19 08:16:20 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 1BB4616A4CE
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 08:16:20 -0700 (PDT)
Received: from flake.decibel.org (flake.decibel.org [66.143.173.58])
	by mx1.FreeBSD.org (Postfix) with SMTP id 9EE7E43D45
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 08:16:19 -0700 (PDT)
	(envelope-from decibel@decibel.org)
Received: (qmail 68482 invoked by uid 1001); 19 Apr 2004 15:16:17 -0000
Date: Mon, 19 Apr 2004 10:16:16 -0500
From: "Jim C. Nasby" <jim@nasby.net>
To: Uwe Doering <gemini@geminix.org>
Message-ID: <20040419151616.GT87362@nasby.net>
References: <20040416163845.GG87362@nasby.net>
	<E1BEbKR-000ISM-00.shmukler-mail-ru@f7.mail.ru>
	<20040416221211.GM87362@nasby.net> <4080DF9F.3040302@geminix.org>
	<20040419022043.GO87362@nasby.net> <408373C0.7080502@geminix.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <408373C0.7080502@geminix.org>
X-Operating-System: FreeBSD 4.9-RELEASE-p3 i386
X-Distributed: Join the Effort!  http://www.distributed.net
User-Agent: Mutt/1.5.6i
cc: freebsd-performance@freebsd.org
Subject: Re: How does disk caching work?
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Apr 2004 15:16:20 -0000

On Mon, Apr 19, 2004 at 08:37:52AM +0200, Uwe Doering wrote:
> Jim C. Nasby wrote:
> >On Sat, Apr 17, 2004 at 09:41:19AM +0200, Uwe Doering wrote:
> >[...]
> >A few questions if I may...
> >
> >What's a good way to tune amount of space dedicated to IO buffers?
> 
> You can tune the number of i/o buffers, and therefore indirectly the 
> amount of memory they may allocate, by using the variable 'kern.nbuf' in 
> '/boot/loader.conf'.  Note that this number gets multiplied by 16384 
> (the default filesystem block size) to arrive at the amount of memory it 
> results in.
> 
> My experience is that with large amounts of RAM this area becomes 
> unduely big, though.  It's not that you have to skimp on RAM in this 
> enviroment, but the disk i/o buffers eat away at the KVM region (kernel 
> virtual memory), which happens to be just 1 GB by default and doesn't 
> grow with the RAM size.  So it can be a good idea to actually reduce the 
> number of disk i/o buffers (compared to its auto-scaled default) on 
> systems with plenty of RAM (since you don't need that many buffers, 
> anyway, due to the VM interaction I just described) and save the 
> available KVM rather for other purposes (kernel resources).  Systems 
> that run out of KVM are prone to kernel panics, given the right 
> combination of circumstances.

Yes, I was thinking the same thing. What I don't know is what would be a
good value to use. dirtybuf in systat -v is typically less than 3000,
which makes 261,000 buffer seem wasteful, but of course that's
neglecting the read caching aspect.

> >What impact will vm_min|max_cache have on system performance? Is there
> >any advantage to setting it fairly high?
> 
> I'm not quite sure which variables you are referring to.  In FreeBSD 
> there are 'vm.v_cache_min' and 'vm.v_cache_max'.  I don't recommend 
> tuning them, though, without having a very deep and thorough look at the 
> kernel sources.  Many of these variables don't really do what their name 
> suggests, and there are interdependencies between some of them.  You can 
> lock up your server by tuning them improperly.

Sorry, I shouldn't have been lazy and actually looked up the settings.
Yes, those are the settings I was reffering to. Someone else had cranked
them up so that the machine was maintaining about 1.7G in cache; he said
that he'd noticed a reduction in disk IO when he did that. I haven't
been able to see any difference in disk IO, though it seems logical that
setting cache too high would hurt write caching and actually increase
disk IO. It's currently set to whatever the kernel thought best, so I'll
just leave it there.

> >The machine I'm tuning is a dual Opteron box with 4G of ram, a mirror
> >and a 6 disk RAID10. It's running PostgreSQL.
> 
> I'm not a PostgreSQL expert, but there have been discussions on this 
> mailing list and elsewhere about tuning PostgreSQL.  I suggest to take a 
> look at the archives.

Yes, I'm familiar with them. The big question that always seems to come
up is about how the disk caching actually works, but I think that's been
cleared up now.
-- 
Jim C. Nasby, Database Consultant                  jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

From owner-freebsd-performance@FreeBSD.ORG  Mon Apr 19 09:25:51 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 295E916A4CE
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 09:25:51 -0700 (PDT)
Received: from gen129.n001.c02.escapebox.net (gen129.n001.c02.escapebox.net
	[213.73.91.129])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 69BCF43D53
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 09:25:50 -0700 (PDT)
	(envelope-from gemini@geminix.org)
Message-ID: <4083FD8B.3000900@geminix.org>
Date: Mon, 19 Apr 2004 18:25:47 +0200
From: Uwe Doering <gemini@geminix.org>
Organization: Private UNIX Site
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.6) Gecko/20040119
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: freebsd-performance@freebsd.org
References: <20040416163845.GG87362@nasby.net>
	<E1BEbKR-000ISM-00.shmukler-mail-ru@f7.mail.ru>
	<20040416221211.GM87362@nasby.net> <4080DF9F.3040302@geminix.org>
	<20040419022043.GO87362@nasby.net> <408373C0.7080502@geminix.org>
	<20040419151616.GT87362@nasby.net>
In-Reply-To: <20040419151616.GT87362@nasby.net>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Received: from gemini by geminix.org with asmtp (TLSv1:AES256-SHA:256)
	(Exim 3.36 #1)
	id 1BFbaf-000NGL-00; Mon, 19 Apr 2004 18:25:49 +0200
Subject: Re: How does disk caching work?
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Apr 2004 16:25:51 -0000

Jim C. Nasby wrote:
> On Mon, Apr 19, 2004 at 08:37:52AM +0200, Uwe Doering wrote:
> 
>>Jim C. Nasby wrote:
>>
>>>On Sat, Apr 17, 2004 at 09:41:19AM +0200, Uwe Doering wrote:
>>>[...]
>>>A few questions if I may...
>>>
>>>What's a good way to tune amount of space dedicated to IO buffers?
>>
>>You can tune the number of i/o buffers, and therefore indirectly the 
>>amount of memory they may allocate, by using the variable 'kern.nbuf' in 
>>'/boot/loader.conf'.  Note that this number gets multiplied by 16384 
>>(the default filesystem block size) to arrive at the amount of memory it 
>>results in.
>>
>>My experience is that with large amounts of RAM this area becomes 
>>unduely big, though.  It's not that you have to skimp on RAM in this 
>>enviroment, but the disk i/o buffers eat away at the KVM region (kernel 
>>virtual memory), which happens to be just 1 GB by default and doesn't 
>>grow with the RAM size.  So it can be a good idea to actually reduce the 
>>number of disk i/o buffers (compared to its auto-scaled default) on 
>>systems with plenty of RAM (since you don't need that many buffers, 
>>anyway, due to the VM interaction I just described) and save the 
>>available KVM rather for other purposes (kernel resources).  Systems 
>>that run out of KVM are prone to kernel panics, given the right 
>>combination of circumstances.
> 
> Yes, I was thinking the same thing. What I don't know is what would be a
> good value to use. dirtybuf in systat -v is typically less than 3000,
> which makes 261,000 buffer seem wasteful, but of course that's
> neglecting the read caching aspect.

With regard to the VM interaction I explained earlier, the same goes for 
read caching.  File and directory data is kept in VM objects attached to 
the internal vnodes (files etc.) once read in.  So large quantities of 
disk i/o buffers aren't needed for read caching, either.

We have 'kern.nbuf="4096"' for our production systems with 2 GB RAM, 
which results in 64 MB disk i/o cache.  These machines are used for 
server hosting purposes and therefore run all sorts of applications at 
the same time.  Look at the URL in my signature for more details.

I unfortunately don't know how much buffer space a dedicated database 
server would need, but I suspect that you won't notice any difference 
between 256 MB (default) and 64 MB (kern.nbuf="4096").

More interesting is probably to crank up 'vfs.hirunningspace' and 
'vfs.lorunningspace' (both 'sysctl' variables) in order to not stall 
write operations when there are plenty of outstanding read requests 
waiting for completion.  FreeBSD's classical bottleneck on disk i/o 
oriented servers.  In case your raid controller has a large i/o buffer 
of its own (16 MB or more) you may want to use values in this range:

   vfs.hirunningspace=8388608
   vfs.lorunningspace=6291456

These are in bytes.  Also, disable agressive read-ahead in the 
controller, if possible.  This feature is for MS Windows and would be 
counter-productive with FreeBSD.  FreeBSD knows better than the 
controller if and when to read ahead.

>>>What impact will vm_min|max_cache have on system performance? Is there
>>>any advantage to setting it fairly high?
>>
>>I'm not quite sure which variables you are referring to.  In FreeBSD 
>>there are 'vm.v_cache_min' and 'vm.v_cache_max'.  I don't recommend 
>>tuning them, though, without having a very deep and thorough look at the 
>>kernel sources.  Many of these variables don't really do what their name 
>>suggests, and there are interdependencies between some of them.  You can 
>>lock up your server by tuning them improperly.
> 
> Sorry, I shouldn't have been lazy and actually looked up the settings.
> Yes, those are the settings I was reffering to. Someone else had cranked
> them up so that the machine was maintaining about 1.7G in cache; he said
> that he'd noticed a reduction in disk IO when he did that. I haven't
> been able to see any difference in disk IO, though it seems logical that
> setting cache too high would hurt write caching and actually increase
> disk IO. It's currently set to whatever the kernel thought best, so I'll
> just leave it there.

Well, I'm afraid your colleague must have been imagining things.  The 
cache queue ('Cache' column in 'top') is just a phase in the laundering 
procedure (VM page recyling) between the inactive queue ('Inact' in 
'top') and the free queue ('Free' in 'top').  So these variables have 
nothing to do with disk i/o performance.

    Uwe
-- 
Uwe Doering         |  EscapeBox - Managed On-Demand UNIX Servers
gemini@geminix.org  |  http://www.escapebox.net

From owner-freebsd-performance@FreeBSD.ORG  Mon Apr 19 11:23:50 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id E8D3016A4CE
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 11:23:50 -0700 (PDT)
Received: from flake.decibel.org (flake.decibel.org [66.143.173.58])
	by mx1.FreeBSD.org (Postfix) with SMTP id 755E243D3F
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 11:23:50 -0700 (PDT)
	(envelope-from decibel@decibel.org)
Received: (qmail 77728 invoked by uid 1001); 19 Apr 2004 18:23:45 -0000
Date: Mon, 19 Apr 2004 13:23:45 -0500
From: "Jim C. Nasby" <jim@nasby.net>
To: Uwe Doering <gemini@geminix.org>
Message-ID: <20040419182345.GV87362@nasby.net>
References: <20040416163845.GG87362@nasby.net>
	<E1BEbKR-000ISM-00.shmukler-mail-ru@f7.mail.ru>
	<20040416221211.GM87362@nasby.net> <4080DF9F.3040302@geminix.org>
	<20040419022043.GO87362@nasby.net> <408373C0.7080502@geminix.org>
	<20040419151616.GT87362@nasby.net> <4083FD8B.3000900@geminix.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4083FD8B.3000900@geminix.org>
X-Operating-System: FreeBSD 4.9-RELEASE-p3 i386
X-Distributed: Join the Effort!  http://www.distributed.net
User-Agent: Mutt/1.5.6i
cc: freebsd-performance@freebsd.org
Subject: Re: How does disk caching work?
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Apr 2004 18:23:51 -0000

Thanks very much for all your help. Only remaining question I have is:
is there something we can monitor to gauge what impact (if any) changes
to these settings have? Will changes to hi|lowrunningspace just result
in increased KB/s or TPS write performance in gstat? How can we tell if
we've advanced it too much?

Likewise, what can we measure regarding nbuf? If I'm understanding
things correctly, runningspace comes out of nbuf, so obviously it needs
to be greater than that, but what symptoms will we see if it's set too
low?

On Mon, Apr 19, 2004 at 06:25:47PM +0200, Uwe Doering wrote:
> With regard to the VM interaction I explained earlier, the same goes for 
> read caching.  File and directory data is kept in VM objects attached to 
> the internal vnodes (files etc.) once read in.  So large quantities of 
> disk i/o buffers aren't needed for read caching, either.
> 
> We have 'kern.nbuf="4096"' for our production systems with 2 GB RAM, 
> which results in 64 MB disk i/o cache.  These machines are used for 
> server hosting purposes and therefore run all sorts of applications at 
> the same time.  Look at the URL in my signature for more details.
> 
> I unfortunately don't know how much buffer space a dedicated database 
> server would need, but I suspect that you won't notice any difference 
> between 256 MB (default) and 64 MB (kern.nbuf="4096").
> 
> More interesting is probably to crank up 'vfs.hirunningspace' and 
> 'vfs.lorunningspace' (both 'sysctl' variables) in order to not stall 
> write operations when there are plenty of outstanding read requests 
> waiting for completion.  FreeBSD's classical bottleneck on disk i/o 
> oriented servers.  In case your raid controller has a large i/o buffer 
> of its own (16 MB or more) you may want to use values in this range:
> 
>   vfs.hirunningspace=8388608
>   vfs.lorunningspace=6291456
> 
> These are in bytes.  Also, disable agressive read-ahead in the 
> controller, if possible.  This feature is for MS Windows and would be 
> counter-productive with FreeBSD.  FreeBSD knows better than the 
> controller if and when to read ahead.
-- 
Jim C. Nasby, Database Consultant                  jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

From owner-freebsd-performance@FreeBSD.ORG  Mon Apr 19 07:07:41 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id DB58016A4CE
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 07:07:41 -0700 (PDT)
Received: from smtp.housing.ufl.edu (smtp.housing.ufl.edu [128.227.47.16])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 4CF0343D48
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 07:07:41 -0700 (PDT)
	(envelope-from WillS@housing.ufl.edu)
Received: (qmail 54656 invoked by uid 98); 19 Apr 2004 14:07:40 -0000
Received: from WillS@housing.ufl.edu by smtp.housing.ufl.edu by uid 82 with
	qmail-scanner-1.20 
	(spamassassin: 2.63.  Clear:RC:1(128.227.47.18):. 
	Processed in 0.015357 secs); 19 Apr 2004 14:07:40 -0000
X-Qmail-Scanner-Mail-From: WillS@housing.ufl.edu via smtp.housing.ufl.edu
X-Qmail-Scanner: 1.20 (Clear:RC:1(128.227.47.18):. Processed in 0.015357 secs)
Received: from bragi.housing.ufl.edu (128.227.47.18)
	by smtp.housing.ufl.edu with RC4-MD5 encrypted SMTP;
	19 Apr 2004 14:07:40 -0000
Content-Class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
x-mimeole: Produced By Microsoft Exchange V6.0.6249.0
Date: Mon, 19 Apr 2004 10:06:45 -0400
Message-ID: <0E972CEE334BFE4291CD07E056C76ED802E869F4@bragi.housing.ufl.edu>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: etherchannel on 5.2.1 - possible?
Thread-Index: AcQmFLevX2DKhHLoSlGPaGbiSKwNYAAAGlpw
From: "Will Saxon" <WillS@housing.ufl.edu>
To: "J.D. Bronson" <jbronson@wixb.com>,
	<freebsd-performance@freebsd.org>
X-Mailman-Approved-At: Mon, 19 Apr 2004 12:04:14 -0700
cc: freebsd-questions@freebsd.org
Subject: RE: etherchannel on 5.2.1 - possible?
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Apr 2004 14:07:42 -0000

> -----Original Message-----
> From: J.D. Bronson [mailto:jbronson@wixb.com]
> Sent: Monday, April 19, 2004 9:46 AM
> To: freebsd-performance@freebsd.org
> Cc: freebsd-questions@freebsd.org
> Subject: etherchannel on 5.2.1 - possible?
>=20
>=20
> I am looking for performance. Not fail-over..
>=20
> Does anyone have this working with either
> intel or broadcom nics?
>=20
> Anyone have any good site that talks about what is needed to=20
> make this work
> as well? - I do have a Cisco switch and it fully supports this.
>=20
> I need a little advice on setting this up...

I have used the ng_fec netgraph module with both broadcom 5703X and
HP NC7170 nics (uses em driver).=20

This is how to set it up:

First you have to have the ng_fec module loaded.

Then,

# ngctl mkpeer fec dummy fec
# ngctl msg fec0: add_iface '"bge0"'
# ngctl msg fec0: add_iface '"bge1"'

Obviously replace bge with em or whatever other driver you are using.
ng_fec supports up to 4 links.

At this point, you will have a fec0 interface that you can=20
manipulate normally with ifconfig. I have noticed that sometimes
I have to bring the interface up and down a couple of times to get
it to start passing traffic. Whenever you 'ifconfig up' or assign an
address to fec0 it resets the bundle.=20

One thing that is annoying is that ng_fec doesn't work with vlans. There
is an ng_vlan module that was recently released, but ng_fec doesn't work
with it because it isn't quite like other netgraph modules.=20

Almost all of my freebsd machines use vlans, so I am not making heavy=20
use of ng_fec. We aren't pushing enough data to make it really necessary =

anyway.

There is also ng_one2many which does implement failover and channel=20
bonding but not using the etherchannel technique. I think it uses
round robin.=20

-Will

From owner-freebsd-performance@FreeBSD.ORG  Mon Apr 19 15:06:46 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 9E48916A4CE
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 15:06:46 -0700 (PDT)
Received: from f16.mail.ru (f16.mail.ru [194.67.57.46])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 5AAA743D41
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 15:06:46 -0700 (PDT)
	(envelope-from shmukler@mail.ru)
Received: from mail by f16.mail.ru with local 
	id 1BFguL-000EKA-00; Tue, 20 Apr 2004 02:06:29 +0400
Received: from [24.184.137.35] by msg.mail.ru with HTTP;
	Tue, 20 Apr 2004 02:06:29 +0400
From: =?koi8-r?Q?=22?=Igor Shmukler=?koi8-r?Q?=22=20?= <shmukler@mail.ru>
To: =?koi8-r?Q?=22?=Uwe Doering=?koi8-r?Q?=22=20?= <gemini@geminix.org>
Mime-Version: 1.0
X-Mailer: mPOP Web-Mail 2.19
X-Originating-IP: [24.184.137.35]
Date: Tue, 20 Apr 2004 02:06:29 +0400
In-Reply-To: <4083FD8B.3000900@geminix.org>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 8bit
Message-Id: <E1BFguL-000EKA-00.shmukler-mail-ru@f16.mail.ru>
cc: freebsd-performance@freebsd.org
Subject: Re[2]: How does disk caching work?
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
Reply-To: =?koi8-r?Q?=22?=Igor Shmukler=?koi8-r?Q?=22=20?= <shmukler@mail.ru>
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Apr 2004 22:06:46 -0000

> > Sorry, I shouldn't have been lazy and actually looked up the settings.
> > Yes, those are the settings I was reffering to. Someone else had cranked
> > them up so that the machine was maintaining about 1.7G in cache; he said
> > that he'd noticed a reduction in disk IO when he did that. I haven't
> > been able to see any difference in disk IO, though it seems logical that
> > setting cache too high would hurt write caching and actually increase
> > disk IO. It's currently set to whatever the kernel thought best, so I'll
> > just leave it there.
> 
> Well, I'm afraid your colleague must have been imagining things.  The 
> cache queue ('Cache' column in 'top') is just a phase in the laundering 
> procedure (VM page recyling) between the inactive queue ('Inact' in 
> 'top') and the free queue ('Free' in 'top').  So these variables have 
> nothing to do with disk i/o performance.

I am not sure you are correct here. I understand things very differently.
Why it is a fact that number of pages in the cache queue does not affect IO throughput, changing vm setting such as:
vm.stats.vm.v_cache_min, vm.stats.vm.v_cache_max, vm.stats.vm.v_free_target and vm.stats.vm.v_free_min should have an effect on disk IO.

The very reason JD came up with cache pages is to minimize IO traffic. If we require lagrer number of free pages we cause OS remove references at earlier point. This should cause kernel re-read some of the pages that otherwise would be just requeued to active queue.

Having larger cache queue would require VM to start cleaning dirty pages earlier, which results in some additional write traffic as well. However, this is not that bad, because here it is a zero sum game. If pages to become free, they would have to written out regardless of cache queue size, just at a later point. However there is a benefit to a larger cache bucket. The upside is that if machine often experiences burst in memory demand (pretty much any real-world server would), you are able to accamodate changing load without blocking.

IS.

From owner-freebsd-performance@FreeBSD.ORG  Mon Apr 19 20:27:29 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id BCDAF16A4CE
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 20:27:29 -0700 (PDT)
Received: from mail.lambertfam.org (www.lambertfam.org [216.223.208.55])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 61A1443D1F
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 20:27:29 -0700 (PDT)
	(envelope-from lambert@lambertfam.org)
Received: from localhost (localhost [127.0.0.1])
	by mail.lambertfam.org (Postfix) with ESMTP id 6D56634D50
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 23:27:26 -0400 (EDT)
Received: from mail.lambertfam.org ([127.0.0.1])
 by localhost (www.lambertfam.org [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id 25431-10 for <freebsd-performance@freebsd.org>;
 Mon, 19 Apr 2004 23:27:17 -0400 (EDT)
Received: from laptop.lambertfam.org (ool-182db8f6.dyn.optonline.net
	[24.45.184.246])
	by mail.lambertfam.org (Postfix) with ESMTP id 5C69334D3B
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 23:27:17 -0400 (EDT)
Received: by laptop.lambertfam.org (Postfix, from userid 1001)
	id A4F58C111; Mon, 19 Apr 2004 23:27:16 -0400 (EDT)
Date: Mon, 19 Apr 2004 23:27:16 -0400
From: Scott Lambert <lambert@lambertfam.org>
To: freebsd-performance@freebsd.org
Message-ID: <20040420032716.GC56561@laptop.lambertfam.org>
Mail-Followup-To: freebsd-performance@freebsd.org
References: <20040416220556.GL87362@nasby.net>
	<002701c42404$e9dbecf0$3102a8c0@metallus> <20040419022239.GP87362@nasby.net>
	<001801c425dd$1fda0b00$3102a8c0@metallus> <20040419140921.GS87362@nasby.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20040419140921.GS87362@nasby.net>
User-Agent: Mutt/1.5.6i
X-Virus-Scanned: by amavisd-new at lambertfam.org
Subject: Re: command piped into bzip not using all available CPU
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 20 Apr 2004 03:27:29 -0000

On Mon, Apr 19, 2004 at 09:09:21AM -0500, Jim C. Nasby wrote:
> Why would I expect to see it only use one CPU? It was CPU bound, not
> disk bound. There were two CPU-intensive processes running, why wouldn't
> they each use a different CPU?
> 
> On Mon, Apr 19, 2004 at 12:08:32AM -0700, Aaron Seelye wrote:
> > I'm not sure the exact technical reason, but as I understand that, it's
> > 47% idle on the total cpu power of the machine, which would indicate
> > that one cpu was 100% full, and the other was 3%, due to system usage,
> > i/o, or whatever else was running.  This is quite normal in my
> > experience, and what you should expect to see.

At the time you took the snapshot, both processes were running on the
same CPU.  FreeBSD 4.x or 5.2?  If 5.2, SCHED_4BSD or SCHED_ULE?

> > > The command I'm running is:
> > > pg_dump -vZ0 ogr | bzip2 > ogr-20040416.sql.bz2
> > >
> > >   PID USERNAME PRI NICE   SIZE    RES STATE  C   TIME   WCPU    CPU COMMAND
> > > 17334 decibel  109    0 10856K  7164K CPU0   0  11:05 65.77% 65.77% bzip2
> > > 17335 pgsql      4    0   154M   124M sbwait 0   5:54 34.03% 34.03% postgres
> > > 17333 decibel   -8    0 20128K  3236K pipdwt 0   0:46  2.88%  2.88% pg_dump

-- 
Scott Lambert                    KC5MLE                       Unix SysAdmin
lambert@lambertfam.org

From owner-freebsd-performance@FreeBSD.ORG  Mon Apr 19 22:45:37 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 0751916A4CE
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 22:45:37 -0700 (PDT)
Received: from gen129.n001.c02.escapebox.net (gen129.n001.c02.escapebox.net
	[213.73.91.129])
	by mx1.FreeBSD.org (Postfix) with ESMTP id BCBB543D1F
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 22:45:36 -0700 (PDT)
	(envelope-from gemini@geminix.org)
Message-ID: <4084B8FD.9080302@geminix.org>
Date: Tue, 20 Apr 2004 07:45:33 +0200
From: Uwe Doering <gemini@geminix.org>
Organization: Private UNIX Site
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.6) Gecko/20040119
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: freebsd-performance@freebsd.org
References: <20040416163845.GG87362@nasby.net>
	<E1BEbKR-000ISM-00.shmukler-mail-ru@f7.mail.ru>
	<20040416221211.GM87362@nasby.net> <4080DF9F.3040302@geminix.org>
	<20040419022043.GO87362@nasby.net> <408373C0.7080502@geminix.org>
	<20040419151616.GT87362@nasby.net> <4083FD8B.3000900@geminix.org>
	<20040419182345.GV87362@nasby.net>
In-Reply-To: <20040419182345.GV87362@nasby.net>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Received: from gemini by geminix.org with asmtp (TLSv1:AES256-SHA:256)
	(Exim 3.36 #1)
	id 1BFo4d-000Ey8-00; Tue, 20 Apr 2004 07:45:35 +0200
Subject: Re: How does disk caching work?
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 20 Apr 2004 05:45:37 -0000

Jim C. Nasby wrote:
> Thanks very much for all your help. Only remaining question I have is:
> is there something we can monitor to gauge what impact (if any) changes
> to these settings have? Will changes to hi|lowrunningspace just result
> in increased KB/s or TPS write performance in gstat? How can we tell if
> we've advanced it too much?

I don't know of any simple means of monitoring the effect, short of 
doing your own benchmark tests.  If these variables are too low you will 
notice that the write throughput suffers during peak read demand. 
That's when we started to examine the kernel sources and found the reason.

To be on the safe side, don't make 'vfs.hirunningspace' larger than a 
fraction of the overall disk i/o buffer space (derived from 
'kern.nbuf'), and also not larger than the buffer space in the disk 
controller.

'vfs.lorunningspace' is used to implement a hysteresis and should be 1/2 
or 3/4 of 'vfs.hirunningspace'.

> Likewise, what can we measure regarding nbuf? If I'm understanding
> things correctly, runningspace comes out of nbuf, so obviously it needs
> to be greater than that, but what symptoms will we see if it's set too
> low?

Maybe just bad performance due to the complete flushing of the disk i/o 
buffers (remember that meta data is cached in there), maybe system 
lockup.  Can't tell.  Just make sure that these variables stay smaller 
than the disk i/o buffer cache, and you won't have to bother with the 
consequences of overdoing it.

    Uwe
-- 
Uwe Doering         |  EscapeBox - Managed On-Demand UNIX Servers
gemini@geminix.org  |  http://www.escapebox.net

From owner-freebsd-performance@FreeBSD.ORG  Mon Apr 19 23:17:12 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 11E2916A4CE
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 23:17:12 -0700 (PDT)
Received: from gen129.n001.c02.escapebox.net (gen129.n001.c02.escapebox.net
	[213.73.91.129])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 7051843D53
	for <freebsd-performance@freebsd.org>;
	Mon, 19 Apr 2004 23:17:11 -0700 (PDT)
	(envelope-from gemini@geminix.org)
Message-ID: <4084C064.5080506@geminix.org>
Date: Tue, 20 Apr 2004 08:17:08 +0200
From: Uwe Doering <gemini@geminix.org>
Organization: Private UNIX Site
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.6) Gecko/20040119
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: freebsd-performance@freebsd.org
References: <E1BFguL-000EKA-00.shmukler-mail-ru@f16.mail.ru>
In-Reply-To: <E1BFguL-000EKA-00.shmukler-mail-ru@f16.mail.ru>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Received: from gemini by geminix.org with asmtp (TLSv1:AES256-SHA:256)
	(Exim 3.36 #1)
	id 1BFoZC-000FdE-00; Tue, 20 Apr 2004 08:17:10 +0200
Subject: Re: How does disk caching work?
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 20 Apr 2004 06:17:12 -0000

Igor Shmukler wrote:
>>>Sorry, I shouldn't have been lazy and actually looked up the settings.
>>>Yes, those are the settings I was reffering to. Someone else had cranked
>>>them up so that the machine was maintaining about 1.7G in cache; he said
>>>that he'd noticed a reduction in disk IO when he did that. I haven't
>>>been able to see any difference in disk IO, though it seems logical that
>>>setting cache too high would hurt write caching and actually increase
>>>disk IO. It's currently set to whatever the kernel thought best, so I'll
>>>just leave it there.
>>
>>Well, I'm afraid your colleague must have been imagining things.  The 
>>cache queue ('Cache' column in 'top') is just a phase in the laundering 
>>procedure (VM page recyling) between the inactive queue ('Inact' in 
>>'top') and the free queue ('Free' in 'top').  So these variables have 
>>nothing to do with disk i/o performance.
> 
> I am not sure you are correct here. I understand things very differently.
> Why it is a fact that number of pages in the cache queue does not affect IO throughput, changing vm setting such as:
> vm.stats.vm.v_cache_min, vm.stats.vm.v_cache_max, vm.stats.vm.v_free_target and vm.stats.vm.v_free_min should have an effect on disk IO.
> 
> The very reason JD came up with cache pages is to minimize IO traffic. If we require lagrer number of free pages we cause OS remove references at earlier point. This should cause kernel re-read some of the pages that otherwise would be just requeued to active queue.
> 
> Having larger cache queue would require VM to start cleaning dirty pages earlier, which results in some additional write traffic as well. However, this is not that bad, because here it is a zero sum game. If pages to become free, they would have to written out regardless of cache queue size, just at a later point. However there is a benefit to a larger cache bucket. The upside is that if machine often experiences burst in memory demand (pretty much any real-world server would), you are able to accamodate changing load without blocking.

Well, I didn't claim that the cache queue were useless.  It does have 
its merits.  And there is a certain default amount configured by the 
kernel's auto-scaling code already.

What I was trying to point out is that these variables don't necessarily 
do what their name suggests.  Take 'vm.v_cache_max', for example.  When 
you crank that up, instead of increasing the size of the cache queue it 
is actually the inactive queue that grows in size.

This is because the kernel steals pages from the inactive queue when it 
temporarily runs out of pages in the cache queue, without having to 
block for i/o as long as there are clean (not written to or already 
laundered) pages in the inactive queue.  When it finds dirty pages 
during this scan it schedules them for background synchronization with 
the disk, but again without blocking in the foreground.

The reason for this algorithm is that it is better to keep pages in the 
inactive queue for as long as possibe, rather than moving them over to 
the cache queue prematurely.  Pages in the inactive queue can be still 
mapped into the memory space of processes, while pages in the cache 
queue have lost this association.  So, quite naturally, when the VM 
system has to reactivate a page (put it back into the active queue) this 
operation tends to be less expensive when the page is still in the 
inactive queue.

So, for reasons like these, I keep recommending to either study the 
kernel sources before you try to tune the VM system, or leave these 
variables alone.

    Uwe
-- 
Uwe Doering         |  EscapeBox - Managed On-Demand UNIX Servers
gemini@geminix.org  |  http://www.escapebox.net

From owner-freebsd-performance@FreeBSD.ORG  Tue Apr 20 07:15:58 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 9C2CA16A4CE
	for <freebsd-performance@freebsd.org>;
	Tue, 20 Apr 2004 07:15:58 -0700 (PDT)
Received: from f20.mail.ru (f20.mail.ru [194.67.57.52])
	by mx1.FreeBSD.org (Postfix) with ESMTP id EA89043D48
	for <freebsd-performance@freebsd.org>;
	Tue, 20 Apr 2004 07:15:57 -0700 (PDT)
	(envelope-from shmukler@mail.ru)
Received: from mail by f20.mail.ru with local 
	id 1BFw2W-000L2G-00; Tue, 20 Apr 2004 18:15:56 +0400
Received: from [24.184.137.35] by msg.mail.ru with HTTP;
	Tue, 20 Apr 2004 18:15:56 +0400
From: =?koi8-r?Q?=22?=Igor Shmukler=?koi8-r?Q?=22=20?= <shmukler@mail.ru>
To: =?koi8-r?Q?=22?=Uwe Doering=?koi8-r?Q?=22=20?= <gemini@geminix.org>
Mime-Version: 1.0
X-Mailer: mPOP Web-Mail 2.19
X-Originating-IP: [24.184.137.35]
Date: Tue, 20 Apr 2004 18:15:56 +0400
In-Reply-To: <4084C064.5080506@geminix.org>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 8bit
Message-Id: <E1BFw2W-000L2G-00.shmukler-mail-ru@f20.mail.ru>
cc: freebsd-performance@freebsd.org
Subject: Re[2]: How does disk caching work?
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
Reply-To: =?koi8-r?Q?=22?=Igor Shmukler=?koi8-r?Q?=22=20?= <shmukler@mail.ru>
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 20 Apr 2004 14:15:58 -0000

> >>>Sorry, I shouldn't have been lazy and actually looked up the settings.
> >>>Yes, those are the settings I was reffering to. Someone else had cranked
> >>>them up so that the machine was maintaining about 1.7G in cache; he said
> >>>that he'd noticed a reduction in disk IO when he did that. I haven't
> >>>been able to see any difference in disk IO, though it seems logical that
> >>>setting cache too high would hurt write caching and actually increase
> >>>disk IO. It's currently set to whatever the kernel thought best, so I'll
> >>>just leave it there.
> >>
> >>Well, I'm afraid your colleague must have been imagining things.  The 
> >>cache queue ('Cache' column in 'top') is just a phase in the laundering 
> >>procedure (VM page recyling) between the inactive queue ('Inact' in 
> >>'top') and the free queue ('Free' in 'top').  So these variables have 
> >>nothing to do with disk i/o performance.
> > 
> > I am not sure you are correct here. I understand things very differently.
> > Why it is a fact that number of pages in the cache queue does not affect IO throughput, changing vm setting such as:
> > vm.stats.vm.v_cache_min, vm.stats.vm.v_cache_max, vm.stats.vm.v_free_target and vm.stats.vm.v_free_min should have an effect on disk IO.
> > 
> > The very reason JD came up with cache pages is to minimize IO traffic. If we require lagrer number of free pages we cause OS remove references at earlier point. This should cause kernel re-read some of the pages that otherwise would be just requeued to active queue.
> > 
> > Having larger cache queue would require VM to start cleaning dirty pages earlier, which results in some additional write traffic as well. However, this is not that bad, because here it is a zero sum game. If pages to become free, they would have to written out regardless of cache queue size, just at a later point. However there is a benefit to a larger cache bucket. The upside is that if machine often experiences burst in memory demand (pretty much any real-world server would), you are able to accamodate changing load without blocking.
> 
> Well, I didn't claim that the cache queue were useless.  It does have 
> its merits.  And there is a certain default amount configured by the 
> kernel's auto-scaling code already.

Yes, kernel defaults for queue sizes should work for most of us.

> What I was trying to point out is that these variables don't necessarily 
> do what their name suggests.  Take 'vm.v_cache_max', for example.  When 
> you crank that up, instead of increasing the size of the cache queue it 
> is actually the inactive queue that grows in size.
> 
> This is because the kernel steals pages from the inactive queue when it 
> temporarily runs out of pages in the cache queue, without having to 
> block for i/o as long as there are clean (not written to or already 
> laundered) pages in the inactive queue.  When it finds dirty pages 
> during this scan it schedules them for background synchronization with 
> the disk, but again without blocking in the foreground.
> 
> The reason for this algorithm is that it is better to keep pages in the 
> inactive queue for as long as possibe, rather than moving them over to 
> the cache queue prematurely.  Pages in the inactive queue can be still 
> mapped into the memory space of processes, while pages in the cache 
> queue have lost this association.  So, quite naturally, when the VM 
> system has to reactivate a page (put it back into the active queue) this 
> operation tends to be less expensive when the page is still in the 
> inactive queue.

While you are correct that when cache is emtry kenrel will dip into the inactive queue. You are mistaken about other things.  Pages on the cache queue still have the association. I wrote that one of the previous posts.

To sum it up: cache queue is same as inactive queue except it has only clean pages.

If things were the you suggest, cache queue would be totally useless.

I actually pretty much explain the whole rotation process. If you read my email again, you should understand what happens whenever page is moved from inactive to cache and then to free.

> So, for reasons like these, I keep recommending to either study the 
> kernel sources before you try to tune the VM system, or leave these 
> variables alone.

I am not sure whether studying kernel sources is really necessary. Virtually every UNIX (R) admin had to tune the machine, despite sources not being available.

From owner-freebsd-performance@FreeBSD.ORG  Tue Apr 20 07:45:52 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 803ED16A4CE
	for <freebsd-performance@freebsd.org>;
	Tue, 20 Apr 2004 07:45:52 -0700 (PDT)
Received: from flake.decibel.org (flake.decibel.org [66.143.173.58])
	by mx1.FreeBSD.org (Postfix) with SMTP id 04FE743D2F
	for <freebsd-performance@freebsd.org>;
	Tue, 20 Apr 2004 07:45:52 -0700 (PDT)
	(envelope-from decibel@decibel.org)
Received: (qmail 43200 invoked by uid 1001); 20 Apr 2004 14:45:47 -0000
Date: Tue, 20 Apr 2004 09:45:47 -0500
From: "Jim C. Nasby" <jim@nasby.net>
To: freebsd-performance@freebsd.org
Message-ID: <20040420144547.GX87362@nasby.net>
References: <20040416220556.GL87362@nasby.net>
	<002701c42404$e9dbecf0$3102a8c0@metallus> <20040419022239.GP87362@nasby.net>
	<001801c425dd$1fda0b00$3102a8c0@metallus> <20040419140921.GS87362@nasby.net>
	<20040420032716.GC56561@laptop.lambertfam.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20040420032716.GC56561@laptop.lambertfam.org>
X-Operating-System: FreeBSD 4.9-RELEASE-p3 i386
X-Distributed: Join the Effort!  http://www.distributed.net
User-Agent: Mutt/1.5.6i
Subject: Re: command piped into bzip not using all available CPU
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 20 Apr 2004 14:45:52 -0000

decibel@fritz.1[9:34]~:6>uname -a
FreeBSD fritz.distributed.net 5.2.1-RELEASE FreeBSD 5.2.1-RELEASE #1:
Wed Apr  7 18:42:52 CDT 2004
root@fritz.distributed.net:/usr/obj/usr/src/sys/FRITZ  amd64

decibel@fritz.1[9:35]/usr/src/sys/amd64/conf:9>grep -i sched FRITZ 
options         SCHED_4BSD              #4BSD scheduler
options         _KPOSIX_PRIORITY_SCHEDULING #Posix P1003_1B real-time extensions

Also, don't read anything into the fact that they were on the same CPU
for that snapshot; here's one showing the exact opposite:

  PID USERNAME PRI NICE   SIZE    RES STATE  C   TIME   WCPU    CPU COMMAND
10336 dnetc    139   20  1344K   856K RUN    1  80.5H 90.14% 90.14% dnetc
10702 decibel  108    0 10856K  7304K CPU0   1   0:13 23.86% 22.41% bzip2
10703 pgsql      4    0   154M 78004K sbwait 0   0:07 10.92% 10.25% postgres


FWIW, I recall seeing this same behavior under FreeBSD 4.x as well.

On Mon, Apr 19, 2004 at 11:27:16PM -0400, Scott Lambert wrote:
> On Mon, Apr 19, 2004 at 09:09:21AM -0500, Jim C. Nasby wrote:
> > Why would I expect to see it only use one CPU? It was CPU bound, not
> > disk bound. There were two CPU-intensive processes running, why wouldn't
> > they each use a different CPU?
> > 
> > On Mon, Apr 19, 2004 at 12:08:32AM -0700, Aaron Seelye wrote:
> > > I'm not sure the exact technical reason, but as I understand that, it's
> > > 47% idle on the total cpu power of the machine, which would indicate
> > > that one cpu was 100% full, and the other was 3%, due to system usage,
> > > i/o, or whatever else was running.  This is quite normal in my
> > > experience, and what you should expect to see.
> 
> At the time you took the snapshot, both processes were running on the
> same CPU.  FreeBSD 4.x or 5.2?  If 5.2, SCHED_4BSD or SCHED_ULE?
> 
> > > > The command I'm running is:
> > > > pg_dump -vZ0 ogr | bzip2 > ogr-20040416.sql.bz2
> > > >
> > > >   PID USERNAME PRI NICE   SIZE    RES STATE  C   TIME   WCPU    CPU COMMAND
> > > > 17334 decibel  109    0 10856K  7164K CPU0   0  11:05 65.77% 65.77% bzip2
> > > > 17335 pgsql      4    0   154M   124M sbwait 0   5:54 34.03% 34.03% postgres
> > > > 17333 decibel   -8    0 20128K  3236K pipdwt 0   0:46  2.88%  2.88% pg_dump
> 
> -- 
> Scott Lambert                    KC5MLE                       Unix SysAdmin
> lambert@lambertfam.org
> 
> _______________________________________________
> freebsd-performance@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
> To unsubscribe, send any mail to "freebsd-performance-unsubscribe@freebsd.org"
> 

-- 
Jim C. Nasby, Database Consultant                  jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

From owner-freebsd-performance@FreeBSD.ORG  Tue Apr 20 08:11:00 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 3A13816A55F
	for <freebsd-performance@freebsd.org>;
	Tue, 20 Apr 2004 08:10:43 -0700 (PDT)
Received: from gen129.n001.c02.escapebox.net (gen129.n001.c02.escapebox.net
	[213.73.91.129])
	by mx1.FreeBSD.org (Postfix) with ESMTP id C964D43D55
	for <freebsd-performance@freebsd.org>;
	Tue, 20 Apr 2004 08:10:15 -0700 (PDT)
	(envelope-from gemini@geminix.org)
Message-ID: <40853D53.1040906@geminix.org>
Date: Tue, 20 Apr 2004 17:10:11 +0200
From: Uwe Doering <gemini@geminix.org>
Organization: Private UNIX Site
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.6) Gecko/20040119
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: freebsd-performance@freebsd.org
References: <E1BFw2W-000L2G-00.shmukler-mail-ru@f20.mail.ru>
In-Reply-To: <E1BFw2W-000L2G-00.shmukler-mail-ru@f20.mail.ru>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Received: from gemini by geminix.org with asmtp (TLSv1:AES256-SHA:256)
	(Exim 3.36 #1)
	id 1BFwt4-0000WK-00; Tue, 20 Apr 2004 17:10:14 +0200
Subject: Re: How does disk caching work?
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 20 Apr 2004 15:11:00 -0000

Igor Shmukler wrote:
>>What I was trying to point out is that these variables don't necessarily 
>>do what their name suggests.  Take 'vm.v_cache_max', for example.  When 
>>you crank that up, instead of increasing the size of the cache queue it 
>>is actually the inactive queue that grows in size.
>>
>>This is because the kernel steals pages from the inactive queue when it 
>>temporarily runs out of pages in the cache queue, without having to 
>>block for i/o as long as there are clean (not written to or already 
>>laundered) pages in the inactive queue.  When it finds dirty pages 
>>during this scan it schedules them for background synchronization with 
>>the disk, but again without blocking in the foreground.
>>
>>The reason for this algorithm is that it is better to keep pages in the 
>>inactive queue for as long as possibe, rather than moving them over to 
>>the cache queue prematurely.  Pages in the inactive queue can be still 
>>mapped into the memory space of processes, while pages in the cache 
>>queue have lost this association.  So, quite naturally, when the VM 
>>system has to reactivate a page (put it back into the active queue) this 
>>operation tends to be less expensive when the page is still in the 
>>inactive queue.
> 
> While you are correct that when cache is emtry kenrel will dip into the inactive queue. You are mistaken about other things.  Pages on the cache queue still have the association. I wrote that one of the previous posts.
> 
> To sum it up: cache queue is same as inactive queue except it has only clean pages.
> 
> If things were the you suggest, cache queue would be totally useless.

I think you're mixing up two different things here.  The way I 
understand the kernel sources, the pages in the cache queue of course 
still have their association with the underlying VM object.  Otherwise 
caching these pages would be useless.  But they are no longer mapped 
into any process address space.  If I may quote the relevant comment 
from vm_page_cache():

         /*
          * Remove all pmaps and indicate that the page is not
          * writeable or mapped.
          */

vm_page_cache() is the function that moves the pages from the inactive 
to the cache queue once they are clean.  Restoring the process address 
space mapping is what makes reactivating pages from the cache queue more 
expensive than just relinking them from the inactive queue, because a 
fault gets generated when the process tries to access the page.  This 
fault then maps the page from the VM object into the process address 
space.  This causes additional overhead.

> I actually pretty much explain the whole rotation process. If you read my email again, you should understand what happens whenever page is moved from inactive to cache and then to free.

You may want to study the kernel sources some more, I'm afraid.

>>So, for reasons like these, I keep recommending to either study the 
>>kernel sources before you try to tune the VM system, or leave these 
>>variables alone.
> 
> I am not sure whether studying kernel sources is really necessary. Virtually every UNIX (R) admin had to tune the machine, despite sources not being available.

Sorry, but you just proved my point ...

    Uwe
-- 
Uwe Doering         |  EscapeBox - Managed On-Demand UNIX Servers
gemini@geminix.org  |  http://www.escapebox.net

From owner-freebsd-performance@FreeBSD.ORG  Tue Apr 20 11:04:13 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id C5D0D16A4D2
	for <freebsd-performance@freebsd.org>;
	Tue, 20 Apr 2004 11:04:13 -0700 (PDT)
Received: from f16.mail.ru (f16.mail.ru [194.67.57.46])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 46EFC43D5F
	for <freebsd-performance@freebsd.org>;
	Tue, 20 Apr 2004 11:04:13 -0700 (PDT)
	(envelope-from shmukler@mail.ru)
Received: from mail by f16.mail.ru with local 
	id 1BFzbP-000Pqh-00; Tue, 20 Apr 2004 22:04:11 +0400
Received: from [24.184.137.35] by msg.mail.ru with HTTP;
	Tue, 20 Apr 2004 22:04:11 +0400
From: =?koi8-r?Q?=22?=Igor Shmukler=?koi8-r?Q?=22=20?= <shmukler@mail.ru>
To: =?koi8-r?Q?=22?=Uwe Doering=?koi8-r?Q?=22=20?= <gemini@geminix.org>
Mime-Version: 1.0
X-Mailer: mPOP Web-Mail 2.19
X-Originating-IP: [24.184.137.35]
Date: Tue, 20 Apr 2004 22:04:11 +0400
In-Reply-To: <40853D53.1040906@geminix.org>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 8bit
Message-Id: <E1BFzbP-000Pqh-00.shmukler-mail-ru@f16.mail.ru>
cc: freebsd-performance@freebsd.org
Subject: Re[2]: How does disk caching work?
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
Reply-To: =?koi8-r?Q?=22?=Igor Shmukler=?koi8-r?Q?=22=20?= <shmukler@mail.ru>
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 20 Apr 2004 18:04:14 -0000

> >>The reason for this algorithm is that it is better to keep pages in the 
> >>inactive queue for as long as possibe, rather than moving them over to 
> >>the cache queue prematurely.  Pages in the inactive queue can be still 
> >>mapped into the memory space of processes, while pages in the cache 
> >>queue have lost this association.  So, quite naturally, when the VM 
> >>system has to reactivate a page (put it back into the active queue) this 
> >>operation tends to be less expensive when the page is still in the 
> >>inactive queue.
> > 
> > While you are correct that when cache is emtry kenrel will dip into the inactive queue. You are mistaken about other things.  Pages on the cache queue still have the association. I wrote that one of the previous posts.
> > 
> > To sum it up: cache queue is same as inactive queue except it has only clean pages.
> > 
> > If things were the you suggest, cache queue would be totally useless.
> 
> I think you're mixing up two different things here.  The way I 
> understand the kernel sources, the pages in the cache queue of course 
> still have their association with the underlying VM object.  Otherwise 
> caching these pages would be useless.  But they are no longer mapped 
> into any process address space.  If I may quote the relevant comment 
> from vm_page_cache():
> 
>          /*
>           * Remove all pmaps and indicate that the page is not
>           * writeable or mapped.
>           */
> 
> vm_page_cache() is the function that moves the pages from the inactive 
> to the cache queue once they are clean.  Restoring the process address 
> space mapping is what makes reactivating pages from the cache queue more 
> expensive than just relinking them from the inactive queue, because a 
> fault gets generated when the process tries to access the page.  This 
> fault then maps the page from the VM object into the process address 
> space.  This causes additional overhead.
> 
> > I actually pretty much explain the whole rotation process. If you read my email again, you should understand what happens whenever page is moved from inactive to cache and then to free.
> 
> You may want to study the kernel sources some more, I'm afraid.

First you explicitly write that "pages in the cache queue have lost this association" then you tell me that I don't understand how VM works.

Are you trying to suggest that mapping page and change permission is comparable with reading page from backing store?

Studying sources never hurts, but filtering lingo is just as helpful.

> >>So, for reasons like these, I keep recommending to either study the 
> >>kernel sources before you try to tune the VM system, or leave these 
> >>variables alone.
> > 
> > I am not sure whether studying kernel sources is really necessary. Virtually every UNIX (R) admin had to tune the machine, despite sources not being available.
> 
> Sorry, but you just proved my point ...

What was that?

In first email you write that size of cache queue does not affect disk traffic. In next you say, no I did not mean that. I just wanted to say that cache queue holds pages that lost association. Now you say, no of course there is an association, someone just has to study kernel sources better.

Last argument is valid only because it's a moot point. Studying kernel sources never hurts anyone ...

I did not originally intend to flame you. I simply thought that some of your answers were not correct. If you had answered off the list, I would not bother, but this a public mailing list and it is a source of knowledge for many people. VM in particular is a grey area for many developers. Everyone knows what it's for, but few programmers really understand VM or VFS. Now you start giving me funny advices. That's not wise.

Sincerely,
IS.

From owner-freebsd-performance@FreeBSD.ORG  Tue Apr 20 12:50:15 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id C1D0E16A4CF
	for <freebsd-performance@freebsd.org>;
	Tue, 20 Apr 2004 12:50:15 -0700 (PDT)
Received: from flake.decibel.org (flake.decibel.org [66.143.173.58])
	by mx1.FreeBSD.org (Postfix) with SMTP id 6ED9B43D5D
	for <freebsd-performance@freebsd.org>;
	Tue, 20 Apr 2004 12:50:15 -0700 (PDT)
	(envelope-from decibel@decibel.org)
Received: (qmail 58438 invoked by uid 1001); 20 Apr 2004 19:50:10 -0000
Date: Tue, 20 Apr 2004 14:50:10 -0500
From: "Jim C. Nasby" <jim@nasby.net>
To: freebsd-performance@freebsd.org
Message-ID: <20040420195010.GZ87362@nasby.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
X-Operating-System: FreeBSD 4.9-RELEASE-p3 i386
X-Distributed: Join the Effort!  http://www.distributed.net
User-Agent: Mutt/1.5.6i
Subject: vfs.hirunningspace on a 3ware 8506
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 20 Apr 2004 19:50:15 -0000

Has anyone done any testing to see what value of vfs.hirunningspace is
optimal for a 3ware 8506-8?
-- 
Jim C. Nasby, Database Consultant                  jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

From owner-freebsd-performance@FreeBSD.ORG  Tue Apr 20 13:23:00 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 42C7816A4CE
	for <freebsd-performance@freebsd.org>;
	Tue, 20 Apr 2004 13:23:00 -0700 (PDT)
Received: from rms04.rommon.net (rms04.rommon.net [212.54.2.140])
	by mx1.FreeBSD.org (Postfix) with ESMTP id DED3543D55
	for <freebsd-performance@freebsd.org>;
	Tue, 20 Apr 2004 13:22:58 -0700 (PDT)	(envelope-from pete@he.iki.fi)
Received: from he.iki.fi (h91.vuokselantie10.fi [193.64.42.145])
	by rms04.rommon.net (8.12.10/8.12.9) with ESMTP id i3KKMxmo071131;
	Tue, 20 Apr 2004 23:22:59 +0300 (EEST)
	(envelope-from pete@he.iki.fi)
Message-ID: <4085869E.7090306@he.iki.fi>
Date: Tue, 20 Apr 2004 23:22:54 +0300
From: Petri Helenius <pete@he.iki.fi>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
	rv:1.6) Gecko/20040113
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: "Jim C. Nasby" <jim@nasby.net>
References: <20040420195010.GZ87362@nasby.net>
In-Reply-To: <20040420195010.GZ87362@nasby.net>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
cc: freebsd-performance@freebsd.org
Subject: Re: vfs.hirunningspace on a 3ware 8506
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 20 Apr 2004 20:23:00 -0000

Jim C. Nasby wrote:

>Has anyone done any testing to see what value of vfs.hirunningspace is
>optimal for a 3ware 8506-8?
>  
>
Do the 3ware controllers actually care about this value due to the 
onboard processing and cache? I thought all writes are satisfied 
immediately?

Pete

From owner-freebsd-performance@FreeBSD.ORG  Tue Apr 20 15:06:40 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 6EABB16A4CE
	for <freebsd-performance@freebsd.org>;
	Tue, 20 Apr 2004 15:06:40 -0700 (PDT)
Received: from gen129.n001.c02.escapebox.net (gen129.n001.c02.escapebox.net
	[213.73.91.129])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 01C3943D45
	for <freebsd-performance@freebsd.org>;
	Tue, 20 Apr 2004 15:06:40 -0700 (PDT)
	(envelope-from gemini@geminix.org)
Message-ID: <40859EED.7040300@geminix.org>
Date: Wed, 21 Apr 2004 00:06:37 +0200
From: Uwe Doering <gemini@geminix.org>
Organization: Private UNIX Site
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.6) Gecko/20040119
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: freebsd-performance@freebsd.org
References: <E1BFzbP-000Pqh-00.shmukler-mail-ru@f16.mail.ru>
In-Reply-To: <E1BFzbP-000Pqh-00.shmukler-mail-ru@f16.mail.ru>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Received: from gemini by geminix.org with asmtp (TLSv1:AES256-SHA:256)
	(Exim 3.36 #1)
	id 1BG3O2-0009Ls-00; Wed, 21 Apr 2004 00:06:39 +0200
Subject: Re: How does disk caching work?
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 20 Apr 2004 22:06:40 -0000

Igor Shmukler wrote:
>>[...]
>>>I actually pretty much explain the whole rotation process. If you read my email again, you should understand what happens whenever page is moved from inactive to cache and then to free.
>>
>>You may want to study the kernel sources some more, I'm afraid.
> 
> First you explicitly write that "pages in the cache queue have lost this association" then you tell me that I don't understand how VM works.

Quote from my initial email about this subject:

"Pages in the inactive queue can be still mapped into the memory space 
of processes, while pages in the cache queue have lost this association."

Which means that I specified right from the beginning which association 
I was referring to (process memory mapping), and that's what I repeated, 
in more detail, in my email following that.  I didn't mention the 
association with the underlying VM object in this context, which remains 
intact.  So I don't quite see what inconsistency you are accusing me of.

>>>>So, for reasons like these, I keep recommending to either study the 
>>>>kernel sources before you try to tune the VM system, or leave these 
>>>>variables alone.
>>>
>>>I am not sure whether studying kernel sources is really necessary. Virtually every UNIX (R) admin had to tune the machine, despite sources not being available.
>>
>>Sorry, but you just proved my point ...
> 
> What was that?
> 
> In first email you write that size of cache queue does not affect disk traffic. In next you say, no I did not mean that. I just wanted to say that cache queue holds pages that lost association. Now you say, no of course there is an association, someone just has to study kernel sources better.
> 
> Last argument is valid only because it's a moot point. Studying kernel sources never hurts anyone ...
> 
> I did not originally intend to flame you. I simply thought that some of your answers were not correct. If you had answered off the list, I would not bother, but this a public mailing list and it is a source of knowledge for many people. VM in particular is a grey area for many developers. Everyone knows what it's for, but few programmers really understand VM or VFS. Now you start giving me funny advices. That's not wise.

Okay.  I tried to be helpful and share my knowledge, but I don't insist 
on convincing anybody, at least not in my (scarce) spare time.  Since I 
don't feel that this discussion is going to lead anywhere I suggest that 
we leave it at that and agree that we disagree.

    Uwe
-- 
Uwe Doering         |  EscapeBox - Managed On-Demand UNIX Servers
gemini@geminix.org  |  http://www.escapebox.net

From owner-freebsd-performance@FreeBSD.ORG  Tue Apr 20 15:55:16 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id A881C16A4CE
	for <freebsd-performance@freebsd.org>;
	Tue, 20 Apr 2004 15:55:16 -0700 (PDT)
Received: from gen129.n001.c02.escapebox.net (gen129.n001.c02.escapebox.net
	[213.73.91.129])
	by mx1.FreeBSD.org (Postfix) with ESMTP id A4C7543D54
	for <freebsd-performance@freebsd.org>;
	Tue, 20 Apr 2004 15:55:15 -0700 (PDT)
	(envelope-from gemini@geminix.org)
Message-ID: <4085AA4B.1020700@geminix.org>
Date: Wed, 21 Apr 2004 00:55:07 +0200
From: Uwe Doering <gemini@geminix.org>
Organization: Private UNIX Site
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.6) Gecko/20040119
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: freebsd-performance@freebsd.org
References: <20040420195010.GZ87362@nasby.net> <4085869E.7090306@he.iki.fi>
In-Reply-To: <4085869E.7090306@he.iki.fi>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Received: from gemini by geminix.org with asmtp (TLSv1:AES256-SHA:256)
	(Exim 3.36 #1)
	id 1BG48z-000AIS-00; Wed, 21 Apr 2004 00:55:09 +0200
Subject: Re: vfs.hirunningspace on a 3ware 8506
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 20 Apr 2004 22:55:16 -0000

Petri Helenius wrote:
> Jim C. Nasby wrote:
> 
>> Has anyone done any testing to see what value of vfs.hirunningspace is
>> optimal for a 3ware 8506-8?
>>
> Do the 3ware controllers actually care about this value due to the 
> onboard processing and cache? I thought all writes are satisfied 
> immediately?

The controller itself doesn't care, but the kernel does.  With the 
current implementation, the amount of memory associated with outstanding 
read requests is subtracted from vfs.hirunningspace.  With many 
concurrent read requests there is no reserve left for write operations, 
so write performance can suffer substantially.

This balancing effect is actually intended in order to give read 
requests some priority, but in high performance systems with fast, 
caching raid controllers the default value of said variable is too low 
and therefore poses a bottleneck.

    Uwe
-- 
Uwe Doering         |  EscapeBox - Managed On-Demand UNIX Servers
gemini@geminix.org  |  http://www.escapebox.net

From owner-freebsd-performance@FreeBSD.ORG  Wed Apr 21 10:06:08 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id C815216A4CE
	for <freebsd-performance@freebsd.org>;
	Wed, 21 Apr 2004 10:06:08 -0700 (PDT)
Received: from flake.decibel.org (flake.decibel.org [66.143.173.58])
	by mx1.FreeBSD.org (Postfix) with SMTP id 53FFC43D55
	for <freebsd-performance@freebsd.org>;
	Wed, 21 Apr 2004 10:06:08 -0700 (PDT)
	(envelope-from decibel@decibel.org)
Received: (qmail 48539 invoked by uid 1001); 21 Apr 2004 17:05:57 -0000
Date: Wed, 21 Apr 2004 12:05:56 -0500
From: "Jim C. Nasby" <jim@nasby.net>
To: Uwe Doering <gemini@geminix.org>
Message-ID: <20040421170556.GB41429@nasby.net>
References: <20040420195010.GZ87362@nasby.net> <4085869E.7090306@he.iki.fi>
	<4085AA4B.1020700@geminix.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4085AA4B.1020700@geminix.org>
X-Operating-System: FreeBSD 4.9-RELEASE-p3 i386
X-Distributed: Join the Effort!  http://www.distributed.net
User-Agent: Mutt/1.5.6i
cc: freebsd-performance@freebsd.org
Subject: Re: vfs.hirunningspace on a 3ware 8506
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 21 Apr 2004 17:06:08 -0000

On Wed, Apr 21, 2004 at 12:55:07AM +0200, Uwe Doering wrote:
> Petri Helenius wrote:
> >Jim C. Nasby wrote:
> >
> >>Has anyone done any testing to see what value of vfs.hirunningspace is
> >>optimal for a 3ware 8506-8?
> >>
> >Do the 3ware controllers actually care about this value due to the 
> >onboard processing and cache? I thought all writes are satisfied 
> >immediately?
> 
> The controller itself doesn't care, but the kernel does.  With the 
> current implementation, the amount of memory associated with outstanding 
> read requests is subtracted from vfs.hirunningspace.  With many 
> concurrent read requests there is no reserve left for write operations, 
> so write performance can suffer substantially.
> 
> This balancing effect is actually intended in order to give read 
> requests some priority, but in high performance systems with fast, 
> caching raid controllers the default value of said variable is too low 
> and therefore poses a bottleneck.
 
Unfortunately, it seems the 8500 series only has 1.8MB of cache, so it
seems like the out-of-the-box setting of 1M may not be too far off.

Is it normally advisable to set vfs.hirunningspace = whatever the
controller's cache is?
-- 
Jim C. Nasby, Database Consultant                  jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

From owner-freebsd-performance@FreeBSD.ORG  Wed Apr 21 11:42:48 2004
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id F2AF616A4CE
	for <freebsd-performance@freebsd.org>;
	Wed, 21 Apr 2004 11:42:47 -0700 (PDT)
Received: from gen129.n001.c02.escapebox.net (gen129.n001.c02.escapebox.net
	[213.73.91.129])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 8411B43D4C
	for <freebsd-performance@freebsd.org>;
	Wed, 21 Apr 2004 11:42:47 -0700 (PDT)
	(envelope-from gemini@geminix.org)
Message-ID: <4086C0A3.30407@geminix.org>
Date: Wed, 21 Apr 2004 20:42:43 +0200
From: Uwe Doering <gemini@geminix.org>
Organization: Private UNIX Site
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.6) Gecko/20040119
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: freebsd-performance@freebsd.org
References: <20040420195010.GZ87362@nasby.net> <4085869E.7090306@he.iki.fi>
	<4085AA4B.1020700@geminix.org> <20040421170556.GB41429@nasby.net>
In-Reply-To: <20040421170556.GB41429@nasby.net>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Received: from gemini by geminix.org with asmtp (TLSv1:AES256-SHA:256)
	(Exim 3.36 #1)
	id 1BGMgH-0009x7-00; Wed, 21 Apr 2004 20:42:46 +0200
Subject: Re: vfs.hirunningspace on a 3ware 8506
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 21 Apr 2004 18:42:48 -0000

Jim C. Nasby wrote:
> On Wed, Apr 21, 2004 at 12:55:07AM +0200, Uwe Doering wrote:
>>Petri Helenius wrote:
>>>Jim C. Nasby wrote:
>>>
>>>>Has anyone done any testing to see what value of vfs.hirunningspace is
>>>>optimal for a 3ware 8506-8?
>>>
>>>Do the 3ware controllers actually care about this value due to the 
>>>onboard processing and cache? I thought all writes are satisfied 
>>>immediately?
>>
>>The controller itself doesn't care, but the kernel does.  With the 
>>current implementation, the amount of memory associated with outstanding 
>>read requests is subtracted from vfs.hirunningspace.  With many 
>>concurrent read requests there is no reserve left for write operations, 
>>so write performance can suffer substantially.
>>
>>This balancing effect is actually intended in order to give read 
>>requests some priority, but in high performance systems with fast, 
>>caching raid controllers the default value of said variable is too low 
>>and therefore poses a bottleneck.
>  
> Unfortunately, it seems the 8500 series only has 1.8MB of cache, so it
> seems like the out-of-the-box setting of 1M may not be too far off.
> 
> Is it normally advisable to set vfs.hirunningspace = whatever the
> controller's cache is?

Well, 1.8 MB is not much.  The Adaptec controllers we use have 16 MB 
buffer space, so I set 'vfs.hirunningspace' to half of that (8 MB). 
Basically by the seat of my pants.  My thinking was that in cases where 
all of this amount was used for write operations the kernel shouldn't be 
able to completely flush the controller's buffer in one go.  Just a 
conservative approach.

In your case, you'll probably get away with picking a value equal to or 
even larger than the controller's cache.  It might result in a slight 
performance penalty, but in case your database server really suffers 
from write starvation due to many concurrent read requests, removing 
this bottleneck is likely to outweigh that penalty by far.

In any case, unless the effect of tweaking 'vfs.hirunningspace' isn't 
outright spectacular you will probably have to run benchmark tests in 
order to find the best value for your server.

    Uwe
-- 
Uwe Doering         |  EscapeBox - Managed On-Demand UNIX Servers
gemini@geminix.org  |  http://www.escapebox.net