From owner-freebsd-bugs Wed Mar 27 11:50:22 2002 Delivered-To: freebsd-bugs@hub.freebsd.org Received: from freefall.freebsd.org (freefall.FreeBSD.org [216.136.204.21]) by hub.freebsd.org (Postfix) with ESMTP id 7022C37B41F for ; Wed, 27 Mar 2002 11:50:09 -0800 (PST) Received: (from gnats@localhost) by freefall.freebsd.org (8.11.6/8.11.6) id g2RJo9T15932; Wed, 27 Mar 2002 11:50:09 -0800 (PST) (envelope-from gnats) Received: from freefall.freebsd.org (freefall.FreeBSD.org [216.136.204.21]) by hub.freebsd.org (Postfix) with ESMTP id 9BEA937B49E for ; Wed, 27 Mar 2002 11:41:19 -0800 (PST) Received: (from nobody@localhost) by freefall.freebsd.org (8.11.6/8.11.6) id g2RJes414681; Wed, 27 Mar 2002 11:40:54 -0800 (PST) (envelope-from nobody) Message-Id: <200203271940.g2RJes414681@freefall.freebsd.org> Date: Wed, 27 Mar 2002 11:40:54 -0800 (PST) From: Alexander Haderer To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-1.0 Subject: kern/36381: ata + hw.ata.wc=1: high CPU load for large filesystems writes Sender: owner-freebsd-bugs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org >Number: 36381 >Category: kern >Synopsis: ata + hw.ata.wc=1: high CPU load for large filesystems writes >Confidential: no >Severity: non-critical >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Wed Mar 27 11:50:09 PST 2002 >Closed-Date: >Last-Modified: >Originator: Alexander Haderer >Release: 4.5 Release @ i386 >Organization: Charite Hospital Berlin Germany >Environment: FreeBSD marvin08.str.charite.de 4.5-RELEASE FreeBSD 4.5-RELEASE #0: Wed Mar 27 14:23:53 CET 2002 root@marvin08.str.charite.de:/usr/src/sys/compile/FBSD45_MARVIN i386 >Description: We are running a fileserver with 8 IDE 100GB disks setup to a raid 10 with vinum as one large drive: df -m shows: Filesystem 1M-blocks Used Avail Capacity Mounted on /dev/vinum/marvin0 369862 353166 16695 95% /xxx All disks run at UDMA100 at four ata channels. Booting is done via a SCSI disk. /var/log/messages is silent. ASUS TUSL2 board with PIII 1,x MHz The dirs and files below /xxx are organized in a flat dir hierarchie with depth 4. All files are stored in dirs below level 4. Dirs at level 4 contain 2..2000 files (organization is given). In other words: Level 1: 10 dirs Level 2: 900 dirs Level 3: 12500 dirs Level 4: 35000 dirs Files in Level 4 dirs: approx: 1,1 million The files to be stored come in via NFS, so the problem first was noticed by high CPU load of nfsd: top shows a load avr of 8 and more with nfsd at 80% and system about 100%. By playing around with cp the effect of high CPU load for file access could be broken down to the following scenario at the server: hw.ata.wc=1 (everything local, no network access, no NFS) mkdir /xxx/scratch cd /xxx/scratch mkdir 300 dir hierarchies of depth 3 with no files at all cp -R /from/scsi/testtree /xxx/scratch Where testtree is a 3 level hierarchie with approx 200 files with 100MB total disk usage. The cp first runs fast with low cpu usage (top), but after writing some megabytes the cpu load of cp walks towards 100% as well as system jumps upto 100% and the load avr goes upto 8 and more. The whole systems becomes slow. When the fs is not "full" enough (70% cap.?) cp is fast. When disabling softupdates the effect comes a little bit later, and top's system is about 5% lower when cp runs fast (soft update penalty). When the scratch dir is empty cp is fast. Switching UFS_DIRHASH in kernel has no effect for cp (in the context given here) Mounting noatime has no effect for cp (in the context given here). Switching of ata write cache with hw.ata.wc=0 in loader.conf makes cp's CPU usage acceptable (below 10%, as well as system). Write speed goes down as follows (which is Ok for us): before: dd if=/dev/zero of=/xxx/100mb bs=1024k count=100 --> 25..30 mbyte/s cp -R /from/scsi/testtree /xxx --> 7 s after (softupdates on): dd if=/dev/zero of=/xxx/100mb bs=1024k count=100 --> 6..7 mbyte/s cp -R /from/scsi/testtree /xxx --> 18 s See also PR 35151. >How-To-Repeat: Put a large filesystem on a ata disk, fill it up to 70% or more, switch hw.ata.wc=1, cp -R dirhier with some Megabyte to the filesystem and watch top. (Probably one need a very large filesystem from todays point of view) >Fix: switch ata write-cache off with hw.ata.wc=0 in loader.conf. Write access will be slow down by factor 3, read access remains fast. >Release-Note: >Audit-Trail: >Unformatted: To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message