From owner-freebsd-questions@FreeBSD.ORG Mon Dec 11 18:57:09 2006 Return-Path: X-Original-To: freebsd-questions@freebsd.org Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C1B2416A4FC for ; Mon, 11 Dec 2006 18:57:09 +0000 (UTC) (envelope-from freebsd@sopwith.solgatos.com) Received: from schitzo.solgatos.com (pool-71-245-104-192.ptldor.fios.verizon.net [71.245.104.192]) by mx1.FreeBSD.org (Postfix) with ESMTP id 96B2344869 for ; Mon, 11 Dec 2006 18:35:56 +0000 (GMT) (envelope-from freebsd@sopwith.solgatos.com) Received: from schitzo.solgatos.com (localhost.home.localnet [127.0.0.1]) by schitzo.solgatos.com (8.13.8/8.13.6) with ESMTP id kBBIaIOE011973 for ; Mon, 11 Dec 2006 10:36:18 -0800 Received: from sopwith.solgatos.com (uucp@localhost) by schitzo.solgatos.com (8.13.8/8.13.4/Submit) with UUCP id kBBIaIlX011970 for freebsd-questions@freebsd.org; Mon, 11 Dec 2006 10:36:18 -0800 Received: from localhost by sopwith.solgatos.com (8.8.8/6.24) id SAA23589; Mon, 11 Dec 2006 18:32:53 GMT Message-Id: <200612111832.SAA23589@sopwith.solgatos.com> To: freebsd-questions@freebsd.org Date: Mon, 11 Dec 2006 10:32:53 +0000 From: Dieter Subject: processes not getting fair share of available disk I/O (was: Re: TCP parameters and interpreting tcpdump output ) X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: freebsd@sopwith.solgatos.com List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Dec 2006 18:57:09 -0000 > Did this problem start before you made port2file run with rtprio? Yes. I only added rtprio because it wasn't working. > Can you please include a copy of your kernel configuration file and dmesg? I think you asked that before: :-) > > OK, that's correct. Can you also provide details of your disk > > hardware (e.g. dmesg) and kernel configuration? > > FreeBSD 6.0 > > Kernel is stock except for addition of: > > device atapicam # needed to burn dvd > > /boot/loader.conf: > > console="comconsole" > hw.ata.wc=0 > hw.ata.atapi_dma="1" > kern.ipc.nmbclusters="256000" > > Mainboard: Tyan Tomcat k8e 2865 > > CPU: AMD64 3000+ > > Chipset: Nvidia nforce4 ultra > > Memory: 2 GB DDR400 ECC > > Disks: 4x Seagate 7200 rpm SATA > 1x Seagate 7200 rpm PATA > 1x LG CD/DVD > > atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xe000-0xe00f at device 6.0 on pci0 > ata0: on atapci0 > ata1: on atapci0 > atapci1: port 0x9f0-0x9f7,0xbf0-0xbf3,0x970-0x977,0xb70-0xb73,0xcc00-0xcc0f mem 0xfebfb00 > 0-0xfebfbfff irq 10 at device 7.0 on pci0 > ata2: on atapci1 > ata3: on atapci1 > atapci2: port 0x9e0-0x9e7,0xbe0-0xbe3,0x960-0x967,0xb60-0xb63,0xb800-0xb80f mem 0xfebfa00 > 0-0xfebfafff irq 11 at device 8.0 on pci0 > ata4: on atapci2 > ata5: on atapci2 > acd0: DVDR at ata0-master UDMA66 > ad2: 305245MB at ata1-master UDMA100 > ad4: 238475MB at ata2-master SATA150 > ad6: 238475MB at ata3-master SATA150 > ad8: 238475MB at ata4-master SATA150 > ad10: 305245MB at ata5-master SATA150 > cd0 at ata0 bus 0 target 0 lun 0 Since then I added another Seagate 7200 rpm PATA, connected via a PATA-to-USB. The idea being to get a different controller path to a disk. Although I think all I/O has to go through the nforce one way or another. This USB disk writes at about 15 MB/s instead of the 6-7 MB/s, but otherwise they interfere with each other same as two disks connected directly to the nforce. Perhaps a clue in there somewhere? umass0: Prolific Technology Inc. ATAPI-6 Bridge Controller, rev 2.00/0.01, addr 2 da0 at umass-sim0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-0 device da0: 40.000MB/s transfers da0: 305245MB (625142449 512 byte sectors: 255H 63S/T 38913C) The Ethernet is on the mainboard: pcib5: at device 13.0 on pci0 pci5: on pcib5 bge0: mem 0xfe4f0000-0xfe4fffff irq 11 at device 0.0 on pci5 miibus1: on bge0 brgphy0: on miibus1 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto The only stuff that says Giant or GIANT-LOCKED is atkbd0 only used with firmware usb the new disk, otherwise not used nve not in use fwe not in use Is Giant the only mutex/lock that could be a bottleneck across disks? I can't figure out anything else that would create a common bottleneck across drives. The nforce can read from all four SATA drives at once as fast as the disks can go, 65-70 MB/s per drive at the fast end of the platter. I assume that the nforce doesn't care about read vs write, and is not the bottleneck. The filesystem has to allocate blocks and such, but that shouldn't be common across drives. It does this without the CPU being maxed out, assuming you believe the numbers from systat -vmstat or top. Memory buffer cache? However they do that these days... I was thinking maybe part of port2file's circular buffer was getting paged out, so I added mlock(2) of the buffer. Still fails. :-( Writing to disk doesn't seem to hurt the Ethernet. If I direct the output of port2file to /dev/null it works fine. I don't suppose you happen to know how to enable SATA's NCQ queuing? I did some experiments with rtprio and dd. rtprio reduces the effect of other disk activity somewhat, but not enough. I noticed that the transfer rates as reported by systat -vmstat varied more than I would expect. First one disk would be faster for a few seconds, then the other. Sometimes they would be about equal. The sum of the two drives looked to be approx constant. The sum was only slightly higher than a single drive by itself. It certainly smells like there is *some* single resource for writing that all the disks have to share.