From owner-freebsd-performance@FreeBSD.ORG Fri Dec 19 01:31:20 2008 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4A4D51065670 for ; Fri, 19 Dec 2008 01:31:20 +0000 (UTC) (envelope-from pathiaki2@yahoo.com) Received: from web110514.mail.gq1.yahoo.com (web110514.mail.gq1.yahoo.com [67.195.8.119]) by mx1.freebsd.org (Postfix) with SMTP id DD2E88FC1D for ; Fri, 19 Dec 2008 01:31:19 +0000 (UTC) (envelope-from pathiaki2@yahoo.com) Received: (qmail 44417 invoked by uid 60001); 19 Dec 2008 01:04:37 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:MIME-Version:Content-Type:Message-ID; b=INih1Aaruh0mW710+sM6pYwbgCzEuFGeqhLGhCFhfLHbUoHczmIVcwL0EW7K9VtKAi1L3QTuhyZdq94jcb5H1lvOzeC0wpFGyi/3kU35v2G6VnrcMKPWoFcEAuKax4Di4Hccpk6P3wlnKwC3hrtBxgKjS8iH6mhVoq5gx1gknhU=; X-YMail-OSG: 4dzdHR0VM1mSPbv8.3OChb7akelogbQLmZhkKHaJ0J2Q7E2KxzaYbFaYUf_sVlxseWFDDjspZ99kCZjxqQ.IyswabkpQe2BgpvmhvrkMoTLUnZxAjdyJkJV8XqIqhhbivyqLZoHGzWhxYchtOq1.NuhDvQ-- Received: from [71.10.225.232] by web110514.mail.gq1.yahoo.com via HTTP; Thu, 18 Dec 2008 17:04:37 PST X-Mailer: YahooMailRC/1155.45 YahooMailWebService/0.7.260.1 Date: Thu, 18 Dec 2008 17:04:37 -0800 (PST) From: Paul Patterson To: freebsd-performance@freebsd.org MIME-Version: 1.0 Message-ID: <553453.43874.qm@web110514.mail.gq1.yahoo.com> Content-Type: text/plain; charset=us-ascii X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: ZFS, NFS and Network tuning X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Dec 2008 01:31:20 -0000 Hi, I just set up my first machine with ZFS. (First, ZFS is nothing short of amazing) I'm running FreeBSD 7.1-RC1 as an NFS server with ZFS striped across two volumes (just testing throughput for now.) Anyhow, I was benching this box, 4GB or RAM, the volume is on 2x146 GB SAS 10K rpm drives and it's an HP Proliant DL360 with dual Gb interfaces. (device bce) Now, I believe that I have tuned this box to the hilt with all the parameters that I can think of (it's at work right now so I'll cut and paste all the sysctls and loader.conf parameters for ZFS and networking) and it still seems to have some type of bottleneck. I have two Debian Linux clients that I use to bench with. I run a script that makes calls that writes to the NFS device and, after about 30 minutes, starts to delete the initial data and follow behind writing and deleting. Here's what's happening: The "other" machine is a NetAPP. It's got 1GB of RAM and it's running RAID DP with 2 parity drives and 6 data drives, all SATA 750 GB 7200 RPM drives with dual Gb interfaces. The benchmark script manages to write lots of little (all less than 30KB) files at a rate of 11,000 per minute, however, after 30 minutes, when it starts deleting, the throughput on write goes to 9500 and deletion is 6000 per minute. If I turn on the second node, I get 17,000 writing combined with about 11,000 deletions combined. One way or another, this will overflow in time. Not good. Now, on to my pet project. :-) The FreeBSD/ZFS server is only able to maintain about 3500 writes per minute but also deletes at the same rate! (I would expect deletion to be at least as fast as writing) The drives are running at only 20-35% while this is going on and only putting down about 4-5 MB/sec each. So, at 1Gb or ~92MB/sec theoretical max (is that about right?) There's something wrong somewhere. I'm assuming it's the network. (I'll post all the tunings tomorrow.) Thinking something wrong, I mounted only one client to each server (they are identical clients and the same configuration as the FreeBSD box). I did a simple stream of: dd if=/dev/zero of=/mnt/nfs bs=1m count=1000. The FreeBSD box wins?! It cranked up the drives to 45-50 MB/sec each and balanced them perfectly on transactions/sec KB/sec, etc from systat -vm. (Woohoo!) The NetAPPs CPU was at over 35-40% constantly, (it does that while benching, too) I'll post the NetAPP finding tomorrow as I forgot it for now. As for the client mounting, it was with the options: nfsvers=3,rsize=32768,wsize=32768,hard,intr,async,noatime I'm trying to figure out why, when running this benchmark, can the NetAPP with WAFL nearly triple the FreeBSD/ZFS box. Also, I'm having something strange happen when I try to mount the disk from the FreeBSD server versus the NetAPP. The FreeBSD server will sometimes RPC timeout. Mounting the NetAPP is instantaneous. That's the beginning. If I have a list of things to check tomorrow, I will. I'd like to see the little machine that could kick the NetAPPs butt. (No offense to NetAPP. :-) ) Thank you for reading, Paul From owner-freebsd-performance@FreeBSD.ORG Fri Dec 19 14:48:00 2008 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E004E1065701 for ; Fri, 19 Dec 2008 14:48:00 +0000 (UTC) (envelope-from pathiaki2@yahoo.com) Received: from web110511.mail.gq1.yahoo.com (web110511.mail.gq1.yahoo.com [67.195.8.116]) by mx1.freebsd.org (Postfix) with SMTP id B95A88FC08 for ; Fri, 19 Dec 2008 14:48:00 +0000 (UTC) (envelope-from pathiaki2@yahoo.com) Received: (qmail 22997 invoked by uid 60001); 19 Dec 2008 14:48:00 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:MIME-Version:Content-Type:Message-ID; b=LhpzIoLK77fP0tOIW6BLiOKBuZSQWiF4cZ/VKskfLYeCKOKvDqIhboM7yAzwZxYoHTFjGQXIwKqZ/ItZMO7dM5QdkD+uTVZ5u1J2w+VHdHzZDAG9wJu8BGovV427Hj4J6/k4TxRcLNfzVEZfIKhmXkjeg6VdIZ+9fVaEvB0OH0Q=; X-YMail-OSG: 2NLyVggVM1k63z1kZQpPBWLAle_Cmic8IMGjqMpNatFej9TKDaXXPEwiNFW.BoVgWwOIS3OJFNSwd44C_ADo78C1m3RV_2BZFDzDmHqtrcXtEv0NFTkYQ4rij0PYdJ2Ja3BsM2K0vxTxTyCem9vhmDnRUNnQuXb0eQEcuDGR12diJBQdqECwc42zDhVtEA-- Received: from [216.229.152.50] by web110511.mail.gq1.yahoo.com via HTTP; Fri, 19 Dec 2008 06:47:59 PST X-Mailer: YahooMailRC/1155.45 YahooMailWebService/0.7.260.1 References: <553453.43874.qm@web110514.mail.gq1.yahoo.com> Date: Fri, 19 Dec 2008 06:47:59 -0800 (PST) From: Paul Patterson To: Paul Patterson , freebsd-performance@freebsd.org MIME-Version: 1.0 Message-ID: <15723.22980.qm@web110511.mail.gq1.yahoo.com> Content-Type: text/plain; charset=us-ascii X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: Re: ZFS, NFS and Network tuning X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Dec 2008 14:48:01 -0000 Hi, as promised, the parameter tuning I have on the box (does anyone see anything wrong?) /boot/loader.conf kern.hz="100" vm.kmem_size_max="1536M" vm.kmem_size="1536M" vfs.zfs.prefetch_disble=1 /etc/sysctl.conf kern.ipc.maxsockbuf=16777216 kern.ipc.nmbclusters=32768 kern.ipc.somaxconn=8192 kern.maxfiles=65536 kern.maxfilesperproc=32768 kern.mxvnodes=600000 net.inet.tcp.delayed_ack=0 net.inet.tcp.inflight.enable=0 net.inet.tcp.path_mtu_discovery=0 net.inet.tcp.recvbuf_auto=1 net.inet.tcp.recvbuf_inc=16384 net.inet.tcp.recvbuf_max=16777216 net.inet.tcp.recvspace=65536 net.inet.tcp.rfc1323=1 net.inet.tcp.sendbuf_auto=1 net.inet.tcpsendbuf_inc=8192 net.inet.tcp.sendspace=65536 net.inet.udp.maxdgram=57344 net.inet.udp.recvspace=65536 net.local.stream.recvspace=65536 net.inet.tcp.sendbuf_max=16777216 ________________________________ From: Paul Patterson To: freebsd-performance@freebsd.org Sent: Thursday, December 18, 2008 8:04:37 PM Subject: ZFS, NFS and Network tuning Hi, I just set up my first machine with ZFS. (First, ZFS is nothing short of amazing) I'm running FreeBSD 7.1-RC1 as an NFS server with ZFS striped across two volumes (just testing throughput for now.) Anyhow, I was benching this box, 4GB or RAM, the volume is on 2x146 GB SAS 10K rpm drives and it's an HP Proliant DL360 with dual Gb interfaces. (device bce) Now, I believe that I have tuned this box to the hilt with all the parameters that I can think of (it's at work right now so I'll cut and paste all the sysctls and loader.conf parameters for ZFS and networking) and it still seems to have some type of bottleneck. I have two Debian Linux clients that I use to bench with. I run a script that makes calls that writes to the NFS device and, after about 30 minutes, starts to delete the initial data and follow behind writing and deleting. Here's what's happening: The "other" machine is a NetAPP. It's got 1GB of RAM and it's running RAID DP with 2 parity drives and 6 data drives, all SATA 750 GB 7200 RPM drives with dual Gb interfaces. The benchmark script manages to write lots of little (all less than 30KB) files at a rate of 11,000 per minute, however, after 30 minutes, when it starts deleting, the throughput on write goes to 9500 and deletion is 6000 per minute. If I turn on the second node, I get 17,000 writing combined with about 11,000 deletions combined. One way or another, this will overflow in time. Not good. Now, on to my pet project. :-) The FreeBSD/ZFS server is only able to maintain about 3500 writes per minute but also deletes at the same rate! (I would expect deletion to be at least as fast as writing) The drives are running at only 20-35% while this is going on and only putting down about 4-5 MB/sec each. So, at 1Gb or ~92MB/sec theoretical max (is that about right?) There's something wrong somewhere. I'm assuming it's the network. (I'll post all the tunings tomorrow.) Thinking something wrong, I mounted only one client to each server (they are identical clients and the same configuration as the FreeBSD box). I did a simple stream of: dd if=/dev/zero of=/mnt/nfs bs=1m count=1000. The FreeBSD box wins?! It cranked up the drives to 45-50 MB/sec each and balanced them perfectly on transactions/sec KB/sec, etc from systat -vm. (Woohoo!) The NetAPPs CPU was at over 35-40% constantly, (it does that while benching, too) I'll post the NetAPP finding tomorrow as I forgot it for now. As for the client mounting, it was with the options: nfsvers=3,rsize=32768,wsize=32768,hard,intr,async,noatime I'm trying to figure out why, when running this benchmark, can the NetAPP with WAFL nearly triple the FreeBSD/ZFS box. Also, I'm having something strange happen when I try to mount the disk from the FreeBSD server versus the NetAPP. The FreeBSD server will sometimes RPC timeout. Mounting the NetAPP is instantaneous. That's the beginning. If I have a list of things to check tomorrow, I will. I'd like to see the little machine that could kick the NetAPPs butt. (No offense to NetAPP. :-) ) Thank you for reading, Paul _______________________________________________ freebsd-performance@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-performance To unsubscribe, send any mail to "freebsd-performance-unsubscribe@freebsd.org" From owner-freebsd-performance@FreeBSD.ORG Fri Dec 19 18:03:16 2008 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 064BD106567A for ; Fri, 19 Dec 2008 18:03:16 +0000 (UTC) (envelope-from pathiaki2@yahoo.com) Received: from web110510.mail.gq1.yahoo.com (web110510.mail.gq1.yahoo.com [67.195.8.115]) by mx1.freebsd.org (Postfix) with SMTP id D25728FC1A for ; Fri, 19 Dec 2008 18:03:15 +0000 (UTC) (envelope-from pathiaki2@yahoo.com) Received: (qmail 78480 invoked by uid 60001); 19 Dec 2008 18:03:15 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:MIME-Version:Content-Type:Message-ID; b=Wv2y1H6u501EtfxGBq6XIeXgJ7OQ1CRTIy0p2u1IdXw+KjG+HJrNyr51NSXI9zDPYIbsu3bwhGVFTUB7E9vuXHJvKLLPiGz+zKhEGoTIPTa/DLYFSsdz0haFSwdm//G4wmCnfbwk47HSabKxEOT4Dawb8BUIt+rIwcuShS829jE=; X-YMail-OSG: sm4UAqAVM1kJIV5WdK29Q95EILE7xuLmcXZq0kQ_fEmOaRt6AjRQDPlC7oVwLxGb4n3R0SXOPHDYwULvPMbNx_A_rSpUS88i1CNh0GBDFrYbssPrBZD4KlJicL0KAck9iIB1obpjkJ_7UEBcf3adln4M6zeMlIyuSnyJe.T20_bg7jpzwWFNIBPosANDCToZP7mdxNuD3mHIRV_MPMc- Received: from [216.229.152.50] by web110510.mail.gq1.yahoo.com via HTTP; Fri, 19 Dec 2008 10:03:14 PST X-Mailer: YahooMailRC/1155.45 YahooMailWebService/0.7.260.1 References: <553453.43874.qm@web110514.mail.gq1.yahoo.com> <15723.22980.qm@web110511.mail.gq1.yahoo.com> Date: Fri, 19 Dec 2008 10:03:14 -0800 (PST) From: Paul Patterson To: Paul Patterson , freebsd-performance@freebsd.org MIME-Version: 1.0 Message-ID: <400826.77992.qm@web110510.mail.gq1.yahoo.com> Content-Type: text/plain; charset=us-ascii X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: Re: ZFS, NFS and Network tuning X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Dec 2008 18:03:16 -0000 Hello all, I guess I've got to send this as I've already had about 5 responses claiming the same thing. This is not a disk bottleneck. The ZFS partition is capable of performing at the theoretical max of the drives. The machine is performing at less than 5 MB combined. I'm assuming that this is a problem with the NFSv3 throughput. I just 'dd' 1000 1MB records (about 1GB) from the clients to their respective servers: Client 1 to NetAPP: 3 tests for 45.9, 45.1, 46.1 Pretty consistent Client 2 to FreeBSD/ZFS: 3 test for 29.7, 12.5, 19.1 NOT consistent (also, the drives were lucky to hit 12% busy. I'm about to mount these servers to each client and see if there's a variation (although they are hw configured the same and bought the same time.) I'll write after this. However, if more people could review the configurations below and see if there's anything glaring.... However, the lack of consistency shows something is wrong network wise. P. ________________________________ From: Paul Patterson To: Paul Patterson ; freebsd-performance@freebsd.org Sent: Friday, December 19, 2008 9:47:59 AM Subject: Re: ZFS, NFS and Network tuning Hi, as promised, the parameter tuning I have on the box (does anyone see anything wrong?) /boot/loader.conf kern.hz="100" vm.kmem_size_max="1536M" vm.kmem_size="1536M" vfs.zfs.prefetch_disble=1 /etc/sysctl.conf kern.ipc.maxsockbuf=16777216 kern.ipc.nmbclusters=32768 kern.ipc.somaxconn=8192 kern.maxfiles=65536 kern.maxfilesperproc=32768 kern.mxvnodes=600000 net.inet.tcp.delayed_ack=0 net.inet.tcp.inflight.enable=0 net.inet.tcp.path_mtu_discovery=0 net.inet.tcp.recvbuf_auto=1 net.inet.tcp.recvbuf_inc=16384 net.inet.tcp.recvbuf_max=16777216 net.inet.tcp.recvspace=65536 net.inet.tcp.rfc1323=1 net.inet.tcp.sendbuf_auto=1 net.inet.tcpsendbuf_inc=8192 net.inet.tcp.sendspace=65536 net.inet.udp.maxdgram=57344 net.inet.udp.recvspace=65536 net.local.stream.recvspace=65536 net.inet.tcp.sendbuf_max=16777216 ________________________________ From: Paul Patterson To: freebsd-performance@freebsd.org Sent: Thursday, December 18, 2008 8:04:37 PM Subject: ZFS, NFS and Network tuning Hi, I just set up my first machine with ZFS. (First, ZFS is nothing short of amazing) I'm running FreeBSD 7.1-RC1 as an NFS server with ZFS striped across two volumes (just testing throughput for now.) Anyhow, I was benching this box, 4GB or RAM, the volume is on 2x146 GB SAS 10K rpm drives and it's an HP Proliant DL360 with dual Gb interfaces. (device bce) Now, I believe that I have tuned this box to the hilt with all the parameters that I can think of (it's at work right now so I'll cut and paste all the sysctls and loader.conf parameters for ZFS and networking) and it still seems to have some type of bottleneck. I have two Debian Linux clients that I use to bench with. I run a script that makes calls that writes to the NFS device and, after about 30 minutes, starts to delete the initial data and follow behind writing and deleting. Here's what's happening: The "other" machine is a NetAPP. It's got 1GB of RAM and it's running RAID DP with 2 parity drives and 6 data drives, all SATA 750 GB 7200 RPM drives with dual Gb interfaces. The benchmark script manages to write lots of little (all less than 30KB) files at a rate of 11,000 per minute, however, after 30 minutes, when it starts deleting, the throughput on write goes to 9500 and deletion is 6000 per minute. If I turn on the second node, I get 17,000 writing combined with about 11,000 deletions combined. One way or another, this will overflow in time. Not good. Now, on to my pet project. :-) The FreeBSD/ZFS server is only able to maintain about 3500 writes per minute but also deletes at the same rate! (I would expect deletion to be at least as fast as writing) The drives are running at only 20-35% while this is going on and only putting down about 4-5 MB/sec each. So, at 1Gb or ~92MB/sec theoretical max (is that about right?) There's something wrong somewhere. I'm assuming it's the network. (I'll post all the tunings tomorrow.) Thinking something wrong, I mounted only one client to each server (they are identical clients and the same configuration as the FreeBSD box). I did a simple stream of: dd if=/dev/zero of=/mnt/nfs bs=1m count=1000. The FreeBSD box wins?! It cranked up the drives to 45-50 MB/sec each and balanced them perfectly on transactions/sec KB/sec, etc from systat -vm. (Woohoo!) The NetAPPs CPU was at over 35-40% constantly, (it does that while benching, too) I'll post the NetAPP finding tomorrow as I forgot it for now. As for the client mounting, it was with the options: nfsvers=3,rsize=32768,wsize=32768,hard,intr,async,noatime I'm trying to figure out why, when running this benchmark, can the NetAPP with WAFL nearly triple the FreeBSD/ZFS box. Also, I'm having something strange happen when I try to mount the disk from the FreeBSD server versus the NetAPP. The FreeBSD server will sometimes RPC timeout. Mounting the NetAPP is instantaneous. That's the beginning. If I have a list of things to check tomorrow, I will. I'd like to see the little machine that could kick the NetAPPs butt. (No offense to NetAPP. :-) ) Thank you for reading, Paul _______________________________________________ freebsd-performance@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-performance To unsubscribe, send any mail to "freebsd-performance-unsubscribe@freebsd.org" From owner-freebsd-performance@FreeBSD.ORG Fri Dec 19 18:59:54 2008 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ECE021065670 for ; Fri, 19 Dec 2008 18:59:54 +0000 (UTC) (envelope-from pathiaki2@yahoo.com) Received: from web110514.mail.gq1.yahoo.com (web110514.mail.gq1.yahoo.com [67.195.8.119]) by mx1.freebsd.org (Postfix) with SMTP id C26B48FC13 for ; Fri, 19 Dec 2008 18:59:54 +0000 (UTC) (envelope-from pathiaki2@yahoo.com) Received: (qmail 87597 invoked by uid 60001); 19 Dec 2008 18:59:54 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:MIME-Version:Content-Type:Message-ID; b=dVYxkMzrOFjtNPs0NAABnWg75KFUM96ubx144O0xF7q/wXVD4t6JCaRmEaY2ERU0y9gFcwAzWQ/w/R+KhbkjCim103gGBNRXKWfqBpOJ1MBznwkBWNXUU2VvUOfPYmnPwYiZWCQ4GT08hH6JjZnpbaJROfMZkzkoYPBH1fW7Bik=; X-YMail-OSG: wE1pZMQVM1nYPxFbduCINVcWETREZ8OyHKaGpcIgqHEQJLVMcLtGMhOoDO_45t0zmRVBAJXrbqeSe_GAQEGNxSPvSQ4Nh0mNTfZ9zQlrJ0.aFVJoqML6KlaKs4lEkUvdkZwXsBvYiN0ypEULE146fDR00aWTa3ZfGGc2ijGWrjXyt5pLM.48qygqVGvxdrfWWpgTLZppbKMoAaj5ULw- Received: from [71.10.225.232] by web110514.mail.gq1.yahoo.com via HTTP; Fri, 19 Dec 2008 10:59:54 PST X-Mailer: YahooMailRC/1155.45 YahooMailWebService/0.7.260.1 References: <553453.43874.qm@web110514.mail.gq1.yahoo.com> <15723.22980.qm@web110511.mail.gq1.yahoo.com> <400826.77992.qm@web110510.mail.gq1.yahoo.com> Date: Fri, 19 Dec 2008 10:59:54 -0800 (PST) From: Paul Patterson To: Paul Patterson , freebsd-performance@freebsd.org MIME-Version: 1.0 Message-ID: <309927.87042.qm@web110514.mail.gq1.yahoo.com> Content-Type: text/plain; charset=us-ascii X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: Re: ZFS, NFS and Network tuning X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Dec 2008 18:59:55 -0000 Hi, Well, I got some input on things: kern.ipc.somaxconn=32768 net.inet.tcp.mssdflt=1460 And for fstab rw,tcp,intr,noatime,nfsv3,-w=65536,-r=65536 I tried turning on polling with ifconfig bce0 polling, however, I didn't see it in ifconfig bce0 so I don't believe it to be active or the card doesn't support it. aI also removed async from the mounts. These had a detrimental affect on the FreeBSD server. I now get 64K per transfer (system -vm) but I'm still only getting about 4MB/sec on the disks and their utilization has dropped to about 5%. Throughput from both clients is ~8.5MB/sec. The tests were run separately. The NetAPP on each host was over 48.5 MB/sec. The FreeBSD host still has about 2 GB free. Paul ________________________________ From: Paul Patterson To: Paul Patterson ; freebsd-performance@freebsd.org Sent: Friday, December 19, 2008 1:03:14 PM Subject: Re: ZFS, NFS and Network tuning Hello all, I guess I've got to send this as I've already had about 5 responses claiming the same thing. This is not a disk bottleneck. The ZFS partition is capable of performing at the theoretical max of the drives. The machine is performing at less than 5 MB combined. I'm assuming that this is a problem with the NFSv3 throughput. I just 'dd' 1000 1MB records (about 1GB) from the clients to their respective servers: Client 1 to NetAPP: 3 tests for 45.9, 45.1, 46.1 Pretty consistent Client 2 to FreeBSD/ZFS: 3 test for 29.7, 12.5, 19.1 NOT consistent (also, the drives were lucky to hit 12% busy. I'm about to mount these servers to each client and see if there's a variation (although they are hw configured the same and bought the same time.) I'll write after this. However, if more people could review the configurations below and see if there's anything glaring.... However, the lack of consistency shows something is wrong network wise. P. ________________________________ From: Paul Patterson To: Paul Patterson ; freebsd-performance@freebsd.org Sent: Friday, December 19, 2008 9:47:59 AM Subject: Re: ZFS, NFS and Network tuning Hi, as promised, the parameter tuning I have on the box (does anyone see anything wrong?) /boot/loader.conf kern.hz="100" vm.kmem_size_max="1536M" vm.kmem_size="1536M" vfs.zfs.prefetch_disble=1 /etc/sysctl.conf kern.ipc.maxsockbuf=16777216 kern.ipc.nmbclusters=32768 kern.ipc.somaxconn=8192 kern.maxfiles=65536 kern.maxfilesperproc=32768 kern.mxvnodes=600000 net.inet.tcp.delayed_ack=0 net.inet.tcp.inflight.enable=0 net.inet.tcp.path_mtu_discovery=0 net.inet.tcp.recvbuf_auto=1 net.inet.tcp.recvbuf_inc=16384 net.inet.tcp.recvbuf_max=16777216 net.inet.tcp.recvspace=65536 net.inet.tcp.rfc1323=1 net.inet.tcp.sendbuf_auto=1 net.inet.tcpsendbuf_inc=8192 net.inet.tcp.sendspace=65536 net.inet.udp.maxdgram=57344 net.inet.udp.recvspace=65536 net.local.stream.recvspace=65536 net.inet.tcp.sendbuf_max=16777216 ________________________________ From: Paul Patterson To: freebsd-performance@freebsd.org Sent: Thursday, December 18, 2008 8:04:37 PM Subject: ZFS, NFS and Network tuning Hi, I just set up my first machine with ZFS. (First, ZFS is nothing short of amazing) I'm running FreeBSD 7.1-RC1 as an NFS server with ZFS striped across two volumes (just testing throughput for now.) Anyhow, I was benching this box, 4GB or RAM, the volume is on 2x146 GB SAS 10K rpm drives and it's an HP Proliant DL360 with dual Gb interfaces. (device bce) Now, I believe that I have tuned this box to the hilt with all the parameters that I can think of (it's at work right now so I'll cut and paste all the sysctls and loader.conf parameters for ZFS and networking) and it still seems to have some type of bottleneck. I have two Debian Linux clients that I use to bench with. I run a script that makes calls that writes to the NFS device and, after about 30 minutes, starts to delete the initial data and follow behind writing and deleting. Here's what's happening: The "other" machine is a NetAPP. It's got 1GB of RAM and it's running RAID DP with 2 parity drives and 6 data drives, all SATA 750 GB 7200 RPM drives with dual Gb interfaces. The benchmark script manages to write lots of little (all less than 30KB) files at a rate of 11,000 per minute, however, after 30 minutes, when it starts deleting, the throughput on write goes to 9500 and deletion is 6000 per minute. If I turn on the second node, I get 17,000 writing combined with about 11,000 deletions combined. One way or another, this will overflow in time. Not good. Now, on to my pet project. :-) The FreeBSD/ZFS server is only able to maintain about 3500 writes per minute but also deletes at the same rate! (I would expect deletion to be at least as fast as writing) The drives are running at only 20-35% while this is going on and only putting down about 4-5 MB/sec each. So, at 1Gb or ~92MB/sec theoretical max (is that about right?) There's something wrong somewhere. I'm assuming it's the network. (I'll post all the tunings tomorrow.) Thinking something wrong, I mounted only one client to each server (they are identical clients and the same configuration as the FreeBSD box). I did a simple stream of: dd if=/dev/zero of=/mnt/nfs bs=1m count=1000. The FreeBSD box wins?! It cranked up the drives to 45-50 MB/sec each and balanced them perfectly on transactions/sec KB/sec, etc from systat -vm. (Woohoo!) The NetAPPs CPU was at over 35-40% constantly, (it does that while benching, too) I'll post the NetAPP finding tomorrow as I forgot it for now. As for the client mounting, it was with the options: nfsvers=3,rsize=32768,wsize=32768,hard,intr,async,noatime I'm trying to figure out why, when running this benchmark, can the NetAPP with WAFL nearly triple the FreeBSD/ZFS box. Also, I'm having something strange happen when I try to mount the disk from the FreeBSD server versus the NetAPP. The FreeBSD server will sometimes RPC timeout. Mounting the NetAPP is instantaneous. That's the beginning. If I have a list of things to check tomorrow, I will. I'd like to see the little machine that could kick the NetAPPs butt. (No offense to NetAPP. :-) ) Thank you for reading, Paul _______________________________________________ freebsd-performance@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-performance To unsubscribe, send any mail to "freebsd-performance-unsubscribe@freebsd.org" _______________________________________________ freebsd-performance@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-performance To unsubscribe, send any mail to "freebsd-performance-unsubscribe@freebsd.org" From owner-freebsd-performance@FreeBSD.ORG Fri Dec 19 22:14:48 2008 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C33F81065674 for ; Fri, 19 Dec 2008 22:14:48 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost1.sentex.ca (smarthost1.sentex.ca [64.7.153.18]) by mx1.freebsd.org (Postfix) with ESMTP id 8E7848FC16 for ; Fri, 19 Dec 2008 22:14:48 +0000 (UTC) (envelope-from mike@sentex.net) Received: from lava.sentex.ca (pyroxene.sentex.ca [199.212.134.18]) by smarthost1.sentex.ca (8.14.3/8.14.3) with ESMTP id mBJMEkrC015056 for ; Fri, 19 Dec 2008 17:14:46 -0500 (EST) (envelope-from mike@sentex.net) Received: from mdt-xp.sentex.net (simeon.sentex.ca [192.168.43.27]) by lava.sentex.ca (8.13.8/8.13.3) with ESMTP id mBJMEj2Q009511 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 19 Dec 2008 17:14:45 -0500 (EST) (envelope-from mike@sentex.net) Message-Id: <200812192214.mBJMEj2Q009511@lava.sentex.ca> X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9 Date: Fri, 19 Dec 2008 17:01:46 -0500 To: freebsd-performance@freebsd.org From: Mike Tancsa Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-Scanned-By: MIMEDefang 2.64 on 64.7.153.18 Subject: intel i7 and Hyperthreading X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Dec 2008 22:14:48 -0000 Just got our first board to play around with and unlike in the past, having hyperthreading enabled seems to help performance.... At least in buildworld tests. doing a make -j4 vs -j6 make -j8 vs -j10 gives -j buildworld time % improvement over -j4 4 13:57 6 12:11 13% 8 11:32 18% 10 11:43 17% dmesg below of the hardware... The CPU seems to run fairly cool, but the board has a lot of nasty hot heatsinks eg. running 8 burnP6 procs 0[ns3c]# sysctl -a | grep temperature dev.cpu.0.temperature: 67 dev.cpu.1.temperature: 67 dev.cpu.2.temperature: 65 dev.cpu.3.temperature: 65 dev.cpu.4.temperature: 66 dev.cpu.5.temperature: 66 dev.cpu.6.temperature: 64 dev.cpu.7.temperature: 64 0[ns3c]# vs idle dev.cpu.0.temperature: 46 dev.cpu.1.temperature: 46 dev.cpu.2.temperature: 42 dev.cpu.3.temperature: 42 dev.cpu.4.temperature: 44 dev.cpu.5.temperature: 44 dev.cpu.6.temperature: 40 dev.cpu.7.temperature: 40 Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.1-PRERELEASE #0: Fri Dec 19 19:48:15 EST 2008 mdtancsa@ns3c.recycle.net:/usr/obj/usr/src/sys/recycle Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz (2666.78-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x106a4 Stepping = 4 Features=0xbfebfbff Features2=0x98e3bd AMD Features=0x28100000 AMD Features2=0x1 Cores per package: 8 Logical CPUs per core: 2 real memory = 2138992640 (2039 MB) avail memory = 2084880384 (1988 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 cpu4 (AP): APIC ID: 4 cpu5 (AP): APIC ID: 5 cpu6 (AP): APIC ID: 6 cpu7 (AP): APIC ID: 7 ACPI Warning (tbfadt-0505): Optional field "Pm2ControlBlock" has zero address or length: 0 450/0 [20070320] ioapic0 irqs 0-23 on motherboard lapic0: Forcing LINT1 to edge trigger kbd1 at kbdmux0 acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 acpi_button0: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pcib1: irq 16 at device 1.0 on pci0 pci1: on pcib1 pcib2: irq 16 at device 3.0 on pci0 pci2: on pcib2 3ware device driver for 9000 series storage controllers, version: 3.70.05.001 twa0: <3ware 9000 series Storage Controller> port 0x3000-0x30ff mem 0xf4000000-0xf5ffffff,0xe4300000-0xe4300fff irq 16 at device 0.0 on pci2 twa0: [ITHREAD] twa0: INFO: (0x04: 0x003C): Initialize paused: unit=0 twa0: INFO: (0x15: 0x1300): Controller details:: Model 9650SE-2LP, 2 ports, Firmware FE9X 4.06.00.004, BIOS BE9X 4.05.00.015 pcib3: irq 16 at device 7.0 on pci0 pci3: on pcib3 pci0: at device 16.0 (no driver attached) pci0: at device 16.1 (no driver attached) pci0: at device 20.0 (no driver attached) pci0: at device 20.1 (no driver attached) pci0: at device 20.2 (no driver attached) pci0: at device 20.3 (no driver attached) em0: port 0x40e0-0x40ff mem 0xe4400000-0xe441ffff,0xe4422000-0xe4422fff irq 20 at device 25.0 on pci0 em0: Using MSI interrupt em0: [FILTER] em0: Ethernet address: 00:1c:c0:92:d5:83 uhci0: port 0x40c0-0x40df irq 16 at device 26.0 on pci0 uhci0: [GIANT-LOCKED] uhci0: [ITHREAD] usb0: on uhci0 usb0: USB revision 1.0 uhub0: on usb0 uhub0: 2 ports with 2 removable, self powered uhci1: port 0x40a0-0x40bf irq 21 at device 26.1 on pci0 uhci1: [GIANT-LOCKED] uhci1: [ITHREAD] usb1: on uhci1 usb1: USB revision 1.0 uhub1: on usb1 uhub1: 2 ports with 2 removable, self powered uhci2: port 0x4080-0x409f irq 19 at device 26.2 on pci0 uhci2: [GIANT-LOCKED] uhci2: [ITHREAD] usb2: on uhci2 usb2: USB revision 1.0 uhub2: on usb2 uhub2: 2 ports with 2 removable, self powered ehci0: mem 0xe4421000-0xe44213ff irq 18 at device 26.7 on pci0 ehci0: [GIANT-LOCKED] ehci0: [ITHREAD] usb3: EHCI version 1.0 usb3: companion controllers, 2 ports each: usb0 usb1 usb2 usb3: on ehci0 usb3: USB revision 2.0 uhub3: on usb3 uhub3: 6 ports with 6 removable, self powered pci0: at device 27.0 (no driver attached) pcib4: irq 17 at device 28.0 on pci0 pci4: on pcib4 pcib5: irq 16 at device 28.1 on pci0 pci5: on pcib5 bge0: mem 0xe4200000-0xe420ffff irq 17 at device 0.0 on pci5 miibus0: on bge0 brgphy0: PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto bge0: Ethernet address: 00:10:18:14:15:12 bge0: [ITHREAD] pcib6: irq 17 at device 28.4 on pci0 pci6: on pcib6 atapci0: port 0x2018-0x201f,0x2024-0x2027,0x2010-0x2017,0x2020-0x2023,0x2000-0x200f mem 0xe4100000-0xe41003ff irq 16 at device 0.0 on pci6 atapci0: [ITHREAD] ata2: on atapci0 ata2: [ITHREAD] ata3: on atapci0 ata3: [ITHREAD] uhci3: port 0x4060-0x407f irq 23 at device 29.0 on pci0 uhci3: [GIANT-LOCKED] uhci3: [ITHREAD] usb4: on uhci3 usb4: USB revision 1.0 uhub4: on usb4 uhub4: 2 ports with 2 removable, self powered uhci4: port 0x4040-0x405f irq 19 at device 29.1 on pci0 uhci4: [GIANT-LOCKED] uhci4: [ITHREAD] usb5: on uhci4 usb5: USB revision 1.0 uhub5: on usb5 uhub5: 2 ports with 2 removable, self powered uhci5: port 0x4020-0x403f irq 18 at device 29.2 on pci0 uhci5: [GIANT-LOCKED] uhci5: [ITHREAD] usb6: on uhci5 usb6: USB revision 1.0 uhub6: on usb6 uhub6: 2 ports with 2 removable, self powered ehci1: mem 0xe4420000-0xe44203ff irq 23 at device 29.7 on pci0 ehci1: [GIANT-LOCKED] ehci1: [ITHREAD] usb7: EHCI version 1.0 usb7: companion controllers, 2 ports each: usb4 usb5 usb6 usb7: on ehci1 usb7: USB revision 2.0 uhub7: on usb7 uhub7: 6 ports with 6 removable, self powered pcib7: at device 30.0 on pci0 pci7: on pcib7 vgapci0: port 0x1000-0x10ff mem 0xe2000000-0xe3ffffff,0xe0000000-0xe1ffffff irq 18 at device 2.0 on pci7 fwohci0: mem 0xe4004000-0xe40047ff,0xe4000000-0xe4003fff irq 19 at device 3.0 on pci7 fwohci0: [FILTER] fwohci0: OHCI version 1.10 (ROM=0) fwohci0: No. of Isochronous channels is 4. fwohci0: EUI64 00:90:27:00:02:39:70:e3 fwohci0: Phy 1394a available S400, 2 ports. fwohci0: Link S400, max_rec 2048 bytes. firewire0: on fwohci0 fwe0: on firewire0 if_fwe0: Fake Ethernet address: 02:90:27:39:70:e3 fwe0: Ethernet address: 02:90:27:39:70:e3 fwip0: on firewire0 fwip0: Firewire address: 00:90:27:00:02:39:70:e3 @ 0xfffe00000000, S400, maxrec 2048 dcons_crom0: on firewire0 dcons_crom0: bus_addr 0x102c000 sbp0: on firewire0 fwohci0: Initiate bus reset fwohci0: BUS reset fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode isab0: at device 31.0 on pci0 isa0: on isab0 atapci1: port 0x4158-0x415f,0x416c-0x416f,0x4150-0x4157,0x4168-0x416b,0x4130-0x413f,0x4120-0x412f irq 19 at device 31.2 on pci0 atapci1: [ITHREAD] ata4: on atapci1 ata4: [ITHREAD] ata5: on atapci1 ata5: [ITHREAD] pci0: at device 31.3 (no driver attached) atapci2: port 0x4148-0x414f,0x4164-0x4167,0x4140-0x4147,0x4160-0x4163,0x4110-0x411f,0x4100-0x410f irq 18 at device 31.5 on pci0 atapci2: [ITHREAD] ata6: on atapci2 ata6: [ITHREAD] ata7: on atapci2 ata7: [ITHREAD] cpu0: on acpi0 est0: on cpu0 p4tcc0: on cpu0 cpu1: on acpi0 est1: on cpu1 p4tcc1: on cpu1 cpu2: on acpi0 est2: on cpu2 p4tcc2: on cpu2 cpu3: on acpi0 est3: on cpu3 p4tcc3: on cpu3 cpu4: on acpi0 est4: on cpu4 p4tcc4: on cpu4 cpu5: on acpi0 est5: on cpu5 p4tcc5: on cpu5 cpu6: on acpi0 est6: on cpu6 p4tcc6: on cpu6 cpu7: on acpi0 est7: on cpu7 p4tcc7: on cpu7 orm0: at iomem 0xc0000-0xc7fff,0xc8000-0xc9fff,0xca000-0xcafff,0xcb000-0xccfff,0xcd800-0xce7ff pnpid ORM0000 on isa0 ata0 at port 0x1f0-0x1f7,0x3f6 irq 14 on isa0 ata0: [ITHREAD] ata1 at port 0x170-0x177,0x376 irq 15 on isa0 ata1: [ITHREAD] atkbdc0: at port 0x60,0x64 on isa0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounters tick every 1.000 msec firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me) firewire0: bus manager 0 (me) ad8: 76319MB at ata4-master SATA150 da0 at twa0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-5 device da0: 100.000MB/s transfers da0: 953664MB (1953103872 512 byte sectors: 255H 63S/T 121575C) lapic3: Forcing LINT1 to edge trigger SMP: AP CPU #3 Launched! lapic1: Forcing LINT1 to edge trigger SMP: AP CPU #1 Launched! lapic4: Forcing LINT1 to edge trigger SMP: AP CPU #4 Launched! lapic6: Forcing LINT1 to edge trigger SMP: AP CPU #6 Launched! lapic5: Forcing LINT1 to edge trigger SMP: AP CPU #5 Launched! lapic2: Forcing LINT1 to edge trigger SMP: AP CPU #2 Launched! lapic7: Forcing LINT1 to edge trigger SMP: AP CPU #7 Launched! ukbd0: on uhub4 kbd2 at ukbd0 uhid0: on uhub4 Trying to mount root from ufs:/dev/da0s1a twa0: INFO: (0x04: 0x000C): Initialize started: unit=0 ukbd0: at uhub4 port 1 (addr 2) disconnected ukbd0: detached uhid0: at uhub4 port 1 (addr 2) disconnected uhid0: detached ukbd0: on uhub4 kbd2 at ukbd0 uhid0: on uhub4 -------------------------------------------------------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet since 1994 www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike