From owner-freebsd-hackers@FreeBSD.ORG Sat Oct 13 15:22:55 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 482E6D3F; Sat, 13 Oct 2012 15:22:55 +0000 (UTC) (envelope-from ndenev@gmail.com) Received: from mail-wg0-f50.google.com (mail-wg0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id A42558FC17; Sat, 13 Oct 2012 15:22:54 +0000 (UTC) Received: by mail-wg0-f50.google.com with SMTP id 16so2959150wgi.31 for ; Sat, 13 Oct 2012 08:22:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; bh=/olCYPBEWcL7a7m43HBPAkiGp1N2wqa0K+xYBKri0LI=; b=eKLvkyZ+8ZV26Oi+8RVYt75xYScP2qlO3T3X1APvryZ2xfuTckMVvC77WUBqz+AmLW OGXi/JU/PgLlRGjExhfQe41B2BGp/4xE+wtLI4a2HhcaMSzwTNMhxFDijCWTf+LZ624M OKR54RzfgkWPrpbeOj8Pg4/JZOSlGXpVaP0rczNVS+3UVKFvSojNe8CHCsJKsqx8OBo1 54Q7WY1L/YNDZfAjfjA3MuNWYCRPYPgmYCe+C3MLoUjWxZvwSaFOImFwUHrHnpeb8zUx iq2IwtJyVG4mkD+ea7Chi70Qck9Px11NPVtwZTN3IEnEp1OALxaXi53LiYz83TjOY9IQ 03rw== Received: by 10.180.19.71 with SMTP id c7mr12865089wie.2.1350141767964; Sat, 13 Oct 2012 08:22:47 -0700 (PDT) Received: from [10.181.156.211] ([213.226.63.148]) by mx.google.com with ESMTPS id dm3sm4093716wib.3.2012.10.13.08.22.45 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 13 Oct 2012 08:22:47 -0700 (PDT) Subject: Re: NFS server bottlenecks Mime-Version: 1.0 (Mac OS X Mail 6.1 \(1498\)) Content-Type: text/plain; charset=us-ascii From: Nikolay Denev In-Reply-To: <937460294.2185822.1350093954059.JavaMail.root@erie.cs.uoguelph.ca> Date: Sat, 13 Oct 2012 18:22:50 +0300 Content-Transfer-Encoding: quoted-printable Message-Id: <302BF685-4B9D-49C8-8000-8D0F6540C8F7@gmail.com> References: <937460294.2185822.1350093954059.JavaMail.root@erie.cs.uoguelph.ca> To: Rick Macklem X-Mailer: Apple Mail (2.1498) Cc: FreeBSD Hackers , Garrett Wollman X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Oct 2012 15:22:55 -0000 On Oct 13, 2012, at 5:05 AM, Rick Macklem wrote: > I wrote: >> Oops, I didn't get the "readahead" option description >> quite right in the last post. The default read ahead >> is 1, which does result in "rsize * 2", since there is >> the read + 1 readahead. >>=20 >> "rsize * 16" would actually be for the option "readahead=3D15" >> and for "readahead=3D16" the calculation would be "rsize * 17". >>=20 >> However, the example was otherwise ok, I think? rick >=20 > I've attached the patch drc3.patch (it assumes drc2.patch has already = been > applied) that replaces the single mutex with one for each hash list > for tcp. It also increases the size of NFSRVCACHE_HASHSIZE to 200. >=20 > These patches are also at: > http://people.freebsd.org/~rmacklem/drc2.patch > http://people.freebsd.org/~rmacklem/drc3.patch > in case the attachments don't get through. >=20 > rick > ps: I haven't tested drc3.patch a lot, but I think it's ok? drc3.patch applied and build cleanly and shows nice improvement! I've done a quick benchmark using iozone over the NFS mount from the = Linux host. drc2.pach (but with NFSRVCACHE_HASHSIZE=3D500) TEST WITH 8K = --------------------------------------------------------------------------= ----------------------- Auto Mode Using Minimum Record Size 8 KB Using Maximum Record Size 8 KB Using minimum file size of 2097152 kilobytes. Using maximum file size of 2097152 kilobytes. O_DIRECT feature enabled SYNC Mode.=20 OPS Mode. Output is in operations per second. Command line used: iozone -a -y 8k -q 8k -n 2g -g 2g -C -I -o -O = -i 0 -i 1 -i 2 Time Resolution =3D 0.000001 seconds. Processor cache size set to 1024 Kbytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. random = random bkwd record stride =20 KB reclen write rewrite read reread read = write read rewrite read fwrite frewrite fread freread 2097152 8 1919 1914 2356 2321 2335 = 1706 =20 TEST WITH 1M = --------------------------------------------------------------------------= ----------------------- Auto Mode Using Minimum Record Size 1024 KB Using Maximum Record Size 1024 KB Using minimum file size of 2097152 kilobytes. Using maximum file size of 2097152 kilobytes. O_DIRECT feature enabled SYNC Mode.=20 OPS Mode. Output is in operations per second. Command line used: iozone -a -y 1m -q 1m -n 2g -g 2g -C -I -o -O = -i 0 -i 1 -i 2 Time Resolution =3D 0.000001 seconds. Processor cache size set to 1024 Kbytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. random = random bkwd record stride =20 KB reclen write rewrite read reread read = write read rewrite read fwrite frewrite fread freread 2097152 1024 73 64 477 486 496 = 61 =20 drc3.patch TEST WITH 8K = --------------------------------------------------------------------------= ----------------------- Auto Mode Using Minimum Record Size 8 KB Using Maximum Record Size 8 KB Using minimum file size of 2097152 kilobytes. Using maximum file size of 2097152 kilobytes. O_DIRECT feature enabled SYNC Mode.=20 OPS Mode. Output is in operations per second. Command line used: iozone -a -y 8k -q 8k -n 2g -g 2g -C -I -o -O = -i 0 -i 1 -i 2 Time Resolution =3D 0.000001 seconds. Processor cache size set to 1024 Kbytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. random = random bkwd record stride =20 KB reclen write rewrite read reread read = write read rewrite read fwrite frewrite fread freread 2097152 8 2108 2397 3001 3013 3010 = 2389 =20 TEST WITH 1M = --------------------------------------------------------------------------= ----------------------- Auto Mode Using Minimum Record Size 1024 KB Using Maximum Record Size 1024 KB Using minimum file size of 2097152 kilobytes. Using maximum file size of 2097152 kilobytes. O_DIRECT feature enabled SYNC Mode.=20 OPS Mode. Output is in operations per second. Command line used: iozone -a -y 1m -q 1m -n 2g -g 2g -C -I -o -O = -i 0 -i 1 -i 2 Time Resolution =3D 0.000001 seconds. Processor cache size set to 1024 Kbytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. random = random bkwd record stride =20 KB reclen write rewrite read reread read = write read rewrite read fwrite frewrite fread freread 2097152 1024 80 79 521 536 528 = 75 =20 Also with drc3 the CPU usage on the server is noticeably lower. Most of = the time I could see only the geom{g_up}/{g_down} threads, and a few nfsd threads, before that nfsd's were much more prominent. I guess under bigger load the performance improvement can be bigger. I'll run some more tests with heavier loads this week. Thanks, Nikolay