Date: Thu, 25 Apr 2002 17:41:39 +0100 From: Brian Candler <B.Candler@pobox.com> To: freebsd-small@freebsd.org Subject: NFS root and unnecessary NFS operations Message-ID: <20020425174139.A871@linnet.org>
next in thread | raw e-mail | index | archive | help
This is a diskless/NFS question - I hope this is the most appropriate place to post it. When setting up some diskless servers in the past, just using the standard rc.diskless1/2 way (NFS root, ramdisk on /tmp and /var), I found that the load generated on the NFS server was much higher than I expected. I built some mailservers with: - read-only NFS root - local disk for /var - a read-write NFS directory for maildrops I found that the number of NFS operations/second generated was very high, but reduced greatly if the root was on local disk. As a result, I was forced to put system disks back into the machines. The measurement of ops/second was crude: the front-panel LCD display on a Network Appliance fileserver, and nfsstat on the machines themselves. Anyway, I've now decided to look into this a bit further. Here are some tests on a diskless machine running FreeBSD 4.1.1 (I know it's ancient :-) which has booted from a separate DHCP/TFTP server. It has its root partition on the Netapp, using NFSv3 because that's the only way I could get the mode bits to be seen properly, mounted read-only. (1) The first test invokes some read-only accesses to the Netapp (/usr/bin/touch and /bin/rm, I would have thought they were cached) and some read-write accesses to a ramdisk. On console 1: # perl -e 'for ($i=0;$i<1000;$i++) { system("touch /var/tmp/xxx"); system("rm /var/tmp/xxx");}' On console 2: # nfsstat -c -W 1 GtAttr Lookup Rdlink Read Write Rename Access Rddir Attr Lkup BioR BioW Accs BioD 0 0 0 0 0 0 0 0 - - - - - - 7363 5323 2 187 0 0 6065 0 90% 86% 100% - 87% - 11496 8336 0 287 0 0 9487 0 89% 86% 100% - 87% - 11471 8318 0 287 0 0 9462 0 90% 86% 100% - 87% - 9756 7069 0 244 0 0 8048 0 89% 86% 100% - 87% - 0 0 0 0 0 0 0 0 - - - - - - Ouch. That's a lot of NFS operations. Now, you would hope there is some caching going on, but if I look at 'tcpdump udp port 2049' on the interface which connects to the Netapp I can see a lot of physical traffic is being generated: # tcpdump -n -i fxp0 udp port 2049 >/dev/null tcpdump: listening on fxp0 ^C 16014 packets received by filter 0 packets dropped by kernel (that's for the 1000 iterations of that Perl script). (2) Now, if I rewrite the Perl script to remove the forks and execs and accesses to /bin and /usr/bin, the problem goes away: # perl -e 'for ($i=0;$i<1000;$i++) { open F,">/var/tmp/xxx"; print F "x"; close F; unlink "/var/tmp/xxx";}' # tcpdump -n -i fxp0 udp port 2049 >/dev/null tcpdump: listening on fxp0 ^C 14 packets received by filter 0 packets dropped by kernel That's great, but it's not a realistic simulation of a server machine, where (for example) sendmail or exim is constantly forking and execing. (3) Another test, just open /bin/rm for read: # perl -e 'for ($i=0;$i<1000;$i++) { open F,"</bin/rm"; $x=<F>; close F;}' # tcpdump -n -i fxp0 udp port 2049 >/dev/null tcpdump: listening on fxp0 ^C 2014 packets received by filter 0 packets dropped by kernel That's part-way between. It looks like I am getting one packet exchange each time the file is opened: 17:25:53.015554 192.168.0.91.649103617 > 192.168.0.1.2049: 108 access [|nfs] 17:25:53.015698 192.168.0.1.2049 > 192.168.0.91.649103617: reply ok 120 access c xxxxxxxx I tried "mount -u -o ro,noatime /" but that didn't make any difference. So, I'm interested to know if anyone can explain why so many NFS operations are being generated, and whether there's a workaround. I can see two possible solutions: - Run with root as a ramdisk, and mount /usr on the Netapp. This should at least ensure that an access to /var/xxx (where /var is a local disk) cannot generate any NFS traffic. I think that {bin,etc,modules,sbin,stand} will fit into about 24MB. But I guess it will still generate NFS traffic when I access /usr/lib/sendmail or whatever, so I would have to try and put all my applications in ramdisk too. - Turn on some optimisation that I am missing, or disable some checking that the kernel is doing, when repeatedly opening the same file (such as /bin/rm) Any observations gratefully received. I guess I'll have to set up a FreeBSD-4.5 image to boot from too... Cheers, Brian Candler. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-small" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020425174139.A871>