From owner-freebsd-bugs Wed Oct 9 00:07:52 1996 Return-Path: owner-bugs Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id AAA22666 for bugs-outgoing; Wed, 9 Oct 1996 00:07:52 -0700 (PDT) Received: from uno.sat.t.u-tokyo.ac.jp (uno.sat.t.u-tokyo.ac.jp [133.11.70.160]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id AAA22386; Wed, 9 Oct 1996 00:06:36 -0700 (PDT) Received: by uno.sat.t.u-tokyo.ac.jp (8.7.3+2.6Wbeta5/8.7.3) with ESMTP id QAA15440; Wed, 9 Oct 1996 16:06:19 +0900 (JST) To: freebsd-bugs@freebsd.org Cc: freebsd-hackers@freebsd.org, freebsd-current@freebsd.org Subject: Re: NFS from Solaris Server X-Mailer: Mew version 1.06 on Emacs 19.28.1, Mule 2.3 Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Date: Wed, 09 Oct 1996 16:06:19 +0900 Message-ID: <15438.844844779@sat.t.u-tokyo.ac.jp> From: Hidetoshi Shimokawa Sender: owner-bugs@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, >Is Hosed, on 2.1.0 NFS will write to a sun server at a whopping 100K/sec over >ethernet, but it goes along happily just slowly. >On 2.2-961006-SNAP we also write to NFS at a whopping 100K/sec, however in 8MB >increments we write out at 600K/sec (apparently into some memory buffer??) at >which point the buffer fills, all NFS operations are locked out on the system >until the buffer is drained. This takes an eternity. How do I undo this >buffering mechanism or make it a tunable? I'm also not satisfied with NFS performance of FreeBSD-current. It seems that 2.1.5 is rather faster than current. >Also an interesting note, a SGI to the SUN writes at 900K/sec via NFS with no >problems. The SGI is apparently doing: > > NFS v3 Proc 6, Proc 7 (Data) > >The FreeBSD box does: > > NFS v2 Proc 8 (Data) > or > NFS v3 Proc 7 (Data) > >Both are horrendously slow. Im going to attempt to figure out what the hell >Proc 6 is (everything I see says read, which doesnt make alot of sense). In >any case Im not much of a kernel hacker, so any assistance or someone with a >solution, please raise your hand! :) I am looking into this problem since yesterday, adding some debuging code into kernel. The following is some results around this. The system is, PentiumPro 100M Ether Sun Ultra Wide SCSI FreeBSD-current <--------------> Solaris 2.5 --------- Disk NFSv2 client NFSv2 server 6MB/s(iozone) and I mesured performance by iozone. With the default setting (with 4 nfsiods), I can get only 300KB/s. - With NFSv3, I got around 500KB-600KB. - SS20 <-> Sun Ultra gets around 800KB/s.) 1) A faster client(pentium class) gets less performance than a slower client(486 class). 2) After I killed all async daemon (nfsiod), I got 400KB/s, this seems funny :-). 3) By added some debugging code, I found that the performance reduction happens when the nfsiods are all busy and the buffer is marked as B_DELWRI(delayed write) in nfs_asyncio() (/sys/nfs/nfs_bio.c). This explains 1). 4) I changed the code so that nfs_asyncio returns with EIO before marks B_DELWRI, then I got 800-900KB/s. I think this algorithm is essentially same as 2.1.5 or BSDI 2.1. 5) It is interesting that the change above doesn't improve v3 performance. 6) I don't know how delayed write scheme is efficient, but at this point, it is a bottleneck. It is because, after a nfsbiod starts processiong the delayed write buffer in nfssvc_iod() (nfs_syscalls.c), other nfsbiods stop its work. I confirmed by debugging code in kernel, but it can be also easily observed by % iozone & sleep 2; cat /proc/"nfsiod's pid"/status nfsiod 15057 1 15056 0 -1,-1 noflags 844842823,10788 0,0 0,78906 sbwait 0 0 0,0,0,5,2,3,4,20,31 nfsiod 15058 1 15056 0 -1,-1 noflags 844842823,10874 0,0 0,30899 nfsrcvlk 0 0 0,0,0,5,2,3,4,20,31 nfsiod 15059 1 15056 0 -1,-1 noflags 844842823,10939 0,0 0,3486 nfsrcvlk 0 0 0,0,0,5,2,3,4,20,31 nfsiod 15060 1 15056 0 -1,-1 noflags 844842823,11001 0,0 0,1694 nfsrcvlk 0 0 0,0,0,5,2,3,4,20,31 nfsiod 15061 1 15056 0 -1,-1 noflags 844842823,11061 0,0 0,1395 nfsrcvlk 0 0 0,0,0,5,2,3,4,20,31 nfsiod 15062 1 15056 0 -1,-1 noflags 844842823,11119 0,0 0,1601 nfsrcvlk 0 0 0,0,0,5,2,3,4,20,31 nfsiod 15063 1 15056 0 -1,-1 noflags 844842823,11176 0,0 0,1339 nfsrcvlk 0 0 0,0,0,5,2,3,4,20,31 nfsiod 15064 1 15056 0 -1,-1 noflags 844842823,11233 0,0 0,1277 nfsrcvlk 0 0 0,0,0,5,2,3,4,20,31 All nfsiods except one are locked at nfs_rcvlock (nfs_socket.c) for long time. By tcpdump, server did reply but nfsiod can not process it for locking. I'm not familiar with NFS and kernel programming but it seems that current code has locking problem. I don't know how to fix it, please help, NFS and kernel experts! (Isn't it needed to be non-interruptable for the code between nfs_send and nfs_rcvlock in nfs_request() and nfs_reply()?) I also like to know why NFSv3 client is so slow. /\ Hidetoshi Shimokawa \/ simokawa@sat.t.u-tokyo.ac.jp PGP public key: finger -l simokawa@sat.t.u-tokyo.ac.jp