From owner-freebsd-hackers Mon Oct 8 7:37:54 2001 Delivered-To: freebsd-hackers@freebsd.org Received: from mail.eimg.com.tw (mail.eimg.com.tw [211.22.11.115]) by hub.freebsd.org (Postfix) with ESMTP id E22DA37B405; Mon, 8 Oct 2001 07:37:41 -0700 (PDT) Received: by mail.eimg.com.tw (Postfix, from userid 1005) id 74943DF05D; Mon, 8 Oct 2001 22:43:45 +0800 (CST) Date: Mon, 8 Oct 2001 22:43:45 +0800 From: buggy@mail.eimg.com.tw To: freebsd-hackers@freebsd.org Cc: mike@freebsd.org, mcl@skysoft.com.tw Subject: processes stuck in 'ffsvgt' states (kern/19479) Message-ID: <20011008224345.A6052@mail.eimg.com.tw> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Greetings all, I have two boxes that encountered the problem reported as kern/19479. The system took 102400K of FFS vnodes, and hang. They are two SMP boxes that both have 2.5G of physical memory. maxusers is 384. The problem happens on 4.3-RC#0 and 4.4-RC#0. I guess that the problem happens on machines that have large physical memory and need to access many files. (our machines have millions of files) It is simple to reproduce the bug with 'cp': In single-user mode, use the native 'cp' to copy directories with about 34000 subdirectories to another partition (cp -Rp), then the number of FFS nodes keeps growing without stop. After a while, it hangs with processes in 'ffsvgt' and 'FFS no' state. The 'top' still runs, but any process that do disk I/O would block. If I kill the 'cp' process before running out of vnodes, the number of FFS nodes doesn't go down after 10 minutes. If I umount the target partition that I'm copying directories to, the number of FFS nodes drops down to normal value immediately. It doesn't matter whether these partitions are mounted with soft-updates or not. Some of the partitions are using vinum. In addition to single-user mode, it also happened many times in muliple-user mode when there are more than 1G of InAct mem (less than 3000 processes), but I have never seen this problem at peek load when the system has less than 200M of InAct mem (5000 to 6000 processes running). It seems that the problem occurred more frequently after we added more RAM. We didn't notice this problem when we have only 2G of RAM, after we upgraded to 2.5G of RAM, both of our machines crashed every one or two days. I guess that FreeBSD might use too much vnodes when there are lots of free memory. ps: we have applied the kernel patch that increase kernel address space to 2G ===== Below are some sys stats when moving directories in single-user mode. $ while true; do vmstat -m | grep 'FFS node' ; sleep 3; done FFS node272660 68165K 68166K102400K 660495 0 0 256 FFS node272767 68192K 68192K102400K 660950 0 0 256 <> FFS node272803 68201K 68201K102400K 661206 0 0 256 FFS node272803 68201K 68201K102400K 661206 0 0 256 FFS node103352 51676K 54947K102400K 340554 0 0 512 <> FFS node 171 86K 54947K102400K 340554 0 0 512 FFS node 171 86K 54947K102400K 340554 0 0 512 $ sysctl -a | grep vnode kern.maxvnodes: 134881 vm.stats.vm.v_vnodein: 210149 vm.stats.vm.v_vnodeout: 0 vm.stats.vm.v_vnodepgsin: 357224 vm.stats.vm.v_vnodepgsout: 0 ===== Below are some sys stats before crash in multiple-user mode, when many online users go to sleep, and the number of processes drop down, and the Inact mem keep increasing. $ while true; do vmstat -m | grep 'FFS node' ; sleep 3; done << InUse vnodes stay unchanged as 99200K for many hours >> FFS node198399 99200K 99200K102400K 3963021 0 0 512 FFS node198399 99200K 99200K102400K 3963274 0 0 512 FFS node198399 99200K 99200K102400K 3963538 0 0 512 FFS node198395 99198K 99200K102400K 3963795 0 0 512 FFS node198399 99200K 99200K102400K 3964102 0 0 512 << InUse vnodes start to increse>> FFS node198510 99255K 99256K102400K 3964476 0 0 512 FFS node198678 99339K 99339K102400K 3964712 0 0 512 FFS node198861 99431K 99431K102400K 3964958 0 0 512 FFS node199049 99525K 99525K102400K 3965276 0 0 512 FFS node199224 99612K 99612K102400K 3965603 0 0 512 FFS node199452 99726K 99726K102400K 3965913 0 0 512 FFS node199636 99818K 99818K102400K 3966161 0 0 512 FFS node199753 99877K 99877K102400K 3966392 0 0 512 FFS node199939 99970K 99970K102400K 3966648 0 0 512 FFS node200142100071K 100071K102400K 3966954 0 0 512 FFS node200346100173K 100173K102400K 3967210 0 0 512 FFS node200547100274K 100274K102400K 3967433 0 0 512 FFS node200719100360K 100360K102400K 3967645 0 0 512 FFS node200901100451K 100451K102400K 3967952 0 0 512 FFS node201059100530K 100530K102400K 3968182 0 0 512 FFS node201246100623K 100623K102400K 3968424 0 0 512 FFS node201446100723K 100724K102400K 3968682 0 0 512 FFS node201593100797K 100797K102400K 3968953 0 0 512 FFS node201735100868K 100868K102400K 3969226 0 0 512 FFS node201803100902K 100902K102400K 3969531 0 0 512 FFS node201949100975K 100975K102400K 3969806 0 0 512 FFS node202152101076K 101077K102400K 3970090 0 0 512 FFS node202317101159K 101159K102400K 3970385 0 0 512 FFS node202506101253K 101253K102400K 3970657 0 0 512 FFS node202655101328K 101328K102400K 3970866 0 0 512 FFS node202794101397K 101397K102400K 3971235 0 0 512 FFS node202986101493K 101494K102400K 3971481 0 0 512 FFS node203026101513K 101514K102400K 3971732 0 0 512 FFS node203293101647K 101647K102400K 3972030 0 0 512 FFS node203540101770K 101770K102400K 3972329 0 0 512 FFS node203971101986K 101986K102400K 3972931 0 0 512 FFS node204203102102K 102102K102400K 3973277 0 0 512 FFS node204637102319K 102319K102400K 3973805 0 0 512 FFS node204787102394K 102394K102400K 3974095 0 0 512 FFS node204800102400K 102400K102400K 3974162 1 0 512 FFS node204800102400K 102400K102400K 3974165 3 0 512 FFS node204800102400K 102400K102400K 3974165 3 0 512 FFS node204800102400K 102400K102400K 3974165 3 0 512 FFS node204800102400K 102400K102400K 3974165 3 0 512 FFS node204800102400K 102400K102400K 3974165 3 0 512 FFS node204800102400K 102400K102400K 3974165 3 0 512 FFS node204800102400K 102400K102400K 3974165 3 0 512 FFS node204800102400K 102400K102400K 3974165 3 0 512 << 2 minutes after InUse vnodes exceeded 99200K , crashed>> << top output>> 5225 processes:3 running, 5210 sleeping, 3 stopped, 9 zombie CPU states: 0.4% user, 0.0% nice, 3.6% system, 0.4% interrupt, 95.7% idle Mem: 684M Active, 1183M Inact, 462M Wired, 140M Cache, 265M Buf, 36M Free Swap: 1024M Total, 1024M Free -- Justin Chuang To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message