Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 8 Oct 2001 22:43:45 +0800
From:      buggy@mail.eimg.com.tw
To:        freebsd-hackers@freebsd.org
Cc:        mike@freebsd.org, mcl@skysoft.com.tw
Subject:   processes stuck in 'ffsvgt' states (kern/19479)
Message-ID:  <20011008224345.A6052@mail.eimg.com.tw>

next in thread | raw e-mail | index | archive | help
Greetings all,

I have two boxes that encountered the problem reported as
kern/19479. The system took 102400K of FFS vnodes, and hang.

They are two SMP boxes that both have 2.5G of physical memory.
maxusers is 384. The problem happens on 4.3-RC#0 and 4.4-RC#0.

I guess that the problem happens on machines that have 
large physical memory and need to access many files. 
(our machines have millions of files)

It is simple to reproduce the bug with 'cp':
In single-user mode, use the native 'cp' to copy directories 
with about 34000 subdirectories to another partition (cp -Rp), 
then the number of FFS nodes keeps growing without stop.
After a while, it hangs with processes in 'ffsvgt' and 'FFS no' state.
The 'top' still runs, but any process that do disk I/O would block.

If I kill the 'cp' process before running out of vnodes, 
the number of FFS nodes doesn't go down after 10 minutes.

If I umount the target partition that I'm copying directories to, 
the number of FFS nodes drops down to normal value immediately.

It doesn't matter whether these partitions are mounted with soft-updates 
or not. Some of the partitions are using vinum.

In addition to single-user mode, it also happened many times in muliple-user
mode when there are more than 1G of InAct mem (less than 3000 processes), 
but I have never seen this problem at peek load when the system 
has less than 200M of InAct mem (5000 to 6000 processes running).

It seems that the problem occurred more frequently after we added more RAM.
We didn't notice this problem when we have only 2G of RAM,
after we upgraded to 2.5G of RAM, both of our machines crashed every
one or two days. 

I guess that FreeBSD might use too much vnodes when there are lots of 
free memory.

ps: we have applied the kernel patch that increase kernel address space to 2G

   =====

Below are some sys stats when moving directories in single-user mode.

$ while true; do vmstat -m | grep 'FFS node' ; sleep 3; done 

     FFS node272660 68165K  68166K102400K   660495    0     0  256
     FFS node272767 68192K  68192K102400K   660950    0     0  256
  <<kill the 'cp' process, InUse vnodes doesn't change after 10 minutes>>
     FFS node272803 68201K  68201K102400K   661206    0     0  256
     FFS node272803 68201K  68201K102400K   661206    0     0  256

     FFS node103352 51676K  54947K102400K   340554    0     0  512
  <<umount the target partition, InUse vnodes go down immediately>>
     FFS node   171    86K  54947K102400K   340554    0     0  512
     FFS node   171    86K  54947K102400K   340554    0     0  512

$ sysctl -a | grep vnode
kern.maxvnodes: 134881
vm.stats.vm.v_vnodein: 210149
vm.stats.vm.v_vnodeout: 0
vm.stats.vm.v_vnodepgsin: 357224
vm.stats.vm.v_vnodepgsout: 0

   =====

  Below are some sys stats before crash in multiple-user mode, 
  when many online users go to sleep, and the number of processes 
  drop down, and the Inact mem keep increasing.

$ while true; do vmstat -m | grep 'FFS node' ; sleep 3; done 
	<< InUse vnodes stay unchanged as 99200K for many hours >>
     FFS node198399 99200K  99200K102400K  3963021    0     0  512
     FFS node198399 99200K  99200K102400K  3963274    0     0  512
     FFS node198399 99200K  99200K102400K  3963538    0     0  512
     FFS node198395 99198K  99200K102400K  3963795    0     0  512
     FFS node198399 99200K  99200K102400K  3964102    0     0  512
	<< InUse vnodes start to increse>>
     FFS node198510 99255K  99256K102400K  3964476    0     0  512
     FFS node198678 99339K  99339K102400K  3964712    0     0  512
     FFS node198861 99431K  99431K102400K  3964958    0     0  512
     FFS node199049 99525K  99525K102400K  3965276    0     0  512
     FFS node199224 99612K  99612K102400K  3965603    0     0  512
     FFS node199452 99726K  99726K102400K  3965913    0     0  512
     FFS node199636 99818K  99818K102400K  3966161    0     0  512
     FFS node199753 99877K  99877K102400K  3966392    0     0  512
     FFS node199939 99970K  99970K102400K  3966648    0     0  512
     FFS node200142100071K 100071K102400K  3966954    0     0  512
     FFS node200346100173K 100173K102400K  3967210    0     0  512
     FFS node200547100274K 100274K102400K  3967433    0     0  512
     FFS node200719100360K 100360K102400K  3967645    0     0  512
     FFS node200901100451K 100451K102400K  3967952    0     0  512
     FFS node201059100530K 100530K102400K  3968182    0     0  512
     FFS node201246100623K 100623K102400K  3968424    0     0  512
     FFS node201446100723K 100724K102400K  3968682    0     0  512
     FFS node201593100797K 100797K102400K  3968953    0     0  512
     FFS node201735100868K 100868K102400K  3969226    0     0  512
     FFS node201803100902K 100902K102400K  3969531    0     0  512
     FFS node201949100975K 100975K102400K  3969806    0     0  512
     FFS node202152101076K 101077K102400K  3970090    0     0  512
     FFS node202317101159K 101159K102400K  3970385    0     0  512
     FFS node202506101253K 101253K102400K  3970657    0     0  512
     FFS node202655101328K 101328K102400K  3970866    0     0  512
     FFS node202794101397K 101397K102400K  3971235    0     0  512
     FFS node202986101493K 101494K102400K  3971481    0     0  512
     FFS node203026101513K 101514K102400K  3971732    0     0  512
     FFS node203293101647K 101647K102400K  3972030    0     0  512
     FFS node203540101770K 101770K102400K  3972329    0     0  512
     FFS node203971101986K 101986K102400K  3972931    0     0  512
     FFS node204203102102K 102102K102400K  3973277    0     0  512
     FFS node204637102319K 102319K102400K  3973805    0     0  512
     FFS node204787102394K 102394K102400K  3974095    0     0  512
     FFS node204800102400K 102400K102400K  3974162    1     0  512
     FFS node204800102400K 102400K102400K  3974165    3     0  512
     FFS node204800102400K 102400K102400K  3974165    3     0  512
     FFS node204800102400K 102400K102400K  3974165    3     0  512
     FFS node204800102400K 102400K102400K  3974165    3     0  512
     FFS node204800102400K 102400K102400K  3974165    3     0  512
     FFS node204800102400K 102400K102400K  3974165    3     0  512
     FFS node204800102400K 102400K102400K  3974165    3     0  512
     FFS node204800102400K 102400K102400K  3974165    3     0  512
	<< 2 minutes after InUse vnodes exceeded 99200K , crashed>>

   << top output>>
5225 processes:3 running, 5210 sleeping, 3 stopped, 9 zombie
CPU states:  0.4% user,  0.0% nice,  3.6% system,  0.4% interrupt, 95.7% idle
Mem: 684M Active, 1183M Inact, 462M Wired, 140M Cache, 265M Buf, 36M Free
Swap: 1024M Total, 1024M Free

--
  Justin Chuang

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20011008224345.A6052>