FreeBSD Mail Archives

Date:      Sat, 30 Oct 2004 13:36:14 -0300 (ADT)
From:      "Marc G. Fournier" <scrappy@hub.org>
To:        freebsd-stable@freebsd.org
Subject:   Re: vnode 'leak' in 4.x ...
Message-ID:  <20041030133328.V6085@ganymede.hub.org>
In-Reply-To: <20041030133044.O6085@ganymede.hub.org>
References:  <20041030131002.O6085@ganymede.hub.org> <20041030133044.O6085@ganymede.hub.org>


And, just before rebooting the server in question, the process listing 
looks like:

USER      PID %CPU %MEM   VSZ  RSS  TT  STAT STARTED      TIME COMMAND
ipaudit 34190  0.0  0.0     0    0  ??  Z     1:30PM   0:00.00  (sh)
root    34521  0.0  0.0   444  280  p3  R+    1:33PM   0:00.00 ps aux
root    34520  0.0  0.0  3080 1452  ??  S     1:33PM   0:00.00 sendmail: startup with [218.17.67.38] (sendmail)
root    34222  0.0  0.1  5024 4644  ??  S     1:30PM   0:00.04 /usr/local/ipaudit/bin/ipaudit -g /usr/local/ipaudit/ipaudit-web.conf -o /usr/local/ipaudit/data/30min/2004-10-30-13:30
ipaudit 34221  0.0  0.0   636  256  ??  I     1:30PM   0:00.00 /bin/sh cron/cron30min
ipaudit 34220  0.0  0.0   636  256  ??  I     1:30PM   0:00.00 /bin/sh cron/cron30min
root    34186  0.0  0.0  1032  632  ??  I     1:30PM   0:00.00 cron: running job (cron)
root    26779  0.0  0.0  1324  900  p0  Ss+   1:15PM   0:00.12 -csh (csh)
root    26777  0.0  0.0  5296 1668  ??  S     1:15PM   0:00.16 sshd: root@ttyp0 (sshd)
root    26725  0.0  0.0  1080  604  p2  S+    1:15PM   0:00.02 grep vnode
root    26724  0.0  0.0   916  416  p2  S+    1:15PM   0:00.04 tail -f /var/log/syswatch
root    19666  0.0  0.0  3116 1828  ??  I     1:05PM   0:00.02 sendmail: server [219.236.18.233] cmd read (sendmail)
root    18305  0.0  0.0  1352  948  p3  Ss   12:59PM   0:00.40 -csh (csh)
root    18303  0.0  0.0  5296 1668  ??  S    12:59PM   0:00.62 sshd: root@ttyp3 (sshd)
root    18291  0.0  0.1  5328 3056  ??  Ss   12:59PM   0:00.26 /usr/local/sbin/named
root    18038  0.0  0.0  1328  924  p2  Is   12:58PM   0:00.52 -csh (csh)
root    18036  0.0  0.0  5296 1668  ??  S    12:58PM   0:00.30 sshd: root@ttyp2 (sshd)
root      208  0.0  0.0   956    8  v7  Is+   5Sep04   0:00.00 /usr/libexec/getty Pc ttyv7
root      207  0.0  0.0   956    8  v6  Is+   5Sep04   0:00.00 /usr/libexec/getty Pc ttyv6
root      206  0.0  0.0   956    8  v5  Is+   5Sep04   0:00.00 /usr/libexec/getty Pc ttyv5
root      205  0.0  0.0   956    8  v4  Is+   5Sep04   0:00.00 /usr/libexec/getty Pc ttyv4
root      204  0.0  0.0   956    8  v3  Is+   5Sep04   0:00.00 /usr/libexec/getty Pc ttyv3
root      203  0.0  0.0   956    8  v2  Is+   5Sep04   0:00.00 /usr/libexec/getty Pc ttyv2
root      202  0.0  0.0   956    8  v1  Is+   5Sep04   0:00.00 /usr/libexec/getty Pc ttyv1
root      201  0.0  0.0   956    8  v0  Is+   5Sep04   0:00.01 /usr/libexec/getty Pc ttyv0
root      190  0.0  0.0  1980  416  ??  I     5Sep04   0:26.67 /usr/local/sbin/upclient
smmsp     145  0.0  0.0  2936  652  ??  Is    5Sep04   0:05.20 sendmail: Queue runner@00:30:00 for /var/spool/clientmqueue (sendmail)
root      142  0.0  0.0  3056 1048  ??  Ss    5Sep04  12:31.00 sendmail: accepting connections (sendmail)
root      111  0.0  0.0  2596  672  ??  Is    5Sep04   3:07.15 /usr/sbin/sshd
root      109  0.0  0.0  1032  528  ??  Ss    5Sep04   2:05.55 /usr/sbin/cron
daemon    103  0.0  0.0   912  364  ??  Ss    5Sep04   1:17.51 rwhod
root      101  0.0  0.0 263152  428  ??  Is    5Sep04   2:29.86 rpc.statd
root       99  0.0  0.0   360    0  ??  I     5Sep04  21:19.41 nfsd: server (nfsd)
root       98  0.0  0.0   360    0  ??  I     5Sep04 105:39.79 nfsd: server (nfsd)
root       97  0.0  0.0   360    0  ??  I     5Sep04 291:27.28 nfsd: server (nfsd)
root       96  0.0  0.0   360    0  ??  I     5Sep04 1453:56.62 nfsd: server (nfsd)
root       95  0.0  0.0   368    0  ??  Is    5Sep04   0:00.00 nfsd: master (nfsd)
root       92  0.0  0.0   588  252  ??  Is    5Sep04   2:30.06 mountd -r
daemon     90  0.0  0.0  1012  460  ??  Is    5Sep04   2:35.51 /usr/sbin/portmap
root       85  0.0  0.0   996  388  ??  Ss    5Sep04  22:39.76 /usr/sbin/syslogd -ss
root        7  0.0  0.0     0    0  ??  DL    5Sep04 655:25.39  (vnlru)
root        6  0.0  0.0     0    0  ??  DL    5Sep04 1088:06.50  (syncer)
root        5  0.0  0.0     0    0  ??  DL    5Sep04   2:57.05  (bufdaemon)
root        4  0.0  0.0     0    0  ??  DL    5Sep04   0:00.00  (vmdaemon)
root        3  0.0  0.0     0    0  ??  DL    5Sep04  23:36.87  (pagedaemon)
root        2  0.0  0.0     0    0  ??  DL    5Sep04   0:00.00  (taskqueue)
root        1  0.0  0.0   552   72  ??  SLs   5Sep04   1:54.30 /sbin/init --
root        0  0.0  0.0     0    0  ??  DLs   5Sep04   0:00.00  (swapper)
root    34522  0.0  0.0   344  184  p3  R+    1:33PM   0:00.00 less

Shutting down all other processes on the server, and umounting everything 
but the required file systems (ie. umounting the heavily used one), 
resulted in it freeing up about 30k vnodes, and then it hovers around 55k 
free ... if I restarted everything, it would once more fall down to the 
20k mark or so, and vnlru would be constantly in a vlrup state :(

Oct 30 13:21:00 venus root: debug.numvnodes: 522265 - debug.freevnodes: 19384 - debug.vnlru_nowhere: 209881 - vlrup
Oct 30 13:22:01 venus root: debug.numvnodes: 522265 - debug.freevnodes: 19935 - debug.vnlru_nowhere: 209901 - vlrup
Oct 30 13:23:01 venus root: debug.numvnodes: 522265 - debug.freevnodes: 22739 - debug.vnlru_nowhere: 209920 - vlrup
Oct 30 13:24:00 venus root: debug.numvnodes: 522265 - debug.freevnodes: 22031 - debug.vnlru_nowhere: 209940 - vlrup
Oct 30 13:25:00 venus root: debug.numvnodes: 522265 - debug.freevnodes: 31552 - debug.vnlru_nowhere: 209960 - vlrup
Oct 30 13:26:00 venus root: debug.numvnodes: 522265 - debug.freevnodes: 26440 - debug.vnlru_nowhere: 209980 - vlrup
Oct 30 13:27:00 venus root: debug.numvnodes: 522265 - debug.freevnodes: 50454 - debug.vnlru_nowhere: 209986 - vlrup
Oct 30 13:28:01 venus root: debug.numvnodes: 522265 - debug.freevnodes: 52263 - debug.vnlru_nowhere: 210005 - vlruwt
Oct 30 13:29:01 venus root: debug.numvnodes: 522265 - debug.freevnodes: 51269 - debug.vnlru_nowhere: 210017 - vlrup
Oct 30 13:30:01 venus root: debug.numvnodes: 522265 - debug.freevnodes: 52146 - debug.vnlru_nowhere: 210027 - vlruwt
Oct 30 13:31:00 venus root: debug.numvnodes: 522265 - debug.freevnodes: 54789 - debug.vnlru_nowhere: 210027 - vlruwt
Oct 30 13:32:00 venus root: debug.numvnodes: 522265 - debug.freevnodes: 54938 - debug.vnlru_nowhere: 210027 - vlruwt
Oct 30 13:33:00 venus root: debug.numvnodes: 522265 - debug.freevnodes: 54932 - debug.vnlru_nowhere: 210027 - vlruwt
Oct 30 13:34:00 venus root: debug.numvnodes: 522265 - debug.freevnodes: 54935 - debug.vnlru_nowhere: 210027 - vlruwt



On Sat, 30 Oct 2004, Marc G. Fournier wrote:

>
> Just to give an idea of what a second server, with less uptime, is looking 
> like, with the approx. the same # of VMs on her:
>
> Oct 30 13:29:00 neptune root: debug.numvnodes: 462882 - debug.freevnodes: 
> 132826 - debug.vnlru_nowhere: 0 - vlruwt
> Oct 30 13:30:00 neptune root: debug.numvnodes: 462882 - debug.freevnodes: 
> 151976 - debug.vnlru_nowhere: 0 - vlruwt
>
> But she's only been up 7 days so far ...
>
> On Sat, 30 Oct 2004, Marc G. Fournier wrote:
>
>> 
>> A little while ago, I reported a suspicion that vnodes just weren't being 
>> freed up on long running servers ... after 55days of uptime on one of my 
>> servers, here is what I'm dealing with ...
>> 
>> 793 'samples' today (one every minute)
>> 786 with vnlru in a vlrup state
>> 
>> I shutdown all of the VMs running on the large hard drive (the only place 
>> unionfs is being used) and umount'd the drive ... there were some suggested 
>> back then that this might/should free everything back up again ... but it 
>> didn't:
>> 
>> Oct 30 13:06:02 venus root: debug.numvnodes: 522265 - debug.freevnodes: 
>> 57966 - debug.vnlru_nowhere: 209679 - vlruwt
>> Oct 30 13:07:00 venus root: debug.numvnodes: 522265 - debug.freevnodes: 
>> 57268 - debug.vnlru_nowhere: 209679 - vlruwt
>> Oct 30 13:08:00 venus root: debug.numvnodes: 522265 - debug.freevnodes: 
>> 52335 - debug.vnlru_nowhere: 209679 - vlruwt
>> Oct 30 13:09:00 venus root: debug.numvnodes: 522265 - debug.freevnodes: 
>> 50228 - debug.vnlru_nowhere: 209682 - vlrup
>> Oct 30 13:10:01 venus root: debug.numvnodes: 522265 - debug.freevnodes: 
>> 44407 - debug.vnlru_nowhere: 209690 - vlrup
>> Oct 30 13:11:00 venus root: debug.numvnodes: 522265 - debug.freevnodes: 
>> 35424 - debug.vnlru_nowhere: 209697 - vlrup
>> Oct 30 13:12:02 venus root: debug.numvnodes: 522265 - debug.freevnodes: 
>> 34626 - debug.vnlru_nowhere: 209708 - vlrup
>> Oct 30 13:13:00 venus root: debug.numvnodes: 522265 - debug.freevnodes: 
>> 29214 - debug.vnlru_nowhere: 209727 - vlrup
>> Oct 30 13:14:00 venus root: debug.numvnodes: 522265 - debug.freevnodes: 
>> 24414 - debug.vnlru_nowhere: 209746 - vlrup
>> Oct 30 13:15:00 venus root: debug.numvnodes: 522265 - debug.freevnodes: 
>> 26994 - debug.vnlru_nowhere: 209766 - vlrup
>> 
>> The 'vlruwt' states above are while I had everything shutdown ... the 
>> vlrup's all started again after I mounted the drive and started to restart 
>> the VMs themselves ...
>> 
>> I expect a high # of vnodes to be used ... that isn't the issue ... the 
>> issue is that even getting rid of the major mount point, so that only /, 
>> /tmp, /usr, /var are left up, the large # of vnodes that are in use on that 
>> mount point aren't being freed by vnlru :(
>> 
>> I hate to reboot the server, but it looks like I've got no choice at this 
>> point ... is there something else that I can do, in 50 days or so, to 
>> provide more information?
>> 
>> Thanks ...
>> 
>> ----
>> Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
>> Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664
>> 
>
> ----
> Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
> Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664
>

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041030133328.V6085>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation