Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Dec 2012 20:58:36 -0800
From:      Hub- Marketing <marketing@hub.org>
To:        "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>
Subject:   9-STABLE -> NFS -> NetAPP: 
Message-ID:  <B7529290-01FC-4E14-ACE5-1EBFCF2367C3@hub.org>

index | next in thread | raw e-mail


I'm running a few servers sitting on top of a NetAPP file server … everything runs great, but periodically I'm getting:

nfs_getpages: error 13
vm_fault: pager read error, pid 11355 (https)

errors on my screen … not always same pid … the annoying part is that it seems to always affect the same jail that is running .. if I shutdown all jails on that physical server, everything shuts down except for that *one* jail, with a ps listing looking like:

USER   PID %CPU %MEM    VSZ   RSS TT  STAT STARTED    TIME COMMAND
root  6670  0.0  0.0   9936  1372 ??  DsJ   3:00AM 0:00.01 newsyslog
root  6815  0.0  0.0   9936  1288 ??  DsJ   3:00AM 0:00.01 /usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg
root  8361  0.0  0.1 220740 11400 ??  DsJ   7:33PM 0:01.25 /usr/local/sbin/httpd -DNOHTTPACCEPT
www   8364  0.0  0.0      0     0 ??  ZJ    7:33PM 0:00.00 <defunct>
www  11866  0.0  0.1 318444 16792 ??  TJ    7:36PM 0:00.03 /usr/local/sbin/httpd -DNOHTTPACCEPT
www  11872  0.0  0.1 297964 14008 ??  TJ    7:36PM 0:00.01 /usr/local/sbin/httpd -DNOHTTPACCEPT
www  11873  0.0  0.1 306156 15028 ??  DEJ   7:36PM 0:00.02 /usr/local/sbin/httpd -DNOHTTPACCEPT
root 17190  0.0  0.0   9936  1240 ??  DsJ   8:00PM 0:00.01 /usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg
root 24864  0.0  0.0   9936  1392 ??  DsJ   4:00AM 0:00.01 newsyslog
root 24910  0.0  0.0   9936  1336 ??  DsJ   4:00AM 0:00.01 /usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg
root 29972  0.0  0.0   9936  1240 ??  DsJ   9:00PM 0:00.01 /usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg
root 34221  0.0  0.0  51480  4332 ??  DsJ   4:47AM 0:00.02 sshd: root@pts/1 (sshd)
root 42452  0.0  0.0   9936  1296 ??  DsJ  10:00PM 0:00.01 newsyslog
root 42522  0.0  0.0   9936  1240 ??  DsJ  10:00PM 0:00.01 /usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg
root 55179  0.0  0.0   9936  1296 ??  DsJ  11:00PM 0:00.01 newsyslog
root 55244  0.0  0.0   9936  1240 ??  DsJ  11:00PM 0:00.01 /usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg
root 67592  0.0  0.0   9936  1336 ??  DsJ  12:00AM 0:00.01 newsyslog
root 67762  0.0  0.0   9936  1288 ??  DsJ  12:00AM 0:00.01 /usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg
root 81603  0.0  0.0   9936  1340 ??  DsJ   1:00AM 0:00.01 newsyslog
root 81640  0.0  0.0   9936  1284 ??  DsJ   1:00AM 0:00.01 /usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg
root 93792  0.0  0.0   9936  1344 ??  DsJ   2:00AM 0:00.01 newsyslog
root 93815  0.0  0.0   9936  1288 ??  DsJ   2:00AM 0:00.01 /usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg
root 34228  0.0  0.0  67960  4464  1  Ds+J  4:47AM 0:00.00 sshd: root@pts/1 (sshd)
root 38473  0.0  0.0  17556  3272  3  SJ    4:53AM 0:00.02 /bin/tcsh
root 38475  0.0  0.0  14212  1512  3  R+J   4:53AM 0:00.00 ps aux

I can do a 'jexec <JID> /bin/tcsh' to get into the jail, I can perform ps commands, etc … I just can't get those processes to shutdown …

everything within the jail is 'up to date' … updates the userland and ports … I've checked over the NetApp, but everything appears fine, and it only seems to repeatedly affect that one jail, on that same physical server ...

I have no ideas on what / how to debug this … thoughts?  help?

thx




help

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B7529290-01FC-4E14-ACE5-1EBFCF2367C3>