From owner-freebsd-stable@FreeBSD.ORG Wed Dec 19 05:16:03 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 41AE448A for ; Wed, 19 Dec 2012 05:16:03 +0000 (UTC) (envelope-from marketing@hub.org) Received: from hub.org (hub.org [200.46.204.220]) by mx1.freebsd.org (Postfix) with ESMTP id C311F8FC12 for ; Wed, 19 Dec 2012 05:16:02 +0000 (UTC) Received: from maia.hub.org (unknown [200.46.151.189]) by hub.org (Postfix) with ESMTP id 76CE613EFD92 for ; Wed, 19 Dec 2012 01:15:02 -0400 (AST) Received: from hub.org ([200.46.204.220]) by maia.hub.org (mx1.hub.org [200.46.151.189]) (amavisd-maia, port 10024) with ESMTP id 22220-05 for ; Wed, 19 Dec 2012 05:15:01 +0000 (UTC) Received: from [192.168.0.52] (S01060026f3ee6b97.gv.shawcable.net [96.54.43.95]) by hub.org (Postfix) with ESMTPA id 5037313EFD8B for ; Wed, 19 Dec 2012 00:58:38 -0400 (AST) From: Hub- Marketing Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Subject: 9-STABLE -> NFS -> NetAPP: Message-Id: Date: Tue, 18 Dec 2012 20:58:36 -0800 To: "freebsd-stable@freebsd.org" Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) X-Mailer: Apple Mail (2.1499) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Dec 2012 05:16:03 -0000 I'm running a few servers sitting on top of a NetAPP file server =85 = everything runs great, but periodically I'm getting: nfs_getpages: error 13 vm_fault: pager read error, pid 11355 (https) errors on my screen =85 not always same pid =85 the annoying part is = that it seems to always affect the same jail that is running .. if I = shutdown all jails on that physical server, everything shuts down except = for that *one* jail, with a ps listing looking like: USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND root 6670 0.0 0.0 9936 1372 ?? DsJ 3:00AM 0:00.01 newsyslog root 6815 0.0 0.0 9936 1288 ?? DsJ 3:00AM 0:00.01 = /usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg root 8361 0.0 0.1 220740 11400 ?? DsJ 7:33PM 0:01.25 = /usr/local/sbin/httpd -DNOHTTPACCEPT www 8364 0.0 0.0 0 0 ?? ZJ 7:33PM 0:00.00 www 11866 0.0 0.1 318444 16792 ?? TJ 7:36PM 0:00.03 = /usr/local/sbin/httpd -DNOHTTPACCEPT www 11872 0.0 0.1 297964 14008 ?? TJ 7:36PM 0:00.01 = /usr/local/sbin/httpd -DNOHTTPACCEPT www 11873 0.0 0.1 306156 15028 ?? DEJ 7:36PM 0:00.02 = /usr/local/sbin/httpd -DNOHTTPACCEPT root 17190 0.0 0.0 9936 1240 ?? DsJ 8:00PM 0:00.01 = /usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg root 24864 0.0 0.0 9936 1392 ?? DsJ 4:00AM 0:00.01 newsyslog root 24910 0.0 0.0 9936 1336 ?? DsJ 4:00AM 0:00.01 = /usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg root 29972 0.0 0.0 9936 1240 ?? DsJ 9:00PM 0:00.01 = /usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg root 34221 0.0 0.0 51480 4332 ?? DsJ 4:47AM 0:00.02 sshd: = root@pts/1 (sshd) root 42452 0.0 0.0 9936 1296 ?? DsJ 10:00PM 0:00.01 newsyslog root 42522 0.0 0.0 9936 1240 ?? DsJ 10:00PM 0:00.01 = /usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg root 55179 0.0 0.0 9936 1296 ?? DsJ 11:00PM 0:00.01 newsyslog root 55244 0.0 0.0 9936 1240 ?? DsJ 11:00PM 0:00.01 = /usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg root 67592 0.0 0.0 9936 1336 ?? DsJ 12:00AM 0:00.01 newsyslog root 67762 0.0 0.0 9936 1288 ?? DsJ 12:00AM 0:00.01 = /usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg root 81603 0.0 0.0 9936 1340 ?? DsJ 1:00AM 0:00.01 newsyslog root 81640 0.0 0.0 9936 1284 ?? DsJ 1:00AM 0:00.01 = /usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg root 93792 0.0 0.0 9936 1344 ?? DsJ 2:00AM 0:00.01 newsyslog root 93815 0.0 0.0 9936 1288 ?? DsJ 2:00AM 0:00.01 = /usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg root 34228 0.0 0.0 67960 4464 1 Ds+J 4:47AM 0:00.00 sshd: = root@pts/1 (sshd) root 38473 0.0 0.0 17556 3272 3 SJ 4:53AM 0:00.02 /bin/tcsh root 38475 0.0 0.0 14212 1512 3 R+J 4:53AM 0:00.00 ps aux I can do a 'jexec /bin/tcsh' to get into the jail, I can perform = ps commands, etc =85 I just can't get those processes to shutdown =85 everything within the jail is 'up to date' =85 updates the userland and = ports =85 I've checked over the NetApp, but everything appears fine, and = it only seems to repeatedly affect that one jail, on that same physical = server ... I have no ideas on what / how to debug this =85 thoughts? help? thx