From owner-freebsd-fs@FreeBSD.ORG Tue May 31 16:44:11 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 96B951065670 for ; Tue, 31 May 2011 16:44:11 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id A7DC08FC0C for ; Tue, 31 May 2011 16:44:10 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id TAA14763; Tue, 31 May 2011 19:43:47 +0300 (EEST) (envelope-from avg@FreeBSD.org) Message-ID: <4DE51AC3.7090700@FreeBSD.org> Date: Tue, 31 May 2011 19:43:47 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.17) Gecko/20110504 Lightning/1.0b2 Thunderbird/3.1.10 MIME-Version: 1.0 To: Patrick Proniewski References: <7DA2CB2F-FA87-427D-903E-514882EE6068@patpro.net> In-Reply-To: <7DA2CB2F-FA87-427D-903E-514882EE6068@patpro.net> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: FreeBSD Filesystems Subject: Re: No physical znode address X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 May 2011 16:44:11 -0000 on 31/05/2011 18:21 Patrick Proniewski said the following: > Hi all, > > I'm running a FreeBSD 8.2 server, with Apache 2.2 hosting around 260 web sites. It's a virtual machine, running on top of ESXi and a SAN storage. > The OS is installed on UFS, and a dedicated ZFS disk holds every web sites. Each web site is a ZFS volume created from the zpool "tank". > > # zpool list > NAME SIZE USED AVAIL CAP HEALTH ALTROOT > tank 149G 53.6G 95.4G 35% ONLINE - > > # zpool status > pool: tank > state: ONLINE > scrub: scrub completed after 0h19m with 0 errors on Fri May 13 22:57:10 2011 > config: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 0 > da1 ONLINE 0 0 0 > > errors: No known data errors > > Today, I've noticed an httpd process, stuck, using 100% CPU for hours. It looks like the process has opened non-existing files. Here is a part of the output of lsof: > > # lsof -p 10453 > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > httpd 10453 www cwd No physical znode address: 0xffffff0013019c60 > httpd 10453 www rtd VDIR 0,87 512 2 / > httpd 10453 www txt VREG 0,87 1321703 406618 /usr/local/sbin/httpd > httpd 10453 www txt VREG 0,87 246776 235521 /libexec/ld-elf.so.1 > httpd 10453 www txt VREG 0,87 154320 659461 /lib/libm.so.5 > ../.. > httpd 10453 www 120r No physical znode address: 0xffffff00132e2840 > httpd 10453 www 121r No physical znode address: 0xffffff0013019c60 > httpd 10453 www 122r No physical znode address: 0xffffff00132e2840 > httpd 10453 www 123r No physical znode address: 0xffffff0013019c60 > httpd 10453 www 124r No physical znode address: 0xffffff00132e2840 > httpd 10453 www 125r No physical znode address: 0xffffff0013019c60 > httpd 10453 www 126r No physical znode address: 0xffffff00132e2840 > httpd 10453 www 127r No physical znode address: 0xffffff0013019c60 > httpd 10453 www 128r No physical znode address: 0xffffff00132e2840 > httpd 10453 www 129r No physical znode address: 0xffffff0013019c60 > httpd 10453 www 130r No physical znode address: 0xffffff00132e2840 > httpd 10453 www 131r No physical znode address: 0xffffff0013019c60 > ../.. > > Reading a part of lsof's source code, it seems to relate to ZFS (dnode2.c - FreeBSD ZFS node functions for lsof). I wonder if fstat would have worked any differently from lsof here. > Using truss, I've discovered that the process is trying to stat a non-existing file, with a way too long path. truss output is a infinite repetition of: > > stat("/Sites/sites//spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core/sites/spip-core-vh/sites/edhum/bd/.Trashes//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////",0x7fffffffcc90) ERR#63 'File name too long' > > (I had to kill -9 truss process...) I wonder if with ktrace/kdump you wouldn't have to use kill. > Obviously, there is something wrong with this particular web site. But I'm afraid it could come from the file system, or impact the FS. > Any idea is welcome. Unfortunately, nothing for such a mysterious problem. Maybe you could investigate why apache tries to use that overlong path in the first place. Like some undesirable recursion... -- Andriy Gapon