From owner-freebsd-fs@FreeBSD.ORG Sat Oct 9 13:37:08 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 79D641065672 for ; Sat, 9 Oct 2010 13:37:08 +0000 (UTC) (envelope-from gallasch@free.de) Received: from smtp.free.de (smtp.free.de [91.204.6.103]) by mx1.freebsd.org (Postfix) with ESMTP id BC66F8FC16 for ; Sat, 9 Oct 2010 13:37:07 +0000 (UTC) Received: (qmail 90429 invoked from network); 9 Oct 2010 15:37:05 +0200 Received: from smtp.free.de (HELO orwell.free.de) ([91.204.4.103]) (envelope-sender ) by smtp.free.de (qmail-ldap-1.03) with AES128-SHA encrypted SMTP for ; 9 Oct 2010 15:37:05 +0200 References: <39F05641-4E46-4BE0-81CA-4DEB175A5FBE@free.de> <20101009111241.GA58948@icarus.home.lan> In-Reply-To: <20101009111241.GA58948@icarus.home.lan> Mime-Version: 1.0 (Apple Message framework v1081) Content-Type: text/plain; charset=us-ascii Message-Id: Content-Transfer-Encoding: quoted-printable From: Kai Gallasch Date: Sat, 9 Oct 2010 15:37:04 +0200 To: freebsd-fs@freebsd.org X-Mailer: Apple Mail (2.1081) Cc: Subject: Re: Locked up processes after upgrade to ZFS v15 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Oct 2010 13:37:08 -0000 Am 09.10.2010 um 13:12 schrieb Jeremy Chadwick: > On Wed, Oct 06, 2010 at 02:28:31PM +0200, Kai Gallasch wrote: >> Two days ago I upgraded my server to 8.1-STABLE (amd64) and upgraded = ZFS from v14 to v15. >> After zpool & zfs upgrade the server was running stable for about = half a day, but then apache processes running inside jails would lock up = and could not be terminated any more. > On RELENG_7, the system used ZFS v14, had the same tunings, and had an > uptime of 221 days w/out issue. 8.0 and 8.1-STABLE + ZFS v14 also ran very solid on my servers - dang! > With RELENG_8, the system lasted approximately 12 hours (about half a > day) before getting into a state that looks almost identical to Kai's > system: existing processes were stuck (unkillable, even with -9). New > processes could be spawned (including ones which used the ZFS > filesystems), and commands executed successfully. same here. I can provoke this locked process problem by starting one of my webserver jails. The first httpd process will lock up after = max. 30 minutes. Problem is, that after lot httpd forks, apache can not fork any more = child processes and the stuck (not killable) httpd processes all have a = socket open, with the IP address of the webserver. So a restart of = apache is not possible, because $IP:80 is already occupied. The jail also cannot be stopped/started in this state.. Only choice = there is: Restart the whole jail-host server (some processes would not = die - ps -axl advised + unclean umounts of ufs partitions) or delete the = IP-Adresse from the network interface and migrate the jail to another = server (zfs send/receive).. no fun at all. BTW: zfs destroy also does = not work here. > init complained about wedged processes when the system was rebooted: I use 'procstat -k -k -a | grep faul' to look for this condition.. This will find all processes in the table that contain 'trap_pfault' > Oct 9 02:00:56 init: some processes would not die; ps axl advised >=20 > No indication of any hardware issues on the console. here too. > The administrator who was handling the issue did not use "ps -l", = "top", > nor "procstat -k", so we don't have any indication of what the process > state was in, nor what the kernel calling stack looked like that lead = up > to the wedging. All he stated was that the processes were in D/I > states, which doesn't help since that's what they're in normally = anyway. > If I was around I would have forced DDB and done "call doadump" to > investigate things post-mortem. Another sign is an increased count of processes in 'top'.=20 > Monitoring graphs of the system during this time don't indicate any > signs of memory thrashing (though bsnmp-ucd doesn't provide as much > granularity as top does); the system looks normal except for a = slightly > decreased load average (probably as a result of the deadlocked > processes). My server currently has 28 GB RAM, with < 60% usage and no special zfs = tuning in loader.conf - although I tried to set = vm.pmap.pg_ps_enabled=3D"0" to find out if the locked processes had = anything to do with it. But setting it, did not prevent the problem from reoccurring. > Aside from the top/procstat/kernel dump aspect, what other information > would kernel folks be interested in? Is "call doadump" sufficient for > post-mortem investigation? I need to know since if/when this happens > again (likely), I want to get folks as much information as possible. I'm also willing to help, but need explicit instructions. I could = provoke such a lockup on one of my servers, but don't have that much = time to leave the server in this state.. So only a small time frame to = collect wanted debug data. > Also, a question for Kai: what did you end up doing to resolve this > problem? Did you roll back to an older FreeBSD, or...? This bug struck me really hard, because the affected server is not part = of a cluster and hosts about 50 jails (mail, web, databases). Problem is: Sockets held open by locked processes cannot be closed.. So = a restart of a jammed service is not possible. Theoretically I had the option to boot into the old world/kernel, but = I'm sure with the old zfs.ko a zfs mount of ZFS v15 wouldn't be = possible. AFAIK there is no zfs downgrade command or utility.. Of course a bare metal recovery of the whole server from tape was also a = last option. But really?? my 'solution': - move the most instable jails to other servers and restore them to UFS = partitions. - move everything else in the zpool temporarily to other servers running = zfs (zfs send/recieve) - zfs destroy -r - zpool delete - gpart create -t freebsd-ufs - gpart add ... - restore all jails from zfs to ufs So the server was now reverted to ufs - just for the piece of (my) mind, = although I waste around 50% of the raid capacity for reserved FS = allocation and all the other disadvantages compared to a volume manager. = I will still use zfs on several machines, but for some time not for = critical data. ZFS is a nifty thing, but I really depend on a stable FS. = (Of course for other people zfs v15 may be running smoothly) I must repeat. I offer my help if someone wants to dig into the locking = problem. =20 Regards, Kai.