From owner-freebsd-fs@FreeBSD.ORG Mon Aug 18 19:46:31 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 719846CC for ; Mon, 18 Aug 2014 19:46:31 +0000 (UTC) Received: from caida.org (rommie.caida.org [192.172.226.78]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5D7743E4B for ; Mon, 18 Aug 2014 19:46:30 +0000 (UTC) Message-ID: <53F25402.1020907@caida.org> Date: Mon, 18 Aug 2014 12:29:06 -0700 From: Daniel Andersen User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Process enters unkillable state and somewhat wedges zfs Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Aug 2014 19:46:31 -0000 We are currently experiencing a strange problem that sort of locks up one of our zfs pools. This is on a FreeBSD 10 machine. Let me give a rough layout of our system to better describe what is happening: We have two pools, tank and work. Both are mounted as /tank and /work respectively. Within these pools, we have a variety of partitions. Each of these partitions is then nullfs mounted into other partitions in an effort to present a common directory structure for users. Below is an example: /tank/a -> /data/a /tank/a -> /export/a /tank/b -> /data/b /work/home -> /home etc. Now, occasionally, something goes horribly wrong with a process ( or sometimes one thread within a process ). It enters a state where it is running, pegging a CPU at 100%, and is unkillable. This process, as I understand it, is attempting to access data on both pools, but only through the nullfs mounts. Now, when this process enters the above mentioned state, the /tank pool becomes inaccessible.. any process attempting to touch it enters the traditional 'D' state and itself becomes unkillable. However... you can *still* access all the data through the nullfs mounts. So, while 'ls /tank/a' wedges, 'ls /data/a' works fine. Typically, this happens when the machine is under high load. Arc memory usage is often at 140+GB ( out of 192GB total ) It has happened under low load once... but I suspect there was still substantial I/O load at the time, as we had been doing many benchmarks trying to trigger this problem, and likely the cache was still flushing to disk. Initially, we thought this was triggered by a process attempting to dump core, as all processes that originally wedged were such.. however, after disabling core dumps, we just had a case where 'sudo -u user su - user' wedged. This has me baffled. If anyone has any hints as to where to even start debugging this, I'd appreciate it greatly. For now, I've tuned down the arc max memory usage.. just as a guess. I saw in another thread here some bugs about FreeBSD not giving back memory from ARC correctly, when needed. I don't know if this could cause a process to enter this state, but figured it was worth a try. Many thanks, Daniel Andersen