From owner-freebsd-stable@FreeBSD.ORG Mon Apr 6 08:55:27 2015 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 98D2C2E8 for ; Mon, 6 Apr 2015 08:55:27 +0000 (UTC) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id 5BFF19CC for ; Mon, 6 Apr 2015 08:55:27 +0000 (UTC) Received: from [IPv6:2001:470:923f:2:617c:73a8:3a0c:e90f] (unknown [IPv6:2001:470:923f:2:617c:73a8:3a0c:e90f]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPSA id 7DB7C56400 for ; Mon, 6 Apr 2015 11:55:10 +0300 (MSK) Message-ID: <552249ED.4050700@FreeBSD.org> Date: Mon, 06 Apr 2015 11:55:09 +0300 From: Lev Serebryakov Reply-To: lev@FreeBSD.org Organization: FreeBSD User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: freebsd-stable@freebsd.org Subject: 10-STABLE live locks, looks like VM-related Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Apr 2015 08:55:27 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 I got several live locks of my server in a row (3 in one week). It is amd64 10-STABLE r277307. Looks like live locks is VM related but manifest themselves under multi-threaded mixed CPU + I/O load (CrashPlan backup + torrents + openjdk8 rebuild, for example). This system doesn't have ZFS, but have several UFS2 SU+J with different block sizes (16Kb and 32Kb)! As far as I remeber several versions ago "16Kb + 64Kb" mix was a killer (known bug). but "16 + 32" works well for several years. I have INVARIANTS and WITNESS in the kernel, but it doesn't help: only report is "bufwait/dirhash" right after booting. When system hangs, I could ping it, but NFS, SMB, ssh, local console, everything else userland-related stop to answer. I could break into kernel debugger on console and "panic" manually, so I have two "crash dumps" of system in this state. Many processes are in "vmwait" or "pfault" state according to DDB's "ps" output. here are logs from two latest crashes: http://lev.serebryakov.spb.ru/freebsd/ll/ I've tested memory with memtest86 for 12 hours without errors. - -- // Lev Serebryakov AKA Black Lion -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (MingW32) iQJ8BAEBCgBmBQJVIknjXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRGOTZEMUNBMEI1RjQzMThCNjc0QjMzMEFF QUIwM0M1OEJGREM0NzhGAAoJEOqwPFi/3EeP5ZkP+QGJje5ejYvKrDS28/Dqrew8 7H07Oqh6XXHG/CPqsQts0MCSfJTjZ/XnaYM5hFARJQYBVj+nZfMyIgaUTgFp6hOd UnQ+qGYgEuTYm3uxewyPL4C6NtLTJq79Au28oz6ONIbLsl5VQIkTxJMn/40RxtQr EjqXSDc2al9s36bOidbD/ZdvuK91NaYcNH2tCvv2uNKR8SwA4LXPU+P/K3MtMDsT K5c+yI9Fb0OWmYAF7wTUgrUrJeXxWcVyTsirtBB4C4sKXFKz1RDieGvUWnVjikbP 27c+cOpGPFzf3EgqFDwER2tLXdoB/YJ4BEY87alNI+mogjNrQa+2ck4wetfUKnSS QrYKUEoQCJuskbgc2J4LKxIPGOJfYGLBIKv5QzlPcdu5hVOiwNmxn5zpaHbI8nRR ikXhHmUl5dBOJ1u+6p6fPFJ38l8Ig+vZKRPhSsft6450yftLPliPiOH43mTaAoyd dhCO0xHBbTNmt0QJEyniyBKxIEs3qYBFYFSQmZjiIXY70lP16Tea0m5r5zt9JYr6 j38IBgFL+TbK0BhwSU0479d29J7loOuIutoVbzKwpgjgu8eGZRcILjyAFonUX4EA r90Lp4WUjR7qsjyCqFLG+g8kt4Hr6oTKAbMTtscWWlst5PCUOo07j6NAOhTjj5bV XfC55FMCCuSa95UEC/5M =a3a7 -----END PGP SIGNATURE-----