From owner-freebsd-stable@FreeBSD.ORG Wed Jun 19 13:01:16 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 16E8EB96 for ; Wed, 19 Jun 2013 13:01:16 +0000 (UTC) (envelope-from dk@neveragain.de) Received: from mail.neveragain.de (mail.neveragain.de [IPv6:2001:aa8:fffc::25]) by mx1.freebsd.org (Postfix) with ESMTP id D7A841F6F for ; Wed, 19 Jun 2013 13:01:15 +0000 (UTC) Received: from dottie.dus.openit.net (dottie.dus.openit.net [IPv6:2001:aa8:fff3::fffd]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mail.neveragain.de (Postfix) with ESMTPSA id 5BA0D14E80 for ; Wed, 19 Jun 2013 15:01:14 +0200 (CEST) From: =?iso-8859-1?Q?Dennis_K=F6gel?= Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Subject: Weird I/O hangs (9.1R, arcsas, interrupt spikes on uhci0) Date: Wed, 19 Jun 2013 15:01:14 +0200 Message-Id: To: freebsd-stable@freebsd.org Mime-Version: 1.0 (Apple Message framework v1283) X-Mailer: Apple Mail (2.1283) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Jun 2013 13:01:16 -0000 Hi, very periodically, we see I/O hangs for about 10 seconds, roughly once = per minute. Each time this happens, the I/O rate simply drops to zero, and all disk = access hangs; this is also very noticeable on the shell, for NFS clients = etc. Everything else (networking, kernel, =85) seems to continue = normally. Environment: FreeBSD 9.1R GENERIC on amd64, using ZFS, on a ARC1320 PCIe = with 24x Seagate ST33000650SS (3rd party arcsas.ko driver). It's easy to observe these hangs under write load, e.g. with 'zpool = iostat 1': void 22.4T 42.6T 34 2.73K 1.07M 293M void 22.4T 42.6T 20 2.74K 623K 289M void 22.4T 42.6T 144 2.62K 4.83M 279M void 22.4T 42.6T 13 2.60K 437K 283M void 22.4T 42.6T 0 0 0 0 <-- hang starts void 22.4T 42.6T 0 0 0 0 void 22.4T 42.6T 0 0 0 0 void 22.4T 42.6T 0 0 0 0 void 22.4T 42.6T 0 0 0 0 void 22.4T 42.6T 0 0 0 0 void 22.4T 42.6T 0 0 0 0 void 22.4T 42.6T 0 0 0 0 void 22.4T 42.6T 0 296 4.00K 34.2M <-- hang ends void 22.4T 42.6T 2 2.64K 73.8K 288M void 22.4T 42.6T 8 3.12K 278K 329M Each time this happens, there is a completely unexplained spike of = interrupts on uhci0: 'systat -vm' then displays numbers around 270k. # vmstat -i | grep -E '(arcsas|uhci0|Total)' irq16: uhci0 1227020890 67708 irq24: arcsas0 12045211 664 Total 1266417827 69882 Things to note: - Booting an USB-less kernel or disabling all USB in the BIOS doesn't = change a thing (no interrupt spikes to be seen, but the hangs remain) - The hangs / interrupt spikes happen just as often when the system is = idle - Board is a Supermicro x8dth - There's two igb cards - Root is ZFS as well (separate pool though) - BIOS, Areca FW and driver already are latest versions - Putting the controller to a different slot doesn't change the = behaviour - We have two identical systems and both show the exact same symptoms, = so flaky hardware is probably not the issue Any ideas would be appreciated. Thanks, D.=