Date: Wed, 19 Jun 2013 15:01:14 +0200 From: =?iso-8859-1?Q?Dennis_K=F6gel?= <dk@neveragain.de> To: freebsd-stable@freebsd.org Subject: Weird I/O hangs (9.1R, arcsas, interrupt spikes on uhci0) Message-ID: <C2AA9591-CBF4-4956-BABE-08BD8994FF8C@neveragain.de>
next in thread | raw e-mail | index | archive | help
Hi, very periodically, we see I/O hangs for about 10 seconds, roughly once = per minute. Each time this happens, the I/O rate simply drops to zero, and all disk = access hangs; this is also very noticeable on the shell, for NFS clients = etc. Everything else (networking, kernel, =85) seems to continue = normally. Environment: FreeBSD 9.1R GENERIC on amd64, using ZFS, on a ARC1320 PCIe = with 24x Seagate ST33000650SS (3rd party arcsas.ko driver). It's easy to observe these hangs under write load, e.g. with 'zpool = iostat 1': void 22.4T 42.6T 34 2.73K 1.07M 293M void 22.4T 42.6T 20 2.74K 623K 289M void 22.4T 42.6T 144 2.62K 4.83M 279M void 22.4T 42.6T 13 2.60K 437K 283M void 22.4T 42.6T 0 0 0 0 <-- hang starts void 22.4T 42.6T 0 0 0 0 void 22.4T 42.6T 0 0 0 0 void 22.4T 42.6T 0 0 0 0 void 22.4T 42.6T 0 0 0 0 void 22.4T 42.6T 0 0 0 0 void 22.4T 42.6T 0 0 0 0 void 22.4T 42.6T 0 0 0 0 void 22.4T 42.6T 0 296 4.00K 34.2M <-- hang ends void 22.4T 42.6T 2 2.64K 73.8K 288M void 22.4T 42.6T 8 3.12K 278K 329M Each time this happens, there is a completely unexplained spike of = interrupts on uhci0: 'systat -vm' then displays numbers around 270k. # vmstat -i | grep -E '(arcsas|uhci0|Total)' irq16: uhci0 1227020890 67708 irq24: arcsas0 12045211 664 Total 1266417827 69882 Things to note: - Booting an USB-less kernel or disabling all USB in the BIOS doesn't = change a thing (no interrupt spikes to be seen, but the hangs remain) - The hangs / interrupt spikes happen just as often when the system is = idle - Board is a Supermicro x8dth - There's two igb cards - Root is ZFS as well (separate pool though) - BIOS, Areca FW and driver already are latest versions - Putting the controller to a different slot doesn't change the = behaviour - We have two identical systems and both show the exact same symptoms, = so flaky hardware is probably not the issue Any ideas would be appreciated. Thanks, D.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C2AA9591-CBF4-4956-BABE-08BD8994FF8C>