Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 20 Mar 2012 15:56:52 +1030
From:      Matt Thyer <matt.thyer@gmail.com>
To:        freebsd-stable@freebsd.org
Subject:   157k interrupts per second causing 60% CPU load on idle system
Message-ID:  <CACM2%2B-46zHafjZo0O1dNNvEJm%2B2sUcYboBWwhJ8NxVhXyvpBZQ@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
I've upgraded my FreeBSD-STABLE NAS from r225723 (22nd Sept 2011) to
r232477 (4th Mar 2012) and am finding that a system process called "intr"
is now constantly using about 60% of 1 CPU starting a short time after
reboot (possibly triggered by use of the samba server).

When this starts, systat -vm 1 says that the system is 85% idle and 14%
interrupt handling.
It says that there's around 157k interrupts per second.

After a reboot the system is back to it's normal state doing between 3 and
250 or so interrupts per second.

The hardware is an Intel Core i3-530 (dual core @ 2.93 GHz with
Hyperthreading) with 8 GB RAM (2x4GB) on a Gigabyte H55M-D2H rev 1.3
motherboard running the latest BIOS (F4).

The system runs a GENERIC kernel with the following significant items in
/boot/loader.conf:

zfs_load="YES"
aio_load="YES"
ahci_load="YES"
geom_mirror_load="YES"
vfs.root.mountfrom="zfs:zroot"
vboxdrv_load="YES"

It has 2 x 300 GB disks for the system with GPT partitioning and zmirror
for the OS ala http://wiki.freebsd.org/RootOnZFS/GPTZFSBoot/Mirror
I have swap on a gmirror as I want swap to survive the loss of one system
disk.

The NAS data is on a raidz2 pool of 8 disks connected to a SuperMicro
AOC-USAS2-L8i (flashed to behave as an AOC-USAS2-L8e).

The system is basically a CIFS NAS with ports/net/samba36 built with
AIO_SUPPORT and configured like:

   socket options = SO_RCVBUF=131072 SO_SNDBUF=131072 TCP_NODELAY
   min receivefile size=16384
   use sendfile=true
   aio read size = 16384
   aio write size = 16384
   aio write behind = true

The only other interesting workload on the box is a java Minecraft server
using ports/java/jdk16.

I'm going to try to reproduce the problem in a VM and binary search down to
the revision where it started as soon as I can work out a reliable way to
trigger the behaviour (as it doesn't start at boot time).

Any idea what could be the cause ?



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CACM2%2B-46zHafjZo0O1dNNvEJm%2B2sUcYboBWwhJ8NxVhXyvpBZQ>