From owner-freebsd-virtualization@FreeBSD.ORG Thu Jun 12 20:49:39 2014 Return-Path: Delivered-To: virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 327B836D for ; Thu, 12 Jun 2014 20:49:39 +0000 (UTC) Received: from mouf.net (mouf.net [IPv6:2607:fc50:0:4400:216:3eff:fe69:33b3]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mouf.net", Issuer "mouf.net" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id E2E2B273F for ; Thu, 12 Jun 2014 20:49:38 +0000 (UTC) Received: from mouf.net (swills@mouf [199.48.129.64]) by mouf.net (8.14.5/8.14.5) with ESMTP id s5CKnSj8024606 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Thu, 12 Jun 2014 20:49:33 GMT (envelope-from swills@mouf.net) Received: (from swills@localhost) by mouf.net (8.14.5/8.14.5/Submit) id s5CKnSH3024605 for virtualization@freebsd.org; Thu, 12 Jun 2014 20:49:28 GMT (envelope-from swills) Date: Thu, 12 Jun 2014 20:49:28 +0000 From: Steve Wills To: virtualization@freebsd.org Subject: Re: interrupt storm on ahci Message-ID: <20140612204924.GA20784@mouf.net> References: <20140607212440.GB3163@mouf.net> <539383F3.2060307@freebsd.org> <20140611091816.GB61572@meatwad.mouf.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140611091816.GB61572@meatwad.mouf.net> User-Agent: Mutt/1.5.22 (2013-10-16) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (mouf.net [199.48.129.64]); Thu, 12 Jun 2014 20:49:33 +0000 (UTC) X-Spam-Status: No, score=0.0 required=4.5 tests=none autolearn=unavailable version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on mouf.net X-Virus-Scanned: clamav-milter 0.98.1 at mouf.net X-Virus-Status: Clean X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Jun 2014 20:49:39 -0000 On Wed, Jun 11, 2014 at 09:18:17AM +0000, Steve Wills wrote: > On Sat, Jun 07, 2014 at 02:53:05PM -0700, Neel Natu wrote: > > Hi Steve, > > > > On Sat, Jun 7, 2014 at 2:28 PM, Peter Grehan wrote: > > > Hi Steve, > > > > > > > > >> I'm running a FreeBSD guest in bhyve on a FreeBSD host. Both are running > > >> FreeBSD CURRENT, r266947. I've gotten this message about 12 times since > > >> boot: > > >> > > >> interrupt storm detected on "irq268:"; throttling interrupt source > > >> > > >> vmstat -i shows: > > >> > > >> irq268: ahci1 236514222 839 > > >> > > >> ahci1 is the second disk connected to the system: > > >> > > >> ahci1: mem 0xc0002400-0xc00027ff irq 18 > > >> at device 4.0 on pci0 > > >> > > >> The VM itself runs poudriere and was building a bunch of packages. At the > > >> moment, the VM seems to be in a rather odd state. The poudriere jails are > > >> running, but not doing anything. Ideas? > > > > > > > > > Is this an 8.* host ? I don't believe AHCI has MSI support on that version, > > > and AHCI legacy interrupts haven't had a huge amount of testing under load. > > > > > > If it is 8.*, I'd recommend using virtio-blk for the block device until we > > > can work out what's going wrong. > > > > > > > The KTR trace would be useful to figure out what's happening. > > > > To do that you can compile the host kernel and vmm.ko with the > > following options: > > options KTR > > options KTR_MASK=(KTR_GEN) > > options KTR_ENTRIES=(4*1024*1024) > > > > And when you see the interrupt storm message in the guest you can execute: > > sudo ktrdump -cto /tmp/ktrdump.out > > > > This was added to the kernel config, the kernel was built and installed and the > host was rebooted. I started the VM and almost as soon as the workload was put > back, the messages started again. So, I ran the ktrdump command above. I'll > mail you privately with the location of the output. Replying to myself just to note for the record that the interrupt storm was harmless, because the throttling only reduced performance (I thought another issue I was having at the same time was a result, but it was just a coincidence) and that the default threshold for interrupt storms of 1000 interrupts per second may be too low in some cases, such as this one. I disabled the throttling by setting the hw.intr_storm_threshold sysctl to 0 and haven't had any issues. Steve