From owner-freebsd-stable@FreeBSD.ORG Tue Mar 17 14:07:56 2009 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B83171065670 for ; Tue, 17 Mar 2009 14:07:56 +0000 (UTC) (envelope-from ns@got2get.net) Received: from secure.guildage.net (confighell.com [78.47.207.147]) by mx1.freebsd.org (Postfix) with ESMTP id 678618FC24 for ; Tue, 17 Mar 2009 14:07:54 +0000 (UTC) (envelope-from ns@got2get.net) Received: from secure.guildage.net (confighell.com [78.47.207.147]) by secure.guildage.net (8.14.3/8.14.2) with ESMTP id n2HDrHKm092753 for ; Tue, 17 Mar 2009 13:53:17 GMT (envelope-from ns@got2get.net) Received: (from www@localhost) by secure.guildage.net (8.14.3/8.14.2/Submit) id n2HDrHDb092752; Tue, 17 Mar 2009 13:53:17 GMT (envelope-from ns@got2get.net) X-Authentication-Warning: secure.guildage.net: www set sender to ns@got2get.net using -f To: MIME-Version: 1.0 Date: Tue, 17 Mar 2009 13:53:17 +0000 From: Nicolai Message-ID: X-Sender: ns@got2get.net User-Agent: RoundCube Webmail/0.2 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Crazy "interrupt storm detected" on atapic0 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Mar 2009 14:07:58 -0000 Hi all, I have had this problem since day 1 on my new server. It has run since November 15th 2008, and serve approx. 10 GB worth of web traffic per month for the main site and then some 40 domains with mail and small web pages. (hence - it's NOT that busy yet) I started with 7.1-RELEASE-pX since I didn't have problems straight off - but it didn't last long. After a few days of running, the interrupt storm on atapci0 starts to show. It slowly builds up and continues. When it reaches 150-200k/sec. I reboot just to be on the safe side. I have also upgraded to 7.1-STABLE to get all the ATA driver changes S.O.S. have been including. Still no visible change. To give you an impression of its impact, I will let the numbers speak for thmeselves: $ uname -v FreeBSD 7.1-STABLE #1: Thu Mar 12 14:22:49 CET 2009 $ uname -m amd64 $ uptime 2:36PM up 4 days, 22:12, 5 users, load averages: 0.28, 0.40, 0.19 $ tail -10 messages Mar 17 13:42:37 box last message repeated 600 times Mar 17 13:52:37 box last message repeated 600 times Mar 17 14:02:37 box last message repeated 600 times Mar 17 14:12:37 box last message repeated 600 times Mar 17 14:22:37 box last message repeated 600 times Mar 17 14:32:22 box last message repeated 585 times Mar 17 14:32:23 box kernel: pid 78195 (try), uid 0: exited on signal 10 (core dumped) Mar 17 14:32:23 box kernel: interrupt storm detected on "irq22:"; throttling interrupt source Mar 17 14:32:54 box last message repeated 31 times Mar 17 14:34:55 box last message repeated 121 times $ vmstat -i interrupt total rate irq1: atkbd0 3 0 irq9: acpi0 1 0 irq16: ohci0 1 0 irq17: ohci1 ohci3 1 0 irq18: ohci2 ohci4 1 0 irq22: atapci0 57317362717 134713 cpu0: timer 850996016 2000 cpu1: timer 850995703 2000 Total 59019354443 138713 [root@box /etc]# atacontrol mode ad4 current mode = SATA300 [root@box /etc]# atacontrol mode ad6 current mode = SATA300 Some relevant lines from dmesg: atapci0: port 0xb000-0xb007,0xa000-0xa003,0x9000-0x9007,0x8000-0x8003,0x7000-0x700f mem 0xfe7ff800-0xfe7ffbff irq 22 at device 18.0 on pci0 atapci0: [ITHREAD] atapci0: AHCI Version 01.10 controller with 4 ports detected ata2: on atapci0 ata2: [ITHREAD] ata3: on atapci0 ata3: [ITHREAD] ata4: on atapci0 ata4: [ITHREAD] ata5: on atapci0 ata5: [ITHREAD] And a few lines from pciconf: atapci0@pci0:0:18:0: class=0x01018f card=0x73271462 chip=0x43801002 rev=0x00 hdr=0x00 vendor = 'ATI Technologies Inc' device = 'IXP SB600 Serial ATA Controller' class = mass storage subclass = ATA ...so - this is where I'm at. Interrupt storm raises through the roof in just 3 days, and continues to raise. Just for kicks I tried disabling AHCI with nextboot, but that made the box not boot. Also - I'm 1000 KM. away from the box - so I'm a little limited to testing fancy boot options - apart from things that can go in nextboot.conf. If anyone have any hints on how to proceed, I would be grateful. Thank you in advance - Nicolai