From owner-freebsd-net@FreeBSD.ORG Tue Sep 24 08:16:59 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id ECC22442 for ; Tue, 24 Sep 2013 08:16:59 +0000 (UTC) (envelope-from adpacifyer@gmail.com) Received: from mail-we0-x22f.google.com (mail-we0-x22f.google.com [IPv6:2a00:1450:400c:c03::22f]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 549B12A10 for ; Tue, 24 Sep 2013 08:16:59 +0000 (UTC) Received: by mail-we0-f175.google.com with SMTP id q59so4079637wes.6 for ; Tue, 24 Sep 2013 01:16:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=ey3Tqlx8a+wJC0m51dBU3ujD0HkE+xLMiHT/GYzDPnI=; b=utU/bUT8SKLrAYNtX5EDWieRJzQeax9796HsPgLU3SuOR4q3mLKMmOLjDd5SSUAJuS BkDCTWKLiHok8pIyUzH6Qlt2EVrI9jedWcz9GcZfjMEJC+AQgYqq5K2ACKhXWrfqy0Ar fAZpBHncheWo1UZKXqBRQjXQhxsxOk5TSlims/Vcs+yG8spbCN197BuN9GfvaKx93TXi WjTwZ6WEwIBcB2jQSOAZWG7oStcEjdF+M8HRs7fz4gYopLnnwMzvzelJW9szvTN7Bir4 5bGnUvDM6fLixmOA/+Op6yZi6+a3X9UNXXQocEf4nk9I+SyXnDVCYcbz/SUMfIMFGS/3 s6oA== MIME-Version: 1.0 X-Received: by 10.180.98.105 with SMTP id eh9mr16953941wib.56.1380010617725; Tue, 24 Sep 2013 01:16:57 -0700 (PDT) Received: by 10.217.113.72 with HTTP; Tue, 24 Sep 2013 01:16:57 -0700 (PDT) Date: Tue, 24 Sep 2013 11:16:57 +0300 Message-ID: Subject: FreeBSD 9 em driver high inerrupt load + network crashes From: Oleg Tarasov To: freebsd-net@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Sep 2013 08:17:00 -0000 Hello, I am currently experiencing a problem on one of my servers. Interestingly it is the only one with such problem however it is the most loaded one. Sometimes on this server network becomes completely unresponsitive forcing me to attach console and reboot the system. There are no records in the logs that could possibly point me to any direction. However I am seing unexpected CPU interrupt load on this server: CPU: 0.7% user, 0.0% nice, 0.2% system, 18.1% interrupt, 80.9% idle # vmstat -i interrupt total rate irq16: em0 aac0+ 8776771724 99009 <--- This is definitely a problem irq23: ehci1 177381 2 cpu0:timer 8442700 95 irq264: em1:rx 0 12525811 141 irq265: em1:tx 0 10743747 121 irq266: em1:link 3893 0 cpu1:timer 5513791 62 cpu2:timer 5937620 66 cpu3:timer 84101501 948 Total 8904218168 100446 This server is using ipfw firewall with ipfw nat. The kernel is compiled with IPSEC support which has 8 rules in SPD table. The network card causing high load is em0 facing internal network. In ipfw there are just plain allow rules on it. All nat conversions and ipsec operations are done on another interface (em1) facing internet. I see both em0 and aac0 sharing irq 16 so maybe it is a RAID controller causing conflicts or whatever. Currently I can't find a solution or a method of more in-depth diagnostics of the matter. Please advice what can possibly cause the irq overload and how can I fight it? Kernel config: ========================================= cpu HAMMER ident SGP makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols # IPFW options IPFIREWALL #firewall options IPFIREWALL_VERBOSE #enable logging to syslogd(8) options IPFIREWALL_VERBOSE_LIMIT=300 #limit verbosity options IPFIREWALL_DEFAULT_TO_ACCEPT #allow everything by default options IPFIREWALL_FORWARD #packet destination changes options IPFIREWALL_NAT #ipfw kernel nat support options IPDIVERT #divert sockets # NAT options LIBALIAS # libalias library, performing NAT # IPSEC options IPSEC device crypto # Below is a common GENERIC config ========================================= dmesg output: ========================================= FreeBSD 9.1-RELEASE-p7 #2: Mon Sep 23 10:02:56 EEST 2013 ******@****************:/usr/obj/usr/src/sys/SGP amd64 CPU: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz (3392.37-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x206a7 Family = 6 Model = 2a Stepping = 7 Features=0xbfebfbff Features2=0x1f9ae3bf AMD Features=0x28100800 AMD Features2=0x1 TSC: P-state invariant, performance statistics real memory = 8589934592 (8192 MB) avail memory = 8141856768 (7764 MB) Event timer "LAPIC" quality 600 ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs FreeBSD/SMP: 1 package(s) x 4 core(s) cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 2 cpu2 (AP): APIC ID: 4 cpu3 (AP): APIC ID: 6 ioapic0 irqs 0-23 on motherboard kbd1 at kbdmux0 ctl: CAM Target Layer loaded cryptosoft0: on motherboard acpi0: on motherboard acpi0: Power Button (fixed) cpu0: on acpi0 cpu1: on acpi0 cpu2: on acpi0 cpu3: on acpi0 attimer0: port 0x40-0x43 irq 0 on acpi0 Timecounter "i8254" frequency 1193182 Hz quality 0 Event timer "i8254" frequency 1193182 Hz quality 100 atrtc0: port 0x70-0x71 irq 8 on acpi0 Event timer "RTC" frequency 32768 Hz quality 0 hpet0: iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 950 Event timer "HPET" frequency 14318180 Hz quality 550 Event timer "HPET1" frequency 14318180 Hz quality 440 Event timer "HPET2" frequency 14318180 Hz quality 440 Event timer "HPET3" frequency 14318180 Hz quality 440 Event timer "HPET4" frequency 14318180 Hz quality 440 Timecounter "ACPI-fast" frequency 3579545 Hz quality 900 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pcib1: irq 16 at device 1.0 on pci0 pci1: on pcib1 aac0: mem 0xf7800000-0xf79fffff irq 16 at device 0.0 on pci1 aac0: Enabling 64-bit address support aac0: Enable Raw I/O aac0: Enable 64-bit array aac0: New comm. interface enabled aac0: Adaptec 2405, aac driver 2.1.9-1 aacp0 on aac0 aacp1 on aac0 aacp2 on aac0 vgapci0: port 0xf000-0xf03f mem 0xf7400000-0xf77fffff,0xe0000000-0xefffffff irq 16 at device 2.0 on pci0 agp0: on vgapci0 agp0: aperture size is 256M, detected 65532k stolen memory pci0: at device 22.0 (no driver attached) ehci0: mem 0xf7e02000-0xf7e023ff irq 16 at device 26.0 on pci0 usbus0: EHCI version 1.0 usbus0 on ehci0 pcib2: irq 17 at device 28.0 on pci0 pci2: on pcib2 pcib3: irq 16 at device 0.0 on pci2 pci3: on pcib3 em0: port 0xe000-0xe03f mem 0xf7d40000-0xf7d5ffff,0xf7d20000-0xf7d3ffff irq 16 at device 0.0 on pci3 em0: Ethernet address: 90:e2:ba:17:59:d5 pcib4: irq 18 at device 28.2 on pci0 pci4: on pcib4 em1: port 0xd000-0xd01f mem 0xf7cc0000-0xf7cdffff,0xf7c00000-0xf7c7ffff,0xf7ce0000-0xf7ce3fff irq 18 at device 0.0 on pci4 em1: Using MSIX interrupts with 3 vectors em1: Ethernet address: 68:05:ca:05:75:34 ehci1: mem 0xf7e01000-0xf7e013ff irq 23 at device 29.0 on pci0 usbus1: EHCI version 1.0 usbus1 on ehci1 isab0: at device 31.0 on pci0 isa0: on isab0 pci0: at device 31.3 (no driver attached) acpi_button0: on acpi0 orm0: at iomem 0xc0000-0xcdfff,0xce000-0xd47ff,0xd4800-0xd57ff,0xd5800-0xd67ff on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 atkbdc0: at port 0x60,0x64 on isa0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] ppc0: cannot reserve I/O port range est0: on cpu0 p4tcc0: on cpu0 est1: on cpu1 p4tcc1: on cpu1 est2: on cpu2 p4tcc2: on cpu2 est3: on cpu3 p4tcc3: on cpu3 Timecounters tick every 1.000 msec IPsec: Initialized Security Association Processing. ipfw2 (+ipv6) initialized, divert enabled, nat enabled, rule-based forwarding enabled, default to accept, logging disabled aacd0 on aac0 aacd0: 476150MB (975155200 sectors) usbus0: 480Mbps High Speed USB v2.0 usbus1: 480Mbps High Speed USB v2.0 ugen0.1: at usbus0 uhub0: on usbus0 ugen1.1: at usbus1 uhub1: on usbus1 uhub0: 2 ports with 2 removable, self powered uhub1: 2 ports with 2 removable, self powered ugen0.2: at usbus0 uhub2: on usbus0 ugen1.2: at usbus1 uhub3: on usbus1 uhub2: 6 ports with 6 removable, self powered uhub3: 8 ports with 8 removable, self powered ugen0.3: at usbus0 ukbd0: on usbus0 kbd2 at ukbd0 ums0: on usbus0 ums0: 5 buttons and [XYZ] coordinates ID=1 pass0 at aacp0 bus 0 scbus0 target 0 lun 0 pass0: < ST3500630AS 3.AA> Fixed Uninstalled SCSI-5 device (offline) pass0: 3.300MB/s transfers pass1 at aacp0 bus 0 scbus0 target 1 lun 0 pass1: < ST3500630AS 3.AA> Fixed Uninstalled SCSI-5 device (offline) pass1: 3.300MB/s transfers SMP: AP CPU #1 Launched! SMP: AP CPU #2 Launched! SMP: AP CPU #3 Launched! ========================================= # sysctl -a | grep dev.em dev.em.0.%desc: Intel(R) PRO/1000 Legacy Network Connection 1.0.4 dev.em.0.%driver: em dev.em.0.%location: slot=0 function=0 dev.em.0.%pnpinfo: vendor=0x8086 device=0x107c subvendor=0x8086 subdevice=0x1376 class=0x020000 dev.em.0.%parent: pci3 dev.em.0.nvm: -1 dev.em.0.rx_int_delay: 0 dev.em.0.tx_int_delay: 66 dev.em.0.rx_abs_int_delay: 66 dev.em.0.tx_abs_int_delay: 66 dev.em.0.rx_processing_limit: 100 dev.em.0.flow_control: 3 dev.em.0.mbuf_alloc_fail: 0 dev.em.0.cluster_alloc_fail: 0 dev.em.0.dropped: 0 dev.em.0.tx_dma_fail: 0 dev.em.0.tx_desc_fail1: 0 dev.em.0.tx_desc_fail2: 35079 dev.em.0.rx_overruns: 0 dev.em.0.watchdog_timeouts: 0 dev.em.0.device_control: 1077674561 dev.em.0.rx_control: 32770 dev.em.0.fc_high_water: 47104 dev.em.0.fc_low_water: 45604 dev.em.0.fifo_workaround: 0 dev.em.0.fifo_reset: 0 dev.em.0.txd_head: 83 dev.em.0.txd_tail: 83 dev.em.0.rxd_head: 2 dev.em.0.rxd_tail: 1 dev.em.0.mac_stats.excess_coll: 0 dev.em.0.mac_stats.single_coll: 0 dev.em.0.mac_stats.multiple_coll: 0 dev.em.0.mac_stats.late_coll: 0 dev.em.0.mac_stats.collision_count: 0 dev.em.0.mac_stats.symbol_errors: 0 dev.em.0.mac_stats.sequence_errors: 0 dev.em.0.mac_stats.defer_count: 0 dev.em.0.mac_stats.missed_packets: 0 dev.em.0.mac_stats.recv_no_buff: 0 dev.em.0.mac_stats.recv_undersize: 0 dev.em.0.mac_stats.recv_fragmented: 0 dev.em.0.mac_stats.recv_oversize: 0 dev.em.0.mac_stats.recv_jabber: 0 dev.em.0.mac_stats.recv_errs: 0 dev.em.0.mac_stats.crc_errs: 0 dev.em.0.mac_stats.alignment_errs: 0 dev.em.0.mac_stats.coll_ext_errs: 0 dev.em.0.mac_stats.xon_recvd: 0 dev.em.0.mac_stats.xon_txd: 0 dev.em.0.mac_stats.xoff_recvd: 0 dev.em.0.mac_stats.xoff_txd: 0 dev.em.0.mac_stats.total_pkts_recvd: 15546581 dev.em.0.mac_stats.good_pkts_recvd: 15546581 dev.em.0.mac_stats.bcast_pkts_recvd: 74727 dev.em.0.mac_stats.mcast_pkts_recvd: 0 dev.em.0.mac_stats.rx_frames_64: 2240358 dev.em.0.mac_stats.rx_frames_65_127: 2026067 dev.em.0.mac_stats.rx_frames_128_255: 1667746 dev.em.0.mac_stats.rx_frames_256_511: 757367 dev.em.0.mac_stats.rx_frames_512_1023: 1081371 dev.em.0.mac_stats.rx_frames_1024_1522: 7773672 dev.em.0.mac_stats.good_octets_recvd: 11071669736 dev.em.0.mac_stats.good_octets_txd: 3680329427 dev.em.0.mac_stats.total_pkts_txd: 12769896 dev.em.0.mac_stats.good_pkts_txd: 12769896 dev.em.0.mac_stats.bcast_pkts_txd: 90 dev.em.0.mac_stats.mcast_pkts_txd: 2802292 dev.em.0.mac_stats.tx_frames_64: 5525466 dev.em.0.mac_stats.tx_frames_65_127: 2387324 dev.em.0.mac_stats.tx_frames_128_255: 2280501 dev.em.0.mac_stats.tx_frames_256_511: 337074 dev.em.0.mac_stats.tx_frames_512_1023: 204906 dev.em.0.mac_stats.tx_frames_1024_1522: 2034625 dev.em.0.mac_stats.tso_txd: 0 dev.em.0.mac_stats.tso_ctx_fail: 0 dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 7.3.2 dev.em.1.%driver: em dev.em.1.%location: slot=0 function=0 dev.em.1.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086 subdevice=0xa01f class=0x020000 dev.em.1.%parent: pci4 dev.em.1.nvm: -1 dev.em.1.debug: -1 dev.em.1.fc: 3 dev.em.1.rx_int_delay: 0 dev.em.1.tx_int_delay: 66 dev.em.1.rx_abs_int_delay: 66 dev.em.1.tx_abs_int_delay: 66 dev.em.1.rx_processing_limit: 100 dev.em.1.eee_control: 1 dev.em.1.link_irq: 3924 dev.em.1.mbuf_alloc_fail: 0 dev.em.1.cluster_alloc_fail: 0 dev.em.1.dropped: 0 dev.em.1.tx_dma_fail: 3 dev.em.1.rx_overruns: 0 dev.em.1.watchdog_timeouts: 0 dev.em.1.device_control: 1074790984 dev.em.1.rx_control: 67141634 dev.em.1.fc_high_water: 18432 dev.em.1.fc_low_water: 16932 dev.em.1.queue0.txd_head: 502 dev.em.1.queue0.txd_tail: 506 dev.em.1.queue0.tx_irq: 10924208 dev.em.1.queue0.no_desc_avail: 0 dev.em.1.queue0.rxd_head: 130 dev.em.1.queue0.rxd_tail: 129 dev.em.1.queue0.rx_irq: 12733678 dev.em.1.mac_stats.excess_coll: 142 dev.em.1.mac_stats.single_coll: 759301 dev.em.1.mac_stats.multiple_coll: 290042 dev.em.1.mac_stats.late_coll: 1049816 dev.em.1.mac_stats.collision_count: 1767409 dev.em.1.mac_stats.symbol_errors: 0 dev.em.1.mac_stats.sequence_errors: 0 dev.em.1.mac_stats.defer_count: 1959182 dev.em.1.mac_stats.missed_packets: 4794 dev.em.1.mac_stats.recv_no_buff: 0 dev.em.1.mac_stats.recv_undersize: 0 dev.em.1.mac_stats.recv_fragmented: 0 dev.em.1.mac_stats.recv_oversize: 0 dev.em.1.mac_stats.recv_jabber: 0 dev.em.1.mac_stats.recv_errs: 0 dev.em.1.mac_stats.crc_errs: 0 dev.em.1.mac_stats.alignment_errs: 0 dev.em.1.mac_stats.coll_ext_errs: 0 dev.em.1.mac_stats.xon_recvd: 0 dev.em.1.mac_stats.xon_txd: 0 dev.em.1.mac_stats.xoff_recvd: 0 dev.em.1.mac_stats.xoff_txd: 0 dev.em.1.mac_stats.total_pkts_recvd: 13461258 dev.em.1.mac_stats.good_pkts_recvd: 13456464 dev.em.1.mac_stats.bcast_pkts_recvd: 156733 dev.em.1.mac_stats.mcast_pkts_recvd: 0 dev.em.1.mac_stats.rx_frames_64: 1001246 dev.em.1.mac_stats.rx_frames_65_127: 5655384 dev.em.1.mac_stats.rx_frames_128_255: 3983776 dev.em.1.mac_stats.rx_frames_256_511: 539762 dev.em.1.mac_stats.rx_frames_512_1023: 260024 dev.em.1.mac_stats.rx_frames_1024_1522: 2016272 dev.em.1.mac_stats.good_octets_recvd: 4428966188 dev.em.1.mac_stats.good_octets_txd: 12327752567 dev.em.1.mac_stats.total_pkts_txd: 16190870 dev.em.1.mac_stats.good_pkts_txd: 16190870 dev.em.1.mac_stats.bcast_pkts_txd: 5 dev.em.1.mac_stats.mcast_pkts_txd: 0 dev.em.1.mac_stats.tx_frames_64: 547777 dev.em.1.mac_stats.tx_frames_65_127: 2630586 dev.em.1.mac_stats.tx_frames_128_255: 2404829 dev.em.1.mac_stats.tx_frames_256_511: 1283850 dev.em.1.mac_stats.tx_frames_512_1023: 1104051 dev.em.1.mac_stats.tx_frames_1024_1522: 8219777 dev.em.1.mac_stats.tso_txd: 234428 dev.em.1.mac_stats.tso_ctx_fail: 0 dev.em.1.interrupts.asserts: 1990 dev.em.1.interrupts.rx_pkt_timer: 0 dev.em.1.interrupts.rx_abs_timer: 0 dev.em.1.interrupts.tx_pkt_timer: 0 dev.em.1.interrupts.tx_abs_timer: 0 dev.em.1.interrupts.tx_queue_empty: 0 dev.em.1.interrupts.tx_queue_min_thresh: 0 dev.em.1.interrupts.rx_desc_min_thresh: 0 dev.em.1.interrupts.rx_overrun: 0