Date: Thu, 10 Nov 2011 01:50:41 -0800 From: Jeremy Chadwick <freebsd@jdc.parodius.com> To: Willem Jan Withagen <wjw@digiware.nl> Cc: "stable@freebsd.org" <stable@freebsd.org>, "Vogel, Jack" <jack.vogel@intel.com> Subject: Re: em0 watchdog timeout Message-ID: <20111110095041.GA73812@icarus.home.lan> In-Reply-To: <4EBB97DF.3020803@digiware.nl> References: <4EBB97DF.3020803@digiware.nl>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Nov 10, 2011 at 10:22:39AM +0100, Willem Jan Withagen wrote: > Still running this file server on ZFS, and every now and then em0 > goes down, and is not revivable.... Nothing goes in or out the > box... > > Any suggestions as how to (help) fix this? CC'ing Jack Vogel of Intel. We need "pciconf -lvbc" output (-lv by itself isn't sufficient in this regard). Also, please do "sysctl dev.em.0.debug=1", which will show nothing useful in the output, however "dmesg" shortly after should have a bunch of driver-level debugging information that should help (output starts with "Interface is ...". Please provide that too. > Nov 10 09:07:41 zfs kernel: em0: Watchdog timeout -- resetting > Nov 10 09:07:41 zfs kernel: em0: Queue(0) tdh = 187, hw tdt = 189 > Nov 10 09:07:41 zfs kernel: em0: TX(0) desc avail = 1022,Next TX to Clean = 187 > Nov 10 09:11:32 zfs kernel: em0: Watchdog timeout -- resetting > Nov 10 09:11:32 zfs kernel: em0: Queue(0) tdh = 139, hw tdt = 151 > Nov 10 09:11:32 zfs kernel: em0: TX(0) desc avail = 1012,Next TX to Clean = 139 > Nov 10 09:16:05 zfs kernel: em0: Watchdog timeout -- resetting > Nov 10 09:16:05 zfs kernel: em0: Queue(0) tdh = 152, hw tdt = 163 > Nov 10 09:16:05 zfs kernel: em0: TX(0) desc avail = 1013,Next TX to Clean = 152 > Nov 10 09:33:10 zfs kernel: em0: Watchdog timeout -- resetting > Nov 10 09:33:10 zfs kernel: em0: Queue(0) tdh = 161, hw tdt = 176 > Nov 10 09:33:10 zfs kernel: em0: TX(0) desc avail = 1008,Next TX to Clean = 160 > Nov 10 09:53:18 zfs kernel: em0: Watchdog timeout -- resetting > Nov 10 09:53:18 zfs kernel: em0: Queue(0) tdh = 157, hw tdt = 172 > Nov 10 09:53:18 zfs kernel: em0: TX(0) desc avail = 1009,Next TX to Clean = 157 > > Device is: > Nov 10 10:07:27 zfs kernel: em0: <Intel(R) PRO/1000 Network Connection 7.2.3> port 0x1820-0x183f mem 0xdf900000-0xdf91ffff,0xdf924000-0xdf924fff irq 16 at device 25.0 on pci0 > Nov 10 10:07:27 zfs kernel: em0: Using an MSI interrupt > Nov 10 10:07:27 zfs kernel: em0: [FILTER] > > pciconf -lv: > em0@pci0:0:25:0: class=0x020000 card=0x10bd15d9 > chip=0x10bd8086 rev=0x02 hdr=0x00 > vendor = 'Intel Corporation' > device = 'Intel 82566DM Gigabit Ethernet Adapter (82566DM)' > class = network > subclass = ethernet > > uname: > 8.2-STABLE FreeBSD 8.2-STABLE #12: Sun Oct 2 13:36:55 CEST 2011 > amd64 > > sysctl -a | grep em.0: > dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.3 > dev.em.0.%driver: em > dev.em.0.%location: slot=25 function=0 handle=\_SB_.PCI0.LAN_ > dev.em.0.%pnpinfo: vendor=0x8086 device=0x10bd subvendor=0x15d9 > subdevice=0x10bd class=0x020000 > dev.em.0.%parent: pci0 > dev.em.0.nvm: -1 > dev.em.0.debug: -1 > dev.em.0.rx_int_delay: 0 > dev.em.0.tx_int_delay: 66 > dev.em.0.rx_abs_int_delay: 66 > dev.em.0.tx_abs_int_delay: 66 > dev.em.0.rx_processing_limit: 100 > dev.em.0.flow_control: 3 > dev.em.0.eee_control: 0 > dev.em.0.link_irq: 0 > dev.em.0.mbuf_alloc_fail: 0 > dev.em.0.cluster_alloc_fail: 0 > dev.em.0.dropped: 0 > dev.em.0.tx_dma_fail: 0 > dev.em.0.rx_overruns: 6 > dev.em.0.watchdog_timeouts: 5 > dev.em.0.device_control: 1074790976 > dev.em.0.rx_control: 67141634 > dev.em.0.fc_high_water: 8192 > dev.em.0.fc_low_water: 6692 > dev.em.0.queue0.txd_head: 78 > dev.em.0.queue0.txd_tail: 78 > dev.em.0.queue0.tx_irq: 0 > dev.em.0.queue0.no_desc_avail: 0 > dev.em.0.queue0.rxd_head: 376 > dev.em.0.queue0.rxd_tail: 375 > dev.em.0.queue0.rx_irq: 0 > dev.em.0.mac_stats.excess_coll: 0 > dev.em.0.mac_stats.single_coll: 0 > dev.em.0.mac_stats.multiple_coll: 0 > dev.em.0.mac_stats.late_coll: 0 > dev.em.0.mac_stats.collision_count: 0 > dev.em.0.mac_stats.symbol_errors: 0 > dev.em.0.mac_stats.sequence_errors: 0 > dev.em.0.mac_stats.defer_count: 0 > dev.em.0.mac_stats.missed_packets: 9 > dev.em.0.mac_stats.recv_no_buff: 0 > dev.em.0.mac_stats.recv_undersize: 0 > dev.em.0.mac_stats.recv_fragmented: 0 > dev.em.0.mac_stats.recv_oversize: 0 > dev.em.0.mac_stats.recv_jabber: 0 > dev.em.0.mac_stats.recv_errs: 1 > dev.em.0.mac_stats.crc_errs: 1 > dev.em.0.mac_stats.alignment_errs: 0 > dev.em.0.mac_stats.coll_ext_errs: 0 > dev.em.0.mac_stats.xon_recvd: 0 > dev.em.0.mac_stats.xon_txd: 0 > dev.em.0.mac_stats.xoff_recvd: 0 > dev.em.0.mac_stats.xoff_txd: 0 > dev.em.0.mac_stats.total_pkts_recvd: 160062850 > dev.em.0.mac_stats.good_pkts_recvd: 160062840 > dev.em.0.mac_stats.bcast_pkts_recvd: 79648 > dev.em.0.mac_stats.mcast_pkts_recvd: 10220 > dev.em.0.mac_stats.rx_frames_64: 0 > dev.em.0.mac_stats.rx_frames_65_127: 0 > dev.em.0.mac_stats.rx_frames_128_255: 0 > dev.em.0.mac_stats.rx_frames_256_511: 0 > dev.em.0.mac_stats.rx_frames_512_1023: 0 > dev.em.0.mac_stats.rx_frames_1024_1522: 0 > dev.em.0.mac_stats.good_octets_recvd: 107143604749 > dev.em.0.mac_stats.good_octets_txd: 129876768158 > dev.em.0.mac_stats.total_pkts_txd: 179010567 > dev.em.0.mac_stats.good_pkts_txd: 179010567 > dev.em.0.mac_stats.bcast_pkts_txd: 14608 > dev.em.0.mac_stats.mcast_pkts_txd: 206 > dev.em.0.mac_stats.tx_frames_64: 0 > dev.em.0.mac_stats.tx_frames_65_127: 0 > dev.em.0.mac_stats.tx_frames_128_255: 0 > dev.em.0.mac_stats.tx_frames_256_511: 0 > dev.em.0.mac_stats.tx_frames_512_1023: 0 > dev.em.0.mac_stats.tx_frames_1024_1522: 0 > dev.em.0.mac_stats.tso_txd: 3691806 > dev.em.0.mac_stats.tso_ctx_fail: 0 > dev.em.0.interrupts.asserts: 130023913 > dev.em.0.interrupts.rx_pkt_timer: 0 > dev.em.0.interrupts.rx_abs_timer: 0 > dev.em.0.interrupts.tx_pkt_timer: 0 > dev.em.0.interrupts.tx_abs_timer: 0 > dev.em.0.interrupts.tx_queue_empty: 0 > dev.em.0.interrupts.tx_queue_min_thresh: 0 > dev.em.0.interrupts.rx_desc_min_thresh: 0 > dev.em.0.interrupts.rx_overrun: 0 > dev.em.0.wake: 0 -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20111110095041.GA73812>