From owner-freebsd-questions@FreeBSD.ORG Wed Oct 15 11:31:03 2008 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E86AA106568B for ; Wed, 15 Oct 2008 11:31:03 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from QMTA06.westchester.pa.mail.comcast.net (qmta06.westchester.pa.mail.comcast.net [76.96.62.56]) by mx1.freebsd.org (Postfix) with ESMTP id 7983D8FC1E for ; Wed, 15 Oct 2008 11:31:03 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from OMTA13.westchester.pa.mail.comcast.net ([76.96.62.52]) by QMTA06.westchester.pa.mail.comcast.net with comcast id SyLS1a00F17dt5G56zWsmz; Wed, 15 Oct 2008 11:30:52 +0000 Received: from koitsu.dyndns.org ([69.181.141.110]) by OMTA13.westchester.pa.mail.comcast.net with comcast id SzX11a00A2P6wsM3ZzX1sa; Wed, 15 Oct 2008 11:31:02 +0000 X-Authority-Analysis: v=1.0 c=1 a=N8TDiX_OOuEA:10 a=TiS0_Z0ZvfoA:10 a=QycZ5dHgAAAA:8 a=Cxw_0xoKYuq37kbWuS4A:9 a=0dJ02GW8OB-I1F93CdMA:7 a=yFkVGcN5oa2xArmMQjx4B4vFoBEA:4 a=EoioJ0NPDVgA:10 a=LY0hPdMaydYA:10 Received: by icarus.home.lan (Postfix, from userid 1000) id 45A5CC9419; Wed, 15 Oct 2008 04:31:01 -0700 (PDT) Date: Wed, 15 Oct 2008 04:31:01 -0700 From: Jeremy Chadwick To: Aniruddha Message-ID: <20081015113101.GA76278@icarus.home.lan> References: <1224054780.4011.20.camel@debian> <20081015072630.GA70901@icarus.home.lan> <1224069478.4247.7.camel@debian> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1224069478.4247.7.camel@debian> User-Agent: Mutt/1.5.18 (2008-05-17) Cc: PYUN Yong-Hyeon , freebsd-questions@freebsd.org Subject: Re: Under heavy load internet gets killed, only a reboot can bring it back up X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Oct 2008 11:31:04 -0000 On Wed, Oct 15, 2008 at 01:17:58PM +0200, Aniruddha wrote: > On Wed, 2008-10-15 at 00:26 -0700, Jeremy Chadwick wrote: > > On Wed, Oct 15, 2008 at 09:13:00AM +0200, Aniruddha wrote: > > > Each time my internet connection is under heavy lead it gets killed > > > after a minute of 10. I tried the following commands to get the internet > > > back up, but nothing helped: > > > > > > /etc/rc.d/netif restart > > > ifconfig mynic down > > > ifconfig mynic up > > > > > > Even worse the last time I issued a '/etc/rc.d/netif restart' my whole > > > system hardlocked (wasn't responding to capslock presses). So far the > > > only solution has been te reboot the computer. Is there any way I can > > > prevent my internet connection from getting killed? How do I get it back > > > up after it has been killed? Thanks in advance! > > > > What network card are you using? Can you provide output from the > > following commands? > > > > dmesg > > vmstat -i > > netstat -in > > > I have a Marvell Yukon onboard nic. > > > Here's the output: > > netstat -in > > Name Mtu Network Address Ipkts Ierrs Opkts > Oerrs Coll > msk0 1500 29 0 25 0 0 > msk0 1500 : 0 - 5 - - > msk0 1500 192.168.2.0/2 192.168.2.111 16 - 14 - > - > fwe0* 1500 0 0 0 0 0 > fwip0 1500 0 0 0 0 0 > lo0 16384 0 0 0 > 0 0 > lo0 16384 ::1/128 ::1 0 - 0 > - - > lo0 16384 ::1/64 0 - 0 - - > lo0 16384 127.0.0.0/8 127.0.0.1 0 - 0 > - - This looks okay. I see no interface errors, which is good. > vmstat -i > interrupt total rate > irq17: atapci0+ 13 0 > irq18: atapci1+ 1045 5 > irq20: uhci0 ehci0 13462 69 > irq21: fwohci0 3 0 > irq23: atapci3 102718 529 > cpu0: timer 386229 1990 > irq256: mskc0 46 0 > cpu1: timer 376453 1940 > Total 879969 4535 msk(4) appears to be using MSI/MSI-X here. One thing worth trying would be to disable MSI/MSI-X. You can disable these by adding the following to your /boot/loader.conf : hw.pci.enable_msix="0" hw.pci.enable_msi="0" > Copyright (c) 1992-2008 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD is a registered trademark of The FreeBSD Foundation. > FreeBSD 7.1-BETA #0: Sun Sep 7 13:49:18 UTC 2008 > root@logan.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz (3001.18-MHz 686-class CPU) > Origin = "GenuineIntel" Id = 0x10676 Stepping = 6 > Features=0xbfebfbff > Features2=0x8e3fd> > AMD Features=0x20000000 > AMD Features2=0x1 > Cores per package: 2 > real memory = 3220701184 (3071 MB) > avail memory = 3146145792 (3000 MB) > ACPI APIC Table: > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs > cpu0 (BSP): APIC ID: 0 > cpu1 (AP): APIC ID: 1 > ioapic0 irqs 0-23 on motherboard > kbd1 at kbdmux0 > ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) > acpi0: on motherboard > acpi0: [ITHREAD] > acpi0: Power Button (fixed) > acpi0: reservation of 0, a0000 (3) failed > acpi0: reservation of 100000, bff00000 (3) failed > Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 > pcib0: port 0xcf8-0xcff on acpi0 > pci0: on pcib0 > pcib1: irq 16 at device 1.0 on pci0 > pci5: on pcib1 > vgapci0: port 0xc800-0xc8ff mem 0xd0000000-0xdfffffff,0xff9f0000-0xff9fffff irq 16 at device 0.0 on pci5 > pci5: at device 0.1 (no driver attached) > pci0: at device 27.0 (no driver attached) > pcib2: irq 16 at device 28.0 on pci0 > pci4: on pcib2 > pcib3: irq 19 at device 28.3 on pci0 > pci3: on pcib3 > mskc0: port 0xb800-0xb8ff mem 0xff8fc000-0xff8fffff irq 19 at device 0.0 on pci3 > msk0: on mskc0 > msk0: Ethernet address: 00:1e:8c:5a:62:da > miibus0: on msk0 > e1000phy0: PHY 0 on miibus0 > e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX-FDX, auto > mskc0: [FILTER] > pcib4: irq 17 at device 28.5 on pci0 > pci2: on pcib4 > atapci0: mem 0xff7fe000-0xff7fffff irq 17 at device 0.0 on pci2 > atapci0: [ITHREAD] > atapci0: AHCI Version 01.00 controller with 2 ports detected > ata2: on atapci0 > ata2: [ITHREAD] > ata3: on atapci0 > ata3: [ITHREAD] > atapci1: port 0xac00-0xac07,0xa880-0xa883,0xa800-0xa807,0xa480-0xa483,0xa400-0xa40f at device 0.1 on pci2 > atapci1: [ITHREAD] > ata4: on atapci1 > ata4: [ITHREAD] > uhci0: port 0xe480-0xe49f irq 20 at device 29.0 on pci0 > uhci0: [GIANT-LOCKED] > uhci0: [ITHREAD] > usb0: on uhci0 > usb0: USB revision 1.0 > uhub0: on usb0 > uhub0: 2 ports with 2 removable, self powered > uhci1: port 0xe800-0xe81f irq 17 at device 29.1 on pci0 > uhci1: [GIANT-LOCKED] > uhci1: [ITHREAD] > usb1: on uhci1 > usb1: USB revision 1.0 > uhub1: on usb1 > uhub1: 2 ports with 2 removable, self powered > uhci2: port 0xe880-0xe89f irq 18 at device 29.2 on pci0 > uhci2: [GIANT-LOCKED] > uhci2: [ITHREAD] > usb2: on uhci2 > usb2: USB revision 1.0 > uhub2: on usb2 > uhub2: 2 ports with 2 removable, self powered > uhci3: port 0xec00-0xec1f irq 19 at device 29.3 on pci0 > uhci3: [GIANT-LOCKED] > uhci3: [ITHREAD] > usb3: on uhci3 > usb3: USB revision 1.0 > uhub3: on usb3 > uhub3: 2 ports with 2 removable, self powered > ehci0: mem 0xffafbc00-0xffafbfff irq 20 at device 29.7 on pci0 > ehci0: [GIANT-LOCKED] > ehci0: [ITHREAD] > usb4: EHCI version 1.0 > usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3 > usb4: on ehci0 > usb4: USB revision 2.0 > uhub4: on usb4 > uhub4: 8 ports with 8 removable, self powered > uhub5: on uhub4 > uhub5: single transaction translator > uhub5: 4 ports with 4 removable, self powered > umass0: on uhub5 > umass1: on uhub4 > pcib5: at device 30.0 on pci0 > pci1: on pcib5 > fwohci0: mem 0xff6ff800-0xff6fffff,0xff6f8000-0xff6fbfff irq 21 at device 3.0 on pci1 > fwohci0: [FILTER] > fwohci0: OHCI version 1.10 (ROM=1) > fwohci0: No. of Isochronous channels is 4. > fwohci0: EUI64 00:1e:8c:00:00:15:36:44 > fwohci0: Phy 1394a available S400, 2 ports. > fwohci0: Link S400, max_rec 2048 bytes. > firewire0: on fwohci0 > fwe0: on firewire0 > if_fwe0: Fake Ethernet address: 02:1e:8c:15:36:44 > fwe0: Ethernet address: 02:1e:8c:15:36:44 > fwip0: on firewire0 > fwip0: Firewire address: 00:1e:8c:00:00:15:36:44 @ 0xfffe00000000, S400, maxrec 2048 > sbp0: on firewire0 > dcons_crom0: on firewire0 > dcons_crom0: bus_addr 0x1468000 > fwohci0: Initiate bus reset > fwohci0: BUS reset > fwohci0: node_id=0xc000ffc0, gen=1, CYCLEMASTER mode > isab0: at device 31.0 on pci0 > isa0: on isab0 > atapci2: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 31.1 on pci0 > ata0: on atapci2 > ata0: [ITHREAD] > ata1: on atapci2 > ata1: [ITHREAD] > atapci3: port 0xe400-0xe407,0xe080-0xe083,0xe000-0xe007,0xdc00-0xdc03,0xd880-0xd88f mem 0xffafb800-0xffafbbff irq 23 at device 31.2 on pci0 > atapci3: [ITHREAD] > atapci3: AHCI Version 01.10 controller with 4 ports detected > ata5: on atapci3 > ata5: [ITHREAD] > ata6: on atapci3 > ata6: [ITHREAD] > ata7: on atapci3 > ata7: [ITHREAD] > ata8: on atapci3 > ata8: [ITHREAD] > pci0: at device 31.3 (no driver attached) > acpi_hpet0: iomem 0xfed00000-0xfed003ff on acpi0 > Timecounter "HPET" frequency 14318180 Hz quality 900 > cpu0: on acpi0 > est0: on cpu0 > est: CPU supports Enhanced Speedstep, but is not recognized. > est: cpu_vendor GenuineIntel, msr 61a092006000920 > device_attach: est0 attach returned 6 > p4tcc0: on cpu0 > cpu1: on acpi0 > est1: on cpu1 > est: CPU supports Enhanced Speedstep, but is not recognized. > est: cpu_vendor GenuineIntel, msr 61a092006000920 > device_attach: est1 attach returned 6 > p4tcc1: on cpu1 > acpi_button0: on acpi0 > sio0: configured irq 4 not in bitmap of probed irqs 0 > sio0: port may not be enabled > sio0: configured irq 4 not in bitmap of probed irqs 0 > sio0: port may not be enabled > sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 > sio0: type 16550A > sio0: [FILTER] > pmtimer0 on isa0 > orm0: at iomem 0xcf800-0xd27ff pnpid ORM0000 on isa0 > atkbdc0: at port 0x60,0x64 on isa0 > atkbd0: irq 1 on atkbdc0 > kbd0 at atkbd0 > atkbd0: [GIANT-LOCKED] > atkbd0: [ITHREAD] > ppc0: parallel port not found. > sc0: at flags 0x100 on isa0 > sc0: VGA <16 virtual consoles, flags=0x300> > sio1: configured irq 3 not in bitmap of probed irqs 0 > sio1: port may not be enabled > vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 > ums0: on uhub0 > ums0: 7 buttons and Z dir. > ukbd0: on uhub0 > kbd2 at ukbd0 > ukbd1: on uhub0 > kbd3 at ukbd1 > uhid0: on uhub0 > uhid1: on uhub1 > Timecounters tick every 1.000 msec > firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me) > firewire0: bus manager 0 (me) > acd0: DVDR at ata4-master UDMA66 > ad10: 953869MB at ata5-master SATA300 > ad14: 953869MB at ata7-master SATA300 > ad16: 152627MB at ata8-master SATA300 > GEOM_LABEL: Label for provider ad10s1 is ext2fs/data2. > GEOM_LABEL: Label for provider ad14s2 is ext2fs/root. > GEOM_LABEL: Label for provider ad14s3 is ext2fs/home. > GEOM_LABEL: Label for provider ad14s4 is ext2fs/data. > da0 at umass-sim0 bus 0 target 0 lun 0 > da0: Removable Direct Access SCSI-2 device > da0: 40.000MB/s transfers > da0: 7712MB (15794176 512 byte sectors: 255H 63S/T 983C) > SMP: AP CPU #1 Launched! > da1 at umass-sim1 bus 1 target 0 lun 0 > da1: Removable Direct Access SCSI-0 device > da1: 40.000MB/s transfers > da1: Attempt to query device size failed: NOT READY, Medium not present > da2 at umass-sim1 bus 1 target 0 lun 1 > da2: Removable Direct Access SCSI-0 device > da2: 40.000MB/s transfers > da2: Attempt to query device size failed: NOT READY, Medium not present > da3 at umass-sim1 bus 1 target 0 lun 2 > da3: Removable Direct Access SCSI-0 device > da3: 40.000MB/s transfers > da3: Attempt to query device size failed: NOT READY, Medium not present > da4 at umass-sim1 bus 1 target 0 lun 3 > da4: Removable Direct Access SCSI-0 device > da4: 40.000MB/s transfers > da4: Attempt to query device size failed: NOT READY, Medium not present > Trying to mount root from ufs:/dev/ad16s3a > WARNING: / was not properly dismounted > GEOM_LABEL: Label ext2fs/home removed. > GEOM_LABEL: Label ext2fs/data removed. > mskc0: Uncorrectable PCI Express error > mskc0: Uncorrectable PCI Express error Those errors at the end of your dmesg don't look good; could be the sign of a NIC or motherboard that's going bad, or possibly a very strange driver problem. Adding Yong-Hyeon PYUN to this thread, since he helps maintain the msk(4) driver. Yong-Hyeon, do you know of any conditions where heavy network I/O could cause msk(4) to lock up or stop transmitting traffic, or possibly hard-lock on ifconfig down/up? -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |