Date: Wed, 15 Oct 2008 04:31:01 -0700 From: Jeremy Chadwick <koitsu@FreeBSD.org> To: Aniruddha <mailing_list@orange.nl> Cc: PYUN Yong-Hyeon <pyunyh@gmail.com>, freebsd-questions@freebsd.org Subject: Re: Under heavy load internet gets killed, only a reboot can bring it back up Message-ID: <20081015113101.GA76278@icarus.home.lan> In-Reply-To: <1224069478.4247.7.camel@debian> References: <1224054780.4011.20.camel@debian> <20081015072630.GA70901@icarus.home.lan> <1224069478.4247.7.camel@debian>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Oct 15, 2008 at 01:17:58PM +0200, Aniruddha wrote: > On Wed, 2008-10-15 at 00:26 -0700, Jeremy Chadwick wrote: > > On Wed, Oct 15, 2008 at 09:13:00AM +0200, Aniruddha wrote: > > > Each time my internet connection is under heavy lead it gets killed > > > after a minute of 10. I tried the following commands to get the internet > > > back up, but nothing helped: > > > > > > /etc/rc.d/netif restart > > > ifconfig mynic down > > > ifconfig mynic up > > > > > > Even worse the last time I issued a '/etc/rc.d/netif restart' my whole > > > system hardlocked (wasn't responding to capslock presses). So far the > > > only solution has been te reboot the computer. Is there any way I can > > > prevent my internet connection from getting killed? How do I get it back > > > up after it has been killed? Thanks in advance! > > > > What network card are you using? Can you provide output from the > > following commands? > > > > dmesg > > vmstat -i > > netstat -in > > > I have a Marvell Yukon onboard nic. > > > Here's the output: > > netstat -in > > Name Mtu Network Address Ipkts Ierrs Opkts > Oerrs Coll > msk0 1500 <Link#1> 29 0 25 0 0 > msk0 1500 : 0 - 5 - - > msk0 1500 192.168.2.0/2 192.168.2.111 16 - 14 - > - > fwe0* 1500 <Link#2> 0 0 0 0 0 > fwip0 1500 <Link#3> 0 0 0 0 0 > lo0 16384 <Link#4> 0 0 0 > 0 0 > lo0 16384 ::1/128 ::1 0 - 0 > - - > lo0 16384 ::1/64 0 - 0 - - > lo0 16384 127.0.0.0/8 127.0.0.1 0 - 0 > - - This looks okay. I see no interface errors, which is good. > vmstat -i > interrupt total rate > irq17: atapci0+ 13 0 > irq18: atapci1+ 1045 5 > irq20: uhci0 ehci0 13462 69 > irq21: fwohci0 3 0 > irq23: atapci3 102718 529 > cpu0: timer 386229 1990 > irq256: mskc0 46 0 > cpu1: timer 376453 1940 > Total 879969 4535 msk(4) appears to be using MSI/MSI-X here. One thing worth trying would be to disable MSI/MSI-X. You can disable these by adding the following to your /boot/loader.conf : hw.pci.enable_msix="0" hw.pci.enable_msi="0" > Copyright (c) 1992-2008 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD is a registered trademark of The FreeBSD Foundation. > FreeBSD 7.1-BETA #0: Sun Sep 7 13:49:18 UTC 2008 > root@logan.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz (3001.18-MHz 686-class CPU) > Origin = "GenuineIntel" Id = 0x10676 Stepping = 6 > Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> > Features2=0x8e3fd<SSE3,RSVD2,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,<b19>> > AMD Features=0x20000000<LM> > AMD Features2=0x1<LAHF> > Cores per package: 2 > real memory = 3220701184 (3071 MB) > avail memory = 3146145792 (3000 MB) > ACPI APIC Table: <A_M_I_ OEMAPIC > > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs > cpu0 (BSP): APIC ID: 0 > cpu1 (AP): APIC ID: 1 > ioapic0 <Version 2.0> irqs 0-23 on motherboard > kbd1 at kbdmux0 > ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) > acpi0: <A_M_I_ OEMRSDT> on motherboard > acpi0: [ITHREAD] > acpi0: Power Button (fixed) > acpi0: reservation of 0, a0000 (3) failed > acpi0: reservation of 100000, bff00000 (3) failed > Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 > pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 > pci0: <ACPI PCI bus> on pcib0 > pcib1: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on pci0 > pci5: <ACPI PCI bus> on pcib1 > vgapci0: <VGA-compatible display> port 0xc800-0xc8ff mem 0xd0000000-0xdfffffff,0xff9f0000-0xff9fffff irq 16 at device 0.0 on pci5 > pci5: <multimedia> at device 0.1 (no driver attached) > pci0: <multimedia> at device 27.0 (no driver attached) > pcib2: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0 > pci4: <ACPI PCI bus> on pcib2 > pcib3: <ACPI PCI-PCI bridge> irq 19 at device 28.3 on pci0 > pci3: <ACPI PCI bus> on pcib3 > mskc0: <Marvell Yukon 88E8053 Gigabit Ethernet> port 0xb800-0xb8ff mem 0xff8fc000-0xff8fffff irq 19 at device 0.0 on pci3 > msk0: <Marvell Technology Group Ltd. Yukon EC Id 0xb6 Rev 0x02> on mskc0 > msk0: Ethernet address: 00:1e:8c:5a:62:da > miibus0: <MII bus> on msk0 > e1000phy0: <Marvell 88E1111 Gigabit PHY> PHY 0 on miibus0 > e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX-FDX, auto > mskc0: [FILTER] > pcib4: <ACPI PCI-PCI bridge> irq 17 at device 28.5 on pci0 > pci2: <ACPI PCI bus> on pcib4 > atapci0: <JMicron AHCI controller> mem 0xff7fe000-0xff7fffff irq 17 at device 0.0 on pci2 > atapci0: [ITHREAD] > atapci0: AHCI Version 01.00 controller with 2 ports detected > ata2: <ATA channel 0> on atapci0 > ata2: [ITHREAD] > ata3: <ATA channel 1> on atapci0 > ata3: [ITHREAD] > atapci1: <JMicron JMB363 UDMA133 controller> port 0xac00-0xac07,0xa880-0xa883,0xa800-0xa807,0xa480-0xa483,0xa400-0xa40f at device 0.1 on pci2 > atapci1: [ITHREAD] > ata4: <ATA channel 0> on atapci1 > ata4: [ITHREAD] > uhci0: <UHCI (generic) USB controller> port 0xe480-0xe49f irq 20 at device 29.0 on pci0 > uhci0: [GIANT-LOCKED] > uhci0: [ITHREAD] > usb0: <UHCI (generic) USB controller> on uhci0 > usb0: USB revision 1.0 > uhub0: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb0 > uhub0: 2 ports with 2 removable, self powered > uhci1: <UHCI (generic) USB controller> port 0xe800-0xe81f irq 17 at device 29.1 on pci0 > uhci1: [GIANT-LOCKED] > uhci1: [ITHREAD] > usb1: <UHCI (generic) USB controller> on uhci1 > usb1: USB revision 1.0 > uhub1: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb1 > uhub1: 2 ports with 2 removable, self powered > uhci2: <UHCI (generic) USB controller> port 0xe880-0xe89f irq 18 at device 29.2 on pci0 > uhci2: [GIANT-LOCKED] > uhci2: [ITHREAD] > usb2: <UHCI (generic) USB controller> on uhci2 > usb2: USB revision 1.0 > uhub2: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb2 > uhub2: 2 ports with 2 removable, self powered > uhci3: <UHCI (generic) USB controller> port 0xec00-0xec1f irq 19 at device 29.3 on pci0 > uhci3: [GIANT-LOCKED] > uhci3: [ITHREAD] > usb3: <UHCI (generic) USB controller> on uhci3 > usb3: USB revision 1.0 > uhub3: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb3 > uhub3: 2 ports with 2 removable, self powered > ehci0: <Intel 82801GB/R (ICH7) USB 2.0 controller> mem 0xffafbc00-0xffafbfff irq 20 at device 29.7 on pci0 > ehci0: [GIANT-LOCKED] > ehci0: [ITHREAD] > usb4: EHCI version 1.0 > usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3 > usb4: <Intel 82801GB/R (ICH7) USB 2.0 controller> on ehci0 > usb4: USB revision 2.0 > uhub4: <Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1> on usb4 > uhub4: 8 ports with 8 removable, self powered > uhub5: <vendor 0x05e3 USB2.0 Hub, class 9/0, rev 2.00/7.02, addr 2> on uhub4 > uhub5: single transaction translator > uhub5: 4 ports with 4 removable, self powered > umass0: <USB 2.0 USB Flash Drive, class 0/0, rev 2.00/1.00, addr 3> on uhub5 > umass1: <Generic Mass Storage Device, class 0/0, rev 2.00/1.00, addr 4> on uhub4 > pcib5: <ACPI PCI-PCI bridge> at device 30.0 on pci0 > pci1: <ACPI PCI bus> on pcib5 > fwohci0: <Texas Instruments TSB43AB22/A> mem 0xff6ff800-0xff6fffff,0xff6f8000-0xff6fbfff irq 21 at device 3.0 on pci1 > fwohci0: [FILTER] > fwohci0: OHCI version 1.10 (ROM=1) > fwohci0: No. of Isochronous channels is 4. > fwohci0: EUI64 00:1e:8c:00:00:15:36:44 > fwohci0: Phy 1394a available S400, 2 ports. > fwohci0: Link S400, max_rec 2048 bytes. > firewire0: <IEEE1394(FireWire) bus> on fwohci0 > fwe0: <Ethernet over FireWire> on firewire0 > if_fwe0: Fake Ethernet address: 02:1e:8c:15:36:44 > fwe0: Ethernet address: 02:1e:8c:15:36:44 > fwip0: <IP over FireWire> on firewire0 > fwip0: Firewire address: 00:1e:8c:00:00:15:36:44 @ 0xfffe00000000, S400, maxrec 2048 > sbp0: <SBP-2/SCSI over FireWire> on firewire0 > dcons_crom0: <dcons configuration ROM> on firewire0 > dcons_crom0: bus_addr 0x1468000 > fwohci0: Initiate bus reset > fwohci0: BUS reset > fwohci0: node_id=0xc000ffc0, gen=1, CYCLEMASTER mode > isab0: <PCI-ISA bridge> at device 31.0 on pci0 > isa0: <ISA bus> on isab0 > atapci2: <Intel ICH7 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 31.1 on pci0 > ata0: <ATA channel 0> on atapci2 > ata0: [ITHREAD] > ata1: <ATA channel 1> on atapci2 > ata1: [ITHREAD] > atapci3: <Intel AHCI controller> port 0xe400-0xe407,0xe080-0xe083,0xe000-0xe007,0xdc00-0xdc03,0xd880-0xd88f mem 0xffafb800-0xffafbbff irq 23 at device 31.2 on pci0 > atapci3: [ITHREAD] > atapci3: AHCI Version 01.10 controller with 4 ports detected > ata5: <ATA channel 0> on atapci3 > ata5: [ITHREAD] > ata6: <ATA channel 1> on atapci3 > ata6: [ITHREAD] > ata7: <ATA channel 2> on atapci3 > ata7: [ITHREAD] > ata8: <ATA channel 3> on atapci3 > ata8: [ITHREAD] > pci0: <serial bus, SMBus> at device 31.3 (no driver attached) > acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0 > Timecounter "HPET" frequency 14318180 Hz quality 900 > cpu0: <ACPI CPU> on acpi0 > est0: <Enhanced SpeedStep Frequency Control> on cpu0 > est: CPU supports Enhanced Speedstep, but is not recognized. > est: cpu_vendor GenuineIntel, msr 61a092006000920 > device_attach: est0 attach returned 6 > p4tcc0: <CPU Frequency Thermal Control> on cpu0 > cpu1: <ACPI CPU> on acpi0 > est1: <Enhanced SpeedStep Frequency Control> on cpu1 > est: CPU supports Enhanced Speedstep, but is not recognized. > est: cpu_vendor GenuineIntel, msr 61a092006000920 > device_attach: est1 attach returned 6 > p4tcc1: <CPU Frequency Thermal Control> on cpu1 > acpi_button0: <Power Button> on acpi0 > sio0: configured irq 4 not in bitmap of probed irqs 0 > sio0: port may not be enabled > sio0: configured irq 4 not in bitmap of probed irqs 0 > sio0: port may not be enabled > sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 > sio0: type 16550A > sio0: [FILTER] > pmtimer0 on isa0 > orm0: <ISA Option ROM> at iomem 0xcf800-0xd27ff pnpid ORM0000 on isa0 > atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 > atkbd0: <AT Keyboard> irq 1 on atkbdc0 > kbd0 at atkbd0 > atkbd0: [GIANT-LOCKED] > atkbd0: [ITHREAD] > ppc0: parallel port not found. > sc0: <System console> at flags 0x100 on isa0 > sc0: VGA <16 virtual consoles, flags=0x300> > sio1: configured irq 3 not in bitmap of probed irqs 0 > sio1: port may not be enabled > vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 > ums0: <Razer Razer Lachesis, class 0/0, rev 1.10/21.00, addr 2> on uhub0 > ums0: 7 buttons and Z dir. > ukbd0: <Razer Razer Lachesis, class 0/0, rev 1.10/21.00, addr 2> on uhub0 > kbd2 at ukbd0 > ukbd1: <Logitech USB Multimedia Keyboard, class 0/0, rev 1.10/0.70, addr 3> on uhub0 > kbd3 at ukbd1 > uhid0: <Logitech USB Multimedia Keyboard, class 0/0, rev 1.10/0.70, addr 3> on uhub0 > uhid1: <Logitech Logitech Attack 3, class 0/0, rev 1.10/2.05, addr 2> on uhub1 > Timecounters tick every 1.000 msec > firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me) > firewire0: bus manager 0 (me) > acd0: DVDR <ATAPI DVD A DH20A4P/9P58> at ata4-master UDMA66 > ad10: 953869MB <SAMSUNG HD103UJ 1AA01109> at ata5-master SATA300 > ad14: 953869MB <SAMSUNG HD103UJ 1AA01109> at ata7-master SATA300 > ad16: 152627MB <WDC WD1600AAJS-00WAA0 58.01D58> at ata8-master SATA300 > GEOM_LABEL: Label for provider ad10s1 is ext2fs/data2. > GEOM_LABEL: Label for provider ad14s2 is ext2fs/root. > GEOM_LABEL: Label for provider ad14s3 is ext2fs/home. > GEOM_LABEL: Label for provider ad14s4 is ext2fs/data. > da0 at umass-sim0 bus 0 target 0 lun 0 > da0: <USB 2.0 USB Flash Drive 0.00> Removable Direct Access SCSI-2 device > da0: 40.000MB/s transfers > da0: 7712MB (15794176 512 byte sectors: 255H 63S/T 983C) > SMP: AP CPU #1 Launched! > da1 at umass-sim1 bus 1 target 0 lun 0 > da1: <Generic USB SD Reader 1.00> Removable Direct Access SCSI-0 device > da1: 40.000MB/s transfers > da1: Attempt to query device size failed: NOT READY, Medium not present > da2 at umass-sim1 bus 1 target 0 lun 1 > da2: <Generic USB CF Reader 1.01> Removable Direct Access SCSI-0 device > da2: 40.000MB/s transfers > da2: Attempt to query device size failed: NOT READY, Medium not present > da3 at umass-sim1 bus 1 target 0 lun 2 > da3: <Generic USB SM Reader 1.02> Removable Direct Access SCSI-0 device > da3: 40.000MB/s transfers > da3: Attempt to query device size failed: NOT READY, Medium not present > da4 at umass-sim1 bus 1 target 0 lun 3 > da4: <Generic USB MS Reader 1.03> Removable Direct Access SCSI-0 device > da4: 40.000MB/s transfers > da4: Attempt to query device size failed: NOT READY, Medium not present > Trying to mount root from ufs:/dev/ad16s3a > WARNING: / was not properly dismounted > GEOM_LABEL: Label ext2fs/home removed. > GEOM_LABEL: Label ext2fs/data removed. > mskc0: Uncorrectable PCI Express error > mskc0: Uncorrectable PCI Express error Those errors at the end of your dmesg don't look good; could be the sign of a NIC or motherboard that's going bad, or possibly a very strange driver problem. Adding Yong-Hyeon PYUN to this thread, since he helps maintain the msk(4) driver. Yong-Hyeon, do you know of any conditions where heavy network I/O could cause msk(4) to lock up or stop transmitting traffic, or possibly hard-lock on ifconfig down/up? -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081015113101.GA76278>