Date: Wed, 6 Jun 2007 08:58:24 -0600 From: "Cory Marsh" <cory@clearwateranalytics.com> To: <freebsd-geom@freebsd.org> Subject: Gmirror, broken ggate locks system Message-ID: <FFC5D7C34B550A4DB09CAE727D8D7EE3014EB5@maildb1.arbfund.com>
next in thread | raw e-mail | index | archive | help
I am experiencing a gmirror issue on my mirrored partitions. These partitions work great replicating data over a gmirror interface to another machine. Everything goes just fine until the ggate interface in the gmirror goes down (backup machine reboot, network problem, etc). At that point, the machine with the gmirror locks up. Any process that is currently running will continue to run, so long as it does not access the disk in anyway. As soon as a disk request happens that process locks hard. This forces me to shutdown the machine ungracefully. =20 Is this the expected behavior? Shouldn't gmirror detect the stale (unresponsive) component and deactivate it? Is it a problem because my primary consumer is the ggate device? Is there a better configuration to achieve the same result? =20 Any ideas/suggestions would be appreciated. Thanks! -Cory =20 %uname -a FreeBSD cwanfs1.arbfund.com 6.2-RELEASE FreeBSD 6.2-RELEASE #0: Fri Jan 12 23:34:43 MST 2007 root@:/usr/obj/usr/src/sys/GENERIC amd64 %gmirror list data Geom name: data State: COMPLETE Components: 2 Balance: prefer Slice: 4096 Flags: NOAUTOSYNC GenID: 2 SyncID: 21 ID: 1381569007 Providers: 1. Name: mirror/data Mediasize: 10737417728 (10G) Sectorsize: 512 Mode: r1w1e1 Consumers: 1. Name: ggate0 Mediasize: 10737418240 (10G) Sectorsize: 512 Mode: r1w1e1 State: ACTIVE Priority: 1 Flags: NONE GenID: 2 SyncID: 21 ID: 1578386556 2. Name: ar0s1g Mediasize: 10737418240 (10G) Sectorsize: 512 Mode: r1w1e1 State: ACTIVE Priority: 100 Flags: NONE GenID: 2 SyncID: 21 ID: 1982490913 =20 Info about the problem that locked the machine (got these messages for 20 minutes, about 100 of them, before the machine locked, it could have been locked after the first message, I only noticed the machine down after 20 minutes). It looks like a network card issue disconnected the ggate devices and then the machine locked. =20 /var/log/messages: ... Jun 5 17:10:15 cwanfs1 kernel: nfe0: watchdog timeout (missed Tx interrupts) -- recovering Jun 5 17:10:28 cwanfs1 ggatec: Lost connection 1. Jun 5 17:10:28 cwanfs1 ggatec: Disconnected [10.10.10.2 /dev/ar0s1g]. Connecting... Jun 5 17:10:59 cwanfs1 kernel: nfe0: watchdog timeout (missed Tx interrupts) -- recovering ... =20 %dmesg Copyright (c) 1992-2007 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 6.2-RELEASE #0: Fri Jan 12 23:34:43 MST 2007 root@:/usr/obj/usr/src/sys/GENERIC Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 5600+ (2800.13-MHz K8-class CPU) Origin =3D "AuthenticAMD" Id =3D 0x40f33 Stepping =3D 3 =20 Features=3D0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PG= E ,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2, HTT> Features2=3D0x2001<SSE3,CX16> AMD = Features=3D0xea500800<SYSCALL,NX,MMX+,FFXSR,RDTSCP,LM,3DNow+,3DNow> AMD Features2=3D0x1f<LAHF,CMP,<b2>,<b3>,CR8> Cores per package: 2 real memory =3D 2147287040 (2047 MB) avail memory =3D 2065846272 (1970 MB) ACPI APIC Table: <A M I OEMAPIC > ioapic0 <Version 1.1> irqs 0-23 on motherboard kbd1 at kbdmux0 acpi0: <A M I OEMXSDT> on motherboard acpi0: Power Button (fixed) acpi0: reservation of fec00000, 1000 (3) failed acpi0: reservation of fee00000, 1000 (3) failed Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x2008-0x200b on acpi0 cpu0: <ACPI CPU> on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 pci0: <memory, RAM> at device 0.0 (no driver attached) isab0: <PCI-ISA bridge> at device 1.0 on pci0 isa0: <ISA bus> on isab0 pci0: <serial bus, SMBus> at device 1.1 (no driver attached) ohci0: <OHCI (generic) USB controller> mem 0xfeaf7000-0xfeaf7fff irq 21 at device 2.0 on pci0 ohci0: [GIANT-LOCKED] usb0: OHCI version 1.0, legacy support usb0: SMM does not respond, resetting usb0: <OHCI (generic) USB controller> on ohci0 usb0: USB revision 1.0 uhub0: nVidia OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 10 ports with 10 removable, self powered ehci0: <EHCI (generic) USB 2.0 controller> mem 0xfeaf6c00-0xfeaf6cff irq 22 at device 2.1 on pci0 ehci0: [GIANT-LOCKED] usb1: EHCI version 1.0 usb1: companion controller, 10 ports each: usb0 usb1: <EHCI (generic) USB 2.0 controller> on ehci0 usb1: USB revision 2.0 uhub1: nVidia EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub1: 10 ports with 10 removable, self powered atapci0: <nVidia nForce MCP55 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 4.0 o n pci0 ata0: <ATA channel 0> on atapci0 ata1: <ATA channel 1> on atapci0 atapci1: <nVidia nForce MCP55 SATA300 controller> port 0xd480-0xd487,0xd400-0xd403,0xd080-0xd087,0xd000-0xd003,0xcc00-0x cc0f mem 0xfeaf5000-0xfeaf5fff irq 23 at device 5.0 on pci0 ata2: <ATA channel 0> on atapci1 ata3: <ATA channel 1> on atapci1 atapci2: <nVidia nForce MCP55 SATA300 controller> port 0xc880-0xc887,0xc800-0xc803,0xc480-0xc487,0xc400-0xc403,0xc080-0x c08f mem 0xfeaf4000-0xfeaf4fff irq 20 at device 5.1 on pci0 ata4: <ATA channel 0> on atapci2 ata5: <ATA channel 1> on atapci2 atapci3: <nVidia nForce MCP55 SATA300 controller> port 0xc000-0xc007,0xbc00-0xbc03,0xb880-0xb887,0xb800-0xb803,0xb480-0x b48f mem 0xfeaf3000-0xfeaf3fff irq 21 at device 5.2 on pci0 ata6: <ATA channel 0> on atapci3 ata7: <ATA channel 1> on atapci3 pcib1: <ACPI PCI-PCI bridge> at device 6.0 on pci0 pci1: <ACPI PCI bus> on pcib1 pci1: <display, VGA> at device 10.0 (no driver attached) nfe0: <NVIDIA nForce MCP55 Networking Adapter> port 0xb400-0xb407 mem 0xfeaf2000-0xfeaf2fff,0xfeaf6800-0xfeaf68ff,0xfeaf 6400-0xfeaf640f irq 22 at device 8.0 on pci0 miibus0: <MII bus> on nfe0 ukphy0: <Generic IEEE 802.3u media interface> on miibus0 ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto nfe0: Ethernet address: 00:e0:81:75:4d:fc nfe0: [FAST] nfe1: <NVIDIA nForce MCP55 Networking Adapter> port 0xb080-0xb087 mem 0xfeaf1000-0xfeaf1fff,0xfeaf6000-0xfeaf60ff,0xfeaf 0c00-0xfeaf0c0f irq 23 at device 9.0 on pci0 miibus1: <MII bus> on nfe1 ukphy1: <Generic IEEE 802.3u media interface> on miibus1 ukphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto nfe1: Ethernet address: 00:e0:81:75:4d:fd nfe1: [FAST] pcib2: <ACPI PCI-PCI bridge> at device 10.0 on pci0 pci2: <ACPI PCI bus> on pcib2 pcib3: <ACPI PCI-PCI bridge> at device 11.0 on pci0 pci3: <ACPI PCI bus> on pcib3 pcib4: <ACPI PCI-PCI bridge> at device 12.0 on pci0 pci4: <ACPI PCI bus> on pcib4 pcib5: <ACPI PCI-PCI bridge> at device 13.0 on pci0 pci5: <ACPI PCI bus> on pcib5 pcib6: <ACPI PCI-PCI bridge> at device 14.0 on pci0 pci6: <ACPI PCI bus> on pcib6 pcib7: <ACPI PCI-PCI bridge> at device 15.0 on pci0 pci7: <ACPI PCI bus> on pcib7 acpi_button0: <Power Button> on acpi0 atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] psm0: <PS/2 Mouse> irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: model IntelliMouse Explorer, device ID 4 orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc97ff on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=3D0x300> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounter "TSC" frequency 2800129550 Hz quality 800 Timecounters tick every 1.000 msec acd0: CDROM <CD-224E-N/1.AA> at ata0-slave UDMA33 ad4: 305245MB <Seagate ST3320620NS 3.AEG> at ata2-master SATA300 ad6: 305245MB <Seagate ST3320620NS 3.AEG> at ata3-master SATA300 ar0: 305245MB <nVidia MediaShield RAID1> status: READY ar0: disk0 READY (master) using ad4 at ata2-master ar0: disk1 READY (mirror) using ad6 at ata3-master =20
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?FFC5D7C34B550A4DB09CAE727D8D7EE3014EB5>