Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 6 Jun 2007 08:58:24 -0600
From:      "Cory Marsh" <cory@clearwateranalytics.com>
To:        <freebsd-geom@freebsd.org>
Subject:   Gmirror, broken ggate locks system
Message-ID:  <FFC5D7C34B550A4DB09CAE727D8D7EE3014EB5@maildb1.arbfund.com>

next in thread | raw e-mail | index | archive | help
I am experiencing a gmirror issue on my mirrored partitions.  These
partitions work great replicating data over a gmirror interface to
another machine.  Everything goes just fine until the ggate interface in
the gmirror goes down (backup machine reboot, network problem, etc).  At
that point, the machine with the gmirror locks up.  Any process that is
currently running will continue to run, so long as it does not access
the disk in anyway.  As soon as a disk request happens that process
locks hard.  This forces me to shutdown the machine ungracefully.

=20

Is this the expected behavior?  Shouldn't gmirror detect the stale
(unresponsive) component and deactivate it?  Is it a problem because my
primary consumer is the ggate device?  Is there a better configuration
to achieve the same result?

=20

Any ideas/suggestions would be appreciated.  Thanks!

-Cory

=20

%uname -a

FreeBSD cwanfs1.arbfund.com 6.2-RELEASE FreeBSD 6.2-RELEASE #0: Fri Jan
12 23:34:43 MST 2007     root@:/usr/obj/usr/src/sys/GENERIC  amd64

%gmirror list data

Geom name: data

State: COMPLETE

Components: 2

Balance: prefer

Slice: 4096

Flags: NOAUTOSYNC

GenID: 2

SyncID: 21

ID: 1381569007

Providers:

1. Name: mirror/data

   Mediasize: 10737417728 (10G)

   Sectorsize: 512

   Mode: r1w1e1

Consumers:

1. Name: ggate0

   Mediasize: 10737418240 (10G)

   Sectorsize: 512

   Mode: r1w1e1

   State: ACTIVE

   Priority: 1

   Flags: NONE

   GenID: 2

   SyncID: 21

   ID: 1578386556

2. Name: ar0s1g

   Mediasize: 10737418240 (10G)

   Sectorsize: 512

   Mode: r1w1e1

   State: ACTIVE

   Priority: 100

   Flags: NONE

   GenID: 2

   SyncID: 21

   ID: 1982490913

=20

Info about the problem that locked the machine (got these messages for
20 minutes, about 100 of them, before the machine locked, it could have
been locked after the first message, I only noticed the machine down
after 20 minutes).  It looks like a network card issue disconnected the
ggate devices and then the machine locked.

=20

/var/log/messages:

...

Jun  5 17:10:15 cwanfs1 kernel: nfe0: watchdog timeout (missed Tx
interrupts) -- recovering

Jun  5 17:10:28 cwanfs1 ggatec: Lost connection 1.

Jun  5 17:10:28 cwanfs1 ggatec: Disconnected [10.10.10.2 /dev/ar0s1g].
Connecting...

Jun  5 17:10:59 cwanfs1 kernel: nfe0: watchdog timeout (missed Tx
interrupts) -- recovering

...

=20

%dmesg

Copyright (c) 1992-2007 The FreeBSD Project.

Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994

        The Regents of the University of California. All rights
reserved.

FreeBSD is a registered trademark of The FreeBSD Foundation.

FreeBSD 6.2-RELEASE #0: Fri Jan 12 23:34:43 MST 2007

    root@:/usr/obj/usr/src/sys/GENERIC

Timecounter "i8254" frequency 1193182 Hz quality 0

CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 5600+ (2800.13-MHz
K8-class CPU)

  Origin =3D "AuthenticAMD"  Id =3D 0x40f33  Stepping =3D 3

=20
Features=3D0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PG=
E
,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,

HTT>

  Features2=3D0x2001<SSE3,CX16>

  AMD =
Features=3D0xea500800<SYSCALL,NX,MMX+,FFXSR,RDTSCP,LM,3DNow+,3DNow>

  AMD Features2=3D0x1f<LAHF,CMP,<b2>,<b3>,CR8>

  Cores per package: 2

real memory  =3D 2147287040 (2047 MB)

avail memory =3D 2065846272 (1970 MB)

ACPI APIC Table: <A M I  OEMAPIC >

ioapic0 <Version 1.1> irqs 0-23 on motherboard

kbd1 at kbdmux0

acpi0: <A M I OEMXSDT> on motherboard

acpi0: Power Button (fixed)

acpi0: reservation of fec00000, 1000 (3) failed

acpi0: reservation of fee00000, 1000 (3) failed

Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000

acpi_timer0: <24-bit timer at 3.579545MHz> port 0x2008-0x200b on acpi0

cpu0: <ACPI CPU> on acpi0

pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0

pci0: <ACPI PCI bus> on pcib0

pci0: <memory, RAM> at device 0.0 (no driver attached)

isab0: <PCI-ISA bridge> at device 1.0 on pci0

isa0: <ISA bus> on isab0

pci0: <serial bus, SMBus> at device 1.1 (no driver attached)

ohci0: <OHCI (generic) USB controller> mem 0xfeaf7000-0xfeaf7fff irq 21
at device 2.0 on pci0

ohci0: [GIANT-LOCKED]

usb0: OHCI version 1.0, legacy support

usb0: SMM does not respond, resetting

usb0: <OHCI (generic) USB controller> on ohci0

usb0: USB revision 1.0

uhub0: nVidia OHCI root hub, class 9/0, rev 1.00/1.00, addr 1

uhub0: 10 ports with 10 removable, self powered

ehci0: <EHCI (generic) USB 2.0 controller> mem 0xfeaf6c00-0xfeaf6cff irq
22 at device 2.1 on pci0

ehci0: [GIANT-LOCKED]

usb1: EHCI version 1.0

usb1: companion controller, 10 ports each: usb0

usb1: <EHCI (generic) USB 2.0 controller> on ehci0

usb1: USB revision 2.0

uhub1: nVidia EHCI root hub, class 9/0, rev 2.00/1.00, addr 1

uhub1: 10 ports with 10 removable, self powered

atapci0: <nVidia nForce MCP55 UDMA133 controller> port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 4.0 o

n pci0

ata0: <ATA channel 0> on atapci0

ata1: <ATA channel 1> on atapci0

atapci1: <nVidia nForce MCP55 SATA300 controller> port
0xd480-0xd487,0xd400-0xd403,0xd080-0xd087,0xd000-0xd003,0xcc00-0x

cc0f mem 0xfeaf5000-0xfeaf5fff irq 23 at device 5.0 on pci0

ata2: <ATA channel 0> on atapci1

ata3: <ATA channel 1> on atapci1

atapci2: <nVidia nForce MCP55 SATA300 controller> port
0xc880-0xc887,0xc800-0xc803,0xc480-0xc487,0xc400-0xc403,0xc080-0x

c08f mem 0xfeaf4000-0xfeaf4fff irq 20 at device 5.1 on pci0

ata4: <ATA channel 0> on atapci2

ata5: <ATA channel 1> on atapci2

atapci3: <nVidia nForce MCP55 SATA300 controller> port
0xc000-0xc007,0xbc00-0xbc03,0xb880-0xb887,0xb800-0xb803,0xb480-0x

b48f mem 0xfeaf3000-0xfeaf3fff irq 21 at device 5.2 on pci0

ata6: <ATA channel 0> on atapci3

ata7: <ATA channel 1> on atapci3

pcib1: <ACPI PCI-PCI bridge> at device 6.0 on pci0

pci1: <ACPI PCI bus> on pcib1

pci1: <display, VGA> at device 10.0 (no driver attached)

nfe0: <NVIDIA nForce MCP55 Networking Adapter> port 0xb400-0xb407 mem
0xfeaf2000-0xfeaf2fff,0xfeaf6800-0xfeaf68ff,0xfeaf

6400-0xfeaf640f irq 22 at device 8.0 on pci0

miibus0: <MII bus> on nfe0

ukphy0: <Generic IEEE 802.3u media interface> on miibus0

ukphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-FDX, auto

nfe0: Ethernet address: 00:e0:81:75:4d:fc

nfe0: [FAST]

nfe1: <NVIDIA nForce MCP55 Networking Adapter> port 0xb080-0xb087 mem
0xfeaf1000-0xfeaf1fff,0xfeaf6000-0xfeaf60ff,0xfeaf

0c00-0xfeaf0c0f irq 23 at device 9.0 on pci0

miibus1: <MII bus> on nfe1

ukphy1: <Generic IEEE 802.3u media interface> on miibus1

ukphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-FDX, auto

nfe1: Ethernet address: 00:e0:81:75:4d:fd

nfe1: [FAST]

pcib2: <ACPI PCI-PCI bridge> at device 10.0 on pci0

pci2: <ACPI PCI bus> on pcib2

pcib3: <ACPI PCI-PCI bridge> at device 11.0 on pci0

pci3: <ACPI PCI bus> on pcib3

pcib4: <ACPI PCI-PCI bridge> at device 12.0 on pci0

pci4: <ACPI PCI bus> on pcib4

pcib5: <ACPI PCI-PCI bridge> at device 13.0 on pci0

pci5: <ACPI PCI bus> on pcib5

pcib6: <ACPI PCI-PCI bridge> at device 14.0 on pci0

pci6: <ACPI PCI bus> on pcib6

pcib7: <ACPI PCI-PCI bridge> at device 15.0 on pci0

pci7: <ACPI PCI bus> on pcib7

acpi_button0: <Power Button> on acpi0

atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0

atkbd0: <AT Keyboard> irq 1 on atkbdc0

kbd0 at atkbd0

atkbd0: [GIANT-LOCKED]

psm0: <PS/2 Mouse> irq 12 on atkbdc0

psm0: [GIANT-LOCKED]

psm0: model IntelliMouse Explorer, device ID 4

orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc97ff on isa0

sc0: <System console> at flags 0x100 on isa0

sc0: VGA <16 virtual consoles, flags=3D0x300>

vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on
isa0

Timecounter "TSC" frequency 2800129550 Hz quality 800

Timecounters tick every 1.000 msec

acd0: CDROM <CD-224E-N/1.AA> at ata0-slave UDMA33

ad4: 305245MB <Seagate ST3320620NS 3.AEG> at ata2-master SATA300

ad6: 305245MB <Seagate ST3320620NS 3.AEG> at ata3-master SATA300

ar0: 305245MB <nVidia MediaShield RAID1> status: READY

ar0: disk0 READY (master) using ad4 at ata2-master

ar0: disk1 READY (mirror) using ad6 at ata3-master

=20




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?FFC5D7C34B550A4DB09CAE727D8D7EE3014EB5>