Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 2 Jun 2013 08:48:32 -0700
From:      Jeremy Chadwick <jdc@koitsu.org>
To:        Alban Hertroys <haramrae@gmail.com>
Cc:        Warren Block <wblock@wonkity.com>, Kimmo Paasiala <kpaasial@gmail.com>, freebsd-stable@freebsd.org
Subject:   Re: Corrupt GPT header on disk from twa array - fixable?
Message-ID:  <20130602154832.GA23072@icarus.home.lan>
In-Reply-To: <3659A498-F0EA-4AF3-80EA-40038DCA9CC7@gmail.com>
References:  <EA2DCEC2-8B07-434B-8B60-8AB15B3788F7@gmail.com> <7ABBEE71A96E411793E41BD97DA72BCE@multiplay.co.uk> <CA%2B7WWSe7O9%2Bxq3UEJ%2B%2BtM1d3tphf7pWU=n4DoQY8XZq39RRScQ@mail.gmail.com> <2943982C-719E-45D0-9B26-43B725738F83@gmail.com> <alpine.BSF.2.00.1306020834050.8625@wonkity.com> <3659A498-F0EA-4AF3-80EA-40038DCA9CC7@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Jun 02, 2013 at 05:12:48PM +0200, Alban Hertroys wrote:
> 
> On Jun 2, 2013, at 16:46, Warren Block <wblock@wonkity.com> wrote:
> 
> > On Sun, 2 Jun 2013, Alban Hertroys wrote:
> > 
> >> On Jun 2, 2013, at 16:12, Kimmo Paasiala <kpaasial@gmail.com> wrote:
> >>> 
> >>> Looking at the gpart(8) output it seems that only 20GBs of the disk is
> >>> recognized by the disk driver but the GPT table still shows the full
> >>> capacity 910GB. I'd say that the GPT table is in fact correct and if
> >>> you can somehow get the disks to be recognized with full capacity they
> >>> should be usable as they are. What does dmesg(8) say about the disks?
> >> 
> >> From dmesg:
> >> 
> >> ada2 at ahcich2 bus 0 scbus2 target 0 lun 0
> >> ada2: usb_alloc_device: set address 2 failed (USB_ERR_IOERROR, ignored)
> >> <ST3500418AS CC34> ATA-8 SATA 2.x device
> >> usbd_setup_device_desc: getting device descriptor at addr 2 failed, USB_ERR_IOERROR
> >> ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
> >> ada2: Command Queueing enabled
> >> ada2: 476940MB (976773168 512 byte sectors: 16H 63S/T 16383C)
> >> ada2: Previously was known as ad8
> >> ada3 at ahcich3 bus 0 scbus3 target 0 lun 0
> >> ada3: <ST3500418AS CC34> ATA-8 SATA 2.x device
> >> ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
> >> ada3: Command Queueing enabled
> >> ada3: 476940MB (976773168 512 byte sectors: 16H 63S/T 16383C)
> >> ada3: Previously was known as ad10
> >> ada4 at ahcich4 bus 0 scbus4 target 0 lun 0
> >> ada4: <Hitachi HDS721010CLA332 JP4OA39C> ATA-8 SATA 2.x device
> >> usbd_req_re_enumerate: addr=2, set address failed! (USB_ERR_IOERROR, ignored)
> >> ada4: 300.000MB/s transfers (SATA 2.x, usbd_setup_device_desc: getting device descriptor at addr 2 failed, USB_ERR_IOERROR
> >> UDMA6, PIO 8192bytes)
> >> ada4: Command Queueing enabled
> >> ada4: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
> >> ada4: Previously was known as ad12
> >> ada5 at ahcich5 bus 0 scbus5 target 0 lun 0
> >> ada5: <WDC WD1002FAEX-00Z3A0 05.01D05> ATA-8 SATA 3.x device
> >> ada5: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
> >> ada5: Command Queueing enabled
> >> ada5: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
> >> ada5: Previously was known as ad14
> >> SMP: AP CPU #1 Launched!
> >> Timecounter "TSC-low" frequency 13371081 Hz quality 800
> >> GEOM: ada2: the secondary GPT header is not in the last LBA.
> >> GEOM: ada3: the secondary GPT header is not in the last LBA.
> >> GEOM_MIRROR: Device mirror/boot launched (2/2).
> >> GEOM_MIRROR: Device mirror/swap launched (2/2).
> >> GEOM_MIRROR: Device mirror/root launched (2/2).
> >> GEOM: ada4: the secondary GPT header is not in the last LBA.
> >> GEOM: ada5: the secondary GPT header is not in the last LBA.
> > 
> > There is a lot of stuff going on there.
> > 
> > You switched from a hardware RAID card to something else in the new machine.  Maybe a different card, or maybe just the motherboard.  The old controller may have put metadata on the drives and hidden it.  On a new controller, that metadata is not hidden.  This would explain the "secondary GPT header is not in the last LBA" message.
> > 
> > If the old controller "split" the combined drives into virtual volumes, there may be another GPT somewhere in the remainder of the drive.  If you could find that, gnop(8) could be used with an offset to mount it. This could be another explanation for the GPT being "corrupt": the GPT at the start of the drive is for the first volume, the backup GPT at the end of the drive is for the second volume.
> 
> It did indeed! I just sent a message about that, as I realised that wasn't clear from my original description. I think gnop(8) is the answer to my question.
> 
> I've never worked with gnop before; is this a safe approach?:
> 
> # kldload geom_nop
> # gnop create -v -o 41943006 -S 512 ada4
> # mount /dev/ada4.nop /mnt
> 
> I get the impression that gnop might be non-destructive, but that's not entirely clear from the man page.
> 
> I tried the above on ada5 (the other half of the mirror that I applied gpart recover to earlier), but it spews:
> 
> gnop: Invalid offset for provider ada5.
> 
> What number does it expect for that offset? And what exactly is gpart show showing? I was under the assumption that both would be sectors (which judging from the numbers would be 512 bytes in size).
> 
> > Finally, GPT and gmirror are combined.  That's a problematic combination because both want metadata in the last block of the drive. The new section in the Handbook about RAID1 (gmirror) describes that in the "Metadata Issues" section:
> > http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/GEOM-mirror.html
> 
> I'm pretty sure the disks on the controller had nothing to do with gmirror ever.
> 
> Gmirror is only applied to a pair of new disks that I put in the (new) server to be able to copy my data over. I hadn't expected to be able to rely on those original disks to be readable at all without the controller, so I needed some place to store the data. I like the redundancy of a mirror, so I used gmirror for (only) the new disks.

I think you're missing what Warren is telling you, because you have
multiple things going on/complexities to deal with simultaneously.

You haven't provided any details about your gmirror setup either.  All
we know at this point:

> >> GEOM_MIRROR: Device mirror/boot launched (2/2).
> >> GEOM_MIRROR: Device mirror/swap launched (2/2).
> >> GEOM_MIRROR: Device mirror/root launched (2/2).

My gut feeling is ada2 and ada3 make up the mirror, and the mirror is at
the disk level (ada2 and ada3).  I'm basing this on past evidence
presented in the thread, and having to make assumptions.  No "gmirror
status" output = we have to make assumptions.

Now, what Warren is telling you: gmirror + GPT do not play well
together.  This is a design flaw** on the part of gmirror.  If you want
to use gmirror with disks using GPT, your only solutions are to mirror
the partitions (adaXpX) and not the disk (adaX), which has its own set
of caveats, or to use the MBR scheme (and if these are 4K sectors disks,
or you plan on using those, you're even more screwed).  I will not bring
ZFS into this discussion since that also opens up a can of worms -- I'm
trying to stay focused.

The errors you see on ada4 and ada5 about the backup GPT header can be
dealt with in a different manner.

But for (again, assuming) ada2 and ada3, you will see GPT "backup header
corruption" messages indefinitely because of the above flaw.

** -- I will not get into a debate about terminology.  I am aware of the
history (which came first), and so on.  It's a flaw.  Linux md had the
same problem when GPT was introduced, and it has since been
fixed/addressed.

-- 
| Jeremy Chadwick                                   jdc@koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Mountain View, CA, US                                            |
| Making life hard for others since 1977.             PGP 4BD6C0CB |




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130602154832.GA23072>