Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 13 Dec 2019 18:26:08 -0500
From:      Matthew Pounsett <matt@conundrum.com>
To:        =?UTF-8?Q?Morgan_Wesstr=C3=B6m?= <freebsd-database@pp.dyndns.biz>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: Root volume renumbered unexpectedly, no longer boots
Message-ID:  <CAAiTEH8rmZYi7pSqWHB_qd5kUGTKXuTY6p0B=_2jn8qZYm9frA@mail.gmail.com>
In-Reply-To: <ae530ed2-2590-ef57-2c3b-6ad0654cba55@pp.dyndns.biz>
References:  <CAAiTEH94JZFf6XpmXbAUFrWbjA8CXF-EpH231huzmxX%2BcjkvVQ@mail.gmail.com> <28a92269-832b-61d0-3d25-68be2439dd9c@pp.dyndns.biz> <CAAiTEH9iZqTDxyib3fnh4Cjxavg=-cxMtE6Gs2n8nfJaBRZWxg@mail.gmail.com> <ae530ed2-2590-ef57-2c3b-6ad0654cba55@pp.dyndns.biz>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 13 Dec 2019 at 17:31, Morgan Wesstr=C3=B6m <
freebsd-database@pp.dyndns.biz> wrote:

> > The SSD is on the mainboard controller.  I have no idea what the SATA
> > controller's original mode was, but just now it was set to IDE.  I trie=
d
> > switching it to AHCI, but that didn't improve anything, and generated a
> new
> > "AHCI BIOS not installed" error during boot, so I switched it back to
> IDE.
> > Either of the RAID settings seem like bad choices, since I want direct
> > access to the physical drives for ZFS.
>
> I've seen systems that emulate 2-port master/slave controllers in
> IDE/Legacy mode and renumber drives differently in that mode than in
> AHCI mode.
>

Yeah.. I didn't see any difference moving back and forth though.


>
> That BIOS setting only affects the mainboard controller and with a 24
> disk ZFS pool I assume those disks are connected to some other
> controller. Regardless, I'd stay away from the RAID mode too though
> since there's a risk of overwriting some sectors of your system disk. It
> would've be interesting to know if that setting has changed though...
>

48 disks, actually. :)   Half of those are on an external JBOD connected
via an LSI FC controller.  This server is significantly older than my
association with it, so I'm uncertain about how the internal 24 drives are
connected.  If it helps, dmesg only reports ses0 and ses1 drivers.  The
boot disk and the first 24 ZFS drives are all on ses0.  I think that
implies only two controllers in use, not three.

Is there any indication that the BIOS settings were reset during the
> disk swap? (Date/time being way off for example).


None that I can see.  And, da4 vanished from the BIOS list of bootable
devices several reboots after powering the machine back on, so I don't have
any reason to think that was directly related to powering the machine off,
or to the drive reordering.  Its disappearance was coincident with booting
the system off a USB installer, but timing is the only connection I can
think of between those two things.  Before anyone goes there... I did not
start the installer, so I have no reason to think it has modified da4's
MBR.  I'm only booting off that stick to get to a usable shell.

The only symptom of a problem that resulted during the power-off/drive swap
was that da0 moved to da4 in the first place, and that trying to boot
ufs:/dev/da4p2 from the boot prompt seems to fail.


> Did you verify that
> the SSD is still connected to the lowest numbered port? ATA_STATIC_ID
> was removed back in 2015 if I remember correctly so unless you run an
> ancient FreeBSD version I wouldn't rely on that mechanism.
>

Was it?  Okay.. I checked the options file in the 11.2 source code earlier
today and it still appears in that file, so I thought it was still in use.
I haven't had cause to manually compile a kernel since about 2009, so I'm a
bit out of date on what the options are.

The boot drive wasn't moved... it's mounted inside the chassis, and we were
swapping drives in the front-facing array of the JBOD (not even in the main
chassis).  So, at least physically, it's still connected to the same port.
   Looks like I was wrong about it being an SSD, but I think that's beside
the point.

dmesg currently reports the following things about da4 (hand-retyped, not
cut and pasted, because I'm on a Java console):
ses0: da4,pass4: Element descriptor: 'Slot 01'
ses0: da4,pass4: SAS Device Slot Element: 1 Phys at Slot 0

>
> Perhaps you can tell us what kind of system this is in case someone on
> the list has a similar system and know its quirks. Also, if you're
> running a GENERIC kernel or not.
>

Ah, sorry.  I thought I mentioned that this is running 11.2-p7 GENERIC.  I
apparently forgot to mention the generic part.  It has been progressively
upgraded over the years.  My guess is that the last time it saw a
completely fresh install was sometime around the 9.x era, but that's just a
guess.

The system mainboard is a Supermicro X8DT3.  The BIOS is significantly out
of date (1.1 dated 2010.. current is 2.2) but we haven't had any BIOS
related issues to cause us to want to update.

More=E2=80=94possibly relevant=E2=80=94hand transcribing from dmesg:
mps0: <Avago Technologies (LSI) SAS2004> port 0xd000-0xd0ff mem
0xfae3c000-0xfa3ffff,0xfae40000-0xfae7ffff irq 16 at device 0.0 on pci2
mpt0: <LSILogic SAS/SATA Adapter> port 0xfc000-0xc0ff mem
0xfabec000-0xfabeffff,xfabf0000-0xfabfffff irq 16 at device 0.0 on pci5
ses0: <LSI SAS2X36 0e12> Fixed Enclosure Services SPC-3 SCSI device
ses1: <LSI SAS2X36 0e12> Fixed Enclosure Services SPC-3 SCSI device



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAAiTEH8rmZYi7pSqWHB_qd5kUGTKXuTY6p0B=_2jn8qZYm9frA>