Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 27 Jun 2006 06:34:50 -0500 (CDT)
From:      Sergey Babkin <babkin@verizon.net>
To:        Peter Jeremy <peterjeremy@optushome.com.au>, Pawel Jakub Dawidek <pjd@freebsd.org>
Cc:        Poul-Henning Kamp <phk@phk.freebsd.dk>, freebsd-arch@freebsd.org
Subject:   Re: Re: Accessing disks via their serial numbers.
Message-ID:  <9415836.2290041151408090117.JavaMail.root@vms073.mailsrvcs.net>

next in thread | raw e-mail | index | archive | help
>From: Peter Jeremy <peterjeremy@optushome.com.au>

>On Mon, 2006-Jun-26 18:46:19 +0000, Poul-Henning Kamp wrote:
>>>> 4.	It prevents cold-state swapout of disk drives.
>>>
>>>Why?
>>
>>Because /etc/fstab contains the serial number of the disk you just
>>junked and the new one has a different serial number.
>
>I've used a couple of OSs that derived their logical disk name (ie
>/dev/disk/dsk5) from the WWID by keeping a magic database to map
>the WWID to the name.  None of them have good solutions to telling
>the OS that WWID-x has died and I want WWID-y to now map to the same
>logical device as WWID-x used to.

I can tell about my experience with this kind of thing in
UnixWare. AFAIK the primary motivation for tracking the
disks by serial numbers was the Multi-Path I/O: having the
same disk accessible through two controllers, so that if
one access path fails, the OS can switch transparently
to another path. (Well, in reality such a disk would
probably be not just a disk but a logical volume in a
RAID box with 2 redundant controllers in it, each connected
to the host through a separate SCSI of FibreChannel bus).

To do this you need ways to discover that both paths
lead to the same disk. UnixWare does this by writing
a randomly generated unique ID into the UnixWare partition.
When the disks are enumerated, the partition (VTOC)
code reads this ID and checks it against the database
of known IDs in the resource manager. When a match
is found, the disk gets connected to an existing 
logical name. As the resource manager database gets
saved between reboots, this also gives you the persistence
of disk names between reboots.

As a good side effect it allowed to move disks around 
in every which way between the slots and connections 
while keeping the same logical name.

As a bad side effect, you would never know what some
logical name refers to until you check the mapping table.
Worse yet, since the logical names have been kept of
the same format as the physical names (cNbNtNdN), if you
happen to have a disk in c1b1t1d1 and then move it to
another slot, it keeps the name, and when you put another
disk into the original slot, you don't see it at all
(I think, unless I forget something) until you reset the 
mappings or unplug the original disk - then the name
magically transfers to the new disk. And since the 
mappings persist, they tend to accumulate if you swap 
the disks often. It's not really an issue with the
concept itself, its' just tha the logical names should
have a different format than the physical ones to
avoid confusion.

The mapping-controlling tools are kind of poor too but
again this is solely the issue with the tools, not with
the concept. If you are creative with direct manipulation 
of the resource manager entries, you can do many
things that the tools won't allow you. The resource
manager manipulation tools are kind of poor in UnixWare
too but again it's not an issue with the concept itself.

>Actually having the WWID (or similar) as the logical name would make
>handling a disk swap really nasty.

That's not a problem in UnixWare. If the original disk
has gone away completely, the new disk in the same SCSI slot
would just get the same logical name.

Things get more interesting for USB: to do the Right Thing
sometimes it would be neccessary to attach the same 
name to any device (of the same type) in the same slot,
and sometimes to the same device in any slot.

>Stating that the sysadmin knows about the change doesn't address the
>issue:  The sysadmin changed the device because the old one failed.
>There may or may not have been advance notice of the replacement.

There should just be a way for sysadmin to communicate this
knowledge to the system. For an UnixWare example again,
it has a command saying "I'm about to hot swap this disk,
do whatever is neccessary to stop the old disk, and then
start and recognize the new one". Of course, in any
seriour datacenter the swap would happen inside a RAID
box and the OS won't see it at all.

-SB



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9415836.2290041151408090117.JavaMail.root>