Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 15 Nov 1998 19:13:37 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        mike@smith.net.au (Mike Smith)
Cc:        tlambert@primenet.com, mike@smith.net.au, peter.jeremy@auss2.alcatel.com.au, hackers@FreeBSD.ORG
Subject:   Re: [Vinum] Stupid benchmark: newfsstone
Message-ID:  <199811151913.MAA27668@usr07.primenet.com>
In-Reply-To: <199811150651.WAA10105@dingo.cdrom.com> from "Mike Smith" at Nov 14, 98 10:51:17 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> > Actually, I could see it being *very* useful for copying data between
> > plexes on different volumes for the purpose of replication.
> 
> How?  Please note the quantum effects I've mentioned previously, and 
> note that the only useful way to overcome them is to enable buffering 
> at both ends of the queue.  Please also study the internal architecture 
> of most SCSI disks and note the almost total decoupling between the 
> head/spindle location and the availability of a request for 
> presentation to the bus.
> 
> Spindle sync was useful back when data off the drive was basically the 
> output of the head preamp or the data separator.  It's not noticeably 
> useful as soon as you allow the drive to do things for it's own reasons.

It's not clear to me that spindle sync, with various drive
performance features enabled, actually refers to anything more
than cylinder selection; i.e., that you couldn't argue that
drive-level optimizations, e.g., write caching, are lower level
than spindle sync.  In other words, at what level do you actually
engage in synchronization?


> > Bringing a new disk online in a RAID 5 array could also use this
> > to advantage.
> 
> What you want is the SCSI COPY command.

Yes.  And if the second drive never had to seek except when the
first was also seeking, and seeking took the same time because
the rotation rates were the same and the seek was between the
same cylinder...


> > Finally, if a disk could be made to talk to itself (unlikely, but
> > some vendor probably neglected to prevent it by checking for the
> > same source/target ID, I hope 8-)), it would be useful for various
> > intra-disk transfers of data, as well.
> 
> This is beyond far-fetched.

Well, hence the "I hope" and the smiley.  It'd be nice to be able
to abuse SCSI COPY for this.


> > Compared to the normal recalculation time on a new hot-swap on a
> > large array, I would think that even though it's expensive, it
> > would be less expensive than the alternative of having to do the
> > same thing anyway, *and* transfer the SCSI data in an out of host
> > memory over, best case, a PCI bus.
> 
> This has nothing to do with array reconstruction.  Having just watched 
> Event Horizon, I can imagine how your brain got here, but you're 
> completely crossed up.  Go back.

OK, say that I was reconstructing, not from parity, but from copies
of plexes on other disks, and I guaranteed that any given plex would
exist on two disks in the array; the one that failed, and the one
that didn't.  Say further that I calculated my parity across plexes,
not across units, so the parity was across a virtual unit instead of
a real unit, such that I didn't have to recalculate parity in the
default case of a signle unit failure.


> > > That's something that the user should take care of.  Any power-of-2 
> > > stripe size is likely to work out OK, as CG's are currently 32MB.
> > 
> > Hm.  Maybe the tools should limit this, or at least "bitch", in the
> > same sense that disklabel puts up little asterisks...
> 
> Disklabel should stop putting them up.  They're not meaningful anymore. 
> It's arguable as to whether it actually matters whether CG's fragment 
> across stripe boundaries either.

Well, that's rather irrelevent.  I meant that the stripe size setting
tool should put up asterisks on non-powers of two, if you didn't want
it to outright disallow them.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199811151913.MAA27668>