Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 10 May 1996 02:54:32 -0700 (PDT)
From:      asami@cs.berkeley.edu (Satoshi Asami)
To:        jgreco@brasil.moneng.mei.com
Cc:        bde@zeta.org.au, current@freebsd.org, nisha@cs.berkeley.edu
Subject:   Re: more than 32 scsi disks on a single machine ?
Message-ID:  <199605100954.CAA00908@silvia.HIP.Berkeley.EDU>
In-Reply-To: <199605091422.JAA00522@brasil.moneng.mei.com> (message from Joe Greco on Thu, 9 May 1996 09:22:11 -0500 (CDT))

next in thread | previous in thread | raw e-mail | index | archive | help
Bruce Evans:

 * > This limits us to 16777216 disks, or only 8388606 disks if we avoid using
 * > the high bit to avoid sign extension bugs.  The limit is in dkunit() in
 * > <sys/disklabel.h>

 * > There are lots of things to change.  The encoding would have to be really
 * > ugly to preserve compatibility with existing device nodes.

Joe Greco:

 * Hmm.
 * 
 * What if we don't give a damn about existing device nodes?  :-)  "Take the
 * plunge".

Well we may try that too I guess, but I didn't want to spend hours
trying to figure out the right mix in the /dev directory (oh devfs,
wasn't you supposed to get rid of the major/minor numbers?), so I went 
in with the quick and dirty hack.

===
Index: sys/sys/disklabel.h
===================================================================
RCS file: /usr/cvs/src/sys/sys/disklabel.h,v
retrieving revision 1.21
diff -u -r1.21 disklabel.h
--- 1.21	1996/05/03 05:38:34
+++ disklabel.h	1996/05/10 00:12:30
@@ -374,17 +374,19 @@
     -----------------------------------------------------------------
     |      TYPE     |PART2| SLICE   |  MAJOR?       |  UNIT   |PART | <-soon
     -----------------------------------------------------------------
+    |  TYPE | UNIT2 |PART2| SLICE   |  MAJOR?       |  UNIT   |PART | <-new
+    -----------------------------------------------------------------
 
 	I want 3 more part bits (taken from 'TYPE' (useless as it is) (JRE)
 */
 #define	dkmakeminor(unit, slice, part) \
-				(((slice) << 16) | ((unit) << 3) | (part))
+				(((slice) << 16) | (((unit) & 0x1f) << 3) | (((unit) & 0x1e0) << 19) | (part))
 #define	dkmodpart(dev, part)	(((dev) & ~(dev_t)7) | (part))
 #define	dkmodslice(dev, slice)	(((dev) & ~(dev_t)0x1f0000) | ((slice) << 16))
 #define	dkpart(dev)		(minor(dev) & 7)
 #define	dkslice(dev)		((minor(dev) >> 16) & 0x1f)
 #define	dktype(dev)       	((minor(dev) >> 21) & 0x7ff)
-#define	dkunit(dev)		((minor(dev) >> 3) & 0x1f)
+#define	dkunit(dev)		(((minor(dev) >> 3) & 0x1f) | ((minor(dev) >> 19) & 0x1e0))
 
 #ifdef KERNEL
 /*
Index: sys/etc/etc.i386/MAKEDEV
===================================================================
RCS file: /usr/cvs/src/etc/etc.i386/MAKEDEV,v
retrieving revision 1.118
diff -u -r1.118 MAKEDEV
--- 1.118	1996/05/03 05:37:34
+++ MAKEDEV	1996/05/10 04:47:22
@@ -131,7 +131,7 @@
 # Convert disk (type, unit, slice, partition) to minor number
 dkminor()
 {
-	echo $((32 * 65536 * $1 + 8 * $2 + 65536 * $3 + $4))
+	echo $((32 * 65536 * $1 + 8 * ($2 % 32) + 256 * 65536 * ($2 / 32) + 65536 * $3 + $4))
 }
 
 # Convert the last character of a tty name to a minor number.
@@ -254,7 +254,7 @@
 	slice=`expr $i : '..[0-9]*s\([0-9]*\)'`
 	part=`expr $i : '..[0-9]*s[0-9]*\(.*\)'`
 	case $unit in
-	[0-9]|[1-2][0-9]|30|31)
+	[0-9]|[1-9][0-9]|[1-4][0-9][0-9]|50[0-9]|510|511)
 		case $slice in
 		[0-9]|[1-2][0-9]|30)
 			oldslice=$slice
@@ -414,15 +414,16 @@
 	esac
 	unit=`expr $i : '..\(.*\)'`
 	case $unit in
-	[0-9]|[1-2][0-9]|30|31)
+	[0-9]|[1-9][0-9]|[1-4][0-9][0-9]|50[0-9]|510|511)
 		for slicepartname in s0h s1 s2 s3 s4
 		do
 			sh MAKEDEV $name$unit$slicepartname
 		done
+		nunit=$((($unit / 32) * 32 * 65536 + ($unit % 32)))
 		case $name in
 		od|sd)
 			rm -f r${name}${unit}.ctl
-			mknod r${name}${unit}.ctl c $chr `expr $unit '*' 8 + $scsictl `
+			mknod r${name}${unit}.ctl c $chr `expr $nunit '*' 8 + $scsictl `
 			chmod 600 r${name}${unit}.ctl
 			;;
 		esac
===

I grabbed 4 bits from the ever-shrinking "type" field (did Julian
actually grab the "part2"?  the commit for "reservation" was on 04/95)
and changed the macros to reflect that.  Then I compiled the kernel,
rebooted the machine, modified MAKEDEV to handle the >32 disk case (up
to 512 now, since we have 9 bits) and poof!  It worked!  I now have a
160GB (40 x 4GB) filesystem! :)

However, further review showed that actually only 256 of them are
usable.  With the first disk (bus 0, target 0) wired down to sd500,
what I got was:

===
# This listing automatically generated by lsdev(1)
 :
ahc0	at pci2:4 # int a irq 9
sd244 at SCSI bus 0:0:0
 :
sd255 at SCSI bus 0:15:0
ahc1	at pci2:5 # int a irq 10
sd0 at SCSI bus 1:0:0
 :
sd7 at SCSI bus 1:7:0
ahc2	at pci1:4 # int a irq 10
sd8 at SCSI bus 2:4:0
 :
sd27 at SCSI bus 3:7:0
===

Note that 244 = 500 (mod 256).  It seems like it's wrapping around at
256, which probably means there's an unsigned int somewhere in the sd
(or generic disk) code.

So I wired the first disk to 216, and got:

===
# This listing automatically generated by lsdev(1)
 :
ahc0	at pci2:4 # int a irq 9
sd216 at SCSI bus 0:0:0
 :
sd227 at SCSI bus 0:15:0
ahc1	at pci2:5 # int a irq 10
sd228 at SCSI bus 1:0:0
 :
sd235 at SCSI bus 1:7:0
ahc2	at pci1:4 # int a irq 10
sd236 at SCSI bus 2:4:0
 :
sd247 at SCSI bus 2:15:0
ahc3	at pci1:5 # int a irq 11
sd248 at SCSI bus 3:0:0
 :
sd255 at SCSI bus 3:7:0
 :
===

I actually tested this whole array, it worked fine, so disks up to 255 
are actually usable.

What do people think?  I'm sure people will agree with Joe that
limiting power users to 32 disks (or even less if not contiguous) will 
seriously damage FreeBSD's reputation as a "server" OS.

With Julian and Peter's impending modified driver coming up some time
in the future (how long til that?  one year?), maybe we shouldn't try
to move around things too much.

I don't think someone who tries to stick in more than 32 disks into a
machine would attempt to mknod them herself, so I guess the apparent
ugliness is ok as long as we ship the matching MAKEDEV....

Satoshi



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199605100954.CAA00908>