From owner-freebsd-fs@FreeBSD.ORG Sat Nov 7 08:04:49 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 98D5A106566B; Sat, 7 Nov 2009 08:04:49 +0000 (UTC) (envelope-from xclin@cs.nctu.edu.tw) Received: from csmailer.cs.nctu.edu.tw (csmailer.cs.nctu.edu.tw [140.113.235.125]) by mx1.freebsd.org (Postfix) with ESMTP id 176E68FC08; Sat, 7 Nov 2009 08:04:49 +0000 (UTC) Received: from csmailer.cs.nctu.edu.tw (localhost [127.0.0.1]) by csmailer.cs.nctu.edu.tw (Postfix) with ESMTP id 9557522CBA; Sat, 7 Nov 2009 15:48:02 +0800 (CST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=cs.nctu.edu.tw; h=date :from:to:cc:subject:message-id:mime-version:content-type; s= rsa1024; bh=pfgwpGTUq4jt8e2x3IaAj+P7kg8=; b=K4/dwrPFyEsfT30tjm3p wos3RwEwlGxSH3KSUEA/T/Ye/pWhA9rzOqY15mgyhGRim1p3OQyhH/c238t1qYXk HLm5Y6ISFVnZoJ5hL1Wnu2DVLwarfa/Zl6VAq3MMw5pdAYLOQBMRoRTjnGrgl5XJ CdAcrM4JvjNBMpszgxyW4+U= DomainKey-Signature: a=rsa-sha1; c=nofws; d=cs.nctu.edu.tw; h=date:from :to:cc:subject:message-id:mime-version:content-type; q=dns; s= rsa1024; b=lT4qb4lbV4GywympGh1FCaZMIo/i8+AHk68/5LTeT0HaTQVvfYaqx +0wdf4LXcIL+Ps/5cJPWTCFVBK7v2ciYZUY8bHLTuJTKVOONpd+sB+jA88AgUt58 yf/x2zzUToef7nlWE5rx2Ou2PeMqB+dildUmGmD6W4OLaFc+ybwaNU= Received: from bsd3.cs.nctu.edu.tw (bsd3.cs.nctu.edu.tw [140.113.235.133]) by csmailer.cs.nctu.edu.tw (Postfix) with ESMTP id 5F66022CB8; Sat, 7 Nov 2009 15:48:02 +0800 (CST) Received: (from xclin@localhost) by bsd3.cs.nctu.edu.tw (8.14.3/8.14.3/Submit) id nA77mN5A080344; Sat, 7 Nov 2009 15:48:23 +0800 (CST) (envelope-from xclin) Date: Sat, 7 Nov 2009 15:48:23 +0800 From: Chen-Chuan Lin To: freebsd-current@freebsd.org, freebsd-fs@freebsd.org Message-ID: <20091107074823.GA78260@cs.nctu.edu.tw> MIME-Version: 1.0 Content-Type: text/plain; charset=big5 Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-06-14) Cc: xclin@cs.nctu.edu.tw Subject: ZFSBoot with zpool inside bsdlabel X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 Nov 2009 08:04:49 -0000 Hi all, I want to build a zfs only system and need multiboot with Microsoft Windows. So I can only choose MBR and zfsboot. And I found that zfsboot doesn't work in my environment. I create 2 slices, 1 for windows and 2 for FreeBSD. In slice 2, I create 2 partitions, A for zfs and B for swap. If partition A's offset is 0, zfsboot works fine, but if partition A is behind B, zfsboot1 can't locate zfsboot2. Also, zfsboot2 can't find my zpool. 0 1 n n+512 n+1024 (sector relative to +------+------+----------+--------+--------+-------+---- current slice) | boot | disk | ... swap | vdev | vdev | Boot |.... | code | label| | label0 | label1 | Block | +------+------+----------+--------+--------+-------+--- | Partition B | Partition A +------------------------+------------------------------ Original zfsboot only support 0 512 1024 +------+--------+-------+-------+---- | boot | vdev | vdev | Boot | +------+ label | label | Block | ..... | 0 | 1 | | +-------------+-+-------+-------+---- where boot code is zfsboot1 and boot block is zfsboot2 I found that zfsboot won't read any disklabel and assume zpool is placed in the whole slice instead of a single partition. So I modified zfsboot.c (part of zfsboot2) and zfsldr.S (zfsboot1). Make prove_drive() in zfsboot.c will scan zfs metadata no only in slice but also partition A if there is disklabel in the slice. In zfsldr.S. There is no enough space to put additional code to scan disk label, so i make a change. Every thing before main.5 is unchanged. And then scan disklable, check magic number, partition type, and calculate the offset of partition A. Finally, read BTX & BTX client from there, instead of sector 1024 of the slice. The rest of thing (BTX relocation & set A20) has been moved into zfsboot1.5, which store in sector 2 of the booting slice and will be loaded into 0xA00 (I don't know this memory location is ok or not). zfsboot1 will load boot1.5, load BTX and then jump to boot1.5. Boot 1.5 does the rest of things and jump to BTX to continue the booting procedure. The steps to install bootcode has become (assume zpool is in ad0s2a) dd if=zfsboot1 of=/dev/ad0s2 dd if=zfsboot15 of=/dev/ad0s2 seek=2 dd if=zfsboot2 of=/dev/ad0s2a seek=1024 There is a sample zfsboot at http://140.113.17.225/~xclin/zfsboot (with some debug message and only for i386/amd64) The first 512 bytes are zfsboot1, the second 512 bytes are zfsboot1.5, and the remaining 32K are zfsboot2 dd if=zfsboot of=/dev/ad0s2 count=1 dd if=zfsboot of=/dev/ad0s2 count=1 skip=1 seek=2 dd if=zfsboot of=/dev/ad0s2a skip=2 seek=1024 I don't know where this function is necessary or not... The following are patches & zfsboot1.5's code (or http://140.113.17.225/~xclin/zfsboot.patch) --- sys/boot/i386/zfsboot/Makefile.orig 2009-10-30 23:48:15.356651674 +0800 +++ sys/boot/i386/zfsboot/Makefile 2009-10-30 23:49:59.515556507 +0800 @@ -15,6 +15,7 @@ REL1= 0x700 ORG1= 0x7c00 +ORG15= 0xa00 ORG2= 0x2000 CFLAGS= -Os -g \ @@ -45,8 +46,8 @@ CLEANFILES= zfsboot -zfsboot: zfsboot1 zfsboot2 - cat zfsboot1 zfsboot2 > zfsboot +zfsboot: zfsboot1 zfsboot1.5 zfsboot2 + cat zfsboot1 zfsboot1.5 zfsboot2 > zfsboot CLEANFILES+= zfsboot1 zfsldr.out zfsldr.o @@ -56,6 +57,14 @@ zfsldr.out: zfsldr.o ${LD} ${LDFLAGS} -e start -Ttext ${ORG1} -o ${.TARGET} zfsldr.o +CLEANFILES+= zfsboot1.5 boot15.out boot15.o + +zfsboot1.5: boot15.out + objcopy -S -O binary boot15.out ${.TARGET} + +boot15.out: boot15.o + ${LD} ${LDFLAGS} -e start -Ttext ${ORG15} -o ${.TARGET} boot15.o + CLEANFILES+= zfsboot2 zfsboot.ld zfsboot.ldr zfsboot.bin zfsboot.out \ zfsboot.o zfsboot.s zfsboot.s.tmp zfsboot.h sio.o --- sys/boot/i386/zfsboot/boot15.S.orig 1970-01-01 08:00:00.000000000 +0800 +++ sys/boot/i386/zfsboot/boot15.S 2009-10-30 23:37:11.310256951 +0800 @@ -0,0 +1,93 @@ + +/* Memory Locations */ + .set MEM_ORG_15,0xa00 # Origin of boot15 + .set MEM_BUF,0x8000 # Load area + .set MEM_BTX,0x9000 # BTX start + .set MEM_JMP,0x9010 # BTX entry point + .set MEM_USR,0xa000 # Client start + .set BDA_BOOT,0x472 # Boot howto flag + +/* Misc. Constants */ + .set SIZ_PAG,0x1000 # Page size + .set SIZ_SEC,0x200 # Sector size + .set NSECT,0x40 + + .global start + .code16 + + .set NREADP, 0x7c27 + +start: jmp main + +main: + mov $msg_hello, %si + callw putstr + +/* + * We have already loaded BTX and client into $MEM_BUF in boot1. The + * only thing boot1.5 need to do is just relocate BTX and client and + * then jump to entry point + */ + +main.7: mov $MEM_BUF,%si # BTX (before reloc) + mov 0xa(%si),%bx # Get BTX length and set + mov $NSECT*SIZ_SEC-1,%di # Size of load area (less one) + mov %di,%si # End of load + add $MEM_BUF,%si # area + sub %bx,%di # End of client, 0xc000 rel + mov %di,%cx # Size of + inc %cx # client + mov $(MEM_USR+2*SIZ_PAG)>>4,%dx # Segment + mov %dx,%es # addressing 0xc000 + std # Move with decrement + rep # Relocate + movsb # client + mov %ds,%dx # Back to + mov %dx,%es # zero segment + mov $MEM_BUF,%si # BTX (before reloc) + mov $MEM_BTX,%di # BTX + mov %bx,%cx # Get BTX length + cld # Increment this time + rep # Relocate + movsb # BTX + +/* + * Enable A20 so we can access memory above 1 meg. + * Use the zero-valued %cx as a timeout for embedded hardware which do not + * have a keyboard controller. + */ +seta20: cli # Disable interrupts +seta20.1: dec %cx # Timeout? + jz seta20.3 # Yes + inb $0x64,%al # Get status + testb $0x2,%al # Busy? + jnz seta20.1 # Yes + movb $0xd1,%al # Command: Write + outb %al,$0x64 # output port +seta20.2: inb $0x64,%al # Get status + testb $0x2,%al # Busy? + jnz seta20.2 # Yes + movb $0xdf,%al # Enable + outb %al,$0x60 # A20 +seta20.3: sti # Enable interrupts + + + jmp start+MEM_JMP-MEM_ORG_15 # Start BTX + +putstr.0: mov $0x7,%bx + movb $0xe,%ah + int $0x10 +putstr: lodsb + testb %al,%al + jne putstr.0 + +ereturn: movb $0x1,%ah + stc +return: retw + + +msg_hello: .asciz "welcom to boot1.5\r\n" + + .org 0x1FE, 0x90 +/* useless but check if code is more then 512byte */ +part_magic: .word 0xaa55 --- sys/boot/i386/zfsboot/zfsboot.c.orig 2009-10-30 23:48:15.355651268 +0800 +++ sys/boot/i386/zfsboot/zfsboot.c 2009-10-30 23:50:44.761515700 +0800 @@ -22,6 +22,7 @@ #ifdef GPT #include #endif +#include #include #include @@ -154,6 +155,7 @@ struct dmadat { char rdbuf[READ_BUF_SIZE]; /* for reading large things */ char secbuf[READ_BUF_SIZE]; /* for MBR/disklabel */ + char labbuf[READ_BUF_SIZE]; /* for MBR/disklabel */ }; static struct dmadat *dmadat; @@ -524,6 +526,32 @@ */ dsk = copy_dsk(dsk); } + else if(dp[i].dp_typ == DOSPTYP_386BSD) + { + struct disklabel *label; + char *lhdr = dmadat->labbuf; + /* save slice offset if we want to scan all partitions */ + u_int32_t dp_start = dp[i].dp_start; + + if(drvread(dsk, lhdr, 1 ,1)) + continue; + + label = lhdr; + + if(label->d_magic !=DISKMAGIC || label->d_magic2 != DISKMAGIC) + continue; + + if(!label->d_partitions[0].p_size) + continue; + + dsk->start += label->d_partitions[0].p_offset; + dsk->start -= label->d_partitions[RAW_PART].p_offset; + + if(vdev_probe(vdev_read, dsk, spap) == 0) { + spap =0; + dsk = copy_dsk(dsk); + } + } } } --- sys/boot/i386/zfsboot/zfsldr.S.orig 2009-10-30 23:48:15.353651293 +0800 +++ sys/boot/i386/zfsboot/zfsldr.S 2009-10-30 23:51:13.431489398 +0800 @@ -18,6 +18,7 @@ /* Memory Locations */ .set MEM_REL,0x700 # Relocation address .set MEM_ARG,0x900 # Arguments + .set MEM_BOOT15,0xa00 # Boot1.5 Location .set MEM_ORG,0x7c00 # Origin .set MEM_BUF,0x8000 # Load area .set MEM_BTX,0x9000 # BTX start @@ -38,6 +39,16 @@ .set SIZ_SEC,0x200 # Sector size .set NSECT,0x40 + +/* Disklabel Constants */ + .set DL_DISKMAGIC,0x82564557 # Disklabel Magic + .set DL_MAGIC1,0x0 # Magic1 + .set DL_MAGIC2,0x84 # Magic2 + .set DL_SECT,0x94 # Number of sector + .set DL_PARTA_OFFSET,0x98 # PartA offset + .set DL_PARTA_TYPE,0xA0 # PartA Type + .set DL_RAW_OFFSET,0xb8 # Raw Part offset + .globl start .globl xread .code16 @@ -194,9 +205,64 @@ * area and target area do not overlap. */ main.5: mov %dx,MEM_ARG # Save args + movw 0x8(%si),%ax # Backup + movw %ax,MEM_BUF+SIZ_SEC+0x8 # original + movw 0xa(%si),%ax # slice + movw %ax,MEM_BUF+SIZ_SEC+0xa # start +/* + * Because we dont's have enough space, relocateing BTX will take + * place in boot 1.5. Load it from sector 2 and relocate into 0xa00 + */ + movb $1,%dh # Load + movw $2,%ax # Boot 1.5 from + callw nread.1 # Read disk + mov $MEM_BUF,%si # Boot 1.5 (before) + mov $MEM_BOOT15,%di # Boot 1.5 + mov $SIZ_SEC,%cx # Size + rep # Relocate + movsb # Boot 1.5 +/* + * Read sector 1 from slice to check out whethre there is disklabel + * or not. If so, check magic, partition type and size of partition A + * And then caculate partition A's offset and put it in + * MEM_BUF+SIZ_SEC+0x8, otherwise, it remain unchanged + */ + movw $MEM_BUF+SIZ_SEC,%si # offset + movb $1,%dh # Sector count + movw $1,%ax # Offset to disklabel + callw nread.1 # Read disk + mov $MEM_BUF,%si # Disklabel + cmpl $DL_DISKMAGIC,0x0(%si) # Check + jne main.6 # msgic1 + cmpl $DL_DISKMAGIC,0x84(%si) # Check + jne main.6 # magic2 + cmpl $0,DL_SECT(%si) # Check label a + je main.6 # size + cmpb $27,DL_PARTA_TYPE(%si) # Check label a + jne main.6 # type ZFS=27 + + movw MEM_BUF+SIZ_SEC+0x8,%ax # Partition A + movw MEM_BUF+SIZ_SEC+0xa,%cx # start at + addw DL_PARTA_OFFSET(%si),%ax # slice + adcw DL_PARTA_OFFSET+2(%si),%cx # offset + + subw DL_RAW_OFFSET(%si),%ax # part a + sbbw DL_RAW_OFFSET+2(%si),%cx # offset - + movw %ax,MEM_BUF+SIZ_SEC+0x8 # raw part + movw %cx,MEM_BUF+SIZ_SEC+0xa # offset +/* + * We have boot partitions's offset in $MEM_BUS+SIZ_SEC+8, put it in + * %si and load. And then read boot2 from disk. The rest of things + * will be done in boot 1.5 + */ +main.6: + movw $MEM_BUF+SIZ_SEC,%si # offset movb $NSECT,%dh # Sector count movw $1024,%ax # Offset to boot2 callw nread.1 # Read disk + + jmp start+MEM_BOOT15-MEM_ORG # Goto Boot1.5 + +#if 0 /* below is the original boot code */ main.6: mov $MEM_BUF,%si # BTX (before reloc) mov 0xa(%si),%bx # Get BTX length and set mov $NSECT*SIZ_SEC-1,%di # Size of load area (less one) @@ -241,14 +307,16 @@ jmp start+MEM_JMP-MEM_ORG # Start BTX +#endif /* * Trampoline used to call read from within boot1. */ nread: xor %ax,%ax # Sector offset in partition nread.1: mov $MEM_BUF,%bx # Transfer buffer + xor %cx,%cx # Clear add 0x8(%si),%ax # Get - mov 0xa(%si),%cx # LBA + adc 0xa(%si),%cx # LBA push %cs # Read from callw xread.1 # disk jnc return # If success, return