Date: Sun, 7 Mar 2004 16:07:04 +0200 From: Valentin Nechayev <netch@ivb.nn.kiev.ua> To: hackers@freebsd.org Subject: ATA large disks & EDD at boot Message-ID: <20040307140703.GA310@iv.nn.kiev.ua>
next in thread | raw e-mail | index | archive | help
Hi, it seems that current traditional or new (packet, aka EDD, aka Int13x) disk read interface selection in boot blocks (boot1 & mbr) is obsolete and leads to unbootable systems. The main factor to kill old access is strange BIOS translation for disks larger than 32G. For two my home disks: Model: IC35L040AVER07-0 LBA size: 80418240 blocks BIOS geometry in "LBA" mode: 5005*255*63 BIOS geometry in "NORMAL" mode: 19710*16*255 BIOS geometry in "LARGE" mode: 1314*240*255 Model: SAMSUNG SP1203N LBA size: 234493056 blocks BIOS geometry in "LBA" mode: 14596*255*63 BIOS geometry in "NORMAL" mode: 57473*16*255 BIOS geometry in "LARGE" mode: 3831*240*255 It may firstly seem that it's specific for one BIOS version, but I've checked reported translations on a bunch of another motherboards and another BIOS'es and saw the identical translation. The sabotage factor here is 255 sectors. It can't fit in 6 bits which are allowed in B-1302 interface (which is used in all boots when EDD is considered to be non-nessessary) and isn't correctly reported by B-1308 interface (i.e. int 0x13 with AH=0x08) which gets drive parameters. I checked work of this interface under MS-DOS using program which prints reported drive parameters and then compares checksum of block got using old (CHS) interface with checksums of blocks got using LBA with idea of 63 or 255 sectors. BIOS translation was set to "NORMAL" (and BIOS said that geometry is 19710*16*255). The testing code in question: [...] for( head = 0; head <= 15 && head < nhead ; ++head ) { for( sect = 1; sect <= 63 && sect <= nsect; ++sect ) { unsigned sum1, sum2, sum3; sum1 = sum_chs( cyl, head, sect ); sum2 = sum_lba( (unsigned long) cyl*nhead*63 + (unsigned long) head*63 + (sect-1) ); sum3 = sum_lba( (unsigned long) cyl*nhead*255 + (unsigned long) head*255 + (sect-1) ); printf( "%u:%u:%u -> 0x%04X 0x%04X 0x%04X\n", cyl, head, sect, sum1, sum2, sum3 ); [...] (sum_xxx() functions read the specified block and return its checksum.) It reports: B-1308: geometry 1024*16*63 2:0:1 -> 0x28D0 0x0646 0x28D0 2:0:2 -> 0xCE22 0x0128 0xCE22 2:0:3 -> 0xEBBD 0x3DD5 0xEBBD 2:0:4 -> 0x53A6 0xADBC 0x53A6 2:0:5 -> 0x04D3 0xD57B 0x04D3 2:0:6 -> 0xB0E6 0xD842 0xB0E6 2:0:7 -> 0xCE3B 0xA763 0xCE3B 2:0:8 -> 0xB988 0xE4F2 0xB988 2:0:9 -> 0x7370 0x3B86 0x7370 [... and so on...] So, sum1 == sum3 whatever head & sect are tested, and the same is for all tested block range; it means that BIOS really thinks for 255 sectors on such logical track. But as seen above, B-1308 reports only 63 sectors :( Result is translation error and incorrect data for any block with absolute number >= 63. /boot/boot1 uses B-1308 BIOS call to determine whether EDD is needed (boot1.s, read() function): read: push %dx // Save movb $0x8,%ah // BIOS: Get drive int $0x13 // parameters [... divide block number to got head and sector count... ] pop %dx // Restore cmpl $0x3ff,%eax // Cylinder number supportable? sti // Enable interrupts ja read.7 // No, try EDD So, it can't boot system on disk >32G and "NORMAL" or "LARGE" BIOS translation. This is proved practically, setting system with this mode and seeing total boot failing :( In contrary to "NORMAL" and "LARGE" modes, "LBA" seems to be free from such problems: it's standardized with 63 sectors. See, e.g., http://www.firmware.com/support/bios/over4gb.htm. So, with LBA translation in BIOS, /boot/boot1 can boot system (unless 128G is crossed); with another translation, it's possible only for disks less than 32G. But /boot/mbr can be faulty even for disks less than 32G, because it decides to use EDD calls basing on too strange factors: movb 0x1(%si),%dh # Load head movw 0x2(%si),%cx # Load cylinder:sector movw $LOAD,%bx # Transfer buffer cmpb $0xff,%dh # Might we need to use LBA? jnz main.7 # No. cmpw $0xffff,%cx # Do we need to use LBA? jnz main.7 # No. So, EDD calls are used only seeing 1023:255:xx as begin addess of slice. I don't know partition (slice) editor which sets such values. E.g., linux fdisk and seen MS Windows fdisks wrote 1023:254:63 when partition begin or end isn't fit in 1024 cylinders in current geometry; boot1 default PT (used in dedicated mode) also has now 254 heads, not 255. This MBR will fail on such blocks. /boot/boot0 is free of such problems because it has packet mode (i.e. EDD calls) enabled by default. To compare with other boot loaders, Linux LILO & GRUB now use EDD by default and doesn't try to fall to CHS for disk beginning. My proposition to solve these problems is to replicate /boot/boot0 approach: use externally configured flag whether to use packet mode (EDD calls), and set default value of this flag to true; EDD is used when is available and returns correct data, otherwise it switches to CHS access. Patches below are tested on my home system. It's interesting that comment to read() in boot1.s says that it uses the same logic, but in real logic EDD usage is reduced to corner case. After the following patch, code and comment will be in consent. I've checked the full CVS history for these boot blocks, but none of them shown real reason of any change in read function selection logic. Also I couldn't find any essential information in mailing archives. So, this post is based only on latest code state and own experiments. ==={{{ --- boot1.s.orig Sun Mar 7 11:18:42 2004 +++ boot1.s Sun Mar 7 11:24:04 2004 @@ -265,7 +265,9 @@ // %dl - byte - drive number // stack - 10 bytes - EDD Packet // -read: push %dx // Save +read: testb $FL_PACKET,%cs:MEM_REL+flags-start // LBA support enabled? + jnz read.7 // Yes, go to LBA code +read.1: push %dx // Save movb $0x8,%ah // BIOS: Get drive int $0x13 // parameters movb %dh,%ch // Max head number @@ -288,7 +290,7 @@ pop %dx // Restore cmpl $0x3ff,%eax // Cylinder number supportable? sti // Enable interrupts - ja read.7 // No, try EDD + ja ereturn // No, stop attempts xchgb %al,%ah // Set up cylinder rorb $0x2,%al // number orb %ch,%al // Merge @@ -326,21 +328,20 @@ sub %al,0x2(%bp) // block count ja read // If not done read.6: retw // To caller -read.7: testb $FL_PACKET,%cs:MEM_REL+flags-start // LBA support enabled? - jz ereturn // No, so return an error - mov $0x55aa,%bx // Magic +read.7: mov $0x55aa,%bx // Magic push %dx // Save movb $0x41,%ah // BIOS: Check int $0x13 // extensions present pop %dx // Restore - jc return // If error, return an error + jc read.1 // if no, go to CHS read cmp $0xaa55,%bx // Magic? - jne ereturn // No, so return an error + jne read.1 // if no, go to CHS read testb $0x1,%cl // Packet interface? - jz ereturn // No, so return an error + jz read.1 // if no, go to CHS read mov %bp,%si // Disk packet movb $0x42,%ah // BIOS: Extended int $0x13 // read + jc read.1 // last resort attempt retw // To caller // Messages ===}}} MBR patch is similar, but requires adding flag byte and reflection in makefile: ==={{{ --- mbr.s.orig Sun Mar 7 11:31:12 2004 +++ mbr.s Sun Mar 7 11:43:33 2004 @@ -88,10 +88,8 @@ movb 0x1(%si),%dh # Load head movw 0x2(%si),%cx # Load cylinder:sector movw $LOAD,%bx # Transfer buffer - cmpb $0xff,%dh # Might we need to use LBA? - jnz main.7 # No. - cmpw $0xffff,%cx # Do we need to use LBA? - jnz main.7 # No. + testb $0x1,%cs:flags+EXEC-start # EDD is allowed? + jz main.6 # No. pushw %cx # Save %cx pushw %bx # Save %bx movw $0x55aa,%bx # Magic @@ -150,6 +148,7 @@ msg_pt: .asciz "Invalid partition table" msg_rd: .asciz "Error loading operating system" msg_os: .asciz "Missing operating system" +flags: .byte MBRFLAGS .org PT_OFF --- Makefile.orig Sun Mar 7 11:37:25 2004 +++ Makefile Sun Mar 7 11:40:09 2004 @@ -6,6 +6,10 @@ BINDIR?= /boot BINMODE= 444 +MBRFLAGS?= 0x01 + +AFLAGS += --defsym MBRFLAGS=${MBRFLAGS} + ORG= 0x600 mbr: mbr.o ===}}} All patches were based on 5.2-release code. -netch-
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040307140703.GA310>