From owner-svn-src-stable-9@FreeBSD.ORG Sun Nov 18 17:09:30 2012 Return-Path: Delivered-To: svn-src-stable-9@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 60589EAB; Sun, 18 Nov 2012 17:09:30 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) by mx1.freebsd.org (Postfix) with ESMTP id 4276E8FC08; Sun, 18 Nov 2012 17:09:30 +0000 (UTC) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.5/8.14.5) with ESMTP id qAIH9Uld072344; Sun, 18 Nov 2012 17:09:30 GMT (envelope-from ae@svn.freebsd.org) Received: (from ae@localhost) by svn.freebsd.org (8.14.5/8.14.5/Submit) id qAIH9UYN072337; Sun, 18 Nov 2012 17:09:30 GMT (envelope-from ae@svn.freebsd.org) Message-Id: <201211181709.qAIH9UYN072337@svn.freebsd.org> From: "Andrey V. Elsukov" Date: Sun, 18 Nov 2012 17:09:30 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-9@freebsd.org Subject: svn commit: r243243 - in stable/9/sys/boot: arm/uboot common i386/libi386 i386/loader i386/pmbr powerpc/uboot sparc64/loader uboot/common uboot/lib userboot userboot/test userboot/userboot zfs X-SVN-Group: stable-9 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-9@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: SVN commit messages for only the 9-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Nov 2012 17:09:30 -0000 Author: ae Date: Sun Nov 18 17:09:29 2012 New Revision: 243243 URL: http://svnweb.freebsd.org/changeset/base/243243 Log: MFC 239054,239057,239058,239060,239066,239067,239068,239070,239073, 239087,239088,239127,239210,239211,239230,239231,239232,239243, 239292,239293,239294,239325,240272,240273,240274,240275,240276, 240277,240335,240481,241023,241047,241053,241065,241068,241069, 241070,241164,241809,241876 239054: Create the interface to work with various partition tables from the loader(8). The following partition tables are supported: BSD label, GPT, MBR, EBR and VTOC8. 239057: Remove unused variables. 239058: Introduce new API to work with disks from the loader's drivers. It uses new API from the part.c to work with partition tables. 239060: When GPT signature is invalid in the primary GPT header, then try to read backup GPT header. 239066: Add offset field to the i386_devdesc structure to be compatible with disk_devdesc structure. Update biosdisk driver to the new disk API. 239067: Remove unneeded flag. 239068: Teach the ZFS use new partitions API when probing. Note: now ZFS does probe only for partitions with type "freebsd-zfs" and "freebsd". 239070: Add simple test program that uses the partition tables handling code. It is useful to test and debug how boot loader handles partition tables metadata. 239073: Bump USERBOOT_VERSION. 239087: Add to the debug output the offset from the parent partitioning scheme. 239088: Fix start offset calculation for the EBR partitions. 239127: As it turned out, there are some installations, where BSD label contains partitions with type zero. And it has worked. So, allow detect these partitions. 239210: Add more debug messages. 239211: Add another debug message. 239230: Unbreak booting from the true dedicated disks. When we open the disk, check the type of partition table, that has been detected. If this is BSD label, then we assume this is DD mode. 239231: Remove colons from the debug message, device name returned by the disk_fmtdev() already has the colons. 239232: Restore the old behaviour. If requested partition is a BSD slice, but d_partition isn't explicitly set, then try to open BSD label and its first partition. 239243: After r239066, reinitialize v86.ctl and v86.addr for int 13 EDD probing in sys/boot/i386/libi386/biosdisk.c. Otherwise, when DISK_DEBUG is enabled, the DEBUG() macros will clobber those fields, and cause the probing to always fail mysteriously when debugging is enabled. 239292: Explicitly terminate the string after strncpy(3). 239293: Rework r239232 to unbreak ZFS detection on MBR slices. 239294: Some BIOSes return incorrect number of sectors, make checks less strictly, to do not lost some partitions. 239325: Add comment why the code has been disabled. 240272: Make struct uboot_devdesc compatible with struct disk_devdesc. 240273: Use disk_fmtdev() and disk_parsedev() functions from the new DISK API. 240274: Update uboot's disk driver to use new DISK API. 240275: Build disk.c only when DISK_SUPPORT is enabled. 240276: Update according to the change of struct uboot_devdesc. 240277: Handle LOADER_NO_DISK_SUPPORT knob in the arm and powerpc ubldr. 240335: Slightly reduce an overhead for the open() call in the zfsloader. libstand(3) tries to detect file system in the predefined order, but zfsloader usually is used for the booting from ZFS, and there is no need to try detect several file system types for each open() call. 240481: The MBR data is not necessarily aligned. This is a problem on ARM. 241023: Make the loader a bit smarter, when it tries to open disk and the slice number is not exactly specified. When the disk has MBR, also try to read BSD label after ptable_getpart() call. When the disk has GPT, also set d_partition to 255. Mostly, this is how it worked before. 241047: Disable splitfs support, since we aren't support floppies for a long time. This slightly reduces an overhead, when loader tries to open file that doesn't exist. 241053: Almost each time when loader opens a file, this leads to calling disk_open(). Very often this is called several times for one file. This leads to reading partition table metadata for each call. To reduce the number of disk I/O we have a simple block cache, but it is very dumb and more than half of I/O operations related to reading metadata, misses this cache. Introduce new cache layer to resolve this problem. It is independent and doesn't need initialization like bcache, and will work by default for all loaders which use the new DISK API. A successful disk_open() call to each new disk or partition produces new entry in the cache. Even more, when disk was already open, now opening of any nested partitions does not require reading top level partition table. So, if without this cache, partition table metadata was read around 20-50 times during boot, now it reads only once. This affects the booting from GPT and MBR from the UFS. 241065: Fix disk_cleanup() to work without DISK_DEBUG too. 241068: Reduce the number of attempts to detect proper kld format for the amd64 loader. 241069: Remember the file format of the last loaded module and try to use it for next files. 241070: Fix the style. 241164: Replace all references to loader_callbacks_v1 with loader_callbacks. 241809: Add the flags parameter to the disk_open() function and DISK_F_NOCACHE flag, that disables the caching of partition tables metadata. Use this flag for floppies in the libi386/biosdisk driver. 241876: When loader tries to open GPT partition, but partition table is not GPT, then try automatically detect an appropriate partition type. Added: stable/9/sys/boot/common/part.c - copied, changed from r239054, head/sys/boot/common/part.c stable/9/sys/boot/common/part.h - copied unchanged from r239054, head/sys/boot/common/part.h Modified: stable/9/sys/boot/arm/uboot/Makefile stable/9/sys/boot/common/Makefile.inc stable/9/sys/boot/common/disk.c stable/9/sys/boot/common/disk.h stable/9/sys/boot/common/module.c stable/9/sys/boot/i386/libi386/Makefile stable/9/sys/boot/i386/libi386/biosdisk.c stable/9/sys/boot/i386/libi386/devicename.c stable/9/sys/boot/i386/libi386/libi386.h stable/9/sys/boot/i386/loader/Makefile stable/9/sys/boot/i386/loader/conf.c stable/9/sys/boot/i386/loader/main.c stable/9/sys/boot/i386/pmbr/pmbr.s stable/9/sys/boot/powerpc/uboot/Makefile stable/9/sys/boot/sparc64/loader/main.c stable/9/sys/boot/uboot/common/main.c stable/9/sys/boot/uboot/lib/Makefile stable/9/sys/boot/uboot/lib/devicename.c stable/9/sys/boot/uboot/lib/disk.c stable/9/sys/boot/uboot/lib/libuboot.h stable/9/sys/boot/userboot/test/test.c stable/9/sys/boot/userboot/userboot.h stable/9/sys/boot/userboot/userboot/Makefile stable/9/sys/boot/userboot/userboot/bootinfo32.c stable/9/sys/boot/userboot/userboot/copy.c stable/9/sys/boot/userboot/userboot/devicename.c stable/9/sys/boot/userboot/userboot/libuserboot.h stable/9/sys/boot/userboot/userboot/main.c stable/9/sys/boot/userboot/userboot/userboot_disk.c stable/9/sys/boot/zfs/zfs.c Directory Properties: stable/9/sys/ (props changed) stable/9/sys/boot/ (props changed) Modified: stable/9/sys/boot/arm/uboot/Makefile ============================================================================== --- stable/9/sys/boot/arm/uboot/Makefile Sun Nov 18 16:58:08 2012 (r243242) +++ stable/9/sys/boot/arm/uboot/Makefile Sun Nov 18 17:09:29 2012 (r243243) @@ -11,7 +11,11 @@ WARNS?= 1 # Architecture-specific loader code SRCS= start.S conf.c vers.c +.if !defined(LOADER_NO_DISK_SUPPORT) LOADER_DISK_SUPPORT?= yes +.else +LOADER_DISK_SUPPORT= no +.endif LOADER_UFS_SUPPORT?= yes LOADER_CD9660_SUPPORT?= no LOADER_EXT2FS_SUPPORT?= no Modified: stable/9/sys/boot/common/Makefile.inc ============================================================================== --- stable/9/sys/boot/common/Makefile.inc Sun Nov 18 16:58:08 2012 (r243242) +++ stable/9/sys/boot/common/Makefile.inc Sun Nov 18 17:09:29 2012 (r243243) @@ -1,6 +1,6 @@ # $FreeBSD$ -SRCS+= boot.c commands.c console.c devopen.c disk.c interp.c +SRCS+= boot.c commands.c console.c devopen.c interp.c SRCS+= interp_backslash.c interp_parse.c ls.c misc.c SRCS+= module.c panic.c @@ -24,6 +24,18 @@ SRCS+= load_elf64.c reloc_elf64.c SRCS+= dev_net.c .endif +.if !defined(LOADER_NO_DISK_SUPPORT) +SRCS+= disk.c part.c +CFLAGS+= -DLOADER_DISK_SUPPORT +.if !defined(LOADER_NO_GPT_SUPPORT) +SRCS+= crc32.c +CFLAGS+= -DLOADER_GPT_SUPPORT +.endif +.if !defined(LOADER_NO_MBR_SUPPORT) +CFLAGS+= -DLOADER_MBR_SUPPORT +.endif +.endif + .if defined(HAVE_BCACHE) SRCS+= bcache.c .endif Modified: stable/9/sys/boot/common/disk.c ============================================================================== --- stable/9/sys/boot/common/disk.c Sun Nov 18 16:58:08 2012 (r243242) +++ stable/9/sys/boot/common/disk.c Sun Nov 18 17:09:29 2012 (r243243) @@ -1,5 +1,6 @@ /*- * Copyright (c) 1998 Michael Smith + * Copyright (c) 2012 Andrey V. Elsukov * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -27,26 +28,12 @@ #include __FBSDID("$FreeBSD$"); -/* - * MBR/GPT partitioned disk device handling. - * - * Ideas and algorithms from: - * - * - NetBSD libi386/biosdisk.c - * - FreeBSD biosboot/disk.c - * - */ - +#include +#include #include - -#include -#include -#include - #include -#include - #include +#include #include "disk.h" @@ -56,53 +43,118 @@ __FBSDID("$FreeBSD$"); # define DEBUG(fmt, args...) #endif -/* - * Search for a slice with the following preferences: - * - * 1: Active FreeBSD slice - * 2: Non-active FreeBSD slice - * 3: Active Linux slice - * 4: non-active Linux slice - * 5: Active FAT/FAT32 slice - * 6: non-active FAT/FAT32 slice - */ -#define PREF_RAWDISK 0 -#define PREF_FBSD_ACT 1 -#define PREF_FBSD 2 -#define PREF_LINUX_ACT 3 -#define PREF_LINUX 4 -#define PREF_DOS_ACT 5 -#define PREF_DOS 6 -#define PREF_NONE 7 +struct open_disk { + struct ptable *table; + off_t mediasize; + u_int sectorsize; + u_int flags; + int rcnt; +}; -#ifdef LOADER_GPT_SUPPORT +struct print_args { + struct disk_devdesc *dev; + const char *prefix; + int verbose; +}; -struct gpt_part { - int gp_index; - uuid_t gp_type; - uint64_t gp_start; - uint64_t gp_end; +struct dentry { + const struct devsw *d_dev; + int d_unit; + int d_slice; + int d_partition; + + struct open_disk *od; + off_t d_offset; + STAILQ_ENTRY(dentry) entry; +#ifdef DISK_DEBUG + uint32_t count; +#endif }; -static uuid_t efi = GPT_ENT_TYPE_EFI; -static uuid_t freebsd_boot = GPT_ENT_TYPE_FREEBSD_BOOT; -static uuid_t freebsd_ufs = GPT_ENT_TYPE_FREEBSD_UFS; -static uuid_t freebsd_swap = GPT_ENT_TYPE_FREEBSD_SWAP; -static uuid_t freebsd_zfs = GPT_ENT_TYPE_FREEBSD_ZFS; -static uuid_t ms_basic_data = GPT_ENT_TYPE_MS_BASIC_DATA; +static STAILQ_HEAD(, dentry) opened_disks = + STAILQ_HEAD_INITIALIZER(opened_disks); +static int +disk_lookup(struct disk_devdesc *dev) +{ + struct dentry *entry; + int rc; + + rc = ENOENT; + STAILQ_FOREACH(entry, &opened_disks, entry) { + if (entry->d_dev != dev->d_dev || + entry->d_unit != dev->d_unit) + continue; + dev->d_opendata = entry->od; + if (entry->d_slice == dev->d_slice && + entry->d_partition == dev->d_partition) { + dev->d_offset = entry->d_offset; + DEBUG("%s offset %lld", disk_fmtdev(dev), + dev->d_offset); +#ifdef DISK_DEBUG + entry->count++; +#endif + return (0); + } + rc = EAGAIN; + } + return (rc); +} + +static void +disk_insert(struct disk_devdesc *dev) +{ + struct dentry *entry; + + entry = (struct dentry *)malloc(sizeof(struct dentry)); + if (entry == NULL) { + DEBUG("no memory"); + return; + } + entry->d_dev = dev->d_dev; + entry->d_unit = dev->d_unit; + entry->d_slice = dev->d_slice; + entry->d_partition = dev->d_partition; + entry->od = (struct open_disk *)dev->d_opendata; + entry->od->rcnt++; + entry->d_offset = dev->d_offset; +#ifdef DISK_DEBUG + entry->count = 1; #endif + STAILQ_INSERT_TAIL(&opened_disks, entry, entry); + DEBUG("%s cached", disk_fmtdev(dev)); +} + +#ifdef DISK_DEBUG +COMMAND_SET(dcachestat, "dcachestat", "get disk cache stats", + command_dcachestat); + +static int +command_dcachestat(int argc, char *argv[]) +{ + struct disk_devdesc dev; + struct dentry *entry; -#if defined(LOADER_GPT_SUPPORT) || defined(LOADER_MBR_SUPPORT) + STAILQ_FOREACH(entry, &opened_disks, entry) { + dev.d_dev = (struct devsw *)entry->d_dev; + dev.d_unit = entry->d_unit; + dev.d_slice = entry->d_slice; + dev.d_partition = entry->d_partition; + printf("%s %d => %p [%d]\n", disk_fmtdev(&dev), entry->count, + entry->od, entry->od->rcnt); + } + return (CMD_OK); +} +#endif /* DISK_DEBUG */ -/* Given a size in 512 byte sectors, convert it to a human-readable number. */ +/* Convert size to a human-readable number. */ static char * -display_size(uint64_t size) +display_size(uint64_t size, u_int sectorsize) { static char buf[80]; char unit; - size /= 2; + size = size * sectorsize / 1024; unit = 'K'; if (size >= 10485760000LL) { size /= 1073741824; @@ -114,687 +166,328 @@ display_size(uint64_t size) size /= 1024; unit = 'M'; } - sprintf(buf, "%.6ld%cB", (long)size, unit); + sprintf(buf, "%ld%cB", (long)size, unit); return (buf); } -#endif - -#ifdef LOADER_MBR_SUPPORT - -static void -disk_checkextended(struct disk_devdesc *dev, - struct dos_partition *slicetab, int slicenum, int *nslicesp) +static int +ptblread(void *d, void *buf, size_t blocks, off_t offset) { - uint8_t buf[DISK_SECSIZE]; - struct dos_partition *dp; - uint32_t base; - int rc, i, start, end; - - dp = &slicetab[slicenum]; - start = *nslicesp; - - if (dp->dp_size == 0) - goto done; - if (dp->dp_typ != DOSPTYP_EXT) - goto done; - rc = dev->d_dev->dv_strategy(dev, F_READ, dp->dp_start, DISK_SECSIZE, - (char *) buf, NULL); - if (rc) - goto done; - if (buf[0x1fe] != 0x55 || buf[0x1ff] != 0xaa) { - DEBUG("no magic in extended table"); - goto done; - } - base = dp->dp_start; - dp = (struct dos_partition *) &buf[DOSPARTOFF]; - for (i = 0; i < NDOSPART; i++, dp++) { - if (dp->dp_size == 0) - continue; - if (*nslicesp == NEXTDOSPART) - goto done; - dp->dp_start += base; - bcopy(dp, &slicetab[*nslicesp], sizeof(*dp)); - (*nslicesp)++; - } - end = *nslicesp; + struct disk_devdesc *dev; + struct open_disk *od; - /* - * now, recursively check the slices we just added - */ - for (i = start; i < end; i++) - disk_checkextended(dev, slicetab, i, nslicesp); -done: - return; + dev = (struct disk_devdesc *)d; + od = (struct open_disk *)dev->d_opendata; + return (dev->d_dev->dv_strategy(dev, F_READ, offset, + blocks * od->sectorsize, (char *)buf, NULL)); } -static int -disk_readslicetab(struct disk_devdesc *dev, - struct dos_partition **slicetabp, int *nslicesp) +#define PWIDTH 35 +static void +ptable_print(void *arg, const char *pname, const struct ptable_entry *part) { - struct dos_partition *slicetab = NULL; - int nslices, i; - int rc; - uint8_t buf[DISK_SECSIZE]; - - /* - * Find the slice in the DOS slice table. - */ - rc = dev->d_dev->dv_strategy(dev, F_READ, 0, DISK_SECSIZE, - (char *) buf, NULL); - if (rc) { - DEBUG("error reading MBR"); - return (rc); - } + struct print_args *pa, bsd; + struct open_disk *od; + struct ptable *table; + char line[80]; - /* - * Check the slice table magic. - */ - if (buf[0x1fe] != 0x55 || buf[0x1ff] != 0xaa) { - DEBUG("no slice table/MBR (no magic)"); - return (rc); + pa = (struct print_args *)arg; + od = (struct open_disk *)pa->dev->d_opendata; + sprintf(line, " %s%s: %s", pa->prefix, pname, + parttype2str(part->type)); + if (pa->verbose) + sprintf(line, "%-*s%s", PWIDTH, line, + display_size(part->end - part->start + 1, + od->sectorsize)); + strcat(line, "\n"); + pager_output(line); + if (part->type == PART_FREEBSD) { + /* Open slice with BSD label */ + pa->dev->d_offset = part->start; + table = ptable_open(pa->dev, part->end - part->start + 1, + od->sectorsize, ptblread); + if (table == NULL) + return; + sprintf(line, " %s%s", pa->prefix, pname); + bsd.dev = pa->dev; + bsd.prefix = line; + bsd.verbose = pa->verbose; + ptable_iterate(table, &bsd, ptable_print); + ptable_close(table); } - - /* - * copy the partition table, then pick up any extended partitions. - */ - slicetab = malloc(NEXTDOSPART * sizeof(struct dos_partition)); - bcopy(buf + DOSPARTOFF, slicetab, - sizeof(struct dos_partition) * NDOSPART); - nslices = NDOSPART; /* extended slices start here */ - for (i = 0; i < NDOSPART; i++) - disk_checkextended(dev, slicetab, i, &nslices); - - *slicetabp = slicetab; - *nslicesp = nslices; - return (0); } +#undef PWIDTH -/* - * Search for the best MBR slice (typically the first FreeBSD slice). - */ -static int -disk_bestslice(struct dos_partition *slicetab, int nslices) +void +disk_print(struct disk_devdesc *dev, char *prefix, int verbose) { - struct dos_partition *dp; - int pref, preflevel; - int i, prefslice; - - prefslice = 0; - preflevel = PREF_NONE; - - dp = &slicetab[0]; - for (i = 0; i < nslices; i++, dp++) { - switch (dp->dp_typ) { - case DOSPTYP_386BSD: /* FreeBSD */ - pref = dp->dp_flag & 0x80 ? PREF_FBSD_ACT : PREF_FBSD; - break; - - case DOSPTYP_LINUX: - pref = dp->dp_flag & 0x80 ? PREF_LINUX_ACT : PREF_LINUX; - break; - - case 0x01: /* DOS/Windows */ - case 0x04: - case 0x06: - case 0x0b: - case 0x0c: - case 0x0e: - pref = dp->dp_flag & 0x80 ? PREF_DOS_ACT : PREF_DOS; - break; + struct open_disk *od; + struct print_args pa; - default: - pref = PREF_NONE; - } - if (pref < preflevel) { - preflevel = pref; - prefslice = i + 1; - } - } - return (prefslice); + /* Disk should be opened */ + od = (struct open_disk *)dev->d_opendata; + pa.dev = dev; + pa.prefix = prefix; + pa.verbose = verbose; + ptable_iterate(od->table, &pa, ptable_print); } -static int -disk_openmbr(struct disk_devdesc *dev) +int +disk_open(struct disk_devdesc *dev, off_t mediasize, u_int sectorsize, + u_int flags) { - struct dos_partition *slicetab = NULL, *dptr; - int nslices, sector, slice; - int rc; - uint8_t buf[DISK_SECSIZE]; - struct disklabel *lp; - - /* - * Following calculations attempt to determine the correct value - * for dev->d_offset by looking for the slice and partition specified, - * or searching for reasonable defaults. - */ - rc = disk_readslicetab(dev, &slicetab, &nslices); - if (rc) - return (rc); + struct open_disk *od; + struct ptable *table; + struct ptable_entry part; + int rc, slice, partition; - /* - * if a slice number was supplied but not found, this is an error. - */ - if (dev->d_slice > 0) { - slice = dev->d_slice - 1; - if (slice >= nslices) { - DEBUG("slice %d not found", slice); - rc = EPART; - goto out; - } - } - - /* - * Check for the historically bogus MBR found on true dedicated disks - */ - if (slicetab[3].dp_typ == DOSPTYP_386BSD && - slicetab[3].dp_start == 0 && slicetab[3].dp_size == 50000) { - sector = 0; - goto unsliced; - } - - /* - * Try to auto-detect the best slice; this should always give - * a slice number - */ - if (dev->d_slice == 0) { - slice = disk_bestslice(slicetab, nslices); - if (slice == -1) { - rc = ENOENT; - goto out; - } - dev->d_slice = slice; + rc = 0; + if ((flags & DISK_F_NOCACHE) == 0) { + rc = disk_lookup(dev); + if (rc == 0) + return (0); } - - /* - * Accept the supplied slice number unequivocally (we may be looking - * at a DOS partition). - * Note: we number 1-4, offsets are 0-3 - */ - dptr = &slicetab[dev->d_slice - 1]; - sector = dptr->dp_start; - DEBUG("slice entry %d at %d, %d sectors", - dev->d_slice - 1, sector, dptr->dp_size); - -unsliced: /* - * Now we have the slice offset, look for the partition in the - * disklabel if we have a partition to start with. - * - * XXX we might want to check the label checksum. + * While we are reading disk metadata, make sure we do it relative + * to the start of the disk */ - if (dev->d_partition < 0) { - /* no partition, must be after the slice */ - DEBUG("opening raw slice"); - dev->d_offset = sector; - rc = 0; - goto out; - } - - rc = dev->d_dev->dv_strategy(dev, F_READ, sector + LABELSECTOR, - DISK_SECSIZE, (char *) buf, NULL); - if (rc) { - DEBUG("error reading disklabel"); - goto out; - } - - lp = (struct disklabel *) buf; - - if (lp->d_magic != DISKMAGIC) { - DEBUG("no disklabel"); - rc = ENOENT; - goto out; - } - if (dev->d_partition >= lp->d_npartitions) { - DEBUG("partition '%c' exceeds partitions in table (a-'%c')", - 'a' + dev->d_partition, - 'a' + lp->d_npartitions); - rc = EPART; + dev->d_offset = 0; + table = NULL; + slice = dev->d_slice; + partition = dev->d_partition; + if (rc == EAGAIN) { + /* + * This entire disk was already opened and there is no + * need to allocate new open_disk structure and open the + * main partition table. + */ + od = (struct open_disk *)dev->d_opendata; + DEBUG("%s unit %d, slice %d, partition %d => %p (cached)", + disk_fmtdev(dev), dev->d_unit, dev->d_slice, + dev->d_partition, od); + goto opened; + } else { + od = (struct open_disk *)malloc(sizeof(struct open_disk)); + if (od == NULL) { + DEBUG("no memory"); + return (ENOMEM); + } + dev->d_opendata = od; + od->rcnt = 0; + } + od->mediasize = mediasize; + od->sectorsize = sectorsize; + od->flags = flags; + DEBUG("%s unit %d, slice %d, partition %d => %p", + disk_fmtdev(dev), dev->d_unit, dev->d_slice, dev->d_partition, od); + + /* Determine disk layout. */ + od->table = ptable_open(dev, mediasize / sectorsize, sectorsize, + ptblread); + if (od->table == NULL) { + DEBUG("Can't read partition table"); + rc = ENXIO; goto out; } - - dev->d_offset = - lp->d_partitions[dev->d_partition].p_offset - - lp->d_partitions[RAW_PART].p_offset + - sector; +opened: rc = 0; - -out: - if (slicetab) - free(slicetab); - return (rc); -} - -/* - * Print out each valid partition in the disklabel of a FreeBSD slice. - * For size calculations, we assume a 512 byte sector size. - */ -static void -disk_printbsdslice(struct disk_devdesc *dev, daddr_t offset, - char *prefix, int verbose) -{ - char line[80]; - char buf[DISK_SECSIZE]; - struct disklabel *lp; - int i, rc, fstype; - - /* read disklabel */ - rc = dev->d_dev->dv_strategy(dev, F_READ, offset + LABELSECTOR, - DISK_SECSIZE, (char *) buf, NULL); - if (rc) - return; - lp =(struct disklabel *)(&buf[0]); - if (lp->d_magic != DISKMAGIC) { - sprintf(line, "%s: FFS bad disklabel\n", prefix); - pager_output(line); - return; - } - - /* Print partitions */ - for (i = 0; i < lp->d_npartitions; i++) { - /* - * For each partition, make sure we know what type of fs it - * is. If not, then skip it. - */ - fstype = lp->d_partitions[i].p_fstype; - if (fstype != FS_BSDFFS && - fstype != FS_SWAP && - fstype != FS_VINUM) - continue; - - /* Only print out statistics in verbose mode */ - if (verbose) - sprintf(line, " %s%c: %s %s (%d - %d)\n", - prefix, 'a' + i, - (fstype == FS_SWAP) ? "swap " : - (fstype == FS_VINUM) ? "vinum" : - "FFS ", - display_size(lp->d_partitions[i].p_size), - lp->d_partitions[i].p_offset, - (lp->d_partitions[i].p_offset - + lp->d_partitions[i].p_size)); + if (ptable_gettype(od->table) == PTABLE_BSD && + partition >= 0) { + /* It doesn't matter what value has d_slice */ + rc = ptable_getpart(od->table, &part, partition); + if (rc == 0) + dev->d_offset = part.start; + } else if (slice >= 0) { + /* Try to get information about partition */ + if (slice == 0) + rc = ptable_getbestpart(od->table, &part); else - sprintf(line, " %s%c: %s\n", prefix, 'a' + i, - (fstype == FS_SWAP) ? "swap" : - (fstype == FS_VINUM) ? "vinum" : - "FFS"); - pager_output(line); - } -} - -static void -disk_printslice(struct disk_devdesc *dev, int slice, - struct dos_partition *dp, char *prefix, int verbose) -{ - char stats[80]; - char line[80]; - - if (verbose) - sprintf(stats, " %s (%d - %d)", display_size(dp->dp_size), - dp->dp_start, dp->dp_start + dp->dp_size); - else - stats[0] = '\0'; - - switch (dp->dp_typ) { - case DOSPTYP_386BSD: - disk_printbsdslice(dev, (daddr_t)dp->dp_start, - prefix, verbose); - return; - case DOSPTYP_LINSWP: - sprintf(line, "%s: Linux swap%s\n", prefix, stats); - break; - case DOSPTYP_LINUX: + rc = ptable_getpart(od->table, &part, slice); + if (rc != 0) /* Partition doesn't exist */ + goto out; + dev->d_offset = part.start; + slice = part.index; + if (ptable_gettype(od->table) == PTABLE_GPT) { + partition = 255; + goto out; /* Nothing more to do */ + } else if (partition == 255) { + /* + * When we try to open GPT partition, but partition + * table isn't GPT, reset d_partition value to -1 + * and try to autodetect appropriate value. + */ + partition = -1; + } /* - * XXX - * read the superblock to confirm this is an ext2fs partition? + * If d_partition < 0 and we are looking at a BSD slice, + * then try to read BSD label, otherwise return the + * whole MBR slice. */ - sprintf(line, "%s: ext2fs%s\n", prefix, stats); - break; - case 0x00: /* unused partition */ - case DOSPTYP_EXT: - return; - case 0x01: - sprintf(line, "%s: FAT-12%s\n", prefix, stats); - break; - case 0x04: - case 0x06: - case 0x0e: - sprintf(line, "%s: FAT-16%s\n", prefix, stats); - break; - case 0x07: - sprintf(line, "%s: NTFS/HPFS%s\n", prefix, stats); - break; - case 0x0b: - case 0x0c: - sprintf(line, "%s: FAT-32%s\n", prefix, stats); - break; - default: - sprintf(line, "%s: Unknown fs: 0x%x %s\n", prefix, dp->dp_typ, - stats); - } - pager_output(line); -} - -static int -disk_printmbr(struct disk_devdesc *dev, char *prefix, int verbose) -{ - struct dos_partition *slicetab; - int nslices, i; - int rc; - char line[80]; - - rc = disk_readslicetab(dev, &slicetab, &nslices); - if (rc) - return (rc); - for (i = 0; i < nslices; i++) { - sprintf(line, "%ss%d", prefix, i + 1); - disk_printslice(dev, i, &slicetab[i], line, verbose); - } - free(slicetab); - return (0); -} - -#endif - -#ifdef LOADER_GPT_SUPPORT - -static int -disk_readgpt(struct disk_devdesc *dev, struct gpt_part **gptp, int *ngptp) -{ - struct dos_partition *dp; - struct gpt_hdr *hdr; - struct gpt_ent *ent; - struct gpt_part *gptab = NULL; - int entries_per_sec, rc, i, part; - daddr_t lba, elba; - uint8_t gpt[DISK_SECSIZE], tbl[DISK_SECSIZE]; - - /* - * Following calculations attempt to determine the correct value - * for dev->d_offset by looking for the slice and partition specified, - * or searching for reasonable defaults. - */ - rc = 0; - - /* First, read the MBR and see if we have a PMBR. */ - rc = dev->d_dev->dv_strategy(dev, F_READ, 0, DISK_SECSIZE, - (char *) tbl, NULL); - if (rc) { - DEBUG("error reading MBR"); - return (EIO); - } - - /* Check the slice table magic. */ - if (tbl[0x1fe] != 0x55 || tbl[0x1ff] != 0xaa) - return (ENXIO); - - /* Check for GPT slice. */ - part = 0; - dp = (struct dos_partition *)(tbl + DOSPARTOFF); - for (i = 0; i < NDOSPART; i++) { - if (dp[i].dp_typ == 0xee) - part++; - else if ((part != 1) && (dp[i].dp_typ != 0x00)) - return (EINVAL); - } - if (part != 1) - return (EINVAL); - - /* Read primary GPT table header. */ - rc = dev->d_dev->dv_strategy(dev, F_READ, 1, DISK_SECSIZE, - (char *) gpt, NULL); - if (rc) { - DEBUG("error reading GPT header"); - return (EIO); - } - hdr = (struct gpt_hdr *)gpt; - if (bcmp(hdr->hdr_sig, GPT_HDR_SIG, sizeof(hdr->hdr_sig)) != 0 || - hdr->hdr_lba_self != 1 || hdr->hdr_revision < 0x00010000 || - hdr->hdr_entsz < sizeof(*ent) || - DISK_SECSIZE % hdr->hdr_entsz != 0) { - DEBUG("Invalid GPT header\n"); - return (EINVAL); - } - - /* Walk the partition table to count valid partitions. */ - part = 0; - entries_per_sec = DISK_SECSIZE / hdr->hdr_entsz; - elba = hdr->hdr_lba_table + hdr->hdr_entries / entries_per_sec; - for (lba = hdr->hdr_lba_table; lba < elba; lba++) { - rc = dev->d_dev->dv_strategy(dev, F_READ, lba, DISK_SECSIZE, - (char *) tbl, NULL); - if (rc) { - DEBUG("error reading GPT table"); - return (EIO); + if (partition == -1 && + part.type != PART_FREEBSD) + goto out; + /* Try to read BSD label */ + table = ptable_open(dev, part.end - part.start + 1, + od->sectorsize, ptblread); + if (table == NULL) { + DEBUG("Can't read BSD label"); + rc = ENXIO; + goto out; } - for (i = 0; i < entries_per_sec; i++) { - ent = (struct gpt_ent *)(tbl + i * hdr->hdr_entsz); - if (uuid_is_nil(&ent->ent_type, NULL) || - ent->ent_lba_start == 0 || - ent->ent_lba_end < ent->ent_lba_start) - continue; - part++; + /* + * If slice contains BSD label and d_partition < 0, then + * assume the 'a' partition. Otherwise just return the + * whole MBR slice, because it can contain ZFS. + */ + if (partition < 0) { + if (ptable_gettype(table) != PTABLE_BSD) + goto out; + partition = 0; } + rc = ptable_getpart(table, &part, partition); + if (rc != 0) + goto out; + dev->d_offset += part.start; } +out: + if (table != NULL) + ptable_close(table); - /* Save the important information about all the valid partitions. */ - if (part != 0) { - gptab = malloc(part * sizeof(struct gpt_part)); - part = 0; - for (lba = hdr->hdr_lba_table; lba < elba; lba++) { - rc = dev->d_dev->dv_strategy(dev, F_READ, lba, DISK_SECSIZE, - (char *) tbl, NULL); - if (rc) { - DEBUG("error reading GPT table"); - free(gptab); - return (EIO); - } - for (i = 0; i < entries_per_sec; i++) { - ent = (struct gpt_ent *)(tbl + i * hdr->hdr_entsz); - if (uuid_is_nil(&ent->ent_type, NULL) || - ent->ent_lba_start == 0 || - ent->ent_lba_end < ent->ent_lba_start) - continue; - gptab[part].gp_index = (lba - hdr->hdr_lba_table) * - entries_per_sec + i + 1; - gptab[part].gp_type = ent->ent_type; - gptab[part].gp_start = ent->ent_lba_start; - gptab[part].gp_end = ent->ent_lba_end; - part++; - } - } + if (rc != 0) { + if (od->rcnt < 1) { + if (od->table != NULL) + ptable_close(od->table); + free(od); + } + DEBUG("%s could not open", disk_fmtdev(dev)); + } else { + if ((flags & DISK_F_NOCACHE) == 0) + disk_insert(dev); + /* Save the slice and partition number to the dev */ + dev->d_slice = slice; + dev->d_partition = partition; + DEBUG("%s offset %lld => %p", disk_fmtdev(dev), + dev->d_offset, od); } - - *gptp = gptab; - *ngptp = part; - return (0); + return (rc); } -static struct gpt_part * -disk_bestgpt(struct gpt_part *gpt, int ngpt) +int +disk_close(struct disk_devdesc *dev) { - struct gpt_part *gp, *prefpart; - int i, pref, preflevel; + struct open_disk *od; - prefpart = NULL; - preflevel = PREF_NONE; - - gp = gpt; - for (i = 0; i < ngpt; i++, gp++) { - /* Windows. XXX: Also Linux. */ - if (uuid_equal(&gp->gp_type, &ms_basic_data, NULL)) - pref = PREF_DOS; - /* FreeBSD */ - else if (uuid_equal(&gp->gp_type, &freebsd_ufs, NULL) || - uuid_equal(&gp->gp_type, &freebsd_zfs, NULL)) - pref = PREF_FBSD; - else - pref = PREF_NONE; - if (pref < preflevel) { - preflevel = pref; - prefpart = gp; - } + od = (struct open_disk *)dev->d_opendata; + DEBUG("%s closed => %p [%d]", disk_fmtdev(dev), od, od->rcnt); + if (od->flags & DISK_F_NOCACHE) { + ptable_close(od->table); + free(od); } - return (prefpart); + return (0); } -static int -disk_opengpt(struct disk_devdesc *dev) +void +disk_cleanup(const struct devsw *d_dev) { - struct gpt_part *gpt = NULL, *gp; - int rc, ngpt, i; - - rc = disk_readgpt(dev, &gpt, &ngpt); - if (rc) - return (rc); - - /* Is this a request for the whole disk? */ - if (dev->d_slice < 0) { - dev->d_offset = 0; - rc = 0; - goto out; - } - - /* - * If a partition number was supplied, then the user is trying to use - * an MBR address rather than a GPT address, so fail. - */ - if (dev->d_partition != 0xff) { - rc = ENOENT; - goto out; - } - - /* If a slice number was supplied but not found, this is an error. */ - gp = NULL; - if (dev->d_slice > 0) { - for (i = 0; i < ngpt; i++) { - if (gpt[i].gp_index == dev->d_slice) { - gp = &gpt[i]; - break; - } - } - if (gp == NULL) { - DEBUG("partition %d not found", dev->d_slice); - rc = ENOENT; - goto out; - } - } +#ifdef DISK_DEBUG + struct disk_devdesc dev; +#endif + struct dentry *entry, *tmp; - /* Try to auto-detect the best partition. */ - if (dev->d_slice == 0) { - gp = disk_bestgpt(gpt, ngpt); - if (gp == NULL) { - rc = ENOENT; - goto out; + STAILQ_FOREACH_SAFE(entry, &opened_disks, entry, tmp) { + if (entry->d_dev != d_dev) *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***