From owner-svn-src-all@freebsd.org Mon Feb 6 18:29:44 2017 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 902D0CD337B; Mon, 6 Feb 2017 18:29:44 +0000 (UTC) (envelope-from tsoome@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 62A9832F; Mon, 6 Feb 2017 18:29:44 +0000 (UTC) (envelope-from tsoome@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v16IThsU001304; Mon, 6 Feb 2017 18:29:43 GMT (envelope-from tsoome@FreeBSD.org) Received: (from tsoome@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v16ITh1o001303; Mon, 6 Feb 2017 18:29:43 GMT (envelope-from tsoome@FreeBSD.org) Message-Id: <201702061829.v16ITh1o001303@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: tsoome set sender to tsoome@FreeBSD.org using -f From: Toomas Soome Date: Mon, 6 Feb 2017 18:29:43 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r313348 - head/sys/boot/i386/libi386 X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Feb 2017 18:29:44 -0000 Author: tsoome Date: Mon Feb 6 18:29:43 2017 New Revision: 313348 URL: https://svnweb.freebsd.org/changeset/base/313348 Log: loader: biosdisk fix for 2+TB disks This fix is implementing partition based boundary check for disk IO and updates disk mediasize (if needed), based on information from partition table. As it appeared, the signed int based approach still has corner cases, and the wrapover based behavior is non-standard. The idea for this fix is based on two assumptions: The bug about media size is hitting large (2+TB) disks, lesser disks hopefully, are not affected. Large disks are using GPT (which does include information about disk size). Since our concern is about boot support and boot disks are partitioned, implementing partition boundaries based IO verification should make the media size issues mostly disappear. However, for large disk case, we do have the disk size available from GPT table. If non-GPT cases will appear, we still can make approximate calculation about disk size based on defined partition(s), however, this is not the objective of this patch, and can be added later if there is any need. This patch does implement disk media size adjustment (if needed) in bd_open(), and boundary check in bd_realstrategy(). Reviewed by: allanjude Approved by: allanjude (mentor) Differential Revision: https://reviews.freebsd.org/D8595 Modified: head/sys/boot/i386/libi386/biosdisk.c Modified: head/sys/boot/i386/libi386/biosdisk.c ============================================================================== --- head/sys/boot/i386/libi386/biosdisk.c Mon Feb 6 17:50:09 2017 (r313347) +++ head/sys/boot/i386/libi386/biosdisk.c Mon Feb 6 18:29:43 2017 (r313348) @@ -39,6 +39,7 @@ __FBSDID("$FreeBSD$"); */ #include +#include #include #include #include @@ -302,15 +303,28 @@ bd_int13probe(struct bdinfo *bd) if (!V86_CY(v86.efl)) { uint64_t total; - if (params.sectors != 0) - bd->bd_sectors = params.sectors; + /* + * Sector size must be a multiple of 512 bytes. + * An alternate test would be to check power of 2, + * powerof2(params.sector_size). + */ + if (params.sector_size % BIOSDISK_SECSIZE) + bd->bd_sectorsize = BIOSDISK_SECSIZE; + else + bd->bd_sectorsize = params.sector_size; + + total = bd->bd_sectorsize * params.sectors; + if (params.sectors != 0) { + /* Only update if we did not overflow. */ + if (total > params.sectors) + bd->bd_sectors = params.sectors; + } total = (uint64_t)params.cylinders * params.heads * params.sectors_per_track; if (bd->bd_sectors < total) bd->bd_sectors = total; - bd->bd_sectorsize = params.sector_size; ret = 1; } DEBUG("unit 0x%x flags %x, sectors %llu, sectorsize %u", @@ -377,8 +391,10 @@ static int bd_open(struct open_file *f, ...) { struct disk_devdesc *dev, rdev; + struct disk_devdesc disk; int err, g_err; va_list ap; + uint64_t size; va_start(ap, f); dev = va_arg(ap, struct disk_devdesc *); @@ -389,6 +405,33 @@ bd_open(struct open_file *f, ...) BD(dev).bd_open++; if (BD(dev).bd_bcache == NULL) BD(dev).bd_bcache = bcache_allocate(); + + /* + * Read disk size from partition. + * This is needed to work around buggy BIOS systems returning + * wrong (truncated) disk media size. + * During bd_probe() we tested if the mulitplication of bd_sectors + * would overflow so it should be safe to perform here. + */ + disk.d_dev = dev->d_dev; + disk.d_type = dev->d_type; + disk.d_unit = dev->d_unit; + disk.d_opendata = NULL; + disk.d_slice = -1; + disk.d_partition = -1; + disk.d_offset = 0; + if (disk_open(&disk, BD(dev).bd_sectors * BD(dev).bd_sectorsize, + BD(dev).bd_sectorsize, (BD(dev).bd_flags & BD_FLOPPY) ? + DISK_F_NOCACHE: 0) == 0) { + + if (disk_ioctl(&disk, DIOCGMEDIASIZE, &size) == 0) { + size /= BD(dev).bd_sectorsize; + if (size > BD(dev).bd_sectors) + BD(dev).bd_sectors = size; + } + disk_close(&disk); + } + err = disk_open(dev, BD(dev).bd_sectors * BD(dev).bd_sectorsize, BD(dev).bd_sectorsize, (BD(dev).bd_flags & BD_FLOPPY) ? DISK_F_NOCACHE: 0); @@ -486,8 +529,14 @@ static int bd_ioctl(struct open_file *f, u_long cmd, void *data) { struct disk_devdesc *dev; + int rc; dev = (struct disk_devdesc *)f->f_devdata; + + rc = disk_ioctl(dev, cmd, data); + if (rc != ENOTTY) + return (rc); + switch (cmd) { case DIOCGSECTORSIZE: *(u_int *)data = BD(dev).bd_sectorsize; @@ -521,7 +570,8 @@ bd_realstrategy(void *devdata, int rw, d char *buf, size_t *rsize) { struct disk_devdesc *dev = (struct disk_devdesc *)devdata; - int blks, remaining; + uint64_t disk_blocks; + int blks; #ifdef BD_SUPPORT_FRAGS /* XXX: sector size */ char fragbuf[BIOSDISK_SECSIZE]; size_t fragsize; @@ -533,19 +583,43 @@ bd_realstrategy(void *devdata, int rw, d #endif DEBUG("open_disk %p", dev); + + /* + * Check the value of the size argument. We do have quite small + * heap (64MB), but we do not know good upper limit, so we check against + * INT_MAX here. This will also protect us against possible overflows + * while translating block count to bytes. + */ + if (size > INT_MAX) { + DEBUG("too large read: %zu bytes", size); + return (EIO); + } + blks = size / BD(dev).bd_sectorsize; + if (dblk > dblk + blks) + return (EIO); + if (rsize) *rsize = 0; + /* Get disk blocks, this value is either for whole disk or for partition */ + if (disk_ioctl(dev, DIOCGMEDIASIZE, &disk_blocks)) { + /* DIOCGMEDIASIZE does return bytes. */ + disk_blocks /= BD(dev).bd_sectorsize; + } else { + /* We should not get here. Just try to survive. */ + disk_blocks = BD(dev).bd_sectors - dev->d_offset; + } + + /* Validate source block address. */ + if (dblk < dev->d_offset || dblk >= dev->d_offset + disk_blocks) + return (EIO); + /* - * Perform partial read to prevent read-ahead crossing - * the end of disk - or any 32 bit aliases of the end. - * Signed arithmetic is used to handle wrap-around cases - * like we do for TCP sequence numbers. + * Truncate if we are crossing disk or partition end. */ - remaining = (int)(BD(dev).bd_sectors - dblk); /* truncate */ - if (remaining > 0 && remaining < blks) { - blks = remaining; + if (dblk + blks >= dev->d_offset + disk_blocks) { + blks = dev->d_offset + disk_blocks - dblk; size = blks * BD(dev).bd_sectorsize; DEBUG("short read %d", blks); }