From owner-svn-src-all@freebsd.org Sun Feb 12 21:05:45 2017 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F315CCDCA38; Sun, 12 Feb 2017 21:05:45 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B233213B7; Sun, 12 Feb 2017 21:05:45 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v1CL5i6B057662; Sun, 12 Feb 2017 21:05:44 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v1CL5iV4057656; Sun, 12 Feb 2017 21:05:44 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201702122105.v1CL5iV4057656@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Sun, 12 Feb 2017 21:05:44 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r313690 - in head: lib/libc/sys sys/kern sys/vm X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 12 Feb 2017 21:05:46 -0000 Author: kib Date: Sun Feb 12 21:05:44 2017 New Revision: 313690 URL: https://svnweb.freebsd.org/changeset/base/313690 Log: Consistently handle negative or wrapping offsets in the mmap(2) syscalls. For regular files and posix shared memory, POSIX requires that [offset, offset + size) range is legitimate. At the maping time, check that offset is not negative. Allowing negative offsets might expose the data that filesystem put into vm_object for internal use, esp. due to OFF_TO_IDX() signess treatment. Fault handler verifies that the mapped range is valid, assuming that mmap(2) checked that arithmetic gives no undefined results. For device mappings, leave the semantic of negative offsets to the driver. Correct object page index calculation to not erronously propagate sign. In either case, disallow overflow of offset + size. Update mmap(2) man page to explain the requirement of the range validity, and behaviour when the range becomes invalid after mapping. Reported and tested by: royger (previous version) Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Modified: head/lib/libc/sys/mmap.2 head/sys/kern/uipc_shm.c head/sys/kern/vfs_vnops.c head/sys/vm/device_pager.c head/sys/vm/sg_pager.c head/sys/vm/vm_object.h Modified: head/lib/libc/sys/mmap.2 ============================================================================== --- head/lib/libc/sys/mmap.2 Sun Feb 12 21:00:12 2017 (r313689) +++ head/lib/libc/sys/mmap.2 Sun Feb 12 21:05:44 2017 (r313690) @@ -28,7 +28,7 @@ .\" @(#)mmap.2 8.4 (Berkeley) 5/11/95 .\" $FreeBSD$ .\" -.Dd November 25, 2016 +.Dd February 4, 2017 .Dt MMAP 2 .Os .Sh NAME @@ -53,11 +53,37 @@ starting at byte offset .Fa offset . If .Fa len -is not a multiple of the pagesize, the mapped region may extend past the +is not a multiple of the page size, the mapped region may extend past the specified range. Any such extension beyond the end of the mapped object will be zero-filled. .Pp If +.Fa fd +references a regular file or a shared memory object, the range of +bytes starting at +.Fa offset +and continuing for +.Fa len +bytes must be legitimate for the possible (not necessarily +current) offsets in the object. +In particular, the +.Fa offset +value cannot be negative. +If the object is truncated and the process later accesses a pages that +is wholly within the truncated region, the access is aborted and a +.Dv SIGBUS +signal is delivered to the process. +.Pp +If +.Fa fd +references a device file, the interpretation of the +.Fa offset +value is device specific and defined by the device driver. +The virtual memory subsystem does not impose any restrictitions on the +.Fa offset +value in this case, passing it unchanged to the driver. +.Pp +If .Fa addr is non-zero, it is used as a hint to the system. (As a convenience to the system, the actual address of the region may differ @@ -157,7 +183,7 @@ If .Dv MAP_FIXED is specified, .Fa addr -must be a multiple of the pagesize. +must be a multiple of the page size. If .Dv MAP_EXCL is not specified, a successful @@ -361,6 +387,12 @@ The argument is not a valid open file descriptor. .It Bq Er EINVAL +An invalid (negative) value was passed in the +.Fa offset +argument, when +.Fa fd +referenced a regular file or shared memory. +.It Bq Er EINVAL An invalid value was passed in the .Fa prot argument. Modified: head/sys/kern/uipc_shm.c ============================================================================== --- head/sys/kern/uipc_shm.c Sun Feb 12 21:00:12 2017 (r313689) +++ head/sys/kern/uipc_shm.c Sun Feb 12 21:05:44 2017 (r313690) @@ -891,20 +891,20 @@ shm_mmap(struct file *fp, vm_map_t map, return (EACCES); maxprot &= cap_maxprot; + /* See comment in vn_mmap(). */ + if ( +#ifdef _LP64 + objsize > OFF_MAX || +#endif + foff < 0 || foff > OFF_MAX - objsize) + return (EINVAL); + #ifdef MAC error = mac_posixshm_check_mmap(td->td_ucred, shmfd, prot, flags); if (error != 0) return (error); #endif - /* - * XXXRW: This validation is probably insufficient, and subject to - * sign errors. It should be fixed. - */ - if (foff >= shmfd->shm_size || - foff + objsize > round_page(shmfd->shm_size)) - return (EINVAL); - mtx_lock(&shm_timestamp_lock); vfs_timestamp(&shmfd->shm_atime); mtx_unlock(&shm_timestamp_lock); Modified: head/sys/kern/vfs_vnops.c ============================================================================== --- head/sys/kern/vfs_vnops.c Sun Feb 12 21:00:12 2017 (r313689) +++ head/sys/kern/vfs_vnops.c Sun Feb 12 21:05:44 2017 (r313690) @@ -2455,6 +2455,24 @@ vn_mmap(struct file *fp, vm_map_t map, v } maxprot &= cap_maxprot; + /* + * For regular files and shared memory, POSIX requires that + * the value of foff be a legitimate offset within the data + * object. In particular, negative offsets are invalid. + * Blocking negative offsets and overflows here avoids + * possible wraparound or user-level access into reserved + * ranges of the data object later. In contrast, POSIX does + * not dictate how offsets are used by device drivers, so in + * the case of a device mapping a negative offset is passed + * on. + */ + if ( +#ifdef _LP64 + size > OFF_MAX || +#endif + foff < 0 || foff > OFF_MAX - size) + return (EINVAL); + writecounted = FALSE; error = vm_mmap_vnode(td, size, prot, &maxprot, &flags, vp, &foff, &object, &writecounted); Modified: head/sys/vm/device_pager.c ============================================================================== --- head/sys/vm/device_pager.c Sun Feb 12 21:00:12 2017 (r313689) +++ head/sys/vm/device_pager.c Sun Feb 12 21:05:44 2017 (r313690) @@ -139,8 +139,18 @@ cdev_pager_allocate(void *handle, enum o if (foff & PAGE_MASK) return (NULL); + /* + * Treat the mmap(2) file offset as an unsigned value for a + * device mapping. This, in effect, allows a user to pass all + * possible off_t values as the mapping cookie to the driver. At + * this point, we know that both foff and size are a multiple + * of the page size. Do a check to avoid wrap. + */ size = round_page(size); - pindex = OFF_TO_IDX(foff + size); + pindex = UOFF_TO_IDX(foff) + UOFF_TO_IDX(size); + if (pindex > OBJ_MAX_SIZE || pindex < UOFF_TO_IDX(foff) || + pindex < UOFF_TO_IDX(size)) + return (NULL); if (ops->cdev_pg_ctor(handle, size, prot, foff, cred, &color) != 0) return (NULL); Modified: head/sys/vm/sg_pager.c ============================================================================== --- head/sys/vm/sg_pager.c Sun Feb 12 21:00:12 2017 (r313689) +++ head/sys/vm/sg_pager.c Sun Feb 12 21:05:44 2017 (r313690) @@ -96,8 +96,9 @@ sg_pager_alloc(void *handle, vm_ooffset_ * to map beyond that. */ size = round_page(size); - pindex = OFF_TO_IDX(foff + size); - if (pindex > npages) + pindex = UOFF_TO_IDX(foff) + UOFF_TO_IDX(size); + if (pindex > npages || pindex < UOFF_TO_IDX(foff) || + pindex < UOFF_TO_IDX(size)) return (NULL); /* Modified: head/sys/vm/vm_object.h ============================================================================== --- head/sys/vm/vm_object.h Sun Feb 12 21:00:12 2017 (r313689) +++ head/sys/vm/vm_object.h Sun Feb 12 21:05:44 2017 (r313690) @@ -183,8 +183,23 @@ struct vm_object { #define OBJ_DISCONNECTWNT 0x4000 /* disconnect from vnode wanted */ #define OBJ_TMPFS 0x8000 /* has tmpfs vnode allocated */ +/* + * Helpers to perform conversion between vm_object page indexes and offsets. + * IDX_TO_OFF() converts an index into an offset. + * OFF_TO_IDX() converts an offset into an index. Since offsets are signed + * by default, the sign propagation in OFF_TO_IDX(), when applied to + * negative offsets, is intentional and returns a vm_object page index + * that cannot be created by a userspace mapping. + * UOFF_TO_IDX() treats the offset as an unsigned value and converts it + * into an index accordingly. Use it only when the full range of offset + * values are allowed. Currently, this only applies to device mappings. + * OBJ_MAX_SIZE specifies the maximum page index corresponding to the + * maximum unsigned offset. + */ #define IDX_TO_OFF(idx) (((vm_ooffset_t)(idx)) << PAGE_SHIFT) #define OFF_TO_IDX(off) ((vm_pindex_t)(((vm_ooffset_t)(off)) >> PAGE_SHIFT)) +#define UOFF_TO_IDX(off) (((vm_pindex_t)(off)) >> PAGE_SHIFT) +#define OBJ_MAX_SIZE (UOFF_TO_IDX(UINT64_MAX) + 1) #ifdef _KERNEL