From owner-svn-src-all@freebsd.org Thu Jun 25 20:30:31 2020 Return-Path: Delivered-To: svn-src-all@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 44D5434496B; Thu, 25 Jun 2020 20:30:31 +0000 (UTC) (envelope-from markj@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49tBTz0qC9z4Z1p; Thu, 25 Jun 2020 20:30:31 +0000 (UTC) (envelope-from markj@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 1770389E9; Thu, 25 Jun 2020 20:30:31 +0000 (UTC) (envelope-from markj@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id 05PKUUWE009383; Thu, 25 Jun 2020 20:30:30 GMT (envelope-from markj@FreeBSD.org) Received: (from markj@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id 05PKUUnB009382; Thu, 25 Jun 2020 20:30:30 GMT (envelope-from markj@FreeBSD.org) Message-Id: <202006252030.05PKUUnB009382@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: markj set sender to markj@FreeBSD.org using -f From: Mark Johnston Date: Thu, 25 Jun 2020 20:30:30 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r362631 - head/sys/compat/linux X-SVN-Group: head X-SVN-Commit-Author: markj X-SVN-Commit-Paths: head/sys/compat/linux X-SVN-Commit-Revision: 362631 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Jun 2020 20:30:31 -0000 Author: markj Date: Thu Jun 25 20:30:30 2020 New Revision: 362631 URL: https://svnweb.freebsd.org/changeset/base/362631 Log: Implement an approximation of Linux MADV_DONTNEED semantics. Linux MADV_DONTNEED is not advisory: it has side effects for anonymous memory, and some system software depends on that. In particular, MADV_DONTNEED causes anonymous pages to be discarded. If the mapping is a private mapping of a named object then subsequent faults are to repopulate the range from that object, otherwise pages will be zero-filled. For mappings of non-anonymous objects, Linux MADV_DONTNEED can be implemented in the same way as our MADV_DONTNEED. This implementation differs from Linux semantics in its handling of private mappings, inherited through fork(), of non-anonymous objects. After applying MADV_DONTNEED, subsequent faults will repopulate the mapping from the parent object rather than the root of the shadow chain. PR: 230160 Reviewed by: alc, kib Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D25330 Modified: head/sys/compat/linux/linux_mmap.c Modified: head/sys/compat/linux/linux_mmap.c ============================================================================== --- head/sys/compat/linux/linux_mmap.c Thu Jun 25 20:29:29 2020 (r362630) +++ head/sys/compat/linux/linux_mmap.c Thu Jun 25 20:30:30 2020 (r362631) @@ -38,9 +38,11 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include #include +#include #include #include #include @@ -48,6 +50,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include @@ -242,6 +245,98 @@ linux_mprotect_common(struct thread *td, uintptr_t add return (kern_mprotect(td, addr, len, prot)); } +/* + * Implement Linux madvise(MADV_DONTNEED), which has unusual semantics: for + * anonymous memory, pages in the range are immediately discarded. + */ +static int +linux_madvise_dontneed(struct thread *td, vm_offset_t start, vm_offset_t end) +{ + vm_map_t map; + vm_map_entry_t entry; + vm_object_t backing_object, object; + vm_offset_t estart, eend; + vm_pindex_t pstart, pend; + int error; + + map = &td->td_proc->p_vmspace->vm_map; + + if (!vm_map_range_valid(map, start, end)) + return (EINVAL); + start = trunc_page(start); + end = round_page(end); + + error = 0; + vm_map_lock_read(map); + if (!vm_map_lookup_entry(map, start, &entry)) + entry = vm_map_entry_succ(entry); + for (; entry->start < end; entry = vm_map_entry_succ(entry)) { + if ((entry->eflags & MAP_ENTRY_IS_SUB_MAP) != 0) + continue; + + if (entry->wired_count != 0) { + error = EINVAL; + break; + } + + object = entry->object.vm_object; + if (object == NULL) + continue; + + pstart = OFF_TO_IDX(entry->offset); + if (start > entry->start) { + pstart += atop(start - entry->start); + estart = start; + } else { + estart = entry->start; + } + pend = OFF_TO_IDX(entry->offset) + + atop(entry->end - entry->start); + if (entry->end > end) { + pend -= atop(entry->end - end); + eend = end; + } else { + eend = entry->end; + } + + if ((object->flags & (OBJ_ANON | OBJ_ONEMAPPING)) == + (OBJ_ANON | OBJ_ONEMAPPING)) { + /* + * Singly-mapped anonymous memory is discarded. This + * does not match Linux's semantics when the object + * belongs to a shadow chain of length > 1, since + * subsequent faults may retrieve pages from an + * intermediate anonymous object. However, handling + * this case correctly introduces a fair bit of + * complexity. + */ + VM_OBJECT_WLOCK(object); + if ((object->flags & OBJ_ONEMAPPING) != 0) { + vm_object_collapse(object); + vm_object_page_remove(object, pstart, pend, 0); + backing_object = object->backing_object; + if (backing_object != NULL && + (backing_object->flags & OBJ_ANON) != 0) + linux_msg(td, + "possibly incorrect MADV_DONTNEED"); + VM_OBJECT_WUNLOCK(object); + continue; + } + VM_OBJECT_WUNLOCK(object); + } + + /* + * Handle shared mappings. Remove them outright instead of + * calling pmap_advise(), for consistency with Linux. + */ + pmap_remove(map->pmap, estart, eend); + vm_object_madvise(object, pstart, pend, MADV_DONTNEED); + } + vm_map_unlock_read(map); + + return (error); +} + int linux_madvise_common(struct thread *td, uintptr_t addr, size_t len, int behav) { @@ -256,7 +351,7 @@ linux_madvise_common(struct thread *td, uintptr_t addr case LINUX_MADV_WILLNEED: return (kern_madvise(td, addr, len, MADV_WILLNEED)); case LINUX_MADV_DONTNEED: - return (kern_madvise(td, addr, len, MADV_DONTNEED)); + return (linux_madvise_dontneed(td, addr, addr + len)); case LINUX_MADV_FREE: return (kern_madvise(td, addr, len, MADV_FREE)); case LINUX_MADV_REMOVE: