From owner-svn-src-head@freebsd.org Sun Dec 24 19:45:17 2017 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8A8F2EA6DF8; Sun, 24 Dec 2017 19:45:17 +0000 (UTC) (envelope-from markj@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4850567157; Sun, 24 Dec 2017 19:45:17 +0000 (UTC) (envelope-from markj@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id vBOJjGHI051699; Sun, 24 Dec 2017 19:45:16 GMT (envelope-from markj@FreeBSD.org) Received: (from markj@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id vBOJjGV5051696; Sun, 24 Dec 2017 19:45:16 GMT (envelope-from markj@FreeBSD.org) Message-Id: <201712241945.vBOJjGV5051696@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: markj set sender to markj@FreeBSD.org using -f From: Mark Johnston Date: Sun, 24 Dec 2017 19:45:16 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r327168 - head/sys/vm X-SVN-Group: head X-SVN-Commit-Author: markj X-SVN-Commit-Paths: head/sys/vm X-SVN-Commit-Revision: 327168 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Dec 2017 19:45:17 -0000 Author: markj Date: Sun Dec 24 19:45:16 2017 New Revision: 327168 URL: https://svnweb.freebsd.org/changeset/base/327168 Log: Fix two problems with the page daemon control loop. Both issues caused the page daemon to erroneously go to sleep when applications are consuming free pages at a high rate, leaving the application threads blocked in VM_WAIT. 1) After completing an inactive queue scan, concurrent allocations may have prevented the page daemon from meeting the v_free_min threshold. In this case, the page daemon was going to sleep even when the inactive queue contained plenty of clean pages. 2) pagedaemon_wakeup() may be called without the free queues lock held. This can lead to a lost wakeup if a call occurs after the page daemon clears vm_pageout_wanted but before going to sleep. Fix 1) by ensuring that we start a new inactive queue scan immediately if v_free_count < v_free_min after a prior scan. Fix 2) by adding a new subroutine, pagedaemon_wait(), called from vm_wait() and vm_waitpfault(). It wakes up the page daemon if either vm_pages_needed or vm_pageout_wanted is false, and atomically sleeps on v_free_count. Reported by: jeff Reviewed by: alc MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D13424 Modified: head/sys/vm/vm_page.c head/sys/vm/vm_pageout.c head/sys/vm/vm_pageout.h Modified: head/sys/vm/vm_page.c ============================================================================== --- head/sys/vm/vm_page.c Sun Dec 24 19:17:15 2017 (r327167) +++ head/sys/vm/vm_page.c Sun Dec 24 19:45:16 2017 (r327168) @@ -2661,15 +2661,9 @@ _vm_wait(void) msleep(&vm_pageout_pages_needed, &vm_page_queue_free_mtx, PDROP | PSWP, "VMWait", 0); } else { - if (__predict_false(pageproc == NULL)) + if (pageproc == NULL) panic("vm_wait in early boot"); - if (!vm_pageout_wanted) { - vm_pageout_wanted = true; - wakeup(&vm_pageout_wanted); - } - vm_pages_needed = true; - msleep(&vm_cnt.v_free_count, &vm_page_queue_free_mtx, PDROP | PVM, - "vmwait", 0); + pagedaemon_wait(PVM, "vmwait"); } } @@ -2699,7 +2693,6 @@ vm_page_alloc_fail(vm_object_t object, int req) atomic_add_int(&vm_pageout_deficit, max((u_int)req >> VM_ALLOC_COUNT_SHIFT, 1)); - pagedaemon_wakeup(); if (req & (VM_ALLOC_WAITOK | VM_ALLOC_WAITFAIL)) { if (object != NULL) VM_OBJECT_WUNLOCK(object); @@ -2708,8 +2701,10 @@ vm_page_alloc_fail(vm_object_t object, int req) VM_OBJECT_WLOCK(object); if (req & VM_ALLOC_WAITOK) return (EAGAIN); - } else + } else { mtx_unlock(&vm_page_queue_free_mtx); + pagedaemon_wakeup(); + } return (0); } @@ -2728,13 +2723,7 @@ vm_waitpfault(void) { mtx_lock(&vm_page_queue_free_mtx); - if (!vm_pageout_wanted) { - vm_pageout_wanted = true; - wakeup(&vm_pageout_wanted); - } - vm_pages_needed = true; - msleep(&vm_cnt.v_free_count, &vm_page_queue_free_mtx, PDROP | PUSER, - "pfault", 0); + pagedaemon_wait(PUSER, "pfault"); } struct vm_pagequeue * Modified: head/sys/vm/vm_pageout.c ============================================================================== --- head/sys/vm/vm_pageout.c Sun Dec 24 19:17:15 2017 (r327167) +++ head/sys/vm/vm_pageout.c Sun Dec 24 19:45:16 2017 (r327168) @@ -1829,10 +1829,14 @@ vm_pageout_worker(void *arg) pass++; } else { /* - * Yes. Sleep until pages need to be reclaimed or + * Yes. If threads are still sleeping in VM_WAIT + * then we immediately start a new scan. Otherwise, + * sleep until the next wakeup or until pages need to * have their reference stats updated. */ - if (mtx_sleep(&vm_pageout_wanted, + if (vm_pages_needed) { + mtx_unlock(&vm_page_queue_free_mtx); + } else if (mtx_sleep(&vm_pageout_wanted, &vm_page_queue_free_mtx, PDROP | PVM, "psleep", hz) == 0) { VM_CNT_INC(v_pdwakeups); @@ -1940,17 +1944,42 @@ vm_pageout(void) } /* - * Unless the free page queue lock is held by the caller, this function - * should be regarded as advisory. Specifically, the caller should - * not msleep() on &vm_cnt.v_free_count following this function unless - * the free page queue lock is held until the msleep() is performed. + * Perform an advisory wakeup of the page daemon. */ void pagedaemon_wakeup(void) { + mtx_assert(&vm_page_queue_free_mtx, MA_NOTOWNED); + if (!vm_pageout_wanted && curthread->td_proc != pageproc) { vm_pageout_wanted = true; wakeup(&vm_pageout_wanted); } +} + +/* + * Wake up the page daemon and wait for it to reclaim free pages. + * + * This function returns with the free queues mutex unlocked. + */ +void +pagedaemon_wait(int pri, const char *wmesg) +{ + + mtx_assert(&vm_page_queue_free_mtx, MA_OWNED); + + /* + * vm_pageout_wanted may have been set by an advisory wakeup, but if the + * page daemon is running on a CPU, the wakeup will have been lost. + * Thus, deliver a potentially spurious wakeup to ensure that the page + * daemon has been notified of the shortage. + */ + if (!vm_pageout_wanted || !vm_pages_needed) { + vm_pageout_wanted = true; + wakeup(&vm_pageout_wanted); + } + vm_pages_needed = true; + msleep(&vm_cnt.v_free_count, &vm_page_queue_free_mtx, PDROP | pri, + wmesg, 0); } Modified: head/sys/vm/vm_pageout.h ============================================================================== --- head/sys/vm/vm_pageout.h Sun Dec 24 19:17:15 2017 (r327167) +++ head/sys/vm/vm_pageout.h Sun Dec 24 19:45:16 2017 (r327168) @@ -96,11 +96,12 @@ extern bool vm_pages_needed; * Signal pageout-daemon and wait for it. */ -extern void pagedaemon_wakeup(void); +void pagedaemon_wait(int pri, const char *wmesg); +void pagedaemon_wakeup(void); #define VM_WAIT vm_wait() #define VM_WAITPFAULT vm_waitpfault() -extern void vm_wait(void); -extern void vm_waitpfault(void); +void vm_wait(void); +void vm_waitpfault(void); #ifdef _KERNEL int vm_pageout_flush(vm_page_t *, int, int, int, int *, boolean_t *);