From owner-svn-src-head@freebsd.org Fri May 27 19:15:47 2016 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 92B46B4C2AD; Fri, 27 May 2016 19:15:47 +0000 (UTC) (envelope-from alc@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 540761C61; Fri, 27 May 2016 19:15:47 +0000 (UTC) (envelope-from alc@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id u4RJFkiv013143; Fri, 27 May 2016 19:15:46 GMT (envelope-from alc@FreeBSD.org) Received: (from alc@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id u4RJFkEl013139; Fri, 27 May 2016 19:15:46 GMT (envelope-from alc@FreeBSD.org) Message-Id: <201605271915.u4RJFkEl013139@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: alc set sender to alc@FreeBSD.org using -f From: Alan Cox Date: Fri, 27 May 2016 19:15:46 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r300865 - in head/sys: sys vm X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 May 2016 19:15:47 -0000 Author: alc Date: Fri May 27 19:15:45 2016 New Revision: 300865 URL: https://svnweb.freebsd.org/changeset/base/300865 Log: The flag "vm_pages_needed" has long served two distinct purposes: (1) to indicate that threads are waiting for free pages to become available and (2) to indicate whether a wakeup call has been sent to the page daemon. The trouble is that a single flag cannot really serve both purposes, because we have two distinct targets for when to wakeup threads waiting for free pages versus when the page daemon has completed its work. In particular, the flag will be cleared by vm_page_free() before the page daemon has met its target, and this can lead to the OOM killer being invoked prematurely. To address this problem, a new flag "vm_pageout_wanted" is introduced. Discussed with: jeff Reviewed by: kib, markj Tested by: markj Sponsored by: EMC / Isilon Storage Division Modified: head/sys/sys/vmmeter.h head/sys/vm/vm_page.c head/sys/vm/vm_pageout.c head/sys/vm/vm_pageout.h Modified: head/sys/sys/vmmeter.h ============================================================================== --- head/sys/sys/vmmeter.h Fri May 27 18:52:58 2016 (r300864) +++ head/sys/sys/vmmeter.h Fri May 27 19:15:45 2016 (r300865) @@ -76,7 +76,7 @@ struct vmmeter { u_int v_vnodepgsout; /* (p) vnode pager pages paged out */ u_int v_intrans; /* (p) intransit blocking page faults */ u_int v_reactivated; /* (f) pages reactivated from free list */ - u_int v_pdwakeups; /* (f) times daemon has awaken from sleep */ + u_int v_pdwakeups; /* (p) times daemon has awaken from sleep */ u_int v_pdpages; /* (p) pages analyzed by daemon */ u_int v_tcached; /* (p) total pages cached */ Modified: head/sys/vm/vm_page.c ============================================================================== --- head/sys/vm/vm_page.c Fri May 27 18:52:58 2016 (r300864) +++ head/sys/vm/vm_page.c Fri May 27 19:15:45 2016 (r300865) @@ -2700,10 +2700,11 @@ vm_wait(void) msleep(&vm_pageout_pages_needed, &vm_page_queue_free_mtx, PDROP | PSWP, "VMWait", 0); } else { - if (!vm_pages_needed) { - vm_pages_needed = 1; - wakeup(&vm_pages_needed); + if (!vm_pageout_wanted) { + vm_pageout_wanted = true; + wakeup(&vm_pageout_wanted); } + vm_pages_needed = true; msleep(&vm_cnt.v_free_count, &vm_page_queue_free_mtx, PDROP | PVM, "vmwait", 0); } @@ -2724,10 +2725,11 @@ vm_waitpfault(void) { mtx_lock(&vm_page_queue_free_mtx); - if (!vm_pages_needed) { - vm_pages_needed = 1; - wakeup(&vm_pages_needed); + if (!vm_pageout_wanted) { + vm_pageout_wanted = true; + wakeup(&vm_pageout_wanted); } + vm_pages_needed = true; msleep(&vm_cnt.v_free_count, &vm_page_queue_free_mtx, PDROP | PUSER, "pfault", 0); } @@ -2908,7 +2910,7 @@ vm_page_free_wakeup(void) * lots of memory. this process will swapin processes. */ if (vm_pages_needed && !vm_page_count_min()) { - vm_pages_needed = 0; + vm_pages_needed = false; wakeup(&vm_cnt.v_free_count); } } Modified: head/sys/vm/vm_pageout.c ============================================================================== --- head/sys/vm/vm_pageout.c Fri May 27 18:52:58 2016 (r300864) +++ head/sys/vm/vm_pageout.c Fri May 27 19:15:45 2016 (r300865) @@ -156,10 +156,11 @@ SYSINIT(vmdaemon, SI_SUB_KTHREAD_VM, SI_ #endif -int vm_pages_needed; /* Event on which pageout daemon sleeps */ int vm_pageout_deficit; /* Estimated number of pages deficit */ int vm_pageout_wakeup_thresh; static int vm_pageout_oom_seq = 12; +bool vm_pageout_wanted; /* Event on which pageout daemon sleeps */ +bool vm_pages_needed; /* Are threads waiting for free pages? */ #if !defined(NO_SWAPPING) static int vm_pageout_req_swapout; /* XXX */ @@ -1550,48 +1551,65 @@ vm_pageout_worker(void *arg) * The pageout daemon worker is never done, so loop forever. */ while (TRUE) { + mtx_lock(&vm_page_queue_free_mtx); + /* - * If we have enough free memory, wakeup waiters. Do - * not clear vm_pages_needed until we reach our target, - * otherwise we may be woken up over and over again and - * waste a lot of cpu. + * Generally, after a level >= 1 scan, if there are enough + * free pages to wakeup the waiters, then they are already + * awake. A call to vm_page_free() during the scan awakened + * them. However, in the following case, this wakeup serves + * to bound the amount of time that a thread might wait. + * Suppose a thread's call to vm_page_alloc() fails, but + * before that thread calls VM_WAIT, enough pages are freed by + * other threads to alleviate the free page shortage. The + * thread will, nonetheless, wait until another page is freed + * or this wakeup is performed. */ - mtx_lock(&vm_page_queue_free_mtx); if (vm_pages_needed && !vm_page_count_min()) { - if (!vm_paging_needed()) - vm_pages_needed = 0; + vm_pages_needed = false; wakeup(&vm_cnt.v_free_count); } - if (vm_pages_needed) { + + /* + * Do not clear vm_pageout_wanted until we reach our target. + * Otherwise, we may be awakened over and over again, wasting + * CPU time. + */ + if (vm_pageout_wanted && !vm_paging_needed()) + vm_pageout_wanted = false; + + /* + * Might the page daemon receive a wakeup call? + */ + if (vm_pageout_wanted) { /* - * We're still not done. Either vm_pages_needed was - * set by another thread during the previous scan - * (typically, this happens during a level 0 scan) or - * vm_pages_needed was already set and the scan failed - * to free enough pages. If we haven't yet performed - * a level >= 2 scan (unlimited dirty cleaning), then - * upgrade the level and scan again now. Otherwise, - * sleep a bit and try again later. While sleeping, - * vm_pages_needed can be cleared. + * No. Either vm_pageout_wanted was set by another + * thread during the previous scan, which must have + * been a level 0 scan, or vm_pageout_wanted was + * already set and the scan failed to free enough + * pages. If we haven't yet performed a level >= 2 + * scan (unlimited dirty cleaning), then upgrade the + * level and scan again now. Otherwise, sleep a bit + * and try again later. */ + mtx_unlock(&vm_page_queue_free_mtx); if (domain->vmd_pass > 1) - msleep(&vm_pages_needed, - &vm_page_queue_free_mtx, PVM, "psleep", - hz / 2); + pause("psleep", hz / 2); + domain->vmd_pass++; } else { /* - * Good enough, sleep until required to refresh - * stats. + * Yes. Sleep until pages need to be reclaimed or + * have their reference stats updated. */ - msleep(&vm_pages_needed, &vm_page_queue_free_mtx, - PVM, "psleep", hz); + if (mtx_sleep(&vm_pageout_wanted, + &vm_page_queue_free_mtx, PDROP | PVM, "psleep", + hz) == 0) { + PCPU_INC(cnt.v_pdwakeups); + domain->vmd_pass = 1; + } else + domain->vmd_pass = 0; } - if (vm_pages_needed) { - vm_cnt.v_pdwakeups++; - domain->vmd_pass++; - } else - domain->vmd_pass = 0; - mtx_unlock(&vm_page_queue_free_mtx); + vm_pageout_scan(domain, domain->vmd_pass); } } @@ -1688,9 +1706,9 @@ void pagedaemon_wakeup(void) { - if (!vm_pages_needed && curthread->td_proc != pageproc) { - vm_pages_needed = 1; - wakeup(&vm_pages_needed); + if (!vm_pageout_wanted && curthread->td_proc != pageproc) { + vm_pageout_wanted = true; + wakeup(&vm_pageout_wanted); } } Modified: head/sys/vm/vm_pageout.h ============================================================================== --- head/sys/vm/vm_pageout.h Fri May 27 18:52:58 2016 (r300864) +++ head/sys/vm/vm_pageout.h Fri May 27 19:15:45 2016 (r300865) @@ -72,9 +72,10 @@ */ extern int vm_page_max_wired; -extern int vm_pages_needed; /* should be some "event" structure */ extern int vm_pageout_deficit; extern int vm_pageout_page_count; +extern bool vm_pageout_wanted; +extern bool vm_pages_needed; /* * Swap out requests