From owner-freebsd-current@freebsd.org Thu Apr 5 12:40:05 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 02F19F82335 for ; Thu, 5 Apr 2018 12:40:05 +0000 (UTC) (envelope-from se@freebsd.org) Received: from mailout11.t-online.de (mailout11.t-online.de [194.25.134.85]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mailout00.t-online.de", Issuer "TeleSec ServerPass DE-2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A6936794ED; Thu, 5 Apr 2018 12:40:04 +0000 (UTC) (envelope-from se@freebsd.org) Received: from fwd06.aul.t-online.de (fwd06.aul.t-online.de [172.20.26.150]) by mailout11.t-online.de (Postfix) with SMTP id 1E7B74276116; Thu, 5 Apr 2018 14:31:38 +0200 (CEST) Received: from Stefans-MBP-7.fritz.box (E28Fi2ZG8hmMTmv6WpCiQT-FYm5QdhL9DxdTg2gOGtf5-oSGB3YBIe5u0+jWAj0wBT@[84.154.99.226]) by fwd06.t-online.de with (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384 encrypted) esmtp id 1f4434-3fNFHk0; Thu, 5 Apr 2018 14:31:26 +0200 Subject: Re: Is kern.sched.preempt_thresh=0 a sensible default? To: Andriy Gapon , "M. Warner Losh" Cc: FreeBSD Current References: <1d188cb0-ebc8-075f-ed51-57641ede1fd6@freebsd.org> <49fa8de4-e164-0642-4e01-a6188992c32e@freebsd.org> <32d6305b-3d57-4d37-ba1b-51631e994520@FreeBSD.org> From: Stefan Esser Openpgp: preference=signencrypt Autocrypt: addr=se@freebsd.org; prefer-encrypt=mutual; keydata= xsBNBFVxiRIBCADOLNOZBsqlplHUQ3tG782FNtVT33rQli9EjNt2fhFERHIo4NxHlWBpHLnU b0s4L/eItx7au0i7Gegv01A9LUMwOnAc9EFAm4EW3Wmoa6MYrcP7xDClohg/Y69f7SNpEs3x YATBy+L6NzWZbJjZXD4vqPgZSDuMcLU7BEdJf0f+6h1BJPnGuwHpsSdnnMrZeIM8xQ8PPUVQ L0GZkVojHgNUngJH6e21qDrud0BkdiBcij0M3TCP4GQrJ/YMdurfc8mhueLpwGR2U1W8TYB7 4UY+NLw0McThOCLCxXflIeF/Y7jSB0zxzvb/H3LWkodUTkV57yX9IbUAGA5RKRg9zsUtABEB AAHNLlN0ZWZhbiBFw59lciAoVC1PbmxpbmUpIDxzdC5lc3NlckB0LW9ubGluZS5kZT7CwH8E EwEIACkFAlhtTvQCGwMFCQWjmoAHCwkIBwMCAQYVCAIJCgsEFgIDAQIeAQIXgAAKCRBH67Xv Wv31RAn0B/9skuajrZxjtCiaOFeJw9l8qEOSNF6PKMN2i/wosqNK57yRQ9AS18x4+mJKXQtc mwyejjQTO9wasBcniKMYyUiie3p7iGuFR4kSqi4xG7dXKjMkYvArWH5DxeWBrVf94yPDexEV FnEG9t1sIXjL17iFR8ng5Kkya5yGWWmikmPdtZChj9OUq4NKHKR7/HGM2dxP3I7BheOwY9PF 4mhqVN2Hu1ZpbzzJo68N8GGBmpQNmahnTsLQ97lsirbnPWyMviWcbzfBCocI9IlepwTCqzlN FMctBpLYjpgBwHZVGXKucU+eQ/FAm+6NWatcs7fpGr7dN99S8gVxnCFX1Lzp/T1YzsBNBFVx iRIBCACxI/aglzGVbnI6XHd0MTP05VK/fJub4hHdc+LQpz1MkVnCAhFbY9oecTB/togdKtfi loavjbFrb0nJhJnx57K+3SdSuu+znaQ4SlWiZOtXnkbpRWNUeMm+gtTDMSvloGAfr76RtFHs kdDOLgXsHD70bKuMhlBxUCrSwGzHaD00q8iQPhJZ5itb3WPqz3B4IjiDAWTO2obD1wtAvSuH uUj/XJRsiKDKW3x13cfavkad81bZW4cpNwUv8XHLv/vaZPSAly+hkY7NrDZydMMXVNQ7AJQu fWuTJ0q7sImRcEZ5EIa98esJPey4O7C0vY405wjeyxpVZkpqThDMurqtQFn1ABEBAAHCwGUE GAEKAA8FAlVxiRICGwwFCQWjmoAACgkQR+u171r99UQEHAf/ZxNbMxwX1v/hXc2ytE6yCAil piZzOffT1VtS3ET66iQRe5VVKL1RXHoIkDRXP7ihm3WF7ZKy9yA9BafMmFxsbXR3+2f+oND6 nRFqQHpiVB/QsVFiRssXeJ2f0WuPYqhpJMFpKTTW/wUWhsDbytFAKXLLfesKdUlpcrwpPnJo KqtVbWAtQ2/o3y+icYOUYzUig+CHl/0pEPr7cUhdDWqZfVdRGVIk6oy00zNYYUmlkkVoU7MB V5D7ZwcBPtjs254P3ecG42szSiEo2cvY9vnMTCIL37tX0M5fE/rHub/uKfG2+JdYSlPJUlva RS1+ODuLoy1pzRd907hl8a7eaVLQWA== Message-ID: <93efc3e1-7ac3-fedc-a71e-66c99f8e8c1e@freebsd.org> Date: Thu, 5 Apr 2018 14:31:24 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <32d6305b-3d57-4d37-ba1b-51631e994520@FreeBSD.org> Content-Type: text/plain; charset=windows-1252 Content-Language: en-US Content-Transfer-Encoding: 7bit X-ID: E28Fi2ZG8hmMTmv6WpCiQT-FYm5QdhL9DxdTg2gOGtf5-oSGB3YBIe5u0+jWAj0wBT X-TOI-MSGID: b1137ebe-952c-40b1-8096-8825a7d0a42f X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2018 12:40:05 -0000 Am 04.04.18 um 18:45 schrieb Andriy Gapon: > On 04/04/2018 16:19, Stefan Esser wrote: >> I have identified the cause of the extremely low I/O performance (2 to 6 read >> operations scheduled per second). >> >> The default value of kern.sched.preempt_thresh=0 does not give any CPU to the >> I/O bound process unless a (long) time slice expires (kern.sched.quantum=94488 >> on my system with HZ=1000) or one of the CPU bound processes voluntarily gives >> up the CPU (or exits). >> >> Any non-zero value of preemt_thresh lets the system perform I/O in parallel >> with the CPU bound processes, again. > > Let me guess... you have a custom kernel configuration and, unlike GENERIC > (assuming x86), it does not have 'options PREEMPTION'? Yes, thank you for pointing that out!!! I used to have PREEMPTION and FULL_PREEMPTION in my kernel configuration, and apparently have deleted both options when only FULL_PREEMPTION was supposed to go ... After looking at sched_ule.c and top/machine.c it appears, that the value of preempt_thresh corresponds to the PRI value as shown by top (or ps -l) plus PZERO which is calculated as (PRI_MIN_KERN=80) + 20. What I do not understand, though, is that the decision about a preemption is only based on the calculated new priority of the thread, but not at all on the priority of other running threads (except the idle thread). On my system, a "real" batch job (i.e. one that does not voluntarily give up the CPU due to I/O) seems to have a PRI value of 80 to 100 (growing over time), while an interactive process has a PRI of 20, a maximally "niced" interactive process has 52. So, I'd expect a reasonable default value of preempt_thresh to be slightly above 120 (e.g. 124) to prevent I/O heavy threads from stealing each other the CPU too often, and to prevent "niced" processes from doing the same ... The two values configured into the kernel (80 for PREEMPTION and 255 for FULL_PREEMPTION) seem to be extremes, but something in between (e.g. 124) is not offered (can only be configured via sysctl without any information for the correspondence between the threshold value and the PRI value in any document I've found, besides the kernel sources ...). Is PRI_MIN_KERN=80 really a good default value for the preemption threshold? Regards, STefan