From owner-freebsd-current@freebsd.org Thu May 3 09:41:06 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E0FC4FCB59A for ; Thu, 3 May 2018 09:41:05 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mail-lf0-f49.google.com (mail-lf0-f49.google.com [209.85.215.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 3C81769C7F; Thu, 3 May 2018 09:41:04 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mail-lf0-f49.google.com with SMTP id g12-v6so24991753lfb.10; Thu, 03 May 2018 02:41:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:subject:to:cc:references:openpgp:autocrypt :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=eRcnIvnqA29ubnIFVJ7rTrpECVdET5SZE8xSjFPhgwY=; b=W9qnUpZdiJZCd0jKpOqzQuzOFSimyngnbfx8XKgjP3wnwPm7RXGj4QqoDv52lOAkTJ fnMMG/tzOisLKYzEIIsLpP5WieDKD5VLlEqO4uZgO/ux2Ubjo+sNUjoHyWNArQym73pT S88kvDNPch8C1P2tS9Sjdr3ign2M/+8Wf+9ha7Hgj4/xnZKbtG/daINmU+Jv3/Q0sl0G KQ+mWoKYjag9qaw6L2Hld45IoCjdrOsdKiezUmEIo/NUjwMlSEIP+ni9KS8QSk2dmPgt AlzE3qf9y64wlNVOarBPRW/++NCatAnmjm2PeLPVU3Uy7vwSnKQAlhcbqsJzOh84eJEZ Gfdg== X-Gm-Message-State: ALQs6tAAw9f+Fxl8BUxJ34MuXIqIb8oBKOwLRCrAGd2/9t13TtUeRpn4 5W99SNK7D5o66d8XNdD0hwi8MIoW X-Google-Smtp-Source: AB8JxZqsB58SlDKS+HsY6OGliUJFvwikSMtWZpNRW5gzVTIjST52998S1Jew3rnVSVYQY2BFz7OkXQ== X-Received: by 2002:a19:180a:: with SMTP id o10-v6mr13636835lfi.117.1525340463106; Thu, 03 May 2018 02:41:03 -0700 (PDT) Received: from [192.168.0.88] (east.meadow.volia.net. [93.72.151.96]) by smtp.googlemail.com with ESMTPSA id m20-v6sm2729854lfc.23.2018.05.03.02.41.01 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 03 May 2018 02:41:02 -0700 (PDT) From: Andriy Gapon Subject: Is kern.sched.preempt_thresh=0 a sensible default? To: Stefan Esser , "M. Warner Losh" Cc: FreeBSD Current References: <1d188cb0-ebc8-075f-ed51-57641ede1fd6@freebsd.org> <49fa8de4-e164-0642-4e01-a6188992c32e@freebsd.org> <32d6305b-3d57-4d37-ba1b-51631e994520@FreeBSD.org> <93efc3e1-7ac3-fedc-a71e-66c99f8e8c1e@freebsd.org> Openpgp: preference=signencrypt Autocrypt: addr=avg@FreeBSD.org; prefer-encrypt=mutual; keydata= xsFNBFm4LIgBEADNB/3lT7f15UKeQ52xCFQx/GqHkSxEdVyLFZTmY3KyNPQGBtyvVyBfprJ7 mAeXZWfhat6cKNRAGZcL5EmewdQuUfQfBdYmKjbw3a9GFDsDNuhDA2QwFt8BmkiVMRYyvI7l N0eVzszWCUgdc3qqM6qqcgBaqsVmJluwpvwp4ZBXmch5BgDDDb1MPO8AZ2QZfIQmplkj8Y6Z AiNMknkmgaekIINSJX8IzRzKD5WwMsin70psE8dpL/iBsA2cpJGzWMObVTtCxeDKlBCNqM1i gTXta1ukdUT7JgLEFZk9ceYQQMJJtUwzWu1UHfZn0Fs29HTqawfWPSZVbulbrnu5q55R4PlQ /xURkWQUTyDpqUvb4JK371zhepXiXDwrrpnyyZABm3SFLkk2bHlheeKU6Yql4pcmSVym1AS4 dV8y0oHAfdlSCF6tpOPf2+K9nW1CFA8b/tw4oJBTtfZ1kxXOMdyZU5fiG7xb1qDgpQKgHUX8 7Rd2T1UVLVeuhYlXNw2F+a2ucY+cMoqz3LtpksUiBppJhw099gEXehcN2JbUZ2TueJdt1FdS ztnZmsHUXLxrRBtGwqnFL7GSd6snpGIKuuL305iaOGODbb9c7ne1JqBbkw1wh8ci6vvwGlzx rexzimRaBzJxlkjNfMx8WpCvYebGMydNoeEtkWldtjTNVsUAtQARAQABzR5BbmRyaXkgR2Fw b24gPGF2Z0BGcmVlQlNELm9yZz7CwZQEEwEIAD4WIQS+LEO7ngQnXA4Bjr538m7TUc1yjwUC WbgsiAIbIwUJBaOagAULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgAAKCRB38m7TUc1yj+JAEACV l9AK/nOWAt/9cufV2fRj0hdOqB1aCshtSrwHk/exXsDa4/FkmegxXQGY+3GWX3deIyesbVRL rYdtdK0dqJyT1SBqXK1h3/at9rxr9GQA6KWOxTjUFURsU7ok/6SIlm8uLRPNKO+yq0GDjgaO LzN+xykuBA0FlhQAXJnpZLcVfPJdWv7sSHGedL5ln8P8rxR+XnmsA5TUaaPcbhTB+mG+iKFj GghASDSfGqLWFPBlX/fpXikBDZ1gvOr8nyMY9nXhgfXpq3B6QCRYKPy58ChrZ5weeJZ29b7/ QdEO8NFNWHjSD9meiLdWQaqo9Y7uUxN3wySc/YUZxtS0bhAd8zJdNPsJYG8sXgKjeBQMVGuT eCAJFEYJqbwWvIXMfVWop4+O4xB+z2YE3jAbG/9tB/GSnQdVSj3G8MS80iLS58frnt+RSEw/ psahrfh0dh6SFHttE049xYiC+cM8J27Aaf0i9RflyITq57NuJm+AHJoU9SQUkIF0nc6lfA+o JRiyRlHZHKoRQkIg4aiKaZSWjQYRl5Txl0IZUP1dSWMX4s3XTMurC/pnja45dge/4ESOtJ9R 8XuIWg45Oq6MeIWdjKddGhRj3OohsltKgkEU3eLKYtB6qRTQypHHUawCXz88uYt5e3w4V16H lCpSTZV/EVHnNe45FVBlvK7k7HFfDDkryM7BTQRZuCyIARAAlq0slcsVboY/+IUJdcbEiJRW be9HKVz4SUchq0z9MZPX/0dcnvz/gkyYA+OuM78dNS7Mbby5dTvOqfpLJfCuhaNYOhlE0wY+ 1T6Tf1f4c/uA3U/YiadukQ3+6TJuYGAdRZD5EqYFIkreARTVWg87N9g0fT9BEqLw9lJtEGDY EWUE7L++B8o4uu3LQFEYxcrb4K/WKmgtmFcm77s0IKDrfcX4doV92QTIpLiRxcOmCC/OCYuO jB1oaaqXQzZrCutXRK0L5XN1Y1PYjIrEzHMIXmCDlLYnpFkK+itlXwlE2ZQxkfMruCWdQXye syl2fynAe8hvp7Mms9qU2r2K9EcJiR5N1t1C2/kTKNUhcRv7Yd/vwusK7BqJbhlng5ZgRx0m WxdntU/JLEntz3QBsBsWM9Y9wf2V4tLv6/DuDBta781RsCB/UrU2zNuOEkSixlUiHxw1dccI 6CVlaWkkJBxmHX22GdDFrcjvwMNIbbyfQLuBq6IOh8nvu9vuItup7qemDG3Ms6TVwA7BD3j+ 3fGprtyW8Fd/RR2bW2+LWkMrqHffAr6Y6V3h5kd2G9Q8ZWpEJk+LG6Mk3fhZhmCnHhDu6CwN MeUvxXDVO+fqc3JjFm5OxhmfVeJKrbCEUJyM8ESWLoNHLqjywdZga4Q7P12g8DUQ1mRxYg/L HgZY3zfKOqcAEQEAAcLBfAQYAQgAJhYhBL4sQ7ueBCdcDgGOvnfybtNRzXKPBQJZuCyIAhsM BQkFo5qAAAoJEHfybtNRzXKPBVwQAKfFy9P7N3OsLDMB56A4Kf+ZT+d5cIx0Yiaf4n6w7m3i ImHHHk9FIetI4Xe54a2IXh4Bq5UkAGY0667eIs+Z1Ea6I2i27Sdo7DxGwq09Qnm/Y65ADvXs 3aBvokCcm7FsM1wky395m8xUos1681oV5oxgqeRI8/76qy0hD9WR65UW+HQgZRIcIjSel9vR XDaD2HLGPTTGr7u4v00UeTMs6qvPsa2PJagogrKY8RXdFtXvweQFz78NbXhluwix2Tb9ETPk LIpDrtzV73CaE2aqBG/KrboXT2C67BgFtnk7T7Y7iKq4/XvEdDWscz2wws91BOXuMMd4c/c4 OmGW9m3RBLufFrOag1q5yUS9QbFfyqL6dftJP3Zq/xe+mr7sbWbhPVCQFrH3r26mpmy841ym dwQnNcsbIGiBASBSKksOvIDYKa2Wy8htPmWFTEOPRpFXdGQ27awcjjnB42nngyCK5ukZDHi6 w0qK5DNQQCkiweevCIC6wc3p67jl1EMFY5+z+zdTPb3h7LeVnGqW0qBQl99vVFgzLxchKcl0 R/paSFgwqXCZhAKMuUHncJuynDOP7z5LirUeFI8qsBAJi1rXpQoLJTVcW72swZ42IdPiboqx NbTMiNOiE36GqMcTPfKylCbF45JNX4nF9ElM0E+Y8gi4cizJYBRr2FBJgay0b9Cp Message-ID: <9aaec961-e604-303a-52f3-ee24e3a435d0@FreeBSD.org> Date: Thu, 3 May 2018 12:41:01 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <93efc3e1-7ac3-fedc-a71e-66c99f8e8c1e@freebsd.org> Content-Type: text/plain; charset=windows-1252 Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 May 2018 09:41:06 -0000 On 05/04/2018 15:31, Stefan Esser wrote: > After looking at sched_ule.c and top/machine.c it appears, that the value > of preempt_thresh corresponds to the PRI value as shown by top (or ps -l) > plus PZERO which is calculated as (PRI_MIN_KERN=80) + 20. Kernel defines priorities from zero to 255. top shows the same priorities with 100 subtracted. At least that's how I look at it. I think we said the same thing but in different words. > What I do not understand, though, is that the decision about a preemption > is only based on the calculated new priority of the thread, but not at all > on the priority of other running threads (except the idle thread). I don't understand this statement. A new thread to run is picked up based on priorities of all runnable threads. The preemption decision does take into account the priorities of the currently running thread as well as the new thread. > On my system, a "real" batch job (i.e. one that does not voluntarily give > up the CPU due to I/O) seems to have a PRI value of 80 to 100 (growing > over time), while an interactive process has a PRI of 20, a maximally > "niced" interactive process has 52. > > So, I'd expect a reasonable default value of preempt_thresh to be slightly > above 120 (e.g. 124) to prevent I/O heavy threads from stealing each other > the CPU too often, and to prevent "niced" processes from doing the same ... > > The two values configured into the kernel (80 for PREEMPTION and 255 for > FULL_PREEMPTION) seem to be extremes, but something in between (e.g. 124) > is not offered (can only be configured via sysctl without any information > for the correspondence between the threshold value and the PRI value in > any document I've found, besides the kernel sources ...). > > > Is PRI_MIN_KERN=80 really a good default value for the preemption threshold? Yeah, a good question... I am not really sure about this. In my opinion it would be better to set preempt_thresh to at least PRI_MAX_KERN, so that all threads running in kernel are allowed to preempt userland threads. But that would also allow kernel threads (with priorities between PRI_MIN_KERN and PRI_MAX_KERN) to preempt other kernel threads as well, not sure if that's always okay. The same argument applies to higher values for preempt_thresh as well. Perhaps a single preempt_thresh is not expressive enough? Just a thought... maybe we need two thresholds where one tells that threads with better priority are potentially allowed to preempt other threads and the other tells that threads with worse priority can be preempted. For example: - may_preempt_prio=PRI_MAX_INTERACT - may_be_preempted_prio=PRI_MIN_BATCH This tells that realtime, kernel and interactive threads are allowed to preempt other threads if other conditions are met. And only batch and idle threads can actually be preempted. Probably even the above is not flexible enough. I think that we need preemption policies that might not be expressible as one or two numbers. A policy could be something like this: - interrupt threads can preempt only threads from "lower" classes: real-time, kernel, timeshare, idle; - interrupt threads cannot preempt other interrupt threads - real-time threads can preempt other real-time threads and threads from "lower" classes: kernel, timeshare, idle - kernel threads can preempt only threads from lower classes: timeshare, idle - interactive timeshare threads can only preempt batch and idle threads - batch threads can only preempt idle threads -- Andriy Gapon