From owner-freebsd-bugs@FreeBSD.ORG Fri Jun 11 13:10:25 2004 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DC85C16A4D3 for ; Fri, 11 Jun 2004 13:10:15 +0000 (GMT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id B404F43D54 for ; Fri, 11 Jun 2004 13:10:15 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.11/8.12.11) with ESMTP id i5BDA3pQ098153 for ; Fri, 11 Jun 2004 13:10:03 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.12.11/8.12.11/Submit) id i5BDA3qB098148; Fri, 11 Jun 2004 13:10:03 GMT (envelope-from gnats) Resent-Date: Fri, 11 Jun 2004 13:10:03 GMT Resent-Message-Id: <200406111310.i5BDA3qB098148@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Uwe Doering Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0ECF516A4CE for ; Fri, 11 Jun 2004 13:04:06 +0000 (GMT) Received: from gen129.n001.c02.escapebox.net (gen129.n001.c02.escapebox.net [213.73.91.129]) by mx1.FreeBSD.org (Postfix) with ESMTP id 69F5243D5D for ; Fri, 11 Jun 2004 13:04:05 +0000 (GMT) (envelope-from gemini@geminix.org) Received: from gemini by geminix.org with local (Exim 3.36 #1) id 1BYlhQ-0003Dt-00; Fri, 11 Jun 2004 15:04:00 +0200 Message-Id: Date: Fri, 11 Jun 2004 15:04:00 +0200 From: Uwe Doering To: FreeBSD-gnats-submit@FreeBSD.org X-Send-Pr-Version: 3.113 Subject: kern/67830: CPU affinity problem with forked child processes (SMP) X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Uwe Doering List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Jun 2004 13:10:25 -0000 >Number: 67830 >Category: kern >Synopsis: CPU affinity problem with forked child processes (SMP) >Confidential: no >Severity: non-critical >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Fri Jun 11 13:10:03 GMT 2004 >Closed-Date: >Last-Modified: >Originator: Uwe Doering >Release: FreeBSD 4.5-RELEASE i386 >Organization: EscapeBox - Managed On-Demand UNIX Servers >Environment: System: FreeBSD geminix.org 4.5-RELEASE FreeBSD 4.5-RELEASE #3: Thu Jun 10 14:06:47 GMT 2004 root@localhost:/STABLE_Enhanced_Edition i386 >Description: In SMP kernels there is a problem with the 'struct proc' variables 'p_oncpu' and 'p_lastcpu' being uninitialized (zeroed) when a forked child process is put onto the run queue near the end of fork1(). Other kernel functions expect these variables to be set up properly even before the new child process has had its first process switch. What happens in the current implementation is that all forked child processes get an initial affinity to CPU0, regardless of which CPU fork1() ran on. I believe this is unintended and in violation of the SMP design goals. The right thing to do, IMHO, would be to give the child process the CPU affinity of the parent process. Since parent processes tend to block shortly after returning from fork1(), in order to wait for some event to happen, chooseproc() will most likely pick up the new child process right away and make it the next process to be executed. Without switching CPUs, that is. On lightly loaded systems the change I propose makes no difference since idle CPUs pick up processes "belonging" to their peers, anyway (CPU migration). On busy systems, however, CPUs regularly favor processes with a matching affinity, as long as there is sufficient supply. In this situation the initial CPU affinity of forked child processes _does_ matter if the goal is to spread the work load evenly over multiple CPUs. Although I cannot provide hard evidence to prove it, I believe it is likely that this change, or rather correction, will improve the overall performance of SMP systems that deal with a lot of forks, like busy email servers, web servers with plenty of CGI invocations etc. >How-To-Repeat: Since this is about a performance issue rather that a malfunction there is nothing to repeat. The algorithmic deficiency becomes apparent from looking at the sources. >Fix: I suggest to initialize 'p_oncpu' and 'p_lastcpu' of the forked child process in a way as if it had been through a process switch once already. chooseproc() and other kernel functions will then work as expected. Please consider the following patch: --- kern_fork.c.diff begins here --- --- src/sys/kern/kern_fork.c.orig Wed Apr 21 09:23:06 2004 +++ src/sys/kern/kern_fork.c Thu Jun 10 16:05:03 2004 @@ -559,6 +559,10 @@ microtime(&(p2->p_stats->p_start)); p2->p_acflag = AFORK; s = splhigh(); +#ifdef SMP + p2->p_oncpu = 0xff; /* idle */ + p2->p_lastcpu = p1->p_oncpu; +#endif p2->p_stat = SRUN; setrunqueue(p2); splx(s); --- kern_fork.c.diff ends here --- >Release-Note: >Audit-Trail: >Unformatted: