From owner-freebsd-hackers Thu Apr 27 18:16:50 2000 Delivered-To: freebsd-hackers@freebsd.org Received: from localhost (localhost [127.0.0.1]) by hub.freebsd.org (Postfix) with ESMTP id BFCED37BB73; Thu, 27 Apr 2000 18:16:43 -0700 (PDT) (envelope-from green@FreeBSD.org) Date: Thu, 27 Apr 2000 21:16:40 -0400 (EDT) From: Brian Fundakowski Feldman X-Sender: green@green.dyndns.org To: Luoqi Chen Cc: hackers@FreeBSD.ORG Subject: Re: lock-ups due to the scheduler In-Reply-To: <200004270547.e3R5lfi24004@lor.watermarkgroup.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Thu, 27 Apr 2000, Luoqi Chen wrote: > This is quite interesting. I'm no scheduler expert, but my understanding > is priority < PUSER won't degrade and is only set in kernel mode after > waking up from a sleep. In user mode, processes should always have priority > p_usrpri >= PUSER, it is obviously not true for a negative nice value: That was my take on it. There were multiple tests for <= PUSER which were really tests for whether or not the process was in SRUN. However, changing these tests to SRUN tests didn't prevent lockups, so the problem seems to be deeper than that. And they definitely keep p_priority < PUSER processes from updating their p_priority to their newly calculated p_usrpri, a very large bug. The deeper problem seems to be that for whatever the process does, it never accrues enough estcpu to classify it as hoggy, as a process I start with a niceness of -20 cycles through priorities 10 (in the very beginning) and 27 at the very highest. This _shouldn't_ be too much of a problem, but it never gets to 50 and thus never gets rescheduled properly... and this seems to be most of what's causing the lockups. > > newpriority = PUSER + p->p_estcpu / INVERSE_ESTCPU_WEIGHT + > > NICE_WEIGHT * p->p_nice; > We should probably offset p->p_nice by PRIO_MIN, > > newpriority = PUSER + p->p_estcpu / INVERSE_ESTCPU_WEIGHT + > > NICE_WEIGHT * (p->p_nice - PRIO_MIN); > > To fully utilize the 20 out of 32 run queues for user priorities, we might > want to change NICE_WEIGHT from 2 to 1, and upper limit of p_estcpu to > #define ESTCPULIM(e) \ > min((e), > INVERSE_ESTCPU_WEIGHT * (NICE_WEIGHT * (PRIO_MAX - PRIO_MIN) - PPQ) + \ > INVERSE_ESTCPU_WEIGHT - 1) > so that a cpu hog at nice 0 would have about the same priority as a low > cpu usage nice +20 process. Yes, this seems right. It seems that the niceness making the priority dip below 50 is a bad idea. I think that if we make that modification (which is another thing I tried) of niceness values subtracting PRIO_MIN to prevent any values less than PUSER, this would fix the bugs we have. I missed, when I did it, changing ESTCPULIM, so that probably explains why things didn't (I believe) lock up, but (I believe) seemed veerrry bad... Also, decreasing NICE_WEIGHT would be a good idea, so I'll try all of this out, and report later. > -lq -- Brian Fundakowski Feldman \ FreeBSD: The Power to Serve! / green@FreeBSD.org `------------------------------' To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message