Date: Wed, 26 Apr 2000 23:27:11 -0400 (EDT) From: Brian Fundakowski Feldman <green@FreeBSD.org> To: hackers@FreeBSD.org Subject: lock-ups due to the scheduler Message-ID: <Pine.BSF.4.21.0004262241010.718-100000@green.dyndns.org>
next in thread | raw e-mail | index | archive | help
I dropped hints that there may be issues about 3 weeks ago, as my machine had locked up for apparently no reason, and I had no idea why until recently. It seems that it has everything to do with running things that use lots of CPU at a very high priority (I use -20). I've been struggling for a few days with this... and what happens is that the kernel never executes any user processes (or does so so very rarely that I really can't detect it at all, in any application, the execution). I'm not really sure what's happening, but it definitely is with the scheduler: if I cap the scheduler's priority computation on the lower end to keep user processes from not executing with a p_priority < PUSER, the system can get slightly unresponsive, but it does not lock up. The modifications I made to allow prevention of this follow my signature. This is a deadlock-type situation, and I can reproduce it at will, so I'll try to explain the steps I can reproduce it with. 1. start XMMS at -20 priority, and play something 2. XMMS is decoding audio and other random things, nothing huge, but at about 15-20% CPU. XMMS sends decoded mp3 (archives of CDs I own) to the EsounD daemon (esd), which takes 2-3% cpu or so, and XFree86 itself takes a good 5-10% cpu. 3. Start a "Visualization" plugin, which basically takes XMMS to full CPU usage (as much as it can get), and things lock up. XMMS is the curproc for every single time I've polled it (using DDB, for example), and I stop hearing audio. XFree86 would be doing the X11 servering, and esd would mostly be writing to the audio device or reading from its socket, so usually in PRIBIO or PSOCK. At this point, the system is really locked up, and there's nothing I can do. I can, however, get a coredump and have the entire system state at this point. I'm certain that that other people here will be able to try the same tests, of course on 5.0-CURRENT, and possibly reproduce them exactly the same as it happens for me. I can grovel in a coredump to get information about the system as it was running at the time, so if anyone can provide hints as to where to check for what happened that makes things lock up nowadays, I'll be grateful, and I'll be able to try almost anything to get this fixed. If you're familiar with the scheduler area of the system, please help. I have noone's arm to twist or anything of the sort, so I'm really going out on a limb hoping someone will be able to try to help me fix this. Note that I've taken my HZ=1000 line out of my kernel config, so I'm running at a standard hz = 100 and a kern.quantum of 20000. -- Brian Fundakowski Feldman \ FreeBSD: The Power to Serve! / green@FreeBSD.org `------------------------------' Index: kern_synch.c =================================================================== RCS file: /usr2/ncvs/src/sys/kern/kern_synch.c,v retrieving revision 1.89 diff -u -u -r1.89 kern_synch.c --- kern_synch.c 2000/03/28 18:06:42 1.89 +++ kern_synch.c 2000/04/27 00:55:21 @@ -903,6 +903,10 @@ maybe_resched(p); } +static int priority_lower_cap = 0; +SYSCTL_INT(_debug, OID_AUTO, enable_priority_lower_cap, CTLFLAG_RW, + &priority_lower_cap, 0, ""); + /* * Compute the priority of a process when running in user mode. * Arrange to reschedule if the resulting priority is better @@ -917,6 +921,12 @@ if (p->p_rtprio.type == RTP_PRIO_NORMAL) { newpriority = PUSER + p->p_estcpu / INVERSE_ESTCPU_WEIGHT + NICE_WEIGHT * p->p_nice; + if (priority_lower_cap && newpriority < PUSER) { + if (p == curproc) + uprintf("kernel: tried to use priority %d\n", + newpriority); + newpriority = PUSER; + } newpriority = min(newpriority, MAXPRI); p->p_usrpri = newpriority; } To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.21.0004262241010.718-100000>