Date: Sat, 23 Nov 1996 14:47:16 -0700 From: Steve Passe <smp@csn.net> To: freebsd-smp@freefall.freebsd.org Cc: Peter Wemm <peter@spinner.dialix.com> Subject: Re: SMP -current merge Message-ID: <199611232147.OAA19699@clem.systemsix.com> In-Reply-To: Your message of "Fri, 22 Nov 1996 20:50:46 MST." <199611230350.UAA13108@clem.systemsix.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi, I think I just made an important breakthru on SMP efficiency. It has always bugged me that many things take longer to do with 2 CPUs running than with just 1 active. While looking for the current SMP "brokenness" (still haven't got a clue...) I noticed in sys/kern/init_main.c:smp_idleloop() the line: if (whichqs || whichrtqs || whichidqs) { appears to look in the queue of idle procs. my debugs seem to show the 2nd CPU switching back and forth from one idle proc to the other as a result. It appears that the 2nd CPU is spending ALOT of time grabbing the lock, switching idle procs, and releaseing the lock again! This seems like it is both unnecessary and wasting bus cycles. So I added the following patc (to the older, NON-broken SMP src): ---------------------------------- cut --------------------------------------- *** init_main.c~ Fri Nov 1 22:50:41 1996 --- init_main.c Sat Nov 23 12:29:17 1996 *************** *** 839,845 **** rel_mplock(); } ! if (whichqs || whichrtqs || whichidqs) { /* grab lock for kernel "entry" */ get_mplock(); --- 839,845 ---- rel_mplock(); } ! if (whichqs || whichrtqs) { /* grab lock for kernel "entry" */ get_mplock(); *************** *** 847,853 **** /* We need to retest due to the spin lock */ __asm __volatile("" : : : "memory"); ! if (whichqs || whichrtqs || whichidqs){ splhigh(); if (curproc) setrunqueue(curproc); --- 847,853 ---- /* We need to retest due to the spin lock */ __asm __volatile("" : : : "memory"); ! if (whichqs || whichrtqs) { splhigh(); if (curproc) setrunqueue(curproc); ---------------------------------- cut --------------------------------------- --- Summary of kernel build times: 1 CPU active, non SMP kernel, for reference: 446.65s real 337.68s user 22.54s system 2 CPUs active, idlequeue search in smp_idleloop(), ie before patch: 597.45s real 373.91s user 147.68s system 2 CPUs active, NO idlequeue search in smp_idleloop(), ie after patch: 433.87s real 332.11s user 44.69s system Now it seems to be working like you would expect!!! --- And some times with PARALLEL make: 2 CPUs active, NO idlequeue search in smp_idleloop(): # time make -j 2 ... 326.28s real 400.29s user 67.71s system Note that we get 400s of user time in 326s of real time!!! # time make -j 3 This dies, I think its a PARALLELISM problem: --- param.c --- --- aic7xxx_asm --- --- genassym.o --- rm -f param.c --- param.c --- cp ../../conf/param.c . --- aic7xxx_asm --- cc -Wall -o aic7xxx_asm ../../dev/aic7xxx/aic7xxx_asm.c --- cd9660_bmap.o --- --- genassym.o --- cc -c -O -Wreturn-type -Wcomment -Wredundant-decls -Wimplicit -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -nostdinc -I- -I. -I../.. -I/usr/include -DFAILSAFE -DCOMPAT_43 -DCD9660 -DMSDOSFS -DNFS -DFFS -DINET -DMY_DEBUG -DKERNEL -DMAXUSERS=64 -UKERNEL ../../i386/i386/genassym.c --- cd9660_bmap.o --- cc -c -O -Wreturn-type -Wcomment -Wredundant-decls -Wimplicit -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -nostdinc -I- -I. -I../.. -I/usr/include -DFAILSAFE -DCOMPAT_43 -DCD9660 -DMSDOSFS -DNFS -DFFS -DINET -DMY_DEBUG -DKERNEL ../../isofs/cd9660/cd9660_bma p.c vnode_if.h: In function `VOP_FSYNC': In file included from ../../sys/vnode.h:379, from ../../isofs/cd9660/cd9660_bmap.c:46: vnode_if.h:403: parse error before `struct' vnode_if.h:408: warning: control reaches end of non-void function ... -- Steve Passe | powered by smp@csn.net | FreeBSD
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199611232147.OAA19699>