Date: Tue, 21 May 1996 10:22:17 -0700 (MST) From: Terry Lambert <terry@lambert.org> To: davem@caip.rutgers.edu (David S. Miller) Cc: terry@lambert.org, jehamby@lightside.com, jkh@time.cdrom.com, current@freebsd.org, hackers@freebsd.org Subject: Re: Congrats on CURRENT 5/1 SNAP... Message-ID: <199605211722.KAA01411@phaeton.artisoft.com> In-Reply-To: <199605210823.EAA07997@huahaga.rutgers.edu> from "David S. Miller" at May 21, 96 04:23:57 am
next in thread | previous in thread | raw e-mail | index | archive | help
> > The SunOS LWP's are pretty easy. > > Actually SunOS does do lwp scheduling where it checks for AST's etc. > although I don't know how relevant that is to whats being discussed. Yes. It uses aioread/aiowrite/aiowait/aiocancel; these are closer to an event flag cluster than AST's. > Furthermore, the way Solaris does threads in the kernel has been > proven to be a lose (pre-emption, a billion mutexes in the kernel, > another thousand read writer locks) and expect the industry to move in > "another" direction. Computer science has proven that current smp > technology (read as: what SVR4.2MP based kernels do right now) cannot > scale past 32 cpu's without an exponential loss in performance. This is an artifact of their VM implementation, and the number is generally acknowledged to be 8 processors. It's possible to get a modified NUMA for an SMP environment using per processor page allocation pools. You're free to put SLAB allocators on top of those pages. This means that the allocation mutex need only be held when the per processor page pool is refilled/released to the general page pool. Using a hierarchical lock manager and computation of transitive closure over the lock hierarchy (treating it as a directed graph), coupled with intention mode locking, there should be a significant decrease in bus overhead. I *don't* think you'd want a non-symmetric implementation. It should be noted that multithreading UFS in SVR4 (UnixWare) resulted in a 160% performance improvement -- even after the performance loss for using mutexes for locking was subtracted out. > Clustering is the answer and can scale to more CPU's than you can > count in an unsigned char. ;-) So can the scheme described above. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199605211722.KAA01411>