Date: Thu, 15 Jul 1999 19:36:51 +0000 (GMT) From: Terry Lambert <tlambert@primenet.com> To: unknown@riverstyx.net (Tani Hosokawa) Cc: davids@webmaster.com, chat@FreeBSD.ORG Subject: Re: Known MMAP() race conditions ... ? Message-ID: <199907151936.MAA02676@usr07.primenet.com> In-Reply-To: <Pine.LNX.4.10.9907142007120.2799-100000@avarice.riverstyx.net> from "Tani Hosokawa" at Jul 14, 99 08:07:33 pm
next in thread | previous in thread | raw e-mail | index | archive | help
tani hosokawa wrote: > > On Wed, 14 Jul 1999, David Schwartz wrote: > > > > > > The current model is a hybrid thread/process model, with a number of > > > processes each with a large number of threads in each, each thread > > > processing one request. From what I've seen, 64 threads/process is about > > > right. So, in one Apache daemon, you can expect to see >1000 threads, > > > running inside 10-20 processes. Does that count as a large number? > > Yes. And it's bad design. > > I'm curious. How would you do it? I can't speak for David, but the process architecture I did for the NetWare for UNIX product used multiple processes (not threads) with a singled shared memory region for client context records, and a shared file descriptor table. This was chosen over threads for the standard context switch thrashing reasons, the lack of threads support on one of our reference platforms, the inability to autogrow the threads stack (even though Steve Baumel put the capability into the SVR4.2 VM system, it was not utilized by the threads people), and, finally, the ability to do "hot engine scheduling". This last used a streams mux to arbitrate, in LIFO order, incoming packets to the "hottest" work-to-do-engine, on the theory that it would be most likely, of all engines, to have its pages in core (remember that SVR4.2 did not have a unified VM and buffer cache, though this was equally applicable to data pages). Using a threads implemetnation would have resulted in each kernel thread engaging in paging operations, generally at the expense of other kernel threads data. Finally, by using a shared user space context, the process context switch overhead did not go up at all, assuming that you had dedicated this machine as a server: the same engine was run repeatedly, with the other engines only coming active when there was sufficient load to merit their participation based on I/O interleaving (this was a decision of the MUX, which also knew how to automatically ACK -- Novell calls this a "server busy" -- requests from a client with a request in progress). So... I personally would use an anonymous work-to-do engine model, shared memory, either via a single process the N:M kernel/user threads (M := N) or via multiple processes with shared contex for representing client state, and asynchronous I/O for I/O interleaving, probably using mmap'ed regions for static content to zero-copy the writes, with a lazy discard policy utilizing LRU. The benefit to the anonymous work-to-do engine is that you can use platform apropriate technology to implement your shared context region, be that a SYSV shared memory segment, and mmap'ed file, a vfork shared process context, or global memory in a threads based system. Sure, it's a little more thoughtful work to get right, but our code (compiled C) *did* outperform Native NetWare (hand coded assembly with non-preemptive coopertive multitasking: a true embedded system if ever there was one) on identical hardware. 8-). Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-chat" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199907151936.MAA02676>