Date: Fri, 27 Apr 2001 13:09:46 -0600 (MDT) From: Nate Williams <nate@yogotech.com> To: Matt Dillon <dillon@earth.backplane.com> Cc: Nate Williams <nate@yogotech.com>, "David O'Brien" <obrien@FreeBSD.ORG>, Julian Elischer <julian@elischer.org>, Arch@FreeBSD.ORG, Daniel Eischen <eischen@vigrid.com> Subject: Re: KSE threading support (first parts) Message-ID: <15081.50170.297579.938254@nomad.yogotech.com> In-Reply-To: <200104271717.f3RHHGp05457@earth.backplane.com> References: <3AE71067.FF4BD029@elischer.org> <20010425110940.L1790@fw.wintelcom.net> <3AE85776.92D6BD90@elischer.org> <20010426120630.A92915@dragon.nuxi.com> <200104270015.f3R0FAi62512@earth.backplane.com> <15081.39397.944224.776391@nomad.yogotech.com> <200104271701.f3RH1Tk05185@earth.backplane.com> <15081.42735.860662.876478@nomad.yogotech.com> <200104271717.f3RHHGp05457@earth.backplane.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> :I read it, and this is what I hear you saying in a nutshell. > : > :KSEs belonging to the same process are serialized, and can not be run > :concurrently. > : > :What I'm saying: > : > :KSEs belonging to the same process can be run concurrently if we have > :multiple processors. > : > :Where did I miss what you were saying? > : > :Nate > > You seem to believe that not being able to run KSE's for the same > process concurrently somehow kills the whole concept of SMP. No, it kills one of the biggest reasons for supporting KSE. Otherwise, a single process can only take advantage of a single processor. > Well, that's complete bullshit. KSE's are extremely short-running > affairs in kernel mode, especially when you consider the most likely > asynchronizing case (a simple blocking situation that will most commonly > be in a read() or write()). Not necessarily. My experience with developing and running applications on Solaris says that having multiple KSE's/process is a *huge* win. > Serializing them within the context of a > single process will > actually *IMPROVE* SMP performance, not make it worse. Why? > Running multiple kernel contexts for the same process on different > cpu's concurrently means that you must now lock every single aspect > of the 'current process' concept Which has to be done anyway, since the processor will be running multiple processes in any case, and that a process may migrate to a different processor depending on process load. Affinity is a goal, but there's no guarantee that a process will *always* execute on the same processor. In essence, you're limiting the design of a threaded program to serialized processes, which is completely bogus. > Well, that's just plain insane. You will wind up with so many fragging > locks and mutexes in the kernel that what performance gain you might > have thought you could get is now completely blown away by the locking > overhead. See above. This has to be done in any case, and is done now. The problem is no more difficult with the addition of KSE's, and removes one of the single biggest advantages of using KSE's. Out of curiousity, have you read the KSE papers at all? They are able to deal with concurrency without all of the complexity you imply must exist. > This is another aspect of the problem you run into when you start > trying to preempt a process running in the kernel arbitrarily. Suddenly > all the assumptions you were able to make before that resulted in > optimal code paths now must be thrown out the window and replaced with > a godaweful number of locks to protect kernel contexts from unexpected > interruptions. *sarcasm on* Heck, then we should just throw out KSE's, since they are way too complex and just stick with the current 'BGL' model, right? *sarcasm off* It doesn't come for free. There is no way to have progress without some additional complexity. The question we must ask is does the complexity we add buy us anything. I believe it does, as do many other people. Certainly Solaris's ability to scale shows that there is something to be said for having a pre-emptive kernel. > That's insane as well. You are introducing a 'solution' to a > problem that doesn't exist Matt, honestly, there's no reason to change the existing FreeBSD model at all, if we're running on a single-processor. It's not broken in any way. However, the current model does not scale with multiple processors. One of the stated goals of the later releases of FreeBSD is to create an OS that scales better on multiple processors, so the current 'model' is not adequate. It's a solution to a new problem, one that *does* exist in BSD if we accept that fact that we want to run better on multiple processors. Hence, the KSE model, which is one of many solution to the scaling problem, and the solution that was decided to be a good solution. Another 'goal' is the ability to write threaded programs that run effeciently on both UP and SMP hardware. KSE's can help with this, but a 'serialized KSE' model won't allow a I/O intensive application to benefit from adding multiple CPU's. An example of such an application is one that does the following (UDP packets were used in this example, for streaming...) 1) One thread is in kernel context in select(), waiting for packets, which are thrown onto a queue back in userland and the thread returns to kernel land. 2) Another thread processes these packets into two classes, and these packets are stuck onto a two different queue. a) Data packets b) Query packets 3) The data queue is read by another thread, which writes them out to disk. 4) The query packets are processed by another thread, which reads the information off the disk (the data may be old, or new, so there is some contention between threads 3/4), and sticks it onto the 'send' queue. 5) A final thread reads information from the send queue, and sends it out to the requestors as BW is available. Not only is this example not made up, it's very similar to a project I completed over 2 years ago. It's a bit more complicated than this, but you get the general picture. Not only did this application scale well (on Solaris), it also had very few bottlenecks since we were able to minimize thread contention with some clever data structures. In our case, the # of packets sent/receive was the biggest bottleneck, so the limit wasn't one of hardware (in terms of I/O bandwidth), but CPU processing of the packets. Adding more CPU's to the mix allowed us to create an application that ran faster by throwing more CPU at it (if CPU was a bottleneck). If CPU wasn't a bottleneck, then the application had no scaling issues on modern hardware. > If we were writing a kernel completely from scratch we could probably > construct it to allow these things, but trying to do it with the current > base is impossible -- you will never get something reliable or efficient > at the end of this road. I believe that in the end, many parts of the system will be re-written, or at least revamped to support multi-tasking to some degree. Even with serialized KSE's, there's still an issue of pre-emption, since multiple processes may be accessing the same data structures (on different CPU's). > Or perhaps I should phrase it: The only way > you will get anything close to reliable will be to effectively revert > the system to the days of the single giant lock, because you will need > so many fraggin locks to deal with the consequences you might as well > have a single big giant lock. I'm not so naive to suggest that it's going to be simple. If it were goign to be simple task, it would have been done already. However, just because it's difficult and time consuming doesn't mean it's not worthwhile. Nate To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?15081.50170.297579.938254>