From owner-freebsd-arch Sun Nov 28 14:28:42 1999 Delivered-To: freebsd-arch@freebsd.org Received: from ns1.yes.no (ns1.yes.no [195.204.136.10]) by hub.freebsd.org (Postfix) with ESMTP id 3AB7F14BD0 for ; Sun, 28 Nov 1999 14:28:39 -0800 (PST) (envelope-from eivind@bitbox.follo.net) Received: from bitbox.follo.net (bitbox.follo.net [195.204.143.218]) by ns1.yes.no (8.9.3/8.9.3) with ESMTP id XAA28492 for ; Sun, 28 Nov 1999 23:28:39 +0100 (CET) Received: (from eivind@localhost) by bitbox.follo.net (8.8.8/8.8.6) id XAA59884 for freebsd-arch@freebsd.org; Sun, 28 Nov 1999 23:28:37 +0100 (MET) Received: from alpo.whistle.com (alpo.whistle.com [207.76.204.38]) by hub.freebsd.org (Postfix) with ESMTP id 3CCC614BD0 for ; Sun, 28 Nov 1999 14:28:01 -0800 (PST) (envelope-from julian@whistle.com) Received: from current1.whiste.com (current1.whistle.com [207.76.205.22]) by alpo.whistle.com (8.9.1a/8.9.1) with ESMTP id OAA26450 for ; Sun, 28 Nov 1999 14:28:00 -0800 (PST) Date: Sun, 28 Nov 1999 14:28:00 -0800 (PST) From: Julian Elischer To: arch@freebsd.org Subject: Re: Which is the truth? (sycalls and traps) (fwd) Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Peter says: :I was rather suprised when I found out just how expensive kernel entry was :some time ago.. What I was doing was a reentrant syscall that aquired no :locks and ran about 5 instructions in kernel context.. Anyway, it took :something like 300 times longer to do that (called via int $0x81) than to :do a 'call' to equivalent code in userland. Anyway, with overheads on that :scale, whether we push 5 or 8 or whatever registers in the handler is :almost lost in the noise. : :Cheers, :-Peter Matt says: Well, it could be 300x but that's like comparing a cache hit to a cache miss - in real terms a UP syscall takes, what, 1-3 uS? An SMP syscall takes 6 uS. This on a PIII-450. Both times can be cut down to less then 500nS with fairly simple optimizations. Unless you are doing hundreds of thousands of context switches a second the overhead is in the noise in real terms, and *definitely* in the noise if you tack on a task switch in the middle of that. Having the kernel do the context switch between threads has a huge number of advantages that should outweight or at least equal the minor increase in overhead. A couple of points that have been brought up in recent emails: * blockages due to VM faults All vm faults that do not occur with the Sp in the UTS's stack (a quick way of finding out if the UTS is running) can be telegraphed to the UTS, which should be able to schedule another thread. (If the UTS is running then we just block the entire process) * blockages due to file I/O (not even network I/O) there is no need for this to require The kernel to do this. The UTS can be notified and schedule a new task with a lot more knowledge of what is needed than the kernel can. of course there si always the case of co-operative scheduling, where teh UTS decides and the kernel 'does'. * disk parallelism (thread A reads file block from kernel cache, thread B reads file block and has a cache miss). Once again I don't think this required the kernel to do the change. * event synchronization what events? * kernel state Kernel state? Kernel state is probably going to be associated with the process, and not with teh KSEs that are sharing its quantum. Even if one were to use an asynchronous call gate one then has to deal with the additional overhead imposed by the asynch call gate when a syscall could have been run from the disk cache (that is, not block). Personally speaking, I think async call gates are a huge mistake without a prioritized, vectorable software interrupt mechanism to go along with it. The current unix signal mechanism is simply not up to the task. I don't think there is too much overhead...a copyout() of the syscall return values. There are serious issues with async call gates including potential resource hogging issues that frankly scare the hell out of me. I would prefer a kernel stack for each thread and I would prefer a syscall to set a thread runnable/not-runnable. Such a syscall could specify an optional cpu and optional run interval. You don't need a system stack for a thread that is not doing IO. you only need to keep one available per thread that enters the kernel. Wheen the thread enters userspace again, you can keep the same stack hanging around. If the thread in user space chages, and the new thread does a syscall, then it comes back and you still have the same stack sitting around.. 1 stack, N threads.. Until one blocks. then you grab a new one. (or block if you can't but you should keep a cache of them sitting around.. Most threading programs that have thousands of threads don't use them to do IO but to implement active objects or some sort. They expect the thread-switch overhead to be minuscule, and they can probably make do with 10 KSEs fro 1000 threads. There are simply too many things that a UP scheduler does not have access to - such as knowing whether a syscall can complete without blocking or not - to allow the UP scheduler to actually perform the context switch. You don't care if it is GOING to block.. you handle that when it happens. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message