From owner-freebsd-arch@FreeBSD.ORG Thu Apr 3 03:21:36 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A92C737B401 for ; Thu, 3 Apr 2003 03:21:36 -0800 (PST) Received: from stork.mail.pas.earthlink.net (stork.mail.pas.earthlink.net [207.217.120.188]) by mx1.FreeBSD.org (Postfix) with ESMTP id 17C2843FAF for ; Thu, 3 Apr 2003 03:21:36 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from pool0015.cvx22-bradley.dialup.earthlink.net ([209.179.198.15] helo=mindspring.com) by stork.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 1912mh-0005sn-00; Thu, 03 Apr 2003 03:21:32 -0800 Message-ID: <3E8C18CC.AF2C6B7F@mindspring.com> Date: Thu, 03 Apr 2003 03:19:40 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Igor Sysoev References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4718c3e02cdc50a5e8ae02ba12f90e767548b785378294e88350badd9bab72f9c350badd9bab72f9c cc: freebsd-arch@freebsd.org Subject: Re: libthr and 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Apr 2003 11:21:37 -0000 Igor Sysoev wrote: > If a process caused a page fault or memory mapping fault at user level > where do you suppose to return in user space after a fault was just queued ? > To the same instruction that caused this fault ? Yes. And then return as if the fault had failed. Only it's not fatal, because you only return out of the code if the fd was marked async; otherwise you go to sleep on the buffer getting filled in by an I/O initiated by the fault. If you do return, you don't crash the process, you return with EAGAIN. The trap handler provides the context for the delayed operation. > With threads you can run another thread in such situation. Yes. And save the 20ms per fault that Robert Watson estimated was the reason for the performance difference between libc_r ("user space threads") and libthr ("1:1 kernel threads"). > BTW what do you mean by 'async fd' in Solaris ? > O_ASYNC ? I do not see it in Solaris 8. > O_NONBLOCK ? It does not matter for disk files. O_NONBLOCK. Examining the Solaris 8 sources, it seems to have been removed from the disk I/O modules, and applies only to socktpi.c, ptm.c, audio_mc.c, ecpp.c, and envctrl.c. Apparently, it's also missing from the tty code, which goes against what Matt claimed, actually. This is unfortunate, in that it leaves me without an easily accessible example, unless you have a USL UNIX source license? Is there any chance you have legal access to the Solaris 2.2 or even the 2.4 source code, which is immediately following the project for integration between USL and SunSoft? Or the USL SVR4.0.2 or SVR4.2 source code? > aioread() or aio_read() ? They are library calls that implemented > via additional LWP for regular disk files. I know this. This was basically what Julian and Matt had discussed as a means of implementing AIO in FreeBSD, rather than using system calls. > >> Certainly, you can argue that the application should be structured > >> to make all I/O explicit and asynchronous, but for various reasons, > >> that's not the case :-). > > > >The mmap'ed file case is obviously not something that can be > >handled without an explicit contract between user and kernel > >for notification of the pagein temporary failure (I would use > >a signal for that, probably, as a gross first approximation, > >but per-process signal handling is currently not happy...). > > And what do you suppose to do in a signal handler ? > Using some non-reenterant library functions ? No. Call the user thread scheduler as a result of a fault that is normally not trappable because it resulted from a memory access to an mmap()'ed region of the address space, rather than resulting from an explicit system call. There is no system call context when a trap like that occurs, there is only a trap context. A signal would allow you to force a user threads context switch for a thread whose only reason it can't run is that running it would result in a page fault and delay all the other runnable threads that aren't waiting on a condition that would result in a page fault. The signal is just to get back to user space so you can force the faulting thread to yield and restart the operation by being rescheduled later, after the fault has been satisfied by the kernel's I/O subsystem. -- Terry