From owner-freebsd-hackers Tue Feb 4 4:53:21 2003 Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9BFD737B405 for ; Tue, 4 Feb 2003 04:53:17 -0800 (PST) Received: from heron.mail.pas.earthlink.net (heron.mail.pas.earthlink.net [207.217.120.189]) by mx1.FreeBSD.org (Postfix) with ESMTP id E8D4A43F9B for ; Tue, 4 Feb 2003 04:53:15 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from pool0030.cvx21-bradley.dialup.earthlink.net ([209.179.192.30] helo=mindspring.com) by heron.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 18g2ZW-00003i-00; Tue, 04 Feb 2003 04:53:07 -0800 Message-ID: <3E3FB756.AFA5ABFD@mindspring.com> Date: Tue, 04 Feb 2003 04:51:34 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: rmkml Cc: freebsd-hackers@FreeBSD.ORG Subject: Re: vfork / execve / accept lock References: <20030203155505.49450.qmail@web13407.mail.yahoo.com> <3E3E9683.801036E1@wanadoo.fr> <3E3EBA30.FB33EF1C@mindspring.com> <3E3F985F.ED36E85C@wanadoo.fr> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a488c2fc92658e62a5b2602f937c853da1350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG rmkml wrote: > Thank for you answer. Sorry that it probably was not the answer you wanted. 8-(. > It is difficult to find anything concerning the signal model > of BSD implementation. In particular, for threaded applications. > If you can give me some advise or documentation to read, it will > be very helpfull to me . I recommend the POSIX standard, the "Go Solo 2" book (I'm afraid it's out of print, now), and the O'Reilly book. The rule of thumb is that anything that is "undefined"/"implementation defined" or is an extension -- don't use it. BSD attempts to implement strict POSIX 1003.1 signals. It does not implement the POSIX "reliable signal delivery" mechanism at this time. With regard to threads, signals are delivered to the process. This may mean that any thread that happens to be running at the time, or the threads schduler, gets the actual signal. In general, the user space threads implementation is what's called a "call conversion" implementation. The wayhe this operates is by trading a blocking call for a non-blocking call, plus an entry into the user space threads scheduler. If there are other threads that are pending execution (not blocked on resources), then the user space threads scheduler schedules them to run. The scheduler function is _thread_kern_sched(); this is the pthreads "kernel", in user space. It's located in libc_r, in the source file /usr/src/lib/libc_r/uthread/uthread_kern.c; if there is only one thread, then it sets a timer and retries. Because everything is non-blocking, this is the only way to convert a non-blocking call in a single thread to "block" pending completion of the operation (really, it polls, and the timer is to keep it from swamping the CPU with a buzz-loop in the scheduler). The time for this wait is tiny, so it's not going to be the timeout you saw. It's tricky, because handling the signal is different in the scheduler from elsewhere, since it's an interval timer signal, and you might be using interval timers in your program. But basically, this means that on return, when signals are unmasked, they are delivered to the unmasking process. This happens either automatically, or as a result of the siglongjmp from the scheduler. So signals which are not caught may end up delivered in the context of any thread, at random. Since signals run on their own stack, and have their own context (sort of), you can do everything you'd normally do in a signal handler, except assume context other than process context. So if you has a thread allocated (or auto) variable that you set a global pointer to, you can't access it from a signal handler, and assume that the handler will only fire in the thread that originally registered it, etc.. The best documentation is Chris Provenzo's documentation on "MIT pthreads". There are a couple of papers he wrote on the pthreads system he wrote. It's a distant ancestor of the FreeBSD user space pthreads implementation. > In the first part of the answer, do you want to say that a threaded > application can't use vfork/fork/rfork command because the result > will be undefined ? No. vfork() is just a wrapper for fork(), in threads. THat's a library implementation detail. The source file is in: /usr/src/lib/libc_r/uthread/uthread_vfork.c; the rfork() call doesn't exist at all. If you are expecting the main process execution to syspend from the vfork() to the execve(), as documented for vfork(), well, it won't. > In this case is there a solution to launch external cmd ? The fork()/execve() combination and system() work. I really recommend using fork/execve. The reason for this is that the system() is a cancellation point. Basically, it will suspend until the command returns. For reapable status, use fork() and system(), rather than trying to roll your own. > In your opinion, can directly the application create a new thread, > this thread used execve, and the parent thread waitpid() ? Not without causing a cancellation point, which means that the thread you care about isn't waiting on the status, and the SIGCLD will interrupt it, and then your SIGCLD will hit the handler and not the wait. You need to use wait4(). This is probably a bug in the implementation, since waitpid() is defined to be identical to wait4(), with an rusage value of 0, so the threads waitpid() needs to call the threads wait4(), which uses the WNOHANG to convert from a blocking call to a non-blocking call, and avoid the cancellation point. Regular wait() also introduces a cancellation point. I'm not positive that it can be implemented in terms of wait4(), since the man page doesn't note an equivalence. I would *think* it could be wait4(-1, status, 0, 0). If so, this would avoid the cancellation point there, too. This is mostly a "signal thing"; sigwait() is also suspect, but with nothing to do about it. You can grep for _thread_enter_cancellation_point() in all the source files in /usr/src/lib/libc_r/uthread/*.c; this will basically tell you every place you need to be worried about a conversion from a blocking to a non-blocking system call. I don't think there are really any semantic issues, except for the signal related calls, and things that could be bad if they started and then were restarted (e.g. system() of most programs that took any time to run at all would be a terrible idea), for what that's worth. Cancellation points are problematic; they are permitted by the standard, but you really want to avoid them. There are also places you could get bit, that probably should have them, but don't, because they're not really supported, like all the System V IPC stuff, mmap, end so on. My personal recommendation with regard to signals: Preestablish signal handlers for all signals, and then trampoline them to a signal handling thread using explicit calls to send the signal to the thread, using pthread_kill(), in the signal handler (but don't do it for SIG_IGN or SIG_DFL, if the default action is to ignore). If you control the signal routing yourself, it will not bite you on the butt. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message