From owner-freebsd-arch@FreeBSD.ORG Sun Sep 2 04:17:54 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 06FBB16A41B for ; Sun, 2 Sep 2007 04:17:54 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail07.syd.optusnet.com.au (mail07.syd.optusnet.com.au [211.29.132.188]) by mx1.freebsd.org (Postfix) with ESMTP id 893AD13C4CE for ; Sun, 2 Sep 2007 04:17:53 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c220-239-235-248.carlnfd3.nsw.optusnet.com.au (c220-239-235-248.carlnfd3.nsw.optusnet.com.au [220.239.235.248]) by mail07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id l824H2m2011606 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 2 Sep 2007 14:17:32 +1000 Date: Sun, 2 Sep 2007 14:17:02 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Jilles Tjoelker In-Reply-To: <20070901224025.GA97796@stack.nl> Message-ID: <20070902131910.H46281@delplex.bde.org> References: <1188600721.1255.11.camel@shumai.marcuscom.com> <20070901112600.GA33832@stack.nl> <1188660782.41727.5.camel@shumai.marcuscom.com> <20070901224025.GA97796@stack.nl> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Joe Marcus Clarke , freebsd-arch@freebsd.org Subject: Re: Understanding interrupted system calls X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 02 Sep 2007 04:17:54 -0000 On Sun, 2 Sep 2007, Jilles Tjoelker wrote: > On Sat, Sep 01, 2007 at 11:33:02AM -0400, Joe Marcus Clarke wrote: >> However, I'm curious as to my other point in this thread. Why should >> one need to re-register the default signal handlers to get a syscall to >> return EINTR? Or should ERESTART be caught and turned into EINTR then >> return to the caller (as in kern_connect())? > > It is intended that most blocking system calls are not interrupted by > signals. This saves the programmer some checks on EINTR. Yes, I think this is just the historical default for BSD. Not restarting syscalls is very unusual in BSD, and this was implemented by defaulting ps_sigintr to all unset (?). Before sigaction(2) existed, signal(3) was probably signal(2), and most programs used only signal() and got the default. siginterrupt(3) was more probably siginterrupt(2), and the few programs that understand this stuff and want to change the default had to use it to get EINTR. Now with sigaction(2), the default doesn't apply to userland, since setting up a signal catcher requires using sigaction(2) which sets the ps_sigintr bit to the inverse of the SA_RESTART bit in the sigaction data. However, the default might still affect operation in the kernel for programs that never call sigaction(). I think it shouldn't have any effect except to return to near the syscall entry point to decide what to do. Lower levels should see ERESTART and return to near the syscall entry point. Similarly if sigaction() is used to change the default. > Some system calls, e.g. connect(), read/write/etc that have already > committed some data and under BSD also select/poll/kqueue do not restart > and always return EINTR or partial success. In the kernel code, this > appears as changing ERESTART to EINTR. This translation happens at a high level, but the translation of ps_sigintr happens at the [*]sleep() level. So ps_sigintr has a significant affect at a low level, but only (?) to select between ERESTART and EINTR. >From Jilles' previous reply: >>> The problem seems to be the following code in >>> src/sys/dev/syscons/syscons.c, in case VT_WAITACTIVE in scioctl(): >>> >>> while ((error=tsleep(&scp->smode, PZERO|PCATCH, >>> "waitvt", 0)) == ERESTART) ; >>> >>> If a signal is caught and system call restart is enabled for that >>> signal, this makes it spin in a tight loop, waiting in vain for the >>> signal to go away. The idea of ERESTART is that the syscall function >>> returns it and then the signal handler is entered. If and when the >>> signal handler returns, it will return to the system call instruction, >>> restarting it (perhaps this is optimized to avoid the switch to userland >>> and back). With EINTR, the signal handler would return to directly >>> after the system call instruction. >>> >>> The fixed version would then be >>> >>> error = tsleep(&scp->smode, PZERO|PCATCH, "waitvt", 0); I think this is right. The kernel should never loop on ERESTART like this. Please fix the remaining style bug in it (missing spaces around binary operator). Another problem here is that tty drivers should rarely or never use tsleep(). They should use ttysleep() so as to check for the tty being revoked. After revoke(), ttysleep() returns ERESTART to make lower levels return to near the syscall entry point (where the syscall is normally restarted and fails because the file descriptor has been moved to deadfs) provided there are no broken lower levels that loop on ERESTART like the above. Not using ttysleep() doesn't seem to cause any problems here (it actually avoids the buggy loop). Bruce