Date: Thu, 30 May 2013 19:14:35 +0000 (UTC) From: John Baldwin <jhb@FreeBSD.org> To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-9@freebsd.org Subject: svn commit: r251147 - in stable/9/sys: fs/nfs fs/nfsclient kern nfsclient sys tools Message-ID: <201305301914.r4UJEZuU017310@svn.freebsd.org>
next in thread | raw e-mail | index | archive | help
Author: jhb Date: Thu May 30 19:14:34 2013 New Revision: 251147 URL: http://svnweb.freebsd.org/changeset/base/251147 Log: MFC 246417,247116,248584: Rework the handling of stop signals in the NFS client. The changes in 195702, 195703, and 195821 prevented a thread from suspending while holding locks inside of NFS by forcing the thread to fail sleeps with EINTR or ERESTART but defer the thread suspension to the user boundary. However, this had the effect that stopping a process during an NFS request could abort the request and trigger EINTR errors that were visible to userland processes (previously the thread would have suspended and completed the request once it was resumed). This change instead effectively masks stop signals while in the NFS client. It uses the existing TDF_SBDRY flag to effect this since SIGSTOP cannot be masked directly. Instead of setting PBDRY on individual sleeps, change the VFS_*() and VOP_*() methods to defer stop signals for filesystems which request this behavior via a new VFCF_SBDRY flag. Note that this has to be a VFC flag rather than a MNTK flag so that it works properly with VFS_MOUNT() when the mount is not yet fully constructed. For now, only the NFS clients set this new flag in VFS_SET(). A few other related changes: - Add an assertion to ensure that TDF_SBDRY doesn't leak to userland. - When a lookup request uses VOP_READLINK() to follow a symlink, mark the request as being on behalf of the thread performing the lookup (cnp_thread) rather than using a NULL thread pointer. This causes NFS to properly handle signals during this VOP on an interruptible mount. - Ignore thread suspend requests due to SIGSTOP if stop signals are currently deferred. This can occur if a process is stopped via SIGSTOP while a thread is running or runnable but before it has set TDF_SBDRY. Modified: stable/9/sys/fs/nfs/nfs_commonkrpc.c stable/9/sys/fs/nfsclient/nfs_clvfsops.c stable/9/sys/kern/kern_sig.c stable/9/sys/kern/kern_thread.c stable/9/sys/kern/subr_sleepqueue.c stable/9/sys/kern/subr_trap.c stable/9/sys/kern/vfs_export.c stable/9/sys/kern/vfs_lookup.c stable/9/sys/nfsclient/nfs_krpc.c stable/9/sys/nfsclient/nfs_vfsops.c stable/9/sys/sys/mount.h stable/9/sys/sys/signalvar.h stable/9/sys/tools/vnode_if.awk Directory Properties: stable/9/sys/ (props changed) stable/9/sys/fs/ (props changed) stable/9/sys/sys/ (props changed) Modified: stable/9/sys/fs/nfs/nfs_commonkrpc.c ============================================================================== --- stable/9/sys/fs/nfs/nfs_commonkrpc.c Thu May 30 17:24:36 2013 (r251146) +++ stable/9/sys/fs/nfs/nfs_commonkrpc.c Thu May 30 19:14:34 2013 (r251147) @@ -952,7 +952,6 @@ int newnfs_sig_set[] = { SIGTERM, SIGHUP, SIGKILL, - SIGSTOP, SIGQUIT }; @@ -973,7 +972,7 @@ nfs_sig_pending(sigset_t set) /* * The set/restore sigmask functions are used to (temporarily) overwrite - * the process p_sigmask during an RPC call (for example). These are also + * the thread td_sigmask during an RPC call (for example). These are also * used in other places in the NFS client that might tsleep(). */ void @@ -1002,8 +1001,9 @@ newnfs_set_sigmask(struct thread *td, si SIGDELSET(newset, newnfs_sig_set[i]); } mtx_unlock(&p->p_sigacts->ps_mtx); + kern_sigprocmask(td, SIG_SETMASK, &newset, oldset, + SIGPROCMASK_PROC_LOCKED); PROC_UNLOCK(p); - kern_sigprocmask(td, SIG_SETMASK, &newset, oldset, 0); } void Modified: stable/9/sys/fs/nfsclient/nfs_clvfsops.c ============================================================================== --- stable/9/sys/fs/nfsclient/nfs_clvfsops.c Thu May 30 17:24:36 2013 (r251146) +++ stable/9/sys/fs/nfsclient/nfs_clvfsops.c Thu May 30 19:14:34 2013 (r251147) @@ -133,7 +133,7 @@ static struct vfsops nfs_vfsops = { .vfs_unmount = nfs_unmount, .vfs_sysctl = nfs_sysctl, }; -VFS_SET(nfs_vfsops, nfs, VFCF_NETWORK); +VFS_SET(nfs_vfsops, nfs, VFCF_NETWORK | VFCF_SBDRY); /* So that loader and kldload(2) can find us, wherever we are.. */ MODULE_VERSION(nfs, 1); Modified: stable/9/sys/kern/kern_sig.c ============================================================================== --- stable/9/sys/kern/kern_sig.c Thu May 30 17:24:36 2013 (r251146) +++ stable/9/sys/kern/kern_sig.c Thu May 30 19:14:34 2013 (r251147) @@ -2352,6 +2352,13 @@ tdsigwakeup(struct thread *td, int sig, } /* + * Don't awaken a sleeping thread for SIGSTOP if the + * STOP signal is deferred. + */ + if ((prop & SA_STOP) && (td->td_flags & TDF_SBDRY)) + goto out; + + /* * Give low priority threads a better chance to run. */ if (td->td_priority > PUSER) @@ -2392,12 +2399,13 @@ sig_suspend_threads(struct thread *td, s if ((TD_IS_SLEEPING(td2) || TD_IS_SWAPPED(td2)) && (td2->td_flags & TDF_SINTR)) { if (td2->td_flags & TDF_SBDRY) { - if (TD_IS_SUSPENDED(td2)) - wakeup_swapper |= - thread_unsuspend_one(td2); - if (TD_ON_SLEEPQ(td2)) - wakeup_swapper |= - sleepq_abort(td2, ERESTART); + /* + * Once a thread is asleep with + * TDF_SBDRY set, it should never + * become suspended due to this check. + */ + KASSERT(!TD_IS_SUSPENDED(td2), + ("thread with deferred stops suspended")); } else if (!TD_IS_SUSPENDED(td2)) { thread_suspend_one(td2); } @@ -2517,6 +2525,40 @@ tdsigcleanup(struct thread *td) } /* + * Defer the delivery of SIGSTOP for the current thread. Returns true + * if stops were deferred and false if they were already deferred. + */ +int +sigdeferstop(void) +{ + struct thread *td; + + td = curthread; + if (td->td_flags & TDF_SBDRY) + return (0); + thread_lock(td); + td->td_flags |= TDF_SBDRY; + thread_unlock(td); + return (1); +} + +/* + * Permit the delivery of SIGSTOP for the current thread. This does + * not immediately suspend if a stop was posted. Instead, the thread + * will suspend either via ast() or a subsequent interruptible sleep. + */ +void +sigallowstop() +{ + struct thread *td; + + td = curthread; + thread_lock(td); + td->td_flags &= ~TDF_SBDRY; + thread_unlock(td); +} + +/* * If the current process has received a signal (should be caught or cause * termination, should interrupt current syscall), return the signal number. * Stop signals with default action are processed immediately, then cleared; @@ -2548,7 +2590,7 @@ issignal(struct thread *td, int stop_all SIGSETOR(sigpending, p->p_sigqueue.sq_signals); SIGSETNAND(sigpending, td->td_sigmask); - if (p->p_flag & P_PPWAIT) + if (p->p_flag & P_PPWAIT || td->td_flags & TDF_SBDRY) SIG_STOPSIGMASK(sigpending); if (SIGISEMPTY(sigpending)) /* no signal to send */ return (0); @@ -2663,10 +2705,6 @@ issignal(struct thread *td, int stop_all (p->p_pgrp->pg_jobc == 0 && prop & SA_TTYSTOP)) break; /* == ignore */ - - /* Ignore, but do not drop the stop signal. */ - if (stop_allowed != SIG_STOP_ALLOWED) - return (sig); mtx_unlock(&ps->ps_mtx); WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, &p->p_mtx.lock_object, "Catching SIGSTOP"); Modified: stable/9/sys/kern/kern_thread.c ============================================================================== --- stable/9/sys/kern/kern_thread.c Thu May 30 17:24:36 2013 (r251146) +++ stable/9/sys/kern/kern_thread.c Thu May 30 19:14:34 2013 (r251147) @@ -796,6 +796,17 @@ thread_suspend_check(int return_instead) return (ERESTART); /* + * Ignore suspend requests for stop signals if they + * are deferred. + */ + if (P_SHOULDSTOP(p) == P_STOPPED_SIG && + td->td_flags & TDF_SBDRY) { + KASSERT(return_instead, + ("TDF_SBDRY set for unsafe thread_suspend_check")); + return (0); + } + + /* * If the process is waiting for us to exit, * this thread should just suicide. * Assumes that P_SINGLE_EXIT implies P_STOPPED_SINGLE. Modified: stable/9/sys/kern/subr_sleepqueue.c ============================================================================== --- stable/9/sys/kern/subr_sleepqueue.c Thu May 30 17:24:36 2013 (r251146) +++ stable/9/sys/kern/subr_sleepqueue.c Thu May 30 19:14:34 2013 (r251147) @@ -351,8 +351,6 @@ sleepq_add(void *wchan, struct lock_obje if (flags & SLEEPQ_INTERRUPTIBLE) { td->td_flags |= TDF_SINTR; td->td_flags &= ~TDF_SLEEPABORT; - if (flags & SLEEPQ_STOP_ON_BDRY) - td->td_flags |= TDF_SBDRY; } thread_unlock(td); } @@ -599,7 +597,7 @@ sleepq_check_signals(void) /* We are no longer in an interruptible sleep. */ if (td->td_flags & TDF_SINTR) - td->td_flags &= ~(TDF_SINTR | TDF_SBDRY); + td->td_flags &= ~TDF_SINTR; if (td->td_flags & TDF_SLEEPABORT) { td->td_flags &= ~TDF_SLEEPABORT; @@ -746,7 +744,7 @@ sleepq_resume_thread(struct sleepqueue * td->td_wmesg = NULL; td->td_wchan = NULL; - td->td_flags &= ~(TDF_SINTR | TDF_SBDRY); + td->td_flags &= ~TDF_SINTR; CTR3(KTR_PROC, "sleepq_wakeup: thread %p (pid %ld, %s)", (void *)td, (long)td->td_proc->p_pid, td->td_name); Modified: stable/9/sys/kern/subr_trap.c ============================================================================== --- stable/9/sys/kern/subr_trap.c Thu May 30 17:24:36 2013 (r251146) +++ stable/9/sys/kern/subr_trap.c Thu May 30 19:14:34 2013 (r251147) @@ -141,6 +141,8 @@ userret(struct thread *td, struct trapfr ("userret: Returning with %d locks held.", td->td_locks)); KASSERT(td->td_vp_reserv == 0, ("userret: Returning while holding vnode reservation")); + KASSERT((td->td_flags & TDF_SBDRY) == 0, + ("userret: Returning with stop signals deferred")); #ifdef VIMAGE /* Unfortunately td_vnet_lpush needs VNET_DEBUG. */ VNET_ASSERT(curvnet == NULL, Modified: stable/9/sys/kern/vfs_export.c ============================================================================== --- stable/9/sys/kern/vfs_export.c Thu May 30 17:24:36 2013 (r251146) +++ stable/9/sys/kern/vfs_export.c Thu May 30 19:14:34 2013 (r251147) @@ -49,6 +49,7 @@ __FBSDID("$FreeBSD$"); #include <sys/mutex.h> #include <sys/rwlock.h> #include <sys/refcount.h> +#include <sys/signalvar.h> #include <sys/socket.h> #include <sys/systm.h> #include <sys/vnode.h> Modified: stable/9/sys/kern/vfs_lookup.c ============================================================================== --- stable/9/sys/kern/vfs_lookup.c Thu May 30 17:24:36 2013 (r251146) +++ stable/9/sys/kern/vfs_lookup.c Thu May 30 19:14:34 2013 (r251147) @@ -348,7 +348,7 @@ namei(struct nameidata *ndp) auio.uio_offset = 0; auio.uio_rw = UIO_READ; auio.uio_segflg = UIO_SYSSPACE; - auio.uio_td = (struct thread *)0; + auio.uio_td = td; auio.uio_resid = MAXPATHLEN; error = VOP_READLINK(ndp->ni_vp, &auio, cnp->cn_cred); if (error) { Modified: stable/9/sys/nfsclient/nfs_krpc.c ============================================================================== --- stable/9/sys/nfsclient/nfs_krpc.c Thu May 30 17:24:36 2013 (r251146) +++ stable/9/sys/nfsclient/nfs_krpc.c Thu May 30 19:14:34 2013 (r251147) @@ -699,7 +699,6 @@ int nfs_sig_set[] = { SIGTERM, SIGHUP, SIGKILL, - SIGSTOP, SIGQUIT }; @@ -720,7 +719,7 @@ nfs_sig_pending(sigset_t set) /* * The set/restore sigmask functions are used to (temporarily) overwrite - * the process p_sigmask during an RPC call (for example). These are also + * the thread td_sigmask during an RPC call (for example). These are also * used in other places in the NFS client that might tsleep(). */ void @@ -749,8 +748,9 @@ nfs_set_sigmask(struct thread *td, sigse SIGDELSET(newset, nfs_sig_set[i]); } mtx_unlock(&p->p_sigacts->ps_mtx); + kern_sigprocmask(td, SIG_SETMASK, &newset, oldset, + SIGPROCMASK_PROC_LOCKED); PROC_UNLOCK(p); - kern_sigprocmask(td, SIG_SETMASK, &newset, oldset, 0); } void Modified: stable/9/sys/nfsclient/nfs_vfsops.c ============================================================================== --- stable/9/sys/nfsclient/nfs_vfsops.c Thu May 30 17:24:36 2013 (r251146) +++ stable/9/sys/nfsclient/nfs_vfsops.c Thu May 30 19:14:34 2013 (r251147) @@ -146,7 +146,7 @@ static struct vfsops nfs_vfsops = { .vfs_unmount = nfs_unmount, .vfs_sysctl = nfs_sysctl, }; -VFS_SET(nfs_vfsops, oldnfs, VFCF_NETWORK); +VFS_SET(nfs_vfsops, oldnfs, VFCF_NETWORK | VFCF_SBDRY); /* So that loader and kldload(2) can find us, wherever we are.. */ MODULE_VERSION(oldnfs, 1); Modified: stable/9/sys/sys/mount.h ============================================================================== --- stable/9/sys/sys/mount.h Thu May 30 17:24:36 2013 (r251146) +++ stable/9/sys/sys/mount.h Thu May 30 19:14:34 2013 (r251147) @@ -517,6 +517,7 @@ struct ovfsconf { #define VFCF_UNICODE 0x00200000 /* stores file names as Unicode */ #define VFCF_JAIL 0x00400000 /* can be mounted from within a jail */ #define VFCF_DELEGADMIN 0x00800000 /* supports delegated administration */ +#define VFCF_SBDRY 0x01000000 /* defer stop requests */ typedef uint32_t fsctlop_t; @@ -654,31 +655,6 @@ struct vfsops { vfs_statfs_t __vfs_statfs; -#define VFS_MOUNT(MP) (*(MP)->mnt_op->vfs_mount)(MP) -#define VFS_UNMOUNT(MP, FORCE) (*(MP)->mnt_op->vfs_unmount)(MP, FORCE) -#define VFS_ROOT(MP, FLAGS, VPP) \ - (*(MP)->mnt_op->vfs_root)(MP, FLAGS, VPP) -#define VFS_QUOTACTL(MP, C, U, A) \ - (*(MP)->mnt_op->vfs_quotactl)(MP, C, U, A) -#define VFS_STATFS(MP, SBP) __vfs_statfs((MP), (SBP)) -#define VFS_SYNC(MP, WAIT) (*(MP)->mnt_op->vfs_sync)(MP, WAIT) -#define VFS_VGET(MP, INO, FLAGS, VPP) \ - (*(MP)->mnt_op->vfs_vget)(MP, INO, FLAGS, VPP) -#define VFS_FHTOVP(MP, FIDP, FLAGS, VPP) \ - (*(MP)->mnt_op->vfs_fhtovp)(MP, FIDP, FLAGS, VPP) -#define VFS_CHECKEXP(MP, NAM, EXFLG, CRED, NUMSEC, SEC) \ - (*(MP)->mnt_op->vfs_checkexp)(MP, NAM, EXFLG, CRED, NUMSEC, SEC) -#define VFS_EXTATTRCTL(MP, C, FN, NS, N) \ - (*(MP)->mnt_op->vfs_extattrctl)(MP, C, FN, NS, N) -#define VFS_SYSCTL(MP, OP, REQ) \ - (*(MP)->mnt_op->vfs_sysctl)(MP, OP, REQ) -#define VFS_SUSP_CLEAN(MP) \ - ({if (*(MP)->mnt_op->vfs_susp_clean != NULL) \ - (*(MP)->mnt_op->vfs_susp_clean)(MP); }) -#define VFS_RECLAIM_LOWERVP(MP, VP) \ - ({if (*(MP)->mnt_op->vfs_reclaim_lowervp != NULL) \ - (*(MP)->mnt_op->vfs_reclaim_lowervp)((MP), (VP)); }) - #define VFS_NEEDSGIANT_(MP) \ ((MP) != NULL && ((MP)->mnt_kern_flag & MNTK_MPSAFE) == 0) @@ -714,9 +690,127 @@ vfs_statfs_t __vfs_statfs; mtx_assert(&Giant, MA_OWNED); \ } while (0) +#define VFS_PROLOGUE(MP) do { \ + int _enable_stops; \ + \ + _enable_stops = ((MP) != NULL && \ + ((MP)->mnt_vfc->vfc_flags & VFCF_SBDRY) && sigdeferstop()) + +#define VFS_EPILOGUE(MP) \ + if (_enable_stops) \ + sigallowstop(); \ +} while (0) + +#define VFS_MOUNT(MP) ({ \ + int _rc; \ + \ + VFS_PROLOGUE(MP); \ + _rc = (*(MP)->mnt_op->vfs_mount)(MP); \ + VFS_EPILOGUE(MP); \ + _rc; }) + +#define VFS_UNMOUNT(MP, FORCE) ({ \ + int _rc; \ + \ + VFS_PROLOGUE(MP); \ + _rc = (*(MP)->mnt_op->vfs_unmount)(MP, FORCE); \ + VFS_EPILOGUE(MP); \ + _rc; }) + +#define VFS_ROOT(MP, FLAGS, VPP) ({ \ + int _rc; \ + \ + VFS_PROLOGUE(MP); \ + _rc = (*(MP)->mnt_op->vfs_root)(MP, FLAGS, VPP); \ + VFS_EPILOGUE(MP); \ + _rc; }) + +#define VFS_QUOTACTL(MP, C, U, A) ({ \ + int _rc; \ + \ + VFS_PROLOGUE(MP); \ + _rc = (*(MP)->mnt_op->vfs_quotactl)(MP, C, U, A); \ + VFS_EPILOGUE(MP); \ + _rc; }) + +#define VFS_STATFS(MP, SBP) ({ \ + int _rc; \ + \ + VFS_PROLOGUE(MP); \ + _rc = __vfs_statfs((MP), (SBP)); \ + VFS_EPILOGUE(MP); \ + _rc; }) + +#define VFS_SYNC(MP, WAIT) ({ \ + int _rc; \ + \ + VFS_PROLOGUE(MP); \ + _rc = (*(MP)->mnt_op->vfs_sync)(MP, WAIT); \ + VFS_EPILOGUE(MP); \ + _rc; }) + +#define VFS_VGET(MP, INO, FLAGS, VPP) ({ \ + int _rc; \ + \ + VFS_PROLOGUE(MP); \ + _rc = (*(MP)->mnt_op->vfs_vget)(MP, INO, FLAGS, VPP); \ + VFS_EPILOGUE(MP); \ + _rc; }) + +#define VFS_FHTOVP(MP, FIDP, FLAGS, VPP) ({ \ + int _rc; \ + \ + VFS_PROLOGUE(MP); \ + _rc = (*(MP)->mnt_op->vfs_fhtovp)(MP, FIDP, FLAGS, VPP); \ + VFS_EPILOGUE(MP); \ + _rc; }) + +#define VFS_CHECKEXP(MP, NAM, EXFLG, CRED, NUMSEC, SEC) ({ \ + int _rc; \ + \ + VFS_PROLOGUE(MP); \ + _rc = (*(MP)->mnt_op->vfs_checkexp)(MP, NAM, EXFLG, CRED, NUMSEC,\ + SEC); \ + VFS_EPILOGUE(MP); \ + _rc; }) + +#define VFS_EXTATTRCTL(MP, C, FN, NS, N) ({ \ + int _rc; \ + \ + VFS_PROLOGUE(MP); \ + _rc = (*(MP)->mnt_op->vfs_extattrctl)(MP, C, FN, NS, N); \ + VFS_EPILOGUE(MP); \ + _rc; }) + +#define VFS_SYSCTL(MP, OP, REQ) ({ \ + int _rc; \ + \ + VFS_PROLOGUE(MP); \ + _rc = (*(MP)->mnt_op->vfs_sysctl)(MP, OP, REQ); \ + VFS_EPILOGUE(MP); \ + _rc; }) + +#define VFS_SUSP_CLEAN(MP) do { \ + if (*(MP)->mnt_op->vfs_susp_clean != NULL) { \ + VFS_PROLOGUE(MP); \ + (*(MP)->mnt_op->vfs_susp_clean)(MP); \ + VFS_EPILOGUE(MP); \ + } \ +} while (0) + +#define VFS_RECLAIM_LOWERVP(MP, VP) do { \ + if (*(MP)->mnt_op->vfs_reclaim_lowervp != NULL) { \ + VFS_PROLOGUE(MP); \ + (*(MP)->mnt_op->vfs_reclaim_lowervp)((MP), (VP)); \ + VFS_EPILOGUE(MP); \ + } \ +} while (0) + #define VFS_UNLINK_LOWERVP(MP, VP) do { \ if (*(MP)->mnt_op->vfs_unlink_lowervp != NULL) { \ + VFS_PROLOGUE(MP); \ (*(MP)->mnt_op->vfs_unlink_lowervp)((MP), (VP)); \ + VFS_EPILOGUE(MP); \ } \ } while (0) Modified: stable/9/sys/sys/signalvar.h ============================================================================== --- stable/9/sys/sys/signalvar.h Thu May 30 17:24:36 2013 (r251146) +++ stable/9/sys/sys/signalvar.h Thu May 30 19:14:34 2013 (r251147) @@ -328,6 +328,8 @@ extern struct mtx sigio_lock; #define SIGPROCMASK_PS_LOCKED 0x0004 int cursig(struct thread *td, int stop_allowed); +int sigdeferstop(void); +void sigallowstop(void); void execsigs(struct proc *p); void gsignal(int pgid, int sig, ksiginfo_t *ksi); void killproc(struct proc *p, char *why); Modified: stable/9/sys/tools/vnode_if.awk ============================================================================== --- stable/9/sys/tools/vnode_if.awk Thu May 30 17:24:36 2013 (r251146) +++ stable/9/sys/tools/vnode_if.awk Thu May 30 19:14:34 2013 (r251147) @@ -172,6 +172,7 @@ if (cfile) { "#include <sys/kernel.h>\n" \ "#include <sys/mount.h>\n" \ "#include <sys/sdt.h>\n" \ + "#include <sys/signalvar.h>\n" \ "#include <sys/systm.h>\n" \ "#include <sys/vnode.h>\n" \ "\n" \ @@ -378,10 +379,12 @@ while ((getline < srcfile) > 0) { for (i = 0; i < numargs; ++i) add_debug_code(name, args[i], "Entry", "\t"); add_pre(name); + printc("\tVFS_PROLOGUE(a->a_" args[0]"->v_mount);") printc("\tif (vop->"name" != NULL)") printc("\t\trc = vop->"name"(a);") printc("\telse") printc("\t\trc = vop->vop_bypass(&a->a_gen);") + printc("\tVFS_EPILOGUE(a->a_" args[0]"->v_mount);") printc(ctrstr); printc("\tSDT_PROBE(vfs, vop, " name ", return, a->a_" args[0] ", a, rc, 0, 0);\n"); printc("\tif (rc == 0) {");
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201305301914.r4UJEZuU017310>