From owner-svn-src-stable@FreeBSD.ORG Tue May 5 14:53:58 2009 Return-Path: Delivered-To: svn-src-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D1B60106566B; Tue, 5 May 2009 14:53:58 +0000 (UTC) (envelope-from dchagin@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id BBEF88FC21; Tue, 5 May 2009 14:53:58 +0000 (UTC) (envelope-from dchagin@FreeBSD.org) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.3/8.14.3) with ESMTP id n45Erw07084373; Tue, 5 May 2009 14:53:58 GMT (envelope-from dchagin@svn.freebsd.org) Received: (from dchagin@localhost) by svn.freebsd.org (8.14.3/8.14.3/Submit) id n45ErwaK084362; Tue, 5 May 2009 14:53:58 GMT (envelope-from dchagin@svn.freebsd.org) Message-Id: <200905051453.n45ErwaK084362@svn.freebsd.org> From: Dmitry Chagin Date: Tue, 5 May 2009 14:53:58 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-7@freebsd.org X-SVN-Group: stable-7 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r191820 - in stable/7/sys: . amd64/linux32 compat/linux contrib/pf dev/ath/ath_hal dev/cxgb i386/linux X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 May 2009 14:53:59 -0000 Author: dchagin Date: Tue May 5 14:53:58 2009 New Revision: 191820 URL: http://svn.freebsd.org/changeset/base/191820 Log: Merge from HEAD to stable/7: r178976 (rdivacky): Implement robust futexes. Most of the code is modelled after what Linux does. This is because robust futexes are mostly userspace thing which we cannot alter. Two syscalls maintain pointer to userspace list and when process exits a routine walks this list waking up processes sleeping on futexes from that list. r183871: Make robust futexes work on linux32/amd64. Use PTRIN to read user-mode pointers. Change types used in the structures definitions to properly-sized architecture-specific types. r185002: In the robust futexes list head, futex_offset shall be signed, and glibc actually supplies negative offsets. Change l_ulong to l_long. Approved by: kib (mentor) Modified: stable/7/sys/ (props changed) stable/7/sys/amd64/linux32/linux.h stable/7/sys/amd64/linux32/linux32_dummy.c stable/7/sys/amd64/linux32/syscalls.master stable/7/sys/compat/linux/linux_emul.c stable/7/sys/compat/linux/linux_emul.h stable/7/sys/compat/linux/linux_futex.c stable/7/sys/compat/linux/linux_futex.h stable/7/sys/compat/linux/linux_misc.c stable/7/sys/contrib/pf/ (props changed) stable/7/sys/dev/ath/ath_hal/ (props changed) stable/7/sys/dev/cxgb/ (props changed) stable/7/sys/i386/linux/linux.h stable/7/sys/i386/linux/linux_dummy.c stable/7/sys/i386/linux/syscalls.master Modified: stable/7/sys/amd64/linux32/linux.h ============================================================================== --- stable/7/sys/amd64/linux32/linux.h Tue May 5 14:46:18 2009 (r191819) +++ stable/7/sys/amd64/linux32/linux.h Tue May 5 14:53:58 2009 (r191820) @@ -886,4 +886,15 @@ typedef int l_mqd_t; (LINUX_CLONE_VM | LINUX_CLONE_FS | LINUX_CLONE_FILES | \ LINUX_CLONE_SIGHAND | LINUX_CLONE_THREAD) +/* robust futexes */ +struct linux_robust_list { + l_uintptr_t next; +}; + +struct linux_robust_list_head { + struct linux_robust_list list; + l_long futex_offset; + l_uintptr_t pending_list; +}; + #endif /* !_AMD64_LINUX_H_ */ Modified: stable/7/sys/amd64/linux32/linux32_dummy.c ============================================================================== --- stable/7/sys/amd64/linux32/linux32_dummy.c Tue May 5 14:46:18 2009 (r191819) +++ stable/7/sys/amd64/linux32/linux32_dummy.c Tue May 5 14:53:58 2009 (r191820) @@ -111,8 +111,6 @@ DUMMY(faccessat); DUMMY(pselect6); DUMMY(ppoll); DUMMY(unshare); -DUMMY(set_robust_list); -DUMMY(get_robust_list); DUMMY(splice); DUMMY(sync_file_range); DUMMY(tee); Modified: stable/7/sys/amd64/linux32/syscalls.master ============================================================================== --- stable/7/sys/amd64/linux32/syscalls.master Tue May 5 14:46:18 2009 (r191819) +++ stable/7/sys/amd64/linux32/syscalls.master Tue May 5 14:53:58 2009 (r191820) @@ -482,8 +482,10 @@ 308 AUE_NULL STD { int linux_pselect6(void); } 309 AUE_NULL STD { int linux_ppoll(void); } 310 AUE_NULL STD { int linux_unshare(void); } -311 AUE_NULL STD { int linux_set_robust_list(void); } -312 AUE_NULL STD { int linux_get_robust_list(void); } +311 AUE_NULL STD { int linux_set_robust_list(struct linux_robust_list_head *head, \ + l_size_t len); } +312 AUE_NULL STD { int linux_get_robust_list(l_int pid, struct linux_robust_list_head *head, \ + l_size_t *len); } 313 AUE_NULL STD { int linux_splice(void); } 314 AUE_NULL STD { int linux_sync_file_range(void); } 315 AUE_NULL STD { int linux_tee(void); } Modified: stable/7/sys/compat/linux/linux_emul.c ============================================================================== --- stable/7/sys/compat/linux/linux_emul.c Tue May 5 14:46:18 2009 (r191819) +++ stable/7/sys/compat/linux/linux_emul.c Tue May 5 14:53:58 2009 (r191820) @@ -44,9 +44,6 @@ __FBSDID("$FreeBSD$"); #include #include -#include -#include - #ifdef COMPAT_LINUX32 #include #include @@ -55,6 +52,9 @@ __FBSDID("$FreeBSD$"); #include #endif +#include +#include + struct sx emul_shared_lock; struct mtx emul_lock; @@ -86,6 +86,7 @@ linux_proc_init(struct thread *td, pid_t em = malloc(sizeof *em, M_LINUX, M_WAITOK | M_ZERO); em->pid = child; em->pdeath_signal = 0; + em->robust_futexes = NULL; if (flags & LINUX_CLONE_THREAD) { /* handled later in the code */ } else { @@ -161,6 +162,8 @@ linux_proc_exit(void *arg __unused, stru if (__predict_true(p->p_sysent != &elf_linux_sysvec)) return; + release_futexes(p); + /* find the emuldata */ em = em_find(p, EMUL_DOLOCK); Modified: stable/7/sys/compat/linux/linux_emul.h ============================================================================== --- stable/7/sys/compat/linux/linux_emul.h Tue May 5 14:46:18 2009 (r191819) +++ stable/7/sys/compat/linux/linux_emul.h Tue May 5 14:53:58 2009 (r191820) @@ -31,6 +31,8 @@ #ifndef _LINUX_EMUL_H_ #define _LINUX_EMUL_H_ +#include + struct linux_emuldata_shared { int refs; pid_t group_pid; @@ -52,6 +54,8 @@ struct linux_emuldata { int pdeath_signal; /* parent death signal */ + struct linux_robust_list_head *robust_futexes; + LIST_ENTRY(linux_emuldata) threads; /* list of linux threads */ }; Modified: stable/7/sys/compat/linux/linux_futex.c ============================================================================== --- stable/7/sys/compat/linux/linux_futex.c Tue May 5 14:46:18 2009 (r191819) +++ stable/7/sys/compat/linux/linux_futex.c Tue May 5 14:53:58 2009 (r191820) @@ -45,8 +45,11 @@ __KERNEL_RCSID(1, "$NetBSD: linux_futex. #include #include #include +#include #include #include +#include +#include #include #include @@ -57,6 +60,7 @@ __KERNEL_RCSID(1, "$NetBSD: linux_futex. #include #include #endif +#include #include struct futex; @@ -533,3 +537,160 @@ futex_atomic_op(struct thread *td, int e return (-ENOSYS); } } + +int +linux_set_robust_list(struct thread *td, struct linux_set_robust_list_args *args) +{ + struct linux_emuldata *em; + +#ifdef DEBUG + if (ldebug(set_robust_list)) + printf(ARGS(set_robust_list, "")); +#endif + if (args->len != sizeof(struct linux_robust_list_head)) + return (EINVAL); + + em = em_find(td->td_proc, EMUL_DOLOCK); + em->robust_futexes = args->head; + EMUL_UNLOCK(&emul_lock); + + return (0); +} + +int +linux_get_robust_list(struct thread *td, struct linux_get_robust_list_args *args) +{ + struct linux_emuldata *em; + struct linux_robust_list_head *head; + l_size_t len = sizeof(struct linux_robust_list_head); + int error = 0; + +#ifdef DEBUG + if (ldebug(get_robust_list)) + printf(ARGS(get_robust_list, "")); +#endif + + if (!args->pid) { + em = em_find(td->td_proc, EMUL_DONTLOCK); + head = em->robust_futexes; + } else { + struct proc *p; + + p = pfind(args->pid); + if (p == NULL) + return (ESRCH); + + em = em_find(p, EMUL_DONTLOCK); + /* XXX: ptrace? */ + if (priv_check(td, PRIV_CRED_SETUID) || + priv_check(td, PRIV_CRED_SETEUID) || + p_candebug(td, p)) + return (EPERM); + head = em->robust_futexes; + + PROC_UNLOCK(p); + } + + error = copyout(&len, args->len, sizeof(l_size_t)); + if (error) + return (EFAULT); + + error = copyout(head, args->head, sizeof(struct linux_robust_list_head)); + + return (error); +} + +static int +handle_futex_death(void *uaddr, pid_t pid, int pi) +{ + int uval, nval, mval; + struct futex *f; + +retry: + if (copyin(uaddr, &uval, 4)) + return (EFAULT); + + if ((uval & FUTEX_TID_MASK) == pid) { + mval = (uval & FUTEX_WAITERS) | FUTEX_OWNER_DIED; + nval = casuword32(uaddr, uval, mval); + + if (nval == -1) + return (EFAULT); + + if (nval != uval) + goto retry; + + if (!pi && (uval & FUTEX_WAITERS)) { + f = futex_get(uaddr, FUTEX_UNLOCKED); + futex_wake(f, 1, NULL, 0); + } + } + + return (0); +} + +static int +fetch_robust_entry(struct linux_robust_list **entry, + struct linux_robust_list **head, int *pi) +{ + l_ulong uentry; + + if (copyin((const void *)head, &uentry, sizeof(l_ulong))) + return (EFAULT); + + *entry = (void *)(uentry & ~1UL); + *pi = uentry & 1; + + return (0); +} + +/* This walks the list of robust futexes releasing them. */ +void +release_futexes(struct proc *p) +{ + struct linux_robust_list_head *head = NULL; + struct linux_robust_list *entry, *next_entry, *pending; + unsigned int limit = 2048, pi, next_pi, pip; + struct linux_emuldata *em; + l_long futex_offset; + int rc; + + em = em_find(p, EMUL_DONTLOCK); + head = em->robust_futexes; + + if (head == NULL) + return; + + if (fetch_robust_entry(&entry, PTRIN(&head->list.next), &pi)) + return; + + if (copyin(&head->futex_offset, &futex_offset, sizeof(futex_offset))) + return; + + if (fetch_robust_entry(&pending, PTRIN(&head->pending_list), &pip)) + return; + + while (entry != &head->list) { + rc = fetch_robust_entry(&next_entry, PTRIN(&entry->next), &next_pi); + + if (entry != pending) + if (handle_futex_death((char *)entry + futex_offset, + p->p_pid, pi)) + return; + + if (rc) + return; + + entry = next_entry; + pi = next_pi; + + if (!--limit) + break; + + sched_relinquish(curthread); + } + + if (pending) + handle_futex_death((char *) pending + futex_offset, + p->p_pid, pip); +} Modified: stable/7/sys/compat/linux/linux_futex.h ============================================================================== --- stable/7/sys/compat/linux/linux_futex.h Tue May 5 14:46:18 2009 (r191819) +++ stable/7/sys/compat/linux/linux_futex.h Tue May 5 14:53:58 2009 (r191820) @@ -63,4 +63,10 @@ #define FUTEX_OP_CMP_GT 4 /* if (oldval > CMPARG) wake */ #define FUTEX_OP_CMP_GE 5 /* if (oldval >= CMPARG) wake */ +#define FUTEX_WAITERS 0x80000000 +#define FUTEX_OWNER_DIED 0x40000000 +#define FUTEX_TID_MASK 0x3fffffff + +void release_futexes(struct proc *); + #endif /* !_LINUX_FUTEX_H */ Modified: stable/7/sys/compat/linux/linux_misc.c ============================================================================== --- stable/7/sys/compat/linux/linux_misc.c Tue May 5 14:46:18 2009 (r191819) +++ stable/7/sys/compat/linux/linux_misc.c Tue May 5 14:53:58 2009 (r191820) @@ -75,10 +75,6 @@ __FBSDID("$FreeBSD$"); #include #include -#include -#include -#include - #ifdef COMPAT_LINUX32 #include #include @@ -90,6 +86,9 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include +#include +#include #define BSD_TO_LINUX_SIGNAL(sig) \ (((sig) <= LINUX_SIGTBLSZ) ? bsd_to_linux_signal[_SIG_IDX(sig)] : sig) Modified: stable/7/sys/i386/linux/linux.h ============================================================================== --- stable/7/sys/i386/linux/linux.h Tue May 5 14:46:18 2009 (r191819) +++ stable/7/sys/i386/linux/linux.h Tue May 5 14:53:58 2009 (r191820) @@ -851,4 +851,15 @@ typedef int l_mqd_t; (LINUX_CLONE_VM | LINUX_CLONE_FS | LINUX_CLONE_FILES | \ LINUX_CLONE_SIGHAND | LINUX_CLONE_THREAD) +/* robust futexes */ +struct linux_robust_list { + struct linux_robust_list *next; +}; + +struct linux_robust_list_head { + struct linux_robust_list list; + l_long futex_offset; + struct linux_robust_list *pending_list; +}; + #endif /* !_I386_LINUX_H_ */ Modified: stable/7/sys/i386/linux/linux_dummy.c ============================================================================== --- stable/7/sys/i386/linux/linux_dummy.c Tue May 5 14:46:18 2009 (r191819) +++ stable/7/sys/i386/linux/linux_dummy.c Tue May 5 14:53:58 2009 (r191820) @@ -102,8 +102,6 @@ DUMMY(faccessat); DUMMY(pselect6); DUMMY(ppoll); DUMMY(unshare); -DUMMY(set_robust_list); -DUMMY(get_robust_list); DUMMY(splice); DUMMY(sync_file_range); DUMMY(tee); Modified: stable/7/sys/i386/linux/syscalls.master ============================================================================== --- stable/7/sys/i386/linux/syscalls.master Tue May 5 14:46:18 2009 (r191819) +++ stable/7/sys/i386/linux/syscalls.master Tue May 5 14:53:58 2009 (r191820) @@ -492,8 +492,10 @@ 308 AUE_NULL STD { int linux_pselect6(void); } 309 AUE_NULL STD { int linux_ppoll(void); } 310 AUE_NULL STD { int linux_unshare(void); } -311 AUE_NULL STD { int linux_set_robust_list(void); } -312 AUE_NULL STD { int linux_get_robust_list(void); } +311 AUE_NULL STD { int linux_set_robust_list(struct linux_robust_list_head *head, \ + l_size_t len); } +312 AUE_NULL STD { int linux_get_robust_list(l_int pid, struct linux_robust_list_head **head, \ + l_size_t *len); } 313 AUE_NULL STD { int linux_splice(void); } 314 AUE_NULL STD { int linux_sync_file_range(void); } 315 AUE_NULL STD { int linux_tee(void); }