From owner-svn-src-stable-10@freebsd.org Sat Jan 9 16:48:52 2016 Return-Path: Delivered-To: svn-src-stable-10@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3FFD6A67B8E; Sat, 9 Jan 2016 16:48:52 +0000 (UTC) (envelope-from dchagin@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0C29518E3; Sat, 9 Jan 2016 16:48:51 +0000 (UTC) (envelope-from dchagin@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id u09GmpeY044909; Sat, 9 Jan 2016 16:48:51 GMT (envelope-from dchagin@FreeBSD.org) Received: (from dchagin@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id u09GmoQq044900; Sat, 9 Jan 2016 16:48:50 GMT (envelope-from dchagin@FreeBSD.org) Message-Id: <201601091648.u09GmoQq044900@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: dchagin set sender to dchagin@FreeBSD.org using -f From: Dmitry Chagin Date: Sat, 9 Jan 2016 16:48:50 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-10@freebsd.org Subject: svn commit: r293549 - in stable/10/sys: amd64/linux amd64/linux32 compat/linux i386/linux sys X-SVN-Group: stable-10 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-10@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SVN commit messages for only the 10-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Jan 2016 16:48:52 -0000 Author: dchagin Date: Sat Jan 9 16:48:50 2016 New Revision: 293549 URL: https://svnweb.freebsd.org/changeset/base/293549 Log: MFC r283444: Implement eventfd system call. Modified: stable/10/sys/amd64/linux/linux_dummy.c stable/10/sys/amd64/linux/syscalls.master stable/10/sys/amd64/linux32/linux32_dummy.c stable/10/sys/amd64/linux32/syscalls.master stable/10/sys/compat/linux/linux_event.c stable/10/sys/compat/linux/linux_event.h stable/10/sys/i386/linux/linux_dummy.c stable/10/sys/i386/linux/syscalls.master stable/10/sys/sys/file.h Directory Properties: stable/10/ (props changed) Modified: stable/10/sys/amd64/linux/linux_dummy.c ============================================================================== --- stable/10/sys/amd64/linux/linux_dummy.c Sat Jan 9 16:47:36 2016 (r293548) +++ stable/10/sys/amd64/linux/linux_dummy.c Sat Jan 9 16:48:50 2016 (r293549) @@ -103,12 +103,10 @@ DUMMY(utimensat); DUMMY(epoll_pwait); DUMMY(signalfd); DUMMY(timerfd); -DUMMY(eventfd); DUMMY(fallocate); DUMMY(timerfd_settime); DUMMY(timerfd_gettime); DUMMY(signalfd4); -DUMMY(eventfd2); DUMMY(inotify_init1); DUMMY(preadv); DUMMY(pwritev); Modified: stable/10/sys/amd64/linux/syscalls.master ============================================================================== --- stable/10/sys/amd64/linux/syscalls.master Sat Jan 9 16:47:36 2016 (r293548) +++ stable/10/sys/amd64/linux/syscalls.master Sat Jan 9 16:48:50 2016 (r293549) @@ -472,14 +472,14 @@ l_int maxevents, l_int timeout, l_sigset_t *mask); } 282 AUE_NULL STD { int linux_signalfd(void); } 283 AUE_NULL STD { int linux_timerfd(void); } -284 AUE_NULL STD { int linux_eventfd(void); } +284 AUE_NULL STD { int linux_eventfd(l_uint initval); } 285 AUE_NULL STD { int linux_fallocate(void); } 286 AUE_NULL STD { int linux_timerfd_settime(void); } 287 AUE_NULL STD { int linux_timerfd_gettime(void); } 288 AUE_ACCEPT STD { int linux_accept4(l_int s, l_uintptr_t addr, \ l_uintptr_t namelen, int flags); } 289 AUE_NULL STD { int linux_signalfd4(void); } -290 AUE_NULL STD { int linux_eventfd2(void); } +290 AUE_NULL STD { int linux_eventfd2(l_uint initval, l_int flags); } 291 AUE_NULL STD { int linux_epoll_create1(l_int flags); } 292 AUE_NULL STD { int linux_dup3(l_int oldfd, \ l_int newfd, l_int flags); } Modified: stable/10/sys/amd64/linux32/linux32_dummy.c ============================================================================== --- stable/10/sys/amd64/linux32/linux32_dummy.c Sat Jan 9 16:47:36 2016 (r293548) +++ stable/10/sys/amd64/linux32/linux32_dummy.c Sat Jan 9 16:48:50 2016 (r293549) @@ -109,7 +109,6 @@ DUMMY(epoll_pwait); DUMMY(utimensat); DUMMY(signalfd); DUMMY(timerfd_create); -DUMMY(eventfd); /* linux 2.6.23: */ DUMMY(fallocate); /* linux 2.6.25: */ @@ -117,7 +116,6 @@ DUMMY(timerfd_settime); DUMMY(timerfd_gettime); /* linux 2.6.27: */ DUMMY(signalfd4); -DUMMY(eventfd2); DUMMY(inotify_init1); /* linux 2.6.30: */ DUMMY(preadv); Modified: stable/10/sys/amd64/linux32/syscalls.master ============================================================================== --- stable/10/sys/amd64/linux32/syscalls.master Sat Jan 9 16:47:36 2016 (r293548) +++ stable/10/sys/amd64/linux32/syscalls.master Sat Jan 9 16:48:50 2016 (r293549) @@ -535,7 +535,7 @@ 320 AUE_NULL STD { int linux_utimensat(void); } 321 AUE_NULL STD { int linux_signalfd(void); } 322 AUE_NULL STD { int linux_timerfd_create(void); } -323 AUE_NULL STD { int linux_eventfd(void); } +323 AUE_NULL STD { int linux_eventfd(l_uint initval); } ; linux 2.6.23: 324 AUE_NULL STD { int linux_fallocate(void); } ; linux 2.6.25: @@ -543,7 +543,7 @@ 326 AUE_NULL STD { int linux_timerfd_gettime(void); } ; linux 2.6.27: 327 AUE_NULL STD { int linux_signalfd4(void); } -328 AUE_NULL STD { int linux_eventfd2(void); } +328 AUE_NULL STD { int linux_eventfd2(l_uint initval, l_int flags); } 329 AUE_NULL STD { int linux_epoll_create1(l_int flags); } 330 AUE_NULL STD { int linux_dup3(l_int oldfd, \ l_int newfd, l_int flags); } Modified: stable/10/sys/compat/linux/linux_event.c ============================================================================== --- stable/10/sys/compat/linux/linux_event.c Sat Jan 9 16:47:36 2016 (r293548) +++ stable/10/sys/compat/linux/linux_event.c Sat Jan 9 16:48:50 2016 (r293549) @@ -43,7 +43,9 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include +#include #include #include #include @@ -114,6 +116,57 @@ struct epoll_copyout_args { int error; }; +/* eventfd */ +typedef uint64_t eventfd_t; + +static fo_rdwr_t eventfd_read; +static fo_rdwr_t eventfd_write; +static fo_truncate_t eventfd_truncate; +static fo_ioctl_t eventfd_ioctl; +static fo_poll_t eventfd_poll; +static fo_kqfilter_t eventfd_kqfilter; +static fo_stat_t eventfd_stat; +static fo_close_t eventfd_close; + +static struct fileops eventfdops = { + .fo_read = eventfd_read, + .fo_write = eventfd_write, + .fo_truncate = eventfd_truncate, + .fo_ioctl = eventfd_ioctl, + .fo_poll = eventfd_poll, + .fo_kqfilter = eventfd_kqfilter, + .fo_stat = eventfd_stat, + .fo_close = eventfd_close, + .fo_chmod = invfo_chmod, + .fo_chown = invfo_chown, + .fo_sendfile = invfo_sendfile, + .fo_flags = DFLAG_PASSABLE +}; + +static void filt_eventfddetach(struct knote *kn); +static int filt_eventfdread(struct knote *kn, long hint); +static int filt_eventfdwrite(struct knote *kn, long hint); + +static struct filterops eventfd_rfiltops = { + .f_isfd = 1, + .f_detach = filt_eventfddetach, + .f_event = filt_eventfdread +}; +static struct filterops eventfd_wfiltops = { + .f_isfd = 1, + .f_detach = filt_eventfddetach, + .f_event = filt_eventfdwrite +}; + +struct eventfd { + eventfd_t efd_count; + uint32_t efd_flags; + struct selinfo efd_sel; + struct mtx efd_lock; +}; + +static int eventfd_create(struct thread *td, uint32_t initval, int flags); + static void epoll_fd_install(struct thread *td, int fd, epoll_udata_t udata) @@ -498,3 +551,280 @@ epoll_delete_all_events(struct thread *t /* report any errors we got */ return (error1 == 0 ? error2 : error1); } + +static int +eventfd_create(struct thread *td, uint32_t initval, int flags) +{ + struct filedesc *fdp; + struct eventfd *efd; + struct file *fp; + int fflags, fd, error; + + fflags = 0; + if ((flags & LINUX_O_CLOEXEC) != 0) + fflags |= O_CLOEXEC; + + fdp = td->td_proc->p_fd; + error = falloc(td, &fp, &fd, fflags); + if (error) + return (error); + + efd = malloc(sizeof(*efd), M_EPOLL, M_WAITOK | M_ZERO); + efd->efd_flags = flags; + efd->efd_count = initval; + mtx_init(&efd->efd_lock, "eventfd", NULL, MTX_DEF); + + knlist_init_mtx(&efd->efd_sel.si_note, &efd->efd_lock); + + fflags = FREAD | FWRITE; + if ((flags & LINUX_O_NONBLOCK) != 0) + fflags |= FNONBLOCK; + + finit(fp, fflags, DTYPE_LINUXEFD, efd, &eventfdops); + fdrop(fp, td); + + td->td_retval[0] = fd; + return (error); +} + +int +linux_eventfd(struct thread *td, struct linux_eventfd_args *args) +{ + + return (eventfd_create(td, args->initval, 0)); +} + +int +linux_eventfd2(struct thread *td, struct linux_eventfd2_args *args) +{ + + if ((args->flags & ~(LINUX_O_CLOEXEC|LINUX_O_NONBLOCK|LINUX_EFD_SEMAPHORE)) != 0) + return (EINVAL); + + return (eventfd_create(td, args->initval, args->flags)); +} + +static int +eventfd_close(struct file *fp, struct thread *td) +{ + struct eventfd *efd; + + efd = fp->f_data; + if (fp->f_type != DTYPE_LINUXEFD || efd == NULL) + return (EBADF); + + seldrain(&efd->efd_sel); + knlist_destroy(&efd->efd_sel.si_note); + + fp->f_ops = &badfileops; + mtx_destroy(&efd->efd_lock); + free(efd, M_EPOLL); + + return (0); +} + +static int +eventfd_read(struct file *fp, struct uio *uio, struct ucred *active_cred, + int flags, struct thread *td) +{ + struct eventfd *efd; + eventfd_t count; + int error; + + efd = fp->f_data; + if (fp->f_type != DTYPE_LINUXEFD || efd == NULL) + return (EBADF); + + if (uio->uio_resid < sizeof(eventfd_t)) + return (EINVAL); + + error = 0; + mtx_lock(&efd->efd_lock); +retry: + if (efd->efd_count == 0) { + if ((efd->efd_flags & LINUX_O_NONBLOCK) != 0) { + mtx_unlock(&efd->efd_lock); + return (EAGAIN); + } + error = mtx_sleep(&efd->efd_count, &efd->efd_lock, PCATCH, "lefdrd", 0); + if (error == 0) + goto retry; + } + if (error == 0) { + if ((efd->efd_flags & LINUX_EFD_SEMAPHORE) != 0) { + count = 1; + --efd->efd_count; + } else { + count = efd->efd_count; + efd->efd_count = 0; + } + KNOTE_LOCKED(&efd->efd_sel.si_note, 0); + selwakeup(&efd->efd_sel); + wakeup(&efd->efd_count); + mtx_unlock(&efd->efd_lock); + error = uiomove(&count, sizeof(eventfd_t), uio); + } else + mtx_unlock(&efd->efd_lock); + + return (error); +} + +static int +eventfd_write(struct file *fp, struct uio *uio, struct ucred *active_cred, + int flags, struct thread *td) +{ + struct eventfd *efd; + eventfd_t count; + int error; + + efd = fp->f_data; + if (fp->f_type != DTYPE_LINUXEFD || efd == NULL) + return (EBADF); + + if (uio->uio_resid < sizeof(eventfd_t)) + return (EINVAL); + + error = uiomove(&count, sizeof(eventfd_t), uio); + if (error) + return (error); + if (count == UINT64_MAX) + return (EINVAL); + + mtx_lock(&efd->efd_lock); +retry: + if (UINT64_MAX - efd->efd_count <= count) { + if ((efd->efd_flags & LINUX_O_NONBLOCK) != 0) { + mtx_unlock(&efd->efd_lock); + return (EAGAIN); + } + error = mtx_sleep(&efd->efd_count, &efd->efd_lock, + PCATCH, "lefdwr", 0); + if (error == 0) + goto retry; + } + if (error == 0) { + efd->efd_count += count; + KNOTE_LOCKED(&efd->efd_sel.si_note, 0); + selwakeup(&efd->efd_sel); + wakeup(&efd->efd_count); + } + mtx_unlock(&efd->efd_lock); + + return (error); +} + +static int +eventfd_poll(struct file *fp, int events, struct ucred *active_cred, + struct thread *td) +{ + struct eventfd *efd; + int revents = 0; + + efd = fp->f_data; + if (fp->f_type != DTYPE_LINUXEFD || efd == NULL) + return (POLLERR); + + mtx_lock(&efd->efd_lock); + if ((events & (POLLIN|POLLRDNORM)) && efd->efd_count > 0) + revents |= events & (POLLIN|POLLRDNORM); + if ((events & (POLLOUT|POLLWRNORM)) && UINT64_MAX - 1 > efd->efd_count) + revents |= events & (POLLOUT|POLLWRNORM); + if (revents == 0) + selrecord(td, &efd->efd_sel); + mtx_unlock(&efd->efd_lock); + + return (revents); +} + +/*ARGSUSED*/ +static int +eventfd_kqfilter(struct file *fp, struct knote *kn) +{ + struct eventfd *efd; + + efd = fp->f_data; + if (fp->f_type != DTYPE_LINUXEFD || efd == NULL) + return (EINVAL); + + mtx_lock(&efd->efd_lock); + switch (kn->kn_filter) { + case EVFILT_READ: + kn->kn_fop = &eventfd_rfiltops; + break; + case EVFILT_WRITE: + kn->kn_fop = &eventfd_wfiltops; + break; + default: + mtx_unlock(&efd->efd_lock); + return (EINVAL); + } + + kn->kn_hook = efd; + knlist_add(&efd->efd_sel.si_note, kn, 1); + mtx_unlock(&efd->efd_lock); + + return (0); +} + +static void +filt_eventfddetach(struct knote *kn) +{ + struct eventfd *efd = kn->kn_hook; + + mtx_lock(&efd->efd_lock); + knlist_remove(&efd->efd_sel.si_note, kn, 1); + mtx_unlock(&efd->efd_lock); +} + +/*ARGSUSED*/ +static int +filt_eventfdread(struct knote *kn, long hint) +{ + struct eventfd *efd = kn->kn_hook; + int ret; + + mtx_assert(&efd->efd_lock, MA_OWNED); + ret = (efd->efd_count > 0); + + return (ret); +} + +/*ARGSUSED*/ +static int +filt_eventfdwrite(struct knote *kn, long hint) +{ + struct eventfd *efd = kn->kn_hook; + int ret; + + mtx_assert(&efd->efd_lock, MA_OWNED); + ret = (UINT64_MAX - 1 > efd->efd_count); + + return (ret); +} + +/*ARGSUSED*/ +static int +eventfd_truncate(struct file *fp, off_t length, struct ucred *active_cred, + struct thread *td) +{ + + return (ENXIO); +} + +/*ARGSUSED*/ +static int +eventfd_ioctl(struct file *fp, u_long cmd, void *data, + struct ucred *active_cred, struct thread *td) +{ + + return (ENXIO); +} + +/*ARGSUSED*/ +static int +eventfd_stat(struct file *fp, struct stat *st, struct ucred *active_cred, + struct thread *td) +{ + + return (ENXIO); +} Modified: stable/10/sys/compat/linux/linux_event.h ============================================================================== --- stable/10/sys/compat/linux/linux_event.h Sat Jan 9 16:47:36 2016 (r293548) +++ stable/10/sys/compat/linux/linux_event.h Sat Jan 9 16:48:50 2016 (r293549) @@ -55,4 +55,6 @@ #define LINUX_EPOLL_CTL_DEL 2 #define LINUX_EPOLL_CTL_MOD 3 +#define LINUX_EFD_SEMAPHORE (1 << 0) + #endif /* !_LINUX_EVENT_H_ */ Modified: stable/10/sys/i386/linux/linux_dummy.c ============================================================================== --- stable/10/sys/i386/linux/linux_dummy.c Sat Jan 9 16:47:36 2016 (r293548) +++ stable/10/sys/i386/linux/linux_dummy.c Sat Jan 9 16:48:50 2016 (r293549) @@ -105,7 +105,6 @@ DUMMY(epoll_pwait); DUMMY(utimensat); DUMMY(signalfd); DUMMY(timerfd_create); -DUMMY(eventfd); /* linux 2.6.23: */ DUMMY(fallocate); /* linux 2.6.25: */ @@ -113,7 +112,6 @@ DUMMY(timerfd_settime); DUMMY(timerfd_gettime); /* linux 2.6.27: */ DUMMY(signalfd4); -DUMMY(eventfd2); DUMMY(inotify_init1); /* linux 2.6.30: */ DUMMY(preadv); Modified: stable/10/sys/i386/linux/syscalls.master ============================================================================== --- stable/10/sys/i386/linux/syscalls.master Sat Jan 9 16:47:36 2016 (r293548) +++ stable/10/sys/i386/linux/syscalls.master Sat Jan 9 16:48:50 2016 (r293549) @@ -543,7 +543,7 @@ 320 AUE_NULL STD { int linux_utimensat(void); } 321 AUE_NULL STD { int linux_signalfd(void); } 322 AUE_NULL STD { int linux_timerfd_create(void); } -323 AUE_NULL STD { int linux_eventfd(void); } +323 AUE_NULL STD { int linux_eventfd(l_uint initval); } ; linux 2.6.23: 324 AUE_NULL STD { int linux_fallocate(void); } ; linux 2.6.25: @@ -551,7 +551,7 @@ 326 AUE_NULL STD { int linux_timerfd_gettime(void); } ; linux 2.6.27: 327 AUE_NULL STD { int linux_signalfd4(void); } -328 AUE_NULL STD { int linux_eventfd2(void); } +328 AUE_NULL STD { int linux_eventfd2(l_uint initval, l_int flags); } 329 AUE_NULL STD { int linux_epoll_create1(l_int flags); } 330 AUE_NULL STD { int linux_dup3(l_int oldfd, \ l_int newfd, l_int flags); } Modified: stable/10/sys/sys/file.h ============================================================================== --- stable/10/sys/sys/file.h Sat Jan 9 16:47:36 2016 (r293548) +++ stable/10/sys/sys/file.h Sat Jan 9 16:48:50 2016 (r293549) @@ -65,6 +65,7 @@ struct socket; #define DTYPE_PTS 10 /* pseudo teletype master device */ #define DTYPE_DEV 11 /* Device specific fd type */ #define DTYPE_PROCDESC 12 /* process descriptor */ +#define DTYPE_LINUXEFD 13 /* emulation eventfd type */ #ifdef _KERNEL