From owner-svn-src-stable-12@freebsd.org Tue Aug 27 03:05:58 2019 Return-Path: Delivered-To: svn-src-stable-12@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 6BA48C68EE; Tue, 27 Aug 2019 03:05:58 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 46HYfZ29z2z4G1f; Tue, 27 Aug 2019 03:05:58 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 2DBD6184F4; Tue, 27 Aug 2019 03:05:58 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id x7R35vw8064195; Tue, 27 Aug 2019 03:05:57 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id x7R35vlN064194; Tue, 27 Aug 2019 03:05:57 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201908270305.x7R35vlN064194@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Tue, 27 Aug 2019 03:05:57 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-12@freebsd.org Subject: svn commit: r351526 - stable/12/sys/kern X-SVN-Group: stable-12 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/12/sys/kern X-SVN-Commit-Revision: 351526 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-12@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: SVN commit messages for only the 12-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Aug 2019 03:05:58 -0000 Author: mav Date: Tue Aug 27 03:05:57 2019 New Revision: 351526 URL: https://svnweb.freebsd.org/changeset/base/351526 Log: MFC r351348 (by markj): Modify pipe_poll() to properly check for pending direct writes. With r349546, it is a responsibility of the writer to clear PIPE_DIRECTW after pinned data has been read. In particular, once a reader has drained this data, there is a small window where the pipe is empty but PIPE_DIRECTW is set. pipe_poll() was using the presence of PIPE_DIRECTW to determine whether to return POLLIN, so in this window it would claim that data was available to read when this was not the case. Fix this by modifying several checks for PIPE_DIRECTW to instead look at the number of residual bytes in data pinned by a direct writer. In some cases we really do want to check for PIPE_DIRECTW, since the presence of this flag indicates that any attempt to write to the pipe will block on the existing direct writer. Modified: stable/12/sys/kern/sys_pipe.c Directory Properties: stable/12/ (props changed) Modified: stable/12/sys/kern/sys_pipe.c ============================================================================== --- stable/12/sys/kern/sys_pipe.c Tue Aug 27 01:40:38 2019 (r351525) +++ stable/12/sys/kern/sys_pipe.c Tue Aug 27 03:05:57 2019 (r351526) @@ -711,11 +711,9 @@ pipe_read(struct file *fp, struct uio *uio, struct ucr /* * Direct copy, bypassing a kernel buffer. */ - } else if ((size = rpipe->pipe_map.cnt) && - (rpipe->pipe_state & PIPE_DIRECTW)) { + } else if ((size = rpipe->pipe_map.cnt) != 0) { if (size > uio->uio_resid) size = (u_int) uio->uio_resid; - PIPE_UNLOCK(rpipe); error = uiomove_fromphys(rpipe->pipe_map.ms, rpipe->pipe_map.pos, size, uio); @@ -821,32 +819,33 @@ pipe_build_write_buffer(struct pipe *wpipe, struct uio u_int size; int i; - PIPE_LOCK_ASSERT(wpipe, MA_NOTOWNED); - KASSERT(wpipe->pipe_state & PIPE_DIRECTW, - ("Clone attempt on non-direct write pipe!")); + PIPE_LOCK_ASSERT(wpipe, MA_OWNED); + KASSERT((wpipe->pipe_state & PIPE_DIRECTW) == 0, + ("%s: PIPE_DIRECTW set on %p", __func__, wpipe)); + KASSERT(wpipe->pipe_map.cnt == 0, + ("%s: pipe map for %p contains residual data", __func__, wpipe)); if (uio->uio_iov->iov_len > wpipe->pipe_buffer.size) size = wpipe->pipe_buffer.size; else size = uio->uio_iov->iov_len; - if ((i = vm_fault_quick_hold_pages(&curproc->p_vmspace->vm_map, + wpipe->pipe_state |= PIPE_DIRECTW; + PIPE_UNLOCK(wpipe); + i = vm_fault_quick_hold_pages(&curproc->p_vmspace->vm_map, (vm_offset_t)uio->uio_iov->iov_base, size, VM_PROT_READ, - wpipe->pipe_map.ms, PIPENPAGES)) < 0) + wpipe->pipe_map.ms, PIPENPAGES); + PIPE_LOCK(wpipe); + if (i < 0) { + wpipe->pipe_state &= ~PIPE_DIRECTW; return (EFAULT); + } -/* - * set up the control block - */ wpipe->pipe_map.npages = i; wpipe->pipe_map.pos = ((vm_offset_t) uio->uio_iov->iov_base) & PAGE_MASK; wpipe->pipe_map.cnt = size; -/* - * and update the uio data - */ - uio->uio_iov->iov_len -= size; uio->uio_iov->iov_base = (char *)uio->uio_iov->iov_base + size; if (uio->uio_iov->iov_len == 0) @@ -866,6 +865,8 @@ pipe_destroy_write_buffer(struct pipe *wpipe) PIPE_LOCK_ASSERT(wpipe, MA_OWNED); KASSERT((wpipe->pipe_state & PIPE_DIRECTW) != 0, ("%s: PIPE_DIRECTW not set on %p", __func__, wpipe)); + KASSERT(wpipe->pipe_map.cnt == 0, + ("%s: pipe map for %p contains residual data", __func__, wpipe)); wpipe->pipe_state &= ~PIPE_DIRECTW; vm_page_unhold_pages(wpipe->pipe_map.ms, wpipe->pipe_map.npages); @@ -891,6 +892,7 @@ pipe_clone_write_buffer(struct pipe *wpipe) size = wpipe->pipe_map.cnt; pos = wpipe->pipe_map.pos; + wpipe->pipe_map.cnt = 0; wpipe->pipe_buffer.in = size; wpipe->pipe_buffer.out = 0; @@ -948,7 +950,6 @@ retry: else goto retry; } - wpipe->pipe_map.cnt = 0; /* transfer not ready yet */ if (wpipe->pipe_buffer.cnt > 0) { if (wpipe->pipe_state & PIPE_WANTR) { wpipe->pipe_state &= ~PIPE_WANTR; @@ -965,19 +966,15 @@ retry: goto retry; } - wpipe->pipe_state |= PIPE_DIRECTW; - - PIPE_UNLOCK(wpipe); error = pipe_build_write_buffer(wpipe, uio); - PIPE_LOCK(wpipe); if (error) { - wpipe->pipe_state &= ~PIPE_DIRECTW; pipeunlock(wpipe); goto error1; } while (wpipe->pipe_map.cnt != 0) { if (wpipe->pipe_state & PIPE_EOF) { + wpipe->pipe_map.cnt = 0; pipe_destroy_write_buffer(wpipe); pipeselwakeup(wpipe); pipeunlock(wpipe); @@ -1131,7 +1128,7 @@ pipe_write(struct file *fp, struct uio *uio, struct uc * pipe buffer. We break out if a signal occurs or the * reader goes away. */ - if (wpipe->pipe_state & PIPE_DIRECTW) { + if (wpipe->pipe_map.cnt != 0) { if (wpipe->pipe_state & PIPE_WANTR) { wpipe->pipe_state &= ~PIPE_WANTR; wakeup(wpipe); @@ -1349,7 +1346,7 @@ pipe_ioctl(struct file *fp, u_long cmd, void *data, st PIPE_UNLOCK(mpipe); return (0); } - if (mpipe->pipe_state & PIPE_DIRECTW) + if (mpipe->pipe_map.cnt != 0) *(int *)data = mpipe->pipe_map.cnt; else *(int *)data = mpipe->pipe_buffer.cnt; @@ -1405,14 +1402,13 @@ pipe_poll(struct file *fp, int events, struct ucred *a goto locked_error; #endif if (fp->f_flag & FREAD && events & (POLLIN | POLLRDNORM)) - if ((rpipe->pipe_state & PIPE_DIRECTW) || - (rpipe->pipe_buffer.cnt > 0)) + if (rpipe->pipe_map.cnt > 0 || rpipe->pipe_buffer.cnt > 0) revents |= events & (POLLIN | POLLRDNORM); if (fp->f_flag & FWRITE && events & (POLLOUT | POLLWRNORM)) if (wpipe->pipe_present != PIPE_ACTIVE || (wpipe->pipe_state & PIPE_EOF) || - (((wpipe->pipe_state & PIPE_DIRECTW) == 0) && + ((wpipe->pipe_state & PIPE_DIRECTW) == 0 && ((wpipe->pipe_buffer.size - wpipe->pipe_buffer.cnt) >= PIPE_BUF || wpipe->pipe_buffer.size == 0))) revents |= events & (POLLOUT | POLLWRNORM); @@ -1487,7 +1483,7 @@ pipe_stat(struct file *fp, struct stat *ub, struct ucr bzero(ub, sizeof(*ub)); ub->st_mode = S_IFIFO; ub->st_blksize = PAGE_SIZE; - if (pipe->pipe_state & PIPE_DIRECTW) + if (pipe->pipe_map.cnt != 0) ub->st_size = pipe->pipe_map.cnt; else ub->st_size = pipe->pipe_buffer.cnt; @@ -1729,7 +1725,7 @@ filt_piperead(struct knote *kn, long hint) PIPE_LOCK_ASSERT(rpipe, MA_OWNED); kn->kn_data = rpipe->pipe_buffer.cnt; - if ((kn->kn_data == 0) && (rpipe->pipe_state & PIPE_DIRECTW)) + if (kn->kn_data == 0) kn->kn_data = rpipe->pipe_map.cnt; if ((rpipe->pipe_state & PIPE_EOF) ||