From owner-svn-src-head@freebsd.org Thu Feb 20 03:01:28 2020 Return-Path: Delivered-To: svn-src-head@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 2F4C825032A; Thu, 20 Feb 2020 03:01:28 +0000 (UTC) (envelope-from hrs@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 48NK9h0TFqz4M9t; Thu, 20 Feb 2020 03:01:28 +0000 (UTC) (envelope-from hrs@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 0604C1B110; Thu, 20 Feb 2020 03:01:28 +0000 (UTC) (envelope-from hrs@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id 01K31RuX043427; Thu, 20 Feb 2020 03:01:27 GMT (envelope-from hrs@FreeBSD.org) Received: (from hrs@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id 01K31RTk043426; Thu, 20 Feb 2020 03:01:27 GMT (envelope-from hrs@FreeBSD.org) Message-Id: <202002200301.01K31RTk043426@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: hrs set sender to hrs@FreeBSD.org using -f From: Hiroki Sato Date: Thu, 20 Feb 2020 03:01:27 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r358152 - head/bin/sh X-SVN-Group: head X-SVN-Commit-Author: hrs X-SVN-Commit-Paths: head/bin/sh X-SVN-Commit-Revision: 358152 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Feb 2020 03:01:28 -0000 Author: hrs Date: Thu Feb 20 03:01:27 2020 New Revision: 358152 URL: https://svnweb.freebsd.org/changeset/base/358152 Log: Improve performance of "read" built-in command when using a seekable fd. The read built-in command calls read(2) with a 1-byte buffer because newline characters need to be detected even on a byte stream which comes from a non-seekable file descriptor. Because of this, the following script calls >6,000 read(2) to show a 6KiB file: while read IN; do echo "$IN"; done < /COPYRIGHT When the input byte stream is seekable, it is possible to read a data block and then reposition the file pointer to where a newline character found. This change adds a small buffer to do this and reduces the number of read(2) calls. Theoretically, multiple built-in commands reading the same seekable byte stream in a single pipe chain can share the buffer. However, this change just makes a single invocation of the read built-in allocate a buffer and deallocate it every time for simplicity. Although this causes read(2) to read the same regions multiple times, the performance penalty should be small compared to the reduction of read(2) calls. Reviewed by: jilles MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D23747 Modified: head/bin/sh/miscbltin.c Modified: head/bin/sh/miscbltin.c ============================================================================== --- head/bin/sh/miscbltin.c Thu Feb 20 01:45:55 2020 (r358151) +++ head/bin/sh/miscbltin.c Thu Feb 20 03:01:27 2020 (r358152) @@ -66,10 +66,79 @@ __FBSDID("$FreeBSD$"); #undef eflag +#define READ_BUFLEN 1024 +struct fdctx { + int fd; + size_t off; /* offset in buf */ + size_t buflen; + char *ep; /* tail pointer */ + char buf[READ_BUFLEN]; +}; + +static void fdctx_init(int, struct fdctx *); +static void fdctx_destroy(struct fdctx *); +static ssize_t fdgetc(struct fdctx *, char *); int readcmd(int, char **); int umaskcmd(int, char **); int ulimitcmd(int, char **); +static void +fdctx_init(int fd, struct fdctx *fdc) +{ + off_t cur; + + /* Check if fd is seekable. */ + cur = lseek(fd, 0, SEEK_CUR); + *fdc = (struct fdctx){ + .fd = fd, + .buflen = (cur != -1) ? READ_BUFLEN : 1, + .ep = &fdc->buf[0], /* No data */ + }; +} + +static ssize_t +fdgetc(struct fdctx *fdc, char *c) +{ + ssize_t nread; + + if (&fdc->buf[fdc->off] == fdc->ep) { + nread = read(fdc->fd, fdc->buf, fdc->buflen); + if (nread > 0) { + fdc->off = 0; + fdc->ep = fdc->buf + nread; + } else + return (nread); + } + *c = fdc->buf[fdc->off++]; + + return (1); +} + +static void +fdctx_destroy(struct fdctx *fdc) +{ + size_t residue; + + if (fdc->buflen > 1) { + /* + * Reposition the file offset. Here is the layout of buf: + * + * | off + * v + * |*****************|-------| + * buf ep buf+buflen + * |<- residue ->| + * + * off: current character + * ep: offset just after read(2) + * residue: length for reposition + */ + residue = (fdc->ep - fdc->buf) - fdc->off; + if (residue > 0) + (void) lseek(fdc->fd, -residue, SEEK_CUR); + } +} + /* * The read builtin. The -r option causes backslashes to be treated like * ordinary characters. @@ -108,6 +177,7 @@ readcmd(int argc __unused, char **argv __unused) fd_set ifds; ssize_t nread; int sig; + struct fdctx fdctx; rflag = 0; prompt = NULL; @@ -173,8 +243,9 @@ readcmd(int argc __unused, char **argv __unused) backslash = 0; STARTSTACKSTR(p); lastnonifs = lastnonifsws = -1; + fdctx_init(STDIN_FILENO, &fdctx); for (;;) { - nread = read(STDIN_FILENO, &c, 1); + nread = fdgetc(&fdctx, &c); if (nread == -1) { if (errno == EINTR) { sig = pendingsig; @@ -260,6 +331,7 @@ readcmd(int argc __unused, char **argv __unused) STARTSTACKSTR(p); lastnonifs = lastnonifsws = -1; } + fdctx_destroy(&fdctx); STACKSTRNUL(p); /*