From owner-svn-src-head@freebsd.org Tue Dec 11 20:35:50 2018 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9E12E13136DA; Tue, 11 Dec 2018 20:35:50 +0000 (UTC) (envelope-from devin@shxd.cx) Received: from shxd.cx (mail.shxd.cx [64.201.244.140]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3F6B46C14F; Tue, 11 Dec 2018 20:35:50 +0000 (UTC) (envelope-from devin@shxd.cx) Received: from [76.77.180.168] (port=62566 helo=eskarina.lan) by shxd.cx with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.77 (FreeBSD)) (envelope-from ) id 1gWoku-000Eqn-4N; Tue, 11 Dec 2018 12:35:48 -0800 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\)) Subject: Re: svn commit: r341803 - head/libexec/rc From: Devin Teske In-Reply-To: Date: Tue, 11 Dec 2018 12:35:46 -0800 Cc: Devin Teske , Warner Losh , src-committers , svn-src-all@freebsd.org, svn-src-head@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: References: <201812110138.wBB1cp1p006660@repo.freebsd.org> <2a76b295-b2da-3015-c201-dbe0ec63ca5a@FreeBSD.org> <98481565-CDD7-4301-B86B-072D5B984AF7@FreeBSD.org> To: cem@freebsd.org X-Mailer: Apple Mail (2.3445.9.1) Sender: devin@shxd.cx X-Rspamd-Queue-Id: 3F6B46C14F X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-4.98 / 15.00]; REPLY(-4.00)[]; NEURAL_HAM_SHORT(-0.98)[-0.979,0] X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Dec 2018 20:35:51 -0000 > On Dec 11, 2018, at 11:57 AM, Conrad Meyer wrote: >=20 > On Tue, Dec 11, 2018 at 10:04 AM Warner Losh wrote: >> On Tue, Dec 11, 2018, 9:55 AM John Baldwin >>=20 >>> The 'read' builtin in sh can't use buffering, so it is always going = to be slow >>=20 >> It can't use it because of pipes. The example from the parts of this = that was on IRC was basically: >>=20 >> foo | (read bar; baz) >>=20 >> Which reads one line into the bar variable and then sends the rest to = the bar command. >=20 > It can't trivially, but it's not impossible. sh could play games and > buffer its own use of stdin, and then open a fresh pipe for stdin of > subsequent non-builtins, writing out unused portions of the buffer.[A] >=20 > Some other alternatives that would require kernel support but are > things we've talked about doing in the kernel before anyway: >=20 > * If we had something like eBPF programs attached to IO, maybe sh's > read built-in could push a small eBPF program into the kernel that > determined how many bytes could be read from the pipe in a single > syscall without reading too far. It's fairly trivial. Simply > returning a number of bytes up to and including the first '\n' would > be a fine, if sometimes conservative amount. (Input lines can be > continued with a trailing backslash, except in -r mode, but as a > first-cut approximation, reading-until-newline is probably good > enough.)[B] >=20 > * Heck, even just a read_until_newline(2) syscall would work and > probably be more broadly useful than just sh(1). I don't think it > passes the sniff test =E2=80=94 not general enough, and probably not = something > you want beginners stumbling across instead of fgets(3) =E2=80=94 but = it'd be > fine, and there are other pipe-abusing programs that care about > reading ASCII text lines without overconsuming input than just > sh(1).[C] >=20 > * If we had something like Linux's tee(2) system call (which is as it > sounds =E2=80=94 tee(1) for pipes), sh(1)'s read built-in could tee(2) = for > buffering without impacting stdin, and read(2) stdin only when it knew > how many bytes were consumed (or when the pipe buffer became full).[D] >=20 > I suspect (C) would be the easiest to implement correctly, followed by > (D). (B) is requires some architectural design and bikeshedding and > the details on the kernel side are tricky. (A) would be a little > tricky and probably require extensive changes to sh(1) itself, which > is a risk to the base system. But it would not impact the kernel. >=20 > Is there any interest in a tee(2)-like syscall? >=20 Linux has vmsplice(2). I know jmg@ also expressed interest in having a vmsplice in FreeBSD. As for sh not being able to read more than a single byte at a time = because it could be reading from a pipe, what if it read into a buffer and = returned a line from the buffer. A subsequent read would return more data from = the buffer, ad nauseam until the buffer runs out -- in which case another = chunk is read to augment the data. This buffer could be expunged when stdin collapses (e.g., when the sub- shell completes. --=20 Devin=