Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 11 Dec 2018 09:54:58 -0800
From:      John Baldwin <jhb@FreeBSD.org>
To:        Devin Teske <dteske@FreeBSD.org>
Cc:        Conrad Meyer <cem@FreeBSD.org>, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r341803 - head/libexec/rc
Message-ID:  <dafbcc18-146f-2e4f-e1e9-346d7c05b096@FreeBSD.org>
In-Reply-To: <98481565-CDD7-4301-B86B-072D5B984AF7@FreeBSD.org>
References:  <201812110138.wBB1cp1p006660@repo.freebsd.org> <2a76b295-b2da-3015-c201-dbe0ec63ca5a@FreeBSD.org> <98481565-CDD7-4301-B86B-072D5B984AF7@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 12/11/18 9:40 AM, Devin Teske wrote:
> 
> 
>> On Dec 11, 2018, at 9:23 AM, John Baldwin <jhb@FreeBSD.org <mailto:jhb@FreeBSD.org>> wrote:
>>
>> On 12/10/18 5:38 PM, Conrad Meyer wrote:
>>> Author: cem
>>> Date: Tue Dec 11 01:38:50 2018
>>> New Revision: 341803
>>> URL: https://svnweb.freebsd.org/changeset/base/341803
>>>
>>> Log:
>>>  rc.subr: Implement list_vars without using 'read'
>>>
>>>  'read' pessimistically read(2)s one byte at a time, which can be quite
>>>  silly for large environments in slow emulators.
>>>
>>>  In my boring user environment, truss shows that the number of read()
>>>  syscalls to source rc.subr and invoke list_vars is reduced by something like
>>>  3400 to 60.  ministat(1) shows a significant time difference of about -71%
>>>  for my environment.
>>>
>>>  Suggested by:jilles
>>>  Discussed with:dteske, jhb, jilles
>>>  Differential Revision:https://reviews.freebsd.org/D18481
>>
>> For some background, one my colleagues reported that it was taking hours in
>> (an admittedly slow) CPU simulator to get through '/etc/rc.d/netif start'.
>> I ended up running that script under truss in a RISC-V qemu machine.  The
>> entire run took 212 seconds (truss did slow it down quite a bit).  Of that
>> 212 seconds, the read side of each list_vars invocation took ~25.5 seconds,
>> and with lo0 and vtnet0 there were 8 list_vars invocations, so 204 out of
>> the 212 seconds were spent in the single-byte read() syscalls in 'while read'.
>>
>> Even on qemu without truss during bootup 'netif start' took a couple of
>> seconds (long enough to get 2-3 Ctrl-T's in) before this change and is now
>> similar to bare metal with the change.  list_vars is rarely used outside of
>> 'netif', so it probably doesn't make a measurable difference on bare metal.
>>
> 
> Thank you for the background which was lost by the time I got to the phab.
> 
> I can't help but ask though,...
> 
> If it was noticed that read(2) processes the stream one byte at a time,
> why not just optimize read(2)?
> 
> I'm afraid of the prospect of having to hunt down every instance of while-read,
> but if we can fix the underlying read(2) inefficiency then we make while-read OK.

It's a system call.  A CPU emulator has to do a lot of work for a system call
because it involves two mode switches (user -> kernel and back again).  You
can't "fix" that as it's just a part of the CPU architecture.  There's a reason
that stdio uses buffering by default, it's because system calls have overhead.
The 'read' builtin in sh can't use buffering, so it is always going to be
inefficient.

-- 
John Baldwin

                                                                            



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?dafbcc18-146f-2e4f-e1e9-346d7c05b096>