Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 14 Oct 2020 14:40:42 +0100
From:      Jessica Clarke <jrtc27@freebsd.org>
To:        Mateusz Guzik <mjguzik@gmail.com>
Cc:        Alex Richardson <arichardson@freebsd.org>, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r366697 - head/usr.bin/xinstall
Message-ID:  <E0AD1261-DEB7-4049-AA54-9BBBE764E02F@freebsd.org>
In-Reply-To: <CAGudoHE=4NxWKh1OfvNy_y4pBBcR4p=rL%2BQs5-GERzuW4WECjg@mail.gmail.com>
References:  <202010141228.09ECSg0D023438@repo.freebsd.org> <CAGudoHE=4NxWKh1OfvNy_y4pBBcR4p=rL%2BQs5-GERzuW4WECjg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 14 Oct 2020, at 14:28, Mateusz Guzik <mjguzik@gmail.com> wrote:
>=20
> This should use copy_file_range (also available on Linux).

I assume this is a bootstrap tool and hence the system OS and version
is relevant. macOS does not have copy_file_range, and FreeBSD only has
it in -CURRENT so that would break building on 11.x and 12.x. So any
use would need to be guarded by preprocessor checks (and there are
still actively-supported Linux distributions out there that are based
on too-old versions of the kernel and/or glibc to include it).

(FYI macOS's equivalent is copyfile(3)... maybe one day it will adopt
the copy_file_range(2) interface too)

Jess

> On 10/14/20, Alex Richardson <arichardson@freebsd.org> wrote:
>> Author: arichardson
>> Date: Wed Oct 14 12:28:41 2020
>> New Revision: 366697
>> URL: https://svnweb.freebsd.org/changeset/base/366697
>>=20
>> Log:
>>  install(1): Avoid unncessary fstatfs() calls and use mmap() based on =
size
>>=20
>>  According to git blame the trymmap() function was added in 1996 to =
skip
>>  mmap() calls for NFS file systems. However, nowadays mmap() should =
be
>>  perfectly safe even on NFS. Importantly, onl ufs and cd9660 file =
systems
>>  were whitelisted so we don't use mmap() on ZFS. It also prevents the =
use
>>  of mmap() when bootstrapping from macOS/Linux since on those systems =
the
>>  trymmap() function was always returning zero due to the missing
>> MFSNAMELEN
>>  define.
>>=20
>>  This change keeps the trymmap() function but changes it to check =
whether
>>  using mmap() can reduce the number of system calls that are =
required.
>>  Using mmap() only reduces the number of system calls if we need =
multiple
>> read()
>>  syscalls, i.e. if the file size is > MAXBSIZE. However, mmap() is =
more
>> expensive
>>  than read() so this sets the threshold at 4 fewer syscalls. =
Additionally,
>> for
>>  larger file size mmap() can significantly increase the number of =
page
>> faults,
>>  so avoid it in that case.
>>=20
>>  It's unclear whether using mmap() is ever faster than a read with an
>> appropriate
>>  buffer size, but this change at least removes two unnecessary system
>> calls
>>  for every file that is installed.
>>=20
>>  Reviewed By:	markj
>>  Differential Revision: https://reviews.freebsd.org/D26041
>>=20
>> Modified:
>>  head/usr.bin/xinstall/xinstall.c
>>=20
>> Modified: head/usr.bin/xinstall/xinstall.c
>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D
>> --- head/usr.bin/xinstall/xinstall.c	Wed Oct 14 10:12:39 2020	=
(r366696)
>> +++ head/usr.bin/xinstall/xinstall.c	Wed Oct 14 12:28:41 2020	=
(r366697)
>> @@ -148,7 +148,7 @@ static void	metadata_log(const char *, const =
char *, s
>> 		    const char *, const char *, off_t);
>> static int	parseid(const char *, id_t *);
>> static int	strip(const char *, int, const char *, char **);
>> -static int	trymmap(int);
>> +static int	trymmap(size_t);
>> static void	usage(void);
>>=20
>> int
>> @@ -1087,7 +1087,7 @@ compare(int from_fd, const char *from_name =
__unused,
>> s
>> 		if (do_digest)
>> 			digest_init(&ctx);
>> 		done_compare =3D 0;
>> -		if (trymmap(from_fd) && trymmap(to_fd)) {
>> +		if (trymmap(from_len) && trymmap(to_len)) {
>> 			p =3D mmap(NULL, from_len, PROT_READ, =
MAP_SHARED,
>> 			    from_fd, (off_t)0);
>> 			if (p =3D=3D MAP_FAILED)
>> @@ -1248,13 +1248,8 @@ copy(int from_fd, const char *from_name, int =
to_fd,
>> co
>>=20
>> 	digest_init(&ctx);
>>=20
>> -	/*
>> -	 * Mmap and write if less than 8M (the limit is so we don't =
totally
>> -	 * trash memory on big files.  This is really a minor hack, but =
it
>> -	 * wins some CPU back.
>> -	 */
>> 	done_copy =3D 0;
>> -	if (size <=3D 8 * 1048576 && trymmap(from_fd) &&
>> +	if (trymmap((size_t)size) &&
>> 	    (p =3D mmap(NULL, (size_t)size, PROT_READ, MAP_SHARED,
>> 		    from_fd, (off_t)0)) !=3D MAP_FAILED) {
>> 		nw =3D write(to_fd, p, size);
>> @@ -1523,20 +1518,23 @@ usage(void)
>>  *	return true (1) if mmap should be tried, false (0) if not.
>>  */
>> static int
>> -trymmap(int fd)
>> +trymmap(size_t filesize)
>> {
>> -/*
>> - * The ifdef is for bootstrapping - f_fstypename doesn't exist in
>> - * pre-Lite2-merge systems.
>> - */
>> -#ifdef MFSNAMELEN
>> -	struct statfs stfs;
>> -
>> -	if (fstatfs(fd, &stfs) !=3D 0)
>> -		return (0);
>> -	if (strcmp(stfs.f_fstypename, "ufs") =3D=3D 0 ||
>> -	    strcmp(stfs.f_fstypename, "cd9660") =3D=3D 0)
>> -		return (1);
>> -#endif
>> -	return (0);
>> +	/*
>> +	 * This function existed to skip mmap() for NFS file systems =
whereas
>> +	 * nowadays mmap() should be perfectly safe. Nevertheless, using =
mmap()
>> +	 * only reduces the number of system calls if we need multiple =
read()
>> +	 * syscalls, i.e. if the file size is > MAXBSIZE. However, =
mmap() is
>> +	 * more expensive than read() so set the threshold at 4 fewer =
syscalls.
>> +	 * Additionally, for larger file size mmap() can significantly =
increase
>> +	 * the number of page faults, so avoid it in that case.
>> +	 *
>> +	 * Note: the 8MB limit is not based on any meaningful =
benchmarking
>> +	 * results, it is simply reusing the same value that was used =
before
>> +	 * and also matches bin/cp.
>> +	 *
>> +	 * XXX: Maybe we shouldn't bother with mmap() at all, since we =
use
>> +	 * MAXBSIZE the syscall overhead of read() shouldn't be too =
high?
>> +	 */
>> +	return (filesize > 4 * MAXBSIZE && filesize < 8 * 1024 * 1024);
>> }
>> _______________________________________________
>> svn-src-all@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/svn-src-all
>> To unsubscribe, send any mail to =
"svn-src-all-unsubscribe@freebsd.org"
>>=20
>=20
>=20
> --=20
> Mateusz Guzik <mjguzik gmail.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E0AD1261-DEB7-4049-AA54-9BBBE764E02F>