Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 5 Jul 2008 09:53:15 +1000
From:      Peter Jeremy <peterjeremy@optushome.com.au>
To:        Marcus Reid <marcus@blazingdot.com>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: msync() differences between Linux and FreeBSD
Message-ID:  <20080704235315.GE29380@server.vk2pj.dyndns.org>
In-Reply-To: <20080702120002.GB65355@blazingdot.com>
References:  <20080702120002.GB65355@blazingdot.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--wRokNccIwvMzawGl
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2008-Jul-02 05:00:02 -0700, Marcus Reid <marcus@blazingdot.com> wrote:
>  It seems that in FreeBSD, msync() waits for bits to be
>committed to disk even when MS_ASYNC is specified.

Your previous ktrace output suggests that, at least for the way
rrdtool is using mmap(2), physical I/O is being performed by msync(2).
It's not clear whether FreeBSD is ignoring the MS_ASYNC flag (the code
suggests it isn't), is blocking on previously queued I/O or is blocking
for some other reason.

>First off, I don't know how frequently msync() is used, and whether changi=
ng
>its behavior would impact the performance of many things.

The behaviour of msync(2) is defined in the Single Unix Specification
and FreeBSD adheres to SUS unless there is a very good reason.

>    media... i.e. issue real I/O.  So msync() can't be a NOP if you go by
>    the OpenGroup specification.
>
>Is there a spec that FreeBSD is adhering to that prevents msync() with
>MS_ASYNC from being a NOP, seeing as munmap() does the job?

As per Matt's response that you quoted, yes.

>  And does this
>really matter for the real-world performance of some apps?

IMO, rrdtool is using mmap()/madvise()/msync()/munmap() in an unusual
fashion and it should be fixed, rather than changing FreeBSD to match
rrdtool.  I believe a more usual approach would be to mmap() a file
(or part thereof), optionally call madvise(), perform a series of
accesses and maybe msync()s of any updated regions then a single
munmap() before exiting.  Performing mmap()/msync()/munmap() (where
the msync() specifies the entire file) for each update maximises
system overheads for no obvious benefit.

Also, you mentioned hitting a "brick wall" between 940MB and 1161MB.
That straddles 1GB.  You may also be running into system or process
boundary conditions (how much RAM do you have and what tuning have you
done).  You might like to write a tool to simulate the rrdtool
behaviour with varying DB sizes and identify exactly what you are
hitting.

--=20
Peter Jeremy
Please excuse any delays as the result of my ISP's inability to implement
an MTA that is either RFC2821-compliant or matches their claimed behaviour.

--wRokNccIwvMzawGl
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (FreeBSD)

iEYEARECAAYFAkhut+sACgkQ/opHv/APuIcXhACgxKC/0qyKpihv5TnUhi1pmn0n
5o8AniQ9m01c1n+WsdvOEM3onWYSL+Az
=SZEX
-----END PGP SIGNATURE-----

--wRokNccIwvMzawGl--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080704235315.GE29380>