Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 3 Feb 2016 07:07:38 +1100
From:      Peter Jeremy <peter@rulingia.com>
To:        Hajimu UMEMOTO <ume@mahoroba.org>
Cc:        stable@FreeBSD.org, mckusick@FreeBSD.org
Subject:   Re: 10-STABLE hangups frequently
Message-ID:  <20160202200738.GA78969@server.rulingia.com>
In-Reply-To: <ygeegcvpmv1.wl-ume@mahoroba.org>
References:  <ygeegcvpmv1.wl-ume@mahoroba.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--3MwIy2ne0vdjdPXF
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2016-Feb-02 16:55:46 +0900, Hajimu UMEMOTO <ume@mahoroba.org> wrote:
>I'm disturbed by a frequent hangup of my 10-STABLE boxes since this
>year.  It seems occur during running the periodic daily scripts.
>I've narrowed which commit causes this problem.  It seems r292895
>causes it.  I see many `Resource temporarily unavailable' message just
>before hangup occurs.
>Any idea?

As others have said, you need to provide lots more detail on your
configuration.

That said, I'm seeing something potentially similar on a Google
Compute Engine f1-micro instance (1 vCPU, 0.6GB RAM) that is running
FreeBSD 10-stable/amd64 with ZFS but basically idle.  (Yes, I realize
that's very little RAM for ZFS but I previously had no problems with
things like buildworld).

There were no problems at r290231 but after I upgraded to r295005, I
started seeing "out of swap" errors and hangs during the periodic
daily runs.  I'm not seeing this on 1GB instances - though they are
all running UFS.

Some experimentation suggested that just "find /" was enough to wedge
my system.  I did some experimenting and found that the following
loader config was enough to prevent it hanging:
vfs.zfs.arc_max=3D"128M"
vfs.zfs.arc_meta_limit=3D"50M"
vfs.zfs.arc_min=3D"25M"
(previously, I had no ZFS tuning at all).

One odditity was that I would semi-regularly see:
 kernel: pid 67431 (ntpd), uid 0, was killed: out of swap space
I haven't worked out why the OOM killer preferred ntpd to anything else -
it didn't seem to be bigger.  And I didn't see any signs that swap space
was being consumed (though I haven't done a scientific examination).
(Note that swap is on a raw partition).

The behaviour is definitely a regression and my initial suspicion is ZFS,
though I haven't identified any smoking gun.

Unfortunately, GCE only offers read access to the console, so I can't
use DDB to poke around after it wedges.

--=20
Peter Jeremy

--3MwIy2ne0vdjdPXF
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQJ8BAEBCgBmBQJWsQyKXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRFRUIyOTg2QzMwNjcxRTc0RTY1QzIyN0Ux
NkE1OTdBMEU0QTIwQjM0AAoJEBall6Dkogs0b80QAJJuXHGlnlpnAmKoh9X3Tejt
0jZuhQ9zHwQJJAJ1c8eZsROZXsrJSMyAaLoUXsp+t0vFT/3VHZ9+vBC0XyaO3ScW
wcFbZvCCjoPg0EdqgDJ0oibscJBYMxJUtK5tsoH9pDL0rOsi9/vjnCU1jH60mubA
O+Knrt/fTdbrn5B+gbxAz4Nlsl3j3u5FuHJWX0u45PpEOHi6yKkCBhd56QqhtyuC
itZ289sC7c3ddZKGejMf8o+Yt0yYMljXY14Eb5N7bAzSEdvLGySX8Nn40bN/UBce
cv1QPOuq0y8UKGdofxzhgpmFzKi/wGKTkY/MJfDW027M3gLP2pYFGuAoUCP+cviX
+7b5C3LgQxMNBNkat9L4vapkDE23iWwIwukqh2r9Pdi4h3UQfEuRbVgDcoZQatg0
slundqkP4qk/XBKCirfK8ij2Yj1QylC/rdpggoECJM+2q1nkuG8gR50KMRwTj32u
zpPHcRN+iTWRfcFqvFelxv3qYJ+4tVTZRjI+TxlKZLoLzoutq56NznzGfqzp+Kqm
SB7ScvCHwIzBsmzKWzVQ2E2IGkxkotXAD6+WFcIzQdQpzpJEN05qZzfBmyMEKnw8
+j994kx6iC0GoIAxVte5kmEHfPTBtNR5IIx5oCUlWepRIz69dWH7jWvwqKDRnpH9
/EmuA3vhk4x8E+CfJWMC
=M0kL
-----END PGP SIGNATURE-----

--3MwIy2ne0vdjdPXF--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160202200738.GA78969>