From owner-freebsd-stable@FreeBSD.ORG Sat Jan 19 20:25:07 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 0E0A7D3F for ; Sat, 19 Jan 2013 20:25:07 +0000 (UTC) (envelope-from john@theusgroup.com) Received: from theusgroup.com (theusgroup.com [64.122.243.222]) by mx1.freebsd.org (Postfix) with ESMTP id DDBDDEF6 for ; Sat, 19 Jan 2013 20:25:06 +0000 (UTC) To: Marin Atanasov Nikolov Subject: Re: Spontaneous reboots on Intel i5 and FreeBSD 9.0 In-reply-to: References: <1358527685.32417.237.camel@revolution.hippie.lan> <20130118173602.GA76438@neutralgood.org> Comments: In-reply-to Marin Atanasov Nikolov message dated "Sat, 19 Jan 2013 12:30:17 +0200." Date: Sat, 19 Jan 2013 12:19:14 -0800 From: John Message-Id: <20130119201914.84B761CB@server.theusgroup.com> Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Jan 2013 20:25:07 -0000 >At 03:00am I can see that periodic(8) runs, but I don't see what could have >taken so much of the free memory. I'm also running this system on ZFS and >have daily rotating ZFS snapshots created - currently the number of ZFS >snapshots are > 1000, and not sure if that could be causing this. Here's a >list of the periodic(8) daily scripts that run at 03:00am time. > >% ls -1 /etc/periodic/daily >800.scrub-zfs > >% ls -1 /usr/local/etc/periodic/daily >402.zfSnap >403.zfSnap_delete On a couple of my zfs machines, I've found running a scrub along with other high file system users to be a problem. I therefore run scrub from cron and schedule it so it doesn't overlap with periodic. I also found on a machine with an i3 and 4G ram that overlapping scrubs and snapshot destroy would cause the machine to grind to the point of being non-responsive. This was not a problem when the machine was new, but became one as the pool got larger (dedup is off and the pool is at 45% capacity). I use my own zfs management script and it prevents snapshot destroys from overlapping scrubs, and with a lockfile it prevents a new destroy from being initiated when an old one is still running. zfSnap has its -S switch to prevent actions during a scrub which you should use if you haven't already. Since making these changes, a machine that would have to be rebooted several times a week has now been up 61 days. John Theus TheUs Group