From owner-freebsd-stable@FreeBSD.ORG  Sat Jan 19 20:25:07 2013
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 0E0A7D3F
 for <freebsd-stable@freebsd.org>; Sat, 19 Jan 2013 20:25:07 +0000 (UTC)
 (envelope-from john@theusgroup.com)
Received: from theusgroup.com (theusgroup.com [64.122.243.222])
 by mx1.freebsd.org (Postfix) with ESMTP id DDBDDEF6
 for <freebsd-stable@freebsd.org>; Sat, 19 Jan 2013 20:25:06 +0000 (UTC)
To: Marin Atanasov Nikolov <dnaeon@gmail.com>
Subject: Re: Spontaneous reboots on Intel i5 and FreeBSD 9.0
In-reply-to: <CAJ-UWtRRfCKg9GBR_ppvtjvJGadiOXMXBFBpX7tAvLEXDoZHQg@mail.gmail.com>
References: <CAJ-UWtSANRMsOqwW9rJ6Eebta6=AiHeNO6fhPO0mhYhZiMmn4A@mail.gmail.com>
 <op.wq3zxn038527sy@ronaldradial.versatec.local>
 <alpine.BSF.2.00.1301180758460.96418@wonkity.com>
 <1358527685.32417.237.camel@revolution.hippie.lan>
 <20130118173602.GA76438@neutralgood.org>
 <alpine.BSF.2.00.1301181313560.1604@wonkity.com>
 <CAJ-UWtRRfCKg9GBR_ppvtjvJGadiOXMXBFBpX7tAvLEXDoZHQg@mail.gmail.com>
Comments: In-reply-to Marin Atanasov Nikolov <dnaeon@gmail.com>
 message dated "Sat, 19 Jan 2013 12:30:17 +0200."
Date: Sat, 19 Jan 2013 12:19:14 -0800
From: John <john@theusgroup.com>
Message-Id: <20130119201914.84B761CB@server.theusgroup.com>
Cc: freebsd-stable@freebsd.org
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 19 Jan 2013 20:25:07 -0000

>At 03:00am I can see that periodic(8) runs, but I don't see what could have
>taken so much of the free memory. I'm also running this system on ZFS and
>have daily rotating ZFS snapshots created - currently the number of ZFS
>snapshots are > 1000, and not sure if that could be causing this. Here's a
>list of the periodic(8) daily scripts that run at 03:00am time.
>
>% ls -1 /etc/periodic/daily
>800.scrub-zfs
>
>% ls -1 /usr/local/etc/periodic/daily
>402.zfSnap
>403.zfSnap_delete

On a couple of my zfs machines, I've found running a scrub along with other
high file system users to be a problem.  I therefore run scrub from cron and
schedule it so it doesn't overlap with periodic.

I also found on a machine with an i3 and 4G ram that overlapping scrubs and
snapshot destroy would cause the machine to grind to the point of being
non-responsive. This was not a problem when the machine was new, but became one
as the pool got larger (dedup is off and the pool is at 45% capacity).

I use my own zfs management script and it prevents snapshot destroys from
overlapping scrubs, and with a lockfile it prevents a new destroy from being
initiated when an old one is still running.

zfSnap has its -S switch to prevent actions during a scrub which you should
use if you haven't already.

Since making these changes, a machine that would have to be rebooted several
times a week has now been up 61 days.

John Theus
TheUs Group