From owner-freebsd-stable@FreeBSD.ORG Sun Jan 20 13:44:49 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 1170BC50 for ; Sun, 20 Jan 2013 13:44:49 +0000 (UTC) (envelope-from dnaeon@gmail.com) Received: from mail-bk0-f48.google.com (mail-bk0-f48.google.com [209.85.214.48]) by mx1.freebsd.org (Postfix) with ESMTP id 7619C2EE for ; Sun, 20 Jan 2013 13:44:47 +0000 (UTC) Received: by mail-bk0-f48.google.com with SMTP id jk14so55147bkc.21 for ; Sun, 20 Jan 2013 05:44:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=6xTIYGc+DKK7uyBOUeuHc16WJVyt+xQQ1Ifi8d6aZB8=; b=MOnOi0zV7gvtTqMCW/BmTpUBUNKaLES41ndFpQ1Tgghz2F4kFlv3y8dRLSZIvxHnm6 WIy+sk+2FkNz2licunLJ9AAKAEjT84c8C35a8NPgBeXb6jKe0GP0zWbOZ1qPBmoQIMQH LFb5W8xW6k8WEXAwUbYDsKStHWqXVBDweKD6AkVmmCfWP8xmkZRG7ZIcWP7qqI4R+zE9 AaiDO26xl4fvDcfYEG5zlI4Us5FnOEn/glYqXBbxdvGT4iruOQkJJIFPJlKavFBRVb+h rPEIV0DP2qxx/A9jhu+CFgdp1kGeDkjzi6nf8/B0x59bz6HIATCrUhFWdiP2bcp+e7j1 QCag== MIME-Version: 1.0 X-Received: by 10.204.147.147 with SMTP id l19mr3957396bkv.91.1358689481212; Sun, 20 Jan 2013 05:44:41 -0800 (PST) Received: by 10.204.179.146 with HTTP; Sun, 20 Jan 2013 05:44:40 -0800 (PST) In-Reply-To: <20130119201914.84B761CB@server.theusgroup.com> References: <1358527685.32417.237.camel@revolution.hippie.lan> <20130118173602.GA76438@neutralgood.org> <20130119201914.84B761CB@server.theusgroup.com> Date: Sun, 20 Jan 2013 15:44:40 +0200 Message-ID: Subject: Re: Spontaneous reboots on Intel i5 and FreeBSD 9.0 From: Marin Atanasov Nikolov To: John Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: ml-freebsd-stable X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Jan 2013 13:44:49 -0000 On Sat, Jan 19, 2013 at 10:19 PM, John wrote: > >At 03:00am I can see that periodic(8) runs, but I don't see what could > have > >taken so much of the free memory. I'm also running this system on ZFS and > >have daily rotating ZFS snapshots created - currently the number of ZFS > >snapshots are > 1000, and not sure if that could be causing this. Here's a > >list of the periodic(8) daily scripts that run at 03:00am time. > > > >% ls -1 /etc/periodic/daily > >800.scrub-zfs > > > >% ls -1 /usr/local/etc/periodic/daily > >402.zfSnap > >403.zfSnap_delete > > On a couple of my zfs machines, I've found running a scrub along with other > high file system users to be a problem. I therefore run scrub from cron > and > schedule it so it doesn't overlap with periodic. > > I also found on a machine with an i3 and 4G ram that overlapping scrubs and > snapshot destroy would cause the machine to grind to the point of being > non-responsive. This was not a problem when the machine was new, but > became one > as the pool got larger (dedup is off and the pool is at 45% capacity). > > I use my own zfs management script and it prevents snapshot destroys from > overlapping scrubs, and with a lockfile it prevents a new destroy from > being > initiated when an old one is still running. > > zfSnap has its -S switch to prevent actions during a scrub which you should > use if you haven't already. > > Hi John, Thanks for the hints. It was a long time since I've setup zfSnap and I've just checked the configuration and I am using the "-s -S" flags, so there should be no overlapping. Meanwhile I've updated to 9.1-RELEASE, but then I hit an issue when trying to reboot the system (which appears to be discussed a lot in a separate thread). Then I've updated to stable/9, so at the least the reboot issue is now solved. Since I've to stable/9 I'm monitoring the system's memory usage and so far it's been pretty stable, so I'll keep an eye of an update to stable/9 has actually fixed this strange issue. Thanks again, Marin > Since making these changes, a machine that would have to be rebooted > several > times a week has now been up 61 days. > > John Theus > TheUs Group > -- Marin Atanasov Nikolov dnaeon AT gmail DOT com http://www.unix-heaven.org/