From owner-freebsd-stable@FreeBSD.ORG Fri Jan 25 10:26:49 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 8BA9DA3A for ; Fri, 25 Jan 2013 10:26:49 +0000 (UTC) (envelope-from dnaeon@gmail.com) Received: from mail-bk0-f45.google.com (mail-bk0-f45.google.com [209.85.214.45]) by mx1.freebsd.org (Postfix) with ESMTP id 1177ECD0 for ; Fri, 25 Jan 2013 10:26:48 +0000 (UTC) Received: by mail-bk0-f45.google.com with SMTP id i18so125751bkv.4 for ; Fri, 25 Jan 2013 02:26:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=dE826DLw8zhVClKuwkm/9IwVWyIzO5gAMXa7y+TtcpA=; b=W7kf6w99E2QXwdc/HXPg1DBmzaMQBppjyBfVa5dDdSI8BT5s6bqtVJRdCNPloK5ThF rNDhiX9apZZRgUB3A6+7xMTvJKz8k9O4EZkVxkJSe31/rDl1dmIoVgUt0pHhIJBp+gUj RZcnmBXO1jcLa/NK0LkpMiRpTj+zPaYoqSF98pKGHXfiDARVOoXxnzIlZghbmconrbtb MbzoFUaupfxeSbnLehPgSOdKqkY7NF0Q4YDvB4ysmXTPfqbGdvjKUchRlZ67/CRRrUSI juUmi6qY87srw7xvHjXwWSLhG73+k50I++wp70mgifxcuge61KJPZLEGZwp5gYCXJ9QU ic7Q== MIME-Version: 1.0 X-Received: by 10.204.4.81 with SMTP id 17mr1661687bkq.137.1359109607925; Fri, 25 Jan 2013 02:26:47 -0800 (PST) Received: by 10.205.112.11 with HTTP; Fri, 25 Jan 2013 02:26:47 -0800 (PST) In-Reply-To: References: <1358527685.32417.237.camel@revolution.hippie.lan> <20130118173602.GA76438@neutralgood.org> <20130119201914.84B761CB@server.theusgroup.com> Date: Fri, 25 Jan 2013 12:26:47 +0200 Message-ID: Subject: Re: Spontaneous reboots on Intel i5 and FreeBSD 9.0 From: Marin Atanasov Nikolov To: Bob Bishop Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: ml-freebsd-stable , John X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jan 2013 10:26:49 -0000 On Fri, Jan 25, 2013 at 12:12 PM, Bob Bishop wrote: > Hi, > > On 25 Jan 2013, at 09:29, Marin Atanasov Nikolov wrote: > > > Hello again :) > > > > Here's my update on these spontaneous reboots after less than a week > since > > I've updated to stable/9. > > > > First two days the system was running fine with no reboots happening, so > I > > though that this update actually fixed it, but I was wrong. > > > > The reboots are still happening and still no clear evidence of the root > > cause. What I did so far: > > > > * Ran disks tests -- looking good > > * Ran memtest -- looking good > > * Replaced power cables > > * Ran UPS tests -- looking good > > * Checked for any bad capacitors -- none found > > * Removed all ZFS snapshots > > > > There is also one more machine connected to the same UPS, so if it was a > > UPS issue I'd expect that the other one reboots too, but that's not the > > case. > > > > Now that I've excluded the hardware part of this problem > > Have you done anything to rule out the machine's power supply? > > Hi, Yes, it's a brand new one. Regards, Marin > > I started looking > > again into the software side, and this time in particular -- ZFS. > > > > I'm running FreeBSD 9.1-STABLE #1 r245686 on a Intel i5 with 8Gb of > memory. > > > > A quick look at top(1) showed lots of memory usage by ARC and my > available > > free memory dropping fast. I've made a screenshot, which you can see on > the > > link below: > > > > * http://users.unix-heaven.org/~dnaeon/top-zfs-arc.jpg > > > > So I went to the FreeBSD Wiki and started reading the ZFS Tuning Guide > [1], > > but honestly at the end I was not sure which parameters I need to > > increase/decrease and to what values. > > > > Here's some info about my current parameters. > > > > % sysctl vm.kmem_size_max > > vm.kmem_size_max: 329853485875 > > > > % sysctl vm.kmem_size > > vm.kmem_size: 8279539712 > > > > % sysctl vfs.zfs.arc_max > > vfs.zfs.arc_max: 7205797888 > > > > % sysctl kern.maxvnodes > > kern.maxvnodes: 206227 > > > > There's one script at the ZFSTuningGuide which calculates kernel memory > > utilization, and for me these values are listed below: > > > > TEXT=22402749, 21.3649 MB > > DATA=4896264192, 4669.44 MB > > TOTAL=4918666941, 4690.81 MB > > > > While looking for ZFS tuning I've also stumbled upon this thread in the > > FreeBSD Forums [2], where the OP describes a similar behaviour to what I > am > > already experiencing, so I'm quite worried now that the reason for these > > crashes is ZFS. > > > > Before jumping into any change to the kernel parameters (vm.kmem_size, > > vm.kmem_max_size, kern.maxvnodes, vfs.zfs.arc_max) I'd like to hear any > > feedback from people that have already done such optimizations on their > ZFS > > systems. > > > > Could you please share what are the optimal values for these parameters > on > > a system with 8Gb of memory? Is there a way to calculate these values or > is > > it just a "test-and-see-which-fits-better" way of doing this? > > > > Thanks and regards, > > Marin > > > > [1]: https://wiki.freebsd.org/ZFSTuningGuide > > [2]: http://forums.freebsd.org/showthread.php?t=9143 > > > > > > On Sun, Jan 20, 2013 at 3:44 PM, Marin Atanasov Nikolov < > dnaeon@gmail.com>wrote: > > > >> > >> > >> > >> On Sat, Jan 19, 2013 at 10:19 PM, John wrote: > >> > >>>> At 03:00am I can see that periodic(8) runs, but I don't see what could > >>> have > >>>> taken so much of the free memory. I'm also running this system on ZFS > and > >>>> have daily rotating ZFS snapshots created - currently the number of > ZFS > >>>> snapshots are > 1000, and not sure if that could be causing this. > Here's > >>> a > >>>> list of the periodic(8) daily scripts that run at 03:00am time. > >>>> > >>>> % ls -1 /etc/periodic/daily > >>>> 800.scrub-zfs > >>>> > >>>> % ls -1 /usr/local/etc/periodic/daily > >>>> 402.zfSnap > >>>> 403.zfSnap_delete > >>> > >>> On a couple of my zfs machines, I've found running a scrub along with > >>> other > >>> high file system users to be a problem. I therefore run scrub from > cron > >>> and > >>> schedule it so it doesn't overlap with periodic. > >>> > >>> I also found on a machine with an i3 and 4G ram that overlapping scrubs > >>> and > >>> snapshot destroy would cause the machine to grind to the point of being > >>> non-responsive. This was not a problem when the machine was new, but > >>> became one > >>> as the pool got larger (dedup is off and the pool is at 45% capacity). > >>> > >>> I use my own zfs management script and it prevents snapshot destroys > from > >>> overlapping scrubs, and with a lockfile it prevents a new destroy from > >>> being > >>> initiated when an old one is still running. > >>> > >>> zfSnap has its -S switch to prevent actions during a scrub which you > >>> should > >>> use if you haven't already. > >>> > >>> > >> Hi John, > >> > >> Thanks for the hints. It was a long time since I've setup zfSnap and > I've > >> just checked the configuration and I am using the "-s -S" flags, so > there > >> should be no overlapping. > >> > >> Meanwhile I've updated to 9.1-RELEASE, but then I hit an issue when > trying > >> to reboot the system (which appears to be discussed a lot in a separate > >> thread). > >> > >> Then I've updated to stable/9, so at the least the reboot issue is now > >> solved. Since I've to stable/9 I'm monitoring the system's memory usage > and > >> so far it's been pretty stable, so I'll keep an eye of an update to > >> stable/9 has actually fixed this strange issue. > >> > >> Thanks again, > >> Marin > >> > >> > >>> Since making these changes, a machine that would have to be rebooted > >>> several > >>> times a week has now been up 61 days. > >>> > >>> John Theus > >>> TheUs Group > >>> > >> > >> > >> > >> -- > >> Marin Atanasov Nikolov > >> > >> dnaeon AT gmail DOT com > >> http://www.unix-heaven.org/ > >> > > > > > > > > -- > > Marin Atanasov Nikolov > > > > dnaeon AT gmail DOT com > > http://www.unix-heaven.org/ > > _______________________________________________ > > freebsd-stable@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org > " > > > > > -- > Bob Bishop +44 (0)118 940 1243 > rb@gid.co.uk fax +44 (0)118 940 1295 > mobile +44 (0)783 626 4518 > > > > > > -- Marin Atanasov Nikolov dnaeon AT gmail DOT com http://www.unix-heaven.org/