Date: Wed, 3 Feb 2010 05:38:45 -0500 From: Thomas Burgess <wonslung@gmail.com> To: Attila Nagy <bra@fsn.hu> Cc: freebsd-fs@freebsd.org Subject: Re: Machine stops for some seconds with ZFS Message-ID: <deb820501002030238m5b71155bk9a6678b7d76a83b@mail.gmail.com> In-Reply-To: <4B694689.2030704@fsn.hu> References: <4B694689.2030704@fsn.hu>
next in thread | previous in thread | raw e-mail | index | archive | help
why would you use a usb drive for L2ARC? I would think that would make things slower...have you tried setting up without the usb drive? On Wed, Feb 3, 2010 at 4:48 AM, Attila Nagy <bra@fsn.hu> wrote: > Hello, > > After a long time, I've switched back to ZFS on my desktop. It runs > 8-STABLE/amd64 with two SATA disks and an USB pendrive. > One-one partition is used from each disk for the zpool, which is encrypted > using GELI, and the pendrive is there for L2ARC: > NAME STATE READ WRITE CKSUM > data ONLINE 0 0 0 > mirror ONLINE 0 0 0 > ad0s1d.eli ONLINE 0 0 0 > ad1s1d.eli ONLINE 0 0 0 > cache > da0 ONLINE 0 0 0 > > Today, after 12 days of uptime the machine has frozen. I could ping it from > a different machine, even could open a telnet to its ssh port, but I > couldn't get the ssh banner. > > Now I'm building a 9-CURRENT kernel and world to see whether the same > problem persists with that, and during the make process I've noticed a > strange thing. > I build with -j4 (the machine has one dual core CPU), so the fans are > screaming during the process. But every few minutes (I couldn't recognize > any patterns in it) the machine goes completely silent (even more silent > than normally), and everything halts. > During this, the top running on the machine can refresh itself, and I can > type on pass through ssh connections (that is, I use the machine in question > to access other machines with ssh), but I can't open new ssh connections to > it, and can't start anything new (for example from an open shell). > ping is running seamlessly during this, and top shows the following: > > last pid: 36503; load averages: 1.59, 3.04, 3.01 up 0+00:49:53 > 10:32:10 > 97 processes: 1 running, 96 sleeping > CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle > Mem: 218M Active, 24M Inact, 639M Wired, 40M Cache, 6208K Buf, 1022M Free > Swap: 4096M Total, 4096M Free > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 1342 root 1 44 0 3204K 620K select 0 0:02 0.00% make > 1424 root 1 44 0 3204K 1036K select 0 0:01 0.00% make > 1280 root 1 44 0 12540K 1900K select 0 0:01 0.00% > hald-addon-storage > 1234 haldaemon 1 44 0 24116K 4464K select 0 0:01 0.00% hald > 93600 root 1 44 0 3204K 1028K select 0 0:00 0.00% make > 1260 root 1 44 0 19704K 2688K select 0 0:00 0.00% > hald-addon-mouse-sy > 15142 bra 1 44 0 9332K 2864K CPU0 0 0:00 0.00% top > 1263 root 1 44 0 12540K 1896K cgticb 0 0:00 0.00% > hald-addon-storage > 94415 bra 1 44 0 37944K 4992K select 1 0:00 0.00% sshd > 35837 root 1 44 0 5252K 2424K select 1 0:00 0.00% make > 95361 bra 1 44 0 37944K 4992K select 1 0:00 0.00% sshd > 35973 root 1 44 0 3204K 1772K select 0 0:00 0.00% make > 608 root 1 44 0 6892K 1436K select 1 0:00 0.00% syslogd > 96928 root 1 44 0 3204K 728K select 0 0:00 0.00% make > 94369 root 1 51 0 37944K 4584K sbwait 0 0:00 0.00% sshd > 82631 root 1 50 0 37944K 4584K sbwait 0 0:00 0.00% sshd > 16304 root 1 44 0 37944K 4576K zio->i 1 0:00 0.00% sshd > 951 _ntp 1 44 0 6876K 1692K select 0 0:00 0.00% ntpd > 1238 root 1 76 0 16768K 2372K select 0 0:00 0.00% > hald-runner > 4916 root 1 44 0 3204K 728K select 1 0:00 0.00% make > 95338 root 1 49 0 37944K 4584K sbwait 1 0:00 0.00% sshd > 1259 root 1 44 0 10280K 2712K pause 1 0:00 0.00% csh > 33357 bra 1 44 0 21596K 4004K select 0 0:00 0.00% ssh > 16405 bra 1 44 0 37944K 5012K zio->i 0 0:00 0.00% sshd > 1044 root 1 44 0 9104K 1796K kqread 0 0:00 0.00% master > 34765 root 1 76 0 8260K 1764K wait 1 0:00 0.00% sh > 82685 bra 1 44 0 37944K 4960K select 1 0:00 0.00% sshd > 1065 postfix 1 44 0 9100K 1872K kqread 0 0:00 0.00% qmgr > 1237 root 17 44 0 27460K 4124K waitvt 0 0:00 0.00% > console-kit-daemon > 95362 bra 1 44 0 10216K 2612K ttyin 0 0:00 0.00% bash > 34764 root 1 44 0 3204K 852K select 0 0:00 0.00% make > 1222 root 1 49 0 21672K 1896K wait 0 0:00 0.00% login > 35728 root 1 44 0 3204K 860K select 0 0:00 0.00% make > 1064 postfix 1 44 0 9104K 1772K zio->i 1 0:00 0.00% pickup > 82696 bra 1 44 0 10216K 2596K wait 0 0:00 0.00% bash > 94417 bra 1 44 0 10216K 2596K wait 1 0:00 0.00% bash > 35455 root 1 44 0 3204K 744K select 0 0:00 0.00% make > 35774 root 1 44 0 3204K 728K select 1 0:00 0.00% make > 16409 bra 1 44 0 10216K 2592K ttyin 0 0:00 0.00% bash > 1155 root 1 44 0 7948K 1604K nanslp 0 0:00 0.00% cron > 1077 messagebus 1 53 0 8092K 2060K select 0 0:00 0.00% > dbus-daemon > 1149 root 1 44 0 26012K 3960K select 1 0:00 0.00% sshd > 35729 root 1 76 0 8260K 1760K wait 0 0:00 0.00% sh > 4921 root 1 57 0 8260K 1748K wait 0 0:00 0.00% sh > 825 root 1 76 0 39212K 2372K lockf 1 0:00 0.00% > saslauthd > 35460 root 1 76 0 8260K 1748K wait 0 0:00 0.00% sh > 34761 root 1 48 0 8260K 1740K wait 1 0:00 0.00% sh > 96923 root 1 50 0 8260K 1740K wait 0 0:00 0.00% sh > > > As you can see, top reports that the machine is 100% idle, while a make -j4 > buildworld runs. This lasts for few seconds (10-20), then everything goes > back to normal, the fans start to scream, the build continues and I can use > the machine. > This occasional halt is new to me -but I'm just switched to ZFS on my > desktop, in a server it's harder to notice if you don't use it for > interactive sessions-, but I could see the final freeze on more than one > servers. > How could I help to debug this, and the final one? > > Thanks, > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?deb820501002030238m5b71155bk9a6678b7d76a83b>
