From owner-freebsd-fs@FreeBSD.ORG Wed Feb 3 12:09:28 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6F962106568D for ; Wed, 3 Feb 2010 12:09:28 +0000 (UTC) (envelope-from sty@blosphere.net) Received: from mail-px0-f183.google.com (mail-px0-f183.google.com [209.85.216.183]) by mx1.freebsd.org (Postfix) with ESMTP id 51F688FC08 for ; Wed, 3 Feb 2010 12:09:28 +0000 (UTC) Received: by pxi13 with SMTP id 13so1297884pxi.3 for ; Wed, 03 Feb 2010 04:09:27 -0800 (PST) MIME-Version: 1.0 Sender: sty@blosphere.net Received: by 10.114.165.4 with SMTP id n4mr5003531wae.81.1265198967728; Wed, 03 Feb 2010 04:09:27 -0800 (PST) In-Reply-To: <4B694689.2030704@fsn.hu> References: <4B694689.2030704@fsn.hu> Date: Wed, 3 Feb 2010 21:09:27 +0900 X-Google-Sender-Auth: 3460c518de9bbab1 Message-ID: From: =?UTF-8?B?VG9tbWkgTMOkdHRp?= To: Attila Nagy Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: Machine stops for some seconds with ZFS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Feb 2010 12:09:28 -0000 > After a long time, I've switched back to ZFS on my desktop. It runs > 8-STABLE/amd64 with two SATA disks and an USB pendrive. > One-one partition is used from each disk for the zpool, which is encrypte= d > using GELI, and the pendrive is there for L2ARC: > =C2=A0 NAME =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0STATE =C2=A0 =C2=A0 = READ WRITE CKSUM > =C2=A0 data =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ONLINE =C2=A0 =C2=A0= =C2=A0 0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 0 > =C2=A0 =C2=A0 mirror =C2=A0 =C2=A0 =C2=A0 =C2=A0ONLINE =C2=A0 =C2=A0 =C2= =A0 0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 0 > =C2=A0 =C2=A0 =C2=A0 ad0s1d.eli =C2=A0ONLINE =C2=A0 =C2=A0 =C2=A0 0 =C2= =A0 =C2=A0 0 =C2=A0 =C2=A0 0 > =C2=A0 =C2=A0 =C2=A0 ad1s1d.eli =C2=A0ONLINE =C2=A0 =C2=A0 =C2=A0 0 =C2= =A0 =C2=A0 0 =C2=A0 =C2=A0 0 > =C2=A0 cache > =C2=A0 =C2=A0 da0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ONLINE =C2=A0 =C2=A0= =C2=A0 0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 0 > > Today, after 12 days of uptime the machine has frozen. I could ping it fr= om > a different machine, even could open a telnet to its ssh port, but I > couldn't get the ssh banner. > > Now I'm building a 9-CURRENT kernel and world to see whether the same > problem persists with that, and during the make process I've noticed a > strange thing. > I build with -j4 (the machine has one dual core CPU), so the fans are > screaming during the process. But every few minutes (I couldn't recognize > any patterns in it) the machine goes completely silent (even more silent > than normally), and everything halts. > =C2=A0PID USERNAME =C2=A0 =C2=A0THR PRI NICE =C2=A0 SIZE =C2=A0 =C2=A0RES= STATE =C2=A0 C =C2=A0 TIME =C2=A0 WCPU COMMAND > 16304 root =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01 =C2=A044 =C2=A0 =C2=A00 37= 944K =C2=A04576K zio->i =C2=A01 =C2=A0 0:00 =C2=A00.00% sshd > 16405 bra =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 =C2=A044 =C2=A0 =C2=A00 37= 944K =C2=A05012K zio->i =C2=A00 =C2=A0 0:00 =C2=A00.00% sshd > 1064 postfix =C2=A0 =C2=A0 =C2=A0 1 =C2=A044 =C2=A0 =C2=A00 =C2=A09104K = =C2=A01772K zio->i =C2=A01 =C2=A0 0:00 =C2=A00.00% pickup This sounds like you're being hit by the same performance slowdown (extensively documented) that seems to affect everybody currently (maybe not those guys with ssd's or 15k rpm drives in big arrays). There's a long thread on -STABLE. Basically how I see it it's impossible to read and write from zfs pool at the same time which might be caused how the arc cache behaves under freebsd (didn't have these problems when zfs was still 'unstable'). I couldn't even watch a 720p video while having small writes (less than 1k every few seconds) to the same array without the smb process going to zio->i state which seems to indicate a complete block on any i/o. Combine with 5400 rpm consumer drives... well... -> switched to opensolaris, performance is now great... --=20 br, Tommi