From owner-freebsd-sparc64@FreeBSD.ORG Mon Mar 7 19:22:42 2011 Return-Path: Delivered-To: freebsd-sparc64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 64EFC106566C for ; Mon, 7 Mar 2011 19:22:42 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (alchemy.franken.de [194.94.249.214]) by mx1.freebsd.org (Postfix) with ESMTP id CC9EB8FC08 for ; Mon, 7 Mar 2011 19:22:41 +0000 (UTC) Received: from alchemy.franken.de (localhost [127.0.0.1]) by alchemy.franken.de (8.14.4/8.14.4/ALCHEMY.FRANKEN.DE) with ESMTP id p27JMeht033560; Mon, 7 Mar 2011 20:22:40 +0100 (CET) (envelope-from marius@alchemy.franken.de) Received: (from marius@localhost) by alchemy.franken.de (8.14.4/8.14.4/Submit) id p27JMd6C033559; Mon, 7 Mar 2011 20:22:39 +0100 (CET) (envelope-from marius) Date: Mon, 7 Mar 2011 20:22:39 +0100 From: Marius Strobl To: Roger Hammerstein Message-ID: <20110307192239.GA31314@alchemy.franken.de> References: <20110307080626.GK57812@alchemy.franken.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110307080626.GK57812@alchemy.franken.de> User-Agent: Mutt/1.4.2.3i Cc: freebsd-sparc64@freebsd.org Subject: Re: sparc64 hang with zfs v28 X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Mar 2011 19:22:42 -0000 On Mon, Mar 07, 2011 at 09:06:26AM +0100, Marius Strobl wrote: > On Sun, Mar 06, 2011 at 11:27:42PM -0500, Roger Hammerstein wrote: > > > > > > > FYI, kernel modules generally should work again with r219340, I haven't > > > tested ZFS though. > > > > > > Thanks! > > I cvsuppedd and rebuilt kernel. > > > > > > > > > > falcon# uname -a > > FreeBSD falcon 9.0-CURRENT FreeBSD 9.0-CURRENT #3: Sun Mar 6 18:55:14 EST 2011 root@falcon:/usr/obj/usr/src/sys/GENERIC sparc64 > > falcon# > > > > I did a kldload zfs and it loaded ok. > > > > falcon# kldstat > > Id Refs Address Size Name > > 1 9 0xc0000000 e42878 kernel > > 2 1 0xc14a2000 32e000 zfs.ko > > 3 1 0xc17d0000 104000 opensolaris.ko > > falcon# > > > > > > But a 'zpool status' or 'zfs list' will cause a zfs or zpool process > > to eat 99% of a cpu and essentially hang the shell i ran zfs/zpool in. > > > > > > > > falcon# zfs list > > > > ZFS NOTICE: Prefetch is disabled by default if less than 4GB of RAM is present; > > > > to enable, add "vfs.zfs.prefetch_disable=0" to /boot/loader.conf. > > > > ZFS filesystem version 5 > > > > ZFS storage pool version 28 > > > > [Hang here] > > > > > > > > > > > > last pid: 1012; load averages: 0.79, 0.30, 0.16 up 0+00:13:58 20:58:43 > > > > 23 processes: 2 running, 21 sleeping > > > > CPU: 0.0% user, 0.0% nice, 52.5% system, 0.0% interrupt, 47.5% idle > > > > Mem: 16M Active, 11M Inact, 46M Wired, 64K Cache, 12M Buf, 1915M Free > > > > Swap: 4055M Total, 4055M Free > > > > > > > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > > > > 1006 root 1 53 0 21672K 2904K CPU1 1 0:05 99.47% zfs > > > > 998 root 1 40 0 41776K 6376K select 0 0:01 0.00% sshd > > > > 994 root 1 16 0 11880K 3536K pause 0 0:01 0.00% csh > > > > 795 root 1 40 0 16720K 3968K select 0 0:00 0.00% ntpd > > > > 1001 root 1 16 0 11880K 3464K pause 0 0:00 0.00% csh > > > > 975 root 1 8 0 25168K 2672K wait 1 0:00 0.00% login > > > > > > > > > > > > > > > > stays at 99%. > > > > truss -p 1006 doesn't "attach", it just hangs. > > > > > > > > ctrl-t on the zfs list shell: > > > > oad: 0.95 cmd: zfs 1006 [running] 182.26r 0.00u 4.66s 99% 2872k > > > > load: 0.95 cmd: zfs 1006 [running] 183.30r 0.00u 4.66s 99% 2872k > > > > load: 0.95 cmd: zfs 1006 [running] 183.76r 0.00u 4.66s 99% 2872k > > > > load: 0.95 cmd: zfs 1006 [running] 184.08r 0.00u 4.66s 99% 2872k > > > > load: 0.95 cmd: zfs 1006 [running] 184.36r 0.00u 4.66s 99% 2872k > > > > > > > > > > A second time with zpool status:: > > last pid: 1224; load averages: 0.98, 0.55, 0.24 up 0+02:07:39 23:12:33 > > 26 processes: 2 running, 24 sleeping > > CPU: 0.0% user, 0.0% nice, 50.2% system, 0.4% interrupt, 49.4% idle > > Mem: 18M Active, 13M Inact, 46M Wired, 64K Cache, 12M Buf, 1911M Free > > Swap: 4055M Total, 4055M Free > > > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > > 1200 root 1 62 0 22704K 2920K CPU1 1 0:00 99.02% zpool > > 793 root 1 40 0 16720K 3968K select 0 0:02 0.00% ntpd > > 1180 root 1 16 0 11880K 3536K pause 1 0:01 0.00% csh > > 1184 root 1 40 0 41776K 6376K select 0 0:01 0.00% sshd > > 1201 root 1 40 0 41776K 6376K select 0 0:01 0.00% sshd > > > > falcon# truss -p 1200 > > truss: can not attach to target process: Device busy > > falcon# truss -p 1200 > > truss: can not attach to target process: Device busy > > falcon# > > > > > > ctrl-t on the zpool status command: > > load: 0.62 cmd: zpool 1200 [running] 54.30r 0.00u 0.07s 83% 2888k > > load: 0.99 cmd: zpool 1200 [running] 271.73r 0.00u 0.07s 99% 2888k > > load: 0.99 cmd: zpool 1200 [running] 272.37r 0.00u 0.07s 99% 2888k > > load: 0.99 cmd: zpool 1200 [running] 272.75r 0.00u 0.07s 99% 2888k > > load: 0.99 cmd: zpool 1200 [running] 273.38r 0.00u 0.07s 99% 2888k > > > > > > > > > > > > truss -f zpool status:: > > > > 1014: sigprocmask(SIG_SETMASK,0x0,0x0) = 0 (0x0) > > 1014: sigprocmask(SIG_BLOCK,SIGHUP|SIGINT|SIGQUIT|SIGKILL|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2,0x0) = 0 (0x0) > > 1014: sigprocmask(SIG_SETMASK,0x0,0x0) = 0 (0x0) > > 1014: sigprocmask(SIG_BLOCK,SIGHUP|SIGINT|SIGQUIT|SIGKILL|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2,0x0) = 0 (0x0) > > 1014: sigprocmask(SIG_SETMASK,0x0,0x0) = 0 (0x0) > > 1014: modfind(0x40d3f140,0x9a0,0xc78,0x10a,0x1027e8,0x7fdffffe8d0) = 303 (0x12f) > > 1014: open("/dev/zfs",O_RDWR,06170) = 3 (0x3) > > 1014: open("/dev/zero",O_RDONLY,0666) = 4 (0x4) > > 1014: open("/etc/zfs/exports",O_RDONLY,0666) ERR#2 'No such file or directory' > > 1014: __sysctl(0x7fdffff8de8,0x2,0x7fdffff8eb0,0x7fdffff8f18,0x40d3f118,0x13) = 0 (0x0) > > 1014: __sysctl(0x7fdffff8eb0,0x4,0x40e4d084,0x7fdffff8fe0,0x0,0x0) = 0 (0x0) > > [hang] > > ctrl-t > > > > load: 0.31 cmd: zpool 1014 [running] 12.47r 0.00u 0.07s 44% 2912k > > > > > > 1014 root 1 54 0 22704K 2944K CPU0 0 0:00 98.47% zpool > > > > > > falcon# truss -p 1014 > > truss: can not attach to target process: Device busy > > > > iostat -x 1 shows no reads and no writes to any disks > > > > > > There's a 2-disk zfs mirror attached to this ultra60 from a freebsd-8 install, but I don't know > > why that would cause a problem with the latest zfs v28. > > > > Me neither :) You'll probably get better help from the ZFS maintainers > than on this list. > Thinking about it this might be caused by the binutils regression also affecting userland. If a world built with the following patch in place still behaves the same you should better contact the ZFS maintainers though: http://people.freebsd.org/~marius/elfxx-sparc.c.diff Marius