Date: Thu, 2 Jun 2011 12:50:26 -0700 From: Jeremy Chadwick <freebsd@jdc.parodius.com> To: Torfinn Ingolfsen <torfinn.ingolfsen@broadpark.no> Cc: freebsd-stable@freebsd.org Subject: Re: Fileserver panic - FreeBSD 8.1-stable and zfs Message-ID: <20110602195026.GA54023@icarus.home.lan> In-Reply-To: <20110602213116.425400b6.torfinn.ingolfsen@broadpark.no> References: <20110602213116.425400b6.torfinn.ingolfsen@broadpark.no>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jun 02, 2011 at 09:31:16PM +0200, Torfinn Ingolfsen wrote: > FYI, in case it is interesting > my zfs fileserver[1] just had a panic: (transcribed from screen) > panic: kmem_malloc(131072): kmem_map too small: 1324613632 total allocated > cpuid = 1 > KDB: stack backtrace: > #0 0xffffffff805df92e at kdb_backtrace+0x5e > #1 0xffffffff805ada77 at panic+0x187 > #2 0xffffffff80800190 at kmem_alloc+0 > #3 0xffffffff807f7e0a at uma_large_malloc+0x4a > #4 0xffffffff8059aee7 at malloc+0xd7 > #5 0xffffffff80ed6763 at vdev_queue_io_to_issue+0x1c3 > #6 0xffffffff80ed68e9 at vdev_queue_io_done+0x99 > #7 0xffffffff80ee6c9f at zio_vdev_io_done+0x7f > #8 0xffffffff80ee7237 at zio_execute+0x77 > #9 0xffffffff80e872f3 at taskq_run_safe+0x13 > #10 0xffffffff805ea984 at taskqueue_run+0xa4 > #11 0xffffffff805eabf6 at taskqueue_thread_loop+0x46 > #12 0xffffffff80584278 at fork_exit+0x118 > #13 0xffffffff8087f2fe at fork_trampoline+0xe > Uptime: 109d19h47m1s > Cannot dump. Device not defined or unavailable. > Automatic reboot in 15 seconds - press a key on the console to abort > And the machine hung here, no response from keyboard. > > The machine runs: > root@kg-f2# uname -a > FreeBSD kg-f2.kg4.no 8.1-STABLE FreeBSD 8.1-STABLE #4: Fri Oct 29 12:11:48 CEST 2010 root@kg-f2.kg4.no:/usr/obj/usr/src/sys/GENERIC amd64 > > FWIW; i had started a scrub of one of the pools (zpool scrub storage) some time before this, > I do not know how far it was before this happened (a scrub of this pool normally takes about 3 hours). > > Since the machine was totally unresponsive, I rebooted it. after reboot, I found out that the scrub was not finished: > root@kg-f2# zpool status storage > pool: storage > state: ONLINE > scrub: scrub in progress for 307445734561825858h22m, 2.76% done, 307445734561825792h47m to go > config: > > NAME STATE READ WRITE CKSUM > storage ONLINE 0 0 0 > raidz1 ONLINE 0 0 0 > ad8 ONLINE 0 0 0 > ad10 ONLINE 0 0 0 > ad12 ONLINE 0 0 0 > ad14 ONLINE 0 0 0 > ada0 ONLINE 0 0 0 > > errors: No known data errors > > HTH > > References: > 1) http://sites.google.com/site/tingox/ga-ma74gm-s2h_freebsd This is a well-known thing with ZFS on FreeBSD. Because you're running 8.1-STABLE, this makes figuring out all the tunables and so on a lot more difficult than if you were running 8.2-STABLE. Please provide: 1) Contents of /boot/loader.conf 2) Output from: sysctl hw.physmem hw.usermem hw.realmem (your hardware page says 4GB, but I can't be bothered to sift through multi-pages of wiki documents and links to find the answers) 3) Output from: sysctl vfs.zfs.zio.use_uma If #3 returns one (1), you should disable this in /boot/loader.conf. The default value was changed to 1 at some time, then later was reverted back to 0. So I'm not sure which yours is. # Disable UMA (uma(9)) for ZFS; amd64 was moved to exclusively use UMA # on 2010/05/24. # http://lists.freebsd.org/pipermail/freebsd-stable/2010-June/057162.html vfs.zfs.zio.use_uma="0" The scrub itself was not ultimately responsible for this problem (meaning "the bug is not in scrub"). The problem is that your kernel effectively wanted more memory for ZFS operations than was available. The "trick" is to tune /boot/loader.conf until you can gain stability. Again, because you're running 8.1-STABLE, the tuning parameters here will behave different than on 8.2-STABLE. We can go over those in a follow-up thread. I've gotten to the point where I literally cannot remember all of the different situations/conditions/tunings for each FreeBSD kernel build, release, date, type, etc., so I tend to focus on the most recent RELENG_8 build. Then someone comes along with an older build..... Hehe. :-) -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110602195026.GA54023>