From owner-freebsd-fs@FreeBSD.ORG Sat Feb 2 10:19:09 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AD80C16A41A for ; Sat, 2 Feb 2008 10:19:09 +0000 (UTC) (envelope-from ndenev@gmail.com) Received: from rv-out-0910.google.com (rv-out-0910.google.com [209.85.198.190]) by mx1.freebsd.org (Postfix) with ESMTP id 7F15313C457 for ; Sat, 2 Feb 2008 10:19:09 +0000 (UTC) (envelope-from ndenev@gmail.com) Received: by rv-out-0910.google.com with SMTP id g13so1128506rvb.43 for ; Sat, 02 Feb 2008 02:19:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition:x-google-sender-auth; bh=7veNi0ZujplnapSKp9psDCCXlFjZgV4idB9i13N1Ckk=; b=aMJYY3+xMOhmqLT873OpAQDOCMCmbcyOZFsU7T86R+XOF6RgAPNjzNX8GnWCJHlg+OpwjysOFBdmofUkxsdqR+yExq/8/f+Mf3UOSk2BHSbDG29GDFmci8jyLx0e8DfMytSpGBQZY0y0nc83CgkYcaFImXxREbjLNqtO5YhL7eA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition:x-google-sender-auth; b=M+GeD9hVUTRjxxqm+XU+oLWLRQeBdJ1qXSnfzquQVmN2u9wXZFxmvy2/FCg60j9cpuhp4siv7ToYmmoQRCXMd3yIHib1/4ZSAELisVditvvctBLu6cDR89/RnSOajS9mNVeL6hju6wt8aeKShU+vUPV3OADrlEOd3aZYp7hA7lk= Received: by 10.140.187.10 with SMTP id k10mr3144359rvf.95.1201945929650; Sat, 02 Feb 2008 01:52:09 -0800 (PST) Received: by 10.141.170.18 with HTTP; Sat, 2 Feb 2008 01:52:09 -0800 (PST) Message-ID: <2e77fc10802020152k2f5385c5w5938d91b1183f8e0@mail.gmail.com> Date: Sat, 2 Feb 2008 11:52:09 +0200 From: "Niki Denev" Sender: ndenev@gmail.com To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Google-Sender-Auth: bf8b2badd3e91c8e Subject: ZFS panics X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Feb 2008 10:19:09 -0000 Hi, I'm doing some stress testing on one server using ZFS and i have experienced two kernel panics in the last days. The machine runs AMD64 7.0-PRERELEASE on dual quad-core (8 cores total) Intel Xeon 2.0Ghz, with 8Gigs of Ram. The disk subsystem consists of eight hitachi SATA drives on a Areca 1231ML with 1G of cache memory and a battery backup. I'm using GUID partitions only. One 10G for the system on UFS2 with geom_journal, 10G swap/dump partition, and the rest 2.7TB is a ZFS pool. I also have this in loader.conf : vm.kmem_size="1G" vm.kmem_size_max="1G" I was running multiple bonnie++ instances in parallel writing and reading from the ZFS pool. The first time i ran 80 bonnie++ instances and the machine rebooted after about 3 hours. The second time i ran 16 bonnie++ instances and the machine survived good 11 hours. I've tried to use the "list" command in kdb as shown in the developers handbook but it keeps saying "No source file for address XXX" Here it is the first panic that i experienced. The second one looks identical : (i'm not entirely sure that i load the zfs symbols properly?) sm-srv221# kldstat |grep zfs 2 1 0xffffffff80bfc000 f5a40 zfs.ko sm-srv221# kgdb -q /boot/kernel/kernel.symbols /var/crash/vmcore.0 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 4; apic id = 04 fault virtual address = 0x18 fault code = supervisor read data, page not present instruction pointer = 0x8:0xffffffff80c19d16 stack pointer = 0x10:0xffffffffd996a8f0 frame pointer = 0x10:0xffffffffd996a920 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 321 (txg_thread_enter) trap number = 12 panic: page fault cpuid = 4 Uptime: 3h20m21s Physical memory: 8177 MB Dumping 522 MB: 507 491 475 459 443 427 411 395 379 363 347 331 315 299 283 267 251 235 219 203 187 171 155 139 123 107 91 75 59 43 27 11 #0 doadump () at pcpu.h:194 194 pcpu.h: No such file or directory. in pcpu.h (kgdb) add-debug-symbols /boot/kernel/zfs.ko.symbols 0xffffffff80bfc000 (kgdb) list *0xffffffff80c19d16 0xffffffff80c19d16 is in dmu_objset_sync_dnodes (/usr/src/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:707). 702 ASSERT(dn->dn_dbuf->db_data_pending); 703 /* 704 * Initialize dn_zio outside dnode_sync() 705 * to accomodate meta-dnode 706 */ 707 dn->dn_zio = dn->dn_dbuf->db_data_pending->dr_zio; 708 ASSERT(dn->dn_zio); 709 710 ASSERT3U(dn->dn_nlevels, <=, DN_MAX_LEVELS); 711 list_remove(list, dn); (kgdb) bt #0 doadump () at pcpu.h:194 #1 0x0000000000000004 in avl_balance2child () #2 0xffffffff80478619 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #3 0xffffffff80478a1d in panic (fmt=0x104
) at /usr/src/sys/kern/kern_shutdown.c:563 #4 0xffffffff8074f174 in trap_fatal (frame=0xffffff0003377000, eva=18446742974251873384) at /usr/src/sys/amd64/amd64/trap.c:724 #5 0xffffffff8074f545 in trap_pfault (frame=0xffffffffd996a840, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:641 #6 0xffffffff8074fe88 in trap (frame=0xffffffffd996a840) at /usr/src/sys/amd64/amd64/trap.c:410 #7 0xffffffff80735aee in calltrap () at /usr/src/sys/amd64/amd64/exception.S:169 #8 0xffffffff80c19d16 in dmu_objset_sync_dnodes (list=0xffffff0003730d20, tx=0xffffff0137f9e800) at /usr/src/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:707 #9 0xffffffff80c19e7d in dmu_objset_sync (os=0xffffff0003730c00, pio=0xffffff0131a4fac0, tx=0xffffff0137f9e800) at /usr/src/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:809 #10 0xffffffff80c27372 in dsl_pool_sync (dp=0xffffff00032b2800, txg=15331) at /usr/src/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c:188 #11 0xffffffff80c31da0 in spa_sync (spa=0xffffff00032be000, txg=15331) at /usr/src/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/spa.c:2989 #12 0xffffffff80c37abf in txg_sync_thread (arg=Variable "arg" is not available. ) at /usr/src/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/txg.c:331 #13 0xffffffff80459d33 in fork_exit (callout=0xffffffff80c37990 , arg=0xffffff00032b2800, frame=0xffffffffd996ac80) at /usr/src/sys/kern/kern_fork.c:781 #14 0xffffffff80735ebe in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:415 #15 0x0000000000000000 in ?? () #16 0x0000000000000000 in ?? () #17 0x0000000000000001 in avl_balance2child () #18 0x0000000000000000 in ?? () #19 0x0000000000000000 in ?? () #20 0x0000000000000000 in ?? () #21 0x0000000000000000 in ?? () #22 0x0000000000000000 in ?? () #23 0x0000000000000000 in ?? () #24 0x0000000000000000 in ?? () #25 0x0000000000000000 in ?? () #26 0x0000000000000000 in ?? () #27 0x0000000000000000 in ?? () #28 0x0000000000000000 in ?? () #29 0x0000000000000000 in ?? () #30 0x0000000000000000 in ?? () #31 0x0000000000000000 in ?? () #32 0x0000000000000000 in ?? () #33 0x0000000000000000 in ?? () #34 0x0000000000000000 in ?? () #35 0x0000000000000000 in ?? () #36 0x0000000000000000 in ?? () #37 0x0000000000000000 in ?? () #38 0x0000000000000000 in ?? () #39 0x0000000000e06000 in ?? () #40 0xffffffff80a7a740 in tdq_cpu () #41 0xffffffff80a83f40 in tdq_groups () #42 0xffffffff80a83d40 in tdq_cpu () #43 0xffffff0003377000 in ?? () #44 0xffffffff80a77540 in tdg_maxid () #45 0xffffffffd996a4b8 in ?? () #46 0xffffff0003377000 in ?? () #47 0xffffffff80496bc8 in sched_switch (td=0xffffffff80c37990, newtd=0x0, flags=Variable "flags" is not available. ) at /usr/src/sys/kern/sched_ule.c:1898 #48 0x0000000000000000 in ?? () #49 0x0000000000000000 in ?? () #50 0x0000000000000000 in ?? () #51 0x0000000000000000 in ?? () #52 0x0000000000000000 in ?? () #53 0x0000000000000000 in ?? () #54 0x0000000000000000 in ?? ()