Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 14 Feb 2012 09:38:18 -0500
From:      Paul Mather <paul@gromit.dlib.vt.edu>
To:        stable@freebsd.org
Subject:   ZFS + nullfs + Linuxulator = panic?
Message-ID:  <CB455B5A-0583-4DFB-9712-6FFCC8B67AAB@gromit.dlib.vt.edu>

next in thread | raw e-mail | index | archive | help
I have a problem with RELENG_8 (FreeBSD/amd64 running a GENERIC kernel, =
last built 2012-02-08).  It will panic during the daily periodic scripts =
that run at 3am.  Here is the most recent panic message:

Fatal trap 9: general protection fault while in kernel mode
cpuid =3D 0; apic id =3D 00
instruction pointer     =3D 0x20:0xffffffff8069d266
stack pointer           =3D 0x28:0xffffff8094b90390
frame pointer           =3D 0x28:0xffffff8094b903a0
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D resume, IOPL =3D 0
current process         =3D 72566 (ps)
trap number             =3D 9
panic: general protection fault
cpuid =3D 0
KDB: stack backtrace:
#0 0xffffffff8062cf8e at kdb_backtrace+0x5e
#1 0xffffffff805facd3 at panic+0x183
#2 0xffffffff808e6c20 at trap_fatal+0x290
#3 0xffffffff808e715a at trap+0x10a
#4 0xffffffff808cec64 at calltrap+0x8
#5 0xffffffff805ee034 at fill_kinfo_thread+0x54
#6 0xffffffff805eee76 at fill_kinfo_proc+0x586
#7 0xffffffff805f22b8 at sysctl_out_proc+0x48
#8 0xffffffff805f26c8 at sysctl_kern_proc+0x278
#9 0xffffffff8060473f at sysctl_root+0x14f
#10 0xffffffff80604a2a at userland_sysctl+0x14a
#11 0xffffffff80604f1a at __sysctl+0xaa
#12 0xffffffff808e62d4 at amd64_syscall+0x1f4
#13 0xffffffff808cef5c at Xfast_syscall+0xfc
Uptime: 3d19h6m0s
Dumping 1308 out of 2028 =
MB:..2%..12%..21%..31%..41%..51%..62%..71%..81%..91%
Dump complete
Automatic reboot in 15 seconds - press a key on the console to abort
Rebooting...


The reason for the subject line is that I have another RELENG_8 system =
that uses ZFS + nullfs but doesn't panic, leading me to believe that ZFS =
+ nullfs is not the problem.  I am wondering if it is the combination of =
the three that is deadly, here.

Both RELENG_8 systems are root-on-ZFS installs.  Each night there is a =
separate backup script that runs and completes before the regular =
"periodic daily" run.  This script takes a recursive snapshot of the ZFS =
pool and then mounts these snapshots via mount_nullfs to provide a =
coherent view of the filesystem under /backup.  The only difference =
between the two RELENG_8 systems is that one uses rsync to back up =
/backup to another machine and the other uses the Linux Tivoli TSM =
client to back up /backup to a TSM server.  After the backup is =
completed, a script runs that unmounts the nullfs file systems and then =
destroys the ZFS snapshot.

The first (rsync backup) RELENG_8 system does not panic.  It has been =
running the ZFS + nullfs rsync backup job without incident for weeks =
now.  The second (Tivoli TSM) RELENG_8 will reliably panic when the =
subsequent "periodic daily" job runs.  (It is using the 32-bit TSM 6.2.4 =
Linux client running "dsmc schedule" via the linux_base-f10-10_4 =
package.)  The actual ZFS + nullfs Tivoli TSM backup job appears to run =
successfully, making me wonder if perhaps it has some memory leak or =
other subtle corruption that sets up the ensuing panic when the =
"periodic daily" job later gives the system a workout.

If I can provide more information about the panic, please let me know.  =
Despite the message about dumping in the panic output above, when the =
system reboots I get a "No core dumps found" message during boot.  (I =
have dumpdev=3D"AUTO" set in /etc/rc.conf.)  My swap device is on =
separate partitions but is mirrored using geom_mirror as =
/dev/mirror/swap.  Do crash dumps to gmirror devices work on RELENG_8?

Does anyone have any idea what is to blame for the panic, or how I can =
fix or work around it?

Cheers,

Paul.

PS: The uptime of three days in the panic message is because I disabled =
the Tivoli TSM backup job on Friday so it would not run over the =
weekend.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CB455B5A-0583-4DFB-9712-6FFCC8B67AAB>