Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 2 May 2003 14:24:47 -0700
From:      "Andrew Kinney" <andykinney@advantagecom.net>
To:        freebsd-hackers@freebsd.org
Subject:   need opinion on kernel debug output from crash dump
Message-ID:  <3EB27FAF.9300.FD28393@localhost>

next in thread | raw e-mail | index | archive | help
Hello,

I finally managed to get an IDE drive into our system that's been 
having nightly panics while running a cron job and finally got a 
crash dump out of it.  I've got the debug output from the crash 
dump and I think I'm reading it correctly, but I'd like a second 
opinion.

# uname -smr
FreeBSD 4.7-RELEASE-p7 i386

# gdb -k kernel.debug.03-13-03 vmcore.0
SMP 2 cpus
IdlePTD at phsyical address 0x00357000
initial pcb at physical address 0x002c04c0
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
mp_lock = 01000002; cpuid = 1; lapic.id = 00000000
fault virtual address   = 0xbfc00000
fault code              = supervisor write, page not present
instruction pointer     = 0x8:0xc024dafd
stack pointer           = 0x10:0xfac94e04
frame pointer           = 0x10:0xfac94e10
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 30307 (perl5)
interrupt mask          = none <- SMP: XXX
trap number             = 12
panic: page fault
mp_lock = 01000002; cpuid = 1; lapic.id = 00000000
boot() called on cpu#1

syncing disks... 30 8 1 1 1 1 1 1 1
done
Uptime: 3d18h49m47s
mlx0: flushing cache...done
mlxd0: detached
mlxd1: detached
xl0: reset didn't complete

dumping to dev #ad/0x20001, offset 984064
dump ata0: resetting devices .. done
3615 ... 0
---
#0  dumpsys () at /usr/src/sys/kern/kern_shutdown.c:487
487             if (dumping++) {
(kgdb) where
#0  dumpsys () at /usr/src/sys/kern/kern_shutdown.c:487
#1  0xc017335b in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:316
#2  0xc01737b4 in poweroff_wait (junk=0xc02937f9, howto=-1071041841) at /usr/src/sys/kern/kern_shutdown.c:595
#3  0xc0251c78 in trap_fatal (frame=0xfac94dc4, eva=3217031168) at /usr/src/sys/i386/i386/trap.c:974
#4  0xc0251909 in trap_pfault (frame=0xfac94dc4, usermode=0, eva=3217031168) at /usr/src/sys/i386/i386/trap.c:867
#5  0xc02514a7 in trap (frame={tf_fs = 24, tf_es = 16, tf_ds = 16, tf_edi = 16384, tf_esi = -94850808, tf_ebp = -87470576,
      tf_isp = -87470608, tf_ebx = 0, tf_edx = -1052901760, tf_ecx = 0, tf_eax = 556142595, tf_trapno = 12, tf_err = 2,
      tf_eip = -1071326467, tf_cs = 8, tf_eflags = 66054, tf_esp = -1024212940, tf_ss = 136142848})
    at /usr/src/sys/i386/i386/trap.c:466
#6  0xc024dafd in pmap_qenter (va=0, m=0xfa58b108, count=4) at /usr/src/sys/i386/i386/pmap.c:786
#7  0xc0183bae in pipe_build_write_buffer (wpipe=0xfa58b0e0, uio=0xfac94ed0) at /usr/src/sys/kern/sys_pipe.c:593
#8  0xc0183d80 in pipe_direct_write (wpipe=0xfa58b0e0, uio=0xfac94ed0) at /usr/src/sys/kern/sys_pipe.c:708
#9  0xc0184122 in pipe_write (fp=0xcbb2ce00, uio=0xfac94ed0, cred=0xcf0cde80, flags=0, p=0xfa68b820)
    at /usr/src/sys/kern/sys_pipe.c:826
#10 0xc018247d in dofilewrite (p=0xfa68b820, fp=0xcbb2ce00, fd=1, buf=0x81d2000, nbyte=16384, offset=-1, flags=0)
    at /usr/src/sys/sys/file.h:162
#11 0xc0182336 in write (p=0xfa68b820, uap=0xfac94f80) at /usr/src/sys/kern/sys_generic.c:329
#12 0xc0251fa9 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 20459, tf_esi = 672929112,
      tf_ebp = -1077938456, tf_isp = -87470124, tf_ebx = 672929580, tf_edx = 672929112, tf_ecx = 136126464, tf_eax = 4,
      tf_trapno = 672126832, tf_err = 2, tf_eip = 672882720, tf_cs = 31, tf_eflags = 647, tf_esp = -1077938500,
      tf_ss = 47}) at /usr/src/sys/i386/i386/trap.c:1175
#13 0xc023f93b in Xint0x80_syscall ()
cannot read proc at 0

Each panic has the same instruction pointer, so the problem is 
consistent.

On line 6 of the debugger output, you can see that the value for 
"va" is 0.  Shouldn't that be some non-zero value?

As used in that function, "va" is supposed to contain a virtual 
address from KVA space, right?  If that's the case, would a zero 
value indicate that the system had run out KVA space?

How would someone tell how much KVA space is in use?  I'm sure 
this information can be found or calculated in some manner, but I'm 
just not sure how to get the information.

Our limit on this sytem for KVA space is 1GB (the default).  There 
is approximately 3500MB of RAM available for use (after BIOS 
maps out space for PCI devices within the 4GB address space).  
The system is tuned up to support the kinds of loads that are 
easily handled by a system with dual 2GHz processors and tons of 
RAM.

I know that running out of KVA space is the most likely cause of 
the panic since we've tuned the system with higher settings on 
things that can eat into KVA space, but I would like to be certain 
before I go fiddling with KVA_PAGES in my kernel config.

Thanks in advance for the second opinion on the cause of this 
panic.

Sincerely,
Andrew Kinney
President and
Chief Technology Officer
Advantagecom Networks, Inc.
http://www.advantagecom.net



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3EB27FAF.9300.FD28393>