Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 6 Sep 2005 11:11:33 +0300
From:      Tom Alsberg <alsbergt@cs.huji.ac.il>
To:        FreeBSD Hackers List <freebsd-hackers@freebsd.org>
Subject:   FFS pread causes kernel panic on loaded 5.4 server
Message-ID:  <20050906081133.GA22769@cs.huji.ac.il>

next in thread | raw e-mail | index | archive | help
Greetings,

We have a FreeBSD 5.4 server which we try run in production, but have
severe problems with it crashing every few days.

Having run it with DDB, it originally appeared to be filesystem and
NFS related on crash.  However, I enabled dumps to swap and fired up
kgdb on the core, and it seems to go through vn_read and ffs_read,
making the impression that it is not NFS related.

I haven't yet figured enough about how to check which file the pread
tried to access, which process it was, and what the process was doing
at that time (any pointers to documentation appreciated, but I
honestly haven't looked that much yet).  Any way, it seems to be
caused by pread somehow.

Can somebody make more of it using the information I have now at hand
(stack-trace follows)?  This began when I upgraded the server from
FreeBSD 4.10 to FreeBSD 5.4.  The server is filesystem intensive,
mainly NFS (running Samba).  It appears that frames 11 and up are the
relevant ones - everything below that is initiated by the trap and DDB
itself.

P.S. Are there easy ways to access individual processes (their data,
list open files, running status, IPCs, etc. and perhaps even stack
trace of them) given a kernel core file?

  -- Tom

Follows bt from gdb:

#0  doadump () at pcpu.h:160
#1  0xc046657a in db_fncall (dummy1=0, dummy2=0, dummy3=-1065484837, 
    dummy4=0xeb4e1850 "|<...binary crap...>\n")
    at /r+d/5.4/src/sys/ddb/db_command.c:531
#2  0xc0466388 in db_command (last_cmdp=0xc0906664, cmd_table=0x0, 
    aux_cmd_tablep=0xc0885e1c, aux_cmd_tablep_end=0xc0885e38)
    at /r+d/5.4/src/sys/ddb/db_command.c:349
#3  0xc0466450 in db_command_loop () at /r+d/5.4/src/sys/ddb/db_command.c:455
#4  0xc0467fe9 in db_trap (type=12, code=0)
    at /r+d/5.4/src/sys/ddb/db_main.c:221
#5  0xc0646483 in kdb_trap (type=12, code=0, tf=0x1)
    at /r+d/5.4/src/sys/kern/subr_kdb.c:470
#6  0xc07fafc5 in trap_fatal (frame=0xeb4e19e4, eva=28)
    at /r+d/5.4/src/sys/i386/i386/trap.c:812
#7  0xc07fad23 in trap_pfault (frame=0xeb4e19e4, usermode=0, eva=28)
    at /r+d/5.4/src/sys/i386/i386/trap.c:735
#8  0xc07fa939 in trap (frame=
      {tf_fs = -1067319272, tf_es = -699793392, tf_ds = 1048592, tf_edi = -699757236, tf_esi = -699757236, tf_ebp = -347203024, tf_isp = -347203056, tf_ebx = -699757236, tf_edx = 0, tf_ecx = -1024473216, tf_eax = 4, tf_trapno = 12, tf_err = 2, tf_eip = -1066976993, tf_cs = 8, tf_eflags = 66050, tf_esp = -699757236, tf_ss = -699757236}) at /r+d/5.4/src/sys/i386/i386/trap.c:425
#9  0xc07e890a in calltrap () at /r+d/5.4/src/sys/i386/i386/exception.s:140
#10 0xc0620018 in linker_load_file (filename=0xd64a8d4c "\002", result=0x1)
    at /r+d/5.4/src/sys/kern/kern_linker.c:327
#11 0xc0674176 in getnewbuf (slpflag=0, slptimeo=0, size=16384, maxsize=16384)
    at /r+d/5.4/src/sys/kern/vfs_bio.c:1885
#12 0xc06755fd in getblk (vp=0xc3242318, blkno=19, size=16384, slpflag=0, 
    slptimeo=0, flags=0) at /r+d/5.4/src/sys/kern/vfs_bio.c:2585
#13 0xc0679b32 in cluster_read (vp=0xc3242318, filesize=1302528, lblkno=19, 
    size=16384, cred=0x0, totread=32768, seqcount=0, bpp=0x0)
    at /r+d/5.4/src/sys/kern/vfs_cluster.c:117
#14 0xc076ed72 in ffs_read (ap=0x0) at /r+d/5.4/src/sys/ufs/ffs/ffs_vnops.c:462
#15 0xc068ed9c in vn_read (fp=0xc3be1088, uio=0xeb4e1cbc, 
    active_cred=0xc2d55800, flags=1, td=0xc2efc780) at vnode_if.h:398
#16 0xc064f4d5 in dofileread (td=0xc2efc780, fd=61, fp=0xc3be1088, 
    auio=0xeb4e1cbc, offset=Unhandled dwarf expression opcode 0x93
) at file.h:233
#17 0xc064f435 in kern_preadv (td=0xc2efc780, fd=61, auio=0xeb4e1cbc, 
    offset=319488) at /r+d/5.4/src/sys/kern/sys_generic.c:242
#18 0xc064f2e3 in pread (td=0xc2efc780, uap=0x0)
    at /r+d/5.4/src/sys/kern/sys_generic.c:151
#19 0xc07fb333 in syscall (frame=
      {tf_fs = 47, tf_es = 131119, tf_ds = 137691183, tf_edi = 0, tf_esi = 319488, tf_ebp = -1077959384, tf_isp = -347202204, tf_ebx = -2008021396, tf_edx = 137863231, tf_ecx = 61, tf_eax = 198, tf_trapno = 0, tf_err = 2, tf_eip = -2008518601, tf_cs = 31, tf_eflags = 646, tf_esp = -1077959428, tf_ss = 47})
    at /r+d/5.4/src/sys/i386/i386/trap.c:1009
#20 0xc07e895f in Xint0x80_syscall ()
    at /r+d/5.4/src/sys/i386/i386/exception.s:201

-- 
  Tom Alsberg - hacker (being the best description fitting this space)
  Web page:	http://www.cs.huji.ac.il/~alsbergt/
DISCLAIMER:  The above message does not even necessarily represent what
my fingers have typed on the keyboard, save anything further.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050906081133.GA22769>