Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 2 Aug 2006 16:06:11 +0100 (BST)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        Goran Gajic <ggajic@afrodita.rcub.bg.ac.yu>
Cc:        freebsd-current@mx1.freebsd.org
Subject:   Re: 7.0-CURRENT panic from today build
Message-ID:  <20060802160339.G56791@fledge.watson.org>
In-Reply-To: <Pine.LNX.4.63.0608021642500.10912@afrodita.rcub.bg.ac.yu>
References:  <Pine.LNX.4.63.0608021642500.10912@afrodita.rcub.bg.ac.yu>

next in thread | previous in thread | raw e-mail | index | archive | help
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--0-214084542-1154531171=:56791
Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE

On Wed, 2 Aug 2006, Goran Gajic wrote:

> fbsd# kgdb /usr/src/sys/i386/compile/GENERIC/kernel.debug /var/crash/vmco=
re.3
> [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.s=
o:=20
> Undefined symbol "ps_pglobal_lookup"]
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you =
are
> welcome to change it and/or distribute copies of it under certain conditi=
ons.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for detail=
s.
> This GDB was configured as "i386-marcel-freebsd".

Thanks for the report.  This is similar to a problem that cropped up in the=
=20
UNIX domain socket code last week after changes in the way socket tear-down=
=20
were made.  I'll likely have a fix for this in the next 24 hours or so, jus=
t=20
need to read some code and decide which of two or three approaches is most=
=20
likely the right one.

To be specific: the problem is that right now we largely tear down socket=
=20
state, such as socket buffers and kqueue, before we enter the pru_detach=20
routine now.  This means that the call to soisdisconnected() from=20
tcp_discardcb() is no longer occuring in the right place, and needs to move=
=2E=20
There was already a comment from me there suggesting it was in the wrong pl=
ace=20
from before, now it is definitely in the wrong place.  Likely the socket=20
should already be detached from the pcb when we run tcp_discardcb(), and th=
e=20
caller should have called soisdisconnected() if it was needed.

I'll commit the fix directly to CVS, so you will want to cvsup in about 24=
=20
hours to pick it up.

Thanks,

Robert N M Watson
Computer Laboratory
University of Cambridge

>
> Unread portion of the kernel message buffer:
> Kernel page fault with the following non-sleepable locks held:
> exclusive sleep mutex so_rcv r =3D 0 (0xc30a56f0) locked @=20
> kern/uipc_socket2.c:166
> exclusive sleep mutex inp (tcpinp) r =3D 0 (0xc30db480) locked @=20
> netinet/tcp_usrreq.c:252
> exclusive sleep mutex tcp r =3D 0 (0xc0a5a2ec) locked @=20
> netinet/tcp_usrreq.c:251
> KDB: stack backtrace:
> kdb_backtrace(3,c2c91d38,c,c2773360,d5aa8a5c,...) at kdb_backtrace+0x29
> witness_warn(5,0,c0946263) at witness_warn+0x192
> trap(c0a50008,28,c0910028,c30a56d8,c30a567c,...) at trap+0x108
> calltrap() at calltrap+0x5
> --- trap 0xc, eip =3D 0xc06818d2, esp =3D 0xd5aa8aa4, ebp =3D 0xd5aa8aa4 =
---
> knlist_mtx_locked(0) at knlist_mtx_locked+0x6
> knote(c30a56d8,0,1,c30a56f0,c30a567c,...) at knote+0x1d
> sowakeup(c30a567c,c30a56cc) at sowakeup+0x61
> soisdisconnected(c30a567c) at soisdisconnected+0x61
> tcp_discardcb(c30b6570) at tcp_discardcb+0x1f5
> tcp_detach(c30a567c,c30db3f0,c30a567c,c09d1568,d5aa8b74,...) at=20
> tcp_detach+0x14e
> tcp_usr_detach(c30a567c) at tcp_usr_detach+0x67
> sofree(c30a567c) at sofree+0x1fe
> soclose(c30a567c) at soclose+0x2d9
> soo_close(c30dfbd0,c2773360) at soo_close+0x4b
> fdrop_locked(c30dfbd0,c2773360,c24806d0,0,c091adae,...) at fdrop_locked+0=
x88
> fdrop(c30dfbd0,c2773360,6b5,c0a11a94,0,...) at fdrop+0x24
> closef(c30dfbd0,c2773360,1,0,0,...) at closef+0x367
> kern_close(c2773360,1d,d5aa8d30,c08990de,c2773360,...) at kern_close+0x1b=
6
> close(c2773360,d5aa8d04) at close+0x10
> syscall(a8e0003b,bf1f003b,8245003b,e,947fa38,...) at syscall+0x256
> Xint0x80_syscall() at Xint0x80_syscall+0x1f
> --- syscall (6, Linux ELF, close), eip =3D 0x28feecaf, esp =3D 0xbf1ff698=
, ebp =3D=20
> 0x91ed218 ---
>
>
> Fatal trap 12: page fault while in kernel mode
> cpuid =3D 0; apic id =3D 00
> fault virtual address   =3D 0x10
> fault code              =3D supervisor read, page not present
> instruction pointer     =3D 0x20:0xc06818d2
> stack pointer           =3D 0x28:0xd5aa8aa4
> frame pointer           =3D 0x28:0xd5aa8aa4
> code segment            =3D base 0x0, limit 0xfffff, type 0x1b
>                        =3D DPL 0, pres 1, def32 1, gran 1
> processor eflags        =3D interrupt enabled, resume, IOPL =3D 0
> current process         =3D 598 (skype)
> Dumping 511 MB (2 chunks)
>  chunk 0: 1MB (159 pages) ... ok
>  chunk 1: 511MB (130736 pages) 495 479 463 447 431 415 399 383 367 351 33=
5=20
> 319 303 287 271 255 239 223 207 191 175 159 143 127 111 95 79 63 47 31 15
>
> #0  doadump () at pcpu.h:166
> 166     pcpu.h: No such file or directory.
>        in pcpu.h
> (kgdb) ) where
> #0  doadump () at pcpu.h:166
> #1  0xc04758a3 in db_fncall (dummy1=3D-710244144, dummy2=3D0, dummy3=3D10=
16,=20
> dummy4=3D0xd5aa88a4 "=BC\210=AA=D5=F8\003") at ../../../ddb/db_command.c:=
481
> #2  0xc04756af in db_command (last_cmdp=3D0xc09f0a84, cmd_table=3D0x0) at=
=20
> ../../../ddb/db_command.c:396
> #3  0xc047576a in db_command_loop () at ../../../ddb/db_command.c:448
> #4  0xc0477369 in db_trap (type=3D12, code=3D0) at ../../../ddb/db_main.c=
:221
> #5  0xc06b71d0 in kdb_trap (type=3D12, code=3D0, tf=3D0xd5aa8a64) at=20
> ../../../kern/subr_kdb.c:502
> #6  0xc0898dc1 in trap_fatal (frame=3D0xd5aa8a64, eva=3D16) at=20
> ../../../i386/i386/trap.c:858
> #7  0xc089843f in trap (frame=3D
>      {tf_fs =3D -1062928376, tf_es =3D 40, tf_ds =3D -1064239064, tf_edi =
=3D=20
> -1022732584, tf_esi =3D -1022732676, tf_ebp =3D -710243676, tf_isp =3D -7=
10243696,=20
> tf_ebx =3D -102
> 2732596, tf_edx =3D -1032388600, tf_ecx =3D 4, tf_eax =3D 0, tf_trapno =
=3D 12, tf_err=20
> =3D 16, tf_eip =3D -1066919726, tf_cs =3D 32, tf_eflags =3D 66050, tf_esp=
 =3D=20
> -710243652, tf
> _ss =3D -1066920643}) at ../../../i386/i386/trap.c:277
> #8  0xc0883e2a in calltrap () at ../../../i386/i386/exception.s:138
> #9  0xc06818d2 in knlist_mtx_locked (arg=3D0x0) at=20
> ../../../kern/kern_event.c:1644
> #10 0xc068153d in knote (list=3D0xc30a56d8, hint=3D0, islocked=3D1) at=20
> ../../../kern/kern_event.c:1520
> #11 0xc06da6dd in sowakeup (so=3D0xc30a567c, sb=3D0xc30a56cc) at=20
> ../../../kern/uipc_sockbuf.c:190
> #12 0xc06df7dd in soisdisconnected (so=3D0xc30a567c) at=20
> ../../../kern/uipc_socket2.c:170
> #13 0xc07476d1 in tcp_discardcb (tp=3D0xc30b6570) at=20
> ../../../netinet/tcp_subr.c:786
> #14 0xc074bee2 in tcp_detach (so=3D0xc30a567c, inp=3D0xc30db3f0) at=20
> ../../../netinet/tcp_usrreq.c:212
> #15 0xc074bf9b in tcp_usr_detach (so=3D0xc30a567c) at=20
> ../../../netinet/tcp_usrreq.c:257
> #16 0xc06dc206 in sofree (so=3D0xc30a567c) at ../../../kern/uipc_socket.c=
:614
> #17 0xc06dc4f1 in soclose (so=3D0xc30a567c) at ../../../kern/uipc_socket.=
c:684
> #18 0xc06c9c53 in soo_close (fp=3D0xc30dfbd0, td=3D0xc2773360) at=20
> ../../../kern/sys_socket.c:315
> #19 0xc067dcc8 in fdrop_locked (fp=3D0xc30dfbd0, td=3D0xc2773360) at file=
=2Eh:296
> #20 0xc067dc38 in fdrop (fp=3D0xc30dfbd0, td=3D0xc2773360) at=20
> ../../../kern/kern_descrip.c:2164
> #21 0xc067c727 in closef (fp=3D0xc30dfbd0, td=3D0xc2773360) at=20
> ../../../kern/kern_descrip.c:1979
> #22 0xc067a002 in kern_close (td=3D0xc2773360, fd=3D29) at=20
> ../../../kern/kern_descrip.c:1026
> #23 0xc0679e48 in close (td=3D0xc2773360, uap=3D0x0) at=20
> ../../../kern/kern_descrip.c:977
> #24 0xc08990de in syscall (frame=3D
>      {tf_fs =3D -1461714885, tf_es =3D -1088487365, tf_ds =3D -2109407173=
, tf_edi=20
> =3D 14, tf_esi =3D 155712056, tf_ebp =3D 153014808, tf_isp =3D -710242972=
, tf_ebx =3D=20
> 29, tf_
> edx =3D 156362632, tf_ecx =3D 1, tf_eax =3D 6, tf_trapno =3D 22, tf_err =
=3D 2, tf_eip =3D=20
> 687795375, tf_cs =3D 51, tf_eflags =3D 582, tf_esp =3D -1088424296, tf_ss=
 =3D 59})
>    at ../../../i386/i386/trap.c:1006
> #25 0xc0883e7f in Xint0x80_syscall () at ../../../i386/i386/exception.s:1=
91
> #26 0x00000033 in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> (kgdb)
>
>
> FreeBSD fbsd.interex.net 7.0-CURRENT FreeBSD 7.0-CURRENT #0: Wed Aug  2=
=20
> 13:15:27 CEST 2006 root@fbsd.interex.net:/usr/src/sys/i386/compile/GENERI=
C=20
> i386
>
>
> Regards,
> gg.
>
--0-214084542-1154531171=:56791--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060802160339.G56791>