Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 25 Nov 2006 23:20:04 +0200
From:      Nikolay Pavlov <quetzal@zone3000.net>
To:        LI Xin <delphij@delphij.net>
Cc:        Robert Watson <rwatson@FreeBSD.org>, freebsd-stable@FreeBSD.org
Subject:   Re: deadlock in "zoneli" state on 6.2-PRERELEASE
Message-ID:  <20061125212004.GA22786@zone3000.net>
In-Reply-To: <456887CD.50606@delphij.net>
References:  <20061122195549.GA57018@zone3000.net> <338b359d969e9c68deaf49096aa91995@mail.geekcn.org> <20061123160208.GA62732@zone3000.net> <456662F4.6000306@delphij.net> <20061125103755.GA78288@zone3000.net> <456887CD.50606@delphij.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sunday, 26 November 2006 at  2:13:33 +0800, LI Xin wrote:
> Nikolay Pavlov wrote:
> > On Friday, 24 November 2006 at 11:11:48 +0800, LI Xin wrote:
> >> Nikolay Pavlov wrote:
> >>> On Thursday, 23 November 2006 at 20:24:15 +0800, delphij@delphij.net =
wrote:
> >>>> Hi,
> >>>>
> >>>> On Wed, 22 Nov 2006 21:55:49 +0200, Nikolay Pavlov <quetzal@zone3000=
.net> wrote:
> >>>>> Hi.
> >>>>> It seems i have a deadlock on 6.2-PRERELEASE.
> >>>>> This is squid server in accelerator mode.
> >>>>> I can easily trigger it with a high rate of requests.
> >>>>> Squid is locked on some "zoneli" state, i am not sure what it is.
> >>>>> Also i can't KILL proccess even with SIGKILL.
> >>>>> In addition one of sshd proccess is locked too.
> >>>> Would you please update to the latest RELENG_6 and apply this patch:
> >>>>
> >>>> http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround
> >>>>
> >>>> to see if things gets improved?
> >>>>
> >>>> Thanks in advance!
> >>>>
> >>>> Cheers,
> >>> Well. This patch works quite ambiguous for me.
> >>> Under heavy load this box become unresponseble via network.
> >>> System is mostly idle. Squid is locked in zoneli.
> >=20
> > Another panic. Guys do i need some additional debug options or this info
> > is enough. I am asking because this panic is easily reproduceable for
> > me.
>=20
> I think these stuff is enough.  By the way, which scheduler do you use?

4BSD

>=20
> > root@accel1:/usr/obj/usr/src/sys/ACCEL# kgdb kernel.debug /var/crash/vm=
core.4
> > kgdb: kvm_nlist(_stopped_cpus):
> > kgdb: kvm_nlist(_stoppcbs):
> > [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db=
.so: Undefined symbol "ps_pglobal_lookup"]
> > GNU gdb 6.1.1 [FreeBSD]
> > Copyright 2004 Free Software Foundation, Inc.
> > GDB is free software, covered by the GNU General Public License, and yo=
u are
> > welcome to change it and/or distribute copies of it under certain condi=
tions.
> > Type "show copying" to see the conditions.
> > There is absolutely no warranty for GDB.  Type "show warranty" for deta=
ils.
> > This GDB was configured as "i386-marcel-freebsd".
> >=20
> > Unread portion of the kernel message buffer:
> > lock order reversal: (sleepable after non-sleepable)
> >  1st 0xca21567c so_snd (so_snd) @ /usr/src/sys/netinet/tcp_output.c:253
> >  2nd 0xc070bd84 user map (user map) @ /usr/src/sys/vm/vm_map.c:3074
> > KDB: stack backtrace:
> > kdb_backtrace(ffffffff,c071ccb0,c071c210,c06e5c4c,c0758f18,...) at kdb_=
backtrace+0x29
> > witness_checkorder(c070bd84,9,c06be56c,c02,c070d2c4,0,c06aab25,9f) at w=
itness_checkorder+0x4cd
> > _sx_xlock(c070bd84,c06be56c,c02) at _sx_xlock+0x2c
> > _vm_map_lock_read(c070bd40,c06be56c,c02,184637d,c92796b0,...) at _vm_ma=
p_lock_read+0x37
> > vm_map_lookup(f48a29d0,0,1,f48a29d4,f48a29c4,f48a29c8,f48a29ab,f48a29ac=
) at vm_map_lookup+0x28
> > vm_fault(c070bd40,0,1,0,c927aa80,...) at vm_fault+0x65
> > trap_pfault(f48a2a98,0,c) at trap_pfault+0xee
> > trap(8,c06b0028,f48a0028,1,0,...) at trap+0x325
> > calltrap() at calltrap+0x5
> > --- trap 0xc, eip =3D 0xc053ea34, esp =3D 0xf48a2ad8, ebp =3D 0xf48a2ae=
4 ---
> > m_copydata(0,ffffffff,1,d0020d74,c1040468,...) at m_copydata+0x28
> > tcp_output(d21c5570) at tcp_output+0x9af
> > tcp_input(d0020d00,14,e9,93935ce,0,...) at tcp_input+0x24a2
> > ip_input(d0020d00) at ip_input+0x561
> > netisr_processqueue(c075a6d8) at netisr_processqueue+0x6e
> > swi_net(0) at swi_net+0xc2
> > ithread_execute_handlers(c9279648,c92c3400) at ithread_execute_handlers=
+0xce
> > ithread_loop(c92436a0,f48a2d38,c070db20,0,c06a818a,...) at ithread_loop=
+0x4e
> > fork_exit(c04f76d4,c92436a0,f48a2d38) at fork_exit+0x61
> > fork_trampoline() at fork_trampoline+0x8
> > --- trap 0x1, eip =3D 0, esp =3D 0xf48a2d6c, ebp =3D 0 ---
> >=20
> >=20
> > Fatal trap 12: page fault while in kernel mode
> > fault virtual address   =3D 0xc
> > fault code              =3D supervisor read, page not present
> > instruction pointer     =3D 0x20:0xc053ea34
> > stack pointer           =3D 0x28:0xf48a2ad8
> > frame pointer           =3D 0x28:0xf48a2ae4
> > code segment            =3D base 0x0, limit 0xfffff, type 0x1b
> >                         =3D DPL 0, pres 1, def32 1, gran 1
> > processor eflags        =3D interrupt enabled, resume, IOPL =3D 0
> > current process         =3D 13 (swi1: net)
> > trap number             =3D 12
> > panic: page fault
> > KDB: stack backtrace:
> > kdb_backtrace(100,c927aa80,28,f48a2a98,c,...) at kdb_backtrace+0x29
> > panic(c069b8a1,c06c5f2c,0,fffff,c927d69b,...) at panic+0xa8
> > trap_fatal(f48a2a98,c,c927aa80,c070bd40,0,...) at trap_fatal+0x2a6
> > trap_pfault(f48a2a98,0,c) at trap_pfault+0x187
> > trap(8,c06b0028,f48a0028,1,0,...) at trap+0x325
> > calltrap() at calltrap+0x5
> > --- trap 0xc, eip =3D 0xc053ea34, esp =3D 0xf48a2ad8, ebp =3D 0xf48a2ae=
4 ---
> > m_copydata(0,ffffffff,1,d0020d74,c1040468,...) at m_copydata+0x28
> > tcp_output(d21c5570) at tcp_output+0x9af
> > tcp_input(d0020d00,14,e9,93935ce,0,...) at tcp_input+0x24a2
> > ip_input(d0020d00) at ip_input+0x561
> > netisr_processqueue(c075a6d8) at netisr_processqueue+0x6e
> > swi_net(0) at swi_net+0xc2
> > ithread_execute_handlers(c9279648,c92c3400) at ithread_execute_handlers=
+0xce
> > ithread_loop(c92436a0,f48a2d38,c070db20,0,c06a818a,...) at ithread_loop=
+0x4e
> > fork_exit(c04f76d4,c92436a0,f48a2d38) at fork_exit+0x61
> > fork_trampoline() at fork_trampoline+0x8
> > --- trap 0x1, eip =3D 0, esp =3D 0xf48a2d6c, ebp =3D 0 ---
> > Uptime: 25m13s
> > Dumping 3967 MB (3 chunks)
> >   chunk 0: 1MB (159 pages) ... ok
> >   chunk 1: 3966MB (1015280 pages) 3950 3934 3918 3902 3886 3870 3854 38=
38 3822 3806 3790 3774 3758 3742 3726 3710 3694 3678 3662 3646 3630 3614 35=
98 3582 3566 3550 3534 3518 3502 3486 3470 3454 3438 3422 3406 3390 3374 33=
58 3342 3326 3310 3294 3278 3262 3246 3230 3214 3198 3182 3166 3150 3134 31=
18 3102 3086 3070 3054 3038 3022 3006 2990 2974 2958 2942 2926 2910 2894 28=
78 2862 2846 2830 2814 2798 2782 2766 2750 2734 2718 2702 2686 2670 2654 26=
38 2622 2606 2590 2574 2558 2542 2526 2510 2494 2478 2462 2446 2430 2414 23=
98 2382 2366 2350 2334 2318 2302 2286 2270 2254 2238 2222 2206 2190 2174 21=
58 2142 2126 2110 2094 2078 2062 2046 2030 2014 1998 1982 1966 1950 1934 19=
18 1902 1886 1870 1854 1838 1822 1806 1790 1774 1758 1742 1726 1710 1694 16=
78 1662 1646 1630 1614 1598 1582 1566 1550 1534 1518 1502 1486 1470 1454 14=
38 1422 1406 1390 1374 1358 1342 1326 1310 1294 1278 1262 1246 1230 1214 11=
98 1182 1166 1150 1134 1118 1102 1086 1070 1054 1038 1022 1006 990 974 958 =
942 926 910 894 8
> 78 862 846 830 814 798 782 766 750 734 718 702 686 670 654 638 622 606 59=
0 574 558 542 526 510 494 478 462 446 430 414 398 382 366 350 334 318 302 2=
86 270 254 238 222 206 190 174 158 142 126 110 94 78 62 46 30 14 ... ok
> >   chunk 2: 1MB (128 pages)
> >=20
> > #0  doadump () at pcpu.h:165
> > 165     pcpu.h: No such file or directory.
> >         in pcpu.h
> > (kgdb) bt
> > #0  doadump () at pcpu.h:165
> > #1  0xc050ae04 in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown=
.c:409
> > #2  0xc050b05f in panic (fmt=3D0xc069b8a1 "%s") at /usr/src/sys/kern/ke=
rn_shutdown.c:565
> > #3  0xc0674fa2 in trap_fatal (frame=3D0xf48a2a98, eva=3D12) at /usr/src=
/sys/i386/i386/trap.c:837
> > #4  0xc0674cd3 in trap_pfault (frame=3D0xf48a2a98, usermode=3D0, eva=3D=
12) at /usr/src/sys/i386/i386/trap.c:745
> > #5  0xc0674939 in trap (frame=3D
> >       {tf_fs =3D 8, tf_es =3D -1066729432, tf_ds =3D -192282584, tf_edi=
 =3D 1, tf_esi =3D 0, tf_ebp =3D -192271644, tf_isp =3D -192271676, tf_ebx =
=3D 4380, tf_edx =3D -1, tf_ecx =3D 0, tf_eax =3D -805171852, tf_trapno =3D=
 12, tf_err =3D 0, tf_eip =3D -1068242380, tf_cs =3D 32, tf_eflags =3D 5903=
38, tf_esp =3D 4380, tf_ss =3D -769895056})
> >     at /usr/src/sys/i386/i386/trap.c:435
> > #6  0xc0663bba in calltrap () at /usr/src/sys/i386/i386/exception.s:139
> > #7  0xc053ea34 in m_copydata (m=3D0x0, off=3D-1, len=3D1, cp=3D0xd0020d=
74 "") at /usr/src/sys/kern/uipc_mbuf.c:543
> > #8  0xc0590aeb in tcp_output (tp=3D0xd21c5570) at /usr/src/sys/netinet/=
tcp_output.c:770
> > #9  0xc058f536 in tcp_input (m=3D0xd0020d00, off0=3D20) at /usr/src/sys=
/netinet/tcp_input.c:2471
> > #10 0xc058755d in ip_input (m=3D0xd0020d00) at /usr/src/sys/netinet/ip_=
input.c:785
> > #11 0xc0578252 in netisr_processqueue (ni=3D0xc075a6d8) at /usr/src/sys=
/net/netisr.c:236
> > #12 0xc057841a in swi_net (dummy=3D0x0) at /usr/src/sys/net/netisr.c:349
> > #13 0xc04f762e in ithread_execute_handlers (p=3D0xc9279648, ie=3D0xc92c=
3400) at /usr/src/sys/kern/kern_intr.c:682
> > #14 0xc04f7722 in ithread_loop (arg=3D0xc92436a0) at /usr/src/sys/kern/=
kern_intr.c:765
> > #15 0xc04f697d in fork_exit (callout=3D0xc04f76d4 <ithread_loop>, arg=
=3D0xc92436a0, frame=3D0xf48a2d38) at /usr/src/sys/kern/kern_fork.c:821
> > #16 0xc0663c1c in fork_trampoline () at /usr/src/sys/i386/i386/exceptio=
n.s:208
>=20
> Cheers,
> --=20
> Xin LI <delphij@delphij.net>	http://www.delphij.net/
> FreeBSD - The Power to Serve!
>=20



--=20
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20
- Best regards, Nikolay Pavlov. <<<-----------------------------------   =
=20
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20061125212004.GA22786>