Date: Fri, 24 Nov 2006 11:11:48 +0800 From: LI Xin <delphij@delphij.net> To: Nikolay Pavlov <quetzal@zone3000.net>, delphij@delphij.net, freebsd-stable@FreeBSD.org, Robert Watson <rwatson@FreeBSD.org> Subject: Re: deadlock in "zoneli" state on 6.2-PRERELEASE Message-ID: <456662F4.6000306@delphij.net> In-Reply-To: <20061123160208.GA62732@zone3000.net> References: <20061122195549.GA57018@zone3000.net> <338b359d969e9c68deaf49096aa91995@mail.geekcn.org> <20061123160208.GA62732@zone3000.net>
next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigFCF6F9B0E31ED5C3B4D84A6A Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Nikolay Pavlov wrote: > On Thursday, 23 November 2006 at 20:24:15 +0800, delphij@delphij.net wr= ote: >> Hi, >> >> On Wed, 22 Nov 2006 21:55:49 +0200, Nikolay Pavlov <quetzal@zone3000.n= et> wrote: >>> Hi. >>> It seems i have a deadlock on 6.2-PRERELEASE. >>> This is squid server in accelerator mode. >>> I can easily trigger it with a high rate of requests. >>> Squid is locked on some "zoneli" state, i am not sure what it is. >>> Also i can't KILL proccess even with SIGKILL. >>> In addition one of sshd proccess is locked too. >> Would you please update to the latest RELENG_6 and apply this patch: >> >> http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround >> >> to see if things gets improved? >> >> Thanks in advance! >> >> Cheers, >=20 > Well. This patch works quite ambiguous for me. > Under heavy load this box become unresponseble via network. > System is mostly idle. Squid is locked in zoneli. Would you please give me the output of "sysctl vm.zone" on a patched system? It's not important whether it is loaded. > ast pid: 840; load averages: 0.26, 0.24, 0.17 up 0+00:11:50 10:1= 9:46 > 34 processes: 1 running, 33 sleeping > CPU states: 0.4% user, 0.0% nice, 0.4% system, 1.5% interrupt, 97.8= % idle > Mem: 225M Active, 144M Inact, 261M Wired, 12K Cache, 112M Buf, 3259M Fr= ee > Swap: 4070M Total, 4070M Free >=20 > PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND= > 682 squid 1 -16 0 207M 207M zoneli 2:18 6.59% squid > 709 root 1 -8 0 7768K 7240K piperd 0:00 0.00% perl5.8= =2E8 > 691 root 1 96 0 6632K 4796K select 0:00 0.00% snmpd > 829 root 1 76 -20 2400K 1648K RUN 0:00 0.00% top > 790 quetzal 1 96 0 6220K 3252K select 0:00 0.00% sshd > 788 root 1 4 0 6232K 3232K sbwait 0:00 0.00% sshd > 837 root 1 20 0 5048K 3024K pause 0:00 0.00% tcsh > 832 root 1 4 0 6232K 3236K sbwait 0:00 0.00% sshd > 820 root 1 20 0 4700K 2856K pause 0:00 0.00% tcsh > 645 root 1 96 0 2984K 1808K select 0:00 0.00% ntpd > 791 quetzal 1 20 0 4708K 2872K pause 0:00 0.00% tcsh > 560 root 1 96 0 1352K 996K select 0:00 0.00% syslogd= > 362 _pflogd 1 -58 0 1600K 1144K bpf 0:00 0.00% pflogd > 835 quetzal 1 20 0 4728K 2960K pause 0:00 0.00% tcsh > 688 squid 1 -8 0 1224K 632K piperd 0:00 0.00% unlinkd= > 834 quetzal 1 96 0 6220K 3252K select 0:00 0.00% sshd > 840 root 1 20 0 1540K 960K pause 0:00 0.00% netstat= > 719 root 1 96 0 3464K 2796K select 0:00 0.00% sendmai= l > 729 root 1 8 0 1364K 1060K nanslp 0:00 0.00% cron >=20 > root@accel1:~# netstat -h 1 > input (Total) output > packets errs bytes packets errs bytes colls > 1.6K 0 1.3M 1.5K 0 1.6M 0 > 1.8K 0 1.6M 1.7K 0 1.6M 0 > 1.3K 0 1.0M 1.4K 0 1.4M 0 > 1.5K 0 1.3M 1.5K 0 1.4M 0 > 1.6K 0 1.4M 1.6K 0 1.5M 0 > 1.7K 0 1.5M 1.6K 0 1.5M 0 > 1.3K 0 830K 1.4K 0 1.5M 0 > 1.1K 0 679K 1.3K 0 1.4M 0 > 812 0 501K 912 0 971K 0 > 1.2K 0 1.1M 1.2K 0 1.1M 0 > 617 0 325K 742 0 806K 0 > 634 0 312K 769 0 818K 0 > 1.8K 0 1.7M 1.5K 0 1.1M 0 > 11K 0 13M 7.5K 0 3.8M 0 > 10K 0 12M 8.0K 0 5.2M 0 > 9.7K 0 9.9M 8.2K 0 6.3M 0 > 513 1.7K 666K 328 0 151K 0 > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > Here goes load... >=20 > 1.0K 543 782K 434 0 247K 0 > 0 2.3K 0 0 0 0 0 > 2 605 1.5K 2 0 132 0 > input (Total) output > packets errs bytes packets errs bytes colls > 0 334 0 0 0 0 0 > 0 286 0 0 0 0 0 > 0 288 0 0 0 0 0 > 819 204 689K 328 0 122K 0 > 0 1.7K 0 0 0 0 0 > 866 1.2K 719K 375 0 141K 0 > 144 1.5K 175K 111 0 55K 0 > 0 1.3K 0 0 0 0 0 > 687 182 426K 304 0 73K 0 > 0 3.2K 0 0 0 0 0 > 1.0K 0 723K 405 0 126K 0 > 17 1.8K 25K 11 0 2.2K 0 > 598 990 409K 163 0 32K 0 > 785 1.9K 635K 313 0 85K 0 > 0 3.1K 0 0 0 0 0 > 0 1.0K 0 0 0 0 0 > 0 82 0 0 0 0 0 > 0 205 0 0 0 0 0 > 0 181 0 0 0 0 0 > 0 197 0 0 0 0 0 > 1.0K 633 915K 442 0 89K 0 > input (Total) output > packets errs bytes packets errs bytes colls > 821 1.8K 712K 297 0 53K 0 > 481 1.8K 437K 326 0 91K 0 > 0 1.5K 0 0 0 0 0 > 2 190 3.0K 2 0 356 0 > 0 210 0 0 0 0 0 > 0 150 0 0 0 0 0 > 0 210 0 0 0 0 0 > 799 631 938K 458 0 99K 0 > 36 2.1K 41K 27 0 5.5K 0 > 678 260 898K 366 0 70K 0 >=20 > BUT after some time, with out any load squid is responding again withou= t > any reboot. >=20 > Some other info: >=20 > root@accel1:~# netstat -i > Name Mtu Network Address Ipkts Ierrs Opkts Oer= rs Coll > em0 1500 <Link#1> 00:30:48:2a:de:5a 984648 11332 939484 = 0 0 > em0 1500 206.53.57/26 accel1 968817 - 939431 = - - > em1* 1500 <Link#2> 00:30:48:2a:de:5b 0 0 0 = 0 0 > lo0 16384 <Link#3> 314 0 314 = 0 0 > lo0 16384 localhost ::1 0 - 0 = - - > lo0 16384 your-net localhost 314 - 314 = - - > pflog 33208 <Link#4> 0 0 0 = 0 0 >=20 > Another intresting thing - i see that mbufs deniend counter is growing:= >=20 > root@accel1:~# netstat -m > 65485/65/65550 mbufs in use (current/cache/total) > 65284/418/65702/131072 mbuf clusters in use (current/cache/total/max) > 65284/0 mbuf+clusters out of packet secondary zone in use (current/cach= e) > 0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max) > 0/0/0/0 9k jumbo clusters in use (current/cache/total/max) > 0/0/0/0 16k jumbo clusters in use (current/cache/total/max) > 146939K/852K/147791K bytes allocated to network (current/cache/total) > 21823/0/75016 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > 0/3/6656 sfbufs in use (current/peak/max) > 0 requests for sfbufs denied > 0 requests for sfbufs delayed > 0 requests for I/O initiated by sendfile > 0 calls to protocol drain routines >=20 > root@accel1:~# netstat -m > 65538/12/65550 mbufs in use (current/cache/total) > 65338/364/65702/131072 mbuf clusters in use (current/cache/total/max) > 65338/6 mbuf+clusters out of packet secondary zone in use (current/cach= e) > 0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max) > 0/0/0/0 9k jumbo clusters in use (current/cache/total/max) > 0/0/0/0 16k jumbo clusters in use (current/cache/total/max) > 147060K/731K/147791K bytes allocated to network (current/cache/total) > 23262/0/86054 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > 0/3/6656 sfbufs in use (current/peak/max) > 0 requests for sfbufs denied > 0 requests for sfbufs delayed > 0 requests for I/O initiated by sendfile > 0 calls to protocol drain routines >=20 > My configuration files: >=20 > root@accel1:~# cat /boot/loader.conf > # --- Loader settings --- > autoboot_delay=3D"5" # Delay in seconds before autobooting= >=20 > # --- Kernel tunables --- > kern.ipc.nmbclusters=3D"131072" # Set the number of mbuf clusters > kern.cam.scsi_delay=3D"5000" # Delay (in ms) before probing SCSI > kern.maxdsiz=3D"2752M" # Allow more memory allocation for sq= uid > kern.dfldsiz=3D"2752M" # Allow more memory allocation for sq= uid > #kern.maxssiz=3D"256M" # Allow more memory allocation for sq= uid > #kern.maxusers=3D"512" >=20 > # --- Networking modules --- > pf_load=3D"YES" # Packet filter >=20 > # --- Other modules --- > accf_data_load=3D"YES" # Wait for data accept filter > accf_http_load=3D"YES" # Wait for full HTTP request accept = filter >=20 >=20 > root@accel1:~# cat /etc/sysctl.conf > # $FreeBSD: src/etc/sysctl.conf,v 1.8 2003/03/13 18:43:50 mux Exp $ > # > # This file is read when going to multi-user and its contents piped th= ru > # ``sysctl'' to adjust kernel values. ``man 5 sysctl.conf'' for detai= ls. > # >=20 > # Uncomment this to prevent users from seeing information about process= es that > # are being run under another UID. > #security.bsd.see_other_uids=3D0 >=20 > # --- MAC access for squid --- > net.inet.ip.portrange.reservedlow=3D0 > net.inet.ip.portrange.reservedhigh=3D0 > security.mac.portacl.rules=3Duid:100:tcp:80 >=20 >=20 > # --- Kernel tunning --- > kern.ipc.somaxconn=3D8192 >=20 > # --- Network tunning and protection --- > kern.ipc.maxsockbuf=3D1048576 > net.inet.tcp.sendspace=3D262144 > net.inet.tcp.recvspace=3D131072 > net.inet.tcp.msl=3D3000 > net.inet.icmp.icmplim=3D50 > net.inet.icmp.drop_redirect=3D1 > net.inet.icmp.log_redirect=3D1 > net.inet.ip.redirect=3D0 > net.inet6.ip6.redirect=3D0 >=20 > Some kernel options: >=20 > options VM_KMEM_SIZE_MAX=3D536870912" > options MAC > options MAC_PORTACL >=20 >=20 > Ahh... And here is also a panic :) >=20 >=20 > root@accel1:/usr/obj/usr/src/sys/ACCEL# kgdb kernel.debug /var/crash/vm= core.3 > kgdb: kvm_nlist(_stopped_cpus): > kgdb: kvm_nlist(_stoppcbs): > [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db= =2Eso: Undefined symbol "ps_pglobal_lookup"] > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and yo= u are > welcome to change it and/or distribute copies of it under certain condi= tions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for deta= ils. > This GDB was configured as "i386-marcel-freebsd". >=20 > Unread portion of the kernel message buffer: >=20 >=20 > Fatal trap 12: page fault while in kernel mode > fault virtual address =3D 0xc > fault code =3D supervisor read, page not present > instruction pointer =3D 0x20:0xc054dfac > stack pointer =3D 0x28:0xf4817ad0 > frame pointer =3D 0x28:0xf4817adc > code segment =3D base 0x0, limit 0xfffff, type 0x1b > =3D DPL 0, pres 1, def32 1, gran 1 > processor eflags =3D interrupt enabled, resume, IOPL =3D 0 > current process =3D 13 (swi1: net) > trap number =3D 12 > panic: page fault > KDB: stack backtrace: > kdb_backtrace(100,c927aa80,28,f4817a90,c,...) at kdb_backtrace+0x29 > panic(c06c9d38,c06f033a,0,fffff,c927d69b,...) at panic+0xa8 > trap_fatal(f4817a90,c,c927aa80,0,c,...) at trap_fatal+0x2a6 > trap_pfault(f4817a90,0,c) at trap_pfault+0x1f3 > trap(c93b0008,f4810028,c0590028,1,0,...) at trap+0x325 > calltrap() at calltrap+0x5 > --- trap 0xc, eip =3D 0xc054dfac, esp =3D 0xf4817ad0, ebp =3D 0xf4817ad= c --- > m_copydata(0,ffffffff,1,cfcbc174,c1040460,...) at m_copydata+0x28 > tcp_output(cc6781d0) at tcp_output+0xa17 > tcp_input(ca16ab00,14,1,c927aa80,c927aa80,...) at tcp_input+0x286c > ip_input(ca16ab00) at ip_input+0x5e1 > netisr_processqueue(c073d7b8) at netisr_processqueue+0x9f > swi_net(0) at swi_net+0xf2 > ithread_execute_handlers(c9279648,c92c3400) at ithread_execute_handlers= +0x121 > ithread_loop(c92436a0,f4817d38) at ithread_loop+0x54 > fork_exit(c04ff5c8,c92436a0,f4817d38) at fork_exit+0x70 > fork_trampoline() at fork_trampoline+0x8 > --- trap 0x1, eip =3D 0, esp =3D 0xf4817d6c, ebp =3D 0 --- > Uptime: 11m24s > Dumping 3967 MB (3 chunks) > chunk 0: 1MB (159 pages) ... ok > chunk 1: 3966MB (1015280 pages) 3950 3934 3918 3902 3886 3870 3854 38= 38 3822 3806 3790 3774 3758 3742 3726 3710 3694 3678 3662 3646 3630 3614 = 3598 3582 3566 3550 3534 3518 3502 3486 3470 3454 3438 3422 3406 3390 337= 4 3358 3342 3326 3310 3294 3278 3262 3246 3230 3214 3198 3182 3166 3150 3= 134 3118 3102 3086 3070 3054 3038 3022 3006 2990 2974 2958 2942 2926 2910= 2894 2878 2862 2846 2830 2814 2798 2782 2766 2750 2734 2718 2702 2686 26= 70 2654 2638 2622 2606 2590 2574 2558 2542 2526 2510 2494 2478 2462 2446 = 2430 2414 2398 2382 2366 2350 2334 2318 2302 2286 2270 2254 2238 2222 220= 6 2190 2174 2158 2142 2126 2110 2094 2078 2062 2046 2030 2014 1998 1982 1= 966 1950 1934 1918 1902 1886 1870 1854 1838 1822 1806 1790 1774 1758 1742= 1726 1710 1694 1678 1662 1646 1630 1614 1598 1582 1566 1550 1534 1518 15= 02 1486 1470 1454 1438 1422 1406 1390 1374 1358 1342 1326 1310 1294 1278 = 1262 1246 1230 1214 1198 1182 1166 1150 1134 1118 1102 1086 1070 1054 103= 8 1022 1006 990 974 958 942 926 910 894 8 78 862 846 830 814 798 782 766 750 734 718 702 686 670 654 638 622 606 59= 0 574 558 542 526 510 494 478 462 446 430 414 398 382 366 350 334 318 302= 286 270 254 238 222 206 190 174 158 142 126 110 94 78 62 46 30 14 ... ok= > chunk 2: 1MB (128 pages) >=20 > #0 doadump () at pcpu.h:165 > 165 pcpu.h: No such file or directory. > in pcpu.h > (kgdb) bt > #0 doadump () at pcpu.h:165 > #1 0xc05167d2 in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown= =2Ec:409 > #2 0xc0516a98 in panic (fmt=3D0xc06c9d38 "%s") at /usr/src/sys/kern/ke= rn_shutdown.c:565 > #3 0xc06a416a in trap_fatal (frame=3D0xf4817a90, eva=3D12) at /usr/src= /sys/i386/i386/trap.c:837 > #4 0xc06a3e9b in trap_pfault (frame=3D0xf4817a90, usermode=3D0, eva=3D= 12) at /usr/src/sys/i386/i386/trap.c:745 > #5 0xc06a3a95 in trap (frame=3D > {tf_fs =3D -918880248, tf_es =3D -192872408, tf_ds =3D -106790908= 0, tf_edi =3D 1, tf_esi =3D 0, tf_ebp =3D -192840996, tf_isp =3D -1928410= 28, tf_ebx =3D 4380, tf_edx =3D -1, tf_ecx =3D 0, tf_eax =3D -808730252, = tf_trapno =3D 12, tf_err =3D 0, tf_eip =3D -1068179540, tf_cs =3D 32, tf_= eflags =3D 66050, tf_esp =3D 4380, tf_ss =3D -865631792}) at /usr/src/sys= /i386/i386/trap.c:435 > #6 0xc0691cda in calltrap () at /usr/src/sys/i386/i386/exception.s:139= > #7 0xc054dfac in m_copydata (m=3D0x0, off=3D-1, len=3D1, cp=3D0xcfcbc1= 74 "") at /usr/src/sys/kern/uipc_mbuf.c:543 > #8 0xc05ac60f in tcp_output (tp=3D0xcc6781d0) at /usr/src/sys/netinet/= tcp_output.c:770 > #9 0xc05aaea4 in tcp_input (m=3D0xca16ab00, off0=3D20) at /usr/src/sys= /netinet/tcp_input.c:2471 > #10 0xc05a1bb9 in ip_input (m=3D0xca16ab00) at /usr/src/sys/netinet/ip_= input.c:785 > #11 0xc0590a6f in netisr_processqueue (ni=3D0xc073d7b8) at /usr/src/sys= /net/netisr.c:236 > #12 0xc0590c6a in swi_net (dummy=3D0x0) at /usr/src/sys/net/netisr.c:34= 9 > #13 0xc04ff511 in ithread_execute_handlers (p=3D0xc9279648, ie=3D0xc92c= 3400) at /usr/src/sys/kern/kern_intr.c:682 > #14 0xc04ff61c in ithread_loop (arg=3D0xc92436a0) at /usr/src/sys/kern/= kern_intr.c:765 > #15 0xc04fe498 in fork_exit (callout=3D0xc04ff5c8 <ithread_loop>, arg=3D= 0xc92436a0, frame=3D0xf4817d38) at /usr/src/sys/kern/kern_fork.c:821 > #16 0xc0691d3c in fork_trampoline () at /usr/src/sys/i386/i386/exceptio= n.s:208 This looks like that we have missed a lock here, and presumably this can be avoided by Robert's reference counting changes in -HEAD. Robert, do you have some comments? Cheers, --=20 Xin LI <delphij@delphij.net> http://www.delphij.net/ FreeBSD - The Power to Serve! --------------enigFCF6F9B0E31ED5C3B4D84A6A Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFZmL0OfuToMruuMARA0sJAJ4+bJx2qlaRDBtzPy1YRRF1dUnklQCfRyim DLg/41v1mo9FYFTEmvChZGM= =+g1C -----END PGP SIGNATURE----- --------------enigFCF6F9B0E31ED5C3B4D84A6A--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?456662F4.6000306>