From owner-freebsd-hackers@FreeBSD.ORG Sat Feb 5 18:44:41 2011 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 045961065670 for ; Sat, 5 Feb 2011 18:44:41 +0000 (UTC) (envelope-from cvs-src@yandex.ru) Received: from forward13.mail.yandex.net (forward13.mail.yandex.net [95.108.130.120]) by mx1.freebsd.org (Postfix) with ESMTP id 80B0C8FC0A for ; Sat, 5 Feb 2011 18:44:40 +0000 (UTC) Received: from smtp11.mail.yandex.net (smtp11.mail.yandex.net [95.108.130.67]) by forward13.mail.yandex.net (Yandex) with ESMTP id 92F3C1080305; Sat, 5 Feb 2011 21:44:38 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1296931478; bh=7CUCDInMNZUAcE15Fn7cPPeD//sYl+oVbxZXR60LqBo=; h=Message-ID:Date:From:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type:Content-Transfer-Encoding; b=dqkT25JKDDvgrTmWEl7nSnbb4Dg+ublb+43zEevwjSd2wdk9GMTj4xYycz0kK3nXK Ah6BmZ7koowEdxKBvFeAS/alh54iPK6Y5bW647K/0XoIcAK/DuEQceemfc6ONi7QOi g84RoiWTj9y0sNkEuqszpdobaDlR+1QUvuaSelnw= Received: from smeshariki2.local (unknown [213.138.85.225]) by smtp11.mail.yandex.net (Yandex) with ESMTPSA id 43D144CC00B8; Sat, 5 Feb 2011 21:44:38 +0300 (MSK) Message-ID: <4D4D9A4D.2070508@yandex.ru> Date: Sat, 05 Feb 2011 21:43:25 +0300 From: Ruslan Mahmatkhanov User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; ru-RU; rv:1.9.2.13) Gecko/20110106 Thunderbird/3.1.7 MIME-Version: 1.0 To: Ivan Voras References: <4D3CB2AF.9050003@yandex.ru> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Cc: freebsd-hackers@freebsd.org Subject: Re: Tracking down a problem with php on FreeBSD X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 Feb 2011 18:44:41 -0000 Hi, Ivan! Thank you much for response and sorry for late answer. We was able to collect some data about the issue to make discussion more objective. See below. 24.01.2011 16:54, Ivan Voras пишет: > On 23.1.2011 23:58, Ruslan Mahmatkhanov wrote: >> >> Good day! >> >> We are using custom php application on FreeBSD 8.1R amd64. It is started >> with php-fpm 5.3.3 from ports as backend and nginx 0.8.54 as frontend. >> Several times per day this app is making self unavailable. > > I think it would be more appropriate to ask this on the stable@ list. I believe that this issue is not related to stable only and problem resolution will involve some debugging and stuff, so this was the reason why i post this to hackers@. > >> Simple php-fpm restart solves the problem, but i need to track it down >> to the cause of this situation and ask for your assistance and >> instructions on how to debug it. Some facts about this: > > On one hand, FPM is said to be very experimental... > > Personally, I've been using apache22-worker or apache22-event + > mod_fcgid for years without trouble. We prefer to avoid using apache at all, because in this it's just adds yet another unneeded link and complexity. > It looks very application-specific, possibly not really an OS problem > (or maybe a problem of different expectations from the OS when porting > from Linux). > >> - `top -mio` shows very high (80000-90000 for VCSW) VCSW/IVCSW values >> for php-fpm processes and LA is more than 120 > > How many "real" user request are in these 120? Do any users at the time > of problem (this doesn't look like a "crash") receive valid responses? This problem is seen on development server too with < 5 users online. And nobody receives valid response, they all get this message: Maximum execution time of 30 seconds exceeded in Unknown on line 0. > >> - user seeing http 502 error code in browser >> - php-fpm log has many of this lines in time of crash: >> Jan 23 17:56:58.176425 [WARNING] [pool world] server reached >> max_children setting (100), consider raising it > > Did you try raising it? Does the error happen ONLY when this limit is > reached? Yes, we did. Nothing changes. I'm not sure about second question, i believe it's not related. > These are some very varied errors, not especially consistent with each > other. > > Did you try some generic socket & TCP tuning like described in > http://serverfault.com/questions/64356/freebsd-performance-tuning-sysctls-loader-conf-kernel > ? Yes we did some on initial stage when server was built. And recently tried to raise kern.ipc.somaxconn, but without success. There is full current sysctl.conf: security.bsd.see_other_uids = 0 security.bsd.see_other_gids = 0 security.bsd.unprivileged_proc_debug=0 security.bsd.unprivileged_read_msgbuf=0 security.bsd.conservative_signals=1 security.bsd.hardlink_check_uid=1 security.bsd.hardlink_check_gid=1 vfs.usermount=0 net.inet.tcp.drop_synfin=1 net.inet.icmp.drop_redirect=1 net.inet.icmp.log_redirect=1 net.inet.icmp.bmcastecho=0 net.inet.icmp.maskrepl=0 net.inet.icmp.icmplim=100 net.link.ether.inet.max_age=800 net.inet.ip.redirect=0 net.inet6.ip6.redirect=0 net.inet.ip.sourceroute=0 net.inet.ip.accept_sourceroute=0 net.inet.tcp.blackhole=2 net.inet.udp.blackhole=1 kern.ps_showallprocs=0 net.inet.ip.random_id=1 kern.corefile=/tmp/%U.%N.%P.core > Other than that, you will probably have to debug the php-fpm processes. > Start by observing in which state they are (top without "-mio"). If the > processes are blocking, try "procstat -k " on them. Thay are not in blocking state: PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 39405 www 1 101 0 108M 14832K RUN 1 0:54 25.39% php-fpm 39544 www 1 101 0 108M 14832K CPU1 1 0:42 25.39% php-fpm 39395 www 1 101 0 108M 14828K CPU1 1 0:57 23.39% php-fpm 39401 www 1 101 0 108M 14832K CPU3 3 0:55 23.19% php-fpm 39481 www 1 101 0 108M 14832K RUN 3 0:55 23.10% php-fpm 13024 www 1 101 0 112M 47156K RUN 0 1:30 23.00% php-fpm 39398 www 1 4 0 108M 14832K accept 0 1:00 13.48% php-fpm 39373 www 1 4 0 108M 14832K accept 0 1:00 10.60% php-fpm 39385 www 1 4 0 108M 14832K accept 1 1:00 9.28% php-fpm 22130 www 1 4 0 111M 40200K accept 1 1:32 7.57% php-fpm There is procstat -k/-t for couple of processes from this list: PID TID COMM TDNAME KSTACK 39405 100110 php-fpm - dmapbase PID TID COMM TDNAME CPU PRI STATE WCHAN 39405 100110 php-fpm - 1 185 run - PID TID COMM TDNAME KSTACK 39398 100154 php-fpm - tdq_cpu PID TID COMM TDNAME CPU PRI STATE WCHAN 39398 100154 php-fpm - 0 88 sleep accept When attaching to any hanging php-fpm proccess with truss, than i see a lot of this calls: sched_yield(0x80516c000,0x1,0x4d4828b6,0x8012ef45c,0xffffffff808bfd80,0x7fffffffa078) = 0 (0x0) sched_yield(0x80516c000,0x1,0x4d4828b6,0x8012ef45c,0xffffffff808bfd80,0x7fffffffa078) = 0 (0x0) sched_yield(0x80516c000,0x1,0x4d4828b6,0x8012ef45c,0xffffffff808bfd80,0x7fffffffa078) = 0 (0x0) sched_yield(0x80516c000,0x1,0x4d4828b6,0x8012ef45c,0xffffffff808bfd80,0x7fffffffa078) = 0 (0x0) Thank you in advance for any tips. -- Regards, Ruslan