Date: Thu, 17 Nov 2011 12:57:44 +0200 From: Kostik Belousov <kostikbel@gmail.com> To: Doug Barton <dougb@freebsd.org> Cc: Daniil Cherednik <dcherednik@masterhost.ru>, freebsd-apache@freebsd.org, freebsd-stable@freebsd.org Subject: Re: 8.2 + apache == a LOT of sigprocmask Message-ID: <20111117105744.GS50300@deviant.kiev.zoral.com.ua> In-Reply-To: <4EC4D359.4040406@FreeBSD.org> References: <4EC17AAF.9050807@FreeBSD.org> <4EC17F57.5030008@FreeBSD.org> <20111115090745.GO50300@deviant.kiev.zoral.com.ua> <20111115100904.GA92795@icarus.home.lan> <4EC4ADC3.2060604@FreeBSD.org> <20111117074909.GL50300@deviant.kiev.zoral.com.ua> <4EC4BECA.5040705@FreeBSD.org> <20111117081210.GN50300@deviant.kiev.zoral.com.ua> <4EC4D359.4040406@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--pSPXDt+5DZRK1gNs Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Nov 17, 2011 at 01:26:49AM -0800, Doug Barton wrote: > On 11/17/2011 00:12, Kostik Belousov wrote: > > On Wed, Nov 16, 2011 at 11:59:06PM -0800, Doug Barton wrote: > >> On 11/16/2011 23:49, Kostik Belousov wrote: > >>> On Wed, Nov 16, 2011 at 10:46:27PM -0800, Doug Barton wrote: > >>>> On 11/15/2011 02:09, Jeremy Chadwick wrote: > >>>>> On Tue, Nov 15, 2011 at 11:07:45AM +0200, Kostik Belousov wrote: > >>>>>> On Mon, Nov 14, 2011 at 12:51:35PM -0800, Doug Barton wrote: > >>>>>>> On 11/14/2011 12:31, Doug Barton wrote: > >>>>>>>> Trying to track down a load problem we're seeing on 8.2-RELEASE-= p4 i386 > >>>>>>>> in a busy web hosting environment I came across the following po= st: > >>>>>>>> > >>>>>>>> http://lists.freebsd.org/pipermail/freebsd-questions/2011-Octobe= r/234520.html > >>>>>>>> > >>>>>>>> That basically describes what we're seeing as well, including the > >>>>>>>> "doesn't happen on Linux" part. > >>>>>>>> > >>>>>>>> Does anyone have any ideas about this? > >>>>>>>> > >>>>>>>> With incredibly similar stuff running on 7.x we didn't see this = problem, > >>>>>>>> so it seems to be something new in 8. > >>>>>>> > >>>>>>> Just took a closer look at our ktrace, and actually our pattern is > >>>>>>> slightly different than the one in that post. In ours the second = option > >>>>>>> is null, but the third is set: > >>>>>>> > >>>>>>> 74195 httpd 0.000017 RET sigprocmask 0 > >>>>>>> 74195 httpd 0.000013 CALL sigprocmask(SIG_BLOCK,0,0xbfbf89d4) > >>>>>>> 74195 httpd 0.000009 RET sigprocmask 0 > >>>>>>> 74195 httpd 0.000013 CALL sigprocmask(SIG_BLOCK,0,0xbfbf89d4) > >>>>>>> 74195 httpd 0.000009 RET sigprocmask 0 > >>>>>>> 74195 httpd 0.000012 CALL sigprocmask(SIG_BLOCK,0,0xbfbf89d4) > >>>>>>> > >>>>>>> But repeated hundreds of times in a row. > >>>>>> > >>>>>> The calls cannot come from rtld, they are generated by some setjmp= () > >>>>>> invocation. If signal-safety is not needed, sigsetjmp() should be = used > >>>>>> instead. > >>>>>> > >>>>>> Quick grep of the apache httpd source shows a single setjmp() in t= heir > >>>>>> copy of pcre. No idea is it to safe to change setjmp() into sigset= jmp(?, 0). > >>>>> > >>>>> I hate cross-posting, but: adding freebsd-apache@ to the list. Som= e of > >>>>> the Apache folks (not just port committers) may have some insight to > >>>>> Kostik's findings. > >>>> > >>>> Thanks to everyone for the responses. We tried Kostik's suggestion a= nd > >>>> unfortunately it didn't reduce the number of sigprocmask() calls to a > >>>> statistically significant degree. > >>>> > >>>> Does anyone have any other ideas on ways to debug this? We're sort of > >>>> running out of things to test. :-/ > >>>> > >>>> Given how important (and prevalent) the Apache + FreeBSD combination= is, > >>>> I'm kind of disturbed that we're seeing this performance problem, an= d if > >>>> it's something in 8.x that's also in 9.x, it would be better to fix = it > >>>> prior to 9.0-RELEASE. > >>> > >>> Since my guess appeared to be not useful, > >> > >> Well I wouldn't say that they weren't useful, we eliminated the obvious > >> candidate. So, "not good news" certainly, but not unhelpful. :) > >> > >>> the way forward is to identify > >>> the location of the call(s) that cause the issue. I suggest compliling > >>> at least apache itself, libc, rtld and libthr (if used) with debugging > >>> information. Then, attach to the running apache worker with the gdb a= nd > > Note this part. >=20 > Right, we attached to a worker, that's why it's in accept(). :) >=20 > > It seems your libc has no debugging information. > > accept() is the pure syscall wrapper, it cannot call sigprocmask. > > If gdb catched the PLT trampoline instead of real accept(), we would > > see the rtld frames. So install libc, libthr and rtld with debug. >=20 > It's not catching there though: >=20 > Reading symbols from /libexec/ld-elf.so.1...done. > Loaded symbols for /libexec/ld-elf.so.1 > 0x28183b2d in accept () at accept.S:3 > 3 RSYSCALL(accept) > (gdb) c > Continuing. > no thread to satisfy query > 0x28183b2d in accept () at accept.S:3 > 3 RSYSCALL(accept) > (gdb) info threads > Cannot get thread info: invalid key > (gdb) Err, the other part of my message was that you shall set the breakpoint on sigprocmask. I want to see a backtrace from the breakpoint hit. Several times. The backtrace at the attach time has no use. --pSPXDt+5DZRK1gNs Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) iEYEARECAAYFAk7E6KgACgkQC3+MBN1Mb4j5mgCgvbV20mLT2co6NO3NUTQlM8Ub kOQAmwU4tRvdIjYTtMfkfVwUq63h/pLe =pZru -----END PGP SIGNATURE----- --pSPXDt+5DZRK1gNs--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20111117105744.GS50300>