From owner-freebsd-hackers@FreeBSD.ORG Sat Sep 12 07:52:23 2009 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E81FA1065676; Sat, 12 Sep 2009 07:52:23 +0000 (UTC) (envelope-from linda.messerschmidt@gmail.com) Received: from mail-qy0-f195.google.com (mail-qy0-f195.google.com [209.85.221.195]) by mx1.freebsd.org (Postfix) with ESMTP id 88D448FC12; Sat, 12 Sep 2009 07:52:23 +0000 (UTC) Received: by qyk33 with SMTP id 33so252696qyk.14 for ; Sat, 12 Sep 2009 00:52:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=zMOGUudmOKhc8QWBvSLLYus+/xijt/SOWjTny66tLDY=; b=qnV0WdMjnL6DIUf1GAaNEdGuPnhXNX8sF/+K2q2U25pBs8TSWiUGlZ8ibjNSnEjb2L Rkyz+EbzNn4FOWdmCU4mh/M/Xyc50yvTnb80qgObAJT5DDqZrMYJEoqBjRI0QMTeijaT qvneSQRMPnoHxjW23gNwkvVfhZPJDqQBUZ+YI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=sVWbpmSmsmdN9GKph/qP7/ZKJ7pl0UANg8yEuocBHDEsPuf3kBF4XgRFXY1OEbk7DR jeSXf7cSeFZo835MLK3qDvTs+SSt7gnHrA0+w3JVTAdzR63HAmkS/GRv8REwVE9hRYFP X6z+3EqWSz5nj6wsliVDgGqkDPTBQN9ePGYP8= MIME-Version: 1.0 Received: by 10.229.9.147 with SMTP id l19mr1536685qcl.65.1252741942766; Sat, 12 Sep 2009 00:52:22 -0700 (PDT) In-Reply-To: <237c27100909112352k5504357dge725c8f905ee650a@mail.gmail.com> References: <237c27100908261203g7e771400o2d9603220d1f1e0b@mail.gmail.com> <200909111102.14503.jhb@freebsd.org> <237c27100909111035y544e8c91hc7726fd6ef16e351@mail.gmail.com> <200909111506.47309.jhb@freebsd.org> <237c27100909111905y244924c1n93b4e4d9ceda44be@mail.gmail.com> <237c27100909112055i35612b4btbfbecb8b5dd1568c@mail.gmail.com> <4AAB1E34.2060908@elischer.org> <237c27100909112147h64f71585p2a97f2b48a510985@mail.gmail.com> <4AAB35E0.3000908@elischer.org> <237c27100909112352k5504357dge725c8f905ee650a@mail.gmail.com> Date: Sat, 12 Sep 2009 03:52:22 -0400 Message-ID: <237c27100909120052k1db7e029xcf36e075865d29d8@mail.gmail.com> From: Linda Messerschmidt To: Julian Elischer Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-hackers@freebsd.org Subject: Re: Intermittent system hangs on 7.2-RELEASE-p1 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Sep 2009 07:52:24 -0000 OK, first, I figured out the seven second thing. I actually had already found that particular issue earlier in the troubleshooting process, but forgot all about it when I pulled in a second machine to test with. It was simply a case of setting Apache's MaxRequestsPerChild to a very low value (128) in combination with only allowing 1 access at a time. 128 requests * (50ms sleep + 2ms request + overhead) ~=3D 7s. So that was just noise masking the real problem, which is less frequent and less predictable. Sorry for the red herring. :( On Sat, Sep 12, 2009 at 2:52 AM, Linda Messerschmidt wrote: > If you're asking could the check script be modified to time out after, > say, 1 second, and if so, would it return during the hang or after it? > =A0I don't know. =A0My guess based on the earlier ktrace output is that i= t > would time out, but not return until the hang ended. =A0I'll see if I > the curl lib exposes a configurable timeout and try it. This proved to be quite easy to do. I ran the script twice, once with the timeout and once without. Without timeout: 1252741492: request 910 101ms 1252741567: request 2133 1429ms 1252741603: request 2722 146ms With 1s timeout: 1252741492: request 1078 106ms 1252741567: request 2302 1010ms (<--- Timeout) 1252741567: request 2303 273ms (<--- after 50ms sleep, goes back to end of stall) 1252741603: request 2892 136ms As you can see, the two scripts experience stalls in pretty much lockstep, but the script itself does not appear affected, so it's just on the Apache side.