From owner-freebsd-hackers@FreeBSD.ORG Thu Sep 10 19:12:37 2009 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7AD3E106568D for ; Thu, 10 Sep 2009 19:12:37 +0000 (UTC) (envelope-from linda.messerschmidt@gmail.com) Received: from qw-out-2122.google.com (qw-out-2122.google.com [74.125.92.26]) by mx1.freebsd.org (Postfix) with ESMTP id 232A98FC16 for ; Thu, 10 Sep 2009 19:12:36 +0000 (UTC) Received: by qw-out-2122.google.com with SMTP id 3so135323qwe.7 for ; Thu, 10 Sep 2009 12:12:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=sVbC4Lh/vy1vEFmT5CDa2TzOaYb0hOzpY+MQ/76B9/o=; b=bXy5eGDEziznh0qxMQHOITc7/6oY067mEczlDrGNHzpXI5H5/8uzVobA9IQVlsu8PF yTpbqEhyup07vyGbD4nDiWduOJVKPWBWYvjIHvtP2YM3IN316ZRJ4vrvGgslfyTiqpR9 mRkLfv676H3PDECL5xxoRH1KaehQ8BAyVA+uE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=hAbRRNHIbN5FNsWXrFkokNuriZ2GDaghcxj09vCcujsynDtzirliPNJKhE6WaL3CyY isbRNo7xaU+egxKj1xUc34Ko+jgWYvTRxbYu6C5aDVk/rnNGqz1Jg6MD3yeEK4FW9Kwx SsXgSSGTdd1yTDkaCf0jPIbM7V7TgAQyW5KvM= MIME-Version: 1.0 Received: by 10.229.118.135 with SMTP id v7mr1052007qcq.62.1252609655432; Thu, 10 Sep 2009 12:07:35 -0700 (PDT) In-Reply-To: <4AA94995.6030700@elischer.org> References: <237c27100908261203g7e771400o2d9603220d1f1e0b@mail.gmail.com> <200908261642.59419.jhb@freebsd.org> <237c27100908271237y66219ef4o4b1b8a6e13ab2f6c@mail.gmail.com> <200908271729.55213.jhb@freebsd.org> <237c27100909100946q3d186af3h66757e0efff307a5@mail.gmail.com> <237c27100909101129y28771061o86db3c6a50a640eb@mail.gmail.com> <4AA94995.6030700@elischer.org> Date: Thu, 10 Sep 2009 15:07:35 -0400 Message-ID: <237c27100909101207q73f0c513r60dd5ab83fdfd083@mail.gmail.com> From: Linda Messerschmidt To: Julian Elischer Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-hackers@freebsd.org Subject: Re: Intermittent system hangs on 7.2-RELEASE-p1 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Sep 2009 19:12:37 -0000 On Thu, Sep 10, 2009 at 2:46 PM, Julian Elischer wrote= : > I've noticed that schedgraph tends to show the idle threads slightly > skewed one way or the other. =A0I think there is a cumulative rounding > error in the way they are drawn due to the fact that they are run so > often. =A0Check the raw data and I think you will find that you just > need to imagine the idle threads slightly to the left or right a bit. No, there's no period anywhere in the trace where either idle thread didn't run for an entire second. I'm pretty sure schedgraph is throwing in some nonsense results. I did capture a second, larger, dataset after a 2.1s stall, and schedgraph includes an httpd process that supposedly spent 58 seconds on the run queue. I don't know if it's a dropped record or a parsing error or what. I do think on this second graph I can kind of see the *end* of the stall, because all of a sudden a ton of processes... everything from sshd to httpd to gmond to sh to vnlru to bufdaemon to fdc0... comes off of whatever it's waiting on and hits the run queue. The combined run queues for both processors spike up to 32 tasks at one point and then rapidly tail off as things return to normal. That pretty much matches the behavior shown by ktrace in my initial post, where everything goes to sleep on something-or-other in the kernel, and then at the end of the stall, everything wakes up at the same time. I think this means the problem is somehow related to locking, rather than scheduling.