From owner-freebsd-hackers@FreeBSD.ORG Wed Sep 1 17:53:06 2010 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 70F9210656B1; Wed, 1 Sep 2010 17:53:06 +0000 (UTC) (envelope-from jamesbrandongooch@gmail.com) Received: from mail-ey0-f182.google.com (mail-ey0-f182.google.com [209.85.215.182]) by mx1.freebsd.org (Postfix) with ESMTP id A66DE8FC15; Wed, 1 Sep 2010 17:53:05 +0000 (UTC) Received: by eyx24 with SMTP id 24so5114702eyx.13 for ; Wed, 01 Sep 2010 10:53:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=aD/83CgCQ2wfD6y/2GQneA/OoDZEv6t6qFv6yVMU9fk=; b=ZWXlmULmHXg+dVowutYxaQ3ULc+lrXhddY/+d7ruHGumlFxcB7VoBrahf3gA4d1ffw 5tT2TFCt77D9AR46u8JfwPArAlJ1xshuVt7f+8WyYiRtyB4LhLOrA0LeH2hDrYNaDRZ8 D/c4Q3cB/po9Nljxji4RxvJIe/oGdCXK/ldvo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=iC59HOuWJf44JIbbrWiXJmHoAcbnqB+drvBuVAIfjY6GTWTxQSMW3H2yp7bIA7XYKa tp3Y0hpViEB2doOfWRKcRdQThNkLD2fkY8JAfIyYq0QbvKjOstpUzqUc7pq4Xs+InQF3 awlaFULVVl/EZoYMStGAGGPLKFDTngwx1bxP8= MIME-Version: 1.0 Received: by 10.216.176.8 with SMTP id a8mr597609wem.93.1283363584516; Wed, 01 Sep 2010 10:53:04 -0700 (PDT) Received: by 10.216.133.2 with HTTP; Wed, 1 Sep 2010 10:53:04 -0700 (PDT) In-Reply-To: <4C7E2E8A.3030709@FreeBSD.org> References: <4C7A5C28.1090904@FreeBSD.org> <20100830110932.23425932@ernst.jennejohn.org> <4C7B82EA.2040104@FreeBSD.org> <20100830121148.11926306@ernst.jennejohn.org> <20100831102918.4f5404cc@ernst.jennejohn.org> <4C7CC1DE.1080907@FreeBSD.org> <4C7E2E8A.3030709@FreeBSD.org> Date: Wed, 1 Sep 2010 12:53:04 -0500 Message-ID: From: Brandon Gooch To: Alexander Motin Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-hackers@freebsd.org, FreeBSD-Current Subject: Re: One-shot-oriented event timers management X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Sep 2010 17:53:06 -0000 On Wed, Sep 1, 2010 at 5:44 AM, Alexander Motin wrote: > Alexander Motin wrote: >> Gary Jennejohn wrote: >>> On Mon, 30 Aug 2010 12:11:48 +0200 >>> OK, this is purely anecdotal, but I'll report it anyway. >>> >>> I was running pretty much all day with the patched kernel and things >>> seemed to be working quite well. >>> >>> Then, after about 7 hours, everything just stopped. >>> >>> I had gkrellm running and noticed that it updated only when I moved the >>> mouse. >>> >>> This behavior leads me to suspect that the timer interrupts had stopped >>> working and the mouse interrupts were causing processes to get scheduled. >>> >>> Unfortunately, I wasn't able to get a dump and had to hit reset to >>> recover. >>> >>> As I wrote above, this is only anecdotal, but I've never seen anything >>> like this before applying the patches. >> >> One-shot timers have one weak side: if for some reason timer interrupt >> getting lost -- there will be nobody to reload the timer. Such cases >> probably will require special attention. Same funny situation with >> mouse-driven scheduler happens also if LAPIC timer dies when pre-Core-iX >> CPU goes to C3 state. > > I have reproduced the problem locally. It happens more often when ticks > are not stopped on idle, like in your original case (or if explicitly > enabled by kern.eventtimer.idletick sysctl). > > I've made some changes to HPET driver, which, I hope, should fix > interrupt losses there. > > Updated patch: http://people.freebsd.org/~mav/timers_oneshot6.patch > > Patch also includes some optimizations to reduce lock contention. > > Thanks for testing. This latest patch causes an interrupt storm with the HPET timer on my system. The machine took about 8 minutes to boot and bring me to a login prompt. System interactivity (i.e. input from keyboard, output on console) was fine, but after checking the output of `systat vmstat -1`, I saw the interrupt rate on each HPET entry was over 120k! Can I provide any useful detail? Of course, test patches are always welcome :) -Brandon