From owner-cvs-all Mon Sep 16 3:12:10 2002 Delivered-To: cvs-all@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7B48F37B400; Mon, 16 Sep 2002 03:12:07 -0700 (PDT) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 73DEF43E4A; Mon, 16 Sep 2002 03:12:06 -0700 (PDT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.6/8.12.6) with ESMTP id g8GAC10B013646; Mon, 16 Sep 2002 12:12:02 +0200 (CEST) (envelope-from phk@critter.freebsd.dk) To: Bruce Evans Cc: Josef Karthauser , "David O'Brien" , cvs-committers@FreeBSD.org, cvs-all@FreeBSD.org Subject: Re: cvs commit: src/sys/kern kern_timeout.c In-Reply-To: Your message of "Mon, 16 Sep 2002 20:07:40 +1000." <20020916195344.S6593-100000@gamplex.bde.org> Date: Mon, 16 Sep 2002 12:12:01 +0200 Message-ID: <13645.1032171121@critter.freebsd.dk> From: Poul-Henning Kamp Sender: owner-cvs-all@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message <20020916195344.S6593-100000@gamplex.bde.org>, Bruce Evans writes: >> > Sep 15 19:40:26 kernel: Expensive timeout(9) function: 0xc65dfc80(0xc659f000) 0.008711721 >> > Sep 15 19:40:26 kernel: Expensive timeout(9) function: 0xc65dfc80(0xc659f000) 0.001068850 > >This is hard to interpret without the function names (or a full stack >trace). Yes, but as far as I'm aware, we still don't have a print_backtrace(9) ? >I get these on an old dual Celeron system mainly for fxp_tick() >and uma_timeout(). uma_timeout() seems to trigger on practically all systems. I've talked with Jeff about it already. >fxp_tick() almost always causes one under load, >with an average of about 2 msec and a maximum of about 8 msec. I have seen other ${netcard}_tick() do the same. I wonder if the 1 second timers in network drivers should be moved from timeout(9) to a kernel thread doing just that: polling net drivers 1/sec. >8 msec >is a huge bite out of the 10 msec between softclock for the default >HZ, and huger for larger HZ. softclock() expects to run multiple >timeout routines and shouldn't take more than 1% of the CPU(s) altogether, >but 8 msec per second is already 0.8% of 1 CPU for just 1 device. But couldn't some of this be due to evil DELAY() or locking in the in the timeout routine ? In that case it's only the duration, not the CPU usage which is .8% ? >I think most of the time is spent >in mii_tick(), and the overhead is not much larger than in RELENG_4. Yes, I've identified mii_tick() as well. Wild Bill Ethernet some time ago pointed out the the parallel issue that MII's should be run interrupt driven if at all possible to not steal bandwidth from the actual data. >In theory, running with Giant held is no worse than running at >splsoftclock() in RELENG_4, but in practice most interrupt handlers >wait for Giant, so the low priority fxp_tick() timeout routine blocks >many high priority interrupts. Which is why I added this diagnostic in the first place: My interrupt latency measurements pointed to timeout(9) users as the worst offender(s). I agree that the prints are annoying, but right now I think they provide primary quality data, so I'd rather not take them behind an option yet. If somebody wants to write some rate-limiting thing, they're more than welcome. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe cvs-all" in the body of the message