From owner-freebsd-hackers@FreeBSD.ORG Mon Aug 15 13:31:08 2011 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E3154106566B for ; Mon, 15 Aug 2011 13:31:08 +0000 (UTC) (envelope-from joesuf4@gmail.com) Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id 890AE8FC12 for ; Mon, 15 Aug 2011 13:31:08 +0000 (UTC) Received: by vxh11 with SMTP id 11so5088485vxh.13 for ; Mon, 15 Aug 2011 06:31:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=B3WnjY/IejKUe01cab/I4eWSZ6XHdIzmMs9/q7GSlIA=; b=Pd/VIqhQ8jb0TReF3OpEo3HaF45HQNWGeOjNiMOowSnAEqVZXbsFcoNX5APPfSRx5Z lDfptDKZ4Ld28Hj8XXorriRnhgXZtw1PGYg1AxcR/AG62zBxhu4/cE6/EAq8bIu1VV5B yvxBMqIiy7b92MYwoBD8KHX/nGIb2NLqS19vQ= MIME-Version: 1.0 Received: by 10.220.150.204 with SMTP id z12mr989101vcv.34.1313415067695; Mon, 15 Aug 2011 06:31:07 -0700 (PDT) Received: by 10.220.190.7 with HTTP; Mon, 15 Aug 2011 06:31:07 -0700 (PDT) In-Reply-To: <4E4911F1.9030808@FreeBSD.org> References: <4E4911F1.9030808@FreeBSD.org> Date: Mon, 15 Aug 2011 09:31:07 -0400 Message-ID: From: Joe Schaefer To: Andriy Gapon Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-hackers Subject: Re: Clock stalls on Sabertooth 990FX X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Aug 2011 13:31:09 -0000 On Mon, Aug 15, 2011 at 8:32 AM, Andriy Gapon wrote: > on 13/08/2011 20:16 Joe Schaefer said the following: >> Brand new machine with a Phenom II X6 1100T and under chronic load >> the clock will stop running periodically until the machine eventually co= mpletely >> freezes. =C2=A0Note: during these stalls the kernel is still running, th= e >> machine is still >> mostly responsive, it's just that the clock is frozen in time. >> >> I've disabled Turbo mode in the bios and toyed with just about every >> other setting but nothing seems to resolve this problem. =C2=A0Based on = the behavior >> of the machine (just making buildworld will eventually kill it, upping >> the -j flag >> just kills it faster), I'm guessing it has something to do with the >> Digi+ VRM features >> but again nothing I've tried modifying in the bios seems to help. >> >> I've tried both 8.2-RELEASE and FreeBSD 9 (head). =C2=A0Running head now= with >> a dtrace enabled kernel. >> >> Suggestions? > > On head, start with checking what source is used for driving clocks: > sysctl kern.eventtimer % sysctl kern.eventtimer [master] kern.eventtimer.choice: HPET(450) HPET1(450) HPET2(450) LAPIC(400) i8254(100) RTC(0) kern.eventtimer.et.LAPIC.flags: 15 kern.eventtimer.et.LAPIC.frequency: 0 kern.eventtimer.et.LAPIC.quality: 400 kern.eventtimer.et.HPET.flags: 3 kern.eventtimer.et.HPET.frequency: 14318180 kern.eventtimer.et.HPET.quality: 450 kern.eventtimer.et.HPET1.flags: 3 kern.eventtimer.et.HPET1.frequency: 14318180 kern.eventtimer.et.HPET1.quality: 450 kern.eventtimer.et.HPET2.flags: 3 kern.eventtimer.et.HPET2.frequency: 14318180 kern.eventtimer.et.HPET2.quality: 450 kern.eventtimer.et.i8254.flags: 1 kern.eventtimer.et.i8254.frequency: 1193182 kern.eventtimer.et.i8254.quality: 100 kern.eventtimer.et.RTC.flags: 17 kern.eventtimer.et.RTC.frequency: 32768 kern.eventtimer.et.RTC.quality: 0 kern.eventtimer.periodic: 0 kern.eventtimer.timer: HPET kern.eventtimer.idletick: 0 kern.eventtimer.singlemul: 2 > > When the problem starts using vmstat -i to check interrupt rates and see = if any > relevant counter gets stuck. (during a buildworld run): joe@sextant:~% vmstat -i [mas= ter] interrupt total rate irq16: hdac2 39 0 irq17: ehci0 ehci1+ 2 0 irq18: ohci0 ohci1* 56943 1 irq19: ahci0 1004414 24 irq22: fwohci0 653499 16 irq46: atapci1 60047 1 irq256: hpet0:t0 8309347 205 irq259: hdac0 1 0 irq260: hdac1 1 0 irq261: re0 93596 2 Total 10177889 251 joe@sextant:~% vmstat -i [mas= ter] interrupt total rate irq16: hdac2 39 0 irq17: ehci0 ehci1+ 2 0 irq18: ohci0 ohci1* 57019 1 irq19: ahci0 1009467 24 irq22: fwohci0 653921 16 irq46: atapci1 60146 1 irq256: hpet0:t0 8381321 207 irq259: hdac0 1 0 irq260: hdac1 1 0 irq261: re0 93694 2 Total 10255611 253 joe@sextant:~% date [mas= ter] Mon Aug 15 09:18:25 EDT 2011 joe@sextant:~% date [mas= ter] Mon Aug 15 09:18:27 EDT 2011 joe@sextant:~% vmstat -i [mas= ter] interrupt total rate irq16: hdac2 39 0 irq17: ehci0 ehci1+ 2 0 irq18: ohci0 ohci1* 57410 1 irq19: ahci0 1019054 25 irq22: fwohci0 654275 16 irq46: atapci1 60230 1 irq256: hpet0:t0 8438249 208 irq259: hdac0 1 0 irq260: hdac1 1 0 irq261: re0 93835 2 Total 10323096 254 joe@sextant:~% date [mas= ter] Mon Aug 15 09:19:41 EDT 2011 joe@sextant:~% date [mas= ter] Mon Aug 15 09:19:41 EDT 2011 joe@sextant:~% vmstat -i [mas= ter] interrupt total rate irq16: hdac2 39 0 irq17: ehci0 ehci1+ 2 0 irq18: ohci0 ohci1* 57432 1 irq19: ahci0 1019054 25 irq22: fwohci0 654275 16 irq46: atapci1 60230 1 irq256: hpet0:t0 8438249 208 irq259: hdac0 1 0 irq260: hdac1 1 0 irq261: re0 93852 2 Total 10323135 254 joe@sextant:~% vmstat -i [mas= ter] interrupt total rate irq16: hdac2 39 0 irq17: ehci0 ehci1+ 2 0 irq18: ohci0 ohci1* 57436 1 irq19: ahci0 1019054 25 irq22: fwohci0 654275 16 irq46: atapci1 60230 1 irq256: hpet0:t0 8438249 208 irq259: hdac0 1 0 irq260: hdac1 1 0 irq261: re0 93866 2 Total 10323153 254 joe@sextant:~% date [mas= ter] Mon Aug 15 09:19:41 EDT 2011 joe@sextant:~% date [mas= ter] Mon Aug 15 09:24:16 EDT 2011 joe@sextant:~% date [mas= ter] Mon Aug 15 09:24:16 EDT 2011 joe@sextant:~% vmstat -i [mas= ter] interrupt total rate irq16: hdac2 39 0 irq17: ehci0 ehci1+ 2 0 irq18: ohci0 ohci1* 59317 1 irq19: ahci0 1020250 24 irq22: fwohci0 654352 16 irq46: atapci1 60248 1 irq256: hpet0:t0 8440763 206 irq259: hdac0 1 0 irq260: hdac1 1 0 irq261: re0 94258 2 Total 10329231 252 joe@sextant:~% vmstat -i [mas= ter] interrupt total rate irq16: hdac2 39 0 irq17: ehci0 ehci1+ 2 0 irq18: ohci0 ohci1* 59330 1 irq19: ahci0 1020471 24 irq22: fwohci0 654411 16 irq46: atapci1 60263 1 irq256: hpet0:t0 8442455 206 irq259: hdac0 1 0 irq260: hdac1 1 0 irq261: re0 94325 2 Total 10331298 252 joe@sextant:~% date [mas= ter] Mon Aug 15 09:24:33 EDT 2011