From owner-freebsd-bugs@FreeBSD.ORG Thu Mar 6 17:00:04 2008 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7D3A51065702 for ; Thu, 6 Mar 2008 17:00:04 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 57FC68FC26 for ; Thu, 6 Mar 2008 17:00:04 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id m26H04C0004892 for ; Thu, 6 Mar 2008 17:00:04 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.2/8.14.1/Submit) id m26H042m004891; Thu, 6 Mar 2008 17:00:04 GMT (envelope-from gnats) Resent-Date: Thu, 6 Mar 2008 17:00:04 GMT Resent-Message-Id: <200803061700.m26H042m004891@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Eugene Grosbein Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5F81F1065672 for ; Thu, 6 Mar 2008 16:55:26 +0000 (UTC) (envelope-from eugen@delikates-nk.ru) Received: from delikates-nk.ru (delikates-nk.ru [81.26.177.74]) by mx1.freebsd.org (Postfix) with ESMTP id CA0438FC12 for ; Thu, 6 Mar 2008 16:55:25 +0000 (UTC) (envelope-from eugen@delikates-nk.ru) Received: from delikates-nk.ru (localhost [127.0.0.1]) by delikates-nk.ru (8.14.2/8.14.2) with ESMTP id m26GhW18005479 for ; Thu, 6 Mar 2008 23:43:32 +0700 (KRAT) (envelope-from eugen@delikates-nk.ru) Received: (from eugen@localhost) by delikates-nk.ru (8.14.2/8.14.2/Submit) id m26GhVBU005478; Thu, 6 Mar 2008 23:43:31 +0700 (KRAT) (envelope-from eugen) Message-Id: <200803061643.m26GhVBU005478@delikates-nk.ru> Date: Thu, 6 Mar 2008 23:43:31 +0700 (KRAT) From: Eugene Grosbein To: FreeBSD-gnats-submit@FreeBSD.org X-Send-Pr-Version: 3.113 Cc: Subject: kern/121433: [cpufreq] kern_cpu.c's logic error leads to spontaneous disabling of passive cooling X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Mar 2008 17:00:04 -0000 >Number: 121433 >Category: kern >Synopsis: [cpufreq] kern_cpu.c's logic error leads to spontaneous disabling of passive cooling >Confidential: no >Severity: serious >Priority: high >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Mar 06 17:00:03 UTC 2008 >Closed-Date: >Last-Modified: >Originator: Eugene Grosbein >Release: FreeBSD 6.3-PRERELEASE i386 >Organization: Svyaz-Service JSC >Environment: System: FreeBSD 6.3-PRERELEASE, Pentium-4 2.0Ghz >Description: I've 1U/unipocessor FreeBSD 6.3-PRERELEASE server having inadequate active cooling that leads to CPU overheating. The server is remote and while good cooling is being prepared, I decided to use passive cooling feature of acpi_thermal(4). It uses p4tcc here and really helps to keep CPU temperature in bounds but there is annoying bug: very often (many times per hour) the acpi_thermal(4) disables passive cooling with a message: failed to set new freq, disabling passive cooling So I need to use cron to (re)enable passive cooling ones a minute to keep it running. I've tracked this down to src/sys/kern/kern_cpu.c, function cf_get_method(): 1) src/sys/dev/acpica/acpi_thermal.c, function acpi_tz_cooling_thread() calls acpi_tz_cpufreq_update() from same file; 2) acpi_tz_cpufreq_update() calls CPUFREQ_GET() that takes us to src/sys/kern/kern_cpu.c, cf_get_method(); 3) cf_get_method() has the following code: /* * Reacquire the lock and search for the given level. * * XXX Note: this is not quite right since we really need to go * through each level and compare both absolute and relative * settings for each driver in the system before making a match. * The estimation code below catches this case though. */ CF_MTX_LOCK(&sc->lock); for (n = 0; n < numdevs && curr_set->freq == CPUFREQ_VAL_UNKNOWN; n++) { if (!device_is_attached(devs[n])) continue; error = CPUFREQ_DRV_GET(devs[n], &set); if (error) continue; for (i = 0; i < count; i++) { if (CPUFREQ_CMP(set.freq, levels[i].total_set.freq)) { sc->curr_level = levels[i]; break; } } } Note that error value is not cleaned after this cycle. It happens to be ENXIO after the cycle in my case. Later code successfully reports: CF_DEBUG("get estimated freq %d\n", curr_set->freq); (curr_set->freq always happens to be max value of CPU frequency here) Then it does 'return (error);' with value ENXIO propagated from the cycle shown above. 4) acpi_tz_cpufreq_update() propagates ENXIO to acpi_tz_cooling_thread() that disables passive cooling. >How-To-Repeat: Just use uniprocessor Pentium-4 system with heavy constant CPU load, acpi_thermal/cpufreq/p4tcc and tune acpi_thermal so passive cooling gets used. Here is my /etc/sysctl.conf: debug.cpufreq.lowest=1246 #debug.cpufreq.verbose=1 hw.acpi.thermal.user_override=1 hw.acpi.thermal.tz0.passive_cooling=1 hw.acpi.thermal.tz0._PSV=70C hw.acpi.thermal.tz0._CRT=75C >Fix: Unknown. Perhaps, just clear errno after the code cited above? As workaround, I've patched acpi_thermal(4) to not disable passive cooling when acpi_tz_cpufreq_update() returns ENXIO, that works for me. Eugene Grosbein >Release-Note: >Audit-Trail: >Unformatted: