Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 29 Aug 2012 08:25:44 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        freebsd-stable@freebsd.org
Cc:        Anton Yuzhaninov <citrin@citrin.ru>
Subject:   Re: Problem with IPMI KCS driver
Message-ID:  <201208290825.44198.jhb@freebsd.org>
In-Reply-To: <503DE2AB.6030702@citrin.ru>
References:  <503DE2AB.6030702@citrin.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wednesday, August 29, 2012 5:36:43 am Anton Yuzhaninov wrote:
> We use servers witch motherboard Supermicro X8DTT-H and meet with such problem:
> when watchdogd started, server is rebooted by IPMI watchdog several times per week.
> 
> After some debugging I've found, that sometimes IPMI driver entered endless 
> loop, and watchdogd have no chances to reset watchdog timer.
> In such situation top show:
> 
> PID USERNAME      PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
> ...
> 113 root          -16    -     0K    16K CPU4    4  17:18 99.17% ipmi0: kcs
> 
> Endless loop located in file /sys/dev/ipmi/ipmi_kcs.c and function 
> kcs_wait_for_obf():
> 
>          int status, start = ticks;
> 
>          status = INB(sc, KCS_CTL_STS);
>          if (state == 0) {
>                  /* WAIT FOR OBF = 0 */
>                  while (ticks - start < MAX_TIMEOUT && status & KCS_STATUS_OBF) {
>                          DELAY(100);
>                          status = INB(sc, KCS_CTL_STS);
>                  }
>          } else {
>                  /* WAIT FOR OBF = 1 */
>                  while (ticks - start < MAX_TIMEOUT &&
>                      !(status & KCS_STATUS_OBF)) {
>                          DELAY(100);
>                          status = INB(sc, KCS_CTL_STS);
>                  }
>          }
> 
> It seems to be, that this loop intended to run no more than MAX_TIMEOUT ticks.
> but by some reason this timeout does not works and loop runs until reboot.
> 
> Questions:
> 1. Is it correct to check ticks to implement timeout here?
> 2. how to fix this timeout?

Hmm.  Can you try this:

Index: kern/kern_clock.c
===================================================================
--- kern/kern_clock.c	(revision 239819)
+++ kern/kern_clock.c	(working copy)
@@ -382,7 +382,7 @@
 int	stathz;
 int	profhz;
 int	profprocs;
-int	ticks;
+volatile int	ticks;
 int	psratio;
 
 static DPCPU_DEFINE(int, pcputicks);	/* Per-CPU version of ticks. */
@@ -469,7 +469,7 @@
 hardclock(int usermode, uintfptr_t pc)
 {
 
-	atomic_add_int((volatile int *)&ticks, 1);
+	atomic_add_int(&ticks, 1);
 	hardclock_cpu(usermode);
 	tc_ticktock(1);
 	cpu_tick_calibration();
Index: sys/kernel.h
===================================================================
--- sys/kernel.h	(revision 239819)
+++ sys/kernel.h	(working copy)
@@ -63,7 +63,7 @@
 extern int stathz;			/* statistics clock's frequency */
 extern int profhz;			/* profiling clock's frequency */
 extern int profprocs;			/* number of process's profiling */
-extern int ticks;
+extern volatile int ticks;
 
 #endif /* _KERNEL */
 

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201208290825.44198.jhb>