Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 6 May 2004 10:47:25 +0400 (MSD)
From:      Andrew Belashov <bel@orel.ru>
To:        FreeBSD-gnats-submit@FreeBSD.org
Subject:   sparc64/66314: SMP kernel panic: ipi_send: couldn't send ipi
Message-ID:  <200405060647.i466lPNZ082751@white.orel.ru>
Resent-Message-ID: <200405060650.i466oLZU063984@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         66314
>Category:       sparc64
>Synopsis:       SMP kernel panic: ipi_send: couldn't send ipi
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-sparc64
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed May 05 23:50:20 PDT 2004
>Closed-Date:
>Last-Modified:
>Originator:     Andrew Belashov
>Release:        FreeBSD 5.2-CURRENT sparc64
>Organization:
JSC "CenterTelecom"
>Environment:
System: FreeBSD bel.localdomain 5.2-CURRENT FreeBSD 5.2-CURRENT #2: Thu May 6 08:25:18 MSD 2004 bel@bel.localdomain:/usr/obj/usr/src/sys/SUNC3D sparc64

	Machine: Sun Ultra 60 UPA/PCI (2 X UltraSPARC-II 450MHz)
	with Creator3D video card, XFree86 4.3.0
>Description:
	I have two Ultra 60. Periodically they hard lookup or panic.
	Example:
======================================================================
panic: ipi_send: couldn't send ipi
at line 455 in file /usr/src/sys/sparc64/sparc64/mp_machdep.c
cpuid = 0;
Debugger("panic")
======================================================================

And no "db>" prompt. Sometimes kernel go to DDB. But core dump is not
possible to save.

>How-To-Repeat:
	I have included some debug code (see attached patch) into kernel
and increase IPI_RETRIES from 100 to 1000000.
	My code have cyclic array for storing last 32 counters of iteration.
Also it have max_ipi_retries variable for storing maximum value of iteration
counter.

	Boot machine and go to kgdb:

root@bel# gdb -k /usr/obj/usr/src/sys/SUNC3D/kernel.debug /dev/mem
GNU gdb 5.3 (FreeBSD)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "sparc64-portbld-freebsd5.2"...
panic messages:
---
---
#0  sched_switch (td=0xc03c1288) at /usr/src/sys/kern/sched_ule.c:1186
1186                    cpu_switch(td, newtd);
(kgdb) p max_ipi_retries
$1 = 1
(kgdb) quit

root@bel# ls -laR /
[10 second output skipped]
^C

root@bel# gdb -k /usr/obj/usr/src/sys/SUNC3D/kernel.debug /dev/mem
[...]
(kgdb) p max_ipi_retries
$1 = 2022

Wow! max_ipi_retries in 20 times more than default limit (100).

>Fix:

	Increase IPI_RETRIES to big value???

--- ipi_send.patch begins here ---
--- sparc64/include/smp.h.orig	Tue Apr  8 10:35:08 2003
+++ sparc64/include/smp.h	Thu May  6 08:24:22 2004
@@ -45,7 +45,7 @@
 #define	IPI_RENDEZVOUS	PIL_RENDEZVOUS
 #define	IPI_STOP	PIL_STOP
 
-#define	IPI_RETRIES	100
+#define	IPI_RETRIES	1000000
 
 struct cpu_start_args {
 	u_int	csa_count;
--- sparc64/sparc64/mp_machdep.c.orig	Wed Dec  3 17:57:25 2003
+++ sparc64/sparc64/mp_machdep.c	Thu May  6 08:23:25 2004
@@ -424,6 +424,10 @@
 	}
 }
 
+static int max_ipi_retries = 0;
+static int curr_ipi_retries_idx = -1;
+static int last_ipi_retries[32];
+
 void
 cpu_ipi_send(u_int mid, u_long d0, u_long d1, u_long d2)
 {
@@ -432,8 +436,11 @@
 
 	KASSERT((ldxa(0, ASI_INTR_DISPATCH_STATUS) & IDR_BUSY) == 0,
 	    ("ipi_send: outstanding dispatch"));
+	curr_ipi_retries_idx++; curr_ipi_retries_idx %= 32;
 	for (i = 0; i < IPI_RETRIES; i++) {
 		s = intr_disable();
+		max_ipi_retries = i > max_ipi_retries ? i : max_ipi_retries;
+		last_ipi_retries[curr_ipi_retries_idx] = i;
 		stxa(AA_SDB_INTR_D0, ASI_SDB_INTR_W, d0);
 		stxa(AA_SDB_INTR_D1, ASI_SDB_INTR_W, d1);
 		stxa(AA_SDB_INTR_D2, ASI_SDB_INTR_W, d2);
--- ipi_send.patch ends here ---


>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200405060647.i466lPNZ082751>