Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 22 Sep 2012 08:37:24 GMT
From:      Fabian Keil <fk@fabiankeil.de>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   kern/171865: [geom] g_wither_washer() keeping a core busy
Message-ID:  <201209220837.q8M8bO6P064925@red.freebsd.org>
Resent-Message-ID: <201209220840.q8M8e7mL074815@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         171865
>Category:       kern
>Synopsis:       [geom] g_wither_washer() keeping a core busy
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Sep 22 08:40:07 UTC 2012
>Closed-Date:
>Last-Modified:
>Originator:     Fabian Keil
>Release:        HEAD
>Organization:
>Environment:
FreeBSD r500.local 10.0-CURRENT FreeBSD 10.0-CURRENT #484 r+345840c: Fri Sep 21 20:20:56 CEST 2012     fk@r500.local:/usr/obj/usr/src/sys/ZOEY  amd64
>Description:
In http://lists.freebsd.org/pipermail/freebsd-fs/2011-June/011855.html
I reported a problem with g_wither_washer() being called more than
400000 times per second after a device got lost, keeping a cpu busy:

fk@r500 ~ $sudo dtrace -n 'fbt:kernel:g_*:entry { @[probefunc, stack()] = count(); } tick-1sec { trunc(@, 3); printa(@); trunc(@)}'
dtrace: description 'fbt:kernel:g_*:entry ' matched 359 probes
CPU     ID                    FUNCTION:NAME
  0  32988                       :tick-1sec 
  g_wither_washer                                   
              kernel`g_run_events+0x3b5
              kernel`0xffffffff8084967e
           446626

  0  32988                       :tick-1sec 
  g_trace                                           
              kernel`g_io_request+0x4d
              kernel`g_io_schedule_down+0x25f
              kernel`g_down_procbody+0x6d
              kernel`fork_exit+0x9a
              kernel`0xffffffff8084967e
              230
  g_trace                                           
              kernel`g_io_deliver+0x7a
              kernel`g_up_procbody+0x6d
              kernel`fork_exit+0x9a
              kernel`0xffffffff8084967e
              230
[...]

I recently found a way to reproduce the problem without using
ZFS or writing to the device.
>How-To-Repeat:
geli onetime /dev/md0
geom sched insert -a rr /dev/md0.eli
geli detach /dev/md0.eli.sched.

>Fix:
I don't have a fix, but the attached patch can be used as a workaround.

After kern.geom.debugflags has been set to 256, it can be set to 0 again,
but the problem will be back after the next geom "event".

Patch attached with submission follows:

>From 8680caf9ab5322377736f62cd4eb674a938bb445 Mon Sep 17 00:00:00 2001
From: Fabian Keil <fk@fabiankeil.de>
Date: Thu, 12 Jul 2012 12:38:00 +0200
Subject: [PATCH] Allow to use kern.geom.debugflags to prevent g_run_events()
 from calling g_wither_washer()

Workaround for geom keeping a whole core busy failing
to remove a lost device.
---
 sys/geom/geom_event.c | 3 +++
 sys/geom/geom_int.h   | 1 +
 2 files changed, 4 insertions(+)

diff --git a/sys/geom/geom_event.c b/sys/geom/geom_event.c
index 3805dcd..b9bfc25 100644
--- a/sys/geom/geom_event.c
+++ b/sys/geom/geom_event.c
@@ -47,6 +47,7 @@ __FBSDID("$FreeBSD: src/sys/geom/geom_event.c,v 1.62 2012/07/29 11:51:48 mav Exp
 #include <sys/kernel.h>
 #include <sys/lock.h>
 #include <sys/mutex.h>
+#include <sys/sysctl.h>
 #include <sys/proc.h>
 #include <sys/errno.h>
 #include <sys/time.h>
@@ -286,6 +287,8 @@ g_run_events()
 			;
 		mtx_assert(&g_eventlock, MA_OWNED);
 		*i = g_wither_work;
+		if (g_debugflags & G_F_STOP_WITHERING)
+			*i = 0;
 		if (*i) {
 			mtx_unlock(&g_eventlock);
 			while (*i) {
diff --git a/sys/geom/geom_int.h b/sys/geom/geom_int.h
index 50f3a2a..0c11be8 100644
--- a/sys/geom/geom_int.h
+++ b/sys/geom/geom_int.h
@@ -50,6 +50,7 @@ extern int g_debugflags;
  */
 #define G_F_DISKIOCTL	64
 #define G_F_CTLDUMP	128
+#define G_F_STOP_WITHERING 256
 
 /* geom_dump.c */
 void g_confxml(void *, int flag);
-- 
1.7.11.5



>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201209220837.q8M8bO6P064925>