From owner-freebsd-performance@FreeBSD.ORG Fri Feb 1 12:21:00 2008 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CA6A816A417 for ; Fri, 1 Feb 2008 12:21:00 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from ns.trinitel.com (186.161.36.72.static.reverse.ltdomains.com [72.36.161.186]) by mx1.freebsd.org (Postfix) with ESMTP id 9F24413C4CE for ; Fri, 1 Feb 2008 12:21:00 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from proton.storspeed.com (209-163-168-124.static.tenantsolutions.net [209.163.168.124] (may be forged)) (authenticated bits=0) by ns.trinitel.com (8.14.1/8.14.1) with ESMTP id m11CKtqD071587; Fri, 1 Feb 2008 06:20:58 -0600 (CST) (envelope-from anderson@freebsd.org) Message-ID: <47A30EA7.7050506@freebsd.org> Date: Fri, 01 Feb 2008 06:20:55 -0600 From: Eric Anderson User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Steven Hartland References: <200801310147.BAA04522@sopwith.solgatos.com> <47A2A606.9080702@freebsd.org> <002201c86499$7861ac20$b6db87d4@multiplay.co.uk> In-Reply-To: <002201c86499$7861ac20$b6db87d4@multiplay.co.uk> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on ns.trinitel.com Cc: Dieter , freebsd-performance@freebsd.org Subject: Re: newfs locks entire machine for 20seconds X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Feb 2008 12:21:00 -0000 Steven Hartland wrote: > > ----- Original Message ----- From: "Eric Anderson" > >> I saw this once before, a long time back, and every time I went >> through a debugging session, it came to some kind of lock on the >> sysctl tree with regards to the geom info (maybe the XML kind of tree >> dump or something). I don't recall all the details, but it was >> something like that. > > Yep thats where I've traced it to its requesting: kern.geom.confxml > > Which does:- > static int > sysctl_kern_geom_confxml(SYSCTL_HANDLER_ARGS) > { > int error; > struct sbuf *sb; > > sb = sbuf_new(NULL, NULL, 0, SBUF_AUTOEXTEND); > g_waitfor_event(g_confxml, sb, M_WAITOK, NULL); > error = SYSCTL_OUT(req, sbuf_data(sb), sbuf_len(sb) + 1); > sbuf_delete(sb); > return error; > } > > What I dont understand is why this would lock the entire machine. > > I've enabled LOCK_PROFILING and reran and I get the following which > seems to indicate the culpret is: SYSCTL_LOCK() > > From what I can tell g_waitfor_event is returning EAGAIN for a large > amount of time which means we get stuck in:- > userland_sysctl > ... > SYSCTL_LOCK(); > > do { > req.oldidx = 0; > req.newidx = 0; > error = sysctl_root(0, name, namelen, &req); > } while (error == EAGAIN); > > if (req.lock == REQ_WIRED && req.validlen > 0) > vsunlock(req.oldptr, req.validlen); > > SYSCTL_UNLOCK(); > ... > > The only reason I can see for returning EAGAIN is g_destroy_geom > calling g_cancel_event Wait - if it returns EAGAIN for a while, then look at that code above. It will hold the sysctl lock for some indefinite amount of time. Maybe it should look like this instead: do { SYSCTL_LOCK(); req.oldidx = 0; req.newidx = 0; error = sysctl_root(0, name, namelen, &req); SYSCTL_UNLOCK(); } while (error == EAGAIN); if (req.lock == REQ_WIRED && req.validlen > 0) vsunlock(req.oldptr, req.validlen); Can you try that? Eric