From owner-freebsd-performance@FreeBSD.ORG Fri Feb 1 11:29:25 2008 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E24D316A46C for ; Fri, 1 Feb 2008 11:29:25 +0000 (UTC) (envelope-from gofp-freebsd-performance@m.gmane.org) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by mx1.freebsd.org (Postfix) with ESMTP id 6AC8C13C478 for ; Fri, 1 Feb 2008 11:29:25 +0000 (UTC) (envelope-from gofp-freebsd-performance@m.gmane.org) Received: from list by ciao.gmane.org with local (Exim 4.43) id 1JKu5E-0001Qf-6W for freebsd-performance@freebsd.org; Fri, 01 Feb 2008 11:29:24 +0000 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 01 Feb 2008 11:29:24 +0000 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 01 Feb 2008 11:29:24 +0000 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-performance@freebsd.org From: Ivan Voras Date: Fri, 01 Feb 2008 12:30:02 +0100 Lines: 128 Message-ID: References: <200801310147.BAA04522@sopwith.solgatos.com> <47A2A606.9080702@freebsd.org> <002201c86499$7861ac20$b6db87d4@multiplay.co.uk> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig40EA324C8EC7A4591F4DDCAD" X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Thunderbird 2.0.0.6 (X11/20071022) In-Reply-To: <002201c86499$7861ac20$b6db87d4@multiplay.co.uk> X-Enigmail-Version: 0.95.0 Sender: news Subject: Re: newfs locks entire machine for 20seconds X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Feb 2008 11:29:26 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig40EA324C8EC7A4591F4DDCAD Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Steven Hartland wrote: > Yep thats where I've traced it to its requesting: kern.geom.confxml >=20 > Which does:- > static int > sysctl_kern_geom_confxml(SYSCTL_HANDLER_ARGS) > { > int error; > struct sbuf *sb; >=20 > sb =3D sbuf_new(NULL, NULL, 0, SBUF_AUTOEXTEND); > g_waitfor_event(g_confxml, sb, M_WAITOK, NULL); > error =3D SYSCTL_OUT(req, sbuf_data(sb), sbuf_len(sb) + 1); > sbuf_delete(sb); > return error; > } >=20 > What I dont understand is why this would lock the entire machine. >=20 > I've enabled LOCK_PROFILING and reran and I get the following which > seems to indicate the culpret is: SYSCTL_LOCK() >=20 >> From what I can tell g_waitfor_event is returning EAGAIN for a large > amount of time which means we get stuck in:- > userland_sysctl > ... > SYSCTL_LOCK(); >=20 > do { > req.oldidx =3D 0; > req.newidx =3D 0; > error =3D sysctl_root(0, name, namelen, &req); > } while (error =3D=3D EAGAIN); >=20 > if (req.lock =3D=3D REQ_WIRED && req.validlen > 0) > vsunlock(req.oldptr, req.validlen); >=20 > SYSCTL_UNLOCK(); > ... >=20 > The only reason I can see for returning EAGAIN is g_destroy_geom > calling g_cancel_event >=20 > To confirm this wasnt related to anything that top is doing and > we are seeing a total lockup I used: > sh -c 'while [ 1 ]; do sleep 1; date; done' > Fri Feb 1 06:04:47 GMT 2008 > Fri Feb 1 06:04:48 GMT 2008 > Fri Feb 1 06:04:49 GMT 2008 <=3D Lockup > Fri Feb 1 06:05:16 GMT 2008 <=3D Recover > Fri Feb 1 06:05:17 GMT 2008 > Fri Feb 1 06:05:18 GMT 2008 >=20 >=20 > Enabling geom debugging I get:- > Feb 1 06:04:45 geomtest kernel: g_post_event_x(0xffffffff802394c0, > 0xffffff00010e6100, 2, 0) > Feb 1 06:04:45 geomtest kernel: ref 0xffffff00010e6100 > Feb 1 06:04:45 geomtest kernel: g_post_event_x(0xffffffff802394c0, > 0xffffff00014e6700, 2, 0) > Feb 1 06:04:45 geomtest kernel: ref 0xffffff00014e6700 > Feb 1 06:04:49 geomtest kernel: g_post_event_x(0xffffffff80239270, > 0xffffff00010e6100, 2, 0) > Feb 1 06:04:49 geomtest kernel: ref 0xffffff00010e6100 > Feb 1 06:04:49 geomtest kernel: g_post_event_x(0xffffffff80239270, > 0xffffff00014e6700, 2, 0) > Feb 1 06:04:49 geomtest kernel: ref 0xffffff00014e6700 > Feb 1 06:04:49 geomtest kernel: mbr_taste(MBR,da0s3) > Feb 1 06:04:49 geomtest kernel: g_mbrext_taste(MBREXT,da0s3) > Feb 1 06:04:49 geomtest kernel: g_slice_spoiled(0xffffff0001b27180/da0= s3) > Feb 1 06:04:49 geomtest kernel: g_wither_geom(0xffffff0001a33c00(da0s3= )) > Feb 1 06:04:49 geomtest kernel: g_part_taste(PART,da0s3) > Feb 1 06:04:56 geomtest kernel: g_post_event_x(0xffffffff80235b10, > 0xffffff000144a9c0, 2, 262144) > Feb 1 06:05:00 geomtest kernel: g_wither_geom(0xffffff000158b300(da0s3= )) > Feb 1 06:05:00 geomtest kernel: > Feb 1 06:05:00 geomtest kernel: g_label_taste(LABEL, da0s3) > Feb 1 06:05:00 geomtest kernel: > Feb 1 06:05:16 geomtest kernel: GEOM_LABEL[1]: MSDOSFS: da0s3: FAT32 > volume not valid. > Feb 1 06:05:16 geomtest kernel: g_detach(0xffffff0001b23980) > Feb 1 06:05:16 geomtest kernel: g_destroy_consumer(0xffffff0001b23980)= > Feb 1 06:05:16 geomtest kernel: > So after all that I can see why the sysctl call is taking > so long to complete but the burning question is why does Can you explain - I don't see it :) Do you mean to say there's a contention for sysctl lock between geom_confxml and g_waitfor_event or that geom_label tasting has something to do with it? > the entire system lock because of this? What else is > waiting on the sysctl lock which is so critical? What I do know is that sysctl is GIANT-locked, which is also used by some parts of device handling infrastructure (dead_cdevsw), the USB stack, and can creep itself in the timer via swi_sched in subr_taskqueue.c:303. I cannot say for sure that's what happening here, but they are possibilities. If you can provoke this reliably, I think there is a (old!) patch for removing sysctl from under the Giant lock that you could try. --------------enig40EA324C8EC7A4591F4DDCAD Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHowLAldnAQVacBcgRAkutAJ0Y3W1r/pUDkumTH7lwOQGEPJAVwgCfUQ+j GrktnsKT+fVbSWCNowj0Y0E= =uomb -----END PGP SIGNATURE----- --------------enig40EA324C8EC7A4591F4DDCAD--