From owner-freebsd-stable@FreeBSD.ORG Wed Nov 14 07:21:20 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 32D703D4 for ; Wed, 14 Nov 2012 07:21:20 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) by mx1.freebsd.org (Postfix) with ESMTP id 7E7A68FC15 for ; Wed, 14 Nov 2012 07:21:19 +0000 (UTC) Received: from tom.home (localhost [127.0.0.1]) by kib.kiev.ua (8.14.5/8.14.5) with ESMTP id qAE7LCNS069255; Wed, 14 Nov 2012 09:21:12 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.7.1 kib.kiev.ua qAE7LCNS069255 Received: (from kostik@localhost) by tom.home (8.14.5/8.14.5/Submit) id qAE7LC4S069254; Wed, 14 Nov 2012 09:21:12 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 14 Nov 2012 09:21:12 +0200 From: Konstantin Belousov To: Markus Gebert Subject: Re: thread taskq / unp_gc() using 100% cpu and stalling unix socket IPC Message-ID: <20121114072112.GX73505@kib.kiev.ua> References: <6908B498-6978-4995-B081-8D504ECB5C0A@hostpoint.ch> <007F7A73-75F6-48A6-9C01-E7C179CDA48A@hostpoint.ch> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="LLz+487bO+3TjNAZ" Content-Disposition: inline In-Reply-To: <007F7A73-75F6-48A6-9C01-E7C179CDA48A@hostpoint.ch> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=0.2 required=5.0 tests=ALL_TRUSTED, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: freebsd-stable X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 07:21:20 -0000 --LLz+487bO+3TjNAZ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Nov 14, 2012 at 01:41:04AM +0100, Markus Gebert wrote: >=20 > On 13.11.2012, at 19:30, Markus Gebert wrote: >=20 > > To me it looks like the unix socket GC is triggered way too often and/o= r running too long, which uses cpu and worse, causes a lot of contention ar= ound the unp_list_lock which in turn causes delays for all processes relayi= ng on unix sockets for IPC. > >=20 > > I don't know why the unp_gc() is called so often and what's triggering = this. >=20 > I have a guess now. Dovecot and relayd both use unix sockets heavily. Acc= ording to dtrace uipc_detach() gets called quite often by dovecot closing u= nix sockets. Each time uipc_detach() is called unp_gc_task is taskqueue_enq= ueue()d if fds are inflight. >=20 > in uipc_detach(): > 682 if (local_unp_rights)=09 > 683 taskqueue_enqueue(taskqueue_thread, &unp_gc_task); >=20 > We use relayd in a way that keeps the source address of the client when c= onnecting to the backend server (transparent load balancing). This requires= IP_BINDANY on the socket which cannot be set by unprivileged processes, so= relayd sends the socket fd to the parent process just to set the socket op= tion and send it back. This means an fd gets transferred twice for every ne= w backend connection. >=20 > So we have dovecot calling uipc_detach() often and relayd making it likel= y that fds are inflight (unp_rights > 0). With a certain amount of load thi= s could cause unp_gc_task to be added to the thread taskq too often, slowin= g everything unix socket related down by holding global locks in unp_gc(). >=20 > I don't know if the slowdown can even cause a negative feedback loop at s= ome point by inreasing the chance of fds being inflight. This would explain= why sometimes the condition goes away by itself and sometimes requires int= ervention (taking load away for a moment). >=20 > I'll look into a way to (dis)prove all this tomorrow. Ideas still welcome= :-). >=20 If the only issue is indeed too aggressive scheduling of the taskqueue, than the postpone up to the next tick could do it. The patch below tries to schedule the taskqueue for gc to the next tick if it is not yet scheduled. Could you try it ? diff --git a/sys/kern/subr_taskqueue.c b/sys/kern/subr_taskqueue.c index 90c6ffc..3bf62f9 100644 --- a/sys/kern/subr_taskqueue.c +++ b/sys/kern/subr_taskqueue.c @@ -252,9 +252,13 @@ taskqueue_enqueue_timeout(struct taskqueue *queue, } else { queue->tq_callouts++; timeout_task->f |=3D DT_CALLOUT_ARMED; + if (ticks < 0) + ticks =3D -ticks; /* Ignore overflow. */ + } + if (ticks > 0) { + callout_reset(&timeout_task->c, ticks, + taskqueue_timeout_func, timeout_task); } - callout_reset(&timeout_task->c, ticks, taskqueue_timeout_func, - timeout_task); } TQ_UNLOCK(queue); return (res); diff --git a/sys/kern/uipc_usrreq.c b/sys/kern/uipc_usrreq.c index cc5360f..ed92e90 100644 --- a/sys/kern/uipc_usrreq.c +++ b/sys/kern/uipc_usrreq.c @@ -131,7 +131,7 @@ static const struct sockaddr sun_noname =3D { sizeof(su= n_noname), AF_LOCAL }; * reentrance in the UNIX domain socket, file descriptor, and socket layer * code. See unp_gc() for a full description. */ -static struct task unp_gc_task; +static struct timeout_task unp_gc_task; =20 /* * The close of unix domain sockets attached as SCM_RIGHTS is @@ -672,7 +672,7 @@ uipc_detach(struct socket *so) if (vp) vrele(vp); if (local_unp_rights) - taskqueue_enqueue(taskqueue_thread, &unp_gc_task); + taskqueue_enqueue_timeout(taskqueue_thread, &unp_gc_task, -1); } =20 static int @@ -1783,7 +1783,7 @@ unp_init(void) LIST_INIT(&unp_shead); LIST_INIT(&unp_sphead); SLIST_INIT(&unp_defers); - TASK_INIT(&unp_gc_task, 0, unp_gc, NULL); + TIMEOUT_TASK_INIT(taskqueue_thread, &unp_gc_task, 0, unp_gc, NULL); TASK_INIT(&unp_defer_task, 0, unp_process_defers, NULL); UNP_LINK_LOCK_INIT(); UNP_LIST_LOCK_INIT(); --LLz+487bO+3TjNAZ Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlCjRmgACgkQC3+MBN1Mb4jZYACeLpu6b0BiuRrYrvQGlFq+BRbc LMYAoJN1cEvxgy3CJHhrfxdjAtiVfM1m =ADPc -----END PGP SIGNATURE----- --LLz+487bO+3TjNAZ--