From owner-freebsd-stable@FreeBSD.ORG Wed Nov 14 10:51:16 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4BE01F19 for ; Wed, 14 Nov 2012 10:51:16 +0000 (UTC) (envelope-from markus.gebert@hostpoint.ch) Received: from mail.adm.hostpoint.ch (mail.adm.hostpoint.ch [IPv6:2a00:d70:0:a::e0]) by mx1.freebsd.org (Postfix) with ESMTP id BAB458FC0C for ; Wed, 14 Nov 2012 10:51:15 +0000 (UTC) Received: from [2001:1620:2013:1:e8ec:d52:8672:17b] (port=51457) by mail.adm.hostpoint.ch with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.69 (FreeBSD)) (envelope-from ) id 1TYaYU-000PaD-2o; Wed, 14 Nov 2012 11:50:50 +0100 Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: thread taskq / unp_gc() using 100% cpu and stalling unix socket IPC From: Markus Gebert In-Reply-To: <20121114072112.GX73505@kib.kiev.ua> Date: Wed, 14 Nov 2012 11:50:39 +0100 Message-Id: <6F156867-5334-489E-AC92-BD385B0DBC2F@hostpoint.ch> References: <6908B498-6978-4995-B081-8D504ECB5C0A@hostpoint.ch> <007F7A73-75F6-48A6-9C01-E7C179CDA48A@hostpoint.ch> <20121114072112.GX73505@kib.kiev.ua> To: Konstantin Belousov X-Mailer: Apple Mail (2.1499) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-stable X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 10:51:16 -0000 On 14.11.2012, at 08:21, Konstantin Belousov = wrote: > On Wed, Nov 14, 2012 at 01:41:04AM +0100, Markus Gebert wrote: >>=20 >> On 13.11.2012, at 19:30, Markus Gebert = wrote: >>=20 >>> To me it looks like the unix socket GC is triggered way too often = and/or running too long, which uses cpu and worse, causes a lot of = contention around the unp_list_lock which in turn causes delays for all = processes relaying on unix sockets for IPC. >>>=20 >>> I don't know why the unp_gc() is called so often and what's = triggering this. >>=20 >> I have a guess now. Dovecot and relayd both use unix sockets heavily. = According to dtrace uipc_detach() gets called quite often by dovecot = closing unix sockets. Each time uipc_detach() is called unp_gc_task is = taskqueue_enqueue()d if fds are inflight. >>=20 >> in uipc_detach(): >> 682 if (local_unp_rights)=09 >> 683 taskqueue_enqueue(taskqueue_thread, = &unp_gc_task); >>=20 >> We use relayd in a way that keeps the source address of the client = when connecting to the backend server (transparent load balancing). This = requires IP_BINDANY on the socket which cannot be set by unprivileged = processes, so relayd sends the socket fd to the parent process just to = set the socket option and send it back. This means an fd gets = transferred twice for every new backend connection. >>=20 >> So we have dovecot calling uipc_detach() often and relayd making it = likely that fds are inflight (unp_rights > 0). With a certain amount of = load this could cause unp_gc_task to be added to the thread taskq too = often, slowing everything unix socket related down by holding global = locks in unp_gc(). >>=20 >> I don't know if the slowdown can even cause a negative feedback loop = at some point by inreasing the chance of fds being inflight. This would = explain why sometimes the condition goes away by itself and sometimes = requires intervention (taking load away for a moment). >>=20 >> I'll look into a way to (dis)prove all this tomorrow. Ideas still = welcome :-). >>=20 >=20 > If the only issue is indeed too aggressive scheduling of the = taskqueue, > than the postpone up to the next tick could do it. The patch below > tries to schedule the taskqueue for gc to the next tick if it is not = yet > scheduled. Could you try it ? Sounds like a good idea, thanks! I'm testing the patch right now. It = could take a few days to know it works for sure. I'll get back to you = soon. Markus