From owner-freebsd-bugs@FreeBSD.ORG Sun Feb 15 06:43:28 2015 Return-Path: Delivered-To: freebsd-bugs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 34F6BA82 for ; Sun, 15 Feb 2015 06:43:28 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 02F43FFD for ; Sun, 15 Feb 2015 06:43:28 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t1F6hR0s066217 for ; Sun, 15 Feb 2015 06:43:27 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 192889] accept4 socket hangs in CLOSED (memcached) Date: Sun, 15 Feb 2015 06:43:25 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.0-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: mp39590@gmail.com X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 15 Feb 2015 06:43:28 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=192889 mp39590@gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mp39590@gmail.com --- Comment #15 from mp39590@gmail.com --- Reason for this bug to happen lies not in the network stack, but in capabilities subsystem. Memcached consists of a dispatcher thread and several worker threads, which communicates through a pipe, for example if new connection is accepted, dispatcher writes 'c' to a pipe for a selected worker thread (it switches them in round-robin manner), worker thread then popup the connection from the queue and serves it. Due to a slight race condition in capabilities, kevent() mechanism sometimes may return spurious ENOTCAPABLE errors for the descriptors. It makes libevent to abort the loop which works with the connections and return. Memcached doesn't expect it to happen and worker thread silenty returns[1] and dies. You may see it with procstat command, comparing count of threads in normal and failing situation - you will be one thread short for the last. Dispatcher is not aware of this catastrophic event, and therefor continues to write "c"'s about new connection to the pipe of that, already dead, thread, but of course no one will serve those connections and they're left on the air. And reasons why you see it as massive amount of CLOSED\CLOSE_WAIT connections is simply the fact that client by timeout or by any other ways decided to close() its connection. Network stack receives FIN packet and expects our application to issue close() on the descriptor, but since thread is already dead - it will never happen. This bug was addressed by Mateusz in r273137[2]. [1] - https://github.com/memcached/memcached/blob/master/thread.c#L369 [2] - https://svnweb.freebsd.org/base?view=revision&revision=273137 -- You are receiving this mail because: You are the assignee for the bug.