From owner-freebsd-net@freebsd.org Fri Jul 21 21:32:13 2017 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6FF13C0893E for ; Fri, 21 Jul 2017 21:32:13 +0000 (UTC) (envelope-from daniel.bilik@neosystem.cz) Received: from mail.neosystem.cz (mail.neosystem.cz [IPv6:2001:41d0:2:5ab8::10:15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3E72383D67 for ; Fri, 21 Jul 2017 21:32:13 +0000 (UTC) (envelope-from daniel.bilik@neosystem.cz) Received: from mail.neosystem.cz (unknown [127.0.10.15]) by mail.neosystem.cz (Postfix) with ESMTP id 00E76AD81 for ; Fri, 21 Jul 2017 23:32:03 +0200 (CEST) X-Virus-Scanned: amavisd-new at mail.neosystem.cz Received: from dragon.sn.neosystem.cz (unknown [IPv6:2001:41d0:2:5ab8::100:f883]) by mail.neosystem.cz (Postfix) with ESMTPA id A2F27AD7B for ; Fri, 21 Jul 2017 23:32:01 +0200 (CEST) Date: Fri, 21 Jul 2017 23:21:12 +0200 From: Daniel Bilik To: freebsd-net@freebsd.org Subject: mbuf clusters leak in netinet6 Message-Id: <20170721232112.82f6e78b76057312183be937@neosystem.cz> Organization: neosystem.cz X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.31; x86_64-portbld-dragonfly4.9) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Jul 2017 21:32:13 -0000 Hi. (Please keep me in cc, I'm not subscribed to the list.) After deploying ndproxy[1] on a few 10-stable hosts, some of them have experienced mbuf clusters exhaustion. Initial analysis showed that after loading ndproxy.ko, "current" values of "mbuf clusters" and "mbuf+clusters out of packet secondary zone" (from netstat -m output) keep continuously increasing and never decrease. More thorough inspection of ndproxy source code pointed me at function packet() in ndpacket.c[2], to the very last "return 1". With this line changed to "return 0", mbuf clusters do not increase anymore, ie. it fixes the issue. As the leak does not come from "return" itself, I suspect "the proper solution" is to modify code in the upper layer to not leak anything on any returned value. If I read it right, the upper layer in this case is function ip6_input() in sys/netinet6/ip6_input.c[3], specifically pfil_run_hooks() call at line 765. I guess it should be changed like this to avoid the leak: --- ip6_input.c.orig 2017-07-21 22:42:17.780594000 +0200 +++ ip6_input.c 2017-07-21 22:45:28.981497000 +0200 @@ -620,8 +620,11 @@ goto passin; if (pfil_run_hooks(&V_inet6_pfil_hook, &m, - m->m_pkthdr.rcvif, PFIL_IN, NULL)) + m->m_pkthdr.rcvif, PFIL_IN, NULL)) { + if (m) + m_free(m); return; + } if (m == NULL) /* consumed by filter */ return; ip6 = mtod(m, struct ip6_hdr *); I haven't actually tested this modification. I prefer to know your opinions first before trying to panic production hosts running hundreds of miles from me. ;-) Thanks. -- Dan [1] https://github.com/AlexandreFenyo/ndproxy [2] https://github.com/AlexandreFenyo/ndproxy/blob/master/ndpacket.c#L455 [3] https://github.com/freebsd/freebsd/blob/master/sys/netinet6/ip6_input.c#L765