From owner-freebsd-net@freebsd.org Fri Oct 2 22:16:16 2015 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F0172A0E835 for ; Fri, 2 Oct 2015 22:16:15 +0000 (UTC) (envelope-from nvass@gmx.com) Received: from mout.gmx.net (mout.gmx.net [212.227.15.15]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id AE8D319F9 for ; Fri, 2 Oct 2015 22:16:15 +0000 (UTC) (envelope-from nvass@gmx.com) Received: from moby.local ([91.140.35.215]) by mail.gmx.com (mrgmx003) with ESMTPSA (Nemesis) id 0LcShi-1aOf5M1SkQ-00jpmD for ; Sat, 03 Oct 2015 00:11:02 +0200 From: Nikos Vassiliadis To: freebsd-net@freebsd.org Subject: carp on if_bridge deadlock Message-ID: <560F00E6.5010503@gmx.com> Date: Sat, 3 Oct 2015 01:10:46 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K0:kMjNh8rdA4FyMbfUKgx/XXGFvaf+huYT37cooTEltYM0N1eixeR r6qbMTl/7xWS4IJ+/w1hhKHhdvdN4nxHTA8MtmzAlfAfFZlBUKlv/n8Ewkp+NagrX2rGeQW gcAxzaHdGvh2lbAiGVWurUq7P81QgBpQmsU9MvhLMFN5Pt4vIePAe/+ybohWv/hI3CfqFOo xLTbdXemPAwHX+gi9pLGg== X-UI-Out-Filterresults: notjunk:1;V01:K0:vOopai/6JV8=:1xJSxDvlwu4m4/IaVn1mjx RepjmqSKJ0JpAbpLSeVUtCBVCHaR3PZCZIRRdrkcqnAbyl3hna88QOzG3R/BjfsWOMTrr5rjM yNnS1lTugPHJJi5YFC0NDcQytJf1M1l+bFUMRw4xtPEN8cypyYRMyoR4jkNfJfET93irX1/GZ zJrLksXcbBrd2Q96WQXHAgSiAcj62BQxYYgU9Kz1YJWACDeLKePtlgiVgbJWMhe6sTAz8kbsl AFwce0/utYfRdECg2bCfnaHOGJNF9exCfyqX6nhNbgMZXi8sLOM39AlM5otMzweOV84Emx3A8 A/sfah2wF8ufBNokzmEzwbkI+lNXtnvOYATzg/UEIkJQVPdbJtgasjKak8nbht66vthStFa/r YXgIDkla727aDfthx1bZ53eqxB5mnivNyP70wPcfZuvue3lkNZ0j2/UVda4BrC3ru+wl8bypw 93Z9+sFL9WOXKWrRnkQLHJT4g+4RSXCvvgDyU7CcLtdp5burZOtykwE94ZRBducPxKlX08cX0 Xe4K3elX7Ej/ogAwZSgN9CQhbmz0PAhrPHd4mAL5JDBdY+acRgSehuu9O4ZQb7OAr6S6ow8eM QOhhALSGSbyrADz7x+GSnFmP41EhJvWv+Usovpz3EqFe5DRrbVpOrfpE0tHA8RbaRaLwbKilk xdsmrDt60/+EUUGhsydeUyvp/+JzKVdm4ktb7tDlPTr7QWVrWG0v1GYKAKz/uE5CDpun86oN+ Z6YhBJStKpnU1L779saCBRroFyOEdGXWpFJzbVaHvhp4ilTjLdWy1a2tPWnfMb9sN4gq1QPpL GSJCWem X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Oct 2015 22:16:16 -0000 Hi, I am trying to use carp over an if_bridge and am getting this LOR: > login: lock order reversal: > 1st 0xfffff8000848a018 if_bridge (if_bridge) @ /usr/src/sys/modules/if_bridge/../../net/if_bridge.c:2315 > 2nd 0xfffff80003a58778 carp_if (carp_if) @ /usr/src/sys/modules/carp/../../netinet/ip_carp.c:1118 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0092c3b6a0 > witness_checkorder() at witness_checkorder+0xe7a/frame 0xfffffe0092c3b720 > __mtx_lock_flags() at __mtx_lock_flags+0xa8/frame 0xfffffe0092c3b770 > carp_forus() at carp_forus+0x7a/frame 0xfffffe0092c3b7b0 > bridge_input() at bridge_input+0x338/frame 0xfffffe0092c3b820 > ether_nh_input() at ether_nh_input+0x2ab/frame 0xfffffe0092c3b860 > netisr_dispatch_src() at netisr_dispatch_src+0x86/frame 0xfffffe0092c3b8d0 > ether_input() at ether_input+0x4f/frame 0xfffffe0092c3b900 > vtnet_rxq_eof() at vtnet_rxq_eof+0x845/frame 0xfffffe0092c3b9b0 > vtnet_rx_vq_intr() at vtnet_rx_vq_intr+0x4e/frame 0xfffffe0092c3b9e0 > intr_event_execute_handlers() at intr_event_execute_handlers+0xe4/frame 0xfffffe0092c3ba20 > ithread_loop() at ithread_loop+0xa6/frame 0xfffffe0092c3ba70 > fork_exit() at fork_exit+0x84/frame 0xfffffe0092c3bab0 > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0092c3bab0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- Eventually all network activity will stop and this will happen: > load: 0.33 cmd: ifconfig 665 [*carp_if] 228.29r 0.00u 0.01s 0% 2716k Or: > load: 2.86 cmd: ifconfig 720 [running] 104.92r 0.00u 36.53s 100% 2716k A single ping to the carp address is enough to trigger the problem. The debugger says: > db> show all locks > Process 669 (sysctl) thread 0xfffff800084269a0 (100082) > exclusive sleep mutex Giant (Giant) r = 0 (0xffffffff81cc4e90) locked @ /usr/src/sys/kern/kern_sysctl.c:164 > Process 668 (ifconfig) thread 0xfffff8000842e9a0 (100079) > exclusive sx carp_sx (carp_sx) r = 0 (0xffffffff820b0ac0) locked @ /usr/src/sys/modules/carp/../../netinet/ip_carp.c:1644 > Process 12 (intr) thread 0xfffff800038a89a0 (100010) > exclusive sleep mutex carp_softc (carp_softc) r = 0 (0xfffff80008266508) locked @ /usr/src/sys/kern/kern_mutex.c:158 > Process 12 (intr) thread 0xfffff80003a679a0 (100031) > exclusive sleep mutex carp_if (carp_if) r = 0 (0xfffff8000825b078) locked @ /usr/src/sys/modules/carp/../../netinet/ip_carp.c:1118 > exclusive sleep mutex if_bridge (if_bridge) r = 0 (0xfffff80008467018) locked @ /usr/src/sys/modules/if_bridge/../../net/if_bridge.c:2315 > db> There is a related PR https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=200319 But I couldn't apply cleanly the patch to HEAD or 10-STABLE. Thanks for any insights, Nikos