From owner-freebsd-stable@freebsd.org Mon Sep 5 16:15:00 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DD1DFA9DF53 for ; Mon, 5 Sep 2016 16:15:00 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-it0-x233.google.com (mail-it0-x233.google.com [IPv6:2607:f8b0:4001:c0b::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id ADEE4B0A for ; Mon, 5 Sep 2016 16:15:00 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-it0-x233.google.com with SMTP id e124so153108340ith.0 for ; Mon, 05 Sep 2016 09:15:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=iR3YNocHrKLBiQ4SjP+/vINYgPkSoISO8ktTs9SUzKc=; b=dmLz5q9l/ucooPRMoSrHGxWcYe1HwB7pNzmvKrctAN88e+1ho5b9gHYIxBFGfS3pqu GnlvA7PInNCet7d1P2sUL68/hLA93+Z5mkwKk3T+vptOtCfQ/KcG9cXq1cpQ9AF9LKuD Cf25zHHi34k1GxpYymXSOHw6Ytlbt2aBUfZ4Hx2hWsHhrhLybPOwkGPnV++TMknBaFtQ dkU4neVLRA/N45PqIslWoNDFgY8L+BSkRldX12YbX0qYUOoAHwteogz5QI37tFkcqNhd wwzeAfC1gewPXCVCixBSrhh/EED1a368OG0MOciF8IAVT7TXIaoo/pUc28iBips4d30/ hDmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=iR3YNocHrKLBiQ4SjP+/vINYgPkSoISO8ktTs9SUzKc=; b=LjMRam9g4Q9FAxSsDIw1bUtb1pjPC+JufUMFp+pxzgVCRIR/mMl4MuIcKCHpKX3EsY PfHRwf+kgZsIBrQKtkoLt2W5eHT5bxZYmL+f+FcHYRll4AiN6mK9t19hiZxCbwGMicHR MVPdS+flFbXHRL8lHMc+2I6USTaQN9y1cwfqliHym17Zg/y4a3UY9yBoj1EjUoPD+O31 P6mvPAnpOxFErkLfH4/sG98s+v9HtJahI8SDE3Xu+i4Ckhypq4ENn3aeIRrTTmRJMVzM Ap/Cozn/7hluKPqKScJJMt3Rhn01HL9ZS80fWSatrfzv9J6lzp7k/r298Jz2O4UvQkR+ BncQ== X-Gm-Message-State: AE9vXwNMVHpklIj3945XSvjea7tr1ndEnzOJqZ2fGMb/SJTVeoggxjGSayMPk14JTX56TjmtkfYgQGkgpdbH6Q== X-Received: by 10.36.149.193 with SMTP id m184mr26538541itd.94.1473092099901; Mon, 05 Sep 2016 09:14:59 -0700 (PDT) MIME-Version: 1.0 Sender: wlosh@bsdimp.com Received: by 10.36.65.7 with HTTP; Mon, 5 Sep 2016 09:14:59 -0700 (PDT) X-Originating-IP: [50.253.99.174] In-Reply-To: <20160905074348.GE34394@zxy.spb.ru> References: <20160904215739.GC22212@zxy.spb.ru> <20160905014612.GA42393@strugglingcoder.info> <20160905074348.GE34394@zxy.spb.ru> From: Warner Losh Date: Mon, 5 Sep 2016 10:14:59 -0600 X-Google-Sender-Auth: lGnH4th4Pl1KeJhAlMTIRdEAkV8 Message-ID: Subject: Re: 11.0 stuck on high network load To: Slawa Olhovchenkov Cc: hiren panchasara , FreeBSD-STABLE Mailing List Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Sep 2016 16:15:01 -0000 On Mon, Sep 5, 2016 at 1:43 AM, Slawa Olhovchenkov wrote: > On Sun, Sep 04, 2016 at 06:46:12PM -0700, hiren panchasara wrote: > >> On 09/05/16 at 12:57P, Slawa Olhovchenkov wrote: >> > I am try using 11.0 on Dual E5-2620 (no X2APIC). >> > Under high network load and may be addtional conditional system go to >> > unresponsible state -- no reaction to network and console (USB IPMI >> > emulation). INVARIANTS give to high overhad. Is this exist some way to >> > debug this? >> >> Can you panic it from console to get to db> to get backtrace and other >> info when it goes unresponsive? > > no > no reaction So the canonical 'ipmitool chassis power diag' doesn't send an NMI to get you to the debugger? I've seen this at Netflix on one variant of our flash offload box with a Intel e5-2697v2 running with the Chelsio driver. We're working around it by having fewer receive threads than CPUs in the system. The only way the boxes would come back was with watchdog. The load was streaming video > ~36Gbps out 4 lagged 10G ports. Console is totally unresponsive as well. This is on our FreeBSD-10 stable based fork. >From my debugging, we go from totally fine as far as I can tell from ps, etc in the moments leading to the hang to being totally wedged. It seems a very sudden-onset condition. Sound at all familiar? Warner