Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 5 Sep 2016 19:46:56 +0300
From:      Slawa Olhovchenkov <slw@zxy.spb.ru>
To:        Warner Losh <imp@bsdimp.com>
Cc:        hiren panchasara <hiren@strugglingcoder.info>, FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org>
Subject:   Re: 11.0 stuck on high network load
Message-ID:  <20160905164656.GG34394@zxy.spb.ru>
In-Reply-To: <CANCZdfoOqVGxiytKRhpFgT6N1EbKyP3qqyj2QQzTbUSPzAQW0A@mail.gmail.com>
References:  <20160904215739.GC22212@zxy.spb.ru> <20160905014612.GA42393@strugglingcoder.info> <20160905074348.GE34394@zxy.spb.ru> <CANCZdfoOqVGxiytKRhpFgT6N1EbKyP3qqyj2QQzTbUSPzAQW0A@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Sep 05, 2016 at 10:14:59AM -0600, Warner Losh wrote:

> On Mon, Sep 5, 2016 at 1:43 AM, Slawa Olhovchenkov <slw@zxy.spb.ru> wrote:
> > On Sun, Sep 04, 2016 at 06:46:12PM -0700, hiren panchasara wrote:
> >
> >> On 09/05/16 at 12:57P, Slawa Olhovchenkov wrote:
> >> > I am try using 11.0 on Dual E5-2620 (no X2APIC).
> >> > Under high network load and may be addtional conditional system go to
> >> > unresponsible state -- no reaction to network and console (USB IPMI
> >> > emulation). INVARIANTS give to high overhad. Is this exist some way to
> >> > debug this?
> >>
> >> Can you panic it from console to get to db> to get backtrace and other
> >> info when it goes unresponsive?
> >
> > no
> > no reaction
> 
> So the canonical 'ipmitool chassis power diag' doesn't send an NMI to
> get you to the debugger?

Don't try (and don't know about this).
Can you some explain?
Is this FreeBSD by default catch NMI and enter to debugger?
How to interoperable with USB stack (I am beware USB keyboard may be locked)?

> I've seen this at Netflix on one variant of our flash offload box with
> a Intel e5-2697v2 running with the Chelsio driver. We're working
> around it by having fewer receive threads than CPUs in the system. The
> only way the boxes would come back was with watchdog. The load was
> streaming video > ~36Gbps out 4 lagged 10G ports. Console is totally
> unresponsive as well. This is on our FreeBSD-10 stable based fork.
> >From my debugging, we go from totally fine as far as I can tell from
> ps, etc in the moments leading to the hang to being totally wedged. It
> seems a very sudden-onset condition. Sound at all familiar?
> 
> Warner

Not sure.
This is less power box and can be servered only 20Gbit, using Intel
card (lagg 2x10H). Day ago I am using on this box 10-STABLE w/o such
issuse. (Not cleancly remember, may be some month ago this box crashed
by this issuse -- at the that time I am don't have any ideas about crash)

May be stuck caused by some poor (too big) memory request from nginx
(atempt parsing some malformed files). Or frequent nginx core dump
(from this malformed files).

11.0 on two different more power box servered from 40 to 55Gbit w/o stuck.
But w/o malformed files (t.e. w/o bogus memory request and w/o nginx
crash). Not sure about correlation.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160905164656.GG34394>