From owner-freebsd-net@FreeBSD.ORG Thu Oct 9 22:29:34 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6FBF2658 for ; Thu, 9 Oct 2014 22:29:34 +0000 (UTC) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "funkthat.com", Issuer "funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4DA3164D for ; Thu, 9 Oct 2014 22:29:33 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id s99MTRVp062746 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 9 Oct 2014 15:29:27 -0700 (PDT) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id s99MTQt8062745; Thu, 9 Oct 2014 15:29:26 -0700 (PDT) (envelope-from jmg) Date: Thu, 9 Oct 2014 15:29:26 -0700 From: John-Mark Gurney To: elof2@sentor.se Subject: Re: Unable to kill a non-zombie process with -9 Message-ID: <20141009222926.GC1852@funkthat.com> Mail-Followup-To: elof2@sentor.se, freebsd-net , snort-devel mailinglist References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Thu, 09 Oct 2014 15:29:27 -0700 (PDT) Cc: freebsd-net , snort-devel mailinglist X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Oct 2014 22:29:34 -0000 elof2@sentor.se wrote this message on Wed, Oct 08, 2014 at 13:30 +0200: > > I guess this is a bug report for FreeBSD 10.0. > > > > Sometimes I can't kill my snort process on FreeBSD 10.0. > It won't die, even with kill -9. > > I'm not talking about a zombie process. Snort is a process that should > die normally. > I've run snort on over 100 nodes since FreeBSD v6.x and I've never seen > this behavior until now in FreeBSD 10.0. > > > Example: > > #ps faxuw > USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME > COMMAND > root 49222 53.4 2.2 492648 183012 - Rs 11:46AM 7:05.59 > /usr/local/bin/snort -q -D -c snort.conf > root 47937 0.0 2.2 488552 182864 - Ts 10:56AM 29:35.98 > /usr/local/bin/snort -q -D -c snort.conf What is the MWCHAN? add l to the ps command... > The pid 47937 has been killed (repeatedly) with -9. > Its status is "Ts" meaning it is Stopped. have you tried to kill -CONT to resume it? > But it won't actually die and disappear. The only way to get rid of it > seem to be to reboot the machine. :-( > > (pid 49222 is the new process that was started after 47937 was killed) > > > The problem doesn't happen all the time and I haven't found any patterns > as to when it does. :-( > If I restart snort once every day, it fails to die approximately 2-4 times > per month. > Even though the problem doesn't happen on every kill, it is a definately a > recurring event. Can you run kgdb on the machine? (yes, it works on a live machine), use info threads to find the thread id, and then use thread to switch to it, and run bt to get a back trace... > I began to see it on a heavily loaded 10GE sensor, so I thought it could > have something to do with the ix driver, or the heavy load. > But now another FreeBSD 10.0-sensor had the exact same problem, and this > sensor don't have any 10GE NICs. In fact, this sensor has been running > just fine with both FreeBSD 9.1 and 9.3 for the past years. Snort has > always terminated correctly! After I reinstalled this machine with FreeBSD > 10.0 last friday, snort has then terminated correctly every day until > today, when it failed with the above pid 47937. (this sensor use the 'em' > driver, not 'ixgbe') > > I'm running snort with the same configuration, settings, version, daq, > libs, etc on 10.0 as I do on 9.3. > None of the 9.3 sensors have this problem, so it has to be something new > in FreeBSD 10.0. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."