Date: Thu, 26 May 2011 09:51:23 +0200 From: Damien Fleuriot <ml@my.gd> To: freebsd-hackers@freebsd.org Subject: Re: DEBUG - analysing core dumps Message-ID: <4DDE067B.7080605@my.gd> In-Reply-To: <BANLkTin8uLk3ZkjaNhurH3%2BVAE1HvhSPag@mail.gmail.com> References: <4DDD3021.1000109@my.gd> <BANLkTin8uLk3ZkjaNhurH3%2BVAE1HvhSPag@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 5/25/11 7:10 PM, Garrett Cooper wrote: > On Wed, May 25, 2011 at 9:36 AM, Damien Fleuriot <ml@my.gd> wrote: >> Hello list, >> >> >> >> We've got these boxes at work running FreeBSD 8.1-STABLE amd64 and >> serving as firewalls and openvpn gateways. >> >> We use CARP interfaces to provide an active-passive fault tolerant system. >> >> >> Today, we received a nagios alert from the master box saying it's >> rsyslogd process had crashed. >> >> I logged on to it and tried to relaunch it, to no avail: >> pid 2303 (rsyslogd), uid 0: exited on signal 11 (core dumped) >> >> >> >> >> I would like advice on how to debug the output from the core dump. >> >> This is what I get from gdb: >> >> # gdb >> GNU gdb 6.1.1 [FreeBSD] >> Copyright 2004 Free Software Foundation, Inc. >> GDB is free software, covered by the GNU General Public License, and you are >> welcome to change it and/or distribute copies of it under certain >> conditions. >> Type "show copying" to see the conditions. >> There is absolutely no warranty for GDB. Type "show warranty" for details. >> This GDB was configured as "amd64-marcel-freebsd". >> (gdb) core rsyslogd.core >> Core was generated by `rsyslogd'. >> Program terminated with signal 11, Segmentation fault. >> #0 0x00000000004258ec in ?? () >> >> >> >> >> Sadly, getting a backtrace with "bt" gives me more lines with "??", >> which is totally not helpful: >> [SNIP] >> #13 0x00007fffff1f9d70 in ?? () >> #14 0x0000000000000000 in ?? () >> #15 0x6f70732f7261762f in ?? () >> #16 0x6c737973722f6c6f in ?? () >> #17 0x5f6e70766f2f676f in ?? () >> #18 0x746174732e676f6c in ?? () >> #19 0x0000000000000065 in ?? () >> #20 0x0000000000000000 in ?? () >> [SNIP] >> >> I am not sure what steps I should follow to get more information ? >> >> >> >> Also, I believe that often, core dumps with signal 11 = RAM problems and >> I would like a confirmation here. >> >> I am concerned because rsyslogd is the only process that crashes in this >> way, even after I rebooted the firewall. > > Rebuild and reinstall rsyslogd with debug symbols and see if you > can get a reasonable stack trace. Something else to try before that to > narrow down the problem section of code is ktrace/kdump it, or truss > it, and see if it's trying to open/read from a file and failing. > Thanks, > -Garrett Thanks everyone for your answers, I'll recompile with DEBUG and obtain a new core dump. I'll also investigate the possibility of corrupted spool files and post the resolution here :) -- dfl
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4DDE067B.7080605>