Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 26 May 2011 09:51:23 +0200
From:      Damien Fleuriot <ml@my.gd>
To:        freebsd-hackers@freebsd.org
Subject:   Re: DEBUG - analysing core dumps
Message-ID:  <4DDE067B.7080605@my.gd>
In-Reply-To: <BANLkTin8uLk3ZkjaNhurH3%2BVAE1HvhSPag@mail.gmail.com>
References:  <4DDD3021.1000109@my.gd> <BANLkTin8uLk3ZkjaNhurH3%2BVAE1HvhSPag@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help


On 5/25/11 7:10 PM, Garrett Cooper wrote:
> On Wed, May 25, 2011 at 9:36 AM, Damien Fleuriot <ml@my.gd> wrote:
>> Hello list,
>>
>>
>>
>> We've got these boxes at work running FreeBSD 8.1-STABLE amd64 and
>> serving as firewalls and openvpn gateways.
>>
>> We use CARP interfaces to provide an active-passive fault tolerant system.
>>
>>
>> Today, we received a nagios alert from the master box saying it's
>> rsyslogd process had crashed.
>>
>> I logged on to it and tried to relaunch it, to no avail:
>> pid 2303 (rsyslogd), uid 0: exited on signal 11 (core dumped)
>>
>>
>>
>>
>> I would like advice on how to debug the output from the core dump.
>>
>> This is what I get from gdb:
>>
>> # gdb
>> GNU gdb 6.1.1 [FreeBSD]
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and you are
>> welcome to change it and/or distribute copies of it under certain
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB.  Type "show warranty" for details.
>> This GDB was configured as "amd64-marcel-freebsd".
>> (gdb) core rsyslogd.core
>> Core was generated by `rsyslogd'.
>> Program terminated with signal 11, Segmentation fault.
>> #0  0x00000000004258ec in ?? ()
>>
>>
>>
>>
>> Sadly, getting a backtrace with "bt" gives me more lines with "??",
>> which is totally not helpful:
>> [SNIP]
>> #13 0x00007fffff1f9d70 in ?? ()
>> #14 0x0000000000000000 in ?? ()
>> #15 0x6f70732f7261762f in ?? ()
>> #16 0x6c737973722f6c6f in ?? ()
>> #17 0x5f6e70766f2f676f in ?? ()
>> #18 0x746174732e676f6c in ?? ()
>> #19 0x0000000000000065 in ?? ()
>> #20 0x0000000000000000 in ?? ()
>> [SNIP]
>>
>> I am not sure what steps I should follow to get more information ?
>>
>>
>>
>> Also, I believe that often, core dumps with signal 11 = RAM problems and
>> I would like a confirmation here.
>>
>> I am concerned because rsyslogd is the only process that crashes in this
>> way, even after I rebooted the firewall.
> 
>     Rebuild and reinstall rsyslogd with debug symbols and see if you
> can get a reasonable stack trace. Something else to try before that to
> narrow down the problem section of code is ktrace/kdump it, or truss
> it, and see if it's trying to open/read from a file and failing.
> Thanks,
> -Garrett




Thanks everyone for your answers, I'll recompile with DEBUG and obtain a
new core dump.

I'll also investigate the possibility of corrupted spool files and post
the resolution here :)


--
dfl



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4DDE067B.7080605>