Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 22 Feb 2015 15:43:03 -0600 (CST)
From:      "Valeri Galtsev" <galtsev@kicp.uchicago.edu>
To:        "David Benfell" <benfell@parts-unknown.org>
Cc:        cpet <cpet@sdf.org>, Polytropon <freebsd@edvax.de>, freebsd-questions@freebsd.org, galtsev@kicp.uchicago.edu
Subject:   Re: why would I get a segmentation fault on one system but not the  other?
Message-ID:  <11537.76.193.19.10.1424641383.squirrel@cosmo.uchicago.edu>
In-Reply-To: <20150222205918.GA68253@home.parts-unknown.org>
References:  <20150221224006.GA5501@home.parts-unknown.org> <09da5ec0816e098badc49432c802dc18@sdf.org> <390c4c0547fc27e91d28872d29aa2e04@sdf.org> <20150222091956.fd1ec914.freebsd@edvax.de> <20150222104425.GA44573@home.parts-unknown.org> <9134.76.193.19.10.1424620110.squirrel@cosmo.uchicago.edu> <590FB195-C4E9-4D22-8900-ABE784CE9896@parts-unknown.org> <20150222205918.GA68253@home.parts-unknown.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On Sun, February 22, 2015 2:59 pm, David Benfell wrote:
> On Sun, Feb 22, 2015 at 12:22:59PM -0800, David Benfell wrote:
>>
>> Sorry for the top post; I'm on my phone now. A photo of the memtest from
>> just before I shut it down is here:
>> https://parts-unknown.org/wp/wp-content/uploads/2015/02/0222150941.jpg
>> Hopefully it will answer some of the questions you pose.
>
> Oh, so I *did* manage to get K-9 configured not to top-post. ;-)
>
>>
>> The segfaults occur at start-up and consistently thereafter but only, so
>> far as I know, with apache and php-fpm. I have not seen segfaults
>> anywhere else on this system. It is plausible that apache is simply
>> reporting segfaults from php. This is why I think something nefarious is
>> happening within the ports.
>
> I am back on site now and, with some help from grep, established that
> the *only* Segmentation faults logged are associated with httpd or
> apache24. Is it really plausible that this can be hardware?
>

No, I'm too inclined now to think it is not hardware. Swapping the drives
between two boxes will confirm that that is not the hardwre case
definitely... At this point I would think of likely memory leak somewhere.
One of the things would be just run that thing in debugger. You will know
exact place where it fails, and will see if failure reproduces. Before
that I would do some quick look around. Is it possible that some thing
calls itself recursively? This will create quickly new processes, the last
you can spot happening, right? I do realize now the other machine is
production machine. Can you use recovery from backup of production machine
to re-create it exactly on failing box? This will cover what swapping the
drive will do. And in case of success you will have working box. Still, on
production machines I would use RELEASE (as it was already mentioned).
Sorry, I didn't follow this thread from beginning; I may be suggesting
something irrelevant...

Good luck!

Valeri

++++++++++++++++++++++++++++++++++++++++
Valeri Galtsev
Sr System Administrator
Department of Astronomy and Astrophysics
Kavli Institute for Cosmological Physics
University of Chicago
Phone: 773-702-4247
++++++++++++++++++++++++++++++++++++++++



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?11537.76.193.19.10.1424641383.squirrel>