Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 22 Oct 2015 12:08:19 -0500 (CDT)
From:      "Valeri Galtsev" <galtsev@kicp.uchicago.edu>
To:        "Mehmet Erol Sanliturk" <m.e.sanliturk@gmail.com>
Cc:        "Andrea Venturoli" <ml@netfence.it>, "questions@freebsd.org" <questions@freebsd.org>, "Ernie Luzar" <luzar722@gmail.com>
Subject:   Re: Spontaneous reboots with splash
Message-ID:  <16867.128.135.52.6.1445533699.squirrel@cosmo.uchicago.edu>
In-Reply-To: <CAOgwaMvG0VoafNjme_c6dEhQ%2BZsKAO0_Q0i97=ta9=TPF=ZhBw@mail.gmail.com>
References:  <5627D8B8.7030901@netfence.it> <5628CD2B.2000902@gmail.com> <5628CFA7.6040704@netfence.it> <CAOgwaMvH5RbAghKCrhWQ7B=8TUVBxoeAXtrQHGK8qWkwCyXUsg@mail.gmail.com> <5628FD40.1030701@netfence.it> <CAOgwaMvG0VoafNjme_c6dEhQ%2BZsKAO0_Q0i97=ta9=TPF=ZhBw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On Thu, October 22, 2015 11:18 am, Mehmet Erol Sanliturk wrote:
> On Thu, Oct 22, 2015 at 8:14 AM, Andrea Venturoli <ml@netfence.it> wrote:
>
>> On 10/22/15 14:18, Mehmet Erol Sanliturk wrote:
>>
>> If you have two identical computers with the same programs running :
>>> One is working correctly , but other one is booting arbitrarily :
>>>
>>
>> I've got another identical box; I'll restore a dump on this and see if
>> the
>> behaviour is the same.
>>
>>
>>
>>
>> Therefore , there is a necessity to check that
>>>
>>> - processor is working correctly
>>>
>>
>> CPU Burn-in says yes.
>>
>>
>>
>> - memories are working correctly
>>>
>>
>> Memtest 86+ says so.
>>
>>
>>
>> - memory management chips are working correctly .
>>>
>>
>> I have no idea how to check. How do I do this?
>>
>>
>>
>
> If memory tests are showing memories are working correctly , it is
> possible
> to say that memory management chips are also working correctly . Otherwise
> , it is not possible to write into and read from chips correctly .
>
> If memory chips fail , by testing with correctly working chips known , the
> problem may be attributed to memory management chips .
>
> Another possibility is the Watt level of Power Supply : If the required
> watts is exceeding the existent power supply watts level , it may cause
> reboots when power use increases beyond its capacity .
>
>
> Another possibility is power supply is cutting power spontaneously  or
> causing fluctuations .
>

Yes, I've seen this even if PS is marginally pushed to its capacity, and
it is old, therefore filtering capacitors lost some of their capacitance.
Excessive ripple on bus power leads (resulting from the above) and
possibly aged capacitors of the system board (I still call it that way
even though long ago the jargon "motherboard" became a standard) partly to
blame. I've seen the machines starting to consume more power some 5 years
down the road merely because hard drives age, and start consuming more
power.

Incidentally, memtest86 may pass successfully in the above case, as it
runs with zero load, hence much less power consumption.

I also wouldn't discard the possibility that BIOS temperature sensor(s) is
(are) tripped - investigate that (simply increasing threshold levels would
be the way to test if this is the case). If you have AMD CPUs, you should
be safe. I heard someone said you can boil water on them and they still
keep running. I had once to live with 96F in the server room for 2 hours
(to let some maintenance be completed) and none of Opteron boxes got sick.
A few of Intel ones did...

Valeri

>
>
>>
>> Another problem may be a program which is causing generation of an
>>> invalid address showing boot start code and jumping into it . This is
>>> very easy for a i386 real mode program .
>>>
>>
>> In that case this program would be FreeBSD! That's why I'm asking here.
>>
>>
>>
>>
>
> If you can isolate the program causing boots , it will be possible to
> check
> its sources and binary file .
>
>
>
>
>>
>> Another possibility is that a program is broken ( contains an invalid
>>> address )
>>>
>> > in HDD . When it starts to working  , it jumps to that broken address
>> and this
>> > may start the boot .
>>
>> Would a userland program be allowed to do this???
>>
>>
>>
>
> Let's assume that CPU is not over-heated and is not rebooting the computer
> like  motherboard is powered .
>
> Let's assume that there is no any malicious program part to cause
> rebooting
> .
>
> A broken network card may corrupt data and may cause serious problems .
>
> The remaining possibility is that instruction counter value is destroyed
> in
> a program  and showing the BIOS boot code area . To reboot the computer ,
> it is necessary to start BIOS boot code
>
> This may occur also during BIOS related calls . Instead of a proper
> interrupt code , boot part is invoked .
>
> Otherwise we will say that within FreeBSD OS parts , there is a point that
> , instead of a proper shut down , it is directly rebooting the computer by
> calling BIOS boot code . Checking panic points and searching OS sources
> for
> such a reboot code ( without any error message and request approval from
> the user ) existence may help .
>
> Here the most important part is to find the program part which is causing
> the reboots . Studying this program part will reveal the reason and ,
> therefore the cure .
>
>
> I can not say any correct sentence here about FreeBSD internals due to (
> not sufficient knowledge ) .
>
>
> Since that computer is not working properly , you can do the following :
> Reinstall OS into a spare disk and check with it .
>
> This will identify whether problem is caused by the presently installed OS
> or not .
> If it can execute 64-bits OS , testing with such an OS will identify
> effect
> of OS or hardware .
>
>
>
>
>>
>>  bye & Thanks
>>         av.
>>
>
>
> Mehmet Erol Sanliturk
> _______________________________________________
> freebsd-questions@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to
> "freebsd-questions-unsubscribe@freebsd.org"
>


++++++++++++++++++++++++++++++++++++++++
Valeri Galtsev
Sr System Administrator
Department of Astronomy and Astrophysics
Kavli Institute for Cosmological Physics
University of Chicago
Phone: 773-702-4247
++++++++++++++++++++++++++++++++++++++++



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?16867.128.135.52.6.1445533699.squirrel>