From owner-freebsd-questions@freebsd.org Thu Oct 22 21:36:04 2015 Return-Path: Delivered-To: freebsd-questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 29CD3A1C189 for ; Thu, 22 Oct 2015 21:36:04 +0000 (UTC) (envelope-from m.e.sanliturk@gmail.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 050221AF5 for ; Thu, 22 Oct 2015 21:36:04 +0000 (UTC) (envelope-from m.e.sanliturk@gmail.com) Received: by mailman.ysv.freebsd.org (Postfix) id 02136A1C188; Thu, 22 Oct 2015 21:36:04 +0000 (UTC) Delivered-To: questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0195FA1C186 for ; Thu, 22 Oct 2015 21:36:04 +0000 (UTC) (envelope-from m.e.sanliturk@gmail.com) Received: from mail-io0-x232.google.com (mail-io0-x232.google.com [IPv6:2607:f8b0:4001:c06::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id BD7FC1AF4 for ; Thu, 22 Oct 2015 21:36:03 +0000 (UTC) (envelope-from m.e.sanliturk@gmail.com) Received: by iofz202 with SMTP id z202so105675088iof.2 for ; Thu, 22 Oct 2015 14:36:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=JhzdvJobCECIy8feupgMXVT0s+xTYwnjoJ0teRHt3VE=; b=kqq219qQBIHoTbpHVEdcPjQ+K4HW7pc8AuBEOvRAdPv8/YK4EIFt/P2OBMJC1g36n5 eYdBisIL82BgM/CUNTBBar0O5hFI6me9fuCAeWoJbM0Fr+BgvnxkufC+HwZXGbaxSiu2 J0IsZtxn5t8YKh4AvzWkv0bJxwycoKzwVkGrhubsbPL93k/NMXzfozyZGwlgCnO90atE 5HL7dNhj/yCVABHuQ0GRYviFi8v6PepCpsm5Hbq6JHsMGA7Qu79uHZ1X59yGCIlqnyfN 2CqKFB48W6WFM+IEfHODZYho73ng42Z1BqopNV/bDO243GqWyxUSTuHOMdAwjwr7/Pxp FlWg== MIME-Version: 1.0 X-Received: by 10.107.11.226 with SMTP id 95mr18737694iol.186.1445549763072; Thu, 22 Oct 2015 14:36:03 -0700 (PDT) Received: by 10.64.241.227 with HTTP; Thu, 22 Oct 2015 14:36:02 -0700 (PDT) In-Reply-To: <16867.128.135.52.6.1445533699.squirrel@cosmo.uchicago.edu> References: <5627D8B8.7030901@netfence.it> <5628CD2B.2000902@gmail.com> <5628CFA7.6040704@netfence.it> <5628FD40.1030701@netfence.it> <16867.128.135.52.6.1445533699.squirrel@cosmo.uchicago.edu> Date: Thu, 22 Oct 2015 14:36:02 -0700 Message-ID: Subject: Re: Spontaneous reboots with splash From: Mehmet Erol Sanliturk To: galtsev@kicp.uchicago.edu Cc: Andrea Venturoli , "questions@freebsd.org" , Ernie Luzar Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Oct 2015 21:36:04 -0000 On Thu, Oct 22, 2015 at 10:08 AM, Valeri Galtsev wrote: > > On Thu, October 22, 2015 11:18 am, Mehmet Erol Sanliturk wrote: > > On Thu, Oct 22, 2015 at 8:14 AM, Andrea Venturoli > wrote: > > > >> On 10/22/15 14:18, Mehmet Erol Sanliturk wrote: > >> > >> If you have two identical computers with the same programs running : > >>> One is working correctly , but other one is booting arbitrarily : > >>> > >> > >> I've got another identical box; I'll restore a dump on this and see if > >> the > >> behaviour is the same. > >> > >> > >> > >> > >> Therefore , there is a necessity to check that > >>> > >>> - processor is working correctly > >>> > >> > >> CPU Burn-in says yes. > >> > >> > >> > >> - memories are working correctly > >>> > >> > >> Memtest 86+ says so. > >> > >> > >> > >> - memory management chips are working correctly . > >>> > >> > >> I have no idea how to check. How do I do this? > >> > >> > >> > > > > If memory tests are showing memories are working correctly , it is > > possible > > to say that memory management chips are also working correctly . > Otherwise > > , it is not possible to write into and read from chips correctly . > > > > If memory chips fail , by testing with correctly working chips known , > the > > problem may be attributed to memory management chips . > > > > Another possibility is the Watt level of Power Supply : If the required > > watts is exceeding the existent power supply watts level , it may cause > > reboots when power use increases beyond its capacity . > > > > > > Another possibility is power supply is cutting power spontaneously or > > causing fluctuations . > > > > Yes, I've seen this even if PS is marginally pushed to its capacity, and > it is old, therefore filtering capacitors lost some of their capacitance. > Excessive ripple on bus power leads (resulting from the above) and > possibly aged capacitors of the system board (I still call it that way > even though long ago the jargon "motherboard" became a standard) partly to > blame. I've seen the machines starting to consume more power some 5 years > down the road merely because hard drives age, and start consuming more > power. > > Incidentally, memtest86 may pass successfully in the above case, as it > runs with zero load, hence much less power consumption. > > I also wouldn't discard the possibility that BIOS temperature sensor(s) is > (are) tripped - investigate that (simply increasing threshold levels would > be the way to test if this is the case). If you have AMD CPUs, you should > be safe. I heard someone said you can boil water on them and they still > keep running. I had once to live with 96F in the server room for 2 hours > (to let some maintenance be completed) and none of Opteron boxes got sick. > A few of Intel ones did... > > Valeri > > > > > > >> > >> Another problem may be a program which is causing generation of an > >>> invalid address showing boot start code and jumping into it . This is > >>> very easy for a i386 real mode program . > >>> > >> > >> In that case this program would be FreeBSD! That's why I'm asking here. > >> > >> > >> > >> > > > > If you can isolate the program causing boots , it will be possible to > > check > > its sources and binary file . > > > > > > > > > >> > >> Another possibility is that a program is broken ( contains an invalid > >>> address ) > >>> > >> > in HDD . When it starts to working , it jumps to that broken address > >> and this > >> > may start the boot . > >> > >> Would a userland program be allowed to do this??? > >> > >> > >> > > > > Let's assume that CPU is not over-heated and is not rebooting the > computer > > like motherboard is powered . > > > > Let's assume that there is no any malicious program part to cause > > rebooting > > . > > > > A broken network card may corrupt data and may cause serious problems . > > > > The remaining possibility is that instruction counter value is destroyed > > in > > a program and showing the BIOS boot code area . To reboot the computer , > > it is necessary to start BIOS boot code > > > > This may occur also during BIOS related calls . Instead of a proper > > interrupt code , boot part is invoked . > > > > Otherwise we will say that within FreeBSD OS parts , there is a point > that > > , instead of a proper shut down , it is directly rebooting the computer > by > > calling BIOS boot code . Checking panic points and searching OS sources > > for > > such a reboot code ( without any error message and request approval from > > the user ) existence may help . > > > > Here the most important part is to find the program part which is causing > > the reboots . Studying this program part will reveal the reason and , > > therefore the cure . > > > > > > I can not say any correct sentence here about FreeBSD internals due to ( > > not sufficient knowledge ) . > > > > > > Since that computer is not working properly , you can do the following : > > Reinstall OS into a spare disk and check with it . > > > > This will identify whether problem is caused by the presently installed > OS > > or not . > > If it can execute 64-bits OS , testing with such an OS will identify > > effect > > of OS or hardware . > > > > > > > > > >> > >> bye & Thanks > >> av. > >> > > > > > > Mehmet Erol Sanliturk > > _______________________________________________ > > freebsd-questions@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-questions > > To unsubscribe, send any mail to > > "freebsd-questions-unsubscribe@freebsd.org" > > > > > ++++++++++++++++++++++++++++++++++++++++ > Valeri Galtsev > Sr System Administrator > Department of Astronomy and Astrophysics > Kavli Institute for Cosmological Physics > University of Chicago > Phone: 773-702-4247 > ++++++++++++++++++++++++++++++++++++++++ > Another important trouble point is HDD cables : They may be badly corrupting loaded programs . Checking ( replacing ) HDD cables may be useful . Mehmet Erol Sanliturk