From owner-freebsd-questions@FreeBSD.ORG Fri Dec 17 21:51:55 2010 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0B58E106566B for ; Fri, 17 Dec 2010 21:51:55 +0000 (UTC) (envelope-from matej.serc@gmail.com) Received: from mail-pv0-f182.google.com (mail-pv0-f182.google.com [74.125.83.182]) by mx1.freebsd.org (Postfix) with ESMTP id CD8C48FC0C for ; Fri, 17 Dec 2010 21:51:54 +0000 (UTC) Received: by pvc22 with SMTP id 22so185929pvc.13 for ; Fri, 17 Dec 2010 13:51:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=SOgstLDDLytZfng27y0yo4O5rwhZStTQc9Vo/wIWx84=; b=RuvHGy7iOwE6hfk0n7ygxcmhzVAFy0qej8O/fuOvw5RvDCcu2KEq0y1RlixL3OxCM1 wJQbLCwacc/kH7ufQhWnCdfaxeznDkeKDN5kGzcn6YX/8HB6+oBymnUtO5dPbwg1jc7S rd0WN3ItGnLRCXNvhjkNFmkuQYv7QN6h223vc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=GumVvvcxxozEq9j1yJTo8KVeE6vjm5266MJSaTyrCubb/TbOEb78umx7RVpR0+iJc0 ZcSlP0EAY0EaSsvr8AnvznbG08w6TmASnou6yELFWQwMn2hHxur4Ynyw+zUnucURY1yM 6gqxppbjYGTDMFovxrk56sXYiC46pAM4nmHV0= MIME-Version: 1.0 Received: by 10.142.191.20 with SMTP id o20mr1052929wff.49.1292622714529; Fri, 17 Dec 2010 13:51:54 -0800 (PST) Received: by 10.142.88.5 with HTTP; Fri, 17 Dec 2010 13:51:54 -0800 (PST) In-Reply-To: References: Date: Fri, 17 Dec 2010 22:51:54 +0100 Message-ID: From: =?UTF-8?Q?Matej_=C5=A0erc?= To: krad Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-questions@freebsd.org Subject: Re: FreeBSD 7.2-RELEASE amd64 hangs X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Dec 2010 21:51:55 -0000 Hi, thank you very much for all the answers and ideas. We have found out that after the server was moved to different switch in the co-location centre th= e network interface and the switch auto-negotiated at the 10 Mbit Full Duplex mode. After setting it to GBit manually, everything seems to be working normally, but I am going to check it for some more time. SNMP connects to the local, isolated network from public and we have control over all the devices in the network. I will post if anything new happens, but for now it seems this throughput limitation was causing those issues (although I am still wondering why ther= e is nothing in log files, but due to network "overload" every service we wer= e trying to connect to through network was not working any more). Thank you for your time. BR, Matej On Fri, Dec 17, 2010 at 2:48 PM, krad wrote: > > > On 17 December 2010 13:47, krad wrote: > >> >> >> On 16 December 2010 17:42, Matej =C5=A0erc wrote: >> >>> Hi, >>> >>> I am experiencing a strange issue that has never occurred to me in all >>> the >>> years of using different versions of FreeBSD. >>> >>> One of our servers, which was running without any issues until yesterda= y, >>> stopped responding for two times now - yesterday and today. About three >>> days >>> ago another process of pulling out SNMP data from devices was added, bu= t >>> I >>> was looking the system load and the system was working normally and als= o >>> processes were cmpleting successfully within the timeframe of 5 minutes >>> (much faster, they completed in about 2 minutes). I also want to mentio= n >>> that those SNMP pulling processes were already working about a month or >>> so >>> on the same server (no hardware was changed in the meantime) and I am >>> pretty >>> sure that it should work normally as it did. >>> >>> My main problem is, that there is abcolutely nothing in log files - no >>> errors, no warnings, nothing. No strange messages, every process just >>> stops >>> logging at one time and then continues after the reboot. Another >>> interesting >>> issue is that both hangs occured at approximately the same time, but >>> there >>> was nobody in the server room and also no one was logged into the serve= r >>> at >>> that time except me. About 10 minutes before hang I was investigating >>> processes and everything was very normal - no large CPU eating or memor= y >>> eating processes. This might be interesting, even after every process >>> stops >>> responding, I was still able to ping the network interfaces and receive >>> ICMP >>> replies back. >>> >>> Of course my idea about it is that it must be connected to some hardwar= e >>> problems - my suggestion was to make some memory tests. But I would lik= e >>> to >>> hear some your oppinions about the entire situation. Could some power >>> supply >>> issues be doing it? The server is about a year old and has, as I alread= y >>> mentioned, worked like a charm until now. How come there is no kernel >>> panic >>> since no daemon seems to be working? Why is network interface still up >>> and >>> working? >>> >>> I was unable to go to the co-location facility so I can't say what was = on >>> the screen at both times, but I suppose there was nothing else than >>> messages >>> I can read from log files. >>> >>> I know that 7.2 is pretty old version, but it was working until now on >>> the >>> same hardware and we had no reason to change that. Now the system is >>> after >>> reboot again running smoothly and without any issues at all. >>> >>> Thank you very much for any information regarding the issue. >>> >>> BR, Matej >>> _______________________________________________ >>> freebsd-questions@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-questions >>> To unsubscribe, send any mail to " >>> freebsd-questions-unsubscribe@freebsd.org" >>> >> >> I'm not a huge fan of letting snmp spawn heavy weight scripts and >> processes as it is to easy for a remote machine to effectively dos the >> machine. I realise you are fairly sure the scripts arent an issue, but t= ry >> croning them every 5 minutes, and writing the results to a file. SNMP ca= n >> then simply retrieve the results from the file. This safeguard to to a >> certain extent, in that it stops many processes being spawned. All you h= ave >> to watch after that is the job run time >> >> >> > Also lets stops resources being tied up on the monitoring machine, as it > doent have to hang around for x minutes for the results for its query >