From owner-freebsd-questions@FreeBSD.ORG Thu Dec 3 13:30:10 2009 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 83AB9106566C for ; Thu, 3 Dec 2009 13:30:10 +0000 (UTC) (envelope-from derek@computinginnovations.com) Received: from betty.computinginnovations.com (mail.computinginnovations.com [64.81.227.250]) by mx1.freebsd.org (Postfix) with ESMTP id 077D28FC0A for ; Thu, 3 Dec 2009 13:30:09 +0000 (UTC) Received: from p28.computinginnovations.com (dhcp-10-20-30-100.computinginnovations.com [10.20.30.100]) (authenticated bits=0) by betty.computinginnovations.com (8.14.3/8.14.3) with ESMTP id nB3DUSWp049283; Thu, 3 Dec 2009 07:30:28 -0600 (CST) (envelope-from derek@computinginnovations.com) Message-Id: <6.0.0.22.2.20091203072542.0265ade0@mail.computinginnovations.com> X-Sender: derek@mail.computinginnovations.com X-Mailer: QUALCOMM Windows Eudora Version 6.0.0.22 Date: Thu, 03 Dec 2009 07:29:28 -0600 To: "Igor V. Ruzanov" From: Derek Ragona In-Reply-To: References: <6.0.0.22.2.20091203054150.02631b68@mail.computinginnovations.com> Mime-Version: 1.0 X-Antivirus: avast! (VPS 091203-1, 12/03/2009), Outbound message X-Antivirus-Status: Clean X-ComputingInnovations-MailScanner-Information: Please contact the ISP for more information X-MailScanner-ID: nB3DUSWp049283 X-ComputingInnovations-MailScanner: Found to be clean X-ComputingInnovations-MailScanner-From: derek@computinginnovations.com X-Spam-Status: No Content-Type: text/plain; charset="us-ascii"; format=flowed X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-questions@freebsd.org Subject: Re: FreeBSD 8.0 retires into itself X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Dec 2009 13:30:10 -0000 At 06:48 AM 12/3/2009, Igor V. Ruzanov wrote: >On Thu, 3 Dec 2009, Derek Ragona wrote: > >|At 04:28 AM 12/3/2009, Igor V. Ruzanov wrote: >|> Hello! >|> >|> I have updated FreeBSD 8.0 sources via cvsup and compiled system. uname -a >|> shows: >|> >|> FreeBSD localhost 8.0-RELEASE FreeBSD 8.0-RELEASE #2: Mon Nov 30 >20:15:12 MSD >|> 2009 root@localhost:/usr/src/sys/i386/compile/HOME-PAE i386 >|> >|> Machine has 3 physical interfaces: >|> - em0 (PCI/Intel PWLA 8390 MT) >|> - em1 (PCI/Intel PWLA 8390 MT) >|> - fxp0 (PCI/Intel EtherExpress PRO/100) >|> >|> and 2 VLANs: vlan317 and vlan320. >|> >|> Also there is one interface built in motherboard: >|> - ale0 (PCI-E/Atheros AR8121) >|> >|> One physical interface (em0) is in trunk mode (802.1Q) to configure >these two >|> VLANs (vlan317 and vlan320) interfaces. Machine acts as BGP router. It >has 3 >|> uplinks: >|> - vlan317 >|> - vlan320 >|> - fxp0 >|> >|> and one backbone interface: >|> - em1. >|> >|> Next, i recompiled all userland and made all necessary configurations after >|> which the machine became as production BGP router installed in server room. >|> So issue looks like the following: >|> >|> After 20-30 minutes of stable work, the system starts to "retire into >|> itself": any user processes (bgpd, zebra, named) don't respond, For >example a >|> can't telnet to bgpd control terminal, telnet just dies showing: >|> Trying 127.0.0.1... >|> Connected to localhost. >|> Escape character is '^]' >|> >|> I even tried to login into system from local console. But when i pressed >|> Enter after username was typed, the console just hang. Power button also >|> doesn't respond (in usual case pressing on Power button gives the >machine is >|> going to power off). One interesting thing: after system was booted, top >|> command shows: >|> >|> system eats about 28-30% of CPU time >|> interrupts eat about only 6-7% of CPU time >|> all user processes eat less than 0-1% of CPU time >|> >|> On another working machine (same BGP router, but system is FreeBSD >7.0-STABLE >|> p4) the picture seems to be different: >|> >|> system etas 9-10% of CPU time >|> interrupts eat 15-16% of CPU time >|> >|> So my question is the REASONS that cause such system behavior. I read >|> UPDATING, so kernel in FreeBSD 8.0 RELEASE was largely reworked, in >|> particular - SMPng in order to remove all non-MPSAFE driver's locks >(netperf >|> project). Are there new specific kernel config options to get better >|> perfomance of network subsystem? Or should i set some sysctl variables? >|> >|> My hardware: >|> - Motherboard: ASUS P5P43TD (with built in Gigabit LAN Atheros AR8121) >|> - Core 2 Quad CPU >|> - 4G RAM (2x2048) >|> >|> kernel compiled with PAE support, ULE-scheduler, with PREEMPTION option. >|> If you need whole kernel config, please let me know, i will post it ASAP. >|> >|> > >|You need to check your network setups: >|ifconfig -a >| >|You can really only have one NIC on a single network. With multiple NICs if >|they are on the same network, you will have arp issues causing routing >issues. >|You can easily check the arp table before and after you see this behavior >|doing: >|arp -a >|after a reboot, then after the system becomes unresponsive after 30-40 >minutes. >| >|Multiple NICs are necessary if you are using this system as a firewall or >|packet filter. >| >|To narrow down your problem you may want to disable any NICs that are not >|necessary and see if the problem persists. >| > >Thank you for reply, Derek! > >I have different non-overlapped subnets on used network interfaces. >Actually, my machine acts as a border rather than just a router. And it >needs several network interface cards (NICs) - one of them looks in my >network (my Autonomous System with my internal routing), and another ones >look to different ISPs with their own ASs. It gives possibility to make a >choice of more cheap route to any Internet resource. > >By the way, when i tested just installed system under traffic load >generated with iperf tool, the system worked fine during several days. >Configuration was the same except only one NIC was under traffic load. And >similar tests with each NIC installed in my machine yielded the same good >results. Since it seems tied to load, which NIC is causing the trouble? I'd suspect the motherboard NIC. I have used many Intel NICs without problems. In multi-NIC servers I setup, I usually add a quad-port Intel card and don't use the motherboard NICs. You may want to try using a different NIC in place of the onboard and see if the problem persists. -Derek -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.