From owner-freebsd-stable@FreeBSD.ORG Mon Mar 29 19:21:47 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 876401065673 for ; Mon, 29 Mar 2010 19:21:47 +0000 (UTC) (envelope-from bra@fsn.hu) Received: from people.fsn.hu (people.fsn.hu [195.228.252.137]) by mx1.freebsd.org (Postfix) with ESMTP id 08E4C8FC3D for ; Mon, 29 Mar 2010 19:21:46 +0000 (UTC) Received: by people.fsn.hu (Postfix, from userid 1001) id DA2EF245FD1; Mon, 29 Mar 2010 21:21:44 +0200 (CEST) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MF-ACE0E1EA [pR: 6.8843] X-CRM114-CacheID: sfid-20100329_21214_A649CC00 X-CRM114-Status: Good ( pR: 6.8843 ) Message-ID: <4BB0FDC6.7050105@fsn.hu> Date: Mon, 29 Mar 2010 21:21:42 +0200 From: Attila Nagy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.23) Gecko/20090817 Thunderbird/2.0.0.23 Mnenhy/0.7.6.0 MIME-Version: 1.0 To: pyunyh@gmail.com References: <4BAB718C.3090001@fsn.hu> <886B21E1787F0003B89E34B6@[192.168.1.44]> <4BB087B7.3030602@fsn.hu> <20100329183848.GE1473@michelle.cdnetworks.com> In-Reply-To: <20100329183848.GE1473@michelle.cdnetworks.com> X-Stationery: 0.4.10 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.1 X-Spambayes-Classification: ham; 0.00 X-DSPAM-Result: Whitelisted X-DSPAM-Processed: Mon Mar 29 21:21:44 2010 X-DSPAM-Confidence: 0.9925 X-DSPAM-Probability: 0.0000 X-DSPAM-Signature: 4bb0fdc83211351825687 X-DSPAM-Factors: 27, X-Bogosity*Ham+tests=bogofilter, 0.00298, X-Bogosity*Ham, 0.00298, X-Spambayes-Classification*ham+0.00, 0.00343, X-Spambayes-Classification*0.00, 0.00343, X-CRM114-Status*Good, 0.00382, X-CRM114-Status*Good+(, 0.00382, X-Bogosity*spamicity=0.000000, 0.00470, X-Bogosity*tests=bogofilter+spamicity=0.000000, 0.00470, X-Bogosity*spamicity=0.000000+version=1.2.1, 0.00470, >+>, 0.00754, wrote, 0.00759, wrote, 0.00759, wrote+>, 0.00871, wrote+>, 0.00871, Nagy+>+Hi, 0.01000, guess, 0.01000, high+load, 0.01000, not+>, 0.01000 Cc: Mailing List FreeBSD Stable , Michael Loftis Subject: Re: 8-STABLE freezes on UDP traffic (DNS), 7.x doesn't X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Mar 2010 19:21:47 -0000 Pyun YongHyeon wrote: > On Mon, Mar 29, 2010 at 12:57:59PM +0200, Attila Nagy wrote: > >> Hi, >> >> Michael Loftis wrote: >> >>> --On Thursday, March 25, 2010 3:22 PM +0100 Attila Nagy >>> wrote: >>> >>> <...> >>> >>>> Both unbound and python accepts DNS requests, and it seems when 25% >>>> interrupt happens, only unbound is in *udp state, where it is 50%, both >>>> programs are in that state. >>>> >>> Try turning of hardware TSO/checksum offload if it's availble on your >>> chipset? ifconfig -rxcsum -txcsum -tso -- I'm only using >>> nfe chips right now, but w/ the TSO/CSUM on they lock up constantly >>> under high load. We're pretty sure it's mostly the nfe driver, or the >>> chips themselves, but have never ruled out some generic 8.x hardware >>> offload issues. >>> >> Bingo, this solved the problem. The current uptime nears four days. >> Previously I couldn't go further than a day. >> >> The machine gets very light TCP load (and other machines which get work >> well), so I guess it's UDP RX or TX checksum related. >> >> > > Hmm, this is unexpected result. Since you're using UDP, TSO is not > involved in this issue. Because you disabled RX/TX checksum > offloading could you check how many number of 'bad checksum' and > and 'no checksum' you have from netstat(1)? > To narrow down which side of checksum offloading causes the issue, > would you just disable one side in a time? For instance, disable TX > checksum offloading with RX checksum offloading enabled and see how > bce(4) works. > #ifconfig bce0 -txcsum rxcsum > If that shows the same issue, try disabling RX checksum offloading > but enabling TX checksum offloading. > #ifconfig bce0 txcsum -rxcsum > It's interesting. During the day, I've disabled only HW checksumming and left TSO enabled. It couldn't run more than a few hours. I have disabled tso again to see what happens. BTW, of course there is TCP traffic on that interface (DNS is also available on TCP), maybe this causes the problem.