From owner-freebsd-net@FreeBSD.ORG Wed Jan 25 12:24:53 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D45FB106566C for ; Wed, 25 Jan 2012 12:24:53 +0000 (UTC) (envelope-from citrin@citrin.ru) Received: from mail-chaos.rambler.ru (mail-chaos.rambler.ru [81.19.68.130]) by mx1.freebsd.org (Postfix) with ESMTP id 4CCB48FC18 for ; Wed, 25 Jan 2012 12:24:53 +0000 (UTC) Received: from citrin.office.vega.ru (office-nat.spylog.net [193.169.234.6]) (Authenticated sender: citrin@citrin.ru) by mail-chaos.rambler.ru (Postfix) with ESMTPSA id 23A771702C for ; Wed, 25 Jan 2012 15:14:07 +0300 (MSK) Message-ID: <4F1FF20E.7080108@citrin.ru> Date: Wed, 25 Jan 2012 16:14:06 +0400 From: Anton Yuzhaninov User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:6.0.2) Gecko/20110922 Thunderbird/6.0.2 MIME-Version: 1.0 To: freebsd-net@freebsd.org Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Subject: livelock with full loaded em(4) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jan 2012 12:24:54 -0000 Hello. I have test boxes with em(4) network card - Intel 82563EB FreeBSD version - 8.2 stable from 2012-01-15, amd64 When this NIC is full loaded livelock occurs - system is unresponsive even from local console. To generate load I use netsend from /usr/src/tools/tools/netrate/ but other traffic source (e. g. TCP instead UDP) cause same problem. There is need 2 conditions for this livelock: 1. With full NIC load, kernel thread "em1 taskq" hogs CPU. top -zISHP for interface load a bit less, than full. Traffic is generated by # netsend 172.16.0.2 9001 8500 14300 3600 where 14300 is packets per second: 112 processes: 10 running, 82 sleeping, 20 waiting CPU 0: 0.0% user, 0.0% nice, 27.1% system, 0.0% interrupt, 72.9% idle CPU 1: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 2: 2.3% user, 0.0% nice, 97.7% system, 0.0% interrupt, 0.0% idle CPU 3: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 4: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 5: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 6: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 7: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Mem: 26M Active, 378M Inact, 450M Wired, 132K Cache, 63M Buf, 15G Free Swap: 8192M Total, 8192M Free PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 7737 ayuzhaninov 119 0 5832K 1116K CPU2 2 0:04 100.00% netsend 0 root -68 0 0K 144K - 0 2:17 22.27% {em1 taskq} top -zISHP for full interface load (some drops occurs), load is generated by # netsend 172.16.0.2 9001 8500 14400 3600 112 processes: 11 running, 81 sleeping, 20 waiting CPU 0: 0.0% user, 0.0% nice, 100% system, 0.0% interrupt, 0.0% idle CPU 1: 4.1% user, 0.0% nice, 95.9% system, 0.0% interrupt, 0.0% idle CPU 2: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 3: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 4: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 5: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 6: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 7: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Mem: 26M Active, 378M Inact, 450M Wired, 132K Cache, 63M Buf, 15G Free Swap: 8192M Total, 8192M Free PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 0 root -68 0 0K 144K CPU0 0 2:17 100.00% {em1 taskq} 7759 ayuzhaninov 119 0 5832K 1116K CPU1 1 0:01 100.00% netsend So pps increased from 14300 to 14400 (0.7%), but CPU load from "em1 taskq" thread increased from 27.1% to 100.00% This at least strange, but system still works fine until I run sysctl dev.cpu.0.temperature 2. sysctl handler code for coretemp must be executed on target CPU, e. g. for dev.cpu.0.temperature code executed on CPU0. If CPU0 is fully loaded by "em1 taskq" sysctl handler for dev.cpu.0.temperature acquires Giant mutex lock then tries to run code on CPU0, but it can't - CPU0 is busy. If Giant mutex hold for long time system is unresponsive. In my case Giant mutex acquired when sysctl dev.cpu.0.temperature started and hold all time while netsend is running. This seems to be a scheduler problem: 1. Why "em1 taskq" runs only on CPU0 (there is no affinity for this tread)? # procstat -k 0 | egrep '(PID|em1)' PID TID COMM TDNAME KSTACK 0 100038 kernel em1 taskq # cpuset -g -t 100038 tid 100038 mask: 0, 1, 2, 3, 4, 5, 6, 7 2. Why "em1 taskq" is not preempted to execute sysctl handler code? This is not short term condition - is netsend running for a hour, "em1 taskq" is not preempted for a hour - sysctl all this time in running state but don't have a chance to be executed. -- Anton Yuzhaninov P. S. I tried to use EM_MULTIQUEUE, but this is don't help in my case.