From owner-freebsd-smp@FreeBSD.ORG Mon Nov 10 09:06:07 2008 Return-Path: Delivered-To: smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 339321065670 for ; Mon, 10 Nov 2008 09:06:07 +0000 (UTC) (envelope-from archimedes.gaviola@gmail.com) Received: from wa-out-1112.google.com (wa-out-1112.google.com [209.85.146.176]) by mx1.freebsd.org (Postfix) with ESMTP id 06DA18FC1A for ; Mon, 10 Nov 2008 09:06:06 +0000 (UTC) (envelope-from archimedes.gaviola@gmail.com) Received: by wa-out-1112.google.com with SMTP id m34so1218864wag.27 for ; Mon, 10 Nov 2008 01:06:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:mime-version:content-type:content-transfer-encoding :content-disposition; bh=dRkRxtUxzM5nY5VndKdaz7E9CkdvKvjwSoRtWmOdHH0=; b=X4cpQ/NW1UuuaqXIxlC4VC26LcLtEvjzhTYUJiMNJxDNCCcsefTN9E1belc97Sn1fr craJWTJH+vJDKoZUkUaW9+T986uTuDFix7pYdcZThdlgoKgt1XAxF5ULTTrRr+pZQ5bt k9wemS8DMvwmn3rGvs+qyFcDAoFs8OA2WNTjU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type :content-transfer-encoding:content-disposition; b=cki9aLtrOUr6XEE/bR2Qij8usDhQ8rhIZDII/ky5ay98R9tG1iICDBZUce43tfuFrf /jCLrMOff2B0u++tRPSAnFT8MvrOQp8ALUzO5WG6RVPEldtXiEv23Gxy6tRKTW5LQM+2 +Ygz146EkkvGfWlc84O/zNg+dE9Mt0yeJE8pE= Received: by 10.114.94.12 with SMTP id r12mr4016013wab.156.1226306003553; Mon, 10 Nov 2008 00:33:23 -0800 (PST) Received: by 10.115.76.12 with HTTP; Mon, 10 Nov 2008 00:33:23 -0800 (PST) Message-ID: <42e3d810811100033w172e90dbl209ecbab640cc24f@mail.gmail.com> Date: Mon, 10 Nov 2008 16:33:23 +0800 From: "Archimedes Gaviola" To: smp@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Mailman-Approved-At: Mon, 10 Nov 2008 12:26:41 +0000 Cc: Subject: CPU affinity with ULE scheduler X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Nov 2008 09:06:07 -0000 To Whom It May Concerned: Can someone explain or share about ULE scheduler (latest version 2 if I'm not mistaken) dealing with CPU affinity? Is there any existing benchmarks on this with FreeBSD? Because I am currently using 4BSD scheduler and as what I have observed especially on processing high network load traffic on multiple CPU cores, only one CPU were being stressed with network interrupt while the rests are mostly in idle state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom network interface cards (bce0 and bce1). Below is the snapshot of the case. PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 17 root 1 171 52 0K 16K RUN 0 96:04 97.71% idle: cpu0 15 root 1 171 52 0K 16K RUN 2 98:41 97.07% idle: cpu2 14 root 1 171 52 0K 16K RUN 3 103:56 95.90% idle: cpu3 13 root 1 171 52 0K 16K RUN 4 104:17 88.23% idle: cpu4 12 root 1 171 52 0K 16K RUN 5 97:59 86.57% idle: cpu5 10 root 1 171 52 0K 16K RUN 7 81:51 82.08% idle: cpu7 11 root 1 171 52 0K 16K RUN 6 95:28 81.35% idle: cpu6 16 root 1 171 52 0K 16K RUN 1 102:15 77.78% idle: cpu1 36 root 1 -68 -187 0K 16K WAIT 7 19:37 4.59% irq23: bce0 bce1 18 root 1 -32 -151 0K 16K CPU0 0 2:13 0.00% swi4: clock sio 4488 root 1 96 0 30728K 4292K select 3 1:51 0.00% sshd 43 root 1 171 52 0K 16K pgzero 3 1:08 0.00% pagezero 218 root 1 96 0 3852K 1380K select 3 0:38 0.00% syslogd 20 root 1 -44 -163 0K 16K WAIT 7 0:32 0.00% swi1: net Thanks, Archimedes From owner-freebsd-smp@FreeBSD.ORG Mon Nov 10 12:52:05 2008 Return-Path: Delivered-To: smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 014B5106564A for ; Mon, 10 Nov 2008 12:52:05 +0000 (UTC) (envelope-from archimedes.gaviola@gmail.com) Received: from wa-out-1112.google.com (wa-out-1112.google.com [209.85.146.182]) by mx1.freebsd.org (Postfix) with ESMTP id C170A8FC18 for ; Mon, 10 Nov 2008 12:52:04 +0000 (UTC) (envelope-from archimedes.gaviola@gmail.com) Received: by wa-out-1112.google.com with SMTP id m34so1265256wag.27 for ; Mon, 10 Nov 2008 04:52:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:mime-version:content-type:content-transfer-encoding :content-disposition; bh=dRkRxtUxzM5nY5VndKdaz7E9CkdvKvjwSoRtWmOdHH0=; b=sbLP4kcyG1zerJemqRAol9p8pp8E3KDtXUGdaiszwaytaYCWeFVdPB4kyIiafSH8Q+ 32N6u7/A4MeQf1ZRsUgzgE1zracVVaMIuDGNOxmM0umZQ9WDJUy51zNzdxvHRKS9lUCo EQJVEJzs40IXLPjsmb58FZSXUbVrd5RrbgI+4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type :content-transfer-encoding:content-disposition; b=PwAIbhg108RIOmMGSU3XpWdW75tVHCh3ZXYicYBhnIGmRdiGqtVHzEI3z6FpRE7jgZ 4cuegwtm7FzhkTFt8HaKARE0M3YWaHVqn8+EGKEYl8YN6vjgjpRpbcPqLASZ27ZxoGwV qrjkeQZGjkWlkSqlNQog+Wf3LamYTGZe0cwjw= Received: by 10.114.158.1 with SMTP id g1mr4236579wae.126.1226321524511; Mon, 10 Nov 2008 04:52:04 -0800 (PST) Received: by 10.115.76.12 with HTTP; Mon, 10 Nov 2008 04:52:04 -0800 (PST) Message-ID: <42e3d810811100452h51d7d8ccw4a1008e234d07692@mail.gmail.com> Date: Mon, 10 Nov 2008 20:52:04 +0800 From: "Archimedes Gaviola" To: smp@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Mailman-Approved-At: Mon, 10 Nov 2008 13:18:28 +0000 Cc: Subject: CPU affinity with ULE scheduler X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Nov 2008 12:52:05 -0000 To Whom It May Concerned: Can someone explain or share about ULE scheduler (latest version 2 if I'm not mistaken) dealing with CPU affinity? Is there any existing benchmarks on this with FreeBSD? Because I am currently using 4BSD scheduler and as what I have observed especially on processing high network load traffic on multiple CPU cores, only one CPU were being stressed with network interrupt while the rests are mostly in idle state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom network interface cards (bce0 and bce1). Below is the snapshot of the case. PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 17 root 1 171 52 0K 16K RUN 0 96:04 97.71% idle: cpu0 15 root 1 171 52 0K 16K RUN 2 98:41 97.07% idle: cpu2 14 root 1 171 52 0K 16K RUN 3 103:56 95.90% idle: cpu3 13 root 1 171 52 0K 16K RUN 4 104:17 88.23% idle: cpu4 12 root 1 171 52 0K 16K RUN 5 97:59 86.57% idle: cpu5 10 root 1 171 52 0K 16K RUN 7 81:51 82.08% idle: cpu7 11 root 1 171 52 0K 16K RUN 6 95:28 81.35% idle: cpu6 16 root 1 171 52 0K 16K RUN 1 102:15 77.78% idle: cpu1 36 root 1 -68 -187 0K 16K WAIT 7 19:37 4.59% irq23: bce0 bce1 18 root 1 -32 -151 0K 16K CPU0 0 2:13 0.00% swi4: clock sio 4488 root 1 96 0 30728K 4292K select 3 1:51 0.00% sshd 43 root 1 171 52 0K 16K pgzero 3 1:08 0.00% pagezero 218 root 1 96 0 3852K 1380K select 3 0:38 0.00% syslogd 20 root 1 -44 -163 0K 16K WAIT 7 0:32 0.00% swi1: net Thanks, Archimedes From owner-freebsd-smp@FreeBSD.ORG Mon Nov 10 14:05:20 2008 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 323441065679 for ; Mon, 10 Nov 2008 14:05:20 +0000 (UTC) (envelope-from freebsd-smp@m.gmane.org) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by mx1.freebsd.org (Postfix) with ESMTP id A08598FC19 for ; Mon, 10 Nov 2008 14:05:19 +0000 (UTC) (envelope-from freebsd-smp@m.gmane.org) Received: from list by ciao.gmane.org with local (Exim 4.43) id 1KzWJj-0005GJ-2K for freebsd-smp@freebsd.org; Mon, 10 Nov 2008 12:56:31 +0000 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 10 Nov 2008 12:56:31 +0000 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 10 Nov 2008 12:56:31 +0000 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-smp@freebsd.org From: Ivan Voras Date: Mon, 10 Nov 2008 13:56:40 +0100 Lines: 75 Message-ID: References: <42e3d810811100033w172e90dbl209ecbab640cc24f@mail.gmail.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig0D62A50CF2198A0578721DE4" X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Thunderbird 2.0.0.17 (X11/20080925) In-Reply-To: <42e3d810811100033w172e90dbl209ecbab640cc24f@mail.gmail.com> X-Enigmail-Version: 0.95.0 Sender: news Subject: Re: CPU affinity with ULE scheduler X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Nov 2008 14:05:20 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig0D62A50CF2198A0578721DE4 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Archimedes Gaviola wrote: > To Whom It May Concerned: >=20 > Can someone explain or share about ULE scheduler (latest version 2 if > I'm not mistaken) dealing with CPU affinity? Is there any existing > benchmarks on this with FreeBSD? Because I am currently using 4BSD Yes but not for network loads. See for example benchmarks in http://people.freebsd.org/~kris/scaling/7.0%20and%20beyond.pdf > scheduler and as what I have observed especially on processing high > network load traffic on multiple CPU cores, only one CPU were being > stressed with network interrupt while the rests are mostly in idle > state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom > network interface cards (bce0 and bce1). Below is the snapshot of the > case. This is unfortunately so and cannot be changed for now - you are not the first with this particular performance problem. BUT, looking at the data in the snapshot you gave, it's not clear that there is a performance problem in your case - bce is not nearly taking as much CPU time to be bottlenecking. What exactly do you think is wrong in your case? > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMA= ND > 17 root 1 171 52 0K 16K RUN 0 96:04 97.71% idle:= cpu0 > 15 root 1 171 52 0K 16K RUN 2 98:41 97.07% idle:= cpu2 > 14 root 1 171 52 0K 16K RUN 3 103:56 95.90% idle:= cpu3 > 13 root 1 171 52 0K 16K RUN 4 104:17 88.23% idle:= cpu4 > 12 root 1 171 52 0K 16K RUN 5 97:59 86.57% idle:= cpu5 > 10 root 1 171 52 0K 16K RUN 7 81:51 82.08% idle:= cpu7 > 11 root 1 171 52 0K 16K RUN 6 95:28 81.35% idle:= cpu6 > 16 root 1 171 52 0K 16K RUN 1 102:15 77.78% idle:= cpu1 > 36 root 1 -68 -187 0K 16K WAIT 7 19:37 4.59% > irq23: bce0 bce1 > 18 root 1 -32 -151 0K 16K CPU0 0 2:13 0.00% > swi4: clock sio > 4488 root 1 96 0 30728K 4292K select 3 1:51 0.00% sshd > 43 root 1 171 52 0K 16K pgzero 3 1:08 0.00% pagez= ero > 218 root 1 96 0 3852K 1380K select 3 0:38 0.00% syslo= gd > 20 root 1 -44 -163 0K 16K WAIT 7 0:32 0.00% swi1:= net --------------enig0D62A50CF2198A0578721DE4 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJGC+IldnAQVacBcgRAmZhAKCGm05TVnG2a9QfilPND5pc5lquAACdGBz6 2kywcxj15+WCN+Ufb+SoLtg= =X6RQ -----END PGP SIGNATURE----- --------------enig0D62A50CF2198A0578721DE4-- From owner-freebsd-smp@FreeBSD.ORG Mon Nov 10 22:34:28 2008 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 97258106564A; Mon, 10 Nov 2008 22:34:28 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 36A2A8FC28; Mon, 10 Nov 2008 22:34:28 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [IPv6:::1]) (authenticated bits=0) by server.baldwin.cx (8.14.3/8.14.3) with ESMTP id mAAMYLMl067634; Mon, 10 Nov 2008 17:34:22 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: freebsd-smp@freebsd.org Date: Mon, 10 Nov 2008 17:33:04 -0500 User-Agent: KMail/1.9.7 References: <42e3d810811100033w172e90dbl209ecbab640cc24f@mail.gmail.com> In-Reply-To: <42e3d810811100033w172e90dbl209ecbab640cc24f@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200811101733.04547.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [IPv6:::1]); Mon, 10 Nov 2008 17:34:22 -0500 (EST) X-Virus-Scanned: ClamAV 0.93.1/8600/Mon Nov 10 14:40:23 2008 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00,NO_RELAYS autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: smp@freebsd.org, Archimedes Gaviola Subject: Re: CPU affinity with ULE scheduler X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Nov 2008 22:34:28 -0000 On Monday 10 November 2008 03:33:23 am Archimedes Gaviola wrote: > To Whom It May Concerned: > > Can someone explain or share about ULE scheduler (latest version 2 if > I'm not mistaken) dealing with CPU affinity? Is there any existing > benchmarks on this with FreeBSD? Because I am currently using 4BSD > scheduler and as what I have observed especially on processing high > network load traffic on multiple CPU cores, only one CPU were being > stressed with network interrupt while the rests are mostly in idle > state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom > network interface cards (bce0 and bce1). Below is the snapshot of the > case. Interrupts are routed to a single CPU. Since bce0 and bce1 are both on the same interrupt (irq 23), the CPU that interrupt is routed to is going to end up handling all the interrupts for bce0 and bce1. This not something ULE or 4BSD have any control over. -- John Baldwin From owner-freebsd-smp@FreeBSD.ORG Mon Nov 10 22:34:28 2008 Return-Path: Delivered-To: smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 97258106564A; Mon, 10 Nov 2008 22:34:28 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 36A2A8FC28; Mon, 10 Nov 2008 22:34:28 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [IPv6:::1]) (authenticated bits=0) by server.baldwin.cx (8.14.3/8.14.3) with ESMTP id mAAMYLMl067634; Mon, 10 Nov 2008 17:34:22 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: freebsd-smp@freebsd.org Date: Mon, 10 Nov 2008 17:33:04 -0500 User-Agent: KMail/1.9.7 References: <42e3d810811100033w172e90dbl209ecbab640cc24f@mail.gmail.com> In-Reply-To: <42e3d810811100033w172e90dbl209ecbab640cc24f@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200811101733.04547.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [IPv6:::1]); Mon, 10 Nov 2008 17:34:22 -0500 (EST) X-Virus-Scanned: ClamAV 0.93.1/8600/Mon Nov 10 14:40:23 2008 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00,NO_RELAYS autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: smp@freebsd.org, Archimedes Gaviola Subject: Re: CPU affinity with ULE scheduler X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Nov 2008 22:34:28 -0000 On Monday 10 November 2008 03:33:23 am Archimedes Gaviola wrote: > To Whom It May Concerned: > > Can someone explain or share about ULE scheduler (latest version 2 if > I'm not mistaken) dealing with CPU affinity? Is there any existing > benchmarks on this with FreeBSD? Because I am currently using 4BSD > scheduler and as what I have observed especially on processing high > network load traffic on multiple CPU cores, only one CPU were being > stressed with network interrupt while the rests are mostly in idle > state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom > network interface cards (bce0 and bce1). Below is the snapshot of the > case. Interrupts are routed to a single CPU. Since bce0 and bce1 are both on the same interrupt (irq 23), the CPU that interrupt is routed to is going to end up handling all the interrupts for bce0 and bce1. This not something ULE or 4BSD have any control over. -- John Baldwin From owner-freebsd-smp@FreeBSD.ORG Tue Nov 11 04:32:57 2008 Return-Path: Delivered-To: smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 40770106567D for ; Tue, 11 Nov 2008 04:32:57 +0000 (UTC) (envelope-from archimedes.gaviola@gmail.com) Received: from qb-out-0506.google.com (qb-out-0506.google.com [72.14.204.233]) by mx1.freebsd.org (Postfix) with ESMTP id E4B208FC19 for ; Tue, 11 Nov 2008 04:32:56 +0000 (UTC) (envelope-from archimedes.gaviola@gmail.com) Received: by qb-out-0506.google.com with SMTP id f30so1519589qba.35 for ; Mon, 10 Nov 2008 20:32:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:cc:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=xdu6OU/46DraNoHS6bJQdC2bj70ANFcVuartNAV1aqM=; b=aMfooo7/d5yLq0qeEyg9PqVdLgG0SaUoTg1Mur7r0PRpAMN8OivjvfWkOAJKVU91kM UpDmgvgPrmsSMPmRHdAihTAQiPBQy/SsB4BjcZsUKbWgzJtnJjnXl5bBIpcYtZN247qp 0JJRIR+uSO0p7CbYeIWLTCN8T6PptLCVAf6Sw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=lwQ8XzoWEE8OusxQppD413RNxs+hM0JnV/MB5vfwD3ldwHRJ78+ouinOuXbnn+iLCa DyuCpCDSe3yZaniZF+7o6PfM+m6t7JOD7NUxHOBnu9LBjNirtaWNHE7zdQF5sh19Ni6X zvFlqDx2uwpuFaMUqfXO/22VrcYtlwxvoiuNY= Received: by 10.114.135.1 with SMTP id i1mr4901091wad.193.1226377975730; Mon, 10 Nov 2008 20:32:55 -0800 (PST) Received: by 10.115.76.12 with HTTP; Mon, 10 Nov 2008 20:32:55 -0800 (PST) Message-ID: <42e3d810811102032w7850a1c0t386d80ce747f37d3@mail.gmail.com> Date: Tue, 11 Nov 2008 12:32:55 +0800 From: "Archimedes Gaviola" To: "John Baldwin" In-Reply-To: <200811101733.04547.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <42e3d810811100033w172e90dbl209ecbab640cc24f@mail.gmail.com> <200811101733.04547.jhb@freebsd.org> Cc: smp@freebsd.org, freebsd-smp@freebsd.org Subject: Re: CPU affinity with ULE scheduler X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Nov 2008 04:32:57 -0000 On Tue, Nov 11, 2008 at 6:33 AM, John Baldwin wrote: > On Monday 10 November 2008 03:33:23 am Archimedes Gaviola wrote: >> To Whom It May Concerned: >> >> Can someone explain or share about ULE scheduler (latest version 2 if >> I'm not mistaken) dealing with CPU affinity? Is there any existing >> benchmarks on this with FreeBSD? Because I am currently using 4BSD >> scheduler and as what I have observed especially on processing high >> network load traffic on multiple CPU cores, only one CPU were being >> stressed with network interrupt while the rests are mostly in idle >> state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom >> network interface cards (bce0 and bce1). Below is the snapshot of the >> case. > > Interrupts are routed to a single CPU. Since bce0 and bce1 are both on the > same interrupt (irq 23), the CPU that interrupt is routed to is going to end > up handling all the interrupts for bce0 and bce1. This not something ULE or > 4BSD have any control over. > > -- > John Baldwin > Hi John, I'm sorry for the wrong snapshot. Here's the right one with my concern. PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 17 root 1 171 52 0K 16K CPU0 0 54:28 95.17% idle: cpu0 15 root 1 171 52 0K 16K CPU2 2 55:55 93.65% idle: cpu2 14 root 1 171 52 0K 16K CPU3 3 58:53 93.55% idle: cpu3 13 root 1 171 52 0K 16K RUN 4 59:14 82.47% idle: cpu4 12 root 1 171 52 0K 16K RUN 5 55:42 82.23% idle: cpu5 16 root 1 171 52 0K 16K CPU1 1 58:13 77.78% idle: cpu1 11 root 1 171 52 0K 16K CPU6 6 54:08 76.17% idle: cpu6 36 root 1 -68 -187 0K 16K WAIT 7 8:50 65.53% irq23: bce0 bce1 10 root 1 171 52 0K 16K CPU7 7 48:19 29.79% idle: cpu7 43 root 1 171 52 0K 16K pgzero 2 0:35 1.51% pagezero 1372 root 10 20 0 16716K 5764K kserel 6 58:42 0.00% kmd 4488 root 1 96 0 30676K 4236K select 2 1:51 0.00% sshd 18 root 1 -32 -151 0K 16K WAIT 0 1:14 0.00% swi4: clock s 20 root 1 -44 -163 0K 16K WAIT 0 0:30 0.00% swi1: net 218 root 1 96 0 3852K 1376K select 0 0:23 0.00% syslogd 2171 root 1 96 0 30676K 4224K select 6 0:19 0.00% sshd Actually I was doing a network performance testing on this system with FreeBSD-6.2 RELEASE using its default scheduler 4BSD and then I used a tool to generate big amount of traffic around 600Mbps-700Mbps traversing the FreeBSD system in bi-direction, meaning both network interfaces are receiving traffic. What happened was, the CPU (cpu7) that handles the (irq 23) on both interfaces consumed big amount of CPU utilization around 65.53% in which it affects other running applications and services like sshd and httpd. It's no longer accessible when traffic is bombarded. With the current situation of my FreeBSD system with only one CPU being stressed, I was thinking of moving to FreeBSD-7.0 RELEASE with the ULE scheduler because I thought my concern has something to do with the distributions of load on multiple CPU cores handled by the scheduler especially at the network level, processing network load. So, if it is more of interrupt handling and not on the scheduler, is there a way we can optimize it? Because if it still routed only to one CPU then for me it's still inefficient. Who handles interrupt scheduling for bounding CPU in order to prevent shared IRQ? Is there any improvements with FreeBSD-7.0 with regards to interrupt handling? Thanks, Archimedes From owner-freebsd-smp@FreeBSD.ORG Tue Nov 11 04:47:24 2008 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EC5541065692 for ; Tue, 11 Nov 2008 04:47:23 +0000 (UTC) (envelope-from archimedes.gaviola@gmail.com) Received: from po-out-1718.google.com (po-out-1718.google.com [72.14.252.159]) by mx1.freebsd.org (Postfix) with ESMTP id B93D28FC08 for ; Tue, 11 Nov 2008 04:47:23 +0000 (UTC) (envelope-from archimedes.gaviola@gmail.com) Received: by po-out-1718.google.com with SMTP id y22so5939788pof.3 for ; Mon, 10 Nov 2008 20:47:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:cc:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=xdu6OU/46DraNoHS6bJQdC2bj70ANFcVuartNAV1aqM=; b=aMfooo7/d5yLq0qeEyg9PqVdLgG0SaUoTg1Mur7r0PRpAMN8OivjvfWkOAJKVU91kM UpDmgvgPrmsSMPmRHdAihTAQiPBQy/SsB4BjcZsUKbWgzJtnJjnXl5bBIpcYtZN247qp 0JJRIR+uSO0p7CbYeIWLTCN8T6PptLCVAf6Sw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=lwQ8XzoWEE8OusxQppD413RNxs+hM0JnV/MB5vfwD3ldwHRJ78+ouinOuXbnn+iLCa DyuCpCDSe3yZaniZF+7o6PfM+m6t7JOD7NUxHOBnu9LBjNirtaWNHE7zdQF5sh19Ni6X zvFlqDx2uwpuFaMUqfXO/22VrcYtlwxvoiuNY= Received: by 10.114.135.1 with SMTP id i1mr4901091wad.193.1226377975730; Mon, 10 Nov 2008 20:32:55 -0800 (PST) Received: by 10.115.76.12 with HTTP; Mon, 10 Nov 2008 20:32:55 -0800 (PST) Message-ID: <42e3d810811102032w7850a1c0t386d80ce747f37d3@mail.gmail.com> Date: Tue, 11 Nov 2008 12:32:55 +0800 From: "Archimedes Gaviola" To: "John Baldwin" In-Reply-To: <200811101733.04547.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <42e3d810811100033w172e90dbl209ecbab640cc24f@mail.gmail.com> <200811101733.04547.jhb@freebsd.org> Cc: smp@freebsd.org, freebsd-smp@freebsd.org Subject: Re: CPU affinity with ULE scheduler X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Nov 2008 04:47:24 -0000 On Tue, Nov 11, 2008 at 6:33 AM, John Baldwin wrote: > On Monday 10 November 2008 03:33:23 am Archimedes Gaviola wrote: >> To Whom It May Concerned: >> >> Can someone explain or share about ULE scheduler (latest version 2 if >> I'm not mistaken) dealing with CPU affinity? Is there any existing >> benchmarks on this with FreeBSD? Because I am currently using 4BSD >> scheduler and as what I have observed especially on processing high >> network load traffic on multiple CPU cores, only one CPU were being >> stressed with network interrupt while the rests are mostly in idle >> state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom >> network interface cards (bce0 and bce1). Below is the snapshot of the >> case. > > Interrupts are routed to a single CPU. Since bce0 and bce1 are both on the > same interrupt (irq 23), the CPU that interrupt is routed to is going to end > up handling all the interrupts for bce0 and bce1. This not something ULE or > 4BSD have any control over. > > -- > John Baldwin > Hi John, I'm sorry for the wrong snapshot. Here's the right one with my concern. PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 17 root 1 171 52 0K 16K CPU0 0 54:28 95.17% idle: cpu0 15 root 1 171 52 0K 16K CPU2 2 55:55 93.65% idle: cpu2 14 root 1 171 52 0K 16K CPU3 3 58:53 93.55% idle: cpu3 13 root 1 171 52 0K 16K RUN 4 59:14 82.47% idle: cpu4 12 root 1 171 52 0K 16K RUN 5 55:42 82.23% idle: cpu5 16 root 1 171 52 0K 16K CPU1 1 58:13 77.78% idle: cpu1 11 root 1 171 52 0K 16K CPU6 6 54:08 76.17% idle: cpu6 36 root 1 -68 -187 0K 16K WAIT 7 8:50 65.53% irq23: bce0 bce1 10 root 1 171 52 0K 16K CPU7 7 48:19 29.79% idle: cpu7 43 root 1 171 52 0K 16K pgzero 2 0:35 1.51% pagezero 1372 root 10 20 0 16716K 5764K kserel 6 58:42 0.00% kmd 4488 root 1 96 0 30676K 4236K select 2 1:51 0.00% sshd 18 root 1 -32 -151 0K 16K WAIT 0 1:14 0.00% swi4: clock s 20 root 1 -44 -163 0K 16K WAIT 0 0:30 0.00% swi1: net 218 root 1 96 0 3852K 1376K select 0 0:23 0.00% syslogd 2171 root 1 96 0 30676K 4224K select 6 0:19 0.00% sshd Actually I was doing a network performance testing on this system with FreeBSD-6.2 RELEASE using its default scheduler 4BSD and then I used a tool to generate big amount of traffic around 600Mbps-700Mbps traversing the FreeBSD system in bi-direction, meaning both network interfaces are receiving traffic. What happened was, the CPU (cpu7) that handles the (irq 23) on both interfaces consumed big amount of CPU utilization around 65.53% in which it affects other running applications and services like sshd and httpd. It's no longer accessible when traffic is bombarded. With the current situation of my FreeBSD system with only one CPU being stressed, I was thinking of moving to FreeBSD-7.0 RELEASE with the ULE scheduler because I thought my concern has something to do with the distributions of load on multiple CPU cores handled by the scheduler especially at the network level, processing network load. So, if it is more of interrupt handling and not on the scheduler, is there a way we can optimize it? Because if it still routed only to one CPU then for me it's still inefficient. Who handles interrupt scheduling for bounding CPU in order to prevent shared IRQ? Is there any improvements with FreeBSD-7.0 with regards to interrupt handling? Thanks, Archimedes From owner-freebsd-smp@FreeBSD.ORG Tue Nov 11 07:02:20 2008 Return-Path: Delivered-To: smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 56A9B1065676 for ; Tue, 11 Nov 2008 07:02:20 +0000 (UTC) (envelope-from archimedes.gaviola@gmail.com) Received: from wa-out-1112.google.com (wa-out-1112.google.com [209.85.146.178]) by mx1.freebsd.org (Postfix) with ESMTP id 1EF0F8FC18 for ; Tue, 11 Nov 2008 07:02:19 +0000 (UTC) (envelope-from archimedes.gaviola@gmail.com) Received: by wa-out-1112.google.com with SMTP id m34so1484692wag.27 for ; Mon, 10 Nov 2008 23:02:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:cc:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=07Kg6z8piMlG3OjzHDr5XMr+H0liA0p7hgcN9JCzcbo=; b=XyV5Ul5xeFVNuI8+O8OLTQRbfU5g6Lx35iNzmctocYv1YOepbF5MbWCSK/BI0qumYz a4m8vs6KcSdqrEymm3fmTK9MBqaRZ1w96Cti8LxNan0hBi5VrqFyoAefSLp6feLlWBXc +9k7M38JkI721LVwKPlQpLQfbHBFuPXXRwRFg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=GUwZeAGZVWIRx2yLMJnNE6ecR9FZ4Co49xa+1HLxAa/cvJq1FJwkuJpGT/0IHyczWi kfxUy8L/3wNqNCNogTCtHrH71CaWcwJBrd9K0d6ByEKf7JdjyeReExLPNp4BOjd34C8P FeSWYBMW27NXZha+HWLxn3ApRmAHVWsGft33U= Received: by 10.115.110.6 with SMTP id n6mr4994440wam.72.1226386939656; Mon, 10 Nov 2008 23:02:19 -0800 (PST) Received: by 10.115.76.12 with HTTP; Mon, 10 Nov 2008 23:02:19 -0800 (PST) Message-ID: <42e3d810811102302h3a0e38bcuf1195cf0a89c29a7@mail.gmail.com> Date: Tue, 11 Nov 2008 15:02:19 +0800 From: "Archimedes Gaviola" To: ivoras@freebsd.org In-Reply-To: <42e3d810811102032w7850a1c0t386d80ce747f37d3@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <42e3d810811100033w172e90dbl209ecbab640cc24f@mail.gmail.com> <200811101733.04547.jhb@freebsd.org> <42e3d810811102032w7850a1c0t386d80ce747f37d3@mail.gmail.com> Cc: smp@freebsd.org, freebsd-smp@freebsd.org Subject: Re: CPU affinity with ULE scheduler X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Nov 2008 07:02:20 -0000 On Tue, Nov 11, 2008 at 12:32 PM, Archimedes Gaviola wrote: > On Tue, Nov 11, 2008 at 6:33 AM, John Baldwin wrote: >> On Monday 10 November 2008 03:33:23 am Archimedes Gaviola wrote: >>> To Whom It May Concerned: >>> >>> Can someone explain or share about ULE scheduler (latest version 2 if >>> I'm not mistaken) dealing with CPU affinity? Is there any existing >>> benchmarks on this with FreeBSD? Because I am currently using 4BSD >>> scheduler and as what I have observed especially on processing high >>> network load traffic on multiple CPU cores, only one CPU were being >>> stressed with network interrupt while the rests are mostly in idle >>> state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom >>> network interface cards (bce0 and bce1). Below is the snapshot of the >>> case. >> >> Interrupts are routed to a single CPU. Since bce0 and bce1 are both on the >> same interrupt (irq 23), the CPU that interrupt is routed to is going to end >> up handling all the interrupts for bce0 and bce1. This not something ULE or >> 4BSD have any control over. >> >> -- >> John Baldwin >> > > Hi John, > > I'm sorry for the wrong snapshot. Here's the right one with my concern. > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 17 root 1 171 52 0K 16K CPU0 0 54:28 95.17% idle: cpu0 > 15 root 1 171 52 0K 16K CPU2 2 55:55 93.65% idle: cpu2 > 14 root 1 171 52 0K 16K CPU3 3 58:53 93.55% idle: cpu3 > 13 root 1 171 52 0K 16K RUN 4 59:14 82.47% idle: cpu4 > 12 root 1 171 52 0K 16K RUN 5 55:42 82.23% idle: cpu5 > 16 root 1 171 52 0K 16K CPU1 1 58:13 77.78% idle: cpu1 > 11 root 1 171 52 0K 16K CPU6 6 54:08 76.17% idle: cpu6 > 36 root 1 -68 -187 0K 16K WAIT 7 8:50 65.53% > irq23: bce0 bce1 > 10 root 1 171 52 0K 16K CPU7 7 48:19 29.79% idle: cpu7 > 43 root 1 171 52 0K 16K pgzero 2 0:35 1.51% pagezero > 1372 root 10 20 0 16716K 5764K kserel 6 58:42 0.00% kmd > 4488 root 1 96 0 30676K 4236K select 2 1:51 0.00% sshd > 18 root 1 -32 -151 0K 16K WAIT 0 1:14 0.00% swi4: clock s > 20 root 1 -44 -163 0K 16K WAIT 0 0:30 0.00% swi1: net > 218 root 1 96 0 3852K 1376K select 0 0:23 0.00% syslogd > 2171 root 1 96 0 30676K 4224K select 6 0:19 0.00% sshd > > Actually I was doing a network performance testing on this system with > FreeBSD-6.2 RELEASE using its default scheduler 4BSD and then I used a > tool to generate big amount of traffic around 600Mbps-700Mbps > traversing the FreeBSD system in bi-direction, meaning both network > interfaces are receiving traffic. What happened was, the CPU (cpu7) > that handles the (irq 23) on both interfaces consumed big amount of > CPU utilization around 65.53% in which it affects other running > applications and services like sshd and httpd. It's no longer > accessible when traffic is bombarded. With the current situation of my > FreeBSD system with only one CPU being stressed, I was thinking of > moving to FreeBSD-7.0 RELEASE with the ULE scheduler because I thought > my concern has something to do with the distributions of load on > multiple CPU cores handled by the scheduler especially at the network > level, processing network load. So, if it is more of interrupt > handling and not on the scheduler, is there a way we can optimize it? > Because if it still routed only to one CPU then for me it's still > inefficient. Who handles interrupt scheduling for bounding CPU in > order to prevent shared IRQ? Is there any improvements with > FreeBSD-7.0 with regards to interrupt handling? > > Thanks, > Archimedes > Hi Ivan, Archimedes Gaviola wrote: > To Whom It May Concerned: >=20 > Can someone explain or share about ULE scheduler (latest version 2 if > I'm not mistaken) dealing with CPU affinity? Is there any existing > benchmarks on this with FreeBSD? Because I am currently using 4BSD Yes but not for network loads. See for example benchmarks in http://people.freebsd.org/~kris/scaling/7.0%20and%20beyond.pdf [Archimedes] Ah okay, so based on my understanding with ULE scheduler in FreeBSD-7.0, it only scale well with userland applications scheduling such as database and DNS? > scheduler and as what I have observed especially on processing high > network load traffic on multiple CPU cores, only one CPU were being > stressed with network interrupt while the rests are mostly in idle > state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom > network interface cards (bce0 and bce1). Below is the snapshot of the > case. This is unfortunately so and cannot be changed for now - you are not the first with this particular performance problem. [Archimedes] Meaning, you still have to improve the ULE scheduler processing network load? I have read some papers and articles that FreeBSD is implementing parallelized network stack, what is the status of this development? Is processing high network load can address this? BUT, looking at the data in the snapshot you gave, it's not clear that there is a performance problem in your case - bce is not nearly taking as much CPU time to be bottlenecking. What exactly do you think is wrong in your case? [Archimedes] Oh I'm sorry this is not the right one. Here below, PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 17 root 1 171 52 0K 16K CPU0 0 54:28 95.17% idle: cpu0 15 root 1 171 52 0K 16K CPU2 2 55:55 93.65% idle: cpu2 14 root 1 171 52 0K 16K CPU3 3 58:53 93.55% idle: cpu3 13 root 1 171 52 0K 16K RUN 4 59:14 82.47% idle: cpu4 12 root 1 171 52 0K 16K RUN 5 55:42 82.23% idle: cpu5 16 root 1 171 52 0K 16K CPU1 1 58:13 77.78% idle: cpu1 11 root 1 171 52 0K 16K CPU6 6 54:08 76.17% idle: cpu6 36 root 1 -68 -187 0K 16K WAIT 7 8:50 65.53% irq23: bce0 bce1 10 root 1 171 52 0K 16K CPU7 7 48:19 29.79% idle: cpu7 43 root 1 171 52 0K 16K pgzero 2 0:35 1.51% pagezero 1372 root 10 20 0 16716K 5764K kserel 6 58:42 0.00% kmd 4488 root 1 96 0 30676K 4236K select 2 1:51 0.00% sshd 18 root 1 -32 -151 0K 16K WAIT 0 1:14 0.00% swi4: clock s 20 root 1 -44 -163 0K 16K WAIT 0 0:30 0.00% swi1: net 218 root 1 96 0 3852K 1376K select 0 0:23 0.00% syslogd 2171 root 1 96 0 30676K 4224K select 6 0:19 0.00% sshd I was doing network performance testing with a traffic generator tool bombarding 600Mbps-700Mbps traversing my FreeBSD system in both directions. As you can see cpu7 is bounded to irq23 shared on both network interfaces bce0 and bce1. cpu7 takes up 65.53% CPU utilization which affects some of the applications running on the system like sshd and httpd. These services are no longer accessible when bombarding that amount of traffic. Since there are still more idled CPUs, I'm concern about CPU load distribution so that not only one CPU will be stressed. Thanks, Archimedes From owner-freebsd-smp@FreeBSD.ORG Tue Nov 11 07:02:20 2008 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A90211065677 for ; Tue, 11 Nov 2008 07:02:20 +0000 (UTC) (envelope-from archimedes.gaviola@gmail.com) Received: from wa-out-1112.google.com (wa-out-1112.google.com [209.85.146.177]) by mx1.freebsd.org (Postfix) with ESMTP id 1F5858FC19 for ; Tue, 11 Nov 2008 07:02:19 +0000 (UTC) (envelope-from archimedes.gaviola@gmail.com) Received: by wa-out-1112.google.com with SMTP id m34so1484691wag.27 for ; Mon, 10 Nov 2008 23:02:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:cc:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=07Kg6z8piMlG3OjzHDr5XMr+H0liA0p7hgcN9JCzcbo=; b=XyV5Ul5xeFVNuI8+O8OLTQRbfU5g6Lx35iNzmctocYv1YOepbF5MbWCSK/BI0qumYz a4m8vs6KcSdqrEymm3fmTK9MBqaRZ1w96Cti8LxNan0hBi5VrqFyoAefSLp6feLlWBXc +9k7M38JkI721LVwKPlQpLQfbHBFuPXXRwRFg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=GUwZeAGZVWIRx2yLMJnNE6ecR9FZ4Co49xa+1HLxAa/cvJq1FJwkuJpGT/0IHyczWi kfxUy8L/3wNqNCNogTCtHrH71CaWcwJBrd9K0d6ByEKf7JdjyeReExLPNp4BOjd34C8P FeSWYBMW27NXZha+HWLxn3ApRmAHVWsGft33U= Received: by 10.115.110.6 with SMTP id n6mr4994440wam.72.1226386939656; Mon, 10 Nov 2008 23:02:19 -0800 (PST) Received: by 10.115.76.12 with HTTP; Mon, 10 Nov 2008 23:02:19 -0800 (PST) Message-ID: <42e3d810811102302h3a0e38bcuf1195cf0a89c29a7@mail.gmail.com> Date: Tue, 11 Nov 2008 15:02:19 +0800 From: "Archimedes Gaviola" To: ivoras@freebsd.org In-Reply-To: <42e3d810811102032w7850a1c0t386d80ce747f37d3@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <42e3d810811100033w172e90dbl209ecbab640cc24f@mail.gmail.com> <200811101733.04547.jhb@freebsd.org> <42e3d810811102032w7850a1c0t386d80ce747f37d3@mail.gmail.com> Cc: smp@freebsd.org, freebsd-smp@freebsd.org Subject: Re: CPU affinity with ULE scheduler X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Nov 2008 07:02:20 -0000 On Tue, Nov 11, 2008 at 12:32 PM, Archimedes Gaviola wrote: > On Tue, Nov 11, 2008 at 6:33 AM, John Baldwin wrote: >> On Monday 10 November 2008 03:33:23 am Archimedes Gaviola wrote: >>> To Whom It May Concerned: >>> >>> Can someone explain or share about ULE scheduler (latest version 2 if >>> I'm not mistaken) dealing with CPU affinity? Is there any existing >>> benchmarks on this with FreeBSD? Because I am currently using 4BSD >>> scheduler and as what I have observed especially on processing high >>> network load traffic on multiple CPU cores, only one CPU were being >>> stressed with network interrupt while the rests are mostly in idle >>> state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom >>> network interface cards (bce0 and bce1). Below is the snapshot of the >>> case. >> >> Interrupts are routed to a single CPU. Since bce0 and bce1 are both on the >> same interrupt (irq 23), the CPU that interrupt is routed to is going to end >> up handling all the interrupts for bce0 and bce1. This not something ULE or >> 4BSD have any control over. >> >> -- >> John Baldwin >> > > Hi John, > > I'm sorry for the wrong snapshot. Here's the right one with my concern. > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 17 root 1 171 52 0K 16K CPU0 0 54:28 95.17% idle: cpu0 > 15 root 1 171 52 0K 16K CPU2 2 55:55 93.65% idle: cpu2 > 14 root 1 171 52 0K 16K CPU3 3 58:53 93.55% idle: cpu3 > 13 root 1 171 52 0K 16K RUN 4 59:14 82.47% idle: cpu4 > 12 root 1 171 52 0K 16K RUN 5 55:42 82.23% idle: cpu5 > 16 root 1 171 52 0K 16K CPU1 1 58:13 77.78% idle: cpu1 > 11 root 1 171 52 0K 16K CPU6 6 54:08 76.17% idle: cpu6 > 36 root 1 -68 -187 0K 16K WAIT 7 8:50 65.53% > irq23: bce0 bce1 > 10 root 1 171 52 0K 16K CPU7 7 48:19 29.79% idle: cpu7 > 43 root 1 171 52 0K 16K pgzero 2 0:35 1.51% pagezero > 1372 root 10 20 0 16716K 5764K kserel 6 58:42 0.00% kmd > 4488 root 1 96 0 30676K 4236K select 2 1:51 0.00% sshd > 18 root 1 -32 -151 0K 16K WAIT 0 1:14 0.00% swi4: clock s > 20 root 1 -44 -163 0K 16K WAIT 0 0:30 0.00% swi1: net > 218 root 1 96 0 3852K 1376K select 0 0:23 0.00% syslogd > 2171 root 1 96 0 30676K 4224K select 6 0:19 0.00% sshd > > Actually I was doing a network performance testing on this system with > FreeBSD-6.2 RELEASE using its default scheduler 4BSD and then I used a > tool to generate big amount of traffic around 600Mbps-700Mbps > traversing the FreeBSD system in bi-direction, meaning both network > interfaces are receiving traffic. What happened was, the CPU (cpu7) > that handles the (irq 23) on both interfaces consumed big amount of > CPU utilization around 65.53% in which it affects other running > applications and services like sshd and httpd. It's no longer > accessible when traffic is bombarded. With the current situation of my > FreeBSD system with only one CPU being stressed, I was thinking of > moving to FreeBSD-7.0 RELEASE with the ULE scheduler because I thought > my concern has something to do with the distributions of load on > multiple CPU cores handled by the scheduler especially at the network > level, processing network load. So, if it is more of interrupt > handling and not on the scheduler, is there a way we can optimize it? > Because if it still routed only to one CPU then for me it's still > inefficient. Who handles interrupt scheduling for bounding CPU in > order to prevent shared IRQ? Is there any improvements with > FreeBSD-7.0 with regards to interrupt handling? > > Thanks, > Archimedes > Hi Ivan, Archimedes Gaviola wrote: > To Whom It May Concerned: >=20 > Can someone explain or share about ULE scheduler (latest version 2 if > I'm not mistaken) dealing with CPU affinity? Is there any existing > benchmarks on this with FreeBSD? Because I am currently using 4BSD Yes but not for network loads. See for example benchmarks in http://people.freebsd.org/~kris/scaling/7.0%20and%20beyond.pdf [Archimedes] Ah okay, so based on my understanding with ULE scheduler in FreeBSD-7.0, it only scale well with userland applications scheduling such as database and DNS? > scheduler and as what I have observed especially on processing high > network load traffic on multiple CPU cores, only one CPU were being > stressed with network interrupt while the rests are mostly in idle > state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom > network interface cards (bce0 and bce1). Below is the snapshot of the > case. This is unfortunately so and cannot be changed for now - you are not the first with this particular performance problem. [Archimedes] Meaning, you still have to improve the ULE scheduler processing network load? I have read some papers and articles that FreeBSD is implementing parallelized network stack, what is the status of this development? Is processing high network load can address this? BUT, looking at the data in the snapshot you gave, it's not clear that there is a performance problem in your case - bce is not nearly taking as much CPU time to be bottlenecking. What exactly do you think is wrong in your case? [Archimedes] Oh I'm sorry this is not the right one. Here below, PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 17 root 1 171 52 0K 16K CPU0 0 54:28 95.17% idle: cpu0 15 root 1 171 52 0K 16K CPU2 2 55:55 93.65% idle: cpu2 14 root 1 171 52 0K 16K CPU3 3 58:53 93.55% idle: cpu3 13 root 1 171 52 0K 16K RUN 4 59:14 82.47% idle: cpu4 12 root 1 171 52 0K 16K RUN 5 55:42 82.23% idle: cpu5 16 root 1 171 52 0K 16K CPU1 1 58:13 77.78% idle: cpu1 11 root 1 171 52 0K 16K CPU6 6 54:08 76.17% idle: cpu6 36 root 1 -68 -187 0K 16K WAIT 7 8:50 65.53% irq23: bce0 bce1 10 root 1 171 52 0K 16K CPU7 7 48:19 29.79% idle: cpu7 43 root 1 171 52 0K 16K pgzero 2 0:35 1.51% pagezero 1372 root 10 20 0 16716K 5764K kserel 6 58:42 0.00% kmd 4488 root 1 96 0 30676K 4236K select 2 1:51 0.00% sshd 18 root 1 -32 -151 0K 16K WAIT 0 1:14 0.00% swi4: clock s 20 root 1 -44 -163 0K 16K WAIT 0 0:30 0.00% swi1: net 218 root 1 96 0 3852K 1376K select 0 0:23 0.00% syslogd 2171 root 1 96 0 30676K 4224K select 6 0:19 0.00% sshd I was doing network performance testing with a traffic generator tool bombarding 600Mbps-700Mbps traversing my FreeBSD system in both directions. As you can see cpu7 is bounded to irq23 shared on both network interfaces bce0 and bce1. cpu7 takes up 65.53% CPU utilization which affects some of the applications running on the system like sshd and httpd. These services are no longer accessible when bombarding that amount of traffic. Since there are still more idled CPUs, I'm concern about CPU load distribution so that not only one CPU will be stressed. Thanks, Archimedes From owner-freebsd-smp@FreeBSD.ORG Tue Nov 11 17:06:49 2008 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E49ED1065670 for ; Tue, 11 Nov 2008 17:06:49 +0000 (UTC) (envelope-from freebsd-smp@m.gmane.org) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by mx1.freebsd.org (Postfix) with ESMTP id AAD318FC1E for ; Tue, 11 Nov 2008 17:06:49 +0000 (UTC) (envelope-from freebsd-smp@m.gmane.org) Received: from list by ciao.gmane.org with local (Exim 4.43) id 1KzwhQ-0003vJ-22 for freebsd-smp@freebsd.org; Tue, 11 Nov 2008 17:06:44 +0000 Received: from 88.79.237.12 ([88.79.237.12]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 11 Nov 2008 17:06:44 +0000 Received: from ivoras by 88.79.237.12 with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 11 Nov 2008 17:06:44 +0000 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-smp@freebsd.org From: Ivan Voras Date: Tue, 11 Nov 2008 18:06:47 +0100 Lines: 23 Message-ID: References: <42e3d810811100033w172e90dbl209ecbab640cc24f@mail.gmail.com> <200811101733.04547.jhb@freebsd.org> <42e3d810811102032w7850a1c0t386d80ce747f37d3@mail.gmail.com> <42e3d810811102302h3a0e38bcuf1195cf0a89c29a7@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: 88.79.237.12 User-Agent: Thunderbird 2.0.0.17 (Windows/20080914) In-Reply-To: <42e3d810811102302h3a0e38bcuf1195cf0a89c29a7@mail.gmail.com> Sender: news Subject: Re: CPU affinity with ULE scheduler X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Nov 2008 17:06:50 -0000 Archimedes Gaviola wrote: > Hi Ivan, > > Archimedes Gaviola wrote: >> To Whom It May Concerned: >> =20 >> Can someone explain or share about ULE scheduler (latest version 2 if >> I'm not mistaken) dealing with CPU affinity? Is there any existing >> benchmarks on this with FreeBSD? Because I am currently using 4BSD > > Yes but not for network loads. See for example benchmarks in > http://people.freebsd.org/~kris/scaling/7.0%20and%20beyond.pdf > > [Archimedes] Ah okay, so based on my understanding with ULE scheduler > in FreeBSD-7.0, it only scale well with userland applications > scheduling such as database and DNS? The problem you are seeing is probably not solvable by a better scheduler. There are other parts of the system that cause performance bottlenecks. I'd recommend you try 7-STABLE, it might help you, but it probably won't. From owner-freebsd-smp@FreeBSD.ORG Wed Nov 12 13:12:14 2008 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1A07410656D3 for ; Wed, 12 Nov 2008 13:12:14 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id B7A288FC22 for ; Wed, 12 Nov 2008 13:12:13 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [IPv6:::1]) (authenticated bits=0) by server.baldwin.cx (8.14.3/8.14.3) with ESMTP id mACDBNKB084446; Wed, 12 Nov 2008 08:12:07 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: "Archimedes Gaviola" Date: Tue, 11 Nov 2008 12:16:37 -0500 User-Agent: KMail/1.9.7 References: <42e3d810811100033w172e90dbl209ecbab640cc24f@mail.gmail.com> <200811101733.04547.jhb@freebsd.org> <42e3d810811102032w7850a1c0t386d80ce747f37d3@mail.gmail.com> In-Reply-To: <42e3d810811102032w7850a1c0t386d80ce747f37d3@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200811111216.37462.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [IPv6:::1]); Wed, 12 Nov 2008 08:12:07 -0500 (EST) X-Virus-Scanned: ClamAV 0.93.1/8620/Wed Nov 12 04:05:38 2008 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-1.9 required=4.2 tests=AWL,BAYES_00, DATE_IN_PAST_12_24,NO_RELAYS autolearn=no version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: freebsd-smp@freebsd.org Subject: Re: CPU affinity with ULE scheduler X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Nov 2008 13:12:14 -0000 On Monday 10 November 2008 11:32:55 pm Archimedes Gaviola wrote: > On Tue, Nov 11, 2008 at 6:33 AM, John Baldwin wrote: > > On Monday 10 November 2008 03:33:23 am Archimedes Gaviola wrote: > >> To Whom It May Concerned: > >> > >> Can someone explain or share about ULE scheduler (latest version 2 if > >> I'm not mistaken) dealing with CPU affinity? Is there any existing > >> benchmarks on this with FreeBSD? Because I am currently using 4BSD > >> scheduler and as what I have observed especially on processing high > >> network load traffic on multiple CPU cores, only one CPU were being > >> stressed with network interrupt while the rests are mostly in idle > >> state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom > >> network interface cards (bce0 and bce1). Below is the snapshot of the > >> case. > > > > Interrupts are routed to a single CPU. Since bce0 and bce1 are both on the > > same interrupt (irq 23), the CPU that interrupt is routed to is going to end > > up handling all the interrupts for bce0 and bce1. This not something ULE or > > 4BSD have any control over. > > > > -- > > John Baldwin > > > > Hi John, > > I'm sorry for the wrong snapshot. Here's the right one with my concern. > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 17 root 1 171 52 0K 16K CPU0 0 54:28 95.17% idle: cpu0 > 15 root 1 171 52 0K 16K CPU2 2 55:55 93.65% idle: cpu2 > 14 root 1 171 52 0K 16K CPU3 3 58:53 93.55% idle: cpu3 > 13 root 1 171 52 0K 16K RUN 4 59:14 82.47% idle: cpu4 > 12 root 1 171 52 0K 16K RUN 5 55:42 82.23% idle: cpu5 > 16 root 1 171 52 0K 16K CPU1 1 58:13 77.78% idle: cpu1 > 11 root 1 171 52 0K 16K CPU6 6 54:08 76.17% idle: cpu6 > 36 root 1 -68 -187 0K 16K WAIT 7 8:50 65.53% > irq23: bce0 bce1 > 10 root 1 171 52 0K 16K CPU7 7 48:19 29.79% idle: cpu7 > 43 root 1 171 52 0K 16K pgzero 2 0:35 1.51% pagezero > 1372 root 10 20 0 16716K 5764K kserel 6 58:42 0.00% kmd > 4488 root 1 96 0 30676K 4236K select 2 1:51 0.00% sshd > 18 root 1 -32 -151 0K 16K WAIT 0 1:14 0.00% swi4: clock s > 20 root 1 -44 -163 0K 16K WAIT 0 0:30 0.00% swi1: net > 218 root 1 96 0 3852K 1376K select 0 0:23 0.00% syslogd > 2171 root 1 96 0 30676K 4224K select 6 0:19 0.00% sshd > > Actually I was doing a network performance testing on this system with > FreeBSD-6.2 RELEASE using its default scheduler 4BSD and then I used a > tool to generate big amount of traffic around 600Mbps-700Mbps > traversing the FreeBSD system in bi-direction, meaning both network > interfaces are receiving traffic. What happened was, the CPU (cpu7) > that handles the (irq 23) on both interfaces consumed big amount of > CPU utilization around 65.53% in which it affects other running > applications and services like sshd and httpd. It's no longer > accessible when traffic is bombarded. With the current situation of my > FreeBSD system with only one CPU being stressed, I was thinking of > moving to FreeBSD-7.0 RELEASE with the ULE scheduler because I thought > my concern has something to do with the distributions of load on > multiple CPU cores handled by the scheduler especially at the network > level, processing network load. So, if it is more of interrupt > handling and not on the scheduler, is there a way we can optimize it? > Because if it still routed only to one CPU then for me it's still > inefficient. Who handles interrupt scheduling for bounding CPU in > order to prevent shared IRQ? Is there any improvements with > FreeBSD-7.0 with regards to interrupt handling? It depends. In all likelihood, the interrupts from bce0 and bce1 are both hardwired to the same interrupt pin and so they will always share the same ithread when using the legacy INTx interrupts. However, bce(4) parts do support MSI, and if you try a newer OS snap (6.3 or later) these devices should use MSI in which case each NIC would be assigned to a separate CPU. I would suggest trying 7.0 or a 7.1 release candidate and see if it does better. -- John Baldwin From owner-freebsd-smp@FreeBSD.ORG Thu Nov 13 00:35:45 2008 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B3C301065678 for ; Thu, 13 Nov 2008 00:35:45 +0000 (UTC) (envelope-from freebsd-smp@m.gmane.org) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by mx1.freebsd.org (Postfix) with ESMTP id 6CA4A8FC18 for ; Thu, 13 Nov 2008 00:35:45 +0000 (UTC) (envelope-from freebsd-smp@m.gmane.org) Received: from list by ciao.gmane.org with local (Exim 4.43) id 1L0QBP-0005ga-38 for freebsd-smp@freebsd.org; Thu, 13 Nov 2008 00:35:39 +0000 Received: from 88.79.237.12 ([88.79.237.12]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 13 Nov 2008 00:35:39 +0000 Received: from ivoras by 88.79.237.12 with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 13 Nov 2008 00:35:39 +0000 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-smp@freebsd.org From: Ivan Voras Date: Thu, 13 Nov 2008 01:35:28 +0100 Lines: 24 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: 88.79.237.12 User-Agent: Thunderbird 2.0.0.17 (Windows/20080914) Sender: news Subject: NUMA? X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Nov 2008 00:35:45 -0000 Hi, As even Intel's new CPUs have integrated memory controllers and thus become NUMA, I'm interested in what is, in theory (I'm not proposing to do it, I'm just curious), necessary to change in an OS to support NUMA. My guess is: 1) node topology detection - something similar to what ULE does but also recording which memory ranges are "close" to which CPU and the "distance" between nodes/CPUs 2) on new image load (exec), pick a node for it, among "least used" nodes and record the choice per-proc; on fork, keep the new process on the same node 3) schedule threads on a CPU from the proc's node if at all possible (e.g, when a 6-core CPU is still 1 node), then on a "near" node from a list of distances sorted in order of cost 4) allocate new pages for a proc from its node's memory range(s) if at all possible. Is this all? On the other hand, did someone do a study of performance increase for todays "consumer" NUMA systems (e.g. 2-4 sockets/nodes x86/x64 systems) - is it worth it? From owner-freebsd-smp@FreeBSD.ORG Thu Nov 13 01:19:03 2008 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 326141065672; Thu, 13 Nov 2008 01:19:03 +0000 (UTC) (envelope-from prvs=julian=19649c2e1@elischer.org) Received: from smtp-outbound.ironport.com (smtp-outbound.ironport.com [63.251.108.112]) by mx1.freebsd.org (Postfix) with ESMTP id D087A8FC19; Thu, 13 Nov 2008 01:19:02 +0000 (UTC) (envelope-from prvs=julian=19649c2e1@elischer.org) Received: from jelischer-laptop.sfo.ironport.com (HELO julian-mac.elischer.org) ([10.251.22.38]) by smtp-outbound.ironport.com with ESMTP; 12 Nov 2008 16:50:08 -0800 Message-ID: <491B79BE.50800@elischer.org> Date: Wed, 12 Nov 2008 16:50:06 -0800 From: Julian Elischer User-Agent: Thunderbird 2.0.0.17 (Macintosh/20080914) MIME-Version: 1.0 To: Ivan Voras References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-smp@freebsd.org Subject: Re: NUMA? X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Nov 2008 01:19:03 -0000 Ivan Voras wrote: > Hi, I did the AMD course a few weeks ago so I'm also very interested in this.. > > As even Intel's new CPUs have integrated memory controllers and thus > become NUMA, I'm interested in what is, in theory (I'm not proposing to > do it, I'm just curious), necessary to change in an OS to support NUMA. > My guess is: > > 1) node topology detection - something similar to what ULE does but also > recording which memory ranges are "close" to which CPU and the > "distance" between nodes/CPUs at a minimum, this is needed before anything else can really work. > 2) on new image load (exec), pick a node for it, among "least used" > nodes and record the choice per-proc; on fork, keep the new process on > the same node In some cases it may be worth having multiple copies of teh read-only text segments. For example, it may eventually be worth having a /bin/sh text segment in each CPU's memory space. > 3) schedule threads on a CPU from the proc's node if at all possible > (e.g, when a 6-core CPU is still 1 node), then on a "near" node from a > list of distances sorted in order of cost this is where it really starts getting hairy.. when do you migrate a process? and what if there are as many threads runnable as processors? > 4) allocate new pages for a proc from its node's memory range(s) if at > all possible. > > Is this all? There are other interesting effects too.. assigning network interrupts to processors that have good access to the hardware AND the destination if you can.. > > On the other hand, did someone do a study of performance increase for > todays "consumer" NUMA systems (e.g. 2-4 sockets/nodes x86/x64 systems) > - is it worth it? caches hide a multitude of sins.. > > _______________________________________________ > freebsd-smp@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-smp > To unsubscribe, send any mail to "freebsd-smp-unsubscribe@freebsd.org" From owner-freebsd-smp@FreeBSD.ORG Thu Nov 13 01:32:20 2008 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 367A1106564A for ; Thu, 13 Nov 2008 01:32:20 +0000 (UTC) (envelope-from marc@freshaire.wiz.com) Received: from freshaire.wiz.com (freshaire.wiz.com [66.143.183.129]) by mx1.freebsd.org (Postfix) with ESMTP id EE6E68FC14 for ; Thu, 13 Nov 2008 01:32:19 +0000 (UTC) (envelope-from marc@freshaire.wiz.com) Received: from freshaire.wiz.com (localhost.wiz.com [127.0.0.1]) by freshaire.wiz.com (8.14.2/8.14.2) with ESMTP id mAD12LDt020333 for ; Wed, 12 Nov 2008 19:02:21 -0600 (CST) (envelope-from marc@freshaire.wiz.com) Received: (from marc@localhost) by freshaire.wiz.com (8.14.2/8.14.2/Submit) id mAD12LwF020332 for freebsd-smp@freebsd.org; Wed, 12 Nov 2008 19:02:21 -0600 (CST) (envelope-from marc) Date: Wed, 12 Nov 2008 19:02:21 -0600 From: Marc Wiz To: freebsd-smp@freebsd.org Message-ID: <20081113010221.GB20056@wiz.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.2i Subject: Re: NUMA? X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Nov 2008 01:32:20 -0000 On Thu, Nov 13, 2008 at 01:35:28AM +0100, Ivan Voras wrote: > Hi, > > As even Intel's new CPUs have integrated memory controllers and thus > become NUMA, I'm interested in what is, in theory (I'm not proposing to > do it, I'm just curious), necessary to change in an OS to support NUMA. > My guess is: > > 1) node topology detection - something similar to what ULE does but also > recording which memory ranges are "close" to which CPU and the > "distance" between nodes/CPUs > 2) on new image load (exec), pick a node for it, among "least used" > nodes and record the choice per-proc; on fork, keep the new process on > the same node > 3) schedule threads on a CPU from the proc's node if at all possible > (e.g, when a 6-core CPU is still 1 node), then on a "near" node from a > list of distances sorted in order of cost > 4) allocate new pages for a proc from its node's memory range(s) if at > all possible. One good source of information on this topic is IBM's AIX on the Power 4 - 6 processors. There is the concept of distant vs. close memory and processors as well as what is referred to as memory affinity. Marc -- Marc Wiz marc@wiz.com Yes, that really is my last name. From owner-freebsd-smp@FreeBSD.ORG Thu Nov 13 11:55:02 2008 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A00051065674 for ; Thu, 13 Nov 2008 11:55:02 +0000 (UTC) (envelope-from archimedes.gaviola@gmail.com) Received: from wa-out-1112.google.com (wa-out-1112.google.com [209.85.146.182]) by mx1.freebsd.org (Postfix) with ESMTP id 6C32F8FC0C for ; Thu, 13 Nov 2008 11:55:02 +0000 (UTC) (envelope-from archimedes.gaviola@gmail.com) Received: by wa-out-1112.google.com with SMTP id m34so449748wag.27 for ; Thu, 13 Nov 2008 03:55:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:cc:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=fi+ChQSbCw/rFAAVJRb/qwFHbduJDiGQ1nO0cDY4qaw=; b=i2Idkj9Ya7STjLHtWnXqpYhWzDASHk1TITgiTCyFg1KyZa3tT9WBBE5zkVjoHWC1Ni evtpTvY035HPLTnfKagWLVMiyuKCGnzyK42qaTVnIjflfI5yzxKJyLJOGCpCvmlckJ7i IRluOAdcWMmbRX1FPdaLI9C67KfCtifvBu5dE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=UUYZn9g2H3+cfIM8zOe7urtVncvnCmFu/cDkJOH5yKUrqt886+Dm15rjFSMYnue7ya LFiTv7WRpTYtxOSlTssCiiW9+mO3PdJ+1y8a9YV+Yggn/RzZpaNTS0SocBMXpPVmBFa1 ZeGcmglXaUxoQrXIFMzme4630RiArG4h1d838= Received: by 10.114.200.2 with SMTP id x2mr6790095waf.83.1226577301822; Thu, 13 Nov 2008 03:55:01 -0800 (PST) Received: by 10.115.76.12 with HTTP; Thu, 13 Nov 2008 03:55:01 -0800 (PST) Message-ID: <42e3d810811130355x3857bceap447e134b18eee04b@mail.gmail.com> Date: Thu, 13 Nov 2008 19:55:01 +0800 From: "Archimedes Gaviola" To: "John Baldwin" In-Reply-To: <200811111216.37462.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <42e3d810811100033w172e90dbl209ecbab640cc24f@mail.gmail.com> <200811101733.04547.jhb@freebsd.org> <42e3d810811102032w7850a1c0t386d80ce747f37d3@mail.gmail.com> <200811111216.37462.jhb@freebsd.org> Cc: freebsd-smp@freebsd.org Subject: Re: CPU affinity with ULE scheduler X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Nov 2008 11:55:02 -0000 On Wed, Nov 12, 2008 at 1:16 AM, John Baldwin wrote: > On Monday 10 November 2008 11:32:55 pm Archimedes Gaviola wrote: >> On Tue, Nov 11, 2008 at 6:33 AM, John Baldwin wrote: >> > On Monday 10 November 2008 03:33:23 am Archimedes Gaviola wrote: >> >> To Whom It May Concerned: >> >> >> >> Can someone explain or share about ULE scheduler (latest version 2 if >> >> I'm not mistaken) dealing with CPU affinity? Is there any existing >> >> benchmarks on this with FreeBSD? Because I am currently using 4BSD >> >> scheduler and as what I have observed especially on processing high >> >> network load traffic on multiple CPU cores, only one CPU were being >> >> stressed with network interrupt while the rests are mostly in idle >> >> state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom >> >> network interface cards (bce0 and bce1). Below is the snapshot of the >> >> case. >> > >> > Interrupts are routed to a single CPU. Since bce0 and bce1 are both on > the >> > same interrupt (irq 23), the CPU that interrupt is routed to is going to > end >> > up handling all the interrupts for bce0 and bce1. This not something ULE > or >> > 4BSD have any control over. >> > >> > -- >> > John Baldwin >> > >> >> Hi John, >> >> I'm sorry for the wrong snapshot. Here's the right one with my concern. >> >> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND >> 17 root 1 171 52 0K 16K CPU0 0 54:28 95.17% idle: cpu0 >> 15 root 1 171 52 0K 16K CPU2 2 55:55 93.65% idle: cpu2 >> 14 root 1 171 52 0K 16K CPU3 3 58:53 93.55% idle: cpu3 >> 13 root 1 171 52 0K 16K RUN 4 59:14 82.47% idle: cpu4 >> 12 root 1 171 52 0K 16K RUN 5 55:42 82.23% idle: cpu5 >> 16 root 1 171 52 0K 16K CPU1 1 58:13 77.78% idle: cpu1 >> 11 root 1 171 52 0K 16K CPU6 6 54:08 76.17% idle: cpu6 >> 36 root 1 -68 -187 0K 16K WAIT 7 8:50 65.53% >> irq23: bce0 bce1 >> 10 root 1 171 52 0K 16K CPU7 7 48:19 29.79% idle: cpu7 >> 43 root 1 171 52 0K 16K pgzero 2 0:35 1.51% pagezero >> 1372 root 10 20 0 16716K 5764K kserel 6 58:42 0.00% kmd >> 4488 root 1 96 0 30676K 4236K select 2 1:51 0.00% sshd >> 18 root 1 -32 -151 0K 16K WAIT 0 1:14 0.00% swi4: > clock s >> 20 root 1 -44 -163 0K 16K WAIT 0 0:30 0.00% swi1: net >> 218 root 1 96 0 3852K 1376K select 0 0:23 0.00% syslogd >> 2171 root 1 96 0 30676K 4224K select 6 0:19 0.00% sshd >> >> Actually I was doing a network performance testing on this system with >> FreeBSD-6.2 RELEASE using its default scheduler 4BSD and then I used a >> tool to generate big amount of traffic around 600Mbps-700Mbps >> traversing the FreeBSD system in bi-direction, meaning both network >> interfaces are receiving traffic. What happened was, the CPU (cpu7) >> that handles the (irq 23) on both interfaces consumed big amount of >> CPU utilization around 65.53% in which it affects other running >> applications and services like sshd and httpd. It's no longer >> accessible when traffic is bombarded. With the current situation of my >> FreeBSD system with only one CPU being stressed, I was thinking of >> moving to FreeBSD-7.0 RELEASE with the ULE scheduler because I thought >> my concern has something to do with the distributions of load on >> multiple CPU cores handled by the scheduler especially at the network >> level, processing network load. So, if it is more of interrupt >> handling and not on the scheduler, is there a way we can optimize it? >> Because if it still routed only to one CPU then for me it's still >> inefficient. Who handles interrupt scheduling for bounding CPU in >> order to prevent shared IRQ? Is there any improvements with >> FreeBSD-7.0 with regards to interrupt handling? > > It depends. In all likelihood, the interrupts from bce0 and bce1 are both > hardwired to the same interrupt pin and so they will always share the same > ithread when using the legacy INTx interrupts. However, bce(4) parts do > support MSI, and if you try a newer OS snap (6.3 or later) these devices > should use MSI in which case each NIC would be assigned to a separate CPU. I > would suggest trying 7.0 or a 7.1 release candidate and see if it does > better. > > -- > John Baldwin > Hi John, I try 7.0 release and each network interface were already allocated separately on different CPU. Here, MSI is already working. PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 12 root 1 171 ki31 0K 16K CPU6 6 123:55 100.00% idle: cpu6 15 root 1 171 ki31 0K 16K CPU3 3 123:54 100.00% idle: cpu3 14 root 1 171 ki31 0K 16K CPU4 4 123:26 100.00% idle: cpu4 16 root 1 171 ki31 0K 16K CPU2 2 123:15 100.00% idle: cpu2 17 root 1 171 ki31 0K 16K CPU1 1 123:15 100.00% idle: cpu1 37 root 1 -68 - 0K 16K CPU7 7 9:09 100.00% irq256: bce0 13 root 1 171 ki31 0K 16K CPU5 5 123:49 99.07% idle: cpu5 40 root 1 -68 - 0K 16K WAIT 0 4:40 51.17% irq257: bce1 18 root 1 171 ki31 0K 16K RUN 0 117:48 49.37% idle: cpu0 11 root 1 171 ki31 0K 16K RUN 7 115:25 0.00% idle: cpu7 19 root 1 -32 - 0K 16K WAIT 0 0:39 0.00% swi4: clock s 14367 root 1 44 0 5176K 3104K select 2 0:01 0.00% dhcpd 22 root 1 -16 - 0K 16K - 3 0:01 0.00% yarrow 25 root 1 -24 - 0K 16K WAIT 0 0:00 0.00% swi6: Giant t 11658 root 1 44 0 32936K 4540K select 1 0:00 0.00% sshd 14224 root 1 44 0 32936K 4540K select 5 0:00 0.00% sshd 41 root 1 -60 - 0K 16K WAIT 0 0:00 0.00% irq1: atkbd0 4 root 1 -8 - 0K 16K - 2 0:00 0.00% g_down The bce0 interface interrupt (irq256) gets stressed out which already have 100% of CPU7 while CPU0 is around 51.17%. Any more recommendations? Is there anything we can do about optimization with MSI? Thanks, Archimedes From owner-freebsd-smp@FreeBSD.ORG Thu Nov 13 19:46:33 2008 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3F9031065678 for ; Thu, 13 Nov 2008 19:46:33 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 7C3498FC12 for ; Thu, 13 Nov 2008 19:46:32 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [IPv6:::1]) (authenticated bits=0) by server.baldwin.cx (8.14.3/8.14.3) with ESMTP id mADJkO2m096236; Thu, 13 Nov 2008 14:46:25 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: "Archimedes Gaviola" Date: Thu, 13 Nov 2008 11:28:54 -0500 User-Agent: KMail/1.9.7 References: <42e3d810811100033w172e90dbl209ecbab640cc24f@mail.gmail.com> <200811111216.37462.jhb@freebsd.org> <42e3d810811130355x3857bceap447e134b18eee04b@mail.gmail.com> In-Reply-To: <42e3d810811130355x3857bceap447e134b18eee04b@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200811131128.55220.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [IPv6:::1]); Thu, 13 Nov 2008 14:46:25 -0500 (EST) X-Virus-Scanned: ClamAV 0.93.1/8628/Thu Nov 13 10:57:02 2008 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.3 required=4.2 tests=AWL,BAYES_00, DATE_IN_PAST_03_06,NO_RELAYS autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: freebsd-smp@freebsd.org Subject: Re: CPU affinity with ULE scheduler X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Nov 2008 19:46:33 -0000 On Thursday 13 November 2008 06:55:01 am Archimedes Gaviola wrote: > On Wed, Nov 12, 2008 at 1:16 AM, John Baldwin wrote: > > On Monday 10 November 2008 11:32:55 pm Archimedes Gaviola wrote: > >> On Tue, Nov 11, 2008 at 6:33 AM, John Baldwin wrote: > >> > On Monday 10 November 2008 03:33:23 am Archimedes Gaviola wrote: > >> >> To Whom It May Concerned: > >> >> > >> >> Can someone explain or share about ULE scheduler (latest version 2 if > >> >> I'm not mistaken) dealing with CPU affinity? Is there any existing > >> >> benchmarks on this with FreeBSD? Because I am currently using 4BSD > >> >> scheduler and as what I have observed especially on processing high > >> >> network load traffic on multiple CPU cores, only one CPU were being > >> >> stressed with network interrupt while the rests are mostly in idle > >> >> state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom > >> >> network interface cards (bce0 and bce1). Below is the snapshot of the > >> >> case. > >> > > >> > Interrupts are routed to a single CPU. Since bce0 and bce1 are both on > > the > >> > same interrupt (irq 23), the CPU that interrupt is routed to is going to > > end > >> > up handling all the interrupts for bce0 and bce1. This not something ULE > > or > >> > 4BSD have any control over. > >> > > >> > -- > >> > John Baldwin > >> > > >> > >> Hi John, > >> > >> I'm sorry for the wrong snapshot. Here's the right one with my concern. > >> > >> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > >> 17 root 1 171 52 0K 16K CPU0 0 54:28 95.17% idle: cpu0 > >> 15 root 1 171 52 0K 16K CPU2 2 55:55 93.65% idle: cpu2 > >> 14 root 1 171 52 0K 16K CPU3 3 58:53 93.55% idle: cpu3 > >> 13 root 1 171 52 0K 16K RUN 4 59:14 82.47% idle: cpu4 > >> 12 root 1 171 52 0K 16K RUN 5 55:42 82.23% idle: cpu5 > >> 16 root 1 171 52 0K 16K CPU1 1 58:13 77.78% idle: cpu1 > >> 11 root 1 171 52 0K 16K CPU6 6 54:08 76.17% idle: cpu6 > >> 36 root 1 -68 -187 0K 16K WAIT 7 8:50 65.53% > >> irq23: bce0 bce1 > >> 10 root 1 171 52 0K 16K CPU7 7 48:19 29.79% idle: cpu7 > >> 43 root 1 171 52 0K 16K pgzero 2 0:35 1.51% pagezero > >> 1372 root 10 20 0 16716K 5764K kserel 6 58:42 0.00% kmd > >> 4488 root 1 96 0 30676K 4236K select 2 1:51 0.00% sshd > >> 18 root 1 -32 -151 0K 16K WAIT 0 1:14 0.00% swi4: > > clock s > >> 20 root 1 -44 -163 0K 16K WAIT 0 0:30 0.00% swi1: net > >> 218 root 1 96 0 3852K 1376K select 0 0:23 0.00% syslogd > >> 2171 root 1 96 0 30676K 4224K select 6 0:19 0.00% sshd > >> > >> Actually I was doing a network performance testing on this system with > >> FreeBSD-6.2 RELEASE using its default scheduler 4BSD and then I used a > >> tool to generate big amount of traffic around 600Mbps-700Mbps > >> traversing the FreeBSD system in bi-direction, meaning both network > >> interfaces are receiving traffic. What happened was, the CPU (cpu7) > >> that handles the (irq 23) on both interfaces consumed big amount of > >> CPU utilization around 65.53% in which it affects other running > >> applications and services like sshd and httpd. It's no longer > >> accessible when traffic is bombarded. With the current situation of my > >> FreeBSD system with only one CPU being stressed, I was thinking of > >> moving to FreeBSD-7.0 RELEASE with the ULE scheduler because I thought > >> my concern has something to do with the distributions of load on > >> multiple CPU cores handled by the scheduler especially at the network > >> level, processing network load. So, if it is more of interrupt > >> handling and not on the scheduler, is there a way we can optimize it? > >> Because if it still routed only to one CPU then for me it's still > >> inefficient. Who handles interrupt scheduling for bounding CPU in > >> order to prevent shared IRQ? Is there any improvements with > >> FreeBSD-7.0 with regards to interrupt handling? > > > > It depends. In all likelihood, the interrupts from bce0 and bce1 are both > > hardwired to the same interrupt pin and so they will always share the same > > ithread when using the legacy INTx interrupts. However, bce(4) parts do > > support MSI, and if you try a newer OS snap (6.3 or later) these devices > > should use MSI in which case each NIC would be assigned to a separate CPU. I > > would suggest trying 7.0 or a 7.1 release candidate and see if it does > > better. > > > > -- > > John Baldwin > > > > Hi John, > > I try 7.0 release and each network interface were already allocated > separately on different CPU. Here, MSI is already working. > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 12 root 1 171 ki31 0K 16K CPU6 6 123:55 100.00% idle: cpu6 > 15 root 1 171 ki31 0K 16K CPU3 3 123:54 100.00% idle: cpu3 > 14 root 1 171 ki31 0K 16K CPU4 4 123:26 100.00% idle: cpu4 > 16 root 1 171 ki31 0K 16K CPU2 2 123:15 100.00% idle: cpu2 > 17 root 1 171 ki31 0K 16K CPU1 1 123:15 100.00% idle: cpu1 > 37 root 1 -68 - 0K 16K CPU7 7 9:09 100.00% irq256: bce0 > 13 root 1 171 ki31 0K 16K CPU5 5 123:49 99.07% idle: cpu5 > 40 root 1 -68 - 0K 16K WAIT 0 4:40 51.17% irq257: bce1 > 18 root 1 171 ki31 0K 16K RUN 0 117:48 49.37% idle: cpu0 > 11 root 1 171 ki31 0K 16K RUN 7 115:25 0.00% idle: cpu7 > 19 root 1 -32 - 0K 16K WAIT 0 0:39 0.00% swi4: clock s > 14367 root 1 44 0 5176K 3104K select 2 0:01 0.00% dhcpd > 22 root 1 -16 - 0K 16K - 3 0:01 0.00% yarrow > 25 root 1 -24 - 0K 16K WAIT 0 0:00 0.00% swi6: Giant t > 11658 root 1 44 0 32936K 4540K select 1 0:00 0.00% sshd > 14224 root 1 44 0 32936K 4540K select 5 0:00 0.00% sshd > 41 root 1 -60 - 0K 16K WAIT 0 0:00 0.00% irq1: atkbd0 > 4 root 1 -8 - 0K 16K - 2 0:00 0.00% g_down > > The bce0 interface interrupt (irq256) gets stressed out which already > have 100% of CPU7 while CPU0 is around 51.17%. Any more > recommendations? Is there anything we can do about optimization with > MSI? Well, on 7.x you can try turning net.isr.direct off (sysctl). However, it seems you are hammering your bce0 interface. You might want to try using polling on bce0 and seeing if it keeps up with the traffic better. -- John Baldwin From owner-freebsd-smp@FreeBSD.ORG Thu Nov 13 21:31:11 2008 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EECAA1065680 for ; Thu, 13 Nov 2008 21:31:11 +0000 (UTC) (envelope-from freebsd-smp@m.gmane.org) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by mx1.freebsd.org (Postfix) with ESMTP id 9E6BE8FC14 for ; Thu, 13 Nov 2008 21:31:11 +0000 (UTC) (envelope-from freebsd-smp@m.gmane.org) Received: from list by ciao.gmane.org with local (Exim 4.43) id 1L0jmM-0008PY-2c for freebsd-smp@freebsd.org; Thu, 13 Nov 2008 21:31:06 +0000 Received: from 93-138-121-139.adsl.net.t-com.hr ([93.138.121.139]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 13 Nov 2008 21:31:06 +0000 Received: from ivoras by 93-138-121-139.adsl.net.t-com.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 13 Nov 2008 21:31:06 +0000 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-smp@freebsd.org From: Ivan Voras Date: Thu, 13 Nov 2008 22:30:51 +0100 Lines: 36 Message-ID: References: <491B79BE.50800@elischer.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig04B1F0CE10A7FC26DA178441" X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: 93-138-121-139.adsl.net.t-com.hr User-Agent: Thunderbird 2.0.0.17 (Windows/20080914) In-Reply-To: <491B79BE.50800@elischer.org> X-Enigmail-Version: 0.95.7 Sender: news Subject: Re: NUMA? X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Nov 2008 21:31:12 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig04B1F0CE10A7FC26DA178441 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Julian Elischer wrote: > There are other interesting effects too.. >=20 > assigning network interrupts to processors that have good access to the= > hardware AND the destination if you can.. UMA also seems to be sensitive to topology. While at that, how do you (if at all) deal with kernel memory allocations with respect to topology? Things that have their own thread or process is easy but AFAIK there is a lot of "thread-agnostic" code? --------------enig04B1F0CE10A7FC26DA178441 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkkcnIsACgkQldnAQVacBchNAQCfU4uwKK7l9x5o6lyzGDlOH9xc XIgAn3pbiYN4tQhvfbP+HolQLW7W7OMd =PLKz -----END PGP SIGNATURE----- --------------enig04B1F0CE10A7FC26DA178441-- From owner-freebsd-smp@FreeBSD.ORG Fri Nov 14 09:10:44 2008 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 202091065670 for ; Fri, 14 Nov 2008 09:10:44 +0000 (UTC) (envelope-from xiazhongqi@huawei.com) Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [119.145.14.65]) by mx1.freebsd.org (Postfix) with ESMTP id CF1EC8FC08 for ; Fri, 14 Nov 2008 09:10:43 +0000 (UTC) (envelope-from xiazhongqi@huawei.com) Received: from huawei.com (szxga02-in [172.24.2.6]) by szxga02-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0KAB00KJ3FGLG6@szxga02-in.huawei.com> for freebsd-smp@freebsd.org; Fri, 14 Nov 2008 16:55:33 +0800 (CST) Received: from huawei.com ([172.24.1.18]) by szxga02-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0KAB00CFKFGLZM@szxga02-in.huawei.com> for freebsd-smp@freebsd.org; Fri, 14 Nov 2008 16:55:33 +0800 (CST) Received: from x49105 ([10.111.9.47]) by szxml03-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTPA id <0KAB003MJFGKH8@szxml03-in.huawei.com> for freebsd-smp@freebsd.org; Fri, 14 Nov 2008 16:55:33 +0800 (CST) Date: Fri, 14 Nov 2008 16:55:32 +0800 From: Sam Xia In-reply-to: <20081113120028.C8E0810656F3@hub.freebsd.org> To: freebsd-smp@freebsd.org Message-id: <000001c94636$c0d4ce40$2f096f0a@china.huawei.com> MIME-version: 1.0 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2900.3350 X-Mailer: Microsoft Office Outlook 11 Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT Thread-index: AclFh79ZjsUFuHaAR0qdLsy4Dk/gpAArooAg Subject: inquiry X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Nov 2008 09:10:44 -0000 Dear all, I am a new comer to FreeBSD kernel. I am reading code of FeeBSD kernel. who can help me explain the purpose/usage/aciton of routine "thread_single()" in kern_thread.c of FreeBSD7.0? thank everyone for reading my email. Best Regards, Sam Xia From owner-freebsd-smp@FreeBSD.ORG Fri Nov 14 10:10:29 2008 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AC0841065687 for ; Fri, 14 Nov 2008 10:10:29 +0000 (UTC) (envelope-from freebsd-smp@m.gmane.org) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by mx1.freebsd.org (Postfix) with ESMTP id 633258FC14 for ; Fri, 14 Nov 2008 10:10:29 +0000 (UTC) (envelope-from freebsd-smp@m.gmane.org) Received: from list by ciao.gmane.org with local (Exim 4.43) id 1L0vdE-0007do-HZ for freebsd-smp@freebsd.org; Fri, 14 Nov 2008 10:10:28 +0000 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 14 Nov 2008 10:10:28 +0000 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 14 Nov 2008 10:10:28 +0000 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-smp@freebsd.org From: Ivan Voras Date: Fri, 14 Nov 2008 11:10:52 +0100 Lines: 32 Message-ID: References: <20081113120028.C8E0810656F3@hub.freebsd.org> <000001c94636$c0d4ce40$2f096f0a@china.huawei.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig01C29AF07197E75281908D76" X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Thunderbird 2.0.0.17 (X11/20080925) In-Reply-To: <000001c94636$c0d4ce40$2f096f0a@china.huawei.com> X-Enigmail-Version: 0.95.0 Sender: news Subject: Re: inquiry X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Nov 2008 10:10:29 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig01C29AF07197E75281908D76 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Sam Xia wrote: > Dear all, >=20 > I am a new comer to FreeBSD kernel. I am reading code of FeeBSD kernel.= > who can help me explain the purpose/usage/aciton of routine > "thread_single()" in kern_thread.c of FreeBSD7.0? Have you read the comment describing the function (it's there immediately before the function)? --------------enig01C29AF07197E75281908D76 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJHU6sldnAQVacBcgRAnaNAJ9RMDLtiBA0VYoTIpjSrK0TBDPOkQCfZaI3 8itFkA7zdkp1yZVg3+HpV/c= =kjhB -----END PGP SIGNATURE----- --------------enig01C29AF07197E75281908D76-- From owner-freebsd-smp@FreeBSD.ORG Fri Nov 14 18:41:46 2008 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 870591065679; Fri, 14 Nov 2008 18:41:46 +0000 (UTC) (envelope-from prvs=julian=1973cfe30@elischer.org) Received: from smtp-outbound.ironport.com (smtp-outbound.ironport.com [63.251.108.112]) by mx1.freebsd.org (Postfix) with ESMTP id 7699F8FC12; Fri, 14 Nov 2008 18:41:46 +0000 (UTC) (envelope-from prvs=julian=1973cfe30@elischer.org) Received: from jelischer-laptop.sfo.ironport.com (HELO julian-mac.elischer.org) ([10.251.22.38]) by smtp-outbound.ironport.com with ESMTP; 14 Nov 2008 10:12:55 -0800 Message-ID: <491DBFA6.70705@elischer.org> Date: Fri, 14 Nov 2008 10:12:54 -0800 From: Julian Elischer User-Agent: Thunderbird 2.0.0.17 (Macintosh/20080914) MIME-Version: 1.0 To: Ivan Voras References: <20081113120028.C8E0810656F3@hub.freebsd.org> <000001c94636$c0d4ce40$2f096f0a@china.huawei.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-smp@freebsd.org Subject: Re: inquiry X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Nov 2008 18:41:46 -0000 Ivan Voras wrote: > Sam Xia wrote: >> Dear all, >> >> I am a new comer to FreeBSD kernel. I am reading code of FeeBSD kernel. >> who can help me explain the purpose/usage/aciton of routine >> "thread_single()" in kern_thread.c of FreeBSD7.0? > > Have you read the comment describing the function (it's there > immediately before the function)? > I wrote that a long time ago, and things have changed a lot since then, but.. There are times, in a threaded process, when a thread making some change to teh state of the process must ensure that no other threads are running. There are several variants of this: An example of this is: Your thread is calling exit (or exec) and all the other threads must stop. now, they can't just bekilled (at least those in the kernel can't) as they may hold resources in the kernel that need to be released, so they are asked to suicide after releasing their resources. Your thread is allowed to proceed when there ar eno other threads alive. (in your process) Your thread is going to do some other action that requires no memory changes in the user space, or resources be stable.. I this case it will allow you to continue when all other threads have suspended at the user boundary.(where they are guaranteed to not hold resources). From owner-freebsd-smp@FreeBSD.ORG Sat Nov 15 06:32:54 2008 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9DED9106567C for ; Sat, 15 Nov 2008 06:32:54 +0000 (UTC) (envelope-from xiazhongqi@huawei.com) Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [119.145.14.67]) by mx1.freebsd.org (Postfix) with ESMTP id 825B98FC12 for ; Sat, 15 Nov 2008 06:32:33 +0000 (UTC) (envelope-from xiazhongqi@huawei.com) Received: from huawei.com (szxga04-in [172.24.2.12]) by szxga04-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0KAD004YZ3HZCC@szxga04-in.huawei.com>; Sat, 15 Nov 2008 14:32:23 +0800 (CST) Received: from huawei.com ([172.24.1.24]) by szxga04-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0KAD000613HZZ7@szxga04-in.huawei.com>; Sat, 15 Nov 2008 14:32:23 +0800 (CST) Received: from x49105 ([10.111.9.47]) by szxml04-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTPA id <0KAD00CYC3HZQ6@szxml04-in.huawei.com>; Sat, 15 Nov 2008 14:32:23 +0800 (CST) Date: Sat, 15 Nov 2008 14:32:23 +0800 From: Sam Xia In-reply-to: <20081114120026.294C21065801@hub.freebsd.org> To: freebsd-smp@freebsd.org Message-id: <001501c946eb$eb4c8810$2f096f0a@china.huawei.com> MIME-version: 1.0 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2900.3350 X-Mailer: Microsoft Office Outlook 11 Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT Thread-index: AclGUNMdpFZ9WkrzSieGz+98P7LHLAAmdHjQ Cc: ivoras@freebsd.org Subject: RE: freebsd-smp Digest, Vol 223, Issue 4 X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Nov 2008 06:32:54 -0000 Hi Ivan, Thank you for your response. yes, i have read the comments. but i am not very clear what is the difference between "SINGLE_EXIT" and "SINGLE_BOUNDARY". >From the comments, I guess that this routine should suspend the other threads and only one thread can run. But from the internal implementation of "thread_single", all other threads are waked up. I am very confused. BR, S.Xia > Message: 4 > Date: Fri, 14 Nov 2008 11:10:52 +0100 > From: Ivan Voras > Subject: Re: inquiry > To: freebsd-smp@freebsd.org > Message-ID: > Content-Type: text/plain; charset="utf-8" > > Sam Xia wrote: > > Dear all, > > > > I am a new comer to FreeBSD kernel. I am reading code of > FeeBSD kernel. > > who can help me explain the purpose/usage/aciton of routine > > "thread_single()" in kern_thread.c of FreeBSD7.0? > > Have you read the comment describing the function (it's there > immediately before the function)? > > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: signature.asc > Type: application/pgp-signature > Size: 252 bytes > Desc: OpenPGP digital signature > Url : > http://lists.freebsd.org/pipermail/freebsd-smp/attachments/200 81114/67b95a7c/signature-0001.pgp > > ------------------------------ > > _______________________________________________ > freebsd-smp@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-smp > To unsubscribe, send any mail to "freebsd-smp-unsubscribe@freebsd.org" > > End of freebsd-smp Digest, Vol 223, Issue 4 > ******************************************* >