From owner-freebsd-threads@FreeBSD.ORG Wed Oct 8 19:32:40 2008 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7321B106569B; Wed, 8 Oct 2008 19:32:40 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from fallbackmx10.syd.optusnet.com.au (fallbackmx10.syd.optusnet.com.au [211.29.132.251]) by mx1.freebsd.org (Postfix) with ESMTP id 044E18FC13; Wed, 8 Oct 2008 19:32:39 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail05.syd.optusnet.com.au (mail05.syd.optusnet.com.au [211.29.132.186]) by fallbackmx10.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id m98C9SS1004217; Wed, 8 Oct 2008 23:09:28 +1100 Received: from c211-30-84-14.carlnfd3.nsw.optusnet.com.au (c211-30-84-14.carlnfd3.nsw.optusnet.com.au [211.30.84.14]) by mail05.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id m98C9OVj014049 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 8 Oct 2008 23:09:26 +1100 Date: Wed, 8 Oct 2008 22:09:24 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: John Baldwin In-Reply-To: <200810070938.04673.jhb@freebsd.org> Message-ID: <20081008210104.S20625@delplex.bde.org> References: <004001c92871$fdec0a10$01000001@china.huawei.com> <200810070938.04673.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="0-242184313-1223467764=:20625" Cc: freebsd-net@freebsd.org, freebsd-questions@freebsd.org, freebsd-threads@freebsd.org Subject: Re: kernel profiling: spinlock_exit consumes 36% CPU time. X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Oct 2008 19:32:40 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-242184313-1223467764=:20625 Content-Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Tue, 7 Oct 2008, John Baldwin wrote: > On Tuesday 07 October 2008 07:44:00 am =C7=F1=BD=A3 wrote: >> Hi, folks, >> >> I did kernel profiling when a single thread client sends UDP packets to = a >> single thread server on the same machine. >> >> In the output kernel profile, the first few kernel functions that consum= es >> the most CPU time are listed below: >> >> granularity: each sample hit covers 16 byte(s) for 0.01% of 25.68 second= s >> >> % cumulative self self total >> time seconds seconds calls ms/call ms/call name >> 42.4 10.88 10.88 0 100.00% __mcount [1] >> 36.1 20.14 9.26 17937541 0.00 0.00 spinlock_exit [4] >> 4.2 21.22 1.08 3145728 0.00 0.00 in_cksum_skip [40] >> 1.8 21.68 0.45 7351987 0.00 0.00 generic_copyin [43= ] >> 1.1 21.96 0.29 3146028 0.00 0.00 generic_copyout [4= 8] >> 1.0 22.21 0.24 2108904 0.00 0.00 Xint0x80_syscall [= 3] >> 0.8 22.42 0.21 6292131 0.00 0.00 uma_zalloc_arg [46= ] >> 0.8 22.62 0.20 1048576 0.00 0.00 soreceive_generic = [9] >> >> It is very strange that spinlock_exit consumes over 36% CPU time while i= t >> seems a very simple function. > > It's because the intr_restore() re-enables interrupts and the resulting t= ime > spent executing the handlers for any pending interrupts are attributed to > spinlock_exit(). This is one of many defects that are not present in high resolution kernel profiling (kgmon -B instead of kgmon -b; availaible on amd64 and i386). However, high resolution kernel profiling doesn't work right with SMP, and was completely broken by gcc-4. Ordinary profiling was less completely broken by gcc-4, and you can recover the old behaviour by turning off new optimizations (mainly -funit-at-a-time and/or -finline-functions-called-once and or all of -O2). Bruce --0-242184313-1223467764=:20625--