From owner-freebsd-hackers@FreeBSD.ORG Sun Apr 1 00:19:48 2012 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5B47E106566C for ; Sun, 1 Apr 2012 00:19:48 +0000 (UTC) (envelope-from yuri@rawbw.com) Received: from shell0.rawbw.com (shell0.rawbw.com [198.144.192.45]) by mx1.freebsd.org (Postfix) with ESMTP id 2A0EB8FC14 for ; Sun, 1 Apr 2012 00:19:48 +0000 (UTC) Received: from eagle.yuri.org (stunnel@localhost [127.0.0.1]) (authenticated bits=0) by shell0.rawbw.com (8.14.4/8.14.4) with ESMTP id q310Jfng039358; Sat, 31 Mar 2012 17:19:44 -0700 (PDT) (envelope-from yuri@rawbw.com) Message-ID: <4F779F18.2000909@rawbw.com> Date: Sat, 31 Mar 2012 17:19:36 -0700 From: Yuri User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:10.0.3) Gecko/20120316 Thunderbird/10.0.3 MIME-Version: 1.0 To: Jason Hellenthal References: <4F775DF5.1020704@rawbw.com> <20120331212220.GA16306@DataIX.net> In-Reply-To: <20120331212220.GA16306@DataIX.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 01 Apr 2012 00:19:48 -0000 On 03/31/2012 14:22, Jason Hellenthal wrote: > procstat(1) I don't see which key of procstat(1) displays this information. The closest key is: -k "Display the stacks of kernel threads in the process" It shows kernel threads, but no user space stacks. How can I get user space stacks? Yuri From owner-freebsd-hackers@FreeBSD.ORG Sun Apr 1 18:19:28 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8343B106566B; Sun, 1 Apr 2012 18:19:28 +0000 (UTC) (envelope-from adutkowski@gmail.com) Received: from mail-ee0-f54.google.com (mail-ee0-f54.google.com [74.125.83.54]) by mx1.freebsd.org (Postfix) with ESMTP id E7B158FC14; Sun, 1 Apr 2012 18:19:27 +0000 (UTC) Received: by eekd17 with SMTP id d17so614662eek.13 for ; Sun, 01 Apr 2012 11:19:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=sqOExnJQYiSMRk1zQj385ak9qtJOTz0ewibwOMylOgs=; b=jmv4sfro2ilgsz0+Poqx4n5jzsXo8WTdQr0iim12IFBK/lI8t4XyQ5OCiyXrhb29ir CMwFM/azzcyth47CvPc71WmlXuj3qzaZfSKyeAe7yo8/aUq8Nxd/wFSmer0nuaOhV2p5 GGtbPGBhEEaugOFMV3lfgtcaLIenCRV1AKtj8NywUht2ct4ZxcyPLYvGFQ5+WNci82xg zuuO5MbAoYnUyz8lAkd4nQPvEQ9625k24OFDGi1NWHbkw+7L5ixJf22Fzj+/r7GmJ3WL PV++OOr7oOFk9TsxDmTGXmRmvdY5CeqMg46ouJ8bAXERoF79+RdiPQS2qoCmj5enDV+V 4hAg== MIME-Version: 1.0 Received: by 10.213.17.205 with SMTP id t13mr371287eba.4.1333304366692; Sun, 01 Apr 2012 11:19:26 -0700 (PDT) Received: by 10.213.16.133 with HTTP; Sun, 1 Apr 2012 11:19:26 -0700 (PDT) Date: Sun, 1 Apr 2012 20:19:26 +0200 Message-ID: From: Aleksander Dutkowski To: freebsd-arm@freebsd.org, freebsd-hackers@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Cc: Subject: [GSoC] [ARM] arm cleanup - my own proposal X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 01 Apr 2012 18:19:28 -0000 hello! after few weeks searching for interesting idea for me, I've decided to propose my own one. It is already mentioned on IdeasPage: - ARM cleanup Why I have chosen this one? I am very interested in embedded world. Now I am working on porting FBSD to at91sam9g45 - I will be much more motivated working on arm fbsd project than any other. Why should you let me do that project? While working on freebsd/arm I've noticed places that could be optimized, or separated, i.e. at91_samsize() should be declared for each board separately - now, this function has if-else and checks, which board is he running on. I would like to identify and fix that bugs, so the code will be more efficient and clear. Moreover, I think there should be a tutorial/framework for adding new boards or SoCs, so I will be simplier. I am currently reading the code in sys/arm/at91 and searching for improvements but I will be very pleased, if you send me your insights. The first question is - should I cleanup only at91 branch or more? I am quite familiar with at91 right now. The second - how to test the code? Some of boards could be tested in qemu, I could buy board with at91rm9200 for example, if I'm in. But maybe I will find here people with their own boards, they could help me testing? I havs sbc6045 board with at91sam9g45 SoC but it hasn't fbsd support yet (I'm working on it now :) ) I also thought about reducing kernel size for embedded, if arm cleanup won't fit. -- regards Aleksander Dutkowski http://ping.kti.gda.pl/~aleek From owner-freebsd-hackers@FreeBSD.ORG Sun Apr 1 20:34:36 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1B704106566C for ; Sun, 1 Apr 2012 20:34:36 +0000 (UTC) (envelope-from greglmiller@gmail.com) Received: from mail-wi0-f178.google.com (mail-wi0-f178.google.com [209.85.212.178]) by mx1.freebsd.org (Postfix) with ESMTP id A09838FC14 for ; Sun, 1 Apr 2012 20:34:35 +0000 (UTC) Received: by wibhq7 with SMTP id hq7so1695503wib.13 for ; Sun, 01 Apr 2012 13:34:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=3Imx7H3EtoEevd9j7zE2AfJpdARfD6O7dka8qbOALno=; b=VGLprVmzR0bSZYdTNUczbgifodSbFDV41TAEsw9VRs3Z/XcqzutOAygWUWUXvR2WJw 7heri9+hJXUmxFYc3D9M2KbT0rCu/4+9Gq/l4hdIBPguLFpCui6VeD8PI6tAVIXJPvlB gfwlLVFbS8PDrlqu89j/q5vChk/ytRUTieQiptBzQqcIZ75RdbKuVrt+3RscGIcJxSSk WIrg7ZMz3xYboruojU96zJg240pgR634RV7Tm4KvxtrI9rC1sbvF0QjAGzcqsVUt3qPc vuFzh+jLUItuoKjXrHNq7jcc9rWpbhkuVjDH8vULdixr/a/gBn2dEhU/Et21lX0ESiKo ngDw== MIME-Version: 1.0 Received: by 10.180.107.104 with SMTP id hb8mr17837291wib.8.1333312474301; Sun, 01 Apr 2012 13:34:34 -0700 (PDT) Received: by 10.216.65.80 with HTTP; Sun, 1 Apr 2012 13:34:34 -0700 (PDT) Date: Sun, 1 Apr 2012 15:34:34 -0500 Message-ID: From: Greg Miller To: freebsd-hackers@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Subject: GSoC mutex contention profiling and lock order verification X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 01 Apr 2012 20:34:36 -0000 The pthread mutex contention profiling and lock order verification entry on the ideas list caught my eye. I'm looking for a potential mentor, and any ideas or suggestions about what's desired in such a tool. The ideas page lists jeff@ as a contact, but I've not gotten a response as yet, so does anyone have an interest in this, and maybe some suggestions? From owner-freebsd-hackers@FreeBSD.ORG Mon Apr 2 14:31:48 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5C7221065677; Mon, 2 Apr 2012 14:31:48 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 336498FC15; Mon, 2 Apr 2012 14:31:48 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 93D4AB963; Mon, 2 Apr 2012 10:31:47 -0400 (EDT) From: John Baldwin To: freebsd-hackers@freebsd.org Date: Mon, 2 Apr 2012 08:31:09 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p10; KDE/4.5.5; amd64; ; ) References: <4F775DF5.1020704@rawbw.com> In-Reply-To: <4F775DF5.1020704@rawbw.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201204020831.09253.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 02 Apr 2012 10:31:47 -0400 (EDT) Cc: Yuri , hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2012 14:31:48 -0000 On Saturday, March 31, 2012 3:41:41 pm Yuri wrote: > I look at seemingly abandoned sysutils/pstack, last modified upstream > 2002-11-27. > It doesn't really work on 9.0 i386, prints some errors. > > It's functions, though, is quite desirable if one wants to understand > why some multithreaded program hangs or is not responsive. > Since there were no updates, I wonder, is this because there is some > alternative in FreeBSD that I don't know about, or it is primarily due > to the lack of interest/resources? > > I don't take gdb as alternative since it is not single line, and also it > has some threading issues of its own. Hmm, I don't know if the port has it, but I did some work on pstack a while ago to make it work with libthread_db so it at least handles i386 ok. It needs to be modified to use something like libunwind though or some other unwinder. And possibly it should use libelf instead of its own ELF-parsing code. -- John Baldwin From owner-freebsd-hackers@FreeBSD.ORG Mon Apr 2 14:31:48 2012 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5C7221065677; Mon, 2 Apr 2012 14:31:48 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 336498FC15; Mon, 2 Apr 2012 14:31:48 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 93D4AB963; Mon, 2 Apr 2012 10:31:47 -0400 (EDT) From: John Baldwin To: freebsd-hackers@freebsd.org Date: Mon, 2 Apr 2012 08:31:09 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p10; KDE/4.5.5; amd64; ; ) References: <4F775DF5.1020704@rawbw.com> In-Reply-To: <4F775DF5.1020704@rawbw.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201204020831.09253.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 02 Apr 2012 10:31:47 -0400 (EDT) Cc: Yuri , hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2012 14:31:48 -0000 On Saturday, March 31, 2012 3:41:41 pm Yuri wrote: > I look at seemingly abandoned sysutils/pstack, last modified upstream > 2002-11-27. > It doesn't really work on 9.0 i386, prints some errors. > > It's functions, though, is quite desirable if one wants to understand > why some multithreaded program hangs or is not responsive. > Since there were no updates, I wonder, is this because there is some > alternative in FreeBSD that I don't know about, or it is primarily due > to the lack of interest/resources? > > I don't take gdb as alternative since it is not single line, and also it > has some threading issues of its own. Hmm, I don't know if the port has it, but I did some work on pstack a while ago to make it work with libthread_db so it at least handles i386 ok. It needs to be modified to use something like libunwind though or some other unwinder. And possibly it should use libelf instead of its own ELF-parsing code. -- John Baldwin From owner-freebsd-hackers@FreeBSD.ORG Mon Apr 2 15:23:54 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 64FE21065672 for ; Mon, 2 Apr 2012 15:23:54 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 38FC28FC20 for ; Mon, 2 Apr 2012 15:23:54 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id B1F25B911; Mon, 2 Apr 2012 11:23:53 -0400 (EDT) From: John Baldwin To: freebsd-hackers@freebsd.org Date: Mon, 2 Apr 2012 11:23:53 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p10; KDE/4.5.5; amd64; ; ) References: <201203220803.57000.jhb@freebsd.org> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <201204021123.53055.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 02 Apr 2012 11:23:53 -0400 (EDT) Cc: Mark Saad Subject: Re: Approaching the limit on PV entries X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2012 15:23:54 -0000 On Thursday, March 22, 2012 1:48:29 pm Mark Saad wrote: > On Thu, Mar 22, 2012 at 8:03 AM, John Baldwin wrote: > > On Wednesday, March 21, 2012 4:20:17 pm Mark Saad wrote: > >> On Wed, Mar 21, 2012 at 12:39 PM, Sergey Kandaurov wrote: > >> > On 21 March 2012 19:19, John Baldwin wrote: > >> >> On Tuesday, March 20, 2012 11:37:57 am Sergey Kandaurov wrote: > >> >>> On 22 November 2011 19:29, Mark Saad wrote: > >> >>> > Hello All > >> >>> > >> >>> [found this mail in my drafts, not sure if my answer is still useful] > >> >>> > >> >>> > I want to get to the bottom of a warning in dmesg. On 7.2-RELEASE and > >> >>> > 7.3-RELEASE I have seen the following warning in dmesg. > >> >>> > > >> >>> > Approaching the limit on PV entries, consider increasing either the > >> >>> > vm.pmap.shpgperproc or the vm.pmap.pv_entry_max sysctl. > >> >>> > > >> >>> > So looking around I see a few posts here and there about how to tune > >> >>> > the sysctls to address the warning however I am not 100% sure what > >> >>> > each value does. > >> >>> > It appears changing vm.pmap.shpgperproc affects the value of > >> >>> > vm.pmap.pv_entry_max . Can someone explain the relationship of the two > >> >>> > sysctls. Also > >> >>> > >> >>> This is how they are calculated. > >> >>> > >> >>> pv_entry_max = shpgperproc * maxproc + cnt.v_page_count; > >> >>> > >> >>> and, respectively, > >> >>> > >> >>> shpgperproc = (pv_entry_max - cnt.v_page_count) / maxproc; > >> >>> > >> >>> So, changing one sysctl will change another and vice versa. > >> >>> > >> >>> > what pitfalls of changing them are. > >> >>> > >> >>> Not known to me (on amd64 platform). > >> >>> I have had vm.pmap.shpgperproc=15000 on 8.1 amd64 with 4G RAM > >> >>> to make some badly written commercial software to work until it > >> >>> was decommissioned to the scrap. > >> >> > >> >> FYI, Alan just removed this warning and the associated sysctls from HEAD > >> >> yesterday because they were made obsolete several years ago. I think they are > >> >> obsolete even on 7. Certainly on 8. > >> > > >> > Yep, and since switching to direct map (somewhere around 7.x on amd64?) > >> > made PV entry limit factually obsolete, this is really cool. > >> > > >> > -- > >> > wbr, > >> > pluknet > >> > >> Interesting so this warning is relevant in 7.x ? > > > > No, looks like it was obsolete starting with 7.0. > > > > -- > > John Baldwin > > Any chance it could be mfc'ed to 7-STABLE ? I just merged it to stable/7. -- John Baldwin From owner-freebsd-hackers@FreeBSD.ORG Mon Apr 2 16:39:32 2012 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9C04E1065673; Mon, 2 Apr 2012 16:39:32 +0000 (UTC) (envelope-from yuri@rawbw.com) Received: from shell0.rawbw.com (shell0.rawbw.com [198.144.192.45]) by mx1.freebsd.org (Postfix) with ESMTP id 83D778FC12; Mon, 2 Apr 2012 16:39:32 +0000 (UTC) Received: from eagle.yuri.org (stunnel@localhost [127.0.0.1]) (authenticated bits=0) by shell0.rawbw.com (8.14.4/8.14.4) with ESMTP id q32GdQxs007496; Mon, 2 Apr 2012 09:39:26 -0700 (PDT) (envelope-from yuri@rawbw.com) Message-ID: <4F79D63E.7010200@rawbw.com> Date: Mon, 02 Apr 2012 09:39:26 -0700 From: Yuri User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:10.0.3) Gecko/20120316 Thunderbird/10.0.3 MIME-Version: 1.0 To: John Baldwin References: <4F775DF5.1020704@rawbw.com> <201204020831.09253.jhb@freebsd.org> In-Reply-To: <201204020831.09253.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@freebsd.org, hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2012 16:39:32 -0000 On 04/02/2012 05:31, John Baldwin wrote: > Hmm, I don't know if the port has it, but I did some work on pstack a while > ago to make it work with libthread_db so it at least handles i386 ok. It > needs to be modified to use something like libunwind though or some other > unwinder. And possibly it should use libelf instead of its own ELF-parsing > code. I see pstack -1.2_1 failing even on i386: pstack: cannot read context for thread 0x1879f pstack: failed to read more threads 1947: /usr/local/share/chromium/chrome ----------------- thread 100255 ----------------- 0x1879f ???????? () ----------------- thread -1 (running) ----------------- 0x389f1df9 __sys_recvmsg (3, bfbfcd44, 0, bfbfcd68, 0, c) + 5 0x97850b4 _init (3, bfbfcdc8, 800, bfbfdc20, bfbfdc4c, bfbfdc40) + 15c7c1c 0xa8089d0 _init (bfbfe074, 3, 0, bfbfe0c4, 20, bfbfdca0) + 264b538 0xa8094d7 _init (bfbfe44c, 0, bfbfe108, 37a85517, 37aa7680, 38fbf400) + 264c03f 0x8e7ec02 _init (bfbfe44c, bfbfe4a0, 3c, 0, 0, 0) + cc176a 0x8e7f102 _init (bfbfe468, bfbfe44c, bfbfe4a0, 37a9f4b4, 37aa5d40, 1) + cc1c6a 0x8e7f471 _init (2, bfbfe540, bfbfe4a0, 88f9c28, bd4dce8, bd4de88) + cc1fd9 0x81c64ab _init (2, bfbfe540, bfbfe4e8, af61795, bfbfe500, bfbfe540) + 9013 0x81c6452 _init (0, 0, bfbfe518, 81c63a7, 2, bfbfe540) + 8fba 0x81c63a7 _init (3791afd0, 2, bfbfe540, 0, 0, 0) + 8f0f 0x81c6318 _init (bfbfe6d0, bfbfe6f1, 0, bfbfe6ff, bfbfe762, bfbfe7b8) + 8e80 Yuri From owner-freebsd-hackers@FreeBSD.ORG Mon Apr 2 16:39:32 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9C04E1065673; Mon, 2 Apr 2012 16:39:32 +0000 (UTC) (envelope-from yuri@rawbw.com) Received: from shell0.rawbw.com (shell0.rawbw.com [198.144.192.45]) by mx1.freebsd.org (Postfix) with ESMTP id 83D778FC12; Mon, 2 Apr 2012 16:39:32 +0000 (UTC) Received: from eagle.yuri.org (stunnel@localhost [127.0.0.1]) (authenticated bits=0) by shell0.rawbw.com (8.14.4/8.14.4) with ESMTP id q32GdQxs007496; Mon, 2 Apr 2012 09:39:26 -0700 (PDT) (envelope-from yuri@rawbw.com) Message-ID: <4F79D63E.7010200@rawbw.com> Date: Mon, 02 Apr 2012 09:39:26 -0700 From: Yuri User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:10.0.3) Gecko/20120316 Thunderbird/10.0.3 MIME-Version: 1.0 To: John Baldwin References: <4F775DF5.1020704@rawbw.com> <201204020831.09253.jhb@freebsd.org> In-Reply-To: <201204020831.09253.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@freebsd.org, hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2012 16:39:32 -0000 On 04/02/2012 05:31, John Baldwin wrote: > Hmm, I don't know if the port has it, but I did some work on pstack a while > ago to make it work with libthread_db so it at least handles i386 ok. It > needs to be modified to use something like libunwind though or some other > unwinder. And possibly it should use libelf instead of its own ELF-parsing > code. I see pstack -1.2_1 failing even on i386: pstack: cannot read context for thread 0x1879f pstack: failed to read more threads 1947: /usr/local/share/chromium/chrome ----------------- thread 100255 ----------------- 0x1879f ???????? () ----------------- thread -1 (running) ----------------- 0x389f1df9 __sys_recvmsg (3, bfbfcd44, 0, bfbfcd68, 0, c) + 5 0x97850b4 _init (3, bfbfcdc8, 800, bfbfdc20, bfbfdc4c, bfbfdc40) + 15c7c1c 0xa8089d0 _init (bfbfe074, 3, 0, bfbfe0c4, 20, bfbfdca0) + 264b538 0xa8094d7 _init (bfbfe44c, 0, bfbfe108, 37a85517, 37aa7680, 38fbf400) + 264c03f 0x8e7ec02 _init (bfbfe44c, bfbfe4a0, 3c, 0, 0, 0) + cc176a 0x8e7f102 _init (bfbfe468, bfbfe44c, bfbfe4a0, 37a9f4b4, 37aa5d40, 1) + cc1c6a 0x8e7f471 _init (2, bfbfe540, bfbfe4a0, 88f9c28, bd4dce8, bd4de88) + cc1fd9 0x81c64ab _init (2, bfbfe540, bfbfe4e8, af61795, bfbfe500, bfbfe540) + 9013 0x81c6452 _init (0, 0, bfbfe518, 81c63a7, 2, bfbfe540) + 8fba 0x81c63a7 _init (3791afd0, 2, bfbfe540, 0, 0, 0) + 8f0f 0x81c6318 _init (bfbfe6d0, bfbfe6f1, 0, bfbfe6ff, bfbfe762, bfbfe7b8) + 8e80 Yuri From owner-freebsd-hackers@FreeBSD.ORG Mon Apr 2 16:59:18 2012 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2F7A2106564A for ; Mon, 2 Apr 2012 16:59:18 +0000 (UTC) (envelope-from rank1seeker@gmail.com) Received: from mail-bk0-f54.google.com (mail-bk0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id AE1898FC08 for ; Mon, 2 Apr 2012 16:59:17 +0000 (UTC) Received: by bkcjc3 with SMTP id jc3so3152973bkc.13 for ; Mon, 02 Apr 2012 09:59:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:from:to:subject:date:content-type :content-transfer-encoding:x-mailer; bh=5O4KvHSCWPPOGw5rVHWNioDISLNB1BlwRfIdpDCRNq8=; b=mdBRmREOIxHTU0xU6LiTMymGijE5MU9GwkSMKBHJ8huMOfDy3wmohfQrNgD9TKDyNA gZ1NawJLed1Q4+KumytRwujLfDg2yIpDFVpuIvvLSAO0V9f2aRz4ieRw1hrAr8WJkMED StJJdXWUL76zw3otfanbGE2umaXaQJextmXbfILTDGUZgQW3W1pawht/yjCbS9BZRQBw 5Bzq7dBlwfG5HZgApVBUYgFe1aOVcvTv3JrxRoE1A/g+Cy2gkcepOSlrr0D8PD41+EoS Peds/1xBqdkjhhZCd+kcttrCwWoRzSBwJ6G31qB3DqrnIsDPNUjY7XtoMthYfZ8kRBY3 E0gg== Received: by 10.204.151.198 with SMTP id d6mr966612bkw.122.1333385956329; Mon, 02 Apr 2012 09:59:16 -0700 (PDT) Received: from DOMYPC ([82.193.208.173]) by mx.google.com with ESMTPS id r14sm39903714bkv.11.2012.04.02.09.59.12 (version=SSLv3 cipher=OTHER); Mon, 02 Apr 2012 09:59:15 -0700 (PDT) Message-ID: <20120402.165914.431.2@DOMY-PC> From: rank1seeker@gmail.com To: hackers@freebsd.org Date: Mon, 02 Apr 2012 18:59:14 +0200 Content-Type: text/plain; charset="Windows-1250" Content-Transfer-Encoding: quoted-printable X-Mailer: POP Peeper (3.8.1.0) Cc: Subject: Upgrading FreeBSD X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2012 16:59:18 -0000 Basically it consists of (re)compilation and installation of world + = kernel=0D=0AHowever, 2 parts are always ommited:=0D=0AStage 1 - = mbr|boot0=0D=0AStage 2 - boot=0D=0AI won't also mention GPT part = ...=0D=0A=0D=0AStage 3 - loader (is "covered" by world = install)=0D=0A=0D=0AShould it be expanded to world + kernel + = bootcodes=0D=0AThe old way, one could pull it's old bootcodes, from 6.0 = to 10.0 world + kernel=0D=0A=0D=0AAnd bootcodes are changing. I've got = bitten by bug in stage 2 boot, during 8.2 -> = 9.0.=0D=0A=0D=0A=0D=0ADomagoj Smol=E8i=E6 From owner-freebsd-hackers@FreeBSD.ORG Mon Apr 2 17:13:12 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 50FC210657B0; Mon, 2 Apr 2012 17:13:12 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 1BBD18FC08; Mon, 2 Apr 2012 17:13:12 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 6839EB95E; Mon, 2 Apr 2012 13:13:11 -0400 (EDT) From: John Baldwin To: Yuri Date: Mon, 2 Apr 2012 13:12:36 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p10; KDE/4.5.5; amd64; ; ) References: <4F775DF5.1020704@rawbw.com> <201204020831.09253.jhb@freebsd.org> <4F79D63E.7010200@rawbw.com> In-Reply-To: <4F79D63E.7010200@rawbw.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201204021312.36568.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 02 Apr 2012 13:13:11 -0400 (EDT) Cc: freebsd-hackers@freebsd.org, hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2012 17:13:12 -0000 On Monday, April 02, 2012 12:39:26 pm Yuri wrote: > On 04/02/2012 05:31, John Baldwin wrote: > > Hmm, I don't know if the port has it, but I did some work on pstack a while > > ago to make it work with libthread_db so it at least handles i386 ok. It > > needs to be modified to use something like libunwind though or some other > > unwinder. And possibly it should use libelf instead of its own ELF-parsing > > code. > > I see pstack -1.2_1 failing even on i386: > > pstack: cannot read context for thread 0x1879f > pstack: failed to read more threads Yes, threads don't work for modern binaries (newer than 4.x) without my changes to make it use libthread_db. You can find the patch I used for this at http://www.freebsd.org/~jhb/patches/pstack_threads.patch -- John Baldwin From owner-freebsd-hackers@FreeBSD.ORG Mon Apr 2 17:13:12 2012 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 50FC210657B0; Mon, 2 Apr 2012 17:13:12 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 1BBD18FC08; Mon, 2 Apr 2012 17:13:12 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 6839EB95E; Mon, 2 Apr 2012 13:13:11 -0400 (EDT) From: John Baldwin To: Yuri Date: Mon, 2 Apr 2012 13:12:36 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p10; KDE/4.5.5; amd64; ; ) References: <4F775DF5.1020704@rawbw.com> <201204020831.09253.jhb@freebsd.org> <4F79D63E.7010200@rawbw.com> In-Reply-To: <4F79D63E.7010200@rawbw.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201204021312.36568.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 02 Apr 2012 13:13:11 -0400 (EDT) Cc: freebsd-hackers@freebsd.org, hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2012 17:13:12 -0000 On Monday, April 02, 2012 12:39:26 pm Yuri wrote: > On 04/02/2012 05:31, John Baldwin wrote: > > Hmm, I don't know if the port has it, but I did some work on pstack a while > > ago to make it work with libthread_db so it at least handles i386 ok. It > > needs to be modified to use something like libunwind though or some other > > unwinder. And possibly it should use libelf instead of its own ELF-parsing > > code. > > I see pstack -1.2_1 failing even on i386: > > pstack: cannot read context for thread 0x1879f > pstack: failed to read more threads Yes, threads don't work for modern binaries (newer than 4.x) without my changes to make it use libthread_db. You can find the patch I used for this at http://www.freebsd.org/~jhb/patches/pstack_threads.patch -- John Baldwin From owner-freebsd-hackers@FreeBSD.ORG Mon Apr 2 17:55:39 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 28EE1106564A for ; Mon, 2 Apr 2012 17:55:39 +0000 (UTC) (envelope-from jrytoung@gmail.com) Received: from mail-wg0-f42.google.com (mail-wg0-f42.google.com [74.125.82.42]) by mx1.freebsd.org (Postfix) with ESMTP id B369C8FC1B for ; Mon, 2 Apr 2012 17:55:38 +0000 (UTC) Received: by wgbds11 with SMTP id ds11so2663902wgb.1 for ; Mon, 02 Apr 2012 10:55:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=1jFQVM8F9/HE/wN+59aK9HO7O1m2+WbfUSOsG7lT6ok=; b=Abdelqjb8QecAKTVSZre+BaRUFeBQTT3IDUPyQJZw9fIc5Ib9IpMytd0mqazZkWhv0 DqvnvUzQSmJXVP4x/I/B37bmtyLIuh40+Rv2B03rhfxa/ATVU6K1msKxzaUpduCc4El9 6/qz9fdbXUvYw3rWvvA3RYSkW9MmBcdukK6drVi6zLZXJ6PWHmAiU1h83tBfMp6wICZK O4LZ6xziWpDMgk8+QMy3eW349cpF/nobw1svuWWkym++j/jT2EQcuNcLQUVJed9WTGTF 3/s9D3inyNd3G4MJVUJBS3wUK+aVHOrSfbq0DhIPa2nqNlaB+MskKFQIkfhYyaiIWOVb 1nZg== MIME-Version: 1.0 Received: by 10.180.107.101 with SMTP id hb5mr27204285wib.7.1333389331929; Mon, 02 Apr 2012 10:55:31 -0700 (PDT) Received: by 10.216.27.148 with HTTP; Mon, 2 Apr 2012 10:55:31 -0700 (PDT) Date: Mon, 2 Apr 2012 10:55:31 -0700 Message-ID: From: Jerry Toung To: freebsd-hackers Content-Type: text/plain; charset=ISO-8859-1 Subject: CAM disk I/O starvation X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2012 17:55:39 -0000 Hello list, I am convinced that there is a bug in the CAM code that leads to I/O starvation. I have already discussed this privately with some. I am now bringing this up to the general audience to get more feedback. My setup is that I have 1 RAID controller with 2 arrays connected to it, da0 and da1. The controller supports 252 tags. After boot up, camcontrol tags on da0 and da1 shows that both devices have 252 openings each. A process P0 writing on da0 is dormant most of the time, but would wake up with burst of I/Os, 5000-6000 ops as reported by gstat. A process P1 writing on da1 has a fixed data rate to da1 as reported by gstat. The issue: When P0 generates that burst of 5000-6000 ops, the write rate of P1 on da1 goes to 0 MB/sec for up to 8-9sec, vfs.hirunningspace starts climbing and we get into waithirunning() or getblk() sleep channel. BTW, raising hirunningspace has no effect on the 0 MB/s behavior. The first problem that I see here, is that if the sim's devq has 252 alloc_queue and send_queue, the struct cam_ed representing da0 and da1 should each have 126 openings and not 252. The second problem is that clearly, there is no I/O fairness in CAM as seen in gstat output and da0 exclusively takes a hold of the sim/controller until it has processed all it's I/Os (8-9 seconds). The code that does this is at cam/cam_xpt.c:3030 3030 && (devq->alloc_openings > 0) and cam/cam_xpt.c:3091 3091 && (devq->send_openings > 0) After you've split the openings to 126 each, the tests above will always be true I have a patch and it fixes those problems. I can share it to the list if requested to. da0 and da1 now both automatically get 126 openings and based on that, extra logic implements fairness in cam/cam_xpt.c. No more 0 MB/s on da1. This is on 8.1-RELEASE FreeBSD. Any comments welcome. Thanks, Jerry From owner-freebsd-hackers@FreeBSD.ORG Mon Apr 2 18:06:30 2012 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53]) by hub.freebsd.org (Postfix) with ESMTP id E10B3106564A; Mon, 2 Apr 2012 18:06:30 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: from opti.dougb.net (hub.freebsd.org [IPv6:2001:4f8:fff6::36]) by mx2.freebsd.org (Postfix) with ESMTP id 6A98A14DEBA; Mon, 2 Apr 2012 18:06:30 +0000 (UTC) Message-ID: <4F79EAA6.5050004@FreeBSD.org> Date: Mon, 02 Apr 2012 11:06:30 -0700 From: Doug Barton Organization: http://SupersetSolutions.com/ User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:10.0.2) Gecko/20120218 Thunderbird/10.0.2 MIME-Version: 1.0 To: Joe Greco References: <201203301441.q2UEfqIE097518@aurora.sol.net> In-Reply-To: <201203301441.q2UEfqIE097518@aurora.sol.net> X-Enigmail-Version: 1.3.5 OpenPGP: id=1A1ABC84 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@FreeBSD.org, Mark Felder , freebsd-questions@FreeBSD.org Subject: Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2012 18:06:31 -0000 On 03/30/2012 07:41, Joe Greco wrote: >> On 3/29/2012 7:01 AM, Joe Greco wrote: >>>> On 3/28/2012 1:59 PM, Mark Felder wrote: >>>>> FreeBSD 8-STABLE, 8.3, and 9.0 are untested >>>> >>>> As much as I'm sensitive to your production requirements, realistically >>>> it's not likely that you'll get a helpful result without testing a newer >>>> version. 8.2 came out over a year ago, many many things have changed >>>> since then. >>>> >>>> Doug >>> >>> So you're saying that he should have been using 8.3-RELEASE, then. >> >> That isn't what I said at all, sorry if I wasn't clear. The OP mentioned >> 9.0-RELEASE, and in the context of his message (which I snipped) he >> mentioned 8-stable. That's what I was referring to. > > And since both the poster and I made it clear that this doesn't seem > to be a case of "it fails reliably on a machine of your choosing", > just installing random other versions and hoping that it's going to > cause a fail ... well, let's just say that doesn't make a whole lot > of sense. Or at least it's a recipe for a hell of a lot of busywork, > busywork not guaranteed to return any sort of useful result. And since you can't reliably reproduce the problem, how do you expect us to? I understand that these sorts of bugs are difficult/annoying, etc. Been there, done that. > In the meantime, it's unrealistic to tell people to use supported > releases, to wait fifteen months between releases, and then to criticize > people complaining about problems with a supported release for "using > old code". Just to be clear, I didn't criticize anyone. And I share your frustration with the length of the 8.3 release cycle. I really wish I had a better answer, but as much as you and I may wish that things were different, "Try a newer version" is the best answer we have atm. Doug -- This .signature sanitized for your protection From owner-freebsd-hackers@FreeBSD.ORG Mon Apr 2 18:43:37 2012 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 928601065672; Mon, 2 Apr 2012 18:43:37 +0000 (UTC) (envelope-from jgreco@aurora.sol.net) Received: from mail2.sol.net (mail2.sol.net [206.55.64.73]) by mx1.freebsd.org (Postfix) with ESMTP id 643428FC1F; Mon, 2 Apr 2012 18:43:37 +0000 (UTC) Received: from aurora.sol.net (IDENT:jgreco@aurora.sol.net [206.55.70.98]) by mail2.sol.net (8.14.4/8.14.4/SNNS-1.04) with ESMTP id q32IhPTU035713; Mon, 2 Apr 2012 13:43:26 -0500 (CDT) Received: (from jgreco@localhost) by aurora.sol.net (8.14.3/8.14.3/Submit) id q32IhPGZ053424; Mon, 2 Apr 2012 13:43:25 -0500 (CDT) From: Joe Greco Message-Id: <201204021843.q32IhPGZ053424@aurora.sol.net> To: dougb@FreeBSD.org (Doug Barton) Date: Mon, 2 Apr 2012 13:43:25 -0500 (CDT) In-Reply-To: <4F79EAA6.5050004@FreeBSD.org> X-Mailer: ELM [version 2.5 PL8] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@FreeBSD.org, Mark Felder , freebsd-questions@FreeBSD.org Subject: Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2012 18:43:37 -0000 > On 03/30/2012 07:41, Joe Greco wrote: > >> On 3/29/2012 7:01 AM, Joe Greco wrote: > >>>> On 3/28/2012 1:59 PM, Mark Felder wrote: > >>>>> FreeBSD 8-STABLE, 8.3, and 9.0 are untested > >>>> > >>>> As much as I'm sensitive to your production requirements, realistically > >>>> it's not likely that you'll get a helpful result without testing a newer > >>>> version. 8.2 came out over a year ago, many many things have changed > >>>> since then. > >>>> > >>>> Doug > >>> > >>> So you're saying that he should have been using 8.3-RELEASE, then. > >> > >> That isn't what I said at all, sorry if I wasn't clear. The OP mentioned > >> 9.0-RELEASE, and in the context of his message (which I snipped) he > >> mentioned 8-stable. That's what I was referring to. > > > > And since both the poster and I made it clear that this doesn't seem > > to be a case of "it fails reliably on a machine of your choosing", > > just installing random other versions and hoping that it's going to > > cause a fail ... well, let's just say that doesn't make a whole lot > > of sense. Or at least it's a recipe for a hell of a lot of busywork, > > busywork not guaranteed to return any sort of useful result. > > And since you can't reliably reproduce the problem, how do you expect us > to? I understand that these sorts of bugs are difficult/annoying, etc. > Been there, done that. Nobody expected you to. We're trying to figure out any commonalities that might exist; these may serve to help shed light on where the problem lies. The interesting thing is that I took it and looked at it and came to a conclusion that might have been wrong, though I think the trail of reasoning I used was itself reasonable, given my exceedingly small (one example of problem) sample size. Mark's able to actually *reproduce* the problem on separate installs and with circumstances that are at least somewhat different than what my theory involved, though it is not quite possible to rule out some sort of corruption. Since I have to *assume* that many sites run some sort of FreeBSD on their VMware gear, given that VMware actually lists it as a supported version and VMware generally does things "for profit", I am still kind of of the opinion that this is some sort of corruption bug, one that I triggered inadvertently, but one that Mark's environment reproduces rather more frequently. That just seems so unlikely, but more unlikely things have come to pass, so I'm holding onto it as my working theory ;-) I still plan to try to recover my broken VM from backups at some point if time permits. But in short, to answer your question: I don't *care* if you can reproduce the problem. As a user, you can't win. If you don't report a problem, you get criticized. If you report a problem but can't figure out how to reproduce it, you get criticized. If you can reproduce it but you don't submit a workaround, you get criticized. If you submit a workaround but you don't submit a patch, you get criticized. If you submit a patch but it's not in the preferred format, you get criticized. Hm. ... JG -- Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net "We call it the 'one bite at the apple' rule. Give me one chance [and] then I won't contact you again." - Direct Marketing Ass'n position on e-mail spam(CNN) With 24 million small businesses in the US alone, that's way too many apples. From owner-freebsd-hackers@FreeBSD.ORG Mon Apr 2 20:53:45 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A33C91065678; Mon, 2 Apr 2012 20:53:45 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 52D008FC21; Mon, 2 Apr 2012 20:53:45 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id AADE5B95A; Mon, 2 Apr 2012 16:53:44 -0400 (EDT) From: John Baldwin To: Maninya M Date: Mon, 2 Apr 2012 16:42:29 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p10; KDE/4.5.5; amd64; ; ) References: <201203290944.11446.jhb@freebsd.org> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <201204021642.29578.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 02 Apr 2012 16:53:44 -0400 (EDT) Cc: freebsd-hackers@freebsd.org Subject: Re: __NR_mmap2 in FreeBSD X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2012 20:53:45 -0000 On Saturday, March 31, 2012 5:40:50 pm Maninya M wrote: > Thanks. > > I've tried this. Still getting some allocation problems. > > if (temp_regs.r_eax != addr) > warn("Wanted space at address 0x%.8x, mmap2 system call returned > 0x%.8x. This could be a problem.",addr,temp_regs.r_eax); > > What can I do? Please help. Hmm, can you capture a ktrace of the target process during this so you can see if the kernel sees the mmap request properly? > > void map_memory(unsigned long addr, unsigned long size, int flags) > { > int status; > struct reg regs,temp_regs; > unsigned long int_instr = 0x000080cd; /* INT 0x80 */ > printf("%x\n",addr); > //addr=addr&0xffff0000; > if (ptrace(PT_GETREGS,exec_pid,(caddr_t)®s,0) < 0) > die_perror("ptrace(PTRACE_GETREGS,%d,(caddr_t)®s,0)",exec_pid); > > /* mmap2 system call seems to take arguments as follows: > * eax = __NR_mmap2 > * ebx = (unsigned long) page aligned address > * ecx = (unsigned long) page aligned file size > * edx = protection > * esi = flags > * Other arguments (fd and pgoff) are not required for anonymous mapping > */ > temp_regs = regs; > > //printf("temp=%u, \teip=%u\tregs=%u\teip=%u\n",&temp_regs,temp_regs.r_eip,®s,regs.r_eip); > // temp_regs.r_eax = __NR_mmap2; > temp_regs.r_eax=71; > /*temp_regs.r_ebx = addr; > temp_regs.r_ecx = size; > temp_regs.r_edx = flags; > temp_regs.r_esi = MAP_PRIVATE | MAP_ANONYMOUS;*/ > //push size > > //temp_regs.r_eip = temp_regs.r_esp - 4; > > //printf("temp=%u, \teip=%u\tregs=%u\teip=%u\n",&temp_regs,temp_regs.r_eip,®s,regs.r_eip); > > if (ptrace(PT_WRITE_D,exec_pid,(void *)(temp_regs.r_esp-4),addr) < 0) > die_perror("ptrace(PT_WRITE,%d,0x%.8x,0x%.8x) failed > ADDER",exec_pid,temp_regs.r_esp,addr); > > if (ptrace(PT_WRITE_D,exec_pid,(void *)(temp_regs.r_esp-8),size) < 0) > die_perror("ptrace(PT_WRITE,%d,0x%.8x,INT 0x80) failed > size",exec_pid,temp_regs.r_esp); > > if (ptrace(PT_WRITE_D,exec_pid,(void *)(temp_regs.r_esp-12),flags) < 0) > die_perror("ptrace(PT_WRITE,%d,0x%.8x,INT 0x80) failed > protections",exec_pid,temp_regs.r_esp); > > if (ptrace(PT_WRITE_D,exec_pid,(void > *)(temp_regs.r_esp-16),MAP_PRIVATE|MAP_ANON|MAP_FIXED) < 0) > die_perror("ptrace(PT_WRITE,%d,0x%.8x,INT 0x80) failed > flags",exec_pid,temp_regs.r_esp); > > if (ptrace(PT_WRITE_D,exec_pid,(void *)(temp_regs.r_esp-20),-1) < 0) > die_perror("ptrace(PT_WRITE,%d,0x%.8x,0x%.8x) failed > ADDER",exec_pid,temp_regs.r_esp,addr); > > if (ptrace(PT_WRITE_D,exec_pid,(void *)(temp_regs.r_esp-24),0) < 0) > die_perror("ptrace(PT_WRITE,%d,0x%.8x,0x%.8x) failed > offset1",exec_pid,temp_regs.r_esp,addr); > if (ptrace(PT_WRITE_D,exec_pid,(void *)(temp_regs.r_esp-28),0) < 0) > die_perror("ptrace(PT_WRITE,%d,0x%.8x,0x%.8x) failed > offset1",exec_pid,temp_regs.r_esp,addr); > > > /* > if (ptrace(PT_WRITE_I,exec_pid,(void *)(temp_regs.r_eip),0x000080cd) < 0) > die_perror("ptrace(PT_WRITE,%d,0x%.8x,INT 0x80) failed while allocating > memory",exec_pid,temp_regs.r_eip); > */ > if (ptrace(PT_WRITE_I,exec_pid,(void *)(temp_regs.r_eip),0x000080cd) < 0) > die_perror("ptrace(PT_WRITE,%d,0x%.8x,INT 0x80) failed while allocating > memory",exec_pid,temp_regs.r_eip); > > //temp_regs.r_eip = temp_regs.r_esp - 32; > temp_regs.r_esp = temp_regs.r_esp - 28; > > if (ptrace(PT_SETREGS,exec_pid,(caddr_t)&temp_regs,0) < 0) { > die_perror("ptrace(PT_SETREGS,%d,...) failed while allocating > memory",exec_pid); > } > if (ptrace(PT_STEP,exec_pid,NULL,0) < 0) > die_perror("ptrace(PT_STEP,...) failed while executing mmap2"); > > wait(&status); > if (WIFEXITED(status)) > die("Restarted process abrubtly (exited with value %d). Aborting > Restart.",WEXITSTATUS(status)); > else if (WIFSIGNALED(status)) > die("Restarted process abrubtly exited because of uncaught signal (%d). > Aborting Restart.",WTERMSIG(status)); > > if (ptrace(PT_GETREGS,exec_pid,(caddr_t)&temp_regs,0) < 0) { > die_perror("ptrace(PT_GETREGS,...) failed after executing mmap2 system > call"); > } > //fprintf(stdout,"hello iam here \n"); > if (temp_regs.r_eax != addr) > warn("Wanted space at address 0x%.8x, mmap2 system call returned > 0x%.8x. This could be a problem.",addr,temp_regs.r_eax); > else if (cr_options.verbose) > > fprintf(stdout,"Successfully allocated [0x%.8lx - > 0x%.8lx]\n",addr,addr+size); > > /* Restore original registers */ > if (ptrace(PT_SETREGS,exec_pid,(caddr_t)&temp_regs,0) < 0) { > die_perror("ptrace(PT_SETREGS,...) when restoring registering after > allocating memory (mmap2)"); > > } > } > > > > > On 29 March 2012 19:14, John Baldwin wrote: > > > On Thursday, March 29, 2012 9:15:43 am Maninya M wrote: > > > Thanks a lot for replying! > > > Ok I've tried this to push arguments onto stack. > > > Is it right? > > > I get an error at this line: > > > > > > die_perror("ptrace(PT_WRITE,%d,0x%.8x,INT 0x80) failed while > > > dasfallocating memory",exec_pid,temp_regs.r_eip); > > > > > > > > > Please tell me what to do. > > > > > > > > > > > > > > > > > > void map_memory(unsigned long addr, unsigned long size, int flags) > > > { > > > int status; > > > struct reg regs,temp_regs; > > > unsigned long int_instr = 0x000080cd; /* INT 0x80 */ > > > > > > if (ptrace(PT_GETREGS,exec_pid,(caddr_t)®s,0) < 0) > > > die_perror("ptrace(PTRACE_GETREGS,%d,(caddr_t)®s,0)",exec_pid); > > > > > > /* mmap2 system call seems to take arguments as follows: > > > * eax = __NR_mmap2 > > > * ebx = (unsigned long) page aligned address > > > * ecx = (unsigned long) page aligned file size > > > * edx = protection > > > * esi = flags > > > * Other arguments (fd and pgoff) are not required for anonymous > > mapping > > > */ > > > temp_regs = regs; > > > > > > //printf("temp=%u, > > \teip=%u\tregs=%u\teip=%u\n",&temp_regs,temp_regs.r_eip,®s,regs.r_eip); > > > // temp_regs.r_eax = __NR_mmap2; > > > temp_regs.r_eax=71; > > > /*temp_regs.r_ebx = addr; > > > temp_regs.r_ecx = size; > > > temp_regs.r_edx = flags; > > > temp_regs.r_esi = MAP_PRIVATE | MAP_ANONYMOUS;*/ > > > //push size > > > > > > //temp_regs.r_eip = temp_regs.r_esp - 4; > > > > You still want this, it is putting the instruction on the stack. However, > > your stack layout is wrong I think. You actually want it to be something > > like > > this: > > > > r_esp - 4: > > r_esp - 8: > > r_esp - 12: > > r_esp - 16: (MAP_FIXED?) > > r_esp - 20: > > r_esp - 24: > > r_esp - 28: > > r_esp - 32: > > > > Then you want to set: > > > > r_eip = r_esp - 32; > > r_esp -= 28; > > > > I think you want MAP_FIXED since it complains if the returned address > > doesn't > > match 'addr' at the end of your routine. However, it might be best if you > > just compiled a program that called mmap() and then looked at the > > disassembly > > and to make sure the stack layout is correct. > > > > > //printf("temp=%u, > > \teip=%u\tregs=%u\teip=%u\n",&temp_regs,temp_regs.r_eip,®s,regs.r_eip); > > > if (ptrace(PT_WRITE_D,exec_pid,(void *)(temp_regs.r_esp-4),MAP_PRIVATE | > > > MAP_ANONYMOUS) < 0) > > > die_perror("ptrace(PT_WRITE,%d,0x%.8x,INT 0x80) failed while > > allocating > > > memory",exec_pid,temp_regs.r_eip); > > > > > > if (ptrace(PT_WRITE_D,exec_pid,(void *)(temp_regs.r_esp-8),flags) < 0) > > > die_perror("ptrace(PT_WRITE,%d,0x%.8x,INT 0x80) failed while > > allocating > > > memory",exec_pid,temp_regs.r_eip); > > > > > > if (ptrace(PT_WRITE_D,exec_pid,(void *)(temp_regs.r_esp-12),size) < 0) > > > die_perror("ptrace(PT_WRITE,%d,0x%.8x,INT 0x80) failed while > > allocating > > > memory",exec_pid,temp_regs.r_eip); > > > > > > if (ptrace(PT_WRITE_D,exec_pid,(void *)(temp_regs.r_esp-16), addr) < 0); > > > die_perror("ptrace(PT_WRITE,%d,0x%.8x,INT 0x80) failed while > > > dasfallocating memory",exec_pid,temp_regs.r_eip); > > > /* > > > if (ptrace(PT_WRITE_I,exec_pid,(void *)(temp_regs.r_eip),0x000080cd) < 0) > > > die_perror("ptrace(PT_WRITE,%d,0x%.8x,INT 0x80) failed while > > allocating > > > memory",exec_pid,temp_regs.r_eip); > > > */ > > > if (ptrace(PT_WRITE_I,exec_pid,(void *)(temp_regs.r_eip),0x000080cd) < > > 0) > > > die_perror("ptrace(PT_WRITE,%d,0x%.8x,INT 0x80) failed while > > allocating > > > memory",exec_pid,temp_regs.r_eip); > > > if (ptrace(PT_SETREGS,exec_pid,(caddr_t)&temp_regs,0) < 0) { > > > die_perror("ptrace(PT_SETREGS,%d,...) failed while allocating > > > memory",exec_pid); > > > } > > > if (ptrace(PT_STEP,exec_pid,NULL,0) < 0) > > > die_perror("ptrace(PT_STEP,...) failed while executing mmap2"); > > > > > > wait(&status); > > > if (WIFEXITED(status)) > > > die("Restarted process abrubtly (exited with value %d). Aborting > > > Restart.",WEXITSTATUS(status)); > > > else if (WIFSIGNALED(status)) > > > die("Restarted process abrubtly exited because of uncaught signal > > (%d). > > > Aborting Restart.",WTERMSIG(status)); > > > > > > if (ptrace(PT_GETREGS,exec_pid,(caddr_t)&temp_regs,0) < 0) { > > > die_perror("ptrace(PT_GETREGS,...) failed after executing mmap2 > > system > > > call"); > > > } > > > //fprintf(stdout,"hello iam here \n"); > > > if (temp_regs.r_eax != addr) > > > warn("Wanted space at address 0x%.8x, mmap2 system call returned > > > 0x%.8x. This could be a problem.",addr,temp_regs.r_eax); > > > else if (cr_options.verbose) > > > > > > fprintf(stdout,"Successfully allocated [0x%.8lx - > > > 0x%.8lx]\n",addr,addr+size); > > > > > > /* Restore original registers */ > > > if (ptrace(PT_SETREGS,exec_pid,(caddr_t)&temp_regs,0) < 0) { > > > die_perror("ptrace(PT_SETREGS,...) when restoring registering after > > > allocating memory (mmap2)"); > > > > > > } > > > } > > > > > > > > > > > > > > > > > > > > > On 27 March 2012 17:23, John Baldwin wrote: > > > > > > > On Monday, March 26, 2012 1:56:08 pm Maninya M wrote: > > > > > I am trying to convert a function written for Linux to FreeBSD. > > > > > What is the equivalent of the __NR_mmap2 system call in FreeBSD? > > > > > > > > > > I keep getting the error because of this exception: > > > > > warn("Wanted space at address 0x%.8x, mmap2 system call returned > > 0x%.8x. > > > > > This could be a problem.",addr,temp_regs.eax); > > > > > > > > I think you could just use plain mmap() for this? > > > > > > > > However, it seems that this is injecting a call into an existing > > binary, > > > > not calling mmap() directly. A few things will need to change. First, > > > > FreeBSD system calls on i386 put their arguments on the stack, not in > > > > registers, so you will need to do a bit more work to push the arguments > > > > onto > > > > the stack rather than just setting registers. > > > > > > > > > I changed > > > > > temp_regs.eax = __NR_mmap2; > > > > > to > > > > > temp_regs.eax = 192; > > > > > > > > > > but it didn't work. I suppose I couldn't understand this function. > > Please > > > > > help. > > > > > > > > > > This is the function: > > > > > > > > > > void map_memory(unsigned long addr, unsigned long size, int flags) > > > > > { > > > > > int status; > > > > > struct user_regs_struct regs,temp_regs; > > > > > unsigned long int_instr = 0x000080cd; /* INT 0x80 */ > > > > > > > > > > if (ptrace(PTRACE_GETREGS,exec_pid,NULL,®s) < 0) > > > > > die_perror("ptrace(PTRACE_GETREGS,%d,NULL,®s)",exec_pid); > > > > > > > > > > /* mmap2 system call seems to take arguments as follows: > > > > > * eax = __NR_mmap2 > > > > > * ebx = (unsigned long) page aligned address > > > > > * ecx = (unsigned long) page aligned file size > > > > > * edx = protection > > > > > * esi = flags > > > > > * Other arguments (fd and pgoff) are not required for anonymous > > > > mapping > > > > > */ > > > > > temp_regs = regs; > > > > > temp_regs.eax = __NR_mmap2; > > > > > temp_regs.ebx = addr; > > > > > temp_regs.ecx = size; > > > > > temp_regs.edx = flags; > > > > > temp_regs.esi = MAP_PRIVATE | MAP_ANONYMOUS; > > > > > temp_regs.eip = temp_regs.esp - 4; > > > > > > > > > > if (ptrace(PTRACE_POKETEXT,exec_pid,(void > > > > > *)(temp_regs.eip),(void*)int_instr) < 0) > > > > > die_perror("ptrace(PTRACE_POKETEXT,%d,0x%.8x,INT 0x80) failed > > while > > > > > allocating memory",exec_pid,temp_regs.eip); > > > > > if (ptrace(PTRACE_SETREGS,exec_pid,NULL,&temp_regs) < 0) { > > > > > die_perror("ptrace(PTRACE_SETREGS,%d,...) failed while allocating > > > > > memory",exec_pid); > > > > > } > > > > > if (ptrace(PTRACE_SINGLESTEP,exec_pid,NULL,NULL) < 0) > > > > > die_perror("ptrace(PTRACE_SINGLESTEP,...) failed while executing > > > > > mmap2"); > > > > > > > > > > wait(&status); > > > > > if (WIFEXITED(status)) > > > > > die("Restarted process abrubtly (exited with value %d). Aborting > > > > > Restart.",WEXITSTATUS(status)); > > > > > else if (WIFSIGNALED(status)) > > > > > die("Restarted process abrubtly exited because of uncaught signal > > > > (%d). > > > > > Aborting Restart.",WTERMSIG(status)); > > > > > > > > > > if (ptrace(PTRACE_GETREGS,exec_pid,NULL,&temp_regs) < 0) { > > > > > die_perror("ptrace(PTRACE_GETREGS,...) failed after executing > > mmap2 > > > > > system call"); > > > > > } > > > > > > > > > > if (temp_regs.eax != addr) > > > > > warn("Wanted space at address 0x%.8x, mmap2 system call returned > > > > > 0x%.8x. This could be a problem.",addr,temp_regs.eax); > > > > > else if (cr_options.verbose) > > > > > fprintf(stdout,"Successfully allocated [0x%.8lx - > > > > > 0x%.8lx]\n",addr,addr+size); > > > > > > > > > > /* Restore original registers */ > > > > > if (ptrace(PTRACE_SETREGS,exec_pid,NULL,®s) < 0) { > > > > > die_perror("ptrace(PTRACE_SETREGS,...) when restoring registering > > > > after > > > > > allocating memory (mmap2)"); > > > > > } > > > > > } > > > > > > > > > > -- > > > > > Maninya > > > > > _______________________________________________ > > > > > freebsd-hackers@freebsd.org mailing list > > > > > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > > > > > To unsubscribe, send any mail to " > > > > freebsd-hackers-unsubscribe@freebsd.org" > > > > > > > > > > > > > -- > > > > John Baldwin > > > > > > > > > > > > > > > > -- > > > Maninya > > > > > > > -- > > John Baldwin > > > > > > -- > Maninya > -- John Baldwin From owner-freebsd-hackers@FreeBSD.ORG Mon Apr 2 21:05:31 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E30F6106566C for ; Mon, 2 Apr 2012 21:05:31 +0000 (UTC) (envelope-from eric@shadowsun.net) Received: from mail.atlantawebhost.com (dns1.atlantawebhost.com [66.223.40.39]) by mx1.freebsd.org (Postfix) with ESMTP id 7A4B58FC14 for ; Mon, 2 Apr 2012 21:05:31 +0000 (UTC) Received: (qmail 32272 invoked from network); 2 Apr 2012 16:58:50 -0400 Received: from c-24-62-202-164.hsd1.ma.comcast.net (HELO ?192.168.1.26?) (24.62.202.164) by mail.atlantawebhost.com with SMTP; 2 Apr 2012 16:58:50 -0400 Message-ID: <4F7A1301.4060900@shadowsun.net> Date: Mon, 02 Apr 2012 16:58:41 -0400 From: Eric McCorkle User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:10.0.3) Gecko/20120318 Thunderbird/10.0.3 MIME-Version: 1.0 To: freebsd-hackers@freebsd.org X-Enigmail-Version: 1.4 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig45211B0DB8FDE12A9CAB1747" Subject: GSoC: EFI on intel X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2012 21:05:32 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig45211B0DB8FDE12A9CAB1747 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I'm assessing possible summer of code projects, and the EFI work caught my attention. I've been running FreeBSD on a macbook for a little under a year now, and booting on EFI is definitely an interest to me. Does anyone know if this is still a viable project proposal? I certainly have the skills to undertake it, I just want to make sure that it stands a chance of actually being selected. --------------enig45211B0DB8FDE12A9CAB1747 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIcBAEBAgAGBQJPehMJAAoJENSCzbQ+koZ7cQsP/0r8xtv9z1MZTXufqoKynlXG ShvbeYrSJL59GHkraEXjum07/1LcRUHlv+Fiewh/4qcwUP07rG8KZLZSFa+XLkHg B0nvHED7AGj4PRD0ykEr5w3rqBoSiEku1Q2nrQIpcLEnxTrYK/+nwbR10BKmB71k kHfSKC53lLYBuZaLkf1aAIRmFp7tpBWU2LwYMS2ZjhZpBH7VdCbu3mirZUub19DI OlcBin8s2TyYVGh+5I1KUsbihR85tXPRyUODYqAtbs1/sFlc7xMAeLKjL5inweOK G0R3tnGAXCM3j3Bu3V4Ztat8CDYQZ3PHoJH2xb5Ty5dNKVy1O6z138sj6oit9tn0 aJmWi3Olh3/p5JbmcqCTLDq0ggLW2DPHXuPX52Pe2ZG5cxIIulGKkTxPPS3PKYYb N/obmPxKe6zf7IN8IlzC5kO1DYys1zSmdsKS0q/0UaAYIG9eCDjHfLcMxjq7Msdj v4H5oCMLYo7Y+mJD/X1cCOfKJMXqAtegmcvuAgu0moo7XNT7qcarpSF+dKBj9NF1 XG8nwAfszMsSu39DKLdyFHfjkG0sGartngjAwmCjFCUzcFTDMsi5xep7Xcd2o6qe obVXpq/VeGWYOEYr3dmDcwzS9CeirbyTG57b3qL66T4wsEgorksdpFFwpxapz/KW 7HS7jniHp0K3UnaCEw9D =ogXF -----END PGP SIGNATURE----- --------------enig45211B0DB8FDE12A9CAB1747-- From owner-freebsd-hackers@FreeBSD.ORG Mon Apr 2 21:19:23 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 408A71065676 for ; Mon, 2 Apr 2012 21:19:23 +0000 (UTC) (envelope-from nonesuch@longcount.org) Received: from mail-lpp01m010-f54.google.com (mail-lpp01m010-f54.google.com [209.85.215.54]) by mx1.freebsd.org (Postfix) with ESMTP id AEC488FC1B for ; Mon, 2 Apr 2012 21:19:22 +0000 (UTC) Received: by lagv3 with SMTP id v3so5273776lag.13 for ; Mon, 02 Apr 2012 14:19:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:content-type:content-transfer-encoding :x-gm-message-state; bh=i977cmIYi0VjPziBONqp26cVdExJxIpmWTsh9EhIEhA=; b=gIIAvWOAd9QZ56FGJhD6IrlrIwSs4TdnQ8H3O9lcOYzA8bu9aZRePA7msag1ZwEjqg /0EbwRsAMLKLjZ4+BOiqJ96JN7IpHBxWVyVTjZvwuf9D9Xa9mjvI5Omi+z8SF6VyxFak XR3/MW6K7vvrN5W+kGU4tyLPEwvHl2YqUahx5QYx+0tlzVp92igY0+qKyvjnGgywBnfV Nq259AR1uMVKI7em/NmcXPge22PDN9K9+0K/n/8+ASFY72of7BwGV5HNVjGqqPs9E/z5 uXN3fIFGDkEuquFIj19/VB29CLuAvjspox11xXjrQGTrzxjHHW5gV5kxIdBGwApC87Xk WPxg== MIME-Version: 1.0 Received: by 10.152.132.132 with SMTP id ou4mr11331139lab.26.1333401561115; Mon, 02 Apr 2012 14:19:21 -0700 (PDT) Received: by 10.112.145.138 with HTTP; Mon, 2 Apr 2012 14:19:21 -0700 (PDT) X-Originating-IP: [216.223.13.111] In-Reply-To: <201204021123.53055.jhb@freebsd.org> References: <201203220803.57000.jhb@freebsd.org> <201204021123.53055.jhb@freebsd.org> Date: Mon, 2 Apr 2012 17:19:21 -0400 Message-ID: From: Mark Saad To: freebsd-hackers@freebsd.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Gm-Message-State: ALoCoQlYO0sy507ApGZplVKh9PscEHCAaof9hJWLvfsp3ukj4oCey7e3wfLGDHDCXOHWPB18Iyjs Subject: Re: Approaching the limit on PV entries X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2012 21:19:23 -0000 On Mon, Apr 2, 2012 at 11:23 AM, John Baldwin wrote: > On Thursday, March 22, 2012 1:48:29 pm Mark Saad wrote: >> On Thu, Mar 22, 2012 at 8:03 AM, John Baldwin wrote: >> > On Wednesday, March 21, 2012 4:20:17 pm Mark Saad wrote: >> >> On Wed, Mar 21, 2012 at 12:39 PM, Sergey Kandaurov wrote: >> >> > On 21 March 2012 19:19, John Baldwin wrote: >> >> >> On Tuesday, March 20, 2012 11:37:57 am Sergey Kandaurov wrote: >> >> >>> On 22 November 2011 19:29, Mark Saad wro= te: >> >> >>> > Hello All >> >> >>> >> >> >>> [found this mail in my drafts, not sure if my answer is still use= ful] >> >> >>> >> >> >>> > =C2=A0I want to get to the bottom of a warning in dmesg. On 7.2= -RELEASE and >> >> >>> > 7.3-RELEASE I have seen the following warning in dmesg. >> >> >>> > >> >> >>> > Approaching the limit on PV entries, consider increasing either= the >> >> >>> > vm.pmap.shpgperproc or the vm.pmap.pv_entry_max sysctl. >> >> >>> > >> >> >>> > So looking around I see a few posts here and there about how to= tune >> >> >>> > the sysctls to address the warning however I am not 100% sure w= hat >> >> >>> > each value does. >> >> >>> > It appears changing vm.pmap.shpgperproc affects the value of >> >> >>> > vm.pmap.pv_entry_max . Can someone explain the relationship of = the two >> >> >>> > sysctls. Also >> >> >>> >> >> >>> This is how they are calculated. >> >> >>> >> >> >>> pv_entry_max =3D shpgperproc * maxproc + cnt.v_page_count; >> >> >>> >> >> >>> and, respectively, >> >> >>> >> >> >>> shpgperproc =3D (pv_entry_max - cnt.v_page_count) / maxproc; >> >> >>> >> >> >>> So, changing one sysctl will change another and vice versa. >> >> >>> >> >> >>> > what pitfalls of changing them are. >> >> >>> >> >> >>> Not known to me (on amd64 platform). >> >> >>> I have had vm.pmap.shpgperproc=3D15000 on 8.1 amd64 with 4G RAM >> >> >>> to make some badly written commercial software to work until it >> >> >>> was decommissioned to the scrap. >> >> >> >> >> >> FYI, Alan just removed this warning and the associated sysctls fro= m HEAD >> >> >> yesterday because they were made obsolete several years ago. =C2= =A0I think they are >> >> >> obsolete even on 7. =C2=A0Certainly on 8. >> >> > >> >> > Yep, and since switching to direct map (somewhere around 7.x on amd= 64?) >> >> > made PV entry limit factually obsolete, this is really cool. >> >> > >> >> > -- >> >> > wbr, >> >> > pluknet >> >> >> >> Interesting so this warning is relevant in 7.x ? >> > >> > No, looks like it was obsolete starting with 7.0. >> > >> > -- >> > John Baldwin >> >> Any chance it could be mfc'ed to 7-STABLE ? > > I just merged it to stable/7. > > -- > John Baldwin Thanks again john . --=20 mark saad | nonesuch@longcount.org From owner-freebsd-hackers@FreeBSD.ORG Mon Apr 2 22:02:07 2012 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53]) by hub.freebsd.org (Postfix) with ESMTP id 1BBA110656B8; Mon, 2 Apr 2012 22:02:07 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: from opti.dougb.net (hub.freebsd.org [IPv6:2001:4f8:fff6::36]) by mx2.freebsd.org (Postfix) with ESMTP id B09AB155B46; Mon, 2 Apr 2012 22:02:00 +0000 (UTC) Message-ID: <4F7A21D8.1040604@FreeBSD.org> Date: Mon, 02 Apr 2012 15:02:00 -0700 From: Doug Barton Organization: http://SupersetSolutions.com/ User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:10.0.2) Gecko/20120218 Thunderbird/10.0.2 MIME-Version: 1.0 To: Joe Greco References: <201204021843.q32IhPGZ053424@aurora.sol.net> In-Reply-To: <201204021843.q32IhPGZ053424@aurora.sol.net> X-Enigmail-Version: 1.3.5 OpenPGP: id=1A1ABC84 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@FreeBSD.org, Mark Felder , freebsd-questions@FreeBSD.org Subject: Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2012 22:02:07 -0000 On 4/2/2012 11:43 AM, Joe Greco wrote: > As a user, you can't win. If you don't report > a problem, you get criticized. If you report a problem but can't figure > out how to reproduce it, you get criticized. If you can reproduce it > but you don't submit a workaround, you get criticized. If you submit a > workaround but you don't submit a patch, you get criticized. If you > submit a patch but it's not in the preferred format, you get criticized. I'm still not sure what you're taking as criticism. Nothing I've said was intended that way, nor should it be read that way. If you feel that you've been criticized by others in the manner you describe, you should probably take it up with them on an individual basis. My experience of FreeBSD as a community is that we tend to be both less critical of users, and less tolerant of it. Especially when compared to other communities that I've interacted with. Doug -- This .signature sanitized for your protection From owner-freebsd-hackers@FreeBSD.ORG Mon Apr 2 22:59:59 2012 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0DAC5106566B; Mon, 2 Apr 2012 22:59:59 +0000 (UTC) (envelope-from jgreco@aurora.sol.net) Received: from mail2.sol.net (mail2.sol.net [206.55.64.73]) by mx1.freebsd.org (Postfix) with ESMTP id AE45A8FC14; Mon, 2 Apr 2012 22:59:58 +0000 (UTC) Received: from aurora.sol.net (IDENT:jgreco@aurora.sol.net [206.55.70.98]) by mail2.sol.net (8.14.4/8.14.4/SNNS-1.04) with ESMTP id q32MxtSF038423; Mon, 2 Apr 2012 17:59:55 -0500 (CDT) Received: (from jgreco@localhost) by aurora.sol.net (8.14.3/8.14.3/Submit) id q32MxtIx055561; Mon, 2 Apr 2012 17:59:55 -0500 (CDT) From: Joe Greco Message-Id: <201204022259.q32MxtIx055561@aurora.sol.net> To: dougb@FreeBSD.org (Doug Barton) Date: Mon, 2 Apr 2012 17:59:55 -0500 (CDT) In-Reply-To: <4F7A21D8.1040604@FreeBSD.org> X-Mailer: ELM [version 2.5 PL8] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@FreeBSD.org, Mark Felder , freebsd-questions@FreeBSD.org Subject: Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: freebsd-hackers@FreeBSD.org, freebsd-questions@FreeBSD.org List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2012 22:59:59 -0000 > On 4/2/2012 11:43 AM, Joe Greco wrote: > > As a user, you can't win. If you don't report > > a problem, you get criticized. If you report a problem but can't figure > > out how to reproduce it, you get criticized. If you can reproduce it > > but you don't submit a workaround, you get criticized. If you submit a > > workaround but you don't submit a patch, you get criticized. If you > > submit a patch but it's not in the preferred format, you get criticized. > > I'm still not sure what you're taking as criticism. Nothing I've said > was intended that way, nor should it be read that way. If you feel that > you've been criticized by others in the manner you describe, you should > probably take it up with them on an individual basis. It certainly seemed to me that > As much as I'm sensitive to your production requirements, realistically > it's not likely that you'll get a helpful result without testing a newer > version. 8.2 came out over a year ago, many many things have changed > since then. was an unwarranted criticism for reasons that I've already explained. Or perhaps this: > And since you can't reliably reproduce the problem, how do you expect us > to? I understand that these sorts of bugs are difficult/annoying, etc. > Been there, done that. Which would appear to be suggesting that either (or possibly both): 1) The reporter has a duty to be able to "reliably reproduce the problem" prior to reporting, and/or 2) That there was some unreasonable expectation on the reporter's part that you were expected to reproduce it. I consider 1) to be ridiculous, as long as the reporter is reasonably willing to work to resolve the issue, that should certainly be good enough, and he's certainly been interactive enough to _my_ comments, and 2) seems to be nowhere in sight in the reporter's comments, but is nonetheless present in your response. Please respect Reply-to. Thanks. ... JG -- Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net "We call it the 'one bite at the apple' rule. Give me one chance [and] then I won't contact you again." - Direct Marketing Ass'n position on e-mail spam(CNN) With 24 million small businesses in the US alone, that's way too many apples. From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 3 01:48:13 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 84DA9106566B for ; Tue, 3 Apr 2012 01:48:13 +0000 (UTC) (envelope-from ycdu.vmcore@gmail.com) Received: from mail-ob0-f182.google.com (mail-ob0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id 48E578FC18 for ; Tue, 3 Apr 2012 01:48:13 +0000 (UTC) Received: by obbwc18 with SMTP id wc18so5945157obb.13 for ; Mon, 02 Apr 2012 18:48:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=ebzgf1j5Pj40Z79SIPAgsy9Ef3kCDkf+2kaji4oGG08=; b=Kcnp0qY6v2OCmTzdHrdPBiOXfqCUggtRz8WiiaeWPc1LVMHTVrH1ui3EYn4+TXSUd7 vmfM9VnTwkct0vUvPnSS5qpdLrJ+4KOKNy2skQvbnsqaLCOdS15DtrDhSr6rWJlst/Ww Rnp48VHKg1NeuXSTSW9oVj5OX7roao5cQSLM6Fo1DlNyrA5y9ud9nzwyGXPEA19UMJ7a fEvBaDs49ka33uYRPHZPc663eGV+pBLKbAb28/f3EvxiXji88n2PI9iaOhuIytdaecmc HVaplZatEel4UDIl6rJm91X7wz5tDsP4f6/l3g+Pp3b9yCKZdr/sEtrX6Srjf3SJTM1v AGag== MIME-Version: 1.0 Received: by 10.60.9.102 with SMTP id y6mr15993518oea.46.1333417692887; Mon, 02 Apr 2012 18:48:12 -0700 (PDT) Received: by 10.60.144.38 with HTTP; Mon, 2 Apr 2012 18:48:12 -0700 (PDT) In-Reply-To: References: Date: Tue, 3 Apr 2012 09:48:12 +0800 Message-ID: From: Yongcong Du To: freebsd-hackers@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Fwd: [gsoc2012] Port NetBSD's UDF implementation X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2012 01:48:13 -0000 cc hackers@ ---------- Forwarded message ---------- From: Yongcong Du Date: Tue, Apr 3, 2012 at 1:23 AM Subject: [gsoc2012] Port NetBSD's UDF implementation To: avg@freebsd.org, netchild@freebsd.org Hi Andriy and Alexander, Firstly, let me introduce myself;) I'm a graduate student from Shanghai, timezone: GMT+8. I began to use FreeBSD and programing under it from 2009. 5 years experience in *nix posix C programing and 2 years experience in linux kernel programing. I have some experience in linux udf filesystem during one intern job -- add read_12 support which is similar as http://msdn.microsoft.com/en-us/library/windows/hardware/gg441227%28v=vs.85%29.aspx, but in an ugly hack way ;) and some hacks to improve the read performance. I want to take "Port NetBSD's UDF implementation (GSoC)" as my gsoc2012 project. Have anyone taken it?seems no according david_chi on freebsd-soc? I did some homework about this project: 1. the udf was implemeted in FreeBSD firstly then ported to NetBSD, the supported revisions is old. Then Reinoud Zandijk implemented most of UDF features and write support win latest udf revision in mind. 2. The work includes kernel (sys/fs/udf) source code and userland mount_udf. And can be completed by either sync the FreeBSD's implementation with NetBSD's or port NetBSD's from scratch. I like the second method. Any suggestions? 3. The difficulty of this project is how to resolve the lock, vm and vfs subsystem difference between NetBSD and FreeBSD. Would you please give some suggestions about it? Thanks in advance, Yongcong From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 3 13:33:58 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3DC721065673 for ; Tue, 3 Apr 2012 13:33:58 +0000 (UTC) (envelope-from vasanth.raonaik@gmail.com) Received: from mail-wg0-f50.google.com (mail-wg0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id C262C8FC18 for ; Tue, 3 Apr 2012 13:33:57 +0000 (UTC) Received: by wgbds12 with SMTP id ds12so3594991wgb.31 for ; Tue, 03 Apr 2012 06:33:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=FpCp6XMmMAP2LiiQ9z7c2XyqplU2N/bGDWRoV9AV4ew=; b=S6UHECPg1ARK8PgZWr/s0utVakTUwdj0IMiysz6EK2ZFZCXLYWDZXoZkBc5JPqtfWs SoSjE/tAh/pRUOf77LJy/ckPX31ErHmNYFrWC8689crhrlDQ6oKnASeT5+JBr/otrDAQ vb6jJBokZJ5v9K0Q5RoQoVei8/HilA7R3Ri+xLz4ZTj9WpaEGDX0OCvN0e70EJc5iPO8 fqM5GcydcQSecECDZofRqmMMENt5igNKEuo1TJZh+OMT1lBo/YRqliQiuopcdq9ut7in 2BHSy/dg5za1PEiLp/XWQlmy/nnvMjJpL1yhO3uvmbPJA0tqDmy1PmUy4s96Y44dSw4+ 1x/g== MIME-Version: 1.0 Received: by 10.180.102.102 with SMTP id fn6mr6435390wib.10.1333460036973; Tue, 03 Apr 2012 06:33:56 -0700 (PDT) Received: by 10.180.98.161 with HTTP; Tue, 3 Apr 2012 06:33:56 -0700 (PDT) Date: Tue, 3 Apr 2012 09:33:56 -0400 Message-ID: From: vasanth rao naik sabavat To: freebsd-hackers@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: question about amd64 pagetable page allocation. X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2012 13:33:58 -0000 Hi, I am trying to understand the page table page allocation for a process in FBSD6.1. I see that the page table pages are allocated by vm_page_alloc(). I believe the virtual address for this allocated page can be derived by PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m)), however when I compare this address with the virtual address accessed in pmap_remove_pages() they are not the same. The virtual address of a *PTE in pmap_remove_pages() is something like *0xffff800000643200 * However, the virtual address of the page allocated by vm_page_alloc() is * 0xffffff033c6a0000 *I would also like to understand the importance of loc_PTmap, I believe the pmap_remove_pages() is derving the page table page addresses from loc_PTmap? (kgdb) p loc_PTmap Cannot access memory at address 0xffff800000000000 How do we relate the loc_PTmap to the page table pages allocated by vm_page_alloc() in _pmap_allocpte(). Thanks, Vasanth From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 3 14:35:14 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C391F1065672; Tue, 3 Apr 2012 14:35:14 +0000 (UTC) (envelope-from feld@feld.me) Received: from feld.me (unknown [IPv6:2607:f4e0:100:300::2]) by mx1.freebsd.org (Postfix) with ESMTP id 965D98FC1A; Tue, 3 Apr 2012 14:35:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=feld.me; s=blargle; h=In-Reply-To:Message-Id:From:Mime-Version:Date:References:Subject:To:Content-Type; bh=EZ1D5hTTeJGbe3KfZ0jKKljX/1u8qcNxTqKJA5Itk+0=; b=PkG6J8gIcas3WLWKLc304BjnfKek2NJ6jpvsBjadEecEdgqiOt4yXpKUzIM0EeH0chXn5K084kMyGklzcMUoohLn2CAcMhcaEfQaoDWMRCqJcjyJgbcp/FYYxXZElDvq; Received: from localhost ([127.0.0.1] helo=mwi1.coffeenet.org) by feld.me with esmtp (Exim 4.77 (FreeBSD)) (envelope-from ) id 1SF4pA-0000zv-IM; Tue, 03 Apr 2012 09:35:13 -0500 Received: from feld@feld.me by mwi1.coffeenet.org (Archiveopteryx 3.1.4) with esmtpa id 1333463702-20726-20725/5/43; Tue, 3 Apr 2012 14:35:02 +0000 Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes To: freebsd-hackers@freebsd.org, freebsd-questions@freebsd.org References: Date: Tue, 3 Apr 2012 09:35:01 -0500 Mime-Version: 1.0 From: Mark Felder Message-Id: In-Reply-To: User-Agent: Opera Mail/11.62 (FreeBSD) X-SA-Score: -1.5 Cc: Subject: Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2012 14:35:14 -0000 Guys, The crash on my machine with debugging has evaded me for a few days. I'm still looking for further suggestions of things I should grab from the DDB when it happens again. Thanks for the help everyone! From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 3 14:56:07 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 02D0F1065672 for ; Tue, 3 Apr 2012 14:56:07 +0000 (UTC) (envelope-from marktinguely@gmail.com) Received: from mail-pb0-f54.google.com (mail-pb0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id CA2128FC15 for ; Tue, 3 Apr 2012 14:56:06 +0000 (UTC) Received: by pbcwz17 with SMTP id wz17so6042059pbc.13 for ; Tue, 03 Apr 2012 07:56:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=3THaVWHQlHCNquLc3FLcQ1O7SltIPIextJTamr0IGEk=; b=qnOh6kqOfJ1+ZD378yBPkzj855k+EOCcpINdCgkDUOhwu078at4OwlT7YPKCYFUI++ FIepGp3m0wRIHsKh5Q2PgwRp2x58YV1pimXrpzEGlzJcNJyb8aMy8wbgSm1JJlKUt/IT tB0p/KuMum1u0tuzr8RRSsQEekAVQe2RnFbnl06Dn4Z7PdxQsuqD13p1m2G20RSV3Nyv kxnbdIKcE9ZBAxrGDejXgiJDMoA1TtJ5SF9RI5KmKm7I2VZKIvE/8CpDvCl7MfID0tM/ St2Ya8fJPZ58bcqXpxBPIgaJdXURWGq/O1Vp3BDrLo54QXMJeO8Gyx747HdDF167dTqj b9sg== MIME-Version: 1.0 Received: by 10.68.200.137 with SMTP id js9mr8261439pbc.110.1333464966636; Tue, 03 Apr 2012 07:56:06 -0700 (PDT) Received: by 10.68.189.69 with HTTP; Tue, 3 Apr 2012 07:56:06 -0700 (PDT) In-Reply-To: References: Date: Tue, 3 Apr 2012 09:56:06 -0500 Message-ID: From: Mark Tinguely To: vasanth rao naik sabavat Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-hackers@freebsd.org Subject: Re: question about amd64 pagetable page allocation. X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2012 14:56:07 -0000 On Tue, Apr 3, 2012 at 8:33 AM, vasanth rao naik sabavat wrote: > Hi, > > I am trying to understand the page table page allocation for a process in > FBSD6.1. I see that the page table pages are allocated by vm_page_alloc()= . > I believe the virtual address for this allocated page can be derived by > PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m)), however when I compare this address wit= h > the virtual address accessed in pmap_remove_pages() they are not the same= . > > The virtual address of a *PTE in pmap_remove_pages() is something like > *0xffff800000643200 > * > However, the virtual address of the page allocated by vm_page_alloc() is = =A0* > 0xffffff033c6a0000 > > *I would also like to understand the importance of loc_PTmap, I believe t= he > pmap_remove_pages() is derving the page table page addresses from loc_PTm= ap? > (kgdb) p loc_PTmap > Cannot access memory at address 0xffff800000000000 > > How do we relate the loc_PTmap to the page table pages allocated by > vm_page_alloc() in _pmap_allocpte(). > > Thanks, > Vasanth The answer to your questions will be more obvious when you get an understanding of the Recursive Page Tables. --Mark Tinguely. From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 3 15:11:07 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [IPv6:2001:4f8:fff6::35]) by hub.freebsd.org (Postfix) with ESMTP id 51C271065672 for ; Tue, 3 Apr 2012 15:11:07 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from [127.0.0.1] (hub.freebsd.org [IPv6:2001:4f8:fff6::36]) by mx2.freebsd.org (Postfix) with ESMTP id 6F4D314E02F; Tue, 3 Apr 2012 15:11:06 +0000 (UTC) Message-ID: <4F7B1306.7070403@FreeBSD.org> Date: Tue, 03 Apr 2012 19:11:02 +0400 From: "Andrey V. Elsukov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20120105 Thunderbird/9.0 MIME-Version: 1.0 To: Eric McCorkle References: <4F7A1301.4060900@shadowsun.net> In-Reply-To: <4F7A1301.4060900@shadowsun.net> X-Enigmail-Version: undefined OpenPGP: id=10C8A17A Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 8bit Cc: freebsd-hackers@freebsd.org Subject: Re: GSoC: EFI on intel X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2012 15:11:07 -0000 On 03.04.2012 00:58, Eric McCorkle wrote: > I'm assessing possible summer of code projects, and the EFI work caught > my attention. I've been running FreeBSD on a macbook for a little under > a year now, and booting on EFI is definitely an interest to me. Does > anyone know if this is still a viable project proposal? I certainly > have the skills to undertake it, I just want to make sure that it stands > a chance of actually being selected. Hi, Eric Yes, this project is still needed for the FreeBSD. -- WBR, Andrey V. Elsukov From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 3 15:18:12 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A52941065673 for ; Tue, 3 Apr 2012 15:18:12 +0000 (UTC) (envelope-from vasanth.raonaik@gmail.com) Received: from mail-wg0-f42.google.com (mail-wg0-f42.google.com [74.125.82.42]) by mx1.freebsd.org (Postfix) with ESMTP id 30B6F8FC1D for ; Tue, 3 Apr 2012 15:18:12 +0000 (UTC) Received: by wgbds11 with SMTP id ds11so3429749wgb.1 for ; Tue, 03 Apr 2012 08:18:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=RlN0GLUORubIpIqDTySV5dSP7eoPIWPQXtQpnDKZ5q8=; b=Yy/jFlznyQFg8Rue4kD0mReYLo9OANGCBkl5n9mAZeyB1Cr+5LiJyqPbJuiT02QsJM iyKMmIvgbmUQmR6zWKRan3R8Xxqyvv9cs4N9bKgsvzCe5bXVx11j6GE6FjWRAKEhohSR pTzXYyKH0buPPm8xGUQPWzdTl/O4iDiPjIWYfAZySEOa4tL6dh11NLAw/lyc4fpv0fBs 8QFOj4dkZJuOfLhD/cJlUXt1sFXyfjhcCQ8sDcD3I8ZY8SB8yceTK4NTxAZBGpiWqq8U +mQ541ro0vW7ZZJ1KukZlLWNnvvedzx4U0QwSxg+7w2cEZUI+XRQAHD7n15p7WmqdKD2 TQlQ== MIME-Version: 1.0 Received: by 10.180.102.102 with SMTP id fn6mr7267184wib.10.1333466290958; Tue, 03 Apr 2012 08:18:10 -0700 (PDT) Received: by 10.180.98.161 with HTTP; Tue, 3 Apr 2012 08:18:10 -0700 (PDT) In-Reply-To: References: Date: Tue, 3 Apr 2012 11:18:10 -0400 Message-ID: From: vasanth rao naik sabavat To: Mark Tinguely Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-hackers@freebsd.org Subject: Re: question about amd64 pagetable page allocation. X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2012 15:18:12 -0000 Hello Mark, Thank you for replying, Could you please point me to any document which illustrates the implementation of recursive page tables in FreeBSD for amd64. Also, I just found that with the following patch from Alon, the usage of vtopte() is removed in pmap_remove_pages(). Why was this removed? http://lists.freebsd.org/pipermail/svn-src-all/2009-March/006006.html Thanks, Vasanth On Tue, Apr 3, 2012 at 10:56 AM, Mark Tinguely wrote: > On Tue, Apr 3, 2012 at 8:33 AM, vasanth rao naik sabavat > wrote: > > Hi, > > > > I am trying to understand the page table page allocation for a process in > > FBSD6.1. I see that the page table pages are allocated by > vm_page_alloc(). > > I believe the virtual address for this allocated page can be derived by > > PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m)), however when I compare this address > with > > the virtual address accessed in pmap_remove_pages() they are not the > same. > > > > The virtual address of a *PTE in pmap_remove_pages() is something like > > *0xffff800000643200 > > * > > However, the virtual address of the page allocated by vm_page_alloc() is > * > > 0xffffff033c6a0000 > > > > *I would also like to understand the importance of loc_PTmap, I believe > the > > pmap_remove_pages() is derving the page table page addresses from > loc_PTmap? > > (kgdb) p loc_PTmap > > Cannot access memory at address 0xffff800000000000 > > > > How do we relate the loc_PTmap to the page table pages allocated by > > vm_page_alloc() in _pmap_allocpte(). > > > > Thanks, > > Vasanth > > > > The answer to your questions will be more obvious when you get an > understanding of the Recursive Page Tables. > > --Mark Tinguely. > From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 3 16:42:05 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 88F45106564A for ; Tue, 3 Apr 2012 16:42:05 +0000 (UTC) (envelope-from marktinguely@gmail.com) Received: from mail-pz0-f44.google.com (mail-pz0-f44.google.com [209.85.210.44]) by mx1.freebsd.org (Postfix) with ESMTP id 5A5D68FC12 for ; Tue, 3 Apr 2012 16:42:05 +0000 (UTC) Received: by dadz14 with SMTP id z14so15840409dad.17 for ; Tue, 03 Apr 2012 09:42:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=slT3W1PpSEE8h0H6MVGZ7FRpy1oQn/Q/fov/zXrDv1g=; b=ex19cSNGmpsPqQVtU64u/Mtxh/C7KiEM87MX9ub7vZzV/xoN5GlbovTf2Dc5UX9cNi zCeHHmaYYi58HGACYUUmKW1hhal4FwyfW3TZitQgsMJ/RQXdJE3AydhlJWIDoY2OUHFB BqqcWf3G80AGJcMjPWWf7rp8EAiTqfYIZ9iSTYa/gWDYA4G7rBunTZmICRDvxqZ7xmYY CVhfYXgpOlzOP+efO9Ftx12CAYT5UCCMUlivRIkAHejdxzZxx/jmDPfZIs0tALtWmK+4 LXsBSzcmr4bUKtiAIlhEW5BpuRDKAO6oDolTnS/WjyVRNHcPdwfqGrz5JrU2BXoSKnbN dwEA== MIME-Version: 1.0 Received: by 10.68.233.135 with SMTP id tw7mr29554696pbc.152.1333471324759; Tue, 03 Apr 2012 09:42:04 -0700 (PDT) Received: by 10.68.189.69 with HTTP; Tue, 3 Apr 2012 09:42:04 -0700 (PDT) In-Reply-To: References: Date: Tue, 3 Apr 2012 11:42:04 -0500 Message-ID: From: Mark Tinguely To: vasanth rao naik sabavat Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-hackers@freebsd.org Subject: Re: question about amd64 pagetable page allocation. X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2012 16:42:05 -0000 On Tue, Apr 3, 2012 at 10:18 AM, vasanth rao naik sabavat wrote: > Hello Mark, > > Thank you for replying, > > Could you please point me to any document which illustrates the > implementation of recursive page tables in FreeBSD for amd64. > > Also, I just found that with the following patch from Alon, the usage of > vtopte() is removed in pmap_remove_pages(). Why was this removed? > > http://lists.freebsd.org/pipermail/svn-src-all/2009-March/006006.html > > Thanks, > Vasanth > > On Tue, Apr 3, 2012 at 10:56 AM, Mark Tinguely > wrote: >> >> On Tue, Apr 3, 2012 at 8:33 AM, vasanth rao naik sabavat >> wrote: >> > Hi, >> > >> > I am trying to understand the page table page allocation for a process >> > in >> > FBSD6.1. I see that the page table pages are allocated by >> > vm_page_alloc(). >> > I believe the virtual address for this allocated page can be derived b= y >> > PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m)), however when I compare this address >> > with >> > the virtual address accessed in pmap_remove_pages() they are not the >> > same. >> > >> > The virtual address of a *PTE in pmap_remove_pages() is something like >> > *0xffff800000643200 >> > * >> > However, the virtual address of the page allocated by vm_page_alloc() = is >> > =A0* >> > 0xffffff033c6a0000 >> > >> > *I would also like to understand the importance of loc_PTmap, I believ= e >> > the >> > pmap_remove_pages() is derving the page table page addresses from >> > loc_PTmap? >> > (kgdb) p loc_PTmap >> > Cannot access memory at address 0xffff800000000000 >> > >> > How do we relate the loc_PTmap to the page table pages allocated by >> > vm_page_alloc() in _pmap_allocpte(). >> > >> > Thanks, >> > Vasanth >> >> >> >> The answer to your questions will be more obvious when you get an >> understanding of the Recursive Page Tables. >> >> --Mark Tinguely. > > Search for "recursive page tables" start with 32 bits first. I did not read it, but the below page looks promising: http://www.rohitab.com/discuss/topic/31139-tutorial-paging-memory-mapping-w= ith-a-recursive-page-directory/ The nice thing about recursive page table is the MMU does the work. But: 1) User recursive page tables work only for the current map. 2) Kernel mappings are placed at the top of every map - so the kernel recursive addresses will always be valid. As the comment in the link states, that change converted from using user recursive page table virtual address (which is only valid when the map is current) to using the physical to virtual direct map (which is always valid). --Mark Tinguely. From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 3 17:22:13 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BDA21106566B for ; Tue, 3 Apr 2012 17:22:13 +0000 (UTC) (envelope-from ambrisko@ambrisko.com) Received: from mail.ambrisko.com (mail.ambrisko.com [70.91.206.90]) by mx1.freebsd.org (Postfix) with ESMTP id 9CCF28FC0C for ; Tue, 3 Apr 2012 17:22:13 +0000 (UTC) X-Ambrisko-Me: Yes Received: from server2.ambrisko.com (HELO internal.ambrisko.com) ([192.168.1.2]) by ironport.ambrisko.com with ESMTP; 03 Apr 2012 10:22:20 -0700 Received: from ambrisko.com (localhost [127.0.0.1]) by internal.ambrisko.com (8.14.4/8.14.4) with ESMTP id q33HMDf4051414; Tue, 3 Apr 2012 10:22:13 -0700 (PDT) (envelope-from ambrisko@ambrisko.com) Received: (from ambrisko@localhost) by ambrisko.com (8.14.4/8.14.4/Submit) id q33HMD6u051412; Tue, 3 Apr 2012 10:22:13 -0700 (PDT) (envelope-from ambrisko) From: Doug Ambrisko Message-Id: <201204031722.q33HMD6u051412@ambrisko.com> In-Reply-To: <4F7A1301.4060900@shadowsun.net> To: Eric McCorkle Date: Tue, 3 Apr 2012 10:22:13 -0700 (PDT) X-Mailer: ELM [version 2.4ME+ PL124d (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="US-ASCII" Cc: freebsd-hackers@freebsd.org Subject: Re: GSoC: EFI on intel X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2012 17:22:13 -0000 Eric McCorkle writes: | I'm assessing possible summer of code projects, and the EFI work caught | my attention. I've been running FreeBSD on a macbook for a little under | a year now, and booting on EFI is definitely an interest to me. Does | anyone know if this is still a viable project proposal? I certainly | have the skills to undertake it, I just want to make sure that it stands | a chance of actually being selected. EFI is a good task. For generic PC's we need an X64 format. The current version in FreeBSD is IA32 format. The X64 can boot i386/amd64. Qemu can be used to test both IA32 and X64 formats. I added some notes about this on the wiki at: http://wiki.freebsd.org/IdeasPage#EFI_support_for_FreeBSD.2BAC8-i386_and_FreeBSD.2BAC8-amd64_.28GSoC.29 Qemu is nice since it can runs an UEFI BIOS via the OVMF project and emulate a DOS file system by pointing qemu to a directory. So then it is easy to build something, toss it into a directory, start qemu and test. Thanks, Doug A. From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 3 17:31:29 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 940C51065675 for ; Tue, 3 Apr 2012 17:31:29 +0000 (UTC) (envelope-from gljennjohn@googlemail.com) Received: from mail-wg0-f42.google.com (mail-wg0-f42.google.com [74.125.82.42]) by mx1.freebsd.org (Postfix) with ESMTP id 1D6958FC22 for ; Tue, 3 Apr 2012 17:31:28 +0000 (UTC) Received: by wgbds11 with SMTP id ds11so3536876wgb.1 for ; Tue, 03 Apr 2012 10:31:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=date:from:to:cc:subject:message-id:in-reply-to:references:reply-to :x-mailer:mime-version:content-type:content-transfer-encoding; bh=j43Zd52MT1AQKfqvm5QnVWDf/eVGyOYZKcMVfslLKUw=; b=uObHRT05Ul0K8eWl+HECsRIibwvyURHndEB6voHJwx0gSaxDPvZ7LOssATw/9hdsRk 5+iRL2L1BVlwcirH2gIZ8RAEptbThy+Tlab9UUm33T3VBqG98ul6OP3MPjdagk8symPA mA/rmiWy+SQzrG6fEWQJ2ANqKvi4xCp4RwlEBZcBCnqTday+CegOR88g1l3ake3C1Qma yDO/GzL/5wTwD9QXgzIOh8ZOSbFj2m880wcl6nupnmhl5mlY1s+YFO640xdohVfbpY7E ULWUUyXkr20kfk2Z+W7VcORh+fyTpvt5GAexrb61xBzqZ1SGl66zDmbWdhnlJ4QiHv8F Ur1g== Received: by 10.180.88.199 with SMTP id bi7mr38417707wib.12.1333474288253; Tue, 03 Apr 2012 10:31:28 -0700 (PDT) Received: from ernst.jennejohn.org (p578E2A9A.dip.t-dialin.net. [87.142.42.154]) by mx.google.com with ESMTPS id 6sm44402034wiz.1.2012.04.03.10.31.26 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 03 Apr 2012 10:31:27 -0700 (PDT) Date: Tue, 3 Apr 2012 19:31:24 +0200 From: Gary Jennejohn To: Jerry Toung Message-ID: <20120403193124.46ad9de9@ernst.jennejohn.org> In-Reply-To: References: X-Mailer: Claws Mail 3.8.0 (GTK+ 2.24.6; amd64-portbld-freebsd10.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: freebsd-hackers Subject: Re: CAM disk I/O starvation X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: gljennjohn@googlemail.com List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2012 17:31:29 -0000 On Mon, 2 Apr 2012 10:55:31 -0700 Jerry Toung wrote: > I am convinced that there is a bug in the CAM code that leads to I/O starvation. > I have already discussed this privately with some. I am now bringing this up to > the general audience to get more feedback. > I've observed this with my onboard ATI IXP700 AHCI SATA controller and 2 or 3 SATA disks. When one disk gets busy all others are pretty much blocked until it finishes. Seems to me that this behavior is (fairly) recent. [snip] > I have a patch and it fixes those problems. I can share it to the list > if requested to. > da0 and da1 now both automatically get 126 openings and based on that, > extra logic implements fairness in cam/cam_xpt.c. No more 0 MB/s on > da1. This is on 8.1-RELEASE FreeBSD. > It would be interesting to see your patch. I always run HEAD but maybe I could use it as a base for my own mods/tests. -- Gary Jennejohn From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 3 17:33:56 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B5C971065672 for ; Tue, 3 Apr 2012 17:33:56 +0000 (UTC) (envelope-from eric@shadowsun.net) Received: from mail.atlantawebhost.com (dns1.atlantawebhost.com [66.223.40.39]) by mx1.freebsd.org (Postfix) with ESMTP id 5EDF38FC16 for ; Tue, 3 Apr 2012 17:33:56 +0000 (UTC) Received: (qmail 9575 invoked from network); 3 Apr 2012 13:33:55 -0400 Received: from c-76-119-101-151.hsd1.ma.comcast.net (HELO ?192.168.1.2?) (76.119.101.151) by mail.atlantawebhost.com with SMTP; 3 Apr 2012 13:33:55 -0400 Message-ID: <4F7B347B.6000400@shadowsun.net> Date: Tue, 03 Apr 2012 13:33:47 -0400 From: Eric McCorkle User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:10.0.3) Gecko/20120318 Thunderbird/10.0.3 MIME-Version: 1.0 To: Doug Ambrisko References: <201204031722.q33HMD6u051412@ambrisko.com> In-Reply-To: <201204031722.q33HMD6u051412@ambrisko.com> X-Enigmail-Version: 1.4 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigD7372AA69EB025FDEEDF8B30" X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-hackers@freebsd.org Subject: Re: GSoC: EFI on intel X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2012 17:33:56 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigD7372AA69EB025FDEEDF8B30 Content-Type: multipart/mixed; boundary="------------060308050004040005090806" This is a multi-part message in MIME format. --------------060308050004040005090806 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 04/03/12 13:22, Doug Ambrisko wrote: > EFI is a good task. For generic PC's we need an X64 format. The curre= nt > version in FreeBSD is IA32 format. The X64 can boot i386/amd64. > Qemu can be used to test both IA32 and X64 formats. I added some > notes about this on the wiki at: > http://wiki.freebsd.org/IdeasPage#EFI_support_for_FreeBSD.2BAC8-i386_a= nd_FreeBSD.2BAC8-amd64_.28GSoC.29 > > Qemu is nice since it can runs an UEFI BIOS via the OVMF project > and emulate a DOS file system by pointing qemu to a directory. So > then it is easy to build something, toss it into a directory, start > qemu and test. > > Thanks, > > Doug A. I'm drafting an application right now. I emailed the listed contacts (Rui Paulo and Andrey Elsukov) a moment ago. I've done background research on this already, as part of getting FreeBSD to work on Mac hardware. QEMU caught my attention as a testbed. Also, I found out Apple EFI implementations are non-standard.=20 They specifically look for an HFS+ partition and load a specific file.=20 The workaround is pretty simple, of course: just wrap the boot loader in an HFS+ image and write it to a partition reserved for that purpose. Anyway, if I'm going to propose this, I need to list possible mentors.=20 Skill-wise, I'm well equipped to take it on. I anticipate needing someone who's a committer, preferably with good knowledge of the kernel sources. --=20 Eric McCorkle Computer Science Ph.D Student eric@shadowsun.net --------------060308050004040005090806-- --------------enigD7372AA69EB025FDEEDF8B30 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIcBAEBAgAGBQJPezSCAAoJENSCzbQ+koZ7pocQAIwuy6EPX08VZ2xPLzn7rsMt /2g7MLPqVnSk39lUUBt5eC2iMVO7aM8sK7GHxoZd8nSFLz/loUIw+9wXH2QP+Arx gx96UCDgYrYJkGgRBhJ1y/YJkHm9lSYJw0PxjtlveegcfAhsvRPRb+FGt5a393oe CSjFtK9laupZaAdqFwpnaWtd3aaOucSQ5ahJiiHOg74dUIyz+NB0jc4DgZ2y7Jn1 dYh6GpJUD2hbTqm24vnaoLn+g/kn1LvEOLUI0qYE/rZnlFg3HaeinJf9YmBd1x2p 8U1C9OAbwgIDM25n6spqfyzjRBUSKrhfakatOmIPMfidv43vUdPBvrRUJ5NX2Juj Xg76vgJXvDjThPKadwHiiNQrJ8llqlEZSSE9iP/JsL6xsNeP06RZSwZcruzX+AXm DaF0Xy16QH2MPJC9t1L2yEVyXZEveVgryN1On0no5uzREUOLeFFD2KpNKx/zZwvH thlAAFu07s5DxRrS0smtYl5lnIk02eCdVccUH1w8AxTIZGqEzYf2MQu+7p5DXfWJ Cu2dv+M4nG9NwWvXRHsl+TyXi0LWGN1ffpEUHdDmi4PO0E20EGOiz7lFW5Qeda82 4D5WD8+vRpIj7GyrHF4e+OZdLZkHROk9mbs20411ib0hmba52tE3q+IveeEX75vJ BIj2dF+QPk0P0k4k4ghv =d6Vv -----END PGP SIGNATURE----- --------------enigD7372AA69EB025FDEEDF8B30-- From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 3 18:51:45 2012 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx2.freebsd.org (mx2.freebsd.org [IPv6:2001:4f8:fff6::35]) by hub.freebsd.org (Postfix) with ESMTP id 486EE1065672; Tue, 3 Apr 2012 18:51:45 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: from [127.0.0.1] (hub.freebsd.org [IPv6:2001:4f8:fff6::36]) by mx2.freebsd.org (Postfix) with ESMTP id C23A114ED01; Tue, 3 Apr 2012 18:51:44 +0000 (UTC) Message-ID: <4F7B46C0.5050503@FreeBSD.org> Date: Tue, 03 Apr 2012 11:51:44 -0700 From: Doug Barton Organization: http://www.FreeBSD.org/ User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:11.0) Gecko/20120327 Thunderbird/11.0.1 MIME-Version: 1.0 To: freebsd-hackers@FreeBSD.org, freebsd-questions@FreeBSD.org References: <201204022259.q32MxtIx055561@aurora.sol.net> In-Reply-To: <201204022259.q32MxtIx055561@aurora.sol.net> X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Mark Felder , Joe Greco Subject: Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2012 18:51:45 -0000 On 4/2/2012 3:59 PM, Joe Greco wrote: >> On 4/2/2012 11:43 AM, Joe Greco wrote: >>> As a user, you can't win. If you don't report >>> a problem, you get criticized. If you report a problem but can't figure >>> out how to reproduce it, you get criticized. If you can reproduce it >>> but you don't submit a workaround, you get criticized. If you submit a >>> workaround but you don't submit a patch, you get criticized. If you >>> submit a patch but it's not in the preferred format, you get criticized. >> >> I'm still not sure what you're taking as criticism. Nothing I've said >> was intended that way, nor should it be read that way. If you feel that >> you've been criticized by others in the manner you describe, you should >> probably take it up with them on an individual basis. > > It certainly seemed to me that > >> As much as I'm sensitive to your production requirements, realistically >> it's not likely that you'll get a helpful result without testing a newer >> version. 8.2 came out over a year ago, many many things have changed >> since then. > > was an unwarranted criticism for reasons that I've already explained. Everything in that paragraph is a fact. If you feel criticized when people state facts, I'm not sure how much I can help you. Please note, I didn't say, "You're an idiot for running old stuff." I even explicitly stated that I understood *why* the OP is running an old version. Nevertheless, the facts are what they are. The only way we can deal rationally with the world is to acknowledge reality and deal with it. Wishing it were otherwise isn't really useful. :) > Or perhaps this: > >> And since you can't reliably reproduce the problem, how do you expect us >> to? I understand that these sorts of bugs are difficult/annoying, etc. >> Been there, done that. > > Which would appear to be suggesting that either (or possibly both): > > 1) The reporter has a duty to be able to "reliably reproduce the problem" > prior to reporting, and/or > > 2) That there was some unreasonable expectation on the reporter's part > that you were expected to reproduce it. Quite the contrary, I was responding to your implication that there is some other answer that we should be able to give the OP, other than "Try a newer version." Various people have chimed in on the thread, all have offered suggestions, none of which (AFAICS) have helped. I continue to maintain that the best course of action for the OP would be to try the latest 8-stable. And BTW, there are (at least) 2 reasons for that. First, the bug may actually be fixed. But second, we're in the middle of a release cycle for 8.3 right now. If the bug persists in the latest code it will be easier to get the right eyes onto the problem. That benefits both the OP and the community at large. Doug From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 3 18:52:21 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9FB941065672 for ; Tue, 3 Apr 2012 18:52:21 +0000 (UTC) (envelope-from vasanth.raonaik@gmail.com) Received: from mail-wi0-f178.google.com (mail-wi0-f178.google.com [209.85.212.178]) by mx1.freebsd.org (Postfix) with ESMTP id 272EA8FC19 for ; Tue, 3 Apr 2012 18:52:21 +0000 (UTC) Received: by wibhq7 with SMTP id hq7so25542wib.13 for ; Tue, 03 Apr 2012 11:52:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=rLwyTeo3bjmPNj0Vt67v42VDa/ZFvNoIq/L9zijnk7Y=; b=eEXChBCjvPI6kBkCT8MqJ5f9zjYq9hVmNLhOHOGkN+4EaY5wqKW25rZgGUWfehX5AN 1NiMLW5AcM3wsVVhEuXqqsGpysOXH/zRsyLgaHH45VwRI0Z2zV1WmYuioQRZ/M4l/thk yZCRz1XcIWe05Hc399+RBL+4um4COMr1JYE7MjB12CLj+bLN7WFahwtio+3HsGzubOiU QRqYors7kpu8kuudZNdk1dZ9fwXiu/ux5BGfCuT37wZ7+E33TZ6VvdvYKj6b7bLGHYX+ YZ4+9FxwlKSYY6/M4xfCTyatu9UEuBFJh/vqfvLI0tZW2pAC+D5ZwTbMBGqX4+YatSab 4/Dw== MIME-Version: 1.0 Received: by 10.180.76.240 with SMTP id n16mr39101751wiw.10.1333479139600; Tue, 03 Apr 2012 11:52:19 -0700 (PDT) Received: by 10.180.98.161 with HTTP; Tue, 3 Apr 2012 11:52:19 -0700 (PDT) In-Reply-To: References: Date: Tue, 3 Apr 2012 14:52:19 -0400 Message-ID: From: vasanth rao naik sabavat To: Mark Tinguely Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-hackers@freebsd.org Subject: Re: question about amd64 pagetable page allocation. X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2012 18:52:21 -0000 Hello Mark, I think pmap_remove_pages() is executed only for the current process. 2549 #ifdef PMAP_REMOVE_PAGES_CURPROC_ONLY 2550 if (pmap != vmspace_pmap(curthread->td_proc->p_vmspace)) { 2551 printf("warning: pmap_remove_pages called with non-current pmap\n"); 2552 return; 2553 } 2554 #endif I dont still get it why this was removed? Thanks, Vasanth On Tue, Apr 3, 2012 at 12:42 PM, Mark Tinguely wrote: > On Tue, Apr 3, 2012 at 10:18 AM, vasanth rao naik sabavat > wrote: > > Hello Mark, > > > > Thank you for replying, > > > > Could you please point me to any document which illustrates the > > implementation of recursive page tables in FreeBSD for amd64. > > > > Also, I just found that with the following patch from Alon, the usage of > > vtopte() is removed in pmap_remove_pages(). Why was this removed? > > > > http://lists.freebsd.org/pipermail/svn-src-all/2009-March/006006.html > > > > Thanks, > > Vasanth > > > > On Tue, Apr 3, 2012 at 10:56 AM, Mark Tinguely > > wrote: > >> > >> On Tue, Apr 3, 2012 at 8:33 AM, vasanth rao naik sabavat > >> wrote: > >> > Hi, > >> > > >> > I am trying to understand the page table page allocation for a process > >> > in > >> > FBSD6.1. I see that the page table pages are allocated by > >> > vm_page_alloc(). > >> > I believe the virtual address for this allocated page can be derived > by > >> > PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m)), however when I compare this address > >> > with > >> > the virtual address accessed in pmap_remove_pages() they are not the > >> > same. > >> > > >> > The virtual address of a *PTE in pmap_remove_pages() is something like > >> > *0xffff800000643200 > >> > * > >> > However, the virtual address of the page allocated by vm_page_alloc() > is > >> > * > >> > 0xffffff033c6a0000 > >> > > >> > *I would also like to understand the importance of loc_PTmap, I > believe > >> > the > >> > pmap_remove_pages() is derving the page table page addresses from > >> > loc_PTmap? > >> > (kgdb) p loc_PTmap > >> > Cannot access memory at address 0xffff800000000000 > >> > > >> > How do we relate the loc_PTmap to the page table pages allocated by > >> > vm_page_alloc() in _pmap_allocpte(). > >> > > >> > Thanks, > >> > Vasanth > >> > >> > >> > >> The answer to your questions will be more obvious when you get an > >> understanding of the Recursive Page Tables. > >> > >> --Mark Tinguely. > > > > > > Search for "recursive page tables" start with 32 bits first. I did not > read it, but the below page looks promising: > > > http://www.rohitab.com/discuss/topic/31139-tutorial-paging-memory-mapping-with-a-recursive-page-directory/ > > The nice thing about recursive page table is the MMU does the work. But: > > 1) User recursive page tables work only for the current map. > 2) Kernel mappings are placed at the top of every map - so the kernel > recursive addresses will always be valid. > > As the comment in the link states, that change converted from using > user recursive page table virtual address (which is only valid when > the map is current) to using the physical to virtual direct map (which > is always valid). > > --Mark Tinguely. > From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 3 19:02:56 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id ED3741065670 for ; Tue, 3 Apr 2012 19:02:56 +0000 (UTC) (envelope-from andrey@zonov.org) Received: from mail-bk0-f54.google.com (mail-bk0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 72CA58FC08 for ; Tue, 3 Apr 2012 19:02:56 +0000 (UTC) Received: by bkcjc3 with SMTP id jc3so50170bkc.13 for ; Tue, 03 Apr 2012 12:02:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject :content-type:x-gm-message-state; bh=AlVasNTAZb+97HKzZR3v+G4SqJJjNHg7e6LexxrIzGg=; b=M48s+6KIBwwU1ZdgjrIIbsdVy6QNaIbICTyOdXy39qMkZkWSjgg4A3RWzyokaXvCGU Eqgw9F2FiiG4Pj3Wq1M7g1C16YfGDzmD1UrsksoDJjnx3K+431Y5BkvYvmlpuRMSi7aM 0EUzrQQiKQalP1LG9MXWvM+rEC9v9nS96Xlx9yB6ezvuKExnCWzcau6hW1Og4f6Qe8Fy aJ19yQ4D57yrrsQOWoamIKx63oYoWY/ZYdUu+4ANUkl5Q3hOIHFekROa8KlUzgn1XnoX 0HTf5ArK3oQs9hSRsNpfI/uS2F2s5sTmXKM5b6+IuTOeqNmgQgTzY2wWymtuRF72V6wQ PefA== Received: by 10.205.127.130 with SMTP id ha2mr6157479bkc.28.1333479775384; Tue, 03 Apr 2012 12:02:55 -0700 (PDT) Received: from [10.254.254.77] (ppp95-165-133-149.pppoe.spdop.ru. [95.165.133.149]) by mx.google.com with ESMTPS id jd17sm47960905bkb.4.2012.04.03.12.02.54 (version=SSLv3 cipher=OTHER); Tue, 03 Apr 2012 12:02:54 -0700 (PDT) Message-ID: <4F7B495D.3010402@zonov.org> Date: Tue, 03 Apr 2012 23:02:53 +0400 From: Andrey Zonov User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.8.1.24) Gecko/20100228 Thunderbird/2.0.0.24 Mnenhy/0.7.6.0 MIME-Version: 1.0 To: freebsd-hackers@freebsd.org Content-Type: multipart/mixed; boundary="------------020308060409060606070001" X-Gm-Message-State: ALoCoQn7VKuqtMnwC6m2B+o/bekQK+DdFHwM36BHbbec9btnc6QDkpT/p7iqMjj/GmZQMt1kkJVL Subject: problems with mmap() and disk caching X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2012 19:02:57 -0000 This is a multi-part message in MIME format. --------------020308060409060606070001 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hi, I open the file, then call mmap() on the whole file and get pointer, then I work with this pointer. I expect that page should be only once touched to get it into the memory (disk cache?), but this doesn't work! I wrote the test (attached) and ran it for the 1G file generated from /dev/random, the result is the following: Prepare file: # swapoff -a # newfs /dev/ada0b # mount /dev/ada0b /mnt # dd if=/dev/random of=/mnt/random-1024 bs=1m count=1024 Purge cache: # umount /mnt # mount /dev/ada0b /mnt Run test: $ ./mmap /mnt/random-1024 30 mmap: 1 pass took: 7.431046 (none: 262112; res: 32; super: 0; other: 0) mmap: 2 pass took: 7.356670 (none: 261648; res: 496; super: 0; other: 0) mmap: 3 pass took: 7.307094 (none: 260521; res: 1623; super: 0; other: 0) mmap: 4 pass took: 7.350239 (none: 258904; res: 3240; super: 0; other: 0) mmap: 5 pass took: 7.392480 (none: 257286; res: 4858; super: 0; other: 0) mmap: 6 pass took: 7.292069 (none: 255584; res: 6560; super: 0; other: 0) mmap: 7 pass took: 7.048980 (none: 251142; res: 11002; super: 0; other: 0) mmap: 8 pass took: 6.899387 (none: 247584; res: 14560; super: 0; other: 0) mmap: 9 pass took: 7.190579 (none: 242992; res: 19152; super: 0; other: 0) mmap: 10 pass took: 6.915482 (none: 239308; res: 22836; super: 0; other: 0) mmap: 11 pass took: 6.565909 (none: 232835; res: 29309; super: 0; other: 0) mmap: 12 pass took: 6.423945 (none: 226160; res: 35984; super: 0; other: 0) mmap: 13 pass took: 6.315385 (none: 208555; res: 53589; super: 0; other: 0) mmap: 14 pass took: 6.760780 (none: 192805; res: 69339; super: 0; other: 0) mmap: 15 pass took: 5.721513 (none: 174497; res: 87647; super: 0; other: 0) mmap: 16 pass took: 5.004424 (none: 155938; res: 106206; super: 0; other: 0) mmap: 17 pass took: 4.224926 (none: 135639; res: 126505; super: 0; other: 0) mmap: 18 pass took: 3.749608 (none: 117952; res: 144192; super: 0; other: 0) mmap: 19 pass took: 3.398084 (none: 99066; res: 163078; super: 0; other: 0) mmap: 20 pass took: 3.029557 (none: 74994; res: 187150; super: 0; other: 0) mmap: 21 pass took: 2.379430 (none: 55231; res: 206913; super: 0; other: 0) mmap: 22 pass took: 2.046521 (none: 40786; res: 221358; super: 0; other: 0) mmap: 23 pass took: 1.152797 (none: 30311; res: 231833; super: 0; other: 0) mmap: 24 pass took: 0.972617 (none: 16196; res: 245948; super: 0; other: 0) mmap: 25 pass took: 0.577515 (none: 8286; res: 253858; super: 0; other: 0) mmap: 26 pass took: 0.380738 (none: 3712; res: 258432; super: 0; other: 0) mmap: 27 pass took: 0.253583 (none: 1193; res: 260951; super: 0; other: 0) mmap: 28 pass took: 0.157508 (none: 0; res: 262144; super: 0; other: 0) mmap: 29 pass took: 0.156169 (none: 0; res: 262144; super: 0; other: 0) mmap: 30 pass took: 0.156550 (none: 0; res: 262144; super: 0; other: 0) If I ran this: $ cat /mnt/random-1024 > /dev/null before test, when result is the following: $ ./mmap /mnt/random-1024 5 mmap: 1 pass took: 0.337657 (none: 0; res: 262144; super: 0; other: 0) mmap: 2 pass took: 0.186137 (none: 0; res: 262144; super: 0; other: 0) mmap: 3 pass took: 0.186132 (none: 0; res: 262144; super: 0; other: 0) mmap: 4 pass took: 0.186535 (none: 0; res: 262144; super: 0; other: 0) mmap: 5 pass took: 0.190353 (none: 0; res: 262144; super: 0; other: 0) This is what I expect. But why this doesn't work without reading file manually? I've also never seen super pages, how to make them work? I've been playing with madvise and posix_fadvise but no luck. BTW, posix_fadvise(POSIX_FADV_WILLNEED) does nothing as the commentary says, shouldn't this be documented in the manual page? All tests were run under 9.0-STABLE (r233744). -- Andrey Zonov --------------020308060409060606070001 Content-Type: text/plain; charset=windows-1251; name="mmap.c" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="mmap.c" /*_ * Andrey Zonov (c) 2011 */ #include #include #include #include #include #include #include #include #include int main(int argc, char **argv) { int i; int fd; int num; int block; int pagesize; size_t n; size_t size; size_t none, incore, super, other; char *p; char *tmp; char *vec; char *vecp; struct stat sb; struct timeval tp, tp1, tp2; if (argc < 2 || argc > 4) errx(1, "usage: mmap [num] [block]"); fd = open(argv[1], O_RDONLY); if (fd == -1) err(1, "open()"); num = 1; if (argc >= 3) num = atoi(argv[2]); pagesize = getpagesize(); block = pagesize; if (argc == 4) block = atoi(argv[3]); if (fstat(fd, &sb) == -1) err(1, "fstat()"); size = sb.st_size; #if 0 if (posix_fadvise(fd, (off_t)0, (off_t)0, POSIX_FADV_WILLNEED) == -1) err(1, "posix_fadvise()"); #endif p = mmap(NULL, sb.st_size, PROT_READ, /*MAP_PREFAULT_READ |*/ MAP_PRIVATE, fd, (off_t)0); if (p == MAP_FAILED) err(1, "mmap()"); #if 0 if (madvise(p, (size_t)size, MADV_WILLNEED) == -1) err(1, "madvise()"); #endif tmp = calloc(1, block); if (tmp == NULL) err(1, "calloc()"); vec = calloc(1, size / pagesize); if (vec == NULL) err(1, "calloc()"); for (i = 0; i < num; i++) { gettimeofday(&tp1, NULL); for (n = 0; n < size / block; n++) memcpy(tmp, p + (n * block), block); gettimeofday(&tp2, NULL); timersub(&tp2, &tp1, &tp); if (mincore(p, size, vec) == -1) err(1, "mincore()"); none = incore = super = other = 0; for (vecp = vec; (size_t)(vecp - vec) < size / pagesize; vecp++) { if (*vecp == 0) none++; else if (*vecp & MINCORE_INCORE) incore++; else if (*vecp & MINCORE_SUPER) super++; else other++; } warnx("%2d pass took: %3ld.%06ld (none: %6ld; res: %6ld; super: %6ld; other: %6ld)", i + 1, tp.tv_sec, tp.tv_usec, none, incore, super, other); } free(vec); free(tmp); if (munmap(p, sb.st_size) == -1) err(1, "munmap()"); close(fd); exit(0); } --------------020308060409060606070001-- From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 3 19:18:40 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 03DBD1065858 for ; Tue, 3 Apr 2012 19:18:40 +0000 (UTC) (envelope-from marktinguely@gmail.com) Received: from mail-pb0-f54.google.com (mail-pb0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id C77A48FC18 for ; Tue, 3 Apr 2012 19:18:39 +0000 (UTC) Received: by pbcwz17 with SMTP id wz17so224311pbc.13 for ; Tue, 03 Apr 2012 12:18:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=4JXOk3Ng9m6BIaI5nnGEalpghZC0tFZCHxXxZu1spEI=; b=wkmNdqmJjE45CHpOewBWT+D+c1AtG80lm81R640bz6FTLZxoNxpK2XA1zShlCK0VMI EAAFqDCjyfPPTEr9zsLMSaOGkC22jt4PJzeCzvSInzb1G2AS+GOoitxzUS2Ai+e+LgUg K0u5utk1w04bPo5lr8ajyTWCQTVnrqYo6NBkr1rztABtPeGgwF40JTisNsLhIM+PEBH/ nq1+YIvfT7WeIZWT9Tb/bF9HBNsHYMZWnlFj3gMQza6XobbfngNPOYkCvQCm7V6Dsf01 +BD01Rem39i56IgiN6bjgykD7r2kaCtDw05eMacduV00ZIncxK1Lez+9ei/p8fXQ+izW xgKw== MIME-Version: 1.0 Received: by 10.68.240.65 with SMTP id vy1mr8235831pbc.131.1333480719177; Tue, 03 Apr 2012 12:18:39 -0700 (PDT) Received: by 10.68.189.69 with HTTP; Tue, 3 Apr 2012 12:18:39 -0700 (PDT) In-Reply-To: References: Date: Tue, 3 Apr 2012 14:18:39 -0500 Message-ID: From: Mark Tinguely To: vasanth rao naik sabavat Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-hackers@freebsd.org Subject: Re: question about amd64 pagetable page allocation. X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2012 19:18:40 -0000 On Tue, Apr 3, 2012 at 1:52 PM, vasanth rao naik sabavat wrote: > Hello Mark, > > I think pmap_remove_pages() is executed only for the current process. > > =A0=A0 2549 #ifdef PMAP_REMOVE_PAGES_CURPROC_ONLY > =A0=A0 2550 =A0=A0=A0 if (pmap !=3D vmspace_pmap(curthread->td_proc->p_vm= space)) { > =A0=A0 2551 =A0=A0=A0 =A0=A0=A0 printf("warning: pmap_remove_pages called= with non-current > pmap\n"); > =A0=A0 2552 =A0=A0=A0 =A0=A0=A0 return; > =A0=A0 2553 =A0=A0=A0 } > =A0=A0 2554 #endif > > I dont still get it why this was removed? > > Thanks, > Vasanth That is pretty old code. Newer code does not make that assumption. Without the assumption that the pages are from the current map, then you have to use the direct physical -> virtual mapping: 2547 #ifdef PMAP_REMOVE_PAGES_CURPROC_ONLY 2548 pte =3D vtopte(pv->pv_va); 2549 #else 2550 pte =3D pmap_pte(pmap, pv->pv_va); 2551 #endif --Mark Tinguely. From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 3 19:46:46 2012 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3EC431065670 for ; Tue, 3 Apr 2012 19:46:46 +0000 (UTC) (envelope-from rank1seeker@gmail.com) Received: from mail-bk0-f54.google.com (mail-bk0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id B99148FC1A for ; Tue, 3 Apr 2012 19:46:45 +0000 (UTC) Received: by bkcjc3 with SMTP id jc3so108495bkc.13 for ; Tue, 03 Apr 2012 12:46:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:from:to:subject:date:content-type :content-transfer-encoding:x-mailer; bh=/RjZS93Cfph/vqcbkKrnwaqE5mD5mjr88ciLg9Y2WOg=; b=D/0ooydOAqMEQYrWpCKs1P4JPrs4ccwywiW/JefbeLp2De8lcMyoC6dFmh/AA5lHEx 0By80EyF8+bTBMFuRBinovsTxeqhq7JS44CCxTEU3GklvVLJWRnQMgdGQZV3Zsuchzkv hlnT6VeEB/frV1XDNykF5IpB+djhTAQRtQ8OA0ohFFEAHh+MZAG0QZlmiYaPz/sFDk1z IXZspSusg5pZNiwSB9u14RnHwXHmqZCW5i+uVKNNHTh4tgDWWzO+FGYXhpbrHwUVHafc 2R1bLNMeL6dyio10A3ROuH75qhZ9A7nCdgceItlEN+Fh3KHMc0WNwqZEcVT3VoQtCVZF Q7Vg== Received: by 10.204.156.65 with SMTP id v1mr5988426bkw.109.1333482404572; Tue, 03 Apr 2012 12:46:44 -0700 (PDT) Received: from DOMYPC ([82.193.208.173]) by mx.google.com with ESMTPS id u5sm48167713bka.5.2012.04.03.12.46.42 (version=SSLv3 cipher=OTHER); Tue, 03 Apr 2012 12:46:44 -0700 (PDT) Message-ID: <20120403.194643.926.1@DOMY-PC> From: rank1seeker@gmail.com To: hackers@freebsd.org Date: Tue, 03 Apr 2012 21:46:43 +0200 Content-Type: text/plain; charset="Windows-1250" Content-Transfer-Encoding: quoted-printable X-Mailer: POP Peeper (3.8.1.0) Cc: Subject: gpart and it's EBR confusion X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2012 19:46:46 -0000 9.0 R i386=0D=0A=0D=0AEBR scheme never installed=0D=0A=0D=0Amd0s3 has BSD = labels scheme=0D=0A----=0D=0A# gpart destroy -F md0s3=0D=0Amd0s3 = destroyed=0D=0A=0D=0A# gpart create -s BSD md0s3=0D=0Agpart: geom = 'md0s3': File exists=0D=0A=0D=0A# gpart show -p md0s3=0D=0A=3D> 0 = 1023120 md0s3 EBR (499M) [CORRUPT]=0D=0A 0 1023120 - = free - (499M)=0D=0A----=0D=0A=0D=0A=0D=0ADuring one of above tasks, on = main console kernel outputs:=0D=0A--=0D=0AGEOM: md0s3: invalid entries in = the EBR ignored.=0D=0A--=0D=0A=0D=0ANot 100% reproducible.=0D=0ARunning = again SAME 'gpart destroy' then 'gpart create', = worked.=0D=0A=0D=0A=0D=0ADomagoj Smol=E8i=E6 From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 3 20:32:19 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 419F3106564A for ; Tue, 3 Apr 2012 20:32:19 +0000 (UTC) (envelope-from ambrisko@ambrisko.com) Received: from mail.ambrisko.com (mail.ambrisko.com [70.91.206.90]) by mx1.freebsd.org (Postfix) with ESMTP id 1F0C38FC0A for ; Tue, 3 Apr 2012 20:32:18 +0000 (UTC) X-Ambrisko-Me: Yes Received: from server2.ambrisko.com (HELO internal.ambrisko.com) ([192.168.1.2]) by ironport.ambrisko.com with ESMTP; 03 Apr 2012 13:32:24 -0700 Received: from ambrisko.com (localhost [127.0.0.1]) by internal.ambrisko.com (8.14.4/8.14.4) with ESMTP id q33KWIwN081140; Tue, 3 Apr 2012 13:32:18 -0700 (PDT) (envelope-from ambrisko@ambrisko.com) Received: (from ambrisko@localhost) by ambrisko.com (8.14.4/8.14.4/Submit) id q33KWHv9081139; Tue, 3 Apr 2012 13:32:17 -0700 (PDT) (envelope-from ambrisko) From: Doug Ambrisko Message-Id: <201204032032.q33KWHv9081139@ambrisko.com> In-Reply-To: <4F7B347B.6000400@shadowsun.net> To: Eric McCorkle Date: Tue, 3 Apr 2012 13:32:17 -0700 (PDT) X-Mailer: ELM [version 2.4ME+ PL124d (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="US-ASCII" Cc: freebsd-hackers@freebsd.org Subject: Re: GSoC: EFI on intel X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2012 20:32:19 -0000 Eric McCorkle writes: | On 04/03/12 13:22, Doug Ambrisko wrote: | > EFI is a good task. For generic PC's we need an X64 format. The current | > version in FreeBSD is IA32 format. The X64 can boot i386/amd64. | > Qemu can be used to test both IA32 and X64 formats. I added some | > notes about this on the wiki at: | > http://wiki.freebsd.org/IdeasPage#EFI_support_for_FreeBSD.2BAC8-i386_and_FreeBSD.2BAC8-amd64_.28GSoC.29 | > | > Qemu is nice since it can runs an UEFI BIOS via the OVMF project | > and emulate a DOS file system by pointing qemu to a directory. So | > then it is easy to build something, toss it into a directory, start | > qemu and test. | | I'm drafting an application right now. I emailed the listed contacts | (Rui Paulo and Andrey Elsukov) a moment ago. | | I've done background research on this already, as part of getting | FreeBSD to work on Mac hardware. QEMU caught my attention as a | testbed. Also, I found out Apple EFI implementations are non-standard. | They specifically look for an HFS+ partition and load a specific file. | The workaround is pretty simple, of course: just wrap the boot loader in | an HFS+ image and write it to a partition reserved for that purpose. | | Anyway, if I'm going to propose this, I need to list possible mentors. | Skill-wise, I'm well equipped to take it on. I anticipate needing | someone who's a committer, preferably with good knowledge of the kernel | sources. Both Rui and Andrey should be able to to fit your need for mentors. I can help with some stuff. It seems you've looked at the Mac side a fair amount. It might be good to look at X64 and IA64. Don't know how much can be shared. There is an efi loader in the tree for IA64. I've only played around with generic PC's (Dell, AMI based systems and qemu). I built grub and had grub boot via efi. Doug A. From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 3 21:12:30 2012 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A9172106566B for ; Tue, 3 Apr 2012 21:12:30 +0000 (UTC) (envelope-from rsimmons0@gmail.com) Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id 633BF8FC14 for ; Tue, 3 Apr 2012 21:12:30 +0000 (UTC) Received: by vbmv11 with SMTP id v11so153899vbm.13 for ; Tue, 03 Apr 2012 14:12:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=mJINJaBZAuRhI41JYEPHIO6YTD1ft6aW94lSafyWITE=; b=M/8whOdpWOhZ5W7f1ENwFQ55izs05tBAMxaRQICmWcAljUk0e9Ef5eRqPkcf8egx0o 4G+ey6ItteTq1+S+oi/eU2Sy+Y9tD50xJStA3IhXuYEVCdQtguMgGwP6/SesaJaRwPFS KzFwmW/G40LzgwbVMfjADYnZiKqqzwznLZncNEBCzal7ryppaD21/q1FT2I6w/y/7znO MnqZHa5vuaETfo5W/5sg/5pFBCw8jLles9gsxccEp0HWeI7llfAdCgvHOYFEnItLgjcf JZEwM9jUOQ7YEZ0aDEgJfVbH1YLPwdVFlQOAlFCwsHsgQ1x+1IVSyf88lNAhypA+3g3U R4Gw== MIME-Version: 1.0 Received: by 10.52.92.140 with SMTP id cm12mr5527251vdb.115.1333487549940; Tue, 03 Apr 2012 14:12:29 -0700 (PDT) Received: by 10.52.117.76 with HTTP; Tue, 3 Apr 2012 14:12:29 -0700 (PDT) In-Reply-To: <20120403.194643.926.1@DOMY-PC> References: <20120403.194643.926.1@DOMY-PC> Date: Tue, 3 Apr 2012 17:12:29 -0400 Message-ID: From: Robert Simmons To: hackers@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Subject: Re: gpart and it's EBR confusion X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2012 21:12:30 -0000 On Tue, Apr 3, 2012 at 3:46 PM, wrote: > 9.0 R i386 > > EBR scheme never installed > > md0s3 has BSD labels scheme > ---- > # gpart destroy -F md0s3 > md0s3 destroyed > > # gpart create -s BSD md0s3 > gpart: geom 'md0s3': File exists > > # gpart show -p md0s3 > =3D> =A0 =A0 =A00 =A01023120 =A0md0s3 =A0EBR =A0(499M) [CORRUPT] > =A0 =A0 =A0 =A00 =A01023120 =A0 =A0 =A0 =A0 - free - =A0(499M) > ---- > > > During one of above tasks, on main console kernel outputs: > -- > GEOM: md0s3: invalid entries in the EBR ignored. I've had a similar problem, but with "gpart destroy" and a GPT partition. If I create a GPT partition on a disk then for some reason begin an install process again by rebooting and starting from the beginning of bsdinstall I am unable to destroy the old partition scheme. I encountered this with sysinstall and 8.0-8.2 as well. The workaround that I found is to use "dd" rather than "gpart destroy". Just dd the drive or part of drive in question from /dev/zero or /dev/random and everything is a clean slate. From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 3 21:27:44 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EEA641065670 for ; Tue, 3 Apr 2012 21:27:44 +0000 (UTC) (envelope-from jrytoung@gmail.com) Received: from mail-wi0-f172.google.com (mail-wi0-f172.google.com [209.85.212.172]) by mx1.freebsd.org (Postfix) with ESMTP id 79A248FC08 for ; Tue, 3 Apr 2012 21:27:44 +0000 (UTC) Received: by wibhj6 with SMTP id hj6so3313615wib.13 for ; Tue, 03 Apr 2012 14:27:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=SIK+bZ3SUelstjS/Pqr6V3fNYBxMlx4RKxdwrT36OpA=; b=LdjkOk1mXkz2WMLyMVb5Rtp3Hj7+jqJcABH8/6hlFCirbj7jJJvnNOlM1X2v/MNb15 Gv+BWAlEx2I4Jr2Mn6CGRcifg0uX8475a4PyfByMr0Qcz3mmXsYVe9lY1y2A6QldAc5R oirMHTvO4hmxcThIf5M0CVAQO/HPyHNU2UIjx/YYqNYtYoL24U+CZiqz5bmzi9IvO+Px PxoCCWBkZnHB8zQlipsofpVZ9CPy6Ey8c8uMTr3IpodRzAnNCk1xLtZkdYCn4N+wZnNr 7pJssh50YeDl6qDcLgKpJLPYz+2Y/9nT4V6SeA5EVrzkQoutLlpEscXhEgr3BApQdsPW +3uA== MIME-Version: 1.0 Received: by 10.180.24.66 with SMTP id s2mr40243820wif.7.1333488463331; Tue, 03 Apr 2012 14:27:43 -0700 (PDT) Received: by 10.216.27.148 with HTTP; Tue, 3 Apr 2012 14:27:43 -0700 (PDT) In-Reply-To: <20120403193124.46ad9de9@ernst.jennejohn.org> References: <20120403193124.46ad9de9@ernst.jennejohn.org> Date: Tue, 3 Apr 2012 14:27:43 -0700 Message-ID: From: Jerry Toung To: gljennjohn@googlemail.com Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-hackers Subject: Re: CAM disk I/O starvation X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2012 21:27:45 -0000 On 4/3/12, Gary Jennejohn wrote: > It would be interesting to see your patch. I always run HEAD but maybe > I could use it as a base for my own mods/tests. > Here is the patch diff -rup cam/cam_sim.c cam/cam_sim.c --- cam/cam_sim.c 2010-06-13 19:09:06.000000000 -0700 +++ cam/cam_sim.c 2012-03-19 13:05:10.000000000 -0700 @@ -87,6 +87,7 @@ cam_sim_alloc(sim_action_func sim_action sim->refcount = 1; sim->devq = queue; sim->max_ccbs = 8; /* Reserve for management purposes. */ + sim->dev_count = 0; sim->mtx = mtx; if (mtx == &Giant) { sim->flags |= 0; diff -rup cam/cam_sim.h cam/cam_sim.h --- cam/cam_sim.h 2010-06-13 19:09:06.000000000 -0700 +++ cam/cam_sim.h 2012-03-19 15:34:17.000000000 -0700 @@ -118,6 +118,8 @@ struct cam_sim { u_int max_ccbs; /* Current count of allocated ccbs */ u_int ccb_count; + /* Number of peripheral drivers mapped to this sim */ + u_int dev_count; }; diff -rup cam/cam_xpt.c cam/cam_xpt.c --- cam/cam_xpt.c 2010-06-13 19:09:06.000000000 -0700 +++ cam/cam_xpt.c 2012-03-29 11:41:51.000000000 -0700 @@ -303,7 +303,7 @@ xpt_schedule_dev_allocq(struct cam_eb *b int retval; if ((dev->drvq.entries > 0) && - (dev->ccbq.devq_openings > 0) && + (dev->runs_token < dev->ccbq.queue.array_size) && (cam_ccbq_frozen(&dev->ccbq, CAM_PRIORITY_TO_RL( CAMQ_GET_PRIO(&dev->drvq))) == 0)) { /* @@ -327,7 +327,7 @@ xpt_schedule_dev_sendq(struct cam_eb *bu int retval; if ((dev->ccbq.queue.entries > 0) && - (dev->ccbq.dev_openings > 0) && + (dev->runs_token < dev->ccbq.queue.array_size) && (cam_ccbq_frozen_top(&dev->ccbq) == 0)) { /* * The priority of a device waiting for controller @@ -973,6 +973,9 @@ xpt_add_periph(struct cam_periph *periph struct cam_ed *device; int32_t status; struct periph_list *periph_head; + struct cam_eb *bus; + struct cam_et *target; + struct cam_ed *devptr; mtx_assert(periph->sim->mtx, MA_OWNED); @@ -991,6 +994,8 @@ xpt_add_periph(struct cam_periph *periph status = camq_resize(&device->drvq, device->drvq.array_size + 1); + if (periph->periph_name != NULL && strncmp(periph->periph_name, "da",2) ==0 ) + device->sim->dev_count++; device->generation++; SLIST_INSERT_HEAD(periph_head, periph, periph_links); @@ -998,6 +1003,24 @@ xpt_add_periph(struct cam_periph *periph mtx_lock(&xsoftc.xpt_topo_lock); xsoftc.xpt_generation++; + + if (device != NULL && device->sim->dev_count > 1 && + (device->sim->max_dev_openings > device->sim->dev_count)) { + TAILQ_FOREACH(bus, &xsoftc.xpt_busses, links) { + if (bus->sim != device->sim) + continue; + TAILQ_FOREACH(target, &bus->et_entries, links) { + TAILQ_FOREACH(devptr, &target->ed_entries, links) { + /* + * The number of openings/tags supported by the sim (i.e controller) + * is evenly distributed between all devices that share this sim. + */ + cam_ccbq_resize(&devptr->ccbq, + (devptr->sim->max_dev_openings/devptr->sim->dev_count)); + } + } + } + } mtx_unlock(&xsoftc.xpt_topo_lock); return (status); @@ -3072,6 +3095,11 @@ xpt_run_dev_allocq(struct cam_eb *bus) } /* We may have more work. Attempt to reschedule. */ + device->runs_token++; + if (device->runs_token >= device->ccbq.queue.array_size) { + device->runs_token = 0; + break; + } xpt_schedule_dev_allocq(bus, device); } devq->alloc_queue.qfrozen_cnt[0]--; @@ -3139,7 +3167,6 @@ xpt_run_dev_sendq(struct cam_eb *bus) devq->send_openings--; devq->send_active++; - xpt_schedule_dev_sendq(bus, device); if (work_ccb && (work_ccb->ccb_h.flags & CAM_DEV_QFREEZE) != 0){ /* @@ -3170,6 +3197,13 @@ xpt_run_dev_sendq(struct cam_eb *bus) */ sim = work_ccb->ccb_h.path->bus->sim; (*(sim->sim_action))(sim, work_ccb); + + device->runs_token++; + if (device->runs_token >= device->ccbq.queue.array_size) { + device->runs_token = 0; + break; + } + xpt_schedule_dev_sendq(bus, device); } devq->send_queue.qfrozen_cnt[0]--; } @@ -4285,6 +4319,7 @@ xpt_alloc_device(struct cam_eb *bus, str device->tag_delay_count = 0; device->tag_saved_openings = 0; device->refcount = 1; + device->runs_token = 0; callout_init_mtx(&device->callout, bus->sim->mtx, 0); /* diff -rup cam/cam_xpt_internal.h cam/cam_xpt_internal.h --- cam/cam_xpt_internal.h 2010-06-13 19:09:06.000000000 -0700 +++ cam/cam_xpt_internal.h 2012-03-21 13:57:45.000000000 -0700 @@ -118,6 +118,7 @@ struct cam_ed { #define CAM_TAG_DELAY_COUNT 5 u_int32_t tag_saved_openings; u_int32_t refcount; + u_int32_t runs_token; struct callout callout; }; diff -rup cam/scsi/scsi_da.c cam/scsi/scsi_da.c --- cam/scsi/scsi_da.c 2010-06-13 19:09:06.000000000 -0700 +++ cam/scsi/scsi_da.c 2012-03-21 14:16:00.000000000 -0700 @@ -56,7 +56,13 @@ __FBSDID("$FreeBSD: src/sys/cam/scsi/scs #include #include #include +#include #include +#include +#include +#include +#include +#include #include @@ -1102,6 +1108,26 @@ dasysctlinit(void *context, int pending) &softc->minimum_cmd_size, 0, dacmdsizesysctl, "I", "Minimum CDB size"); + SYSCTL_ADD_INT(&softc->sysctl_ctx,SYSCTL_CHILDREN(softc->sysctl_tree), + OID_AUTO, "outstanding_cmds", CTLTYPE_INT | CTLFLAG_RD, + &softc->outstanding_cmds, 0, "Outstanding CCB Cmds"); + + SYSCTL_ADD_INT(&softc->sysctl_ctx,SYSCTL_CHILDREN(softc->sysctl_tree), + OID_AUTO, "ccbq_devq_openings", CTLTYPE_INT | CTLFLAG_RD, + &periph->path->device->ccbq.devq_openings, 0, "CCBQ Dev Openings"); + + SYSCTL_ADD_INT(&softc->sysctl_ctx,SYSCTL_CHILDREN(softc->sysctl_tree), + OID_AUTO, "ccbq_array_size", CTLTYPE_INT | CTLFLAG_RW, + &periph->path->device->ccbq.queue.array_size, 0, "CCBQ Array Size"); + + SYSCTL_ADD_INT(&softc->sysctl_ctx,SYSCTL_CHILDREN(softc->sysctl_tree), + OID_AUTO, "sim_ccb_count", CTLTYPE_INT | CTLFLAG_RD, + &periph->sim->ccb_count, 0, "SIM CCB COUNT"); + + SYSCTL_ADD_INT(&softc->sysctl_ctx,SYSCTL_CHILDREN(softc->sysctl_tree), + OID_AUTO, "sim_devq_alloc_openings", CTLTYPE_INT | CTLFLAG_RD, + &periph->sim->devq->alloc_openings, 0, "SIM Devq Alloc Openings"); + cam_periph_release(periph); } From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 3 22:20:17 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BF8EF106566C for ; Tue, 3 Apr 2012 22:20:17 +0000 (UTC) (envelope-from eric@shadowsun.net) Received: from mail.atlantawebhost.com (dns1.atlantawebhost.com [66.223.40.39]) by mx1.freebsd.org (Postfix) with ESMTP id 66A828FC08 for ; Tue, 3 Apr 2012 22:20:17 +0000 (UTC) Received: (qmail 28963 invoked from network); 3 Apr 2012 18:13:30 -0400 Received: from c-76-119-101-151.hsd1.ma.comcast.net (HELO ?192.168.1.2?) (76.119.101.151) by mail.atlantawebhost.com with SMTP; 3 Apr 2012 18:13:30 -0400 Message-ID: <4F7B7604.6090703@shadowsun.net> Date: Tue, 03 Apr 2012 18:13:24 -0400 From: Eric McCorkle User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:10.0.3) Gecko/20120318 Thunderbird/10.0.3 MIME-Version: 1.0 To: freebsd-hackers@freebsd.org References: <201204031722.q33HMD6u051412@ambrisko.com> In-Reply-To: <201204031722.q33HMD6u051412@ambrisko.com> X-Enigmail-Version: 1.4 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig28B364A43523696C6C9E4DF6" X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Re: GSoC: EFI on intel X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2012 22:20:17 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig28B364A43523696C6C9E4DF6 Content-Type: multipart/mixed; boundary="------------090402030200090909000405" This is a multi-part message in MIME format. --------------090402030200090909000405 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 04/03/12 13:22, Doug Ambrisko wrote: > Eric McCorkle writes: > | I'm assessing possible summer of code projects, and the EFI work caug= ht > | my attention. I've been running FreeBSD on a macbook for a little un= der > | a year now, and booting on EFI is definitely an interest to me. Does= > | anyone know if this is still a viable project proposal? I certainly > | have the skills to undertake it, I just want to make sure that it sta= nds > | a chance of actually being selected. > > EFI is a good task. For generic PC's we need an X64 format. The curre= nt > version in FreeBSD is IA32 format. The X64 can boot i386/amd64. > Qemu can be used to test both IA32 and X64 formats. I added some > notes about this on the wiki at: > http://wiki.freebsd.org/IdeasPage#EFI_support_for_FreeBSD.2BAC8-i386_a= nd_FreeBSD.2BAC8-amd64_.28GSoC.29 Based on the feedback I've gotten, I've gone ahead and submitted the proposal. It should be available, and I can edit it up to the 6th. --=20 Eric McCorkle Computer Science Ph.D Student eric@shadowsun.net --------------090402030200090909000405-- --------------enig28B364A43523696C6C9E4DF6 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIcBAEBAgAGBQJPe3YJAAoJENSCzbQ+koZ7zEEQAI/XufcX5j3gqlYxO05df6nP nvexIxeaeXW+sr3zUduUevLsn6BARWMxkemjkVDKUX+3UkN/Wk7roCrwxhqEQ4nj udwUpkvQQIOc1/n70d8mpF4+LiEtxlWsgBWGy8q280T+nrqqSbmdwQU/8LMypvk1 Z21eT2wxM5JnvadFyxgSc24zXwAlR4M8Ku9MvQwM741mlgoyl0POC5/4byH1lsHi Nqe+yhSXPyLk0bfwkPO80ofFwZbSuKvkU1xoYeGquxcsV6V/0QyN0WXq2yxIXFJy mRGr7nXXscoZ1mYDOle6dchUgktlAlVppVbwqNsGunqHZkO4KEcHeYeUelqF5B+Q qcEZqpiymbKmNtvyGVU7u1BRZ6ebinRjGRu+Tks/UHsynVOChoFgw0KImdpeBvcc oKaala5yUD5zkzAIpnbB2a8aO+TZQXVelRP4Z7YZlDON7tpzAgdMlRTe/C8DqKew /lZkTTUCSEJ8ySicsp+oZBnqLjUTORBqZ8o9S0GZg5vZKQkNtbYYMqnWV/CWhu9K nYDkuWjXcbcrm1hc2pdKBDLv06oQM76tNHiKQmYdtifR7NI0EZPBRb48uxLatDYf V4d05nz+YjBcREc5wxSHbEAyVQ0ZRbNfT9C8vwL5/CTSTK3sc+awY4pAfLKeM398 qw9vAG/enwDDQcukxLLW =oPcr -----END PGP SIGNATURE----- --------------enig28B364A43523696C6C9E4DF6-- From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 3 23:43:27 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B9F231065732 for ; Tue, 3 Apr 2012 23:43:27 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-lpp01m010-f54.google.com (mail-lpp01m010-f54.google.com [209.85.215.54]) by mx1.freebsd.org (Postfix) with ESMTP id 3B2158FC08 for ; Tue, 3 Apr 2012 23:43:27 +0000 (UTC) Received: by lagv3 with SMTP id v3so423334lag.13 for ; Tue, 03 Apr 2012 16:43:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:x-google-sender-auth:message-id:subject :from:to:content-type; bh=OlEtw6/bNKYrJCgVe1SqoJmJm7vAS3FQACDV7jBHkIQ=; b=iSKfGsDXvvZj9DatVeQHls8Xmwgb32T/EfImWg3qwFmYtBNmpPJv1SAKusWINJpwMk g+/0OX3FkLGgLhGnJZQaEGgP3rBkMmDDmakaTQ8pV9NPq5b147FJB9/jMfmBxDOQdzMd sWS4k1z4j13Sl/iAx7WjdMzQ6ngQpjkNJCjqRl3Vq2bqpXVEoR8HfGhRmevxKk3eBSu3 jBj1yaFw8Jck3J6vVX//Yjmphd4MgxxvCDPJLpHXhgi0uIIGhVkrXuDZSnPjpJEZ3RqU sgyqkAT4eUEqCes437hVl7bAMBTbfbvTpinepDfFwTXRr01JzI3hyNqWFuwXeLd6/tJC QDYw== MIME-Version: 1.0 Received: by 10.152.111.161 with SMTP id ij1mr16261738lab.19.1333496605933; Tue, 03 Apr 2012 16:43:25 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.112.91.141 with HTTP; Tue, 3 Apr 2012 16:43:25 -0700 (PDT) Date: Tue, 3 Apr 2012 16:43:25 -0700 X-Google-Sender-Auth: rz_PNlH1t0KL1VaMt-abmLw-67I Message-ID: From: Adrian Chadd To: freebsd-hackers@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Subject: Request: expresscard (PCIe hotplug) support X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2012 23:43:27 -0000 Hi all, FreeBSD still doesn't have the hotplug side of expresscard support. (I don't think we have APSM support either, but that's a later problem.) Would anyone be interested (and have the hardware/skills) to make it work? If you can make something work, you'll make my 802.11n development happen so much faster. :-) I'd be eternally grateful. Thanks! Adrian From owner-freebsd-hackers@FreeBSD.ORG Wed Apr 4 04:29:23 2012 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7D19A106566B for ; Wed, 4 Apr 2012 04:29:23 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from mail.kirov.so-ups.ru (mail.kirov.so-ups.ru [178.74.170.1]) by mx1.freebsd.org (Postfix) with ESMTP id 23AF38FC08 for ; Wed, 4 Apr 2012 04:29:23 +0000 (UTC) Received: from kas30pipe.localhost (localhost.kirov.so-ups.ru [127.0.0.1]) by mail.kirov.so-ups.ru (Postfix) with SMTP id CBFF0B801B; Wed, 4 Apr 2012 08:29:15 +0400 (MSK) Received: from kirov.so-ups.ru (unknown [172.21.81.1]) by mail.kirov.so-ups.ru (Postfix) with ESMTP id C2695B8008; Wed, 4 Apr 2012 08:29:15 +0400 (MSK) Received: by ns.kirov.so-ups.ru (Postfix, from userid 1010) id BB6A5BA008; Wed, 4 Apr 2012 08:29:15 +0400 (MSK) Received: from [127.0.0.1] (elsukov.kirov.oduur.so [10.118.3.52]) by ns.kirov.so-ups.ru (Postfix) with ESMTP id 871A9BA003; Wed, 4 Apr 2012 08:29:15 +0400 (MSK) Message-ID: <4F7BCE1B.6070708@FreeBSD.org> Date: Wed, 04 Apr 2012 08:29:15 +0400 From: "Andrey V. Elsukov" User-Agent: Mozilla Thunderbird 1.5 (FreeBSD/20051231) MIME-Version: 1.0 To: rank1seeker@gmail.com References: <20120403.194643.926.1@DOMY-PC> In-Reply-To: <20120403.194643.926.1@DOMY-PC> X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 7bit X-SpamTest-Version: SMTP-Filter Version 3.0.0 [0284], KAS30/Release X-SpamTest-Info: Not protected Cc: hackers@freebsd.org Subject: Re: gpart and it's EBR confusion X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2012 04:29:23 -0000 On 03.04.2012 23:46, rank1seeker@gmail.com wrote: > GEOM: md0s3: invalid entries in the EBR ignored. > -- > > Not 100% reproducible. > Running again SAME 'gpart destroy' then 'gpart create', worked. This should be fixed in stable/9 with r232535. -- WBR, Andrey V. Elsukov From owner-freebsd-hackers@FreeBSD.ORG Wed Apr 4 05:40:50 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6612E1065670; Wed, 4 Apr 2012 05:40:50 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) by mx1.freebsd.org (Postfix) with ESMTP id 3337E8FC14; Wed, 4 Apr 2012 05:40:50 +0000 (UTC) Received: from julian-mac.elischer.org (c-67-180-24-15.hsd1.ca.comcast.net [67.180.24.15]) (authenticated bits=0) by vps1.elischer.org (8.14.5/8.14.5) with ESMTP id q345el1O001443 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Tue, 3 Apr 2012 22:40:48 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <4F7BDF06.8000104@freebsd.org> Date: Tue, 03 Apr 2012 22:41:26 -0700 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.4; en-US; rv:1.9.2.28) Gecko/20120306 Thunderbird/3.1.20 MIME-Version: 1.0 To: John Baldwin References: <4F775DF5.1020704@rawbw.com> <201204020831.09253.jhb@freebsd.org> <4F79D63E.7010200@rawbw.com> <201204021312.36568.jhb@freebsd.org> In-Reply-To: <201204021312.36568.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Yuri , hackers@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2012 05:40:50 -0000 On 4/2/12 10:12 AM, John Baldwin wrote: > On Monday, April 02, 2012 12:39:26 pm Yuri wrote: >> On 04/02/2012 05:31, John Baldwin wrote: >>> Hmm, I don't know if the port has it, but I did some work on pstack a while >>> ago to make it work with libthread_db so it at least handles i386 ok. It >>> needs to be modified to use something like libunwind though or some other >>> unwinder. And possibly it should use libelf instead of its own ELF-parsing >>> code. >> I see pstack -1.2_1 failing even on i386: >> >> pstack: cannot read context for thread 0x1879f >> pstack: failed to read more threads > Yes, threads don't work for modern binaries (newer than 4.x) without my changes > to make it use libthread_db. You can find the patch I used for this at > http://www.freebsd.org/~jhb/patches/pstack_threads.patch should be in ports? From owner-freebsd-hackers@FreeBSD.ORG Wed Apr 4 05:40:50 2012 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6612E1065670; Wed, 4 Apr 2012 05:40:50 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) by mx1.freebsd.org (Postfix) with ESMTP id 3337E8FC14; Wed, 4 Apr 2012 05:40:50 +0000 (UTC) Received: from julian-mac.elischer.org (c-67-180-24-15.hsd1.ca.comcast.net [67.180.24.15]) (authenticated bits=0) by vps1.elischer.org (8.14.5/8.14.5) with ESMTP id q345el1O001443 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Tue, 3 Apr 2012 22:40:48 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <4F7BDF06.8000104@freebsd.org> Date: Tue, 03 Apr 2012 22:41:26 -0700 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.4; en-US; rv:1.9.2.28) Gecko/20120306 Thunderbird/3.1.20 MIME-Version: 1.0 To: John Baldwin References: <4F775DF5.1020704@rawbw.com> <201204020831.09253.jhb@freebsd.org> <4F79D63E.7010200@rawbw.com> <201204021312.36568.jhb@freebsd.org> In-Reply-To: <201204021312.36568.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Yuri , hackers@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2012 05:40:50 -0000 On 4/2/12 10:12 AM, John Baldwin wrote: > On Monday, April 02, 2012 12:39:26 pm Yuri wrote: >> On 04/02/2012 05:31, John Baldwin wrote: >>> Hmm, I don't know if the port has it, but I did some work on pstack a while >>> ago to make it work with libthread_db so it at least handles i386 ok. It >>> needs to be modified to use something like libunwind though or some other >>> unwinder. And possibly it should use libelf instead of its own ELF-parsing >>> code. >> I see pstack -1.2_1 failing even on i386: >> >> pstack: cannot read context for thread 0x1879f >> pstack: failed to read more threads > Yes, threads don't work for modern binaries (newer than 4.x) without my changes > to make it use libthread_db. You can find the patch I used for this at > http://www.freebsd.org/~jhb/patches/pstack_threads.patch should be in ports? From owner-freebsd-hackers@FreeBSD.ORG Wed Apr 4 07:18:03 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7D365106566B; Wed, 4 Apr 2012 07:18:03 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id C68078FC0C; Wed, 4 Apr 2012 07:18:01 +0000 (UTC) Received: from skuns.kiev.zoral.com.ua (localhost [127.0.0.1]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id q347HkZV059958; Wed, 4 Apr 2012 10:17:46 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5) with ESMTP id q347Hkk0081689; Wed, 4 Apr 2012 10:17:46 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5/Submit) id q347HkcN081688; Wed, 4 Apr 2012 10:17:46 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 4 Apr 2012 10:17:46 +0300 From: Konstantin Belousov To: Andrey Zonov Message-ID: <20120404071746.GJ2358@deviant.kiev.zoral.com.ua> References: <4F7B495D.3010402@zonov.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="MLgImouMc6M0nTYk" Content-Disposition: inline In-Reply-To: <4F7B495D.3010402@zonov.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.0 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: alc@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: problems with mmap() and disk caching X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2012 07:18:03 -0000 --MLgImouMc6M0nTYk Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote: > Hi, >=20 > I open the file, then call mmap() on the whole file and get pointer,=20 > then I work with this pointer. I expect that page should be only once=20 > touched to get it into the memory (disk cache?), but this doesn't work! >=20 > I wrote the test (attached) and ran it for the 1G file generated from=20 > /dev/random, the result is the following: >=20 > Prepare file: > # swapoff -a > # newfs /dev/ada0b > # mount /dev/ada0b /mnt > # dd if=3D/dev/random of=3D/mnt/random-1024 bs=3D1m count=3D1024 >=20 > Purge cache: > # umount /mnt > # mount /dev/ada0b /mnt >=20 > Run test: > $ ./mmap /mnt/random-1024 30 > mmap: 1 pass took: 7.431046 (none: 262112; res: 32; super:=20 > 0; other: 0) > mmap: 2 pass took: 7.356670 (none: 261648; res: 496; super:=20 > 0; other: 0) > mmap: 3 pass took: 7.307094 (none: 260521; res: 1623; super:=20 > 0; other: 0) > mmap: 4 pass took: 7.350239 (none: 258904; res: 3240; super:=20 > 0; other: 0) > mmap: 5 pass took: 7.392480 (none: 257286; res: 4858; super:=20 > 0; other: 0) > mmap: 6 pass took: 7.292069 (none: 255584; res: 6560; super:=20 > 0; other: 0) > mmap: 7 pass took: 7.048980 (none: 251142; res: 11002; super:=20 > 0; other: 0) > mmap: 8 pass took: 6.899387 (none: 247584; res: 14560; super:=20 > 0; other: 0) > mmap: 9 pass took: 7.190579 (none: 242992; res: 19152; super:=20 > 0; other: 0) > mmap: 10 pass took: 6.915482 (none: 239308; res: 22836; super:=20 > 0; other: 0) > mmap: 11 pass took: 6.565909 (none: 232835; res: 29309; super:=20 > 0; other: 0) > mmap: 12 pass took: 6.423945 (none: 226160; res: 35984; super:=20 > 0; other: 0) > mmap: 13 pass took: 6.315385 (none: 208555; res: 53589; super:=20 > 0; other: 0) > mmap: 14 pass took: 6.760780 (none: 192805; res: 69339; super:=20 > 0; other: 0) > mmap: 15 pass took: 5.721513 (none: 174497; res: 87647; super:=20 > 0; other: 0) > mmap: 16 pass took: 5.004424 (none: 155938; res: 106206; super:=20 > 0; other: 0) > mmap: 17 pass took: 4.224926 (none: 135639; res: 126505; super:=20 > 0; other: 0) > mmap: 18 pass took: 3.749608 (none: 117952; res: 144192; super:=20 > 0; other: 0) > mmap: 19 pass took: 3.398084 (none: 99066; res: 163078; super:=20 > 0; other: 0) > mmap: 20 pass took: 3.029557 (none: 74994; res: 187150; super:=20 > 0; other: 0) > mmap: 21 pass took: 2.379430 (none: 55231; res: 206913; super:=20 > 0; other: 0) > mmap: 22 pass took: 2.046521 (none: 40786; res: 221358; super:=20 > 0; other: 0) > mmap: 23 pass took: 1.152797 (none: 30311; res: 231833; super:=20 > 0; other: 0) > mmap: 24 pass took: 0.972617 (none: 16196; res: 245948; super:=20 > 0; other: 0) > mmap: 25 pass took: 0.577515 (none: 8286; res: 253858; super:=20 > 0; other: 0) > mmap: 26 pass took: 0.380738 (none: 3712; res: 258432; super:=20 > 0; other: 0) > mmap: 27 pass took: 0.253583 (none: 1193; res: 260951; super:=20 > 0; other: 0) > mmap: 28 pass took: 0.157508 (none: 0; res: 262144; super:=20 > 0; other: 0) > mmap: 29 pass took: 0.156169 (none: 0; res: 262144; super:=20 > 0; other: 0) > mmap: 30 pass took: 0.156550 (none: 0; res: 262144; super:=20 > 0; other: 0) >=20 > If I ran this: > $ cat /mnt/random-1024 > /dev/null > before test, when result is the following: >=20 > $ ./mmap /mnt/random-1024 5 > mmap: 1 pass took: 0.337657 (none: 0; res: 262144; super:=20 > 0; other: 0) > mmap: 2 pass took: 0.186137 (none: 0; res: 262144; super:=20 > 0; other: 0) > mmap: 3 pass took: 0.186132 (none: 0; res: 262144; super:=20 > 0; other: 0) > mmap: 4 pass took: 0.186535 (none: 0; res: 262144; super:=20 > 0; other: 0) > mmap: 5 pass took: 0.190353 (none: 0; res: 262144; super:=20 > 0; other: 0) >=20 > This is what I expect. But why this doesn't work without reading file=20 > manually? Issue seems to be in some change of the behaviour of the reserv or phys allocator. I Cc:ed Alan. What happen is that fault handler deactivates or caches the pages previous to the one which would satisfy the fault. See the if() statement starting at line 463 of vm/vm_fault.c. Since all pages of the object in your test are clean, the pages are cached. Next fault would need to allocate some more pages for different index of the same object. What I see is that vm_reserv_alloc_page() returns a page that is from the cache for the same object, but different pindex. As an obvious result, the page is invalidated and repurposed. When next loop started, the page is not resident anymore, so it has to be re-read from disk. The behaviour of the allocator is not consistent, so some pages are not reused, allowing the test to converge and to collect all pages of the object eventually. Calling madvise(MADV_RANDOM) fixes the issue, because the code to deactivate/cache the pages is turned off. On the other hand, it also turns of read-ahead for faulting, and the first loop becomes eternally long. Doing MADV_WILLNEED does not fix the problem indeed, since willneed reactivates the pages of the object at the time of call. To use MADV_WILLNEED, you would need to call it between faults/memcpy. >=20 > I've also never seen super pages, how to make them work? They just work, at least for me. Look at the output of procstat -v after enough loops finished to not cause disk activity. >=20 > I've been playing with madvise and posix_fadvise but no luck. BTW,=20 > posix_fadvise(POSIX_FADV_WILLNEED) does nothing as the commentary says,= =20 > shouldn't this be documented in the manual page? >=20 > All tests were run under 9.0-STABLE (r233744). >=20 > --=20 > Andrey Zonov > /*_ > * Andrey Zonov (c) 2011 > */ >=20 > #include > #include > #include > #include > #include > #include > #include > #include > #include >=20 > int > main(int argc, char **argv) > { > int i; > int fd; > int num; > int block; > int pagesize; > size_t n; > size_t size; > size_t none, incore, super, other; > char *p; > char *tmp; > char *vec; > char *vecp; > struct stat sb; > struct timeval tp, tp1, tp2; >=20 > if (argc < 2 || argc > 4) > errx(1, "usage: mmap [num] [block]"); >=20 > fd =3D open(argv[1], O_RDONLY); > if (fd =3D=3D -1) > err(1, "open()"); >=20 > num =3D 1; > if (argc >=3D 3) > num =3D atoi(argv[2]); >=20 > pagesize =3D getpagesize(); > block =3D pagesize; > if (argc =3D=3D 4) > block =3D atoi(argv[3]); >=20 > if (fstat(fd, &sb) =3D=3D -1) > err(1, "fstat()"); > size =3D sb.st_size; >=20 > #if 0 > if (posix_fadvise(fd, (off_t)0, (off_t)0, POSIX_FADV_WILLNEED) =3D=3D -1) > err(1, "posix_fadvise()"); > #endif >=20 > p =3D mmap(NULL, sb.st_size, PROT_READ, /*MAP_PREFAULT_READ |*/ MAP_PRIV= ATE, fd, (off_t)0); > if (p =3D=3D MAP_FAILED) > err(1, "mmap()"); >=20 > #if 0 > if (madvise(p, (size_t)size, MADV_WILLNEED) =3D=3D -1) > err(1, "madvise()"); > #endif >=20 > tmp =3D calloc(1, block); > if (tmp =3D=3D NULL) > err(1, "calloc()"); > vec =3D calloc(1, size / pagesize); > if (vec =3D=3D NULL) > err(1, "calloc()"); > for (i =3D 0; i < num; i++) { > gettimeofday(&tp1, NULL); > for (n =3D 0; n < size / block; n++) > memcpy(tmp, p + (n * block), block); > gettimeofday(&tp2, NULL); > timersub(&tp2, &tp1, &tp); >=20 > if (mincore(p, size, vec) =3D=3D -1) > err(1, "mincore()"); >=20 > none =3D incore =3D super =3D other =3D 0; > for (vecp =3D vec; (size_t)(vecp - vec) < size / pagesize; vecp++) { > if (*vecp =3D=3D 0) > none++; > else if (*vecp & MINCORE_INCORE) > incore++; > else if (*vecp & MINCORE_SUPER) > super++; > else > other++; > } > warnx("%2d pass took: %3ld.%06ld (none: %6ld; res: %6ld; super: %6ld; o= ther: %6ld)", > i + 1, tp.tv_sec, tp.tv_usec, none, incore, super, other); > } > free(vec); > free(tmp); >=20 > if (munmap(p, sb.st_size) =3D=3D -1) > err(1, "munmap()"); >=20 > close(fd); >=20 > exit(0); > } > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" --MLgImouMc6M0nTYk Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAk979ZkACgkQC3+MBN1Mb4h70QCfWy5SBFMhoOSu4lsImFUH07ee 5XUAoLqpvJ9l29O1foymHmTDVNSEY4wU =j1j8 -----END PGP SIGNATURE----- --MLgImouMc6M0nTYk-- From owner-freebsd-hackers@FreeBSD.ORG Wed Apr 4 07:58:38 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1A786106566C; Wed, 4 Apr 2012 07:58:38 +0000 (UTC) (envelope-from utisoft@gmail.com) Received: from mail-bk0-f54.google.com (mail-bk0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 034CC8FC0C; Wed, 4 Apr 2012 07:58:36 +0000 (UTC) Received: by bkcjc3 with SMTP id jc3so575983bkc.13 for ; Wed, 04 Apr 2012 00:58:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=vjF8YXrYzIlaBR4MEshQrTPB4cCRiUu00Yn8Q6Y8Xmo=; b=PmZFL2tLmhZjf7Heu7NZtDUep7Do27Y6WAxH9Nk0FmUKVrYvbMzbTtGzHrO7LHJbL1 4cjk5M0pPgJlBrefZj72NYI5gyolVVZKxyyFB/td4fb0iA5KVlmZ2t8nGcZCnfkXCaCP /R/ZxsXVquyyvPA7tYGWutplEWTT69M3zeyH7G1wtvLLBwTnja54orVGYOy99ha2V0fq 11Ra9v8/4o7fr3VTtCXSjJLx1j6R9l1t4N9S+8dDfVrHpYDdCgJ6bl4uaGWicr6DqIR7 Ahg2W+5dOzwMja/ZNycMgbahMGV8x9FzqhWyAmCnN1T2r30RhZ+QtmIkoelnka/lRNBw JQxA== MIME-Version: 1.0 Received: by 10.204.154.28 with SMTP id m28mr6620541bkw.102.1333526315763; Wed, 04 Apr 2012 00:58:35 -0700 (PDT) Received: by 10.204.202.142 with HTTP; Wed, 4 Apr 2012 00:58:35 -0700 (PDT) Received: by 10.204.202.142 with HTTP; Wed, 4 Apr 2012 00:58:35 -0700 (PDT) In-Reply-To: <4F7BDF06.8000104@freebsd.org> References: <4F775DF5.1020704@rawbw.com> <201204020831.09253.jhb@freebsd.org> <4F79D63E.7010200@rawbw.com> <201204021312.36568.jhb@freebsd.org> <4F7BDF06.8000104@freebsd.org> Date: Wed, 4 Apr 2012 07:58:35 +0000 Message-ID: From: Chris Rees To: Julian Elischer Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-hackers@freebsd.org, hackers@freebsd.org, Yuri Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2012 07:58:38 -0000 On 4 Apr 2012 06:41, "Julian Elischer" wrote: > > On 4/2/12 10:12 AM, John Baldwin wrote: >> >> On Monday, April 02, 2012 12:39:26 pm Yuri wrote: >>> >>> On 04/02/2012 05:31, John Baldwin wrote: >>>> >>>> Hmm, I don't know if the port has it, but I did some work on pstack a while >>>> ago to make it work with libthread_db so it at least handles i386 ok. It >>>> needs to be modified to use something like libunwind though or some other >>>> unwinder. And possibly it should use libelf instead of its own ELF-parsing >>>> code. >>> >>> I see pstack -1.2_1 failing even on i386: >>> >>> pstack: cannot read context for thread 0x1879f >>> pstack: failed to read more threads >> >> Yes, threads don't work for modern binaries (newer than 4.x) without my changes >> to make it use libthread_db. You can find the patch I used for this at >> http://www.freebsd.org/~jhb/patches/pstack_threads.patch > > > should be in ports? > > I'm on it. Chris From owner-freebsd-hackers@FreeBSD.ORG Wed Apr 4 07:58:38 2012 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1A786106566C; Wed, 4 Apr 2012 07:58:38 +0000 (UTC) (envelope-from utisoft@gmail.com) Received: from mail-bk0-f54.google.com (mail-bk0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 034CC8FC0C; Wed, 4 Apr 2012 07:58:36 +0000 (UTC) Received: by bkcjc3 with SMTP id jc3so575983bkc.13 for ; Wed, 04 Apr 2012 00:58:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=vjF8YXrYzIlaBR4MEshQrTPB4cCRiUu00Yn8Q6Y8Xmo=; b=PmZFL2tLmhZjf7Heu7NZtDUep7Do27Y6WAxH9Nk0FmUKVrYvbMzbTtGzHrO7LHJbL1 4cjk5M0pPgJlBrefZj72NYI5gyolVVZKxyyFB/td4fb0iA5KVlmZ2t8nGcZCnfkXCaCP /R/ZxsXVquyyvPA7tYGWutplEWTT69M3zeyH7G1wtvLLBwTnja54orVGYOy99ha2V0fq 11Ra9v8/4o7fr3VTtCXSjJLx1j6R9l1t4N9S+8dDfVrHpYDdCgJ6bl4uaGWicr6DqIR7 Ahg2W+5dOzwMja/ZNycMgbahMGV8x9FzqhWyAmCnN1T2r30RhZ+QtmIkoelnka/lRNBw JQxA== MIME-Version: 1.0 Received: by 10.204.154.28 with SMTP id m28mr6620541bkw.102.1333526315763; Wed, 04 Apr 2012 00:58:35 -0700 (PDT) Received: by 10.204.202.142 with HTTP; Wed, 4 Apr 2012 00:58:35 -0700 (PDT) Received: by 10.204.202.142 with HTTP; Wed, 4 Apr 2012 00:58:35 -0700 (PDT) In-Reply-To: <4F7BDF06.8000104@freebsd.org> References: <4F775DF5.1020704@rawbw.com> <201204020831.09253.jhb@freebsd.org> <4F79D63E.7010200@rawbw.com> <201204021312.36568.jhb@freebsd.org> <4F7BDF06.8000104@freebsd.org> Date: Wed, 4 Apr 2012 07:58:35 +0000 Message-ID: From: Chris Rees To: Julian Elischer Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-hackers@freebsd.org, hackers@freebsd.org, Yuri Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2012 07:58:38 -0000 On 4 Apr 2012 06:41, "Julian Elischer" wrote: > > On 4/2/12 10:12 AM, John Baldwin wrote: >> >> On Monday, April 02, 2012 12:39:26 pm Yuri wrote: >>> >>> On 04/02/2012 05:31, John Baldwin wrote: >>>> >>>> Hmm, I don't know if the port has it, but I did some work on pstack a while >>>> ago to make it work with libthread_db so it at least handles i386 ok. It >>>> needs to be modified to use something like libunwind though or some other >>>> unwinder. And possibly it should use libelf instead of its own ELF-parsing >>>> code. >>> >>> I see pstack -1.2_1 failing even on i386: >>> >>> pstack: cannot read context for thread 0x1879f >>> pstack: failed to read more threads >> >> Yes, threads don't work for modern binaries (newer than 4.x) without my changes >> to make it use libthread_db. You can find the patch I used for this at >> http://www.freebsd.org/~jhb/patches/pstack_threads.patch > > > should be in ports? > > I'm on it. Chris From owner-freebsd-hackers@FreeBSD.ORG Wed Apr 4 09:36:36 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2D9C01065678 for ; Wed, 4 Apr 2012 09:36:36 +0000 (UTC) (envelope-from andrey@zonov.org) Received: from mail-bk0-f54.google.com (mail-bk0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id A15348FC1D for ; Wed, 4 Apr 2012 09:36:35 +0000 (UTC) Received: by bkcjc3 with SMTP id jc3so87917bkc.13 for ; Wed, 04 Apr 2012 02:36:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding :x-gm-message-state; bh=xyefyInvH+zOqwd0cIEs9AvRPyOh45IEw5eg6ueLN18=; b=TudNV/y/JQoTRO0Uzuqzv1H92zcYlskHwZS0mICy1OertRChSPVzFEmXwMJWpsNqpT c3j8U3H0KBntk1yyme0BXnKhNokEQTbNVTtCtSO7kH5dgFjAkWeouxCrREpZXUKY25eT NiewupJVKzCv0QFu+e/8wk5VtRiovW8Rx8+yruCZJ8nfOfAFTa0NEBtyKBh7zntLRZv6 3n3hGL/pzpltos2fn3GPb/5x8cf0LyLciQPVV3q3B5+O9IWr9/HD0x/+lhlF5O81N9v4 l2n750bT06567t0ztcC20xu2r9KRsOyFBvJ9ou1AifAqNRCizjsEVaA230MyNnUb6u0I EE7g== Received: by 10.205.132.73 with SMTP id ht9mr7266688bkc.46.1333532194398; Wed, 04 Apr 2012 02:36:34 -0700 (PDT) Received: from [10.254.254.77] (ppp95-165-133-149.pppoe.spdop.ru. [95.165.133.149]) by mx.google.com with ESMTPS id u16sm332189bkf.10.2012.04.04.02.36.33 (version=SSLv3 cipher=OTHER); Wed, 04 Apr 2012 02:36:34 -0700 (PDT) Message-ID: <4F7C1620.6040703@zonov.org> Date: Wed, 04 Apr 2012 13:36:32 +0400 From: Andrey Zonov User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.8.1.24) Gecko/20100228 Thunderbird/2.0.0.24 Mnenhy/0.7.6.0 MIME-Version: 1.0 To: Konstantin Belousov References: <4F7B495D.3010402@zonov.org> <20120404071746.GJ2358@deviant.kiev.zoral.com.ua> In-Reply-To: <20120404071746.GJ2358@deviant.kiev.zoral.com.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Gm-Message-State: ALoCoQnBpWyu5V1CH6gP3FT3PI+AuvplEfq1FHXQM9NKAJrQaQ+2ZRPP81gkCgrn5EtfHN6ZhunN Cc: alc@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: problems with mmap() and disk caching X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2012 09:36:36 -0000 On 04.04.2012 11:17, Konstantin Belousov wrote: > > Calling madvise(MADV_RANDOM) fixes the issue, because the code to > deactivate/cache the pages is turned off. On the other hand, it also > turns of read-ahead for faulting, and the first loop becomes eternally > long. Now it takes 5 times longer. Anyway, thanks for explanation. > > Doing MADV_WILLNEED does not fix the problem indeed, since willneed > reactivates the pages of the object at the time of call. To use > MADV_WILLNEED, you would need to call it between faults/memcpy. > I played with it, but no luck so far. >> >> I've also never seen super pages, how to make them work? > They just work, at least for me. Look at the output of procstat -v > after enough loops finished to not cause disk activity. > The problem was in my test program. I fixed it, now I see super pages but I'm still not satisfied. There are several tests below: 1. With madvise(MADV_RANDOM) I see almost all super pages: $ ./mmap /mnt/random-1024 5 mmap: 1 pass took: 26.438535 (none: 0; res: 262144; super: 511; other: 0) mmap: 2 pass took: 0.187311 (none: 0; res: 262144; super: 511; other: 0) mmap: 3 pass took: 0.184953 (none: 0; res: 262144; super: 511; other: 0) mmap: 4 pass took: 0.186007 (none: 0; res: 262144; super: 511; other: 0) mmap: 5 pass took: 0.185790 (none: 0; res: 262144; super: 511; other: 0) Should it be 512? 2. Without madvise(MADV_RANDOM): $ ./mmap /mnt/random-1024 50 mmap: 1 pass took: 7.629745 (none: 262112; res: 32; super: 0; other: 0) mmap: 2 pass took: 7.301720 (none: 261202; res: 942; super: 0; other: 0) mmap: 3 pass took: 7.261416 (none: 260226; res: 1918; super: 1; other: 0) [skip] mmap: 49 pass took: 0.155368 (none: 0; res: 262144; super: 323; other: 0) mmap: 50 pass took: 0.155438 (none: 0; res: 262144; super: 323; other: 0) Only 323 pages. 3. If I just re-run test I don't see super pages with any size of "block". $ ./mmap /mnt/random-1024 5 $((1<<30)) mmap: 1 pass took: 1.013939 (none: 0; res: 262144; super: 0; other: 0) mmap: 2 pass took: 0.267082 (none: 0; res: 262144; super: 0; other: 0) mmap: 3 pass took: 0.270711 (none: 0; res: 262144; super: 0; other: 0) mmap: 4 pass took: 0.268940 (none: 0; res: 262144; super: 0; other: 0) mmap: 5 pass took: 0.269634 (none: 0; res: 262144; super: 0; other: 0) 4. If I activate madvise(MADV_WILLNEDD) in the copy loop and re-run test then I see super pages only if I use "block" greater than 2Mb. $ ./mmap /mnt/random-1024 1 $((1<<21)) mmap: 1 pass took: 0.299722 (none: 0; res: 262144; super: 0; other: 0) $ ./mmap /mnt/random-1024 1 $((1<<22)) mmap: 1 pass took: 0.271828 (none: 0; res: 262144; super: 170; other: 0) $ ./mmap /mnt/random-1024 1 $((1<<23)) mmap: 1 pass took: 0.333188 (none: 0; res: 262144; super: 258; other: 0) $ ./mmap /mnt/random-1024 1 $((1<<24)) mmap: 1 pass took: 0.339250 (none: 0; res: 262144; super: 303; other: 0) $ ./mmap /mnt/random-1024 1 $((1<<25)) mmap: 1 pass took: 0.418812 (none: 0; res: 262144; super: 324; other: 0) $ ./mmap /mnt/random-1024 1 $((1<<26)) mmap: 1 pass took: 0.360892 (none: 0; res: 262144; super: 335; other: 0) $ ./mmap /mnt/random-1024 1 $((1<<27)) mmap: 1 pass took: 0.401122 (none: 0; res: 262144; super: 342; other: 0) $ ./mmap /mnt/random-1024 1 $((1<<28)) mmap: 1 pass took: 0.478764 (none: 0; res: 262144; super: 345; other: 0) $ ./mmap /mnt/random-1024 1 $((1<<29)) mmap: 1 pass took: 0.607266 (none: 0; res: 262144; super: 346; other: 0) $ ./mmap /mnt/random-1024 1 $((1<<30)) mmap: 1 pass took: 0.901269 (none: 0; res: 262144; super: 347; other: 0) 5. If I activate madvise(MADV_WILLNEED) immediately after mmap() then I see some number of super pages (the number from test #2). $ ./mmap /mnt/random-1024 5 mmap: 1 pass took: 0.178666 (none: 0; res: 262144; super: 323; other: 0) mmap: 2 pass took: 0.158889 (none: 0; res: 262144; super: 323; other: 0) mmap: 3 pass took: 0.157229 (none: 0; res: 262144; super: 323; other: 0) mmap: 4 pass took: 0.156895 (none: 0; res: 262144; super: 323; other: 0) mmap: 5 pass took: 0.162938 (none: 0; res: 262144; super: 323; other: 0) 6. If I read file manually before test then I don't see super pages with any size of "block" and madvise(MADV_WILLNEED) doesn't help. $ ./mmap /mnt/random-1024 5 $((1<<30)) mmap: 1 pass took: 0.996767 (none: 0; res: 262144; super: 0; other: 0) mmap: 2 pass took: 0.311129 (none: 0; res: 262144; super: 0; other: 0) mmap: 3 pass took: 0.317430 (none: 0; res: 262144; super: 0; other: 0) mmap: 4 pass took: 0.314437 (none: 0; res: 262144; super: 0; other: 0) mmap: 5 pass took: 0.310757 (none: 0; res: 262144; super: 0; other: 0) -- Andrey Zonov From owner-freebsd-hackers@FreeBSD.ORG Wed Apr 4 09:39:12 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7A999106566C for ; Wed, 4 Apr 2012 09:39:12 +0000 (UTC) (envelope-from andrey@zonov.org) Received: from mail-bk0-f54.google.com (mail-bk0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id F134E8FC14 for ; Wed, 4 Apr 2012 09:39:11 +0000 (UTC) Received: by bkcjc3 with SMTP id jc3so90324bkc.13 for ; Wed, 04 Apr 2012 02:39:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:x-gm-message-state; bh=zapY+D4KfQQO6wDhcZ+6NnjhjHevdLqOp+q3/Ktxnvw=; b=mK3SKkArtA9bs2S2XcHW8k8kFx8gd60L0ZPVbuvfscGeYPjbcnJk5F+lH+969DrC8/ 0GQw0gXf4BvXlgG0AUbQOYQHxplVAN+CdfT8DT5DWck8MoKf79Ik0ErHtHX0+ni+944A X70YTNaKDnqpm4NcU3nethwjAobGsx7W6h10QYFM7gkZL8SiKsIb8pjtCKmitDS3LaVS tETzVwj20gmTYpoPUV9Xbz2f6JaX0MSVeicX/JqaRjy/zKLcy3gqLKRAbaJEQBUUIHzC FT4NiIb9Det3ne7ll5FC3QDlhmQndvxvqnMejK9Zd/Tt5nsGB9K3fgi6/UO8Dz1tC7L+ oxgg== Received: by 10.204.154.2 with SMTP id m2mr6844419bkw.110.1333532350975; Wed, 04 Apr 2012 02:39:10 -0700 (PDT) Received: from [10.254.254.77] (ppp95-165-133-149.pppoe.spdop.ru. [95.165.133.149]) by mx.google.com with ESMTPS id f5sm354836bke.9.2012.04.04.02.39.09 (version=SSLv3 cipher=OTHER); Wed, 04 Apr 2012 02:39:10 -0700 (PDT) Message-ID: <4F7C16BD.3010703@zonov.org> Date: Wed, 04 Apr 2012 13:39:09 +0400 From: Andrey Zonov User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.8.1.24) Gecko/20100228 Thunderbird/2.0.0.24 Mnenhy/0.7.6.0 MIME-Version: 1.0 To: Konstantin Belousov References: <4F7B495D.3010402@zonov.org> <20120404071746.GJ2358@deviant.kiev.zoral.com.ua> <4F7C1620.6040703@zonov.org> In-Reply-To: <4F7C1620.6040703@zonov.org> Content-Type: multipart/mixed; boundary="------------010408020308040305090701" X-Gm-Message-State: ALoCoQljr0wRIl+eTWz3kkmNrqB9zPdkI57zwP+zqjNsyXykv6yyQku56j1UtSnhaMYHympsUe3N Cc: alc@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: problems with mmap() and disk caching X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2012 09:39:12 -0000 This is a multi-part message in MIME format. --------------010408020308040305090701 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit I forgot to attach my test program. On 04.04.2012 13:36, Andrey Zonov wrote: > On 04.04.2012 11:17, Konstantin Belousov wrote: >> >> Calling madvise(MADV_RANDOM) fixes the issue, because the code to >> deactivate/cache the pages is turned off. On the other hand, it also >> turns of read-ahead for faulting, and the first loop becomes eternally >> long. > > Now it takes 5 times longer. Anyway, thanks for explanation. > >> >> Doing MADV_WILLNEED does not fix the problem indeed, since willneed >> reactivates the pages of the object at the time of call. To use >> MADV_WILLNEED, you would need to call it between faults/memcpy. >> > > I played with it, but no luck so far. > >>> >>> I've also never seen super pages, how to make them work? >> They just work, at least for me. Look at the output of procstat -v >> after enough loops finished to not cause disk activity. >> > > The problem was in my test program. I fixed it, now I see super pages > but I'm still not satisfied. There are several tests below: > > 1. With madvise(MADV_RANDOM) I see almost all super pages: > $ ./mmap /mnt/random-1024 5 > mmap: 1 pass took: 26.438535 (none: 0; res: 262144; super: 511; other: 0) > mmap: 2 pass took: 0.187311 (none: 0; res: 262144; super: 511; other: 0) > mmap: 3 pass took: 0.184953 (none: 0; res: 262144; super: 511; other: 0) > mmap: 4 pass took: 0.186007 (none: 0; res: 262144; super: 511; other: 0) > mmap: 5 pass took: 0.185790 (none: 0; res: 262144; super: 511; other: 0) > > Should it be 512? > > 2. Without madvise(MADV_RANDOM): > $ ./mmap /mnt/random-1024 50 > mmap: 1 pass took: 7.629745 (none: 262112; res: 32; super: 0; other: 0) > mmap: 2 pass took: 7.301720 (none: 261202; res: 942; super: 0; other: 0) > mmap: 3 pass took: 7.261416 (none: 260226; res: 1918; super: 1; other: 0) > [skip] > mmap: 49 pass took: 0.155368 (none: 0; res: 262144; super: 323; other: 0) > mmap: 50 pass took: 0.155438 (none: 0; res: 262144; super: 323; other: 0) > > Only 323 pages. > > 3. If I just re-run test I don't see super pages with any size of "block". > > $ ./mmap /mnt/random-1024 5 $((1<<30)) > mmap: 1 pass took: 1.013939 (none: 0; res: 262144; super: 0; other: 0) > mmap: 2 pass took: 0.267082 (none: 0; res: 262144; super: 0; other: 0) > mmap: 3 pass took: 0.270711 (none: 0; res: 262144; super: 0; other: 0) > mmap: 4 pass took: 0.268940 (none: 0; res: 262144; super: 0; other: 0) > mmap: 5 pass took: 0.269634 (none: 0; res: 262144; super: 0; other: 0) > > 4. If I activate madvise(MADV_WILLNEDD) in the copy loop and re-run test > then I see super pages only if I use "block" greater than 2Mb. > > $ ./mmap /mnt/random-1024 1 $((1<<21)) > mmap: 1 pass took: 0.299722 (none: 0; res: 262144; super: 0; other: 0) > $ ./mmap /mnt/random-1024 1 $((1<<22)) > mmap: 1 pass took: 0.271828 (none: 0; res: 262144; super: 170; other: 0) > $ ./mmap /mnt/random-1024 1 $((1<<23)) > mmap: 1 pass took: 0.333188 (none: 0; res: 262144; super: 258; other: 0) > $ ./mmap /mnt/random-1024 1 $((1<<24)) > mmap: 1 pass took: 0.339250 (none: 0; res: 262144; super: 303; other: 0) > $ ./mmap /mnt/random-1024 1 $((1<<25)) > mmap: 1 pass took: 0.418812 (none: 0; res: 262144; super: 324; other: 0) > $ ./mmap /mnt/random-1024 1 $((1<<26)) > mmap: 1 pass took: 0.360892 (none: 0; res: 262144; super: 335; other: 0) > $ ./mmap /mnt/random-1024 1 $((1<<27)) > mmap: 1 pass took: 0.401122 (none: 0; res: 262144; super: 342; other: 0) > $ ./mmap /mnt/random-1024 1 $((1<<28)) > mmap: 1 pass took: 0.478764 (none: 0; res: 262144; super: 345; other: 0) > $ ./mmap /mnt/random-1024 1 $((1<<29)) > mmap: 1 pass took: 0.607266 (none: 0; res: 262144; super: 346; other: 0) > $ ./mmap /mnt/random-1024 1 $((1<<30)) > mmap: 1 pass took: 0.901269 (none: 0; res: 262144; super: 347; other: 0) > > 5. If I activate madvise(MADV_WILLNEED) immediately after mmap() then I > see some number of super pages (the number from test #2). > > $ ./mmap /mnt/random-1024 5 > mmap: 1 pass took: 0.178666 (none: 0; res: 262144; super: 323; other: 0) > mmap: 2 pass took: 0.158889 (none: 0; res: 262144; super: 323; other: 0) > mmap: 3 pass took: 0.157229 (none: 0; res: 262144; super: 323; other: 0) > mmap: 4 pass took: 0.156895 (none: 0; res: 262144; super: 323; other: 0) > mmap: 5 pass took: 0.162938 (none: 0; res: 262144; super: 323; other: 0) > > 6. If I read file manually before test then I don't see super pages with > any size of "block" and madvise(MADV_WILLNEED) doesn't help. > > $ ./mmap /mnt/random-1024 5 $((1<<30)) > mmap: 1 pass took: 0.996767 (none: 0; res: 262144; super: 0; other: 0) > mmap: 2 pass took: 0.311129 (none: 0; res: 262144; super: 0; other: 0) > mmap: 3 pass took: 0.317430 (none: 0; res: 262144; super: 0; other: 0) > mmap: 4 pass took: 0.314437 (none: 0; res: 262144; super: 0; other: 0) > mmap: 5 pass took: 0.310757 (none: 0; res: 262144; super: 0; other: 0) > > -- Andrey Zonov --------------010408020308040305090701 Content-Type: text/plain; charset=windows-1251; name="mmap.c" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="mmap.c" /*_ * Andrey Zonov (c) 2011 */ #include #include #include #include #include #include #include #include #include int main(int argc, char **argv) { int i; int fd; int num; int block; int pagesize; size_t size; size_t none, incore, super, other; char *ptr; char *ptrp; char *tmp; char *vec; char *vecp; struct stat sb; struct timeval tp, tp1, tp2; if (argc < 2 || argc > 4) errx(1, "usage: mmap [num] [block]"); fd = open(argv[1], O_RDONLY); if (fd == -1) err(1, "open()"); num = 1; if (argc >= 3) num = atoi(argv[2]); pagesize = getpagesize(); block = pagesize; if (argc == 4) block = atoi(argv[3]); if (fstat(fd, &sb) == -1) err(1, "fstat()"); size = sb.st_size; #if 0 if (posix_fadvise(fd, (off_t)0, (off_t)0, POSIX_FADV_WILLNEED) == -1) err(1, "posix_fadvise()"); #endif ptr = mmap(NULL, size, PROT_READ, /*MAP_PREFAULT_READ |*/ MAP_PRIVATE, fd, (off_t)0); if (ptr == MAP_FAILED) err(1, "mmap()"); #if 0 if (madvise(ptr, size, MADV_RANDOM) == -1) err(1, "madvise()"); #endif #if 0 /* Turn on super pages */ if (madvise(ptr, size, MADV_WILLNEED) == -1) err(1, "madvise()"); #endif tmp = calloc(1, block); if (tmp == NULL) err(1, "calloc()"); vec = calloc(1, size / pagesize); if (vec == NULL) err(1, "calloc()"); for (i = 0; i < num; i++) { gettimeofday(&tp1, NULL); for (ptrp = ptr; (size_t)(ptrp - ptr) < size; ptrp += block) { #if 0 if (madvise(ptrp, block, MADV_WILLNEED) == -1) err(1, "madvise()"); #endif memcpy(tmp, ptrp, block); } gettimeofday(&tp2, NULL); timersub(&tp2, &tp1, &tp); if (mincore(ptr, size, vec) == -1) err(1, "mincore()"); none = incore = super = other = 0; for (vecp = vec; (size_t)(vecp - vec) < size / pagesize; vecp++) { if (*vecp == 0) none++; else if (*vecp & MINCORE_INCORE) incore++; else other++; if (*vecp & MINCORE_SUPER) super++; } warnx("%2d pass took: %3ld.%06ld (none: %6ld; res: %6ld; super: %6ld; other: %6ld)", i + 1, tp.tv_sec, tp.tv_usec, none, incore, super / (2048/4) /* 2Mb / 4Kb */, other); } free(vec); free(tmp); if (munmap(ptr, size) == -1) err(1, "munmap()"); close(fd); exit(0); } --------------010408020308040305090701-- From owner-freebsd-hackers@FreeBSD.ORG Wed Apr 4 12:45:03 2012 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 01D091065672 for ; Wed, 4 Apr 2012 12:45:02 +0000 (UTC) (envelope-from lists@eitanadler.com) Received: from mail-wg0-f50.google.com (mail-wg0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id 776938FC17 for ; Wed, 4 Apr 2012 12:45:02 +0000 (UTC) Received: by wgbds12 with SMTP id ds12so224150wgb.31 for ; Wed, 04 Apr 2012 05:45:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eitanadler.com; s=0xdeadbeef; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=Tl/Igfty+mpAMsgHWtijCCYlp8XjXwDrb74aEB9/Dc4=; b=sIgAmYBJvQgNkjMsK9K+fsuOKFAk9z2Kwc+YHMv5gRqPP0lil5uwwObHaVnh+T76mQ Wlqx/Zwm1aQbUhapRrr2z9hJ5Dm9ycOV5eZHgmBMET94uiDcxo9pCP6q0RSfxsapqnhr xEnDIKCpdGgXW7LFHXlUpTB2rAvrGGpfW0e6k= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:x-gm-message-state; bh=Tl/Igfty+mpAMsgHWtijCCYlp8XjXwDrb74aEB9/Dc4=; b=WVjHZM1Jd6kflLEYHPQuDX4aaAwpdlYyqp9j4N8n34h4SOKKAzGMAAv/JBHr5Trei8 oMCjCHjBlo9UQdpBXf/SU0Q7X6lCkJn/Tk8R+twaquOkJzoDGsaKD+xdsCzCcpJCM/Z6 32ja5MC8s7CF0NZPu3HWLXHLdwbZD7xVlckkNch+Z+ylBtLB21xe+y7bBdfmg8bdRarx 7K+o6AQCW/vTdYtZAxfvDpnWPh9om/MfYFjFkIeZWRUJysL9KBtn46yaLYVoeuFdtPTW QSkKNFYqkXxIVLUfRuvLTgEbo+6nUPIpKvfLVDf+BGdJipagVdyCaFZNSk5KDlUMGaor rzPw== Received: by 10.180.83.72 with SMTP id o8mr5063729wiy.5.1333543501608; Wed, 04 Apr 2012 05:45:01 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.63.4 with HTTP; Wed, 4 Apr 2012 05:44:31 -0700 (PDT) In-Reply-To: <4F7BDF06.8000104@freebsd.org> References: <4F775DF5.1020704@rawbw.com> <201204020831.09253.jhb@freebsd.org> <4F79D63E.7010200@rawbw.com> <201204021312.36568.jhb@freebsd.org> <4F7BDF06.8000104@freebsd.org> From: Eitan Adler Date: Wed, 4 Apr 2012 08:44:31 -0400 Message-ID: To: Julian Elischer Content-Type: text/plain; charset=UTF-8 X-Gm-Message-State: ALoCoQnjORtbaKaHrVzdmDuI45DMfxX1EM7+blLWD4p0GJ0EKn5iMgv6yrAoi/Hf/pH+fAcZOxHs Cc: Yuri , hackers@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2012 12:45:03 -0000 On 4 April 2012 01:41, Julian Elischer wrote: > should be in ports? Not unless someone decides to become the new upstream and make a release. We do not maintain software in ports. -- Eitan Adler From owner-freebsd-hackers@FreeBSD.ORG Wed Apr 4 12:45:03 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1C8611065675 for ; Wed, 4 Apr 2012 12:45:03 +0000 (UTC) (envelope-from lists@eitanadler.com) Received: from mail-wg0-f50.google.com (mail-wg0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id 89D468FC19 for ; Wed, 4 Apr 2012 12:45:02 +0000 (UTC) Received: by wgbds12 with SMTP id ds12so224149wgb.31 for ; Wed, 04 Apr 2012 05:45:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eitanadler.com; s=0xdeadbeef; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=Tl/Igfty+mpAMsgHWtijCCYlp8XjXwDrb74aEB9/Dc4=; b=sIgAmYBJvQgNkjMsK9K+fsuOKFAk9z2Kwc+YHMv5gRqPP0lil5uwwObHaVnh+T76mQ Wlqx/Zwm1aQbUhapRrr2z9hJ5Dm9ycOV5eZHgmBMET94uiDcxo9pCP6q0RSfxsapqnhr xEnDIKCpdGgXW7LFHXlUpTB2rAvrGGpfW0e6k= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:x-gm-message-state; bh=Tl/Igfty+mpAMsgHWtijCCYlp8XjXwDrb74aEB9/Dc4=; b=H33YxHQmOnrWCjLIlSMMnYx5K2zIE1346X5aYnDN9yLWi6aMqEzJOYsC43J/vbKz7O 4pAR+V4jfMKhSTeFF4dOSYFDwmMhh5AK51sLOkT7Ad4ieFgRiazkJOSrRIWbSiNztYvC 7KsgXZ21R0YdIBzqF1REf8AR0IcQseMuM7E41yqLIAh+N1RRRr+5IEkxst+M25q5E9sR NUow7Des/t0Y5ctmQPYaP8GnXm43G/yXeCGVEnrBohffFEdVMb6DOOlnq80WPzsM11Rk y9++k8Uphg0xbz58W8NLtug8Cz7VyAjJGgOZmaeo/rsEk+BM+QNTqENAPn3KUSfZBODv c0Yg== Received: by 10.180.83.72 with SMTP id o8mr5063729wiy.5.1333543501608; Wed, 04 Apr 2012 05:45:01 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.63.4 with HTTP; Wed, 4 Apr 2012 05:44:31 -0700 (PDT) In-Reply-To: <4F7BDF06.8000104@freebsd.org> References: <4F775DF5.1020704@rawbw.com> <201204020831.09253.jhb@freebsd.org> <4F79D63E.7010200@rawbw.com> <201204021312.36568.jhb@freebsd.org> <4F7BDF06.8000104@freebsd.org> From: Eitan Adler Date: Wed, 4 Apr 2012 08:44:31 -0400 Message-ID: To: Julian Elischer Content-Type: text/plain; charset=UTF-8 X-Gm-Message-State: ALoCoQkQ3lWqFUrjxFQJEzIQOpfln3ziGEE+dZI4t2FegMcM185FtA22f/KlywuHGaXw7V8/aFX6 Cc: Yuri , hackers@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2012 12:45:03 -0000 On 4 April 2012 01:41, Julian Elischer wrote: > should be in ports? Not unless someone decides to become the new upstream and make a release. We do not maintain software in ports. -- Eitan Adler From owner-freebsd-hackers@FreeBSD.ORG Wed Apr 4 17:06:46 2012 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 15A0F106564A; Wed, 4 Apr 2012 17:06:45 +0000 (UTC) (envelope-from yuri@rawbw.com) Received: from shell0.rawbw.com (shell0.rawbw.com [198.144.192.45]) by mx1.freebsd.org (Postfix) with ESMTP id 9BBBE8FC14; Wed, 4 Apr 2012 17:06:45 +0000 (UTC) Received: from eagle.yuri.org (stunnel@localhost [127.0.0.1]) (authenticated bits=0) by shell0.rawbw.com (8.14.4/8.14.4) with ESMTP id q34H6hdg092523; Wed, 4 Apr 2012 10:06:43 -0700 (PDT) (envelope-from yuri@rawbw.com) Message-ID: <4F7C7FA2.5020904@rawbw.com> Date: Wed, 04 Apr 2012 10:06:42 -0700 From: Yuri User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:10.0.3) Gecko/20120316 Thunderbird/10.0.3 MIME-Version: 1.0 To: Eitan Adler References: <4F775DF5.1020704@rawbw.com> <201204020831.09253.jhb@freebsd.org> <4F79D63E.7010200@rawbw.com> <201204021312.36568.jhb@freebsd.org> <4F7BDF06.8000104@freebsd.org> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2012 17:06:46 -0000 On 04/04/2012 05:44, Eitan Adler wrote: > Not unless someone decides to become the new upstream and make a > release. We do not maintain software in ports. > > -- Eitan Adler But upstream is the sourceforge. Even though there is no activity there for a long while, it is easy to join that project, commit the change and make a release. It's better than to keep private patches. Yuri From owner-freebsd-hackers@FreeBSD.ORG Wed Apr 4 17:15:26 2012 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1EF3D10656B4 for ; Wed, 4 Apr 2012 17:15:26 +0000 (UTC) (envelope-from jhellenthal@dataix.net) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id C06368FC14 for ; Wed, 4 Apr 2012 17:15:25 +0000 (UTC) Received: by iahk25 with SMTP id k25so798690iah.13 for ; Wed, 04 Apr 2012 10:15:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dataix.net; s=rsa; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to; bh=njHqaVxEWhhhAirrAb3yy2BcyCsmefOvdzFjS69veKg=; b=fhsctID2niAj72EG7krshy5MpDZkvNoRjpkF8s1E2pui0UVNNKngl+aOUHKoatRBMM IFhYjbMVn7xhsRdbnC9EaTNNJAmYNrshEDQpq+ZrAq84N/3S90oGV3kox8JaZuocpYQg d2i8cqhY8lrElssZwnNTQb+wXjRXZsUQJyMoo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:x-gm-message-state; bh=njHqaVxEWhhhAirrAb3yy2BcyCsmefOvdzFjS69veKg=; b=ARzFWsnVdbfdcv6+ab+xFbJZjD6M6QP6zyzwS/YBCbXjJ4VSVrK/wRZWo2mVNkrJnS qloZvtfxNedMhuj1Ct612GPzKq14k/mO2N2oiElZkWX2zG81/HxmXlmFL9xsrAAOCLrJ jl7RZbx97rfso4TigBJVndTI9h1u3oLE92j1vfjTKv7arEPghSEANRaiPatd/vZnF4Yw JxbEbG9jZ5fuZHSwP4BUJjhZMG09MbW1r27q0Xkb6nrsR1uvYpsMgCYfVQv22ajKRpbs NoTWArNlshLbMh3VD7rK7a9xC9HxBOw8EErMUARfWiIWHGimWWAxfG7PdAmIFD4Bgamm vPxg== Received: by 10.50.140.101 with SMTP id rf5mr2444227igb.27.1333559725112; Wed, 04 Apr 2012 10:15:25 -0700 (PDT) Received: from DataIX.net (adsl-99-109-124-46.dsl.klmzmi.sbcglobal.net. [99.109.124.46]) by mx.google.com with ESMTPS id gf4sm3425039igb.14.2012.04.04.10.15.22 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 04 Apr 2012 10:15:24 -0700 (PDT) Received: from DataIX.net (localhost [127.0.0.1]) by DataIX.net (8.14.5/8.14.5) with ESMTP id q34HFJfU097846 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 4 Apr 2012 13:15:19 -0400 (EDT) (envelope-from jhellenthal@DataIX.net) Received: (from jhellenthal@localhost) by DataIX.net (8.14.5/8.14.5/Submit) id q34HFIjC097726; Wed, 4 Apr 2012 13:15:18 -0400 (EDT) (envelope-from jhellenthal@DataIX.net) Date: Wed, 4 Apr 2012 13:15:17 -0400 From: Jason Hellenthal To: Eitan Adler Message-ID: <20120404171517.GA1886@DataIX.net> References: <4F775DF5.1020704@rawbw.com> <201204020831.09253.jhb@freebsd.org> <4F79D63E.7010200@rawbw.com> <201204021312.36568.jhb@freebsd.org> <4F7BDF06.8000104@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Gm-Message-State: ALoCoQlHbrrl52AK9Fw4Syn3ITeTE5eLNO7vqhy2vo51rf2a4PpEEXomqgA3awYJE3glJ5kAWqxj Cc: Yuri , hackers@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2012 17:15:26 -0000 There are plenty of patches in the ports tree. At which point do you call it maintaining within the ports tree ? 8 files changed, 796 insertions(+), 233 deletions(-) Is hardly what someone should call maintaining considering the size of some of the other patches. And besides someone was willing to contribute the patch... no sense in degrading their work if they were willing to put it up for consumption. On Wed, Apr 04, 2012 at 08:44:31AM -0400, Eitan Adler wrote: > On 4 April 2012 01:41, Julian Elischer wrote: > > should be in ports? > > Not unless someone decides to become the new upstream and make a > release. We do not maintain software in ports. > > -- > Eitan Adler > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" -- ;s =; From owner-freebsd-hackers@FreeBSD.ORG Wed Apr 4 17:15:26 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5B3C010656B7 for ; Wed, 4 Apr 2012 17:15:26 +0000 (UTC) (envelope-from jhellenthal@dataix.net) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id BD3FC8FC12 for ; Wed, 4 Apr 2012 17:15:25 +0000 (UTC) Received: by iahk25 with SMTP id k25so798686iah.13 for ; Wed, 04 Apr 2012 10:15:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dataix.net; s=rsa; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to; bh=njHqaVxEWhhhAirrAb3yy2BcyCsmefOvdzFjS69veKg=; b=fhsctID2niAj72EG7krshy5MpDZkvNoRjpkF8s1E2pui0UVNNKngl+aOUHKoatRBMM IFhYjbMVn7xhsRdbnC9EaTNNJAmYNrshEDQpq+ZrAq84N/3S90oGV3kox8JaZuocpYQg d2i8cqhY8lrElssZwnNTQb+wXjRXZsUQJyMoo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:x-gm-message-state; bh=njHqaVxEWhhhAirrAb3yy2BcyCsmefOvdzFjS69veKg=; b=msdFnNUGaZjYfvZDLN893vwNLvX01y3xJzpNvn0m5QTlHhtekcPdEajZsCbb9H4Nvl Fn3LJcILQrTSlkVEqYHUUU/qEmSRKPJjU00KzP71+0nX6U6v75cqGu9BWI7QUOBdGMff 1HDXcGQ5M6sx426dakbJElm5gALeatPjvr9oE1No5As1xtFyA4Is/0Gd1sT47X8YUrli LARG6jkvJjJvb+ZPOwNBjTqd+ydiH0Rt/SfTN0fZvbEXCUJhqhwWomPYfzCzN7jF9Frd N0rtoPWaLW+XD69mXobYvQ3D70h0ndHFyfppQH6esrJhAD0AMu3bcXAYU8A+yVSCoGC2 RLxw== Received: by 10.50.140.101 with SMTP id rf5mr2444227igb.27.1333559725112; Wed, 04 Apr 2012 10:15:25 -0700 (PDT) Received: from DataIX.net (adsl-99-109-124-46.dsl.klmzmi.sbcglobal.net. [99.109.124.46]) by mx.google.com with ESMTPS id gf4sm3425039igb.14.2012.04.04.10.15.22 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 04 Apr 2012 10:15:24 -0700 (PDT) Received: from DataIX.net (localhost [127.0.0.1]) by DataIX.net (8.14.5/8.14.5) with ESMTP id q34HFJfU097846 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 4 Apr 2012 13:15:19 -0400 (EDT) (envelope-from jhellenthal@DataIX.net) Received: (from jhellenthal@localhost) by DataIX.net (8.14.5/8.14.5/Submit) id q34HFIjC097726; Wed, 4 Apr 2012 13:15:18 -0400 (EDT) (envelope-from jhellenthal@DataIX.net) Date: Wed, 4 Apr 2012 13:15:17 -0400 From: Jason Hellenthal To: Eitan Adler Message-ID: <20120404171517.GA1886@DataIX.net> References: <4F775DF5.1020704@rawbw.com> <201204020831.09253.jhb@freebsd.org> <4F79D63E.7010200@rawbw.com> <201204021312.36568.jhb@freebsd.org> <4F7BDF06.8000104@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Gm-Message-State: ALoCoQkz85ql8KQ4iaEmfPMMWymATym0WsinozS9xqqYGyoX0rFgi7LZ9ac7OLyocoxarQf4qbQX Cc: Yuri , hackers@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2012 17:15:26 -0000 There are plenty of patches in the ports tree. At which point do you call it maintaining within the ports tree ? 8 files changed, 796 insertions(+), 233 deletions(-) Is hardly what someone should call maintaining considering the size of some of the other patches. And besides someone was willing to contribute the patch... no sense in degrading their work if they were willing to put it up for consumption. On Wed, Apr 04, 2012 at 08:44:31AM -0400, Eitan Adler wrote: > On 4 April 2012 01:41, Julian Elischer wrote: > > should be in ports? > > Not unless someone decides to become the new upstream and make a > release. We do not maintain software in ports. > > -- > Eitan Adler > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" -- ;s =; From owner-freebsd-hackers@FreeBSD.ORG Wed Apr 4 18:28:54 2012 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 88B79106564A; Wed, 4 Apr 2012 18:28:54 +0000 (UTC) (envelope-from onwahe@gmail.com) Received: from mail-gy0-f182.google.com (mail-gy0-f182.google.com [209.85.160.182]) by mx1.freebsd.org (Postfix) with ESMTP id 29DA98FC12; Wed, 4 Apr 2012 18:28:54 +0000 (UTC) Received: by ghrr20 with SMTP id r20so445007ghr.13 for ; Wed, 04 Apr 2012 11:28:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=7I1kzmLGO0BX7bZR0im3BxRkomiWI+i7kFtjZYY//sM=; b=uUGEk6eRMhQFSWrOwUgHToqWHktN/dQcVOODmx5oiF+Pq3QXiYJ2IVKVXI8hIvD+SG EWKv3ug+u6pWaBtY+tLV9mmRC/rL3CXmRKVPqfbxyySThaAxcrhLeal+LqY2YI9y8Yst o2YQ1xySQSLKhlTgPVn/hhMesPdn/L367uOo6dRHkmfXjFS/KPyv7YfjWze5+m/bY25+ 2yDfgsTosJjK9HaW1hKy1AL3qcpYCrfcVH87FD8Jb5pDkRcs8yH/joTw4B8j9669Qn9f k8A1z9/Vt/6WoA2FYFHP2PszmN9yQZ8MHn/obj2jHICpxOvojlP9C8N/2YZFKYKDzTSf AKuA== MIME-Version: 1.0 Received: by 10.236.161.73 with SMTP id v49mr15595150yhk.89.1333564133697; Wed, 04 Apr 2012 11:28:53 -0700 (PDT) Received: by 10.236.37.195 with HTTP; Wed, 4 Apr 2012 11:28:53 -0700 (PDT) In-Reply-To: References: <20120312181921.GF75778@deviant.kiev.zoral.com.ua> <20120315112959.GP75778@deviant.kiev.zoral.com.ua> Date: Wed, 4 Apr 2012 20:28:53 +0200 Message-ID: From: Svatopluk Kraus To: Adrian Chadd Content-Type: text/plain; charset=ISO-8859-1 Cc: Konstantin Belousov , hackers@freebsd.org Subject: Re: [vfs] buf_daemon() slows down write() severely on low-speed CPU X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2012 18:28:54 -0000 On Wed, Mar 21, 2012 at 5:55 AM, Adrian Chadd wrote: > Hi, > > I'm interested in this, primarily because I'm tinkering with file > storage stuff on my little (most wifi targetted) embedded MIPS > platforms. > > So what's the story here? How can I reproduce your issue and do some > of my own profiling/investigation? > > > Adrian Hi, your interest has made me to do more solid/comparable investigation on my embedded ELAN486 platform. With more test results, I made full tracing of related VFS, filesystem, and disk function calls. It took some time to understand what about the issue really is. My test case: Single file copy (no O_FSYNC). It means that no other filesystem operation is served. The file size must be big enough according to hidirtybuffers value. Other processes on machine, where the test was run, almost were inactive. The real copy time was profiled. In all tests, a machine was booted, a file was copied, file was removed, the machine was rebooted. Thus, the the file was copied into same disk layout. The motivation is that my embedded machines don't do any writing to a disk mostly. Only during software update, a single process is writing to a disk (file by file). It doesn't need to be a problem at all, but an update must be successful even under full cpu load. So, the writing should be tuned up greatly to not affect other processes too much and to finish in finite time. On my embedded ELAN486 machines, a flash memory is used as a disk. It means that a reading is very fast, but a writing is slow. Further, a flash memory is divided into sectors and only complete sector can be erased at once. A sector erasure is very time expensive action. When I tried to tune up VFS by various parameters changing, I found out that real copy time depends on two things. Both of them are a subject of bufdaemon. Namely, its feature to try to work harder, if its buffers flushing mission is failing. It's not suprise that the best copy times were achived when bufdaemon was excluded from buffers flushing at all (by VFS parameters setting). This bufdaemon feature brings along (with respect to the real copy time): 1. bufdaemon runtime itself, 2. very frequent filesystem buffers flushing. What really happens in the test case on my machine: A copy program uses a buffer for coping. The default buffer size is 128 KiB in my case. The simplified sys_write() implementation for DTYPE_VNODE and VREG type is following: sys_write() { dofilewrite() { bwillwrite() fo_write() => vn_write() { bwillwrite() vn_lock() VOP_WRITE() VOP_UNLOCK() } } } So, all 128 KiB is written under VNODE lock. When I take back the machine defaults: hidirtybuffers: 134 lodirtybuffers: 67 bufdirtythresh: 120 buffer size (filesystem block size): 512 bytes and do some simple calculations: 134 * 512 = 68608 -> high water bytes count 120 * 512 = 61440 67 * 512 = 34304 -> low water byte count then it's obvious that bufdaemon has something to do during each sys_write(). However, almost all dirty buffers belong to new file VNODE and the VNODE is locked. What remains are filesystem buffers only. I.e., superblock buffer and free block bitmap buffers. So, bufdaemon iterates over all dirty buffers queue, what takes a SIGNIFICANT time on my machine, and does not find any buffer to be able to flush almost all time. If bufdaemon flushes one or two buffers, kern_yield() is called, and new iteration is started until no buffer is flushed. So, very often TWO full iteration over dirty buffers queue is done to flush only one or two filesystem buffers and to failed to reach lodirtybuffers threshold. A bufdaemon runtime is growing up. Moreover, the frequent filesystem buffers flushing brings along higher cpu load (geom down thread, geom up thread, disk thread scheduling) and a disk blocks writing re-ordering. The correct disk blocks writing order is important for the flash disk. Further, while the file data buffers are aged but not flushed, filesystem buffers are written repeatedly but flushed. Of course, I use a sector cache in the flash disk, but I can't cache too many sectors because of total memory size. So, filesystem disk blocks often are written and that evokes more disk sector flushes. A sector flush really takes long time, so real copy time grows up beyond control. Last but not least, the flash memory are going to be aged uselessly. Well, this is my old story. Just to be honest, I quite forgot that my kernel was compiled with FULL_PREEMPTION option. The things are very much worse in this case. However, the option just makes the issue worse, the issue doesn't disapper without it. In this old story, I played a game with and focused to bufdirtythresh value. However, bufdirtythresh is changing the way, how and by who buffers are flushed, too much. I recorded disk sector flush count and total disk_strategy() calls count with BIO_WRITE command (and total bytes count to write). I used a file with size 2235517 bytes. When I was caching 4 disk sectors, without FULL_PREEMPTION option, the result are following: bufdirtythresh: 120, 85, 40 real copy time: 1:16.909, 38.031, 23.117 cp runtime: 8.876, 9.419, 9.922 bufdaemon runtime: 4.95, 0.26, 0.01 fldisk runtime: 39.64, 18.23, 8.50 idle runtime: 18.99, 7.79, 2.76 qsleep: 358, 32, 0 flush_count: 49, 39, 25 write_count: 452, 529, 931 write_size: 2428928, 2461184, 2649600 Idle runtime is, in fact, a disk sector erasure runtime. A qsleep is a counter, how often bufdaemon slept with hz/10 timeout, so in some sense, it shows how often bufdaemon failed to reach lodirtybuffers threshold. A flush_count is a number of disk sector flushes and in fact, many runtimes depend on it. A write_count is a number of disk_strategy() calls with BIO_WRITE command. A write_size is a total number of bytes to write in BIO_WRITE commands. It can be seen in case of bufdirtythresh=40 that even if write_count is biggest, the flush_count is lowest. Thus, a disk block writing order was the best in respect to disk sector cache. However, it has shown that there is another things in the issue, I didn't figure out before: VFS clusters and disk fragmentation. For a matter of interest, the same tests with FULL_PREEMPTION option: bufdirtythresh: 120, 85, 40 real copy time: 2:00.389, 27.385, 21.975 cp runtime: 10.873, 9.538, 9.988 bufdaemon runtime: 48.36, 0.11, 0.00 fldisk runtime: 35.96, 11.67, 7.75 idle runtime: 16.59, 4.23, 2.20 qsleep: 4180, 10, 0 flush_count: 44, 30, 24 write_count: 1112, 532, 931 write_size: 2767360, 2462720, 264960 It's interesting that real copy times are better for bufdirtythresh=40 and 85. However, the times depend on flush_count again. I.e., a disk block writing order is what is better. As I said, the FULL_PREEMPTION option just makes the things worse. The issue starts when cp process is rescheduled by bufdaemon (and numdirtybuffers are greater than lodirtybuffers) during VOP_WRITE(), which is called with VNODE locked. At the beginning (no FULL_PREEMPTION option), it happens if: 1. bufdaemon is timeouted, 2. a thread with realtime priority is scheduled. Threads with realtime priorities always are full preempted (with or without FULL_PREEMPTION option). When a realtime thread finish its job, a thread with highest priority is scheduled. I.e., bufdaemon interrupts cp process too. After that, filesystem buffers can be bawrited by bufdaemon, it causes, that cp process must sleep in getblk() (td_wmesg = "getblk") sometimes and bufdaemon is scheduled again. Moreover, when it starts, bufdaemon sleeps with hz/10 timeout, so it's more likely that bufdaemon is timeouted. I'm not sure, but when it starts, it looks that cp process sometimes can be contested too (and bufdaemon can be scheduled). One more word about FULL_PREEMPTION option. I know that the option is not supported (unfortunately). The option also makes the things worse because of bd_wakeup() and bufdaemon synchronization by bdlock mutex. In case of FULL_PREEMPTION option, bufdaemon is waked up and scheduled and contested immediately, so process which calls bd_wakeup() must be scheduled again to release bdlock and then, bufdaemon is scheduled again. The following would help: critical_enter() wakeup(&bd_request); mtx_unlock(&bdlock); critical_exit() In UP case, it's OK always. In SMP case, it's better than nothing. Now, a new story. I compiled kernel without FULL_PREEMPTION option and made a ramdisk from flash disk, so no disk sector flush happened during the copy. Only VFS and filesystem remained for investigation. I already mentioned VFS clusters and disk fragmentation. My flash disk almost is not fragmented, so even 64 KiB clusters are written at once. A new message is that clusters are bawrited. I did not investigate it fully, but (struct mount *)mnt_iosize_max limits maximal cluster size. This clusters bawriting keeps numdirtybuffers low and makes the things better. It still is about how to exclude bufdaemon from dirty buffers flushing. I also mentioned a buffer size, copy program is using. If the buffer size is lower, it makes the probability of bufdaemon influence lesser. However, once the described issue starts ... I left bufdirtythresh tuning and started with mnt_iosize_max and copy program buffer size tuning. On my machine, default mnt_iosize_max is MAXPHYS = 128 KiB. Just to remember, 68608 bytes is my default high water dirty buffer byte size. Results for no FULL_PREEMPTION option, bufdirtythresh=120, file size 2235517 bytes, and ramdisk: mnt_iosize_max = 128 KiB ------------------------ cp buffer size: 128, 64, 32, 16 KiB real copy time: 19.234, 15.708, 15.700, 16.054 cp runtime: 10.978, 9.837, 9.895, 9.995 bufdaemon runtime: 2.93, 1.14, 0.97, 1.12 ramdisk runtime: 3.08, 2.98, 2.99, 3.00 qsleep: 172, 46, 37, 47 write_count: 473, 284, 283, 292 write_size: 2439680, 2336768, 2337280, 2341888 dirtybufferflushes: 134, 5, 25, 25 altbufferflushes: 1, 0, 0, 0 The dirtybufferflushes and altbufferflushes are kernel globals. As I said, cp buffer size just makes probability lesser. A bufdaemon runtime is significant in respect to ramdisk runtime. A mnt_iosize_max is bigger than mentioned high water. mnt_iosize_max = 64 KiB ------------------------ cp buffer size: 128, 64, 32, 16 KiB real copy time: 17.566, 15.669, 15.708, 15.543 cp runtime: 9.934, 9.632, 9.730, 9.797 bufdaemon runtime: 2.40, 1.18, 1.10, 0.90 ramdisk runtime: 3.09, 2.99, 3.00, 2.98 qsleep: 137, 53, 43, 30 write_count: 451, 291, 280, 269 write_size: 2421760, 2339840, 2334720, 2329088 dirtybufferflushes: 134, 12, 5, 6 altbufferflushes: 1, 0, 0, 0 A bufdaemon is excluded a little, but no dirtybufthresh=120. It means that 120 * 512 = 61440 bytes is less than mnt_iosize_max and still is in a game (dirtybufferflushes). mnt_iosize_max = 32 KiB ------------------------ cp buffer size: 128, 64, 32, 16 KiB real copy time: 14.580, 14.327, 14.491, 14.609 cp runtime: 9.864, 9.680, 9.732, 9.841 bufdaemon runtime: 0.04, 0.01, 0.01, 0.03 ramdisk runtime: 2.98, 2.98, 2.97, 2.99 qsleep: 2, 0, 0, 2 write_count: 213, 206, 194, 191 write_size: 2266112, 2262528, 2256384, 2254848 dirtybufferflushes: 0, 0, 0, 0 altbufferflushes: 0, 0, 0, 0 A bufdaemon is excluded and dirtybufthresh too. Dirty buffers are bawrited by VFS cluster layer. All times and results are typical (no statistical) ones. The issue is not so visible when a ramdisk is used. If cp process is sleeping because of a buffer bawriting, which can be slow on my flash disk, the bufdaemon runtime can grow very much. On high speed cpu, the issue can be hidden in a system workload noise totally. However, on my machine, the issue exists with no doubt. I think, it's clear how to try to reproduce my issue. In FreeBSD kernel, MAXPHYS = 128 KiB is a maximal mnt_iosize_max value. The hidirtybuffers, dirtybufthresh, and lodirtybuffers together with VFS buffer size must be set in respect to mnt_iosize_max. They (in bytes) must be lower than mnt_iosize_max. I know that my machine configuration and utilizations is a special one. Thus, I don't push any change. And once again, I'm just sharing my experience. However, for embedded platforms, it can be more common than somebody thinks. Svata From owner-freebsd-hackers@FreeBSD.ORG Wed Apr 4 19:29:07 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7A916106566B; Wed, 4 Apr 2012 19:29:07 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) by mx1.freebsd.org (Postfix) with ESMTP id 46AFA8FC15; Wed, 4 Apr 2012 19:29:07 +0000 (UTC) Received: from julian-mac.elischer.org (c-67-180-24-15.hsd1.ca.comcast.net [67.180.24.15]) (authenticated bits=0) by vps1.elischer.org (8.14.5/8.14.5) with ESMTP id q34JT2tY004623 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Wed, 4 Apr 2012 12:29:04 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <4F7CA124.8080401@freebsd.org> Date: Wed, 04 Apr 2012 12:29:40 -0700 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.4; en-US; rv:1.9.2.28) Gecko/20120306 Thunderbird/3.1.20 MIME-Version: 1.0 To: Eitan Adler References: <4F775DF5.1020704@rawbw.com> <201204020831.09253.jhb@freebsd.org> <4F79D63E.7010200@rawbw.com> <201204021312.36568.jhb@freebsd.org> <4F7BDF06.8000104@freebsd.org> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Yuri , hackers@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2012 19:29:07 -0000 On 4/4/12 5:44 AM, Eitan Adler wrote: > On 4 April 2012 01:41, Julian Elischer wrote: >> should be in ports? > Not unless someone decides to become the new upstream and make a > release. We do not maintain software in ports. > but we do add patches to make things work on FreeBSD. From owner-freebsd-hackers@FreeBSD.ORG Wed Apr 4 19:29:07 2012 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7A916106566B; Wed, 4 Apr 2012 19:29:07 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) by mx1.freebsd.org (Postfix) with ESMTP id 46AFA8FC15; Wed, 4 Apr 2012 19:29:07 +0000 (UTC) Received: from julian-mac.elischer.org (c-67-180-24-15.hsd1.ca.comcast.net [67.180.24.15]) (authenticated bits=0) by vps1.elischer.org (8.14.5/8.14.5) with ESMTP id q34JT2tY004623 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Wed, 4 Apr 2012 12:29:04 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <4F7CA124.8080401@freebsd.org> Date: Wed, 04 Apr 2012 12:29:40 -0700 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.4; en-US; rv:1.9.2.28) Gecko/20120306 Thunderbird/3.1.20 MIME-Version: 1.0 To: Eitan Adler References: <4F775DF5.1020704@rawbw.com> <201204020831.09253.jhb@freebsd.org> <4F79D63E.7010200@rawbw.com> <201204021312.36568.jhb@freebsd.org> <4F7BDF06.8000104@freebsd.org> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Yuri , hackers@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2012 19:29:07 -0000 On 4/4/12 5:44 AM, Eitan Adler wrote: > On 4 April 2012 01:41, Julian Elischer wrote: >> should be in ports? > Not unless someone decides to become the new upstream and make a > release. We do not maintain software in ports. > but we do add patches to make things work on FreeBSD. From owner-freebsd-hackers@FreeBSD.ORG Wed Apr 4 20:08:33 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AA25E1065677 for ; Wed, 4 Apr 2012 20:08:33 +0000 (UTC) (envelope-from vasanth.raonaik@gmail.com) Received: from mail-wg0-f50.google.com (mail-wg0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id 37DC98FC23 for ; Wed, 4 Apr 2012 20:08:33 +0000 (UTC) Received: by wgbds12 with SMTP id ds12so641434wgb.31 for ; Wed, 04 Apr 2012 13:08:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=pU6Kpfv5eb9JFCwmqc5+5ccnHPmyceYC8Dwsi59HrKg=; b=wUqsdyY9UYNOxJOZ++aqnVgEfzzXA77mXajiwHTP8VE1HCQ3+TFfpjMX0o5bfJ4SUH N9fGUVoduelhBLI87Lhf4el5oaa8PLpOWh9lFDM+WzG9s/804bw8RmWuEF7YSsVfRDO+ cFxWVKhd7N+/S8WWxQHKhcr1dYBuWnAsqgc9Ikk3qtHVXI2mnbimNcMA1lL6bhRZLjAe QwoOS7Cdfue8nUvM1XK9u3roLEFL9GPJzonSss+J4+VdvTd/14GdIQaAiAsd0FrJ23J4 uzvDMRTfCSb7Z6icyurA59p4b6Cds1BK5s04/wVJNplUBMRe4xGJUJdH86zMqFuA82ix 4zbg== MIME-Version: 1.0 Received: by 10.216.132.40 with SMTP id n40mr2205844wei.68.1333570112066; Wed, 04 Apr 2012 13:08:32 -0700 (PDT) Received: by 10.180.98.161 with HTTP; Wed, 4 Apr 2012 13:08:32 -0700 (PDT) In-Reply-To: References: Date: Wed, 4 Apr 2012 16:08:32 -0400 Message-ID: From: vasanth rao naik sabavat To: Mark Tinguely Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-hackers@freebsd.org Subject: Re: question about amd64 pagetable page allocation. X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2012 20:08:33 -0000 Hello Mark, >From what I understand, the virtual address of a given page table should not change when accessing from vtopte() and pmap_pte(). However, with small code change in pmap_remove_pages(), I was able to print the values returned by these two functions. vtopte() and pmap_pte(), pte1 0xffff8000006432e0 pa1 346941425 m1 0xffffff04291cf600, pte2 0xffffff03463032e0 pa2 346941425 m2 0xffffff04291cf600 pte1 0xffff8000006432e8 pa1 346842425 m1 0xffffff04291c7e78, pte2 0xffffff03463032e8 pa2 346842425 m2 0xffffff04291c7e78 pte1 0xffff8000006432f0 pa1 346863425 m1 0xffffff04291c8df0, pte2 0xffffff03463032f0 pa2 346863425 m2 0xffffff04291c8df0 In the above result, the pte1 is the result of vtopte() and pte2 is the result of pmap_pte(). Interestingly, the value of these two virtual addresses pte1 and pte2, result in the same physical address pa1 == pa2. If I am not wrong, the page tables are now mapped in two different virtual addresses? Could you please clarify this? Thanks, Vasanth On Tue, Apr 3, 2012 at 3:18 PM, Mark Tinguely wrote: > On Tue, Apr 3, 2012 at 1:52 PM, vasanth rao naik sabavat > wrote: > > Hello Mark, > > > > I think pmap_remove_pages() is executed only for the current process. > > > > 2549 #ifdef PMAP_REMOVE_PAGES_CURPROC_ONLY > > 2550 if (pmap != vmspace_pmap(curthread->td_proc->p_vmspace)) { > > 2551 printf("warning: pmap_remove_pages called with > non-current > > pmap\n"); > > 2552 return; > > 2553 } > > 2554 #endif > > > > I dont still get it why this was removed? > > > > Thanks, > > Vasanth > > > That is pretty old code. Newer code does not make that assumption. > > Without the assumption that the pages are from the current map, then you > have to use the direct physical -> virtual mapping: > > 2547 #ifdef PMAP_REMOVE_PAGES_CURPROC_ONLY > 2548 pte = vtopte(pv->pv_va); > 2549 #else > 2550 pte = pmap_pte(pmap, pv->pv_va); > 2551 #endif > > --Mark Tinguely. > From owner-freebsd-hackers@FreeBSD.ORG Wed Apr 4 21:18:26 2012 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 48DDE1065670 for ; Wed, 4 Apr 2012 21:18:26 +0000 (UTC) (envelope-from onwahe@gmail.com) Received: from mail-yx0-f182.google.com (mail-yx0-f182.google.com [209.85.213.182]) by mx1.freebsd.org (Postfix) with ESMTP id F20568FC0C for ; Wed, 4 Apr 2012 21:18:25 +0000 (UTC) Received: by yenl9 with SMTP id l9so569409yen.13 for ; Wed, 04 Apr 2012 14:18:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=sqWcjei++gLHmyUkgUUuIGqLEbXPKGz14T3xhKdxF1s=; b=EWNcnuJSrA7ByYE+EQHh07p5QXlREVYna0bGb6j44TbbyErUMrvRqicyzI8y3fRkMM AR8I/fuhOqcB0sp9TOb+SZfyXzrDY9hcDSYmbtILJ1UufbGnH0jL1CePAGCZHomtjlSJ 9n4rIko3uKz7OYgpH+TyeuG47n2en1bhMkaI/jMvuiAPwlUNiFwzKvJDnAAun2FP0SbT hFJMrywtc2T+1aMLwrqsyp7AKTuFzDnRAN8279fIIozfBZgfWx9x5gB9z14Di3deviPX RgqDw/ZVjsiCg4veijHk2MbCty9duaQsBfOhEjwHvOBdsb8qo6bvxKv1xgHfwMmOno9r iWHQ== MIME-Version: 1.0 Received: by 10.236.77.106 with SMTP id c70mr7934728yhe.85.1333574305112; Wed, 04 Apr 2012 14:18:25 -0700 (PDT) Received: by 10.236.37.195 with HTTP; Wed, 4 Apr 2012 14:18:25 -0700 (PDT) In-Reply-To: <20120321203828.GW2358@deviant.kiev.zoral.com.ua> References: <20120312181921.GF75778@deviant.kiev.zoral.com.ua> <20120315112959.GP75778@deviant.kiev.zoral.com.ua> <20120321203828.GW2358@deviant.kiev.zoral.com.ua> Date: Wed, 4 Apr 2012 23:18:25 +0200 Message-ID: From: Svatopluk Kraus To: Konstantin Belousov Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: hackers@freebsd.org Subject: Re: [vfs] buf_daemon() slows down write() severely on low-speed CPU X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2012 21:18:26 -0000 2012/3/21 Konstantin Belousov : > On Thu, Mar 15, 2012 at 08:00:41PM +0100, Svatopluk Kraus wrote: >> 2012/3/15 Konstantin Belousov : >> > On Tue, Mar 13, 2012 at 01:54:38PM +0100, Svatopluk Kraus wrote: >> >> On Mon, Mar 12, 2012 at 7:19 PM, Konstantin Belousov >> >> wrote: >> >> > On Mon, Mar 12, 2012 at 04:00:58PM +0100, Svatopluk Kraus wrote: >> >> >> Hi, >> >> >> >> >> >> =A0 =A0I have solved a following problem. If a big file (according= to >> >> >> 'hidirtybuffers') is being written, the write speed is very poor. >> >> >> >> >> >> =A0 =A0It's observed on system with elan 486 and 32MB RAM (i.e., l= ow speed >> >> >> CPU and not too much memory) running FreeBSD-9. >> >> >> >> >> >> =A0 =A0Analysis: A file is being written. All or almost all dirty = buffers >> >> >> belong to the file. The file vnode is almost all time locked by >> >> >> writing process. The buf_daemon() can not flush any dirty buffer a= s a >> >> >> chance to acquire the file vnode lock is very low. A number of dir= ty >> >> >> buffers grows up very slow and with each new dirty buffer slower, >> >> >> because buf_daemon() eats more and more CPU time by looping on dir= ty >> >> >> buffers queue (with very low or no effect). >> >> >> >> >> >> =A0 =A0This slowing down effect is started by buf_daemon() itself,= when >> >> >> 'numdirtybuffers' reaches 'lodirtybuffers' threshold and buf_daemo= n() >> >> >> is waked up by own timeout. The timeout fires at 'hz' period, but >> >> >> starts to fire at 'hz/10' immediately as buf_daemon() fails to rea= ch >> >> >> 'lodirtybuffers' threshold. When 'numdirtybuffers' (now slowly) >> >> >> reaches ((lodirtybuffers + hidirtybuffers) / 2) threshold, the >> >> >> buf_daemon() can be waked up within bdwrite() too and it's much wo= rse. >> >> >> Finally and with very slow speed, the 'hidirtybuffers' or >> >> >> 'dirtybufthresh' is reached, the dirty buffers are flushed, and >> >> >> everything starts from beginning... >> >> > Note that for some time, bufdaemon work is distributed among bufdae= mon >> >> > thread itself and any thread that fails to allocate a buffer, esp. >> >> > a thread that owns vnode lock and covers long queue of dirty buffer= s. >> >> >> >> However, the problem starts when numdirtybuffers reaches >> >> lodirtybuffers count and ends around hidirtybuffers count. There are >> >> still plenty of free buffers in system. >> >> >> >> >> >> >> >> =A0 =A0On the system, a buffer size is 512 bytes and the default >> >> >> thresholds are following: >> >> >> >> >> >> =A0 =A0vfs.hidirtybuffers =3D 134 >> >> >> =A0 =A0vfs.lodirtybuffers =3D 67 >> >> >> =A0 =A0vfs.dirtybufthresh =3D 120 >> >> >> >> >> >> =A0 =A0For example, a 2MB file is copied into flash disk in about = 3 >> >> >> minutes and 15 second. If dirtybufthresh is set to 40, the copy ti= me >> >> >> is about 20 seconds. >> >> >> >> >> >> =A0 =A0My solution is a mix of three things: >> >> >> =A0 =A01. Suppresion of buf_daemon() wakeup by setting bd_request = to 1 in >> >> >> the main buf_daemon() loop. >> >> > I cannot understand this. Please provide a patch that shows what do >> >> > you mean there. >> >> > >> >> =A0 =A0 =A0 curthread->td_pflags |=3D TDP_NORUNNINGBUF | TDP_BUFNEED; >> >> =A0 =A0 =A0 mtx_lock(&bdlock); >> >> =A0 =A0 =A0 for (;;) { >> >> - =A0 =A0 =A0 =A0 =A0 =A0 bd_request =3D 0; >> >> + =A0 =A0 =A0 =A0 =A0 =A0 bd_request =3D 1; >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 mtx_unlock(&bdlock); >> > Is this a complete patch ? The change just causes lost wakeups for buf= daemon, >> > nothing more. >> Yes, it's a complete patch. And exactly, it causes lost wakeups which ar= e: >> 1. !! UNREASONABLE !!, because bufdaemon is not sleeping, >> 2. not wanted, because it looks that it's correct behaviour for the >> sleep with hz/10 period. However, if the sleep with hz/10 period is >> expected to be waked up by bd_wakeup(), then bd_request should be set >> to 0 just before sleep() call, and then bufdaemon behaviour will be >> clear. > No, your description is wrong. > > If bufdaemon is unable to flush enough buffers and numdirtybuffers still > greater then lodirtybuffers, then bufdaemon enters qsleep state > without resetting bd_request, with timeouts of one tens of second. > Your patch will cause all wakeups for this case to be lost. This is > exactly the situation when we want bufdaemon to run harder to avoid > possible deadlocks, not to slow down. OK. Let's focus to bufdaemon implementation. Now, qsleep state is entered with random bd_request value. If someone calls bd_wakeup() during bufdaemon iteration over dirty buffers queues, then bd_request is set to 1. Otherwise, bd_request remains 0. I.e., sometimes qsleep state only can be timeouted, sometimes it can be waked up by bd_wakeup(). So, this random behaviour is what is wanted? >> All stuff around bd_request and bufdaemon sleep is under bd_lock, so >> if bd_request is 0 and bufdaemon is not sleeping, then all wakeups are >> unreasonable! The patch is about that mainly. > Wakeups itself are very cheap for the running process. Mostly, it comes > down to locking sleepq and waking all threads that are present in the > sleepq blocked queue. If there is no threads in queue, nothing is done. Are you serious? Is spin mutex really cheap? Many calls are cheap, but they are not any matter where. Svata From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 01:54:13 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5AEE1106564A for ; Thu, 5 Apr 2012 01:54:13 +0000 (UTC) (envelope-from sushanth_rai@yahoo.com) Received: from nm24.bullet.mail.sp2.yahoo.com (nm24.bullet.mail.sp2.yahoo.com [98.139.91.94]) by mx1.freebsd.org (Postfix) with SMTP id 196A28FC08 for ; Thu, 5 Apr 2012 01:54:13 +0000 (UTC) Received: from [98.139.91.66] by nm24.bullet.mail.sp2.yahoo.com with NNFMP; 05 Apr 2012 01:54:07 -0000 Received: from [98.139.44.86] by tm6.bullet.mail.sp2.yahoo.com with NNFMP; 05 Apr 2012 01:54:07 -0000 Received: from [127.0.0.1] by omp1023.access.mail.sp2.yahoo.com with NNFMP; 05 Apr 2012 01:54:07 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 335107.92701.bm@omp1023.access.mail.sp2.yahoo.com Received: (qmail 63198 invoked by uid 60001); 5 Apr 2012 01:54:07 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1333590847; bh=LNg9qmn432+R6Ufk2u7Wc/MpS6a3zEleE5cn5I4g/bI=; h=X-YMail-OSG:Received:X-Mailer:Message-ID:Date:From:Subject:To:MIME-Version:Content-Type; b=FTIy0/KlqagiVS7KY6aZ0z99YPHGOrVtnsNdHgv6UxJGkzqkSz3fwRLQF+RMdpoI0zPoJI2jfHKYYptnPSRnjEkLJAO7h5KFV0gfKKqICcNvOp32ayqICgeZR+7avWikqYQTOrX/F+UNWcLb+PWaLx6uIlfxSvyVif76IdHpH2w= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:Message-ID:Date:From:Subject:To:MIME-Version:Content-Type; b=WCvdYyCrvkJIXV9WP3dJJYX1O55UBcUIMsJ0aiXSngKdMPcMgJZF0QCyslXtvS8J5dQ30DVPnDsDoqWwBaQ1Fc5iUApo1KQjn6NMmeAbL27Smyuy2Ibhzcm9pCjbDg9m3YL5iOEdZULQeiXNBsKx8xi3ku+9D7Wrsj3ftbUvAws=; X-YMail-OSG: nwEIMXMVM1kmgWH9wvp7Oggn8tcyuOSAjeP6YUE6lplwrlp hafuRuMTCoy4N2_IZXcGGLwcsd_zzjS9aOEcBxLi2zNMM.MTD24f2kVDRKZg iq6eSBfx7jt.EfDlP8IME4LhMhxOB_IKP0esPnF.Lem0yKZVS9r68D.zoEqO Xs_RT81D4gNaRuzhd7J3o496EsU4YOo_d2QD5bA_f6rSkX0H1fnwfNj7b5tC hiLmz14rdhfJiGpgSIZ1j3UR11__YOmwB_19pKApAvcWEvxFmUGkBEig7IJk YESYRttlqgEmzkyyMx61L6bpQ127XN2amgJDlUzp9yJQE_sxTR9ruvOXdDWN 0VMMRqg4kYdgcuTvb7j2GtIzBhc.xds91PZSSRGw1xIbz1tIAIsJfIHa9Otw xrGmgbq9CueIcK7ajY4Dgyxe64wsm.NGq6upBNyuZ4istRPhMl7exEinvlY6 xxvfTVEw- Received: from [209.119.38.67] by web180011.mail.gq1.yahoo.com via HTTP; Wed, 04 Apr 2012 18:54:06 PDT X-Mailer: YahooMailClassic/15.0.5 YahooMailWebService/0.8.117.340979 Message-ID: <1333590846.58474.YahooMailClassic@web180011.mail.gq1.yahoo.com> Date: Wed, 4 Apr 2012 18:54:06 -0700 (PDT) From: Sushanth Rai To: freebsd-hackers@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: Startvation of realtime piority threads X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 01:54:13 -0000 I have a multithreaded user space program that basically runs at realtime priority. Synchronization between threads are done using spinlock. When running this program on a SMP system under heavy memory pressure I see that thread holding the spinlock is starved out of cpu. The cpus are effectively consumed by other threads that are spinning for lock to become available. After instrumenting the kernel a little bit what I found was that under memory pressure, when the user thread holding the spinlock traps into the kernel due to page fault, that thread sleeps until the free pages are available. The thread sleeps PUSER priority (within vm_waitpfault()). When it is ready to run, it is queued at PUSER priority even thought it's base priority is realtime. The other siblings threads that are spinning at realtime priority to acquire the spinlock starves the owner of spinlock. I was wondering if the sleep in vm_waitpfault() should be a MAX(td_user_pri, PUSER) instead of just PUSER. I'm running on 7.2 and it looks like this logic is the same in the trunk. Thanks, Sushanth From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 02:14:10 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8B848106566B for ; Thu, 5 Apr 2012 02:14:10 +0000 (UTC) (envelope-from robert.lorentz@gmail.com) Received: from mail-we0-f182.google.com (mail-we0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id DFDEC8FC0A for ; Thu, 5 Apr 2012 02:14:09 +0000 (UTC) Received: by wern13 with SMTP id n13so715839wer.13 for ; Wed, 04 Apr 2012 19:14:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=4qpuw+xu9OOXXuHPyM95DdG6LHFUX9e1csTJho6hw94=; b=zLi8iU/j5gMWTQTTPli5R7JLy0bwGhiFvMga7xt0Z/jZd+icBVKjjvjYs5fEtApWkJ RGtOuUTBQqRxO+GX6x8xlaRfSJy+sB67O6cvkLDQYaUfbMTpelFi4orLjY/LZnGNkgpC yW7S3pWPmqOkkAQcXeH2FCCVRkEUTUuPN03PcyYWtAeSIv98lFJ3sd7RHGTR6hwmTZLo MgKGR/4Or2fDzJyLvOoLZrKG8QxPAcmJGl3pMqrYCoksUoui+QI2j+t4GZSPkzZcD4Q0 AIdX8/pm+HUaq1gxiLQH9SNb7hjRP053YcD+2Fopdq3ajpHyJfPNT7QVVHpaWf3kD4Bf N3fQ== MIME-Version: 1.0 Received: by 10.180.102.101 with SMTP id fn5mr463655wib.6.1333592042770; Wed, 04 Apr 2012 19:14:02 -0700 (PDT) Received: by 10.180.104.162 with HTTP; Wed, 4 Apr 2012 19:14:02 -0700 (PDT) Date: Wed, 4 Apr 2012 22:14:02 -0400 Message-ID: From: Robert Lorentz To: freebsd-hackers@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Subject: GSoC Call for Mentor X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 02:14:10 -0000 Hi, I've been communicating with the FreeBSD GSoC admins list for a few months now, not realizing only 4 people are on there. I have spoken with Ben Laurie (affiliated with OpenSSL) and Robert Watson regarding my GSoC idea for software implementations of SHA-3 hash algorithms for the purpose of inclusion within FreeBSD or OpenSSL. The timeline for applications is now almost upon us, so I would like to finalize my plan as soon as possible to allow me time to create a good proposal in time to submit it. It seems clear that the implementation and performance analysis of the SHA-3 candidate algorithm(s) is the interesting part of what I discussed in that earlier correspondence. Whether the code is written for FreeBSD or OpenSSL's specific framework is not interesting and more a strategic/political decision than a technical one. After pondering the previous suggestions, I think that my project proposal should be roughly as follows: - C Implementations of all 5 SHA-3 hash algorithm candidates. These implementations will operate in a standalone manner, with a reasonable interface such that the NIST SHA-3 selected algorithm's implementation could be easily adapted to work within OpenSSL or FreeBSD. - Expect that alternate implementations will be explored to determine possible performance tradeoffs and optimal implementations. - Formalized analysis and discussion, formatted in a conference-quality paper My motivation for this work is that I am currently working on PhD research in the field of cryptographic engineering, recently completed my MS CpE research on hardware FPGA implementations of SHA-3 candidates, and my undergraduate degree and personal experience is in computer science (C, C++, UNIX) so this project is very interesting to me and I feel I have the skills and experience to obtain meaningful results. I desire a mentor at this point in time because I understand that it would give me a better chance of my project proposal being accepted and successfully executed. My hope is that one of you will agree to be my mentor, at which point I will create a detailed project proposal to submit to the GSoC. If I do not have a willing mentor I do not intend to submit a proposal. Ben and Robert seemed enthusiastic regarding my idea (Ben commented that the current AES implementation began in this way) but are too busy or lack the interest to become my mentor. I see that FreeBSD has been accepted as a program to GSoC 2012; OpenSSL is not listed. Therefore it is my assumption that my proposed project would be done under the FreeBSD program - even if eventually this code ends up in OpenSSL and flows downstream to FreeBSD. If this assumption does not satisfy you, can you please suggest a modification to my proposal that would make it become eligible for sponsorship under the FreeBSD GSoC program? If anyone is willing to take me on for this, please send me a response. I am very easy to work with :) Thanks, Robert Lorentz From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 02:01:05 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8951F106564A for ; Thu, 5 Apr 2012 02:01:05 +0000 (UTC) (envelope-from robert.lorentz@me.com) Received: from st11p02mm-asmtpout002.mac.com (st11p02mm-asmtpout002.mac.com [17.172.220.237]) by mx1.freebsd.org (Postfix) with ESMTP id 56D4F8FC12 for ; Thu, 5 Apr 2012 02:01:05 +0000 (UTC) MIME-version: 1.0 Received: from [192.168.1.3] (pool-71-191-82-137.washdc.fios.verizon.net [71.191.82.137]) by st11p02mm-asmtp002.mac.com (Oracle Communications Messaging Server 7u4-23.01(7.0.4.23.0) 64bit (built Aug 10 2011)) with ESMTPSA id <0M1Z002J2ETLXTB0@st11p02mm-asmtp002.mac.com> for freebsd-hackers@freebsd.org; Thu, 05 Apr 2012 01:00:59 +0000 (GMT) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.6.7498,1.0.260,0.0.0000 definitions=2012-04-04_07:2012-04-04, 2012-04-04, 1970-01-01 signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 ipscore=0 suspectscore=0 phishscore=0 bulkscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=6.0.2-1012030000 definitions=main-1204040281 From: Robert Lorentz X-Mailer: iPhone Mail (9B176) Message-id: Date: Wed, 04 Apr 2012 21:00:53 -0400 To: FreeBSD Hackers X-Mailman-Approved-At: Thu, 05 Apr 2012 02:46:53 +0000 Content-Type: text/plain; CHARSET=US-ASCII Content-Transfer-Encoding: 7BIT X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: GSoC call for mentor X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 02:01:05 -0000 Hi, I've been communicating with the FreeBSD GSoC admins list for a few months now, not realizing only 4 people are on there. I have spoken with Ben Laurie (affiliated with OpenSSL) and Robert Watson regarding my GSoC idea for software implementations of SHA-3 hash algorithms for the purpose of inclusion within FreeBSD or OpenSSL. The timeline for applications is now almost upon us, so I would like to finalize my plan as soon as possible to allow me time to create a good proposal in time to submit it. It seems clear that the implementation and performance analysis of the SHA-3 candidate algorithm(s) is the interesting part of what I discussed in that earlier correspondence. Whether the code is written for FreeBSD or OpenSSL's specific framework is not interesting and more a strategic/political decision than a technical one. After pondering the previous suggestions, I think that my project proposal should be roughly as follows: - C Implementations of all 5 SHA-3 hash algorithm candidates. These implementations will operate in a standalone manner, with a reasonable interface such that the NIST SHA-3 selected algorithm's implementation could be easily adapted to work within OpenSSL or FreeBSD. - Expect that alternate implementations will be explored to determine possible performance tradeoffs and optimal implementations. - Formalized analysis and discussion, formatted in a conference-quality paper My motivation for this work is that I am currently working on PhD research in the field of cryptographic engineering, recently completed my MS CpE research on hardware FPGA implementations of SHA-3 candidates, and my undergraduate degree and personal experience is in computer science (C, C++, UNIX) so this project is very interesting to me and I feel I have the skills and experience to obtain meaningful results. I desire a mentor at this point in time because I understand that it would give me a better chance of my project proposal being accepted and successfully executed. My hope is that one of you will agree to be my mentor, at which point I will create a detailed project proposal to submit to the GSoC. If I do not have a willing mentor I do not intend to submit a proposal. Ben and Robert seemed enthusiastic regarding my idea (Ben commented that the current AES implementation began in this way) but are too busy or lack the interest to become my mentor. I see that FreeBSD has been accepted as a program to GSoC 2012; OpenSSL is not listed. Therefore it is my assumption that my proposed project would be done under the FreeBSD program - even if eventually this code ends up in OpenSSL and flows downstream to FreeBSD. If this assumption does not satisfy you, can you please suggest a modification to my proposal that would make it become eligible for sponsorship under the FreeBSD GSoC program? If anyone is willing to take me on for this, please send me a response. I am very easy to work with :) Thanks, Robert Lorentz From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 03:56:51 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 78F4E106566B for ; Thu, 5 Apr 2012 03:56:51 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id DD4138FC14 for ; Thu, 5 Apr 2012 03:56:50 +0000 (UTC) Received: from skuns.kiev.zoral.com.ua (localhost [127.0.0.1]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id q353ukVq061738; Thu, 5 Apr 2012 06:56:46 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5) with ESMTP id q353ukUJ087869; Thu, 5 Apr 2012 06:56:46 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5/Submit) id q353ujga087868; Thu, 5 Apr 2012 06:56:45 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 5 Apr 2012 06:56:45 +0300 From: Konstantin Belousov To: Sushanth Rai Message-ID: <20120405035645.GO2358@deviant.kiev.zoral.com.ua> References: <1333590846.58474.YahooMailClassic@web180011.mail.gq1.yahoo.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="KKfHGHXZBgaZT5Zm" Content-Disposition: inline In-Reply-To: <1333590846.58474.YahooMailClassic@web180011.mail.gq1.yahoo.com> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.0 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: freebsd-hackers@freebsd.org Subject: Re: Startvation of realtime piority threads X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 03:56:51 -0000 --KKfHGHXZBgaZT5Zm Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Apr 04, 2012 at 06:54:06PM -0700, Sushanth Rai wrote: > I have a multithreaded user space program that basically runs at realtime= priority. Synchronization between threads are done using spinlock. When ru= nning this program on a SMP system under heavy memory pressure I see that t= hread holding the spinlock is starved out of cpu. The cpus are effectively = consumed by other threads that are spinning for lock to become available.= =20 >=20 > After instrumenting the kernel a little bit what I found was that under m= emory pressure, when the user thread holding the spinlock traps into the ke= rnel due to page fault, that thread sleeps until the free pages are availab= le. The thread sleeps PUSER priority (within vm_waitpfault()). When it is r= eady to run, it is queued at PUSER priority even thought it's base priority= is realtime. The other siblings threads that are spinning at realtime prio= rity to acquire the spinlock starves the owner of spinlock.=20 >=20 > I was wondering if the sleep in vm_waitpfault() should be a MAX(td_user_p= ri, PUSER) instead of just PUSER. I'm running on 7.2 and it looks like this= logic is the same in the trunk. It just so happen that your program stumbles upon a single sleep point in the kernel. If for whatever reason the thread in kernel is put off CPU due to failure to acquire any resource without priority propagation, you would get the same effect. Only blockable primitives do priority propagation, that are mutexes and rwlocks, AFAIR. In other words, any sx/lockmgr/sleep points are vulnerable to the same issue. Speaking of exactly your problem, did you considered wiring the memory of your realtime process ? This is a common practice, used e.g. by ntpd. --KKfHGHXZBgaZT5Zm Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAk99F/0ACgkQC3+MBN1Mb4gWIACghu8Uy8q/hfORgXAb3d/gkJyJ xyMAmwYeXIBi4xxiod19/r/LXuvpqz/2 =8MEs -----END PGP SIGNATURE----- --KKfHGHXZBgaZT5Zm-- From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 03:23:15 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E9935106566B for ; Thu, 5 Apr 2012 03:23:15 +0000 (UTC) (envelope-from alexander@leidinger.net) Received: from mail.ebusiness-leidinger.de (mail.ebusiness-leidinger.de [217.11.53.44]) by mx1.freebsd.org (Postfix) with ESMTP id 9F1758FC19 for ; Thu, 5 Apr 2012 03:23:15 +0000 (UTC) Received: from outgoing.leidinger.net (p4FC42835.dip.t-dialin.net [79.196.40.53]) by mail.ebusiness-leidinger.de (Postfix) with ESMTPSA id 293578446D3; Thu, 5 Apr 2012 05:22:52 +0200 (CEST) Received: from unknown (IO.Leidinger.net [192.168.1.12]) by outgoing.leidinger.net (Postfix) with ESMTPS id A3D0C23A1; Thu, 5 Apr 2012 05:22:48 +0200 (CEST) Date: Thu, 5 Apr 2012 05:22:46 +0200 From: Alexander Leidinger To: Jerry Toung Message-ID: <20120405052246.00002c53@unknown> In-Reply-To: References: <20120403193124.46ad9de9@ernst.jennejohn.org> X-Mailer: Claws Mail 3.7.10cvs42 (GTK+ 2.16.6; i586-pc-mingw32msvc) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-EBL-MailScanner-Information: Please contact the ISP for more information X-EBL-MailScanner-ID: 293578446D3.AED8F X-EBL-MailScanner: Found to be clean X-EBL-MailScanner-SpamCheck: not spam, spamhaus-ZEN, SpamAssassin (not cached, score=0.39, required 6, autolearn=disabled, ALL_TRUSTED -1.00, AWL -0.60, BR_SPAMMER_URI 2.00, T_RP_MATCHES_RCVD -0.01) X-EBL-MailScanner-From: alexander@leidinger.net X-EBL-MailScanner-Watermark: 1334200972.84234@wFBhdtfxuK6bbreajlGgMw X-EBL-Spam-Status: No X-Mailman-Approved-At: Thu, 05 Apr 2012 04:25:10 +0000 Cc: freebsd-hackers Subject: Re: CAM disk I/O starvation X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 03:23:16 -0000 On Tue, 3 Apr 2012 14:27:43 -0700 Jerry Toung wrote: > On 4/3/12, Gary Jennejohn wrote: > > > It would be interesting to see your patch. I always run HEAD but > > maybe I could use it as a base for my own mods/tests. > > > > Here is the patch This looks fair if all your disks are working at the same time (e.g. RAID only setup), but if you have a setup where you have multiple disks and only one is doing something, you limit the amount of tags which can be used. No idea what kind of performance impact this would have. What about the case where you have more disks than tags? I also noticed that you do a strncmp for "da". What about "ada" (available in 9 and 10), I would assume it suffers from the same problem. Bye, Alexander. -- http://www.Leidinger.net Alexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 04:56:13 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 93B12106566C for ; Thu, 5 Apr 2012 04:56:13 +0000 (UTC) (envelope-from listlog2011@gmail.com) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 7D0BF8FC08 for ; Thu, 5 Apr 2012 04:56:13 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q354uAFm079316 for ; Thu, 5 Apr 2012 04:56:12 GMT (envelope-from listlog2011@gmail.com) Message-ID: <4F7D25E9.8030703@gmail.com> Date: Thu, 05 Apr 2012 12:56:09 +0800 From: David Xu User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:11.0) Gecko/20120327 Thunderbird/11.0.1 MIME-Version: 1.0 To: freebsd-hackers@freebsd.org References: <1333590846.58474.YahooMailClassic@web180011.mail.gq1.yahoo.com> In-Reply-To: <1333590846.58474.YahooMailClassic@web180011.mail.gq1.yahoo.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: Startvation of realtime piority threads X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: davidxu@freebsd.org List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 04:56:13 -0000 On 2012/4/5 9:54, Sushanth Rai wrote: > I have a multithreaded user space program that basically runs at realtime priority. Synchronization between threads are done using spinlock. When running this program on a SMP system under heavy memory pressure I see that thread holding the spinlock is starved out of cpu. The cpus are effectively consumed by other threads that are spinning for lock to become available. > > After instrumenting the kernel a little bit what I found was that under memory pressure, when the user thread holding the spinlock traps into the kernel due to page fault, that thread sleeps until the free pages are available. The thread sleeps PUSER priority (within vm_waitpfault()). When it is ready to run, it is queued at PUSER priority even thought it's base priority is realtime. The other siblings threads that are spinning at realtime priority to acquire the spinlock starves the owner of spinlock. > > I was wondering if the sleep in vm_waitpfault() should be a MAX(td_user_pri, PUSER) instead of just PUSER. I'm running on 7.2 and it looks like this logic is the same in the trunk. > > Thanks, > Sushanth I think 7.2 still has libkse which supports static priority scheduling, if performance is not important but correctness, you may try libkse with process-scope threads, and use priority-inherit mutex to do locking. Kernel is known to be vulnerable to support user realtime threads. I think not every-locking primitive can support priority propagation, this is an issue. In userland, internal library mutexes are not priority-inherit, so starvation may happen too. If you know what you are doing, don't call such functions which uses internal mutexes, but this is rather difficult. Regards, David Xu From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 05:08:02 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 27FC11065672 for ; Thu, 5 Apr 2012 05:08:02 +0000 (UTC) (envelope-from listlog2011@gmail.com) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id EB5F68FC12 for ; Thu, 5 Apr 2012 05:08:01 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q3557u9u088159 for ; Thu, 5 Apr 2012 05:07:58 GMT (envelope-from listlog2011@gmail.com) Message-ID: <4F7D28AB.605@gmail.com> Date: Thu, 05 Apr 2012 13:07:55 +0800 From: David Xu User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:11.0) Gecko/20120327 Thunderbird/11.0.1 MIME-Version: 1.0 To: freebsd-hackers@freebsd.org References: <1333590846.58474.YahooMailClassic@web180011.mail.gq1.yahoo.com> <20120405035645.GO2358@deviant.kiev.zoral.com.ua> In-Reply-To: <20120405035645.GO2358@deviant.kiev.zoral.com.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: Startvation of realtime piority threads X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: davidxu@freebsd.org List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 05:08:02 -0000 On 2012/4/5 11:56, Konstantin Belousov wrote: > On Wed, Apr 04, 2012 at 06:54:06PM -0700, Sushanth Rai wrote: >> I have a multithreaded user space program that basically runs at realtime priority. Synchronization between threads are done using spinlock. When running this program on a SMP system under heavy memory pressure I see that thread holding the spinlock is starved out of cpu. The cpus are effectively consumed by other threads that are spinning for lock to become available. >> >> After instrumenting the kernel a little bit what I found was that under memory pressure, when the user thread holding the spinlock traps into the kernel due to page fault, that thread sleeps until the free pages are available. The thread sleeps PUSER priority (within vm_waitpfault()). When it is ready to run, it is queued at PUSER priority even thought it's base priority is realtime. The other siblings threads that are spinning at realtime priority to acquire the spinlock starves the owner of spinlock. >> >> I was wondering if the sleep in vm_waitpfault() should be a MAX(td_user_pri, PUSER) instead of just PUSER. I'm running on 7.2 and it looks like this logic is the same in the trunk. > It just so happen that your program stumbles upon a single sleep point in > the kernel. If for whatever reason the thread in kernel is put off CPU > due to failure to acquire any resource without priority propagation, > you would get the same effect. Only blockable primitives do priority > propagation, that are mutexes and rwlocks, AFAIR. In other words, any > sx/lockmgr/sleep points are vulnerable to the same issue. This is why I suggested that POSIX realtime priority should not be boosted, it should be only higher than PRI_MIN_TIMESHARE but lower than any priority all msleep() callers provided. The problem is userland realtime thread 's busy looping code can cause starvation a thread in kernel which holding a critical resource. In kernel we can avoid to write dead-loop code, but userland code is not trustable. If you search "Realtime thread priorities" in 2010-december within @arch list. you may find the argument. > Speaking of exactly your problem, did you considered wiring the memory > of your realtime process ? This is a common practice, used e.g. by ntpd. From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 06:07:19 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D0B53106564A for ; Thu, 5 Apr 2012 06:07:19 +0000 (UTC) (envelope-from pfg@FreeBSD.org) Received: from nm30-vm0.bullet.mail.sp2.yahoo.com (nm30-vm0.bullet.mail.sp2.yahoo.com [98.139.91.238]) by mx1.freebsd.org (Postfix) with SMTP id 633158FC0A for ; Thu, 5 Apr 2012 06:07:19 +0000 (UTC) Received: from [98.139.91.64] by nm30.bullet.mail.sp2.yahoo.com with NNFMP; 05 Apr 2012 06:07:19 -0000 Received: from [208.71.42.212] by tm4.bullet.mail.sp2.yahoo.com with NNFMP; 05 Apr 2012 06:07:19 -0000 Received: from [127.0.0.1] by smtp223.mail.gq1.yahoo.com with NNFMP; 05 Apr 2012 06:07:19 -0000 X-Yahoo-Newman-Id: 997972.1709.bm@smtp223.mail.gq1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: _HnKKvgVM1ml9u_a.fDXcoh.FAfORLU7Y0gv8GJhaCbDfWH JqlE.m9q1SmP1dumqEDUfoJgdL7gHYUgfMVT0LWzEfiOJ8kVkGzWR3yska83 _QHvwO1Mw1ii4_j7GjAzAEehLavETsKYDZwrBo8d9aEGXph4WZrZa4gLtu_V .Vc5YjT3mGpAVTsogcrsklb1XJ5sGd1.OuJr1Bpeazyq1loS3vhXqqC4iZN3 oToTlpDXBz.W80cEmf8BJTSo5x9Dhe2SyWG4n_UNnUGGvVKStex9HrxGqH2q tsH5sAsS2mWlIFgcpDFGAenJ7ZeYrPia9S49pTjM1VMBnFVw6TCTdsmGkK6_ N30lEH7s8fsC_H07JzAfpVc4qKILbUQijz_xj1vc6qrGuhOvOhsypAT7q7Ag IZeSAtp_R7E10Fk6cY7T5sv4WPHfgV1WS_7XxRCxsBJ3Ndg0uqpJsNxjW.2A vORlqtlC5D.ia8WiETgRON1YF27Shp9mcLS.jeB8H4mFAwsVD17sVT4hDe_b henj3XhI8Pvmll4nPPSyDIe_ymkB0OdrJb5AC_moV0B1F0wGYqg-- X-Yahoo-SMTP: xcjD0guswBAZaPPIbxpWwLcp9Unf Received: from [192.168.10.106] (pfg@200.118.157.7 with plain) by smtp223.mail.gq1.yahoo.com with SMTP; 04 Apr 2012 23:07:18 -0700 PDT Message-ID: <4F7D3694.7020704@FreeBSD.org> Date: Thu, 05 Apr 2012 01:07:16 -0500 From: Pedro Giffuni User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:10.0.2) Gecko/20120226 Thunderbird/10.0.2 MIME-Version: 1.0 To: Yongcong Du , freebsd-hackers@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: netchild@freebsd.org, avg@freebsd.org Subject: Re: [gsoc2012] Port NetBSD's UDF implementation X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 06:07:19 -0000 Hi YongCon; The project would be very interesting for us. I am pretty sure you will not have problems finding a mentor. That said, let me point out an old thread: http://lists.freebsd.org/pipermail/freebsd-stable/2008-May/042565.html I think the biggest problem is that you will have to get acquainted with FreeBSD's Virtual Memory which is different from NetBSD's. In that same thread you will find some comments by Matt Dillon (no idea how up to date those are). It will not be an easy task but people find such challenges very rewarding. Pedro. From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 07:34:34 2012 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7B6471065670; Thu, 5 Apr 2012 07:34:34 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 5A1688FC15; Thu, 5 Apr 2012 07:34:33 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id KAA08402; Thu, 05 Apr 2012 10:34:30 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1SFhDB-0000KW-Mm; Thu, 05 Apr 2012 10:34:29 +0300 Message-ID: <4F7D4B03.50900@FreeBSD.org> Date: Thu, 05 Apr 2012 10:34:27 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:10.0.3) Gecko/20120317 Thunderbird/10.0.3 MIME-Version: 1.0 To: Yongcong Du References: <4F7D3694.7020704@FreeBSD.org> In-Reply-To: <4F7D3694.7020704@FreeBSD.org> X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@FreeBSD.org, Pedro Giffuni , netchild@FreeBSD.org Subject: Re: [gsoc2012] Port NetBSD's UDF implementation X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 07:34:34 -0000 on 05/04/2012 09:07 Pedro Giffuni said the following: > Hi YongCon; > > The project would be very interesting for us. I am pretty sure you will > not have problems finding a mentor. > > That said, let me point out an old thread: > > http://lists.freebsd.org/pipermail/freebsd-stable/2008-May/042565.html > > I think the biggest problem is that you will have to get acquainted > with FreeBSD's Virtual Memory which is different from NetBSD's. > In that same thread you will find some comments by Matt Dillon > (no idea how up to date those are). > > It will not be an easy task but people find such challenges very > rewarding. Yongcong, please note that we have already got proposals from two other students for this project. I haven't expected this project to get so popular after sitting unnoticed for a few years. To increase your chances of getting accepted, it might be a good idea to consider another project. -- Andriy Gapon From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 08:49:52 2012 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E1817106564A for ; Thu, 5 Apr 2012 08:49:52 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 3A6638FC12 for ; Thu, 5 Apr 2012 08:49:52 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id LAA09013 for ; Thu, 05 Apr 2012 11:49:44 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1SFiO0-0000NH-Ad for freebsd-hackers@FreeBSD.org; Thu, 05 Apr 2012 11:49:44 +0300 Message-ID: <4F7D5CA5.5030306@FreeBSD.org> Date: Thu, 05 Apr 2012 11:49:41 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:10.0.3) Gecko/20120317 Thunderbird/10.0.3 MIME-Version: 1.0 To: freebsd-hackers@FreeBSD.org X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=X-VIET-VPS Content-Transfer-Encoding: 7bit Cc: Subject: opensslv.h SHLIB_VERSION_NUMBER X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 08:49:53 -0000 I wonder who can review the following change and what good or bad can come from it? Index: crypto/openssl/crypto/opensslv.h =================================================================== --- crypto/openssl/crypto/opensslv.h (revision 233888) +++ crypto/openssl/crypto/opensslv.h (working copy) @@ -83,7 +83,7 @@ * should only keep the versions that are binary compatible with the current. */ #define SHLIB_VERSION_HISTORY "" -#define SHLIB_VERSION_NUMBER "0.9.8" +#define SHLIB_VERSION_NUMBER "6" #endif /* HEADER_OPENSSLV_H */ Rationale for the change can be seen here: http://article.gmane.org/gmane.comp.kde.freebsd/20645 TLDR: some software may depend on libssl.so.${SHLIB_VERSION_NUMBER } being correct. -- Andriy Gapon From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 11:43:48 2012 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1F43F1065670 for ; Thu, 5 Apr 2012 11:43:48 +0000 (UTC) (envelope-from lists@eitanadler.com) Received: from mail-we0-f182.google.com (mail-we0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 953ED8FC14 for ; Thu, 5 Apr 2012 11:43:47 +0000 (UTC) Received: by wern13 with SMTP id n13so1032350wer.13 for ; Thu, 05 Apr 2012 04:43:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eitanadler.com; s=0xdeadbeef; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=GgQMASxxqkN2oSEWHhXsjGtv532i85G/htN/+IXuGRw=; b=QMsy2z7NofCfeCtu9Ipw/lLj5Im5dncuyz0z2q8SZWJunmPoahkdEjGH8jw+cbdabb THne4bbJ+Vpa1Lj60bIwwNXJaxnwDgEF70qDTiiC4tWTLXMPk3u+0Gn8u3zk3CGxRxHd oPQqJpmz9pIxe9NKbIIHGkDv5O3kF2vLoYmYY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:x-gm-message-state; bh=GgQMASxxqkN2oSEWHhXsjGtv532i85G/htN/+IXuGRw=; b=cWDoIHou/xS/ZT9hwctCBnAnuNYqtoWuqs9AuSUMGLevVfaZSoV1xOuDCPmq+qLyi+ Vhw+85Toje/Kjve63VTEKVvkcmf2/NhTsclM3WFjuqxek8Yq6E0faoUZviL58osxKogn pFLJvu78eD646i8bWnrTksBAC4+NJt30q7cwPhxoIDVQ/Z2NqwEiIUMLL1/+VNFLVWKQ 07QR4E5lxe0nJdpDCGTd4j88cMmsfbbLFyuN/FFr/DiZ24/HFNq9B1H5wqEa5B4KTQOz aJfafzLbgK0kJDh5hCq2rr9hfjAstjD7gd6AYrEMktc7fO5dHTEoQzDZlxhzemqpRL/m 5sDw== Received: by 10.180.91.165 with SMTP id cf5mr4092623wib.2.1333626225780; Thu, 05 Apr 2012 04:43:45 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.63.4 with HTTP; Thu, 5 Apr 2012 04:43:15 -0700 (PDT) In-Reply-To: <4F7CA124.8080401@freebsd.org> References: <4F775DF5.1020704@rawbw.com> <201204020831.09253.jhb@freebsd.org> <4F79D63E.7010200@rawbw.com> <201204021312.36568.jhb@freebsd.org> <4F7BDF06.8000104@freebsd.org> <4F7CA124.8080401@freebsd.org> From: Eitan Adler Date: Thu, 5 Apr 2012 07:43:15 -0400 Message-ID: To: Julian Elischer Content-Type: text/plain; charset=UTF-8 X-Gm-Message-State: ALoCoQkAzrJgIwbGohtS5hJ/q1KNYihbb28fDlIPahJXmi/xhhAFqBRdbLl7Rn1j18bSW1UOuE+B Cc: Yuri , hackers@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 11:43:48 -0000 On 4 April 2012 15:29, Julian Elischer wrote: > but we do add patches to make things work on FreeBSD. We add patches to make ports... ... work on FreeBSD ... conform to FreeBSD hier (to an extent) ... work with alternate compilers, PREFIX, etc. We shouldn't add patches which "continue development". In all cases the goal should be to upstream the patch ASAP. If there is no active upstream and the patch does more than the above that is a sign that someone needs to be willing to become the upstream maintainer first. -- Eitan Adler From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 11:43:53 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4425D106566B for ; Thu, 5 Apr 2012 11:43:53 +0000 (UTC) (envelope-from lists@eitanadler.com) Received: from mail-wi0-f178.google.com (mail-wi0-f178.google.com [209.85.212.178]) by mx1.freebsd.org (Postfix) with ESMTP id B6DD68FC08 for ; Thu, 5 Apr 2012 11:43:52 +0000 (UTC) Received: by wibhq7 with SMTP id hq7so1000614wib.13 for ; Thu, 05 Apr 2012 04:43:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eitanadler.com; s=0xdeadbeef; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=GgQMASxxqkN2oSEWHhXsjGtv532i85G/htN/+IXuGRw=; b=QMsy2z7NofCfeCtu9Ipw/lLj5Im5dncuyz0z2q8SZWJunmPoahkdEjGH8jw+cbdabb THne4bbJ+Vpa1Lj60bIwwNXJaxnwDgEF70qDTiiC4tWTLXMPk3u+0Gn8u3zk3CGxRxHd oPQqJpmz9pIxe9NKbIIHGkDv5O3kF2vLoYmYY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:x-gm-message-state; bh=GgQMASxxqkN2oSEWHhXsjGtv532i85G/htN/+IXuGRw=; b=TWsbCXW04r22X5NnJ0AyPsHFB2uiakH3VNcfYhyQKYdetnBl5Dbqxj6b6J5t8IWT66 OusTENdk5Pxgo1r3C2WZkacMutB5Xrzy3qxiPPBhhM681AYtjeI1biWWSHbgx/shLT1M VKBRukOOn79S8GvZU+l5KJth+YAPTHOhCeapa8kKjjUEKiJCoREz85QRQOSJZavjlouI 0DR7TGZ8DSNHfYU4hju62ZM3oJypUTRQeI/CbwVynMs64uccqHHhhyicuQNUM0iXjO6f VrQ/ZT4vo7a/WoA8C0FQJFWoCH90PYP9rcYEUUMwxSTDMRTAXYLHcfCoDn9P9SsLoXS6 zuCw== Received: by 10.180.91.165 with SMTP id cf5mr4092623wib.2.1333626225780; Thu, 05 Apr 2012 04:43:45 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.63.4 with HTTP; Thu, 5 Apr 2012 04:43:15 -0700 (PDT) In-Reply-To: <4F7CA124.8080401@freebsd.org> References: <4F775DF5.1020704@rawbw.com> <201204020831.09253.jhb@freebsd.org> <4F79D63E.7010200@rawbw.com> <201204021312.36568.jhb@freebsd.org> <4F7BDF06.8000104@freebsd.org> <4F7CA124.8080401@freebsd.org> From: Eitan Adler Date: Thu, 5 Apr 2012 07:43:15 -0400 Message-ID: To: Julian Elischer Content-Type: text/plain; charset=UTF-8 X-Gm-Message-State: ALoCoQk+Bd+W3lsrses1XTF4AhNkkxCl8OOITVPYd+xgvgMkS2YljnM7Z4R74VXkuMghww+Fi0yu Cc: Yuri , hackers@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 11:43:53 -0000 On 4 April 2012 15:29, Julian Elischer wrote: > but we do add patches to make things work on FreeBSD. We add patches to make ports... ... work on FreeBSD ... conform to FreeBSD hier (to an extent) ... work with alternate compilers, PREFIX, etc. We shouldn't add patches which "continue development". In all cases the goal should be to upstream the patch ASAP. If there is no active upstream and the patch does more than the above that is a sign that someone needs to be willing to become the upstream maintainer first. -- Eitan Adler From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 12:19:56 2012 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6F9EC1065670 for ; Thu, 5 Apr 2012 12:19:56 +0000 (UTC) (envelope-from jeremie@le-hen.org) Received: from smtp5-g21.free.fr (smtp5-g21.free.fr [IPv6:2a01:e0c:1:1599::14]) by mx1.freebsd.org (Postfix) with ESMTP id C0E1A8FC08 for ; Thu, 5 Apr 2012 12:19:53 +0000 (UTC) Received: from endor.tataz.chchile.org (unknown [82.233.239.98]) by smtp5-g21.free.fr (Postfix) with ESMTP id 2EF99D480C9 for ; Thu, 5 Apr 2012 14:19:48 +0200 (CEST) Received: from felucia.tataz.chchile.org (felucia.tataz.chchile.org [192.168.1.9]) by endor.tataz.chchile.org (Postfix) with ESMTP id 0E3EA2A71 for ; Thu, 5 Apr 2012 12:19:48 +0000 (UTC) Received: by felucia.tataz.chchile.org (Postfix, from userid 1000) id E260BDA23; Thu, 5 Apr 2012 12:19:47 +0000 (UTC) Date: Thu, 5 Apr 2012 14:19:47 +0200 From: Jeremie Le Hen To: freebsd-hackers@FreeBSD.org Message-ID: <20120405121947.GB46534@felucia.tataz.chchile.org> Mail-Followup-To: freebsd-hackers@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Subject: bin/166660: new stdbuf utility X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 12:19:56 -0000 Hi hackers, I've posted a PR bin/166660 this morning: [patch] New util/shlib to change per-fd default stdio buffering mode For some unknown reason, "[libc]" has been prepended to the subject, but this patch __DOES NOT touch libc__ (except a one-line addition in a manpage). This is aboslutely non-intrusive and can be easily MFC'd to RELENG_9, RELENG_8 and even RELENG_7 if it hasn't reach itf end-of-life. In brief, this is a new tool that allow to control default fd stdio buffering mode. The feature exists in Linux and its command-line interface is BSD-compatible, so I used the same name and the same interface for obvious compatibility reasons. As you can guess, I'm looking for someone willing to test (though I've already tested it and the code is pretty straightforward) and commit it. You will find additional information in the PR. Thanks. -- Jeremie Le Hen Men are born free and equal. Later on, they're on their own. Jean Yanne From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 14:38:14 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id EF59E1065675; Thu, 5 Apr 2012 14:38:14 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id C352C8FC1C; Thu, 5 Apr 2012 14:38:14 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 1FA1EB970; Thu, 5 Apr 2012 10:38:14 -0400 (EDT) From: John Baldwin To: Eitan Adler Date: Thu, 5 Apr 2012 10:06:11 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p10; KDE/4.5.5; amd64; ; ) References: <4F775DF5.1020704@rawbw.com> <4F7CA124.8080401@freebsd.org> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <201204051006.11598.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Thu, 05 Apr 2012 10:38:14 -0400 (EDT) Cc: Yuri , hackers@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 14:38:15 -0000 On Thursday, April 05, 2012 7:43:15 am Eitan Adler wrote: > On 4 April 2012 15:29, Julian Elischer wrote: > > but we do add patches to make things work on FreeBSD. > > We add patches to make ports... > ... work on FreeBSD > ... conform to FreeBSD hier (to an extent) > ... work with alternate compilers, PREFIX, etc. > > We shouldn't add patches which "continue development". In all cases > the goal should be to upstream the patch ASAP. If there is no active > upstream and the patch does more than the above that is a sign that > someone needs to be willing to become the upstream maintainer first. In this case we probably should become the upstream maintainer. My patch actually bumps the version to 1.3 as it is sort of intended to do that. -- John Baldwin From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 14:38:14 2012 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id EF59E1065675; Thu, 5 Apr 2012 14:38:14 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id C352C8FC1C; Thu, 5 Apr 2012 14:38:14 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 1FA1EB970; Thu, 5 Apr 2012 10:38:14 -0400 (EDT) From: John Baldwin To: Eitan Adler Date: Thu, 5 Apr 2012 10:06:11 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p10; KDE/4.5.5; amd64; ; ) References: <4F775DF5.1020704@rawbw.com> <4F7CA124.8080401@freebsd.org> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <201204051006.11598.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Thu, 05 Apr 2012 10:38:14 -0400 (EDT) Cc: Yuri , hackers@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 14:38:15 -0000 On Thursday, April 05, 2012 7:43:15 am Eitan Adler wrote: > On 4 April 2012 15:29, Julian Elischer wrote: > > but we do add patches to make things work on FreeBSD. > > We add patches to make ports... > ... work on FreeBSD > ... conform to FreeBSD hier (to an extent) > ... work with alternate compilers, PREFIX, etc. > > We shouldn't add patches which "continue development". In all cases > the goal should be to upstream the patch ASAP. If there is no active > upstream and the patch does more than the above that is a sign that > someone needs to be willing to become the upstream maintainer first. In this case we probably should become the upstream maintainer. My patch actually bumps the version to 1.3 as it is sort of intended to do that. -- John Baldwin From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 14:44:51 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0CD351065676 for ; Thu, 5 Apr 2012 14:44:51 +0000 (UTC) (envelope-from lists@eitanadler.com) Received: from mail-wi0-f178.google.com (mail-wi0-f178.google.com [209.85.212.178]) by mx1.freebsd.org (Postfix) with ESMTP id 895AE8FC1A for ; Thu, 5 Apr 2012 14:44:50 +0000 (UTC) Received: by wibhq7 with SMTP id hq7so1147775wib.13 for ; Thu, 05 Apr 2012 07:44:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eitanadler.com; s=0xdeadbeef; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=jEEUGKFyEuUliWsgOIVgjdB6pV4r/YfyvNQD3QPb6tU=; b=FAqCGT0KgMRrHPGk1ALiFJnxKYfeK8YD6SemXW+GfqrIsYB7tsZANlSj/YDo/kyO85 Lmc7LQcqGyUeG3KZj5W3ar7KkSBn8k39Qi6NNFff2QZMGXpsoqUuwRnEOFHsFU1EjiWG BpZ3YPW1piXpDCgwYzyHwDpDEOjIKhzsrF/ec= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding:x-gm-message-state; bh=jEEUGKFyEuUliWsgOIVgjdB6pV4r/YfyvNQD3QPb6tU=; b=TSfouLuGEzmZTFGIPCSdRqT9P74idhoNbj7Da6aQd17Z09Jm10ibLIkwwphJysVWH1 OUCJtPLRQTSM5Lrgo7PWDQL+Vgxfqp0uMEKwhHrsg+m0MnWeOMeWh8orPmh+tfpjzUsu XpJQS8IguKsGxgrXIl73zVBIhq5xlY/QXJtyFAVQQl9NaHMdPNmBlAfakkxkQQylB6ko GBf/caFO7Up+XlJ6i+BIvfyubNJhqWxJBr5V6AaTYPmhsVdpCp6Ki1Z5QAEuGPhg1MPS mFu3FBTqn4zYG9rpOq3CDXauwg2O/YPBwUhMkR1X+AClUpB1JipxfsZ2iFzFQU7EoznB +DBA== Received: by 10.216.136.131 with SMTP id w3mr1921243wei.15.1333637088638; Thu, 05 Apr 2012 07:44:48 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.63.4 with HTTP; Thu, 5 Apr 2012 07:44:18 -0700 (PDT) In-Reply-To: <201204051006.11598.jhb@freebsd.org> References: <4F775DF5.1020704@rawbw.com> <4F7CA124.8080401@freebsd.org> <201204051006.11598.jhb@freebsd.org> From: Eitan Adler Date: Thu, 5 Apr 2012 10:44:18 -0400 Message-ID: To: John Baldwin Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Gm-Message-State: ALoCoQmx7mkLU0CTbW32lbHiztKUkCyNcWHlyJJcg6FEVwYkBcwixAL11oodlg1pdVOvEfulgtbb Cc: Yuri , hackers@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 14:44:51 -0000 On 5 April 2012 10:06, John Baldwin wrote: > In this case we probably should become the upstream maintainer. =C2=A0My = patch > actually bumps the version to 1.3 as it is sort of intended to do that. Yay! Can you please roll a new tarball and host in ~/public_distfiles or something of a similar nature? That way we could just point the port at the distfile and we don't have to maintain a seperate patchfile in the ports tree. --=20 Eitan Adler From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 14:44:51 2012 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 13A7A1065677 for ; Thu, 5 Apr 2012 14:44:51 +0000 (UTC) (envelope-from lists@eitanadler.com) Received: from mail-wi0-f178.google.com (mail-wi0-f178.google.com [209.85.212.178]) by mx1.freebsd.org (Postfix) with ESMTP id 8F2228FC1B for ; Thu, 5 Apr 2012 14:44:50 +0000 (UTC) Received: by wibhq7 with SMTP id hq7so1147776wib.13 for ; Thu, 05 Apr 2012 07:44:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eitanadler.com; s=0xdeadbeef; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=jEEUGKFyEuUliWsgOIVgjdB6pV4r/YfyvNQD3QPb6tU=; b=FAqCGT0KgMRrHPGk1ALiFJnxKYfeK8YD6SemXW+GfqrIsYB7tsZANlSj/YDo/kyO85 Lmc7LQcqGyUeG3KZj5W3ar7KkSBn8k39Qi6NNFff2QZMGXpsoqUuwRnEOFHsFU1EjiWG BpZ3YPW1piXpDCgwYzyHwDpDEOjIKhzsrF/ec= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding:x-gm-message-state; bh=jEEUGKFyEuUliWsgOIVgjdB6pV4r/YfyvNQD3QPb6tU=; b=jrndCO0PrrK56sQl4A73UdQG1zd2aenIWi6xUEp3qDMoJeZEMm4kTxc4M5IzWj0LJC pLn8E979XpCDyDPGOuXmrv5BVuZDsjenQFBUUsMfjIQ5409K2GD7VQkt9Sl08RubYwIC qSOzpAIb49jjWMAKeozQeOjdSe401xlQaQ/1X0Wur4MGJMO8fw3OKmcABuzjsGuUMTgj KoYnRVmCh9DvwI2MEFZQa7dyfL+k6QoAMd6SrAHPltZj0HXJCkOqTdJ/SHCP7rr0aucU WQyd4S9VaISTcZL1pyYxFik65C3nH0in6yzk0T03O8BiARN8k4/SjyOdQ3hLit68CF6y ssbQ== Received: by 10.216.136.131 with SMTP id w3mr1921243wei.15.1333637088638; Thu, 05 Apr 2012 07:44:48 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.63.4 with HTTP; Thu, 5 Apr 2012 07:44:18 -0700 (PDT) In-Reply-To: <201204051006.11598.jhb@freebsd.org> References: <4F775DF5.1020704@rawbw.com> <4F7CA124.8080401@freebsd.org> <201204051006.11598.jhb@freebsd.org> From: Eitan Adler Date: Thu, 5 Apr 2012 10:44:18 -0400 Message-ID: To: John Baldwin Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Gm-Message-State: ALoCoQnIp6yyvKtMkpGKhGWHuZ0N3fx1RWOlDSfr7FZSVddM1jYBwxPzUHQLe98hLZJp6UhXAZYr Cc: Yuri , hackers@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: Is there any modern alternative to pstack? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 14:44:51 -0000 On 5 April 2012 10:06, John Baldwin wrote: > In this case we probably should become the upstream maintainer. =C2=A0My = patch > actually bumps the version to 1.3 as it is sort of intended to do that. Yay! Can you please roll a new tarball and host in ~/public_distfiles or something of a similar nature? That way we could just point the port at the distfile and we don't have to maintain a seperate patchfile in the ports tree. --=20 Eitan Adler From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 14:58:30 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 055031065670 for ; Thu, 5 Apr 2012 14:58:30 +0000 (UTC) (envelope-from gljennjohn@googlemail.com) Received: from mail-ee0-f54.google.com (mail-ee0-f54.google.com [74.125.83.54]) by mx1.freebsd.org (Postfix) with ESMTP id 86EEE8FC0C for ; Thu, 5 Apr 2012 14:58:29 +0000 (UTC) Received: by eekd17 with SMTP id d17so531983eek.13 for ; Thu, 05 Apr 2012 07:58:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=date:from:to:cc:subject:message-id:in-reply-to:references:reply-to :x-mailer:mime-version:content-type:content-transfer-encoding; bh=5AoWYqKTaNi48qXbc/VEqy+/YAnma3XydeJ6uOGqWN8=; b=gqhB8gheNqO44Rgh0UukpYWhGyhqxpBgyHwqV9+bXrBQtGmaQJKo/d30xCAdBoSdUR 3vGU5ykaTYI3vgiL5p0PWcY714n0eykCkcDSnQysd/SHXSQR9GEcvJKUtEnt/Qn+sX/O FR0SYIzYesgCuUHSQ2GS1X8gx2K+zbn9DrbnEybEnaUBQYU1ENBWO0/pxHjjghYVHeoS XTMvAgDLaIPzHAzAhUU8iL945nMBIIbs9d+r/BwSl2Pf9nfwZr2ntiFhG+VS7UnNkizt ochiyNxJxIxmlI+zsjYz8DnJBp+GzG46KCrGbdwMg8pFDaZQFBtzsnVTyCTWvnIi/LG0 4WSA== Received: by 10.213.112.197 with SMTP id x5mr447727ebp.57.1333637903500; Thu, 05 Apr 2012 07:58:23 -0700 (PDT) Received: from ernst.jennejohn.org (p54895F57.dip.t-dialin.net. [84.137.95.87]) by mx.google.com with ESMTPS id y11sm14054627eem.3.2012.04.05.07.58.21 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 05 Apr 2012 07:58:22 -0700 (PDT) Date: Thu, 5 Apr 2012 16:58:20 +0200 From: Gary Jennejohn To: Alexander Leidinger Message-ID: <20120405165820.38698f1f@ernst.jennejohn.org> In-Reply-To: <20120405052246.00002c53@unknown> References: <20120403193124.46ad9de9@ernst.jennejohn.org> <20120405052246.00002c53@unknown> X-Mailer: Claws Mail 3.8.0 (GTK+ 2.24.6; amd64-portbld-freebsd10.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: freebsd-hackers , Jerry Toung Subject: Re: CAM disk I/O starvation X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: gljennjohn@googlemail.com List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 14:58:30 -0000 On Thu, 5 Apr 2012 05:22:46 +0200 Alexander Leidinger wrote: > On Tue, 3 Apr 2012 14:27:43 -0700 Jerry Toung > wrote: > > > On 4/3/12, Gary Jennejohn wrote: > > > > > It would be interesting to see your patch. I always run HEAD but > > > maybe I could use it as a base for my own mods/tests. > > > > > > > Here is the patch > > This looks fair if all your disks are working at the same time (e.g. > RAID only setup), but if you have a setup where you have multiple > disks and only one is doing something, you limit the amount of tags > which can be used. No idea what kind of performance impact this would > have. > > What about the case where you have more disks than tags? > > I also noticed that you do a strncmp for "da". What about > "ada" (available in 9 and 10), I would assume it suffers from the same > problem. > It seems to. All my disks are ada. -- Gary Jennejohn From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 15:55:39 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 07AA6106564A for ; Thu, 5 Apr 2012 15:55:39 +0000 (UTC) (envelope-from jrytoung@gmail.com) Received: from mail-wi0-f170.google.com (mail-wi0-f170.google.com [209.85.212.170]) by mx1.freebsd.org (Postfix) with ESMTP id 85B7F8FC14 for ; Thu, 5 Apr 2012 15:55:38 +0000 (UTC) Received: by wibhr17 with SMTP id hr17so1517729wib.1 for ; Thu, 05 Apr 2012 08:55:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=Svxlyd0z7lztzNEddLK0ieZjEwxaT0mIbXM/0YZTOfg=; b=kmjyTm/MwEsoQsKf/rGLufG2WGJIPDaU2Pd7+gUPJZyYJnr3O0uAm2WlRCF65e20Ce EjwxJGV5fVGhHy9ngQrTLvk6YsJhEo5ztg39RI9HmozDg5/u12WeFuhQyl7p6CHrTAPS V0CjLSRt6Ry5Oer0YobnuydAo2dxWGCUm6Z1ORfsdmvOPxvR0erBengeqIEjDxNxaIHK HsUWBbbdubaxavlM4iLIz85pZmpzHU5gzqjrfWrLROeWXXGhi6VpyhNnnTgoe0ZmiQ0M dCEF7Iv9T9vCq7shfZhj5bxty6t3JvlEn8aFuB90uhseC5PLAHvAUbob5UpErGMSpYfE Zc+A== MIME-Version: 1.0 Received: by 10.180.24.66 with SMTP id s2mr6064965wif.7.1333641337569; Thu, 05 Apr 2012 08:55:37 -0700 (PDT) Received: by 10.216.27.148 with HTTP; Thu, 5 Apr 2012 08:55:37 -0700 (PDT) In-Reply-To: <20120405052246.00002c53@unknown> References: <20120403193124.46ad9de9@ernst.jennejohn.org> <20120405052246.00002c53@unknown> Date: Thu, 5 Apr 2012 08:55:37 -0700 Message-ID: From: Jerry Toung To: Alexander Leidinger Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-hackers Subject: Re: CAM disk I/O starvation X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 15:55:39 -0000 On Wed, Apr 4, 2012 at 8:22 PM, Alexander Leidinger wrote: > > This looks fair if all your disks are working at the same time (e.g. > RAID only setup), but if you have a setup where you have multiple > disks and only one is doing something, you limit the amount of tags > which can be used. No idea what kind of performance impact this would > have. > > I haven't seen any performance impact. da1, the one the used to stall consistenly get over 600MB/s. > What about the case where you have more disks than tags? > This part of the patch takes care of that scenario: @@ -998,6 +1003,24 @@ xpt_add_periph(struct cam_periph *periph mtx_lock(&xsoftc.xpt_topo_lock); xsoftc.xpt_generation++; + + if (device != NULL && device->sim->dev_count > 1 && + (device->sim->max_dev_openings > device->sim->dev_count)) { otherwise, we don't split the tags and the original behavior remains. > > I also noticed that you do a strncmp for "da". What about > "ada" (available in 9 and 10), I would assume it suffers from the same > problem. > I am running FreeBSD 8.1, no "ada". Me presenting a patch is just a way to draw attention on a problem and it improves things on my setup. There is certainly a way to make it more general/inclusive. Jerry From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 16:03:15 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 73991106566B; Thu, 5 Apr 2012 16:03:15 +0000 (UTC) (envelope-from alc@rice.edu) Received: from mh8.mail.rice.edu (mh8.mail.rice.edu [128.42.201.24]) by mx1.freebsd.org (Postfix) with ESMTP id 35E9B8FC18; Thu, 5 Apr 2012 16:03:15 +0000 (UTC) Received: from mh8.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh8.mail.rice.edu (Postfix) with ESMTP id 68379291D61; Thu, 5 Apr 2012 10:54:35 -0500 (CDT) Received: from mh8.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh8.mail.rice.edu (Postfix) with ESMTP id 5D4FE29761F; Thu, 5 Apr 2012 10:54:35 -0500 (CDT) X-Virus-Scanned: by amavis-2.6.4 at mh8.mail.rice.edu, auth channel Received: from mh8.mail.rice.edu ([127.0.0.1]) by mh8.mail.rice.edu (mh8.mail.rice.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id UU0oaVYrni33; Thu, 5 Apr 2012 10:54:35 -0500 (CDT) Received: from adsl-216-63-78-18.dsl.hstntx.swbell.net (adsl-216-63-78-18.dsl.hstntx.swbell.net [216.63.78.18]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: alc) by mh8.mail.rice.edu (Postfix) with ESMTPSA id B2C2E291D19; Thu, 5 Apr 2012 10:54:34 -0500 (CDT) Message-ID: <4F7DC037.9060803@rice.edu> Date: Thu, 05 Apr 2012 10:54:31 -0500 From: Alan Cox User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:8.0) Gecko/20111113 Thunderbird/8.0 MIME-Version: 1.0 To: Konstantin Belousov References: <4F7B495D.3010402@zonov.org> <20120404071746.GJ2358@deviant.kiev.zoral.com.ua> In-Reply-To: <20120404071746.GJ2358@deviant.kiev.zoral.com.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Mailman-Approved-At: Thu, 05 Apr 2012 16:10:10 +0000 Cc: alc@freebsd.org, freebsd-hackers@freebsd.org, Andrey Zonov Subject: Re: problems with mmap() and disk caching X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 16:03:15 -0000 On 04/04/2012 02:17, Konstantin Belousov wrote: > On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote: >> Hi, >> >> I open the file, then call mmap() on the whole file and get pointer, >> then I work with this pointer. I expect that page should be only once >> touched to get it into the memory (disk cache?), but this doesn't work! >> >> I wrote the test (attached) and ran it for the 1G file generated from >> /dev/random, the result is the following: >> >> Prepare file: >> # swapoff -a >> # newfs /dev/ada0b >> # mount /dev/ada0b /mnt >> # dd if=/dev/random of=/mnt/random-1024 bs=1m count=1024 >> >> Purge cache: >> # umount /mnt >> # mount /dev/ada0b /mnt >> >> Run test: >> $ ./mmap /mnt/random-1024 30 >> mmap: 1 pass took: 7.431046 (none: 262112; res: 32; super: >> 0; other: 0) >> mmap: 2 pass took: 7.356670 (none: 261648; res: 496; super: >> 0; other: 0) >> mmap: 3 pass took: 7.307094 (none: 260521; res: 1623; super: >> 0; other: 0) >> mmap: 4 pass took: 7.350239 (none: 258904; res: 3240; super: >> 0; other: 0) >> mmap: 5 pass took: 7.392480 (none: 257286; res: 4858; super: >> 0; other: 0) >> mmap: 6 pass took: 7.292069 (none: 255584; res: 6560; super: >> 0; other: 0) >> mmap: 7 pass took: 7.048980 (none: 251142; res: 11002; super: >> 0; other: 0) >> mmap: 8 pass took: 6.899387 (none: 247584; res: 14560; super: >> 0; other: 0) >> mmap: 9 pass took: 7.190579 (none: 242992; res: 19152; super: >> 0; other: 0) >> mmap: 10 pass took: 6.915482 (none: 239308; res: 22836; super: >> 0; other: 0) >> mmap: 11 pass took: 6.565909 (none: 232835; res: 29309; super: >> 0; other: 0) >> mmap: 12 pass took: 6.423945 (none: 226160; res: 35984; super: >> 0; other: 0) >> mmap: 13 pass took: 6.315385 (none: 208555; res: 53589; super: >> 0; other: 0) >> mmap: 14 pass took: 6.760780 (none: 192805; res: 69339; super: >> 0; other: 0) >> mmap: 15 pass took: 5.721513 (none: 174497; res: 87647; super: >> 0; other: 0) >> mmap: 16 pass took: 5.004424 (none: 155938; res: 106206; super: >> 0; other: 0) >> mmap: 17 pass took: 4.224926 (none: 135639; res: 126505; super: >> 0; other: 0) >> mmap: 18 pass took: 3.749608 (none: 117952; res: 144192; super: >> 0; other: 0) >> mmap: 19 pass took: 3.398084 (none: 99066; res: 163078; super: >> 0; other: 0) >> mmap: 20 pass took: 3.029557 (none: 74994; res: 187150; super: >> 0; other: 0) >> mmap: 21 pass took: 2.379430 (none: 55231; res: 206913; super: >> 0; other: 0) >> mmap: 22 pass took: 2.046521 (none: 40786; res: 221358; super: >> 0; other: 0) >> mmap: 23 pass took: 1.152797 (none: 30311; res: 231833; super: >> 0; other: 0) >> mmap: 24 pass took: 0.972617 (none: 16196; res: 245948; super: >> 0; other: 0) >> mmap: 25 pass took: 0.577515 (none: 8286; res: 253858; super: >> 0; other: 0) >> mmap: 26 pass took: 0.380738 (none: 3712; res: 258432; super: >> 0; other: 0) >> mmap: 27 pass took: 0.253583 (none: 1193; res: 260951; super: >> 0; other: 0) >> mmap: 28 pass took: 0.157508 (none: 0; res: 262144; super: >> 0; other: 0) >> mmap: 29 pass took: 0.156169 (none: 0; res: 262144; super: >> 0; other: 0) >> mmap: 30 pass took: 0.156550 (none: 0; res: 262144; super: >> 0; other: 0) >> >> If I ran this: >> $ cat /mnt/random-1024> /dev/null >> before test, when result is the following: >> >> $ ./mmap /mnt/random-1024 5 >> mmap: 1 pass took: 0.337657 (none: 0; res: 262144; super: >> 0; other: 0) >> mmap: 2 pass took: 0.186137 (none: 0; res: 262144; super: >> 0; other: 0) >> mmap: 3 pass took: 0.186132 (none: 0; res: 262144; super: >> 0; other: 0) >> mmap: 4 pass took: 0.186535 (none: 0; res: 262144; super: >> 0; other: 0) >> mmap: 5 pass took: 0.190353 (none: 0; res: 262144; super: >> 0; other: 0) >> >> This is what I expect. But why this doesn't work without reading file >> manually? > Issue seems to be in some change of the behaviour of the reserv or > phys allocator. I Cc:ed Alan. I'm pretty sure that the behavior here hasn't significantly changed in about twelve years. Otherwise, I agree with your analysis. On more than one occasion, I've been tempted to change: pmap_remove_all(mt); if (mt->dirty != 0) vm_page_deactivate(mt); else vm_page_cache(mt); to: vm_page_dontneed(mt); because I suspect that the current code does more harm than good. In theory, it saves activations of the page daemon. However, more often than not, I suspect that we are spending more on page reactivations than we are saving on page daemon activations. The sequential access detection heuristic is just too easily triggered. For example, I've seen it triggered by demand paging of the gcc text segment. Also, I think that pmap_remove_all() and especially vm_page_cache() are too severe for a detection heuristic that is so easily triggered. > What happen is that fault handler deactivates or caches the pages > previous to the one which would satisfy the fault. See the if() > statement starting at line 463 of vm/vm_fault.c. Since all pages > of the object in your test are clean, the pages are cached. > > Next fault would need to allocate some more pages for different index > of the same object. What I see is that vm_reserv_alloc_page() returns a > page that is from the cache for the same object, but different pindex. > As an obvious result, the page is invalidated and repurposed. When next > loop started, the page is not resident anymore, so it has to be re-read > from disk. > > The behaviour of the allocator is not consistent, so some pages are not > reused, allowing the test to converge and to collect all pages of the > object eventually. > > Calling madvise(MADV_RANDOM) fixes the issue, because the code to > deactivate/cache the pages is turned off. On the other hand, it also > turns of read-ahead for faulting, and the first loop becomes eternally > long. > > Doing MADV_WILLNEED does not fix the problem indeed, since willneed > reactivates the pages of the object at the time of call. To use > MADV_WILLNEED, you would need to call it between faults/memcpy. > >> I've also never seen super pages, how to make them work? > They just work, at least for me. Look at the output of procstat -v > after enough loops finished to not cause disk activity. > >> I've been playing with madvise and posix_fadvise but no luck. BTW, >> posix_fadvise(POSIX_FADV_WILLNEED) does nothing as the commentary says, >> shouldn't this be documented in the manual page? >> >> All tests were run under 9.0-STABLE (r233744). >> >> -- >> Andrey Zonov >> /*_ >> * Andrey Zonov (c) 2011 >> */ >> >> #include >> #include >> #include >> #include >> #include >> #include >> #include >> #include >> #include >> >> int >> main(int argc, char **argv) >> { >> int i; >> int fd; >> int num; >> int block; >> int pagesize; >> size_t n; >> size_t size; >> size_t none, incore, super, other; >> char *p; >> char *tmp; >> char *vec; >> char *vecp; >> struct stat sb; >> struct timeval tp, tp1, tp2; >> >> if (argc< 2 || argc> 4) >> errx(1, "usage: mmap [num] [block]"); >> >> fd = open(argv[1], O_RDONLY); >> if (fd == -1) >> err(1, "open()"); >> >> num = 1; >> if (argc>= 3) >> num = atoi(argv[2]); >> >> pagesize = getpagesize(); >> block = pagesize; >> if (argc == 4) >> block = atoi(argv[3]); >> >> if (fstat(fd,&sb) == -1) >> err(1, "fstat()"); >> size = sb.st_size; >> >> #if 0 >> if (posix_fadvise(fd, (off_t)0, (off_t)0, POSIX_FADV_WILLNEED) == -1) >> err(1, "posix_fadvise()"); >> #endif >> >> p = mmap(NULL, sb.st_size, PROT_READ, /*MAP_PREFAULT_READ |*/ MAP_PRIVATE, fd, (off_t)0); >> if (p == MAP_FAILED) >> err(1, "mmap()"); >> >> #if 0 >> if (madvise(p, (size_t)size, MADV_WILLNEED) == -1) >> err(1, "madvise()"); >> #endif >> >> tmp = calloc(1, block); >> if (tmp == NULL) >> err(1, "calloc()"); >> vec = calloc(1, size / pagesize); >> if (vec == NULL) >> err(1, "calloc()"); >> for (i = 0; i< num; i++) { >> gettimeofday(&tp1, NULL); >> for (n = 0; n< size / block; n++) >> memcpy(tmp, p + (n * block), block); >> gettimeofday(&tp2, NULL); >> timersub(&tp2,&tp1,&tp); >> >> if (mincore(p, size, vec) == -1) >> err(1, "mincore()"); >> >> none = incore = super = other = 0; >> for (vecp = vec; (size_t)(vecp - vec)< size / pagesize; vecp++) { >> if (*vecp == 0) >> none++; >> else if (*vecp& MINCORE_INCORE) >> incore++; >> else if (*vecp& MINCORE_SUPER) >> super++; >> else >> other++; >> } >> warnx("%2d pass took: %3ld.%06ld (none: %6ld; res: %6ld; super: %6ld; other: %6ld)", >> i + 1, tp.tv_sec, tp.tv_usec, none, incore, super, other); >> } >> free(vec); >> free(tmp); >> >> if (munmap(p, sb.st_size) == -1) >> err(1, "munmap()"); >> >> close(fd); >> >> exit(0); >> } >> _______________________________________________ >> freebsd-hackers@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers >> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 16:41:18 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3D050106564A; Thu, 5 Apr 2012 16:41:18 +0000 (UTC) (envelope-from alc@rice.edu) Received: from mh8.mail.rice.edu (mh8.mail.rice.edu [128.42.201.24]) by mx1.freebsd.org (Postfix) with ESMTP id 006158FC08; Thu, 5 Apr 2012 16:41:17 +0000 (UTC) Received: from mh8.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh8.mail.rice.edu (Postfix) with ESMTP id B1EF0291D5A; Thu, 5 Apr 2012 11:41:17 -0500 (CDT) Received: from mh8.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh8.mail.rice.edu (Postfix) with ESMTP id A4C3329761F; Thu, 5 Apr 2012 11:41:17 -0500 (CDT) X-Virus-Scanned: by amavis-2.6.4 at mh8.mail.rice.edu, auth channel Received: from mh8.mail.rice.edu ([127.0.0.1]) by mh8.mail.rice.edu (mh8.mail.rice.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id Fy1thYzieUxi; Thu, 5 Apr 2012 11:41:17 -0500 (CDT) Received: from adsl-216-63-78-18.dsl.hstntx.swbell.net (adsl-216-63-78-18.dsl.hstntx.swbell.net [216.63.78.18]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: alc) by mh8.mail.rice.edu (Postfix) with ESMTPSA id C6411291CFC; Thu, 5 Apr 2012 11:41:16 -0500 (CDT) Message-ID: <4F7DCB2C.7070709@rice.edu> Date: Thu, 05 Apr 2012 11:41:16 -0500 From: Alan Cox User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:8.0) Gecko/20111113 Thunderbird/8.0 MIME-Version: 1.0 To: Andrey Zonov References: <4F7B495D.3010402@zonov.org> <20120404071746.GJ2358@deviant.kiev.zoral.com.ua> <4F7C1620.6040703@zonov.org> In-Reply-To: <4F7C1620.6040703@zonov.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Mailman-Approved-At: Thu, 05 Apr 2012 16:54:06 +0000 Cc: Konstantin Belousov , freebsd-hackers@freebsd.org, alc@freebsd.org Subject: Re: problems with mmap() and disk caching X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 16:41:18 -0000 On 04/04/2012 04:36, Andrey Zonov wrote: > On 04.04.2012 11:17, Konstantin Belousov wrote: >> >> Calling madvise(MADV_RANDOM) fixes the issue, because the code to >> deactivate/cache the pages is turned off. On the other hand, it also >> turns of read-ahead for faulting, and the first loop becomes eternally >> long. > > Now it takes 5 times longer. Anyway, thanks for explanation. > >> >> Doing MADV_WILLNEED does not fix the problem indeed, since willneed >> reactivates the pages of the object at the time of call. To use >> MADV_WILLNEED, you would need to call it between faults/memcpy. >> > > I played with it, but no luck so far. > >>> >>> I've also never seen super pages, how to make them work? >> They just work, at least for me. Look at the output of procstat -v >> after enough loops finished to not cause disk activity. >> > > The problem was in my test program. I fixed it, now I see super pages > but I'm still not satisfied. There are several tests below: > > 1. With madvise(MADV_RANDOM) I see almost all super pages: > $ ./mmap /mnt/random-1024 5 > mmap: 1 pass took: 26.438535 (none: 0; res: 262144; super: 511; > other: 0) > mmap: 2 pass took: 0.187311 (none: 0; res: 262144; super: 511; > other: 0) > mmap: 3 pass took: 0.184953 (none: 0; res: 262144; super: 511; > other: 0) > mmap: 4 pass took: 0.186007 (none: 0; res: 262144; super: 511; > other: 0) > mmap: 5 pass took: 0.185790 (none: 0; res: 262144; super: 511; > other: 0) > > Should it be 512? > Check the starting virtual address. It is probably not aligned on a superpage boundary. Hence, a few pages at the start and end of your mapped region are not in a superpage. > 2. Without madvise(MADV_RANDOM): > $ ./mmap /mnt/random-1024 50 > mmap: 1 pass took: 7.629745 (none: 262112; res: 32; super: 0; > other: 0) > mmap: 2 pass took: 7.301720 (none: 261202; res: 942; super: 0; > other: 0) > mmap: 3 pass took: 7.261416 (none: 260226; res: 1918; super: 1; > other: 0) > [skip] > mmap: 49 pass took: 0.155368 (none: 0; res: 262144; super: 323; > other: 0) > mmap: 50 pass took: 0.155438 (none: 0; res: 262144; super: 323; > other: 0) > > Only 323 pages. > > 3. If I just re-run test I don't see super pages with any size of > "block". > > $ ./mmap /mnt/random-1024 5 $((1<<30)) > mmap: 1 pass took: 1.013939 (none: 0; res: 262144; super: 0; > other: 0) > mmap: 2 pass took: 0.267082 (none: 0; res: 262144; super: 0; > other: 0) > mmap: 3 pass took: 0.270711 (none: 0; res: 262144; super: 0; > other: 0) > mmap: 4 pass took: 0.268940 (none: 0; res: 262144; super: 0; > other: 0) > mmap: 5 pass took: 0.269634 (none: 0; res: 262144; super: 0; > other: 0) > > 4. If I activate madvise(MADV_WILLNEDD) in the copy loop and re-run > test then I see super pages only if I use "block" greater than 2Mb. > > $ ./mmap /mnt/random-1024 1 $((1<<21)) > mmap: 1 pass took: 0.299722 (none: 0; res: 262144; super: 0; > other: 0) > $ ./mmap /mnt/random-1024 1 $((1<<22)) > mmap: 1 pass took: 0.271828 (none: 0; res: 262144; super: 170; > other: 0) > $ ./mmap /mnt/random-1024 1 $((1<<23)) > mmap: 1 pass took: 0.333188 (none: 0; res: 262144; super: 258; > other: 0) > $ ./mmap /mnt/random-1024 1 $((1<<24)) > mmap: 1 pass took: 0.339250 (none: 0; res: 262144; super: 303; > other: 0) > $ ./mmap /mnt/random-1024 1 $((1<<25)) > mmap: 1 pass took: 0.418812 (none: 0; res: 262144; super: 324; > other: 0) > $ ./mmap /mnt/random-1024 1 $((1<<26)) > mmap: 1 pass took: 0.360892 (none: 0; res: 262144; super: 335; > other: 0) > $ ./mmap /mnt/random-1024 1 $((1<<27)) > mmap: 1 pass took: 0.401122 (none: 0; res: 262144; super: 342; > other: 0) > $ ./mmap /mnt/random-1024 1 $((1<<28)) > mmap: 1 pass took: 0.478764 (none: 0; res: 262144; super: 345; > other: 0) > $ ./mmap /mnt/random-1024 1 $((1<<29)) > mmap: 1 pass took: 0.607266 (none: 0; res: 262144; super: 346; > other: 0) > $ ./mmap /mnt/random-1024 1 $((1<<30)) > mmap: 1 pass took: 0.901269 (none: 0; res: 262144; super: 347; > other: 0) > > 5. If I activate madvise(MADV_WILLNEED) immediately after mmap() then > I see some number of super pages (the number from test #2). > > $ ./mmap /mnt/random-1024 5 > mmap: 1 pass took: 0.178666 (none: 0; res: 262144; super: 323; > other: 0) > mmap: 2 pass took: 0.158889 (none: 0; res: 262144; super: 323; > other: 0) > mmap: 3 pass took: 0.157229 (none: 0; res: 262144; super: 323; > other: 0) > mmap: 4 pass took: 0.156895 (none: 0; res: 262144; super: 323; > other: 0) > mmap: 5 pass took: 0.162938 (none: 0; res: 262144; super: 323; > other: 0) > > 6. If I read file manually before test then I don't see super pages > with any size of "block" and madvise(MADV_WILLNEED) doesn't help. > > $ ./mmap /mnt/random-1024 5 $((1<<30)) > mmap: 1 pass took: 0.996767 (none: 0; res: 262144; super: 0; > other: 0) > mmap: 2 pass took: 0.311129 (none: 0; res: 262144; super: 0; > other: 0) > mmap: 3 pass took: 0.317430 (none: 0; res: 262144; super: 0; > other: 0) > mmap: 4 pass took: 0.314437 (none: 0; res: 262144; super: 0; > other: 0) > mmap: 5 pass took: 0.310757 (none: 0; res: 262144; super: 0; > other: 0) > > When you read manually, i.e., perform a dd in advance of running your test program, the VM subsystem doesn't know that you intend to later mmap() the data. Moreover, it doesn't know what the alignment of that mapping will be. So, when it allocates physical memory for the file during the running of dd, it only allocates ordinary pages. I suspect that the rest of your results are explained by the overzealous behavior of the sequential access / cache-behind heuristic that Kostik described. Alan From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 17:15:56 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 15F80106564A; Thu, 5 Apr 2012 17:15:56 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id DF4B18FC12; Thu, 5 Apr 2012 17:15:55 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 3E3A7B945; Thu, 5 Apr 2012 13:15:55 -0400 (EDT) From: John Baldwin To: freebsd-hackers@freebsd.org, davidxu@freebsd.org Date: Thu, 5 Apr 2012 12:01:58 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p10; KDE/4.5.5; amd64; ; ) References: <1333590846.58474.YahooMailClassic@web180011.mail.gq1.yahoo.com> <20120405035645.GO2358@deviant.kiev.zoral.com.ua> <4F7D28AB.605@gmail.com> In-Reply-To: <4F7D28AB.605@gmail.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201204051201.58651.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Thu, 05 Apr 2012 13:15:55 -0400 (EDT) Cc: Subject: Re: Startvation of realtime piority threads X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 17:15:56 -0000 On Thursday, April 05, 2012 1:07:55 am David Xu wrote: > On 2012/4/5 11:56, Konstantin Belousov wrote: > > On Wed, Apr 04, 2012 at 06:54:06PM -0700, Sushanth Rai wrote: > >> I have a multithreaded user space program that basically runs at realtime priority. Synchronization between threads are done using spinlock. When running this program on a SMP system under heavy memory pressure I see that thread holding the spinlock is starved out of cpu. The cpus are effectively consumed by other threads that are spinning for lock to become available. > >> > >> After instrumenting the kernel a little bit what I found was that under memory pressure, when the user thread holding the spinlock traps into the kernel due to page fault, that thread sleeps until the free pages are available. The thread sleeps PUSER priority (within vm_waitpfault()). When it is ready to run, it is queued at PUSER priority even thought it's base priority is realtime. The other siblings threads that are spinning at realtime priority to acquire the spinlock starves the owner of spinlock. > >> > >> I was wondering if the sleep in vm_waitpfault() should be a MAX(td_user_pri, PUSER) instead of just PUSER. I'm running on 7.2 and it looks like this logic is the same in the trunk. > > It just so happen that your program stumbles upon a single sleep point in > > the kernel. If for whatever reason the thread in kernel is put off CPU > > due to failure to acquire any resource without priority propagation, > > you would get the same effect. Only blockable primitives do priority > > propagation, that are mutexes and rwlocks, AFAIR. In other words, any > > sx/lockmgr/sleep points are vulnerable to the same issue. > This is why I suggested that POSIX realtime priority should not be > boosted, it should be > only higher than PRI_MIN_TIMESHARE but lower than any priority all > msleep() callers > provided. The problem is userland realtime thread 's busy looping code > can cause > starvation a thread in kernel which holding a critical resource. > In kernel we can avoid to write dead-loop code, but userland code is not > trustable. Note that you have to be root to be rtprio, and that there is trustable userland code (just because you haven't used any doesn't mean it doesn't exist). > If you search "Realtime thread priorities" in 2010-december within @arch > list. > you may find the argument. I think the bug here is that sched_sleep() should not lower the priority of an rtprio process. It should arguably not raise the priority of an idprio process either, but sched_sleep() should probably only apply to timesharing threads. All that said, userland rtprio code is going to have to be careful. It should be using things like wired memory as Kostik suggested, and probably avoiding most system calls. You can definitely blow your foot off quite easily in lots of ways with rtprio. -- John Baldwin From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 17:35:31 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 043C3106564A; Thu, 5 Apr 2012 17:35:31 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id AA7618FC19; Thu, 5 Apr 2012 17:35:29 +0000 (UTC) Received: from skuns.kiev.zoral.com.ua (localhost [127.0.0.1]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id q35HVcrV033090; Thu, 5 Apr 2012 20:31:38 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5) with ESMTP id q35HVcHi027847; Thu, 5 Apr 2012 20:31:38 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5/Submit) id q35HVcaj027846; Thu, 5 Apr 2012 20:31:38 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 5 Apr 2012 20:31:38 +0300 From: Konstantin Belousov To: Alan Cox Message-ID: <20120405173138.GX2358@deviant.kiev.zoral.com.ua> References: <4F7B495D.3010402@zonov.org> <20120404071746.GJ2358@deviant.kiev.zoral.com.ua> <4F7DC037.9060803@rice.edu> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="zr9wCmgsEgDWsI0K" Content-Disposition: inline In-Reply-To: <4F7DC037.9060803@rice.edu> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean Cc: alc@freebsd.org, freebsd-hackers@freebsd.org, Andrey Zonov Subject: Re: problems with mmap() and disk caching X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 17:35:31 -0000 --zr9wCmgsEgDWsI0K Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Apr 05, 2012 at 10:54:31AM -0500, Alan Cox wrote: > On 04/04/2012 02:17, Konstantin Belousov wrote: > >On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote: > >>Hi, > >> > >>I open the file, then call mmap() on the whole file and get pointer, > >>then I work with this pointer. I expect that page should be only once > >>touched to get it into the memory (disk cache?), but this doesn't work! > >> > >>I wrote the test (attached) and ran it for the 1G file generated from > >>/dev/random, the result is the following: > >> > >>Prepare file: > >># swapoff -a > >># newfs /dev/ada0b > >># mount /dev/ada0b /mnt > >># dd if=3D/dev/random of=3D/mnt/random-1024 bs=3D1m count=3D1024 > >> > >>Purge cache: > >># umount /mnt > >># mount /dev/ada0b /mnt > >> > >>Run test: > >>$ ./mmap /mnt/random-1024 30 > >>mmap: 1 pass took: 7.431046 (none: 262112; res: 32; super: > >>0; other: 0) > >>mmap: 2 pass took: 7.356670 (none: 261648; res: 496; super: > >>0; other: 0) > >>mmap: 3 pass took: 7.307094 (none: 260521; res: 1623; super: > >>0; other: 0) > >>mmap: 4 pass took: 7.350239 (none: 258904; res: 3240; super: > >>0; other: 0) > >>mmap: 5 pass took: 7.392480 (none: 257286; res: 4858; super: > >>0; other: 0) > >>mmap: 6 pass took: 7.292069 (none: 255584; res: 6560; super: > >>0; other: 0) > >>mmap: 7 pass took: 7.048980 (none: 251142; res: 11002; super: > >>0; other: 0) > >>mmap: 8 pass took: 6.899387 (none: 247584; res: 14560; super: > >>0; other: 0) > >>mmap: 9 pass took: 7.190579 (none: 242992; res: 19152; super: > >>0; other: 0) > >>mmap: 10 pass took: 6.915482 (none: 239308; res: 22836; super: > >>0; other: 0) > >>mmap: 11 pass took: 6.565909 (none: 232835; res: 29309; super: > >>0; other: 0) > >>mmap: 12 pass took: 6.423945 (none: 226160; res: 35984; super: > >>0; other: 0) > >>mmap: 13 pass took: 6.315385 (none: 208555; res: 53589; super: > >>0; other: 0) > >>mmap: 14 pass took: 6.760780 (none: 192805; res: 69339; super: > >>0; other: 0) > >>mmap: 15 pass took: 5.721513 (none: 174497; res: 87647; super: > >>0; other: 0) > >>mmap: 16 pass took: 5.004424 (none: 155938; res: 106206; super: > >>0; other: 0) > >>mmap: 17 pass took: 4.224926 (none: 135639; res: 126505; super: > >>0; other: 0) > >>mmap: 18 pass took: 3.749608 (none: 117952; res: 144192; super: > >>0; other: 0) > >>mmap: 19 pass took: 3.398084 (none: 99066; res: 163078; super: > >>0; other: 0) > >>mmap: 20 pass took: 3.029557 (none: 74994; res: 187150; super: > >>0; other: 0) > >>mmap: 21 pass took: 2.379430 (none: 55231; res: 206913; super: > >>0; other: 0) > >>mmap: 22 pass took: 2.046521 (none: 40786; res: 221358; super: > >>0; other: 0) > >>mmap: 23 pass took: 1.152797 (none: 30311; res: 231833; super: > >>0; other: 0) > >>mmap: 24 pass took: 0.972617 (none: 16196; res: 245948; super: > >>0; other: 0) > >>mmap: 25 pass took: 0.577515 (none: 8286; res: 253858; super: > >>0; other: 0) > >>mmap: 26 pass took: 0.380738 (none: 3712; res: 258432; super: > >>0; other: 0) > >>mmap: 27 pass took: 0.253583 (none: 1193; res: 260951; super: > >>0; other: 0) > >>mmap: 28 pass took: 0.157508 (none: 0; res: 262144; super: > >>0; other: 0) > >>mmap: 29 pass took: 0.156169 (none: 0; res: 262144; super: > >>0; other: 0) > >>mmap: 30 pass took: 0.156550 (none: 0; res: 262144; super: > >>0; other: 0) > >> > >>If I ran this: > >>$ cat /mnt/random-1024> /dev/null > >>before test, when result is the following: > >> > >>$ ./mmap /mnt/random-1024 5 > >>mmap: 1 pass took: 0.337657 (none: 0; res: 262144; super: > >>0; other: 0) > >>mmap: 2 pass took: 0.186137 (none: 0; res: 262144; super: > >>0; other: 0) > >>mmap: 3 pass took: 0.186132 (none: 0; res: 262144; super: > >>0; other: 0) > >>mmap: 4 pass took: 0.186535 (none: 0; res: 262144; super: > >>0; other: 0) > >>mmap: 5 pass took: 0.190353 (none: 0; res: 262144; super: > >>0; other: 0) > >> > >>This is what I expect. But why this doesn't work without reading file > >>manually? > >Issue seems to be in some change of the behaviour of the reserv or > >phys allocator. I Cc:ed Alan. >=20 > I'm pretty sure that the behavior here hasn't significantly changed in=20 > about twelve years. Otherwise, I agree with your analysis. >=20 > On more than one occasion, I've been tempted to change: >=20 > pmap_remove_all(mt); > if (mt->dirty !=3D 0) > vm_page_deactivate(mt); > else > vm_page_cache(mt); >=20 > to: >=20 > vm_page_dontneed(mt); >=20 > because I suspect that the current code does more harm than good. In=20 > theory, it saves activations of the page daemon. However, more often=20 > than not, I suspect that we are spending more on page reactivations than= =20 > we are saving on page daemon activations. The sequential access=20 > detection heuristic is just too easily triggered. For example, I've=20 > seen it triggered by demand paging of the gcc text segment. Also, I=20 > think that pmap_remove_all() and especially vm_page_cache() are too=20 > severe for a detection heuristic that is so easily triggered. Yes, I agree that such change shall be an improvement, and I expect that Andrey will test it. On the other hand, I do think that allocator should prefer unnamed pages to pages which still have valid content. On my 12G desktop, I never saw more then 100MB of cached pages, and similar numbers are observed on the 32-48GB servers. I suppose that this is related. --zr9wCmgsEgDWsI0K Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAk991voACgkQC3+MBN1Mb4hlcgCfR9YVkv2Oj7ybQhmro4m7Ewgs FxEAn1urOu+uu1tcLh4u7H56v/oNAsJJ =HI7f -----END PGP SIGNATURE----- --zr9wCmgsEgDWsI0K-- From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 18:12:21 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 21A10106564A; Thu, 5 Apr 2012 18:12:21 +0000 (UTC) (envelope-from lacombar@gmail.com) Received: from mail-gy0-f182.google.com (mail-gy0-f182.google.com [209.85.160.182]) by mx1.freebsd.org (Postfix) with ESMTP id 7AE258FC0C; Thu, 5 Apr 2012 18:12:20 +0000 (UTC) Received: by ghrr20 with SMTP id r20so1087338ghr.13 for ; Thu, 05 Apr 2012 11:12:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=Tb0bjDRLad2LfXHu2UGx7fyYltMUsQ3lDS+HJQzMfTc=; b=OPSNgxsrhN9+GS2NfzUy/wi99JH6iyLpLOsLogCxNpFm8ELkh14C/gvM9OqbEEvutg 2Rfv0INAZmu6hqAFeiW/09p9KoEq/d20oH/YYvFb5xki+uyGhFzfusvYGG8VI4MA+x07 DdzTLjzSviQiIBo2w8c5Kx/MYZMHC91O8JebQSgy59+ghaeIl0FImKPSuvBOGh07Cy+l xZu+VMAF2n15+M3Fwiq6W+lXp3NjSD6bzurE8z4/J0PwNAc6MMBhww9zODVEdLsJRo0o xq0GybtMLSYO+l2Eq7VkiINgp/LC0Zm/rFe+AUr5or/fwNWAb4N1XAuk//2Vxrv55k1G qO5A== MIME-Version: 1.0 Received: by 10.236.197.66 with SMTP id s42mr3270180yhn.69.1333649533960; Thu, 05 Apr 2012 11:12:13 -0700 (PDT) Received: by 10.220.185.138 with HTTP; Thu, 5 Apr 2012 11:12:13 -0700 (PDT) In-Reply-To: <4F3E8858.4000001@FreeBSD.org> References: <4F2F7B7F.40508@FreeBSD.org> <4F366E8F.9060207@FreeBSD.org> <4F367965.6000602@FreeBSD.org> <4F396B24.5090602@FreeBSD.org> <4F3978BC.6090608@FreeBSD.org> <4F3990EA.1080002@FreeBSD.org> <4F3C0BB9.6050101@FreeBSD.org> <4F3E807A.60103@FreeBSD.org> <4F3E8858.4000001@FreeBSD.org> Date: Thu, 5 Apr 2012 14:12:13 -0400 Message-ID: From: Arnaud Lacombe To: Alexander Motin Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-hackers@freebsd.org, Florian Smeets , Jeff Roberson , Andriy Gapon , FreeBSD current Subject: Re: [RFT][patch] Scheduling for HTT and not only X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 18:12:21 -0000 Hi, [Sorry for the delay, I got a bit sidetrack'ed...] 2012/2/17 Alexander Motin : > On 17.02.2012 18:53, Arnaud Lacombe wrote: >> >> On Fri, Feb 17, 2012 at 11:29 AM, Alexander Motin =A0wr= ote: >>> >>> On 02/15/12 21:54, Jeff Roberson wrote: >>>> >>>> On Wed, 15 Feb 2012, Alexander Motin wrote: >>>>> >>>>> I've decided to stop those cache black magic practices and focus on >>>>> things that really exist in this world -- SMT and CPU load. I've >>>>> dropped most of cache related things from the patch and made the rest >>>>> of things more strict and predictable: >>>>> http://people.freebsd.org/~mav/sched.htt34.patch >>>> >>>> >>>> This looks great. I think there is value in considering the other >>>> approach further but I would like to do this part first. It would be >>>> nice to also add priority as a greater influence in the load balancing >>>> as well. >>> >>> >>> I haven't got good idea yet about balancing priorities, but I've >>> rewritten >>> balancer itself. As soon as sched_lowest() / sched_highest() are more >>> intelligent now, they allowed to remove topology traversing from the >>> balancer itself. That should fix double-swapping problem, allow to keep >>> some >>> affinity while moving threads and make balancing more fair. I did numbe= r >>> of >>> tests running 4, 8, 9 and 16 CPU-bound threads on 8 CPUs. With 4, 8 and >>> 16 >>> threads everything is stationary as it should. With 9 threads I see >>> regular >>> and random load move between all 8 CPUs. Measurements on 5 minutes run >>> show >>> deviation of only about 5 seconds. It is the same deviation as I see >>> caused >>> by only scheduling of 16 threads on 8 cores without any balancing neede= d >>> at >>> all. So I believe this code works as it should. >>> >>> Here is the patch: http://people.freebsd.org/~mav/sched.htt40.patch >>> >>> I plan this to be a final patch of this series (more to come :)) and if >>> there will be no problems or objections, I am going to commit it (excep= t >>> some debugging KTRs) in about ten days. So now it's a good time for >>> reviews >>> and testing. :) >>> >> is there a place where all the patches are available ? > > > All my scheduler patches are cumulative, so all you need is only the last > mentioned here sched.htt40.patch. > You may want to have a look to the result I collected in the `runs/freebsd-experiments' branch of: https://github.com/lacombar/hackbench/ and compare them with vanilla FreeBSD 9.0 and -CURRENT results available in `runs/freebsd'. On the dual package platform, your patch is not a definite win. > But in some cases, especially for multi-socket systems, to let it show it= s > best, you may want to apply additional patch from avg@ to better detect C= PU > topology: > https://gitorious.org/~avg/freebsd/avgbsd/commit/6bca4a2e4854ea3fc275946a= 023db65c483cb9dd > test I conducted specifically for this patch did not showed much improvemen= t... - Arnaud From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 18:25:52 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4B29D1065678; Thu, 5 Apr 2012 18:25:52 +0000 (UTC) (envelope-from alc@rice.edu) Received: from mh8.mail.rice.edu (mh8.mail.rice.edu [128.42.201.24]) by mx1.freebsd.org (Postfix) with ESMTP id C939F8FC0A; Thu, 5 Apr 2012 18:25:51 +0000 (UTC) Received: from mh8.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh8.mail.rice.edu (Postfix) with ESMTP id 1E9CA291D6C; Thu, 5 Apr 2012 13:25:51 -0500 (CDT) Received: from mh8.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh8.mail.rice.edu (Postfix) with ESMTP id 0634229761F; Thu, 5 Apr 2012 13:25:51 -0500 (CDT) X-Virus-Scanned: by amavis-2.6.4 at mh8.mail.rice.edu, auth channel Received: from mh8.mail.rice.edu ([127.0.0.1]) by mh8.mail.rice.edu (mh8.mail.rice.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id sxOt7pzi1VbQ; Thu, 5 Apr 2012 13:25:50 -0500 (CDT) Received: from adsl-216-63-78-18.dsl.hstntx.swbell.net (adsl-216-63-78-18.dsl.hstntx.swbell.net [216.63.78.18]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: alc) by mh8.mail.rice.edu (Postfix) with ESMTPSA id 5A1D4291D6C; Thu, 5 Apr 2012 13:25:50 -0500 (CDT) Message-ID: <4F7DE3AD.5080401@rice.edu> Date: Thu, 05 Apr 2012 13:25:49 -0500 From: Alan Cox User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:8.0) Gecko/20111113 Thunderbird/8.0 MIME-Version: 1.0 To: Konstantin Belousov References: <4F7B495D.3010402@zonov.org> <20120404071746.GJ2358@deviant.kiev.zoral.com.ua> <4F7DC037.9060803@rice.edu> <20120405173138.GX2358@deviant.kiev.zoral.com.ua> In-Reply-To: <20120405173138.GX2358@deviant.kiev.zoral.com.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Mailman-Approved-At: Thu, 05 Apr 2012 18:45:30 +0000 Cc: alc@freebsd.org, freebsd-hackers@freebsd.org, Andrey Zonov Subject: Re: problems with mmap() and disk caching X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 18:25:52 -0000 On 04/05/2012 12:31, Konstantin Belousov wrote: > On Thu, Apr 05, 2012 at 10:54:31AM -0500, Alan Cox wrote: >> On 04/04/2012 02:17, Konstantin Belousov wrote: >>> On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote: >>>> Hi, >>>> >>>> I open the file, then call mmap() on the whole file and get pointer, >>>> then I work with this pointer. I expect that page should be only once >>>> touched to get it into the memory (disk cache?), but this doesn't work! >>>> >>>> I wrote the test (attached) and ran it for the 1G file generated from >>>> /dev/random, the result is the following: >>>> >>>> Prepare file: >>>> # swapoff -a >>>> # newfs /dev/ada0b >>>> # mount /dev/ada0b /mnt >>>> # dd if=/dev/random of=/mnt/random-1024 bs=1m count=1024 >>>> >>>> Purge cache: >>>> # umount /mnt >>>> # mount /dev/ada0b /mnt >>>> >>>> Run test: >>>> $ ./mmap /mnt/random-1024 30 >>>> mmap: 1 pass took: 7.431046 (none: 262112; res: 32; super: >>>> 0; other: 0) >>>> mmap: 2 pass took: 7.356670 (none: 261648; res: 496; super: >>>> 0; other: 0) >>>> mmap: 3 pass took: 7.307094 (none: 260521; res: 1623; super: >>>> 0; other: 0) >>>> mmap: 4 pass took: 7.350239 (none: 258904; res: 3240; super: >>>> 0; other: 0) >>>> mmap: 5 pass took: 7.392480 (none: 257286; res: 4858; super: >>>> 0; other: 0) >>>> mmap: 6 pass took: 7.292069 (none: 255584; res: 6560; super: >>>> 0; other: 0) >>>> mmap: 7 pass took: 7.048980 (none: 251142; res: 11002; super: >>>> 0; other: 0) >>>> mmap: 8 pass took: 6.899387 (none: 247584; res: 14560; super: >>>> 0; other: 0) >>>> mmap: 9 pass took: 7.190579 (none: 242992; res: 19152; super: >>>> 0; other: 0) >>>> mmap: 10 pass took: 6.915482 (none: 239308; res: 22836; super: >>>> 0; other: 0) >>>> mmap: 11 pass took: 6.565909 (none: 232835; res: 29309; super: >>>> 0; other: 0) >>>> mmap: 12 pass took: 6.423945 (none: 226160; res: 35984; super: >>>> 0; other: 0) >>>> mmap: 13 pass took: 6.315385 (none: 208555; res: 53589; super: >>>> 0; other: 0) >>>> mmap: 14 pass took: 6.760780 (none: 192805; res: 69339; super: >>>> 0; other: 0) >>>> mmap: 15 pass took: 5.721513 (none: 174497; res: 87647; super: >>>> 0; other: 0) >>>> mmap: 16 pass took: 5.004424 (none: 155938; res: 106206; super: >>>> 0; other: 0) >>>> mmap: 17 pass took: 4.224926 (none: 135639; res: 126505; super: >>>> 0; other: 0) >>>> mmap: 18 pass took: 3.749608 (none: 117952; res: 144192; super: >>>> 0; other: 0) >>>> mmap: 19 pass took: 3.398084 (none: 99066; res: 163078; super: >>>> 0; other: 0) >>>> mmap: 20 pass took: 3.029557 (none: 74994; res: 187150; super: >>>> 0; other: 0) >>>> mmap: 21 pass took: 2.379430 (none: 55231; res: 206913; super: >>>> 0; other: 0) >>>> mmap: 22 pass took: 2.046521 (none: 40786; res: 221358; super: >>>> 0; other: 0) >>>> mmap: 23 pass took: 1.152797 (none: 30311; res: 231833; super: >>>> 0; other: 0) >>>> mmap: 24 pass took: 0.972617 (none: 16196; res: 245948; super: >>>> 0; other: 0) >>>> mmap: 25 pass took: 0.577515 (none: 8286; res: 253858; super: >>>> 0; other: 0) >>>> mmap: 26 pass took: 0.380738 (none: 3712; res: 258432; super: >>>> 0; other: 0) >>>> mmap: 27 pass took: 0.253583 (none: 1193; res: 260951; super: >>>> 0; other: 0) >>>> mmap: 28 pass took: 0.157508 (none: 0; res: 262144; super: >>>> 0; other: 0) >>>> mmap: 29 pass took: 0.156169 (none: 0; res: 262144; super: >>>> 0; other: 0) >>>> mmap: 30 pass took: 0.156550 (none: 0; res: 262144; super: >>>> 0; other: 0) >>>> >>>> If I ran this: >>>> $ cat /mnt/random-1024> /dev/null >>>> before test, when result is the following: >>>> >>>> $ ./mmap /mnt/random-1024 5 >>>> mmap: 1 pass took: 0.337657 (none: 0; res: 262144; super: >>>> 0; other: 0) >>>> mmap: 2 pass took: 0.186137 (none: 0; res: 262144; super: >>>> 0; other: 0) >>>> mmap: 3 pass took: 0.186132 (none: 0; res: 262144; super: >>>> 0; other: 0) >>>> mmap: 4 pass took: 0.186535 (none: 0; res: 262144; super: >>>> 0; other: 0) >>>> mmap: 5 pass took: 0.190353 (none: 0; res: 262144; super: >>>> 0; other: 0) >>>> >>>> This is what I expect. But why this doesn't work without reading file >>>> manually? >>> Issue seems to be in some change of the behaviour of the reserv or >>> phys allocator. I Cc:ed Alan. >> I'm pretty sure that the behavior here hasn't significantly changed in >> about twelve years. Otherwise, I agree with your analysis. >> >> On more than one occasion, I've been tempted to change: >> >> pmap_remove_all(mt); >> if (mt->dirty != 0) >> vm_page_deactivate(mt); >> else >> vm_page_cache(mt); >> >> to: >> >> vm_page_dontneed(mt); >> >> because I suspect that the current code does more harm than good. In >> theory, it saves activations of the page daemon. However, more often >> than not, I suspect that we are spending more on page reactivations than >> we are saving on page daemon activations. The sequential access >> detection heuristic is just too easily triggered. For example, I've >> seen it triggered by demand paging of the gcc text segment. Also, I >> think that pmap_remove_all() and especially vm_page_cache() are too >> severe for a detection heuristic that is so easily triggered. > Yes, I agree that such change shall be an improvement, and I expect > that Andrey will test it. > > On the other hand, I do think that allocator should prefer unnamed > pages to pages which still have valid content. On my 12G desktop, > I never saw more then 100MB of cached pages, and similar numbers > are observed on the 32-48GB servers. I suppose that this is related. On allocation, the system does prefer free pages over cached pages. When cached pages are added to the physical memory allocator, they are added to VM_FREEPOOL_CACHE. When pages are allocated, they are taken from VM_FREEPOOL_DEFAULT. Generally, pages only move from the CACHE pool to the DEFAULT pool when the DEFAULT pool is depleted. (However, occasionally, they do move because of coalescing.) When I redid the physical memory allocator, I looked at the rate of cached page reactivation under the old and the new allocators. At least for the tests that I did the rates weren't that different. It was low, single-digit percentages. I think the highest likelihood of reactivation comes from the pages that are cached by the sequential access heuristic because it is so overzealous. I don't think it's related. You see modest numbers of cached pages simply because the page daemon met its target for the sum of free and cached pages. So, it just stopped moving pages from the inactive queue into the physical memory allocator's cache/free queues. Alan From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 18:46:01 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AA18F1065680; Thu, 5 Apr 2012 18:46:01 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-we0-f182.google.com (mail-we0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 9F8508FC17; Thu, 5 Apr 2012 18:46:00 +0000 (UTC) Received: by wern13 with SMTP id n13so1363198wer.13 for ; Thu, 05 Apr 2012 11:45:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=d+b5ocGMfgLi5iHgcjBuONGM0pHcjjFqK6YOdy+fwXk=; b=rfliI3aIpQ9MN2qx/j+3oPf7mGyR5nCVTpqODciiEMk2SPQktZBLhl2ZqcanTqvJKF h82FzLn6NJqVWItexC+bgqEmWSGBLAQ9PrPFxllBHwU+oYS10M7lxBl88qSXIKgOQwwk xVo1Bd4Mx97TCTEzLWy87R1F1VDqhZnnO4Yy6PbDI8lJuzU08Lz3ouVRwfZR9mEozJ8E o2AVxTazkIvOIUj4mbWM52n+E3oQNN1HlnXvL9HLR++6l59cThG576unsPJGrqyLArxD mVZQN0+Nhz5dfUsXbOZb9EhFnwSUb6f8FFVLFR+WskchCvCYQznTjGv04mdpPc+d9aFN 1P4w== Received: by 10.216.134.27 with SMTP id r27mr2307338wei.107.1333651559469; Thu, 05 Apr 2012 11:45:59 -0700 (PDT) Received: from mavbook.mavhome.dp.ua (pc.mavhome.dp.ua. [212.86.226.226]) by mx.google.com with ESMTPS id 9sm20370826wid.2.2012.04.05.11.45.56 (version=SSLv3 cipher=OTHER); Thu, 05 Apr 2012 11:45:58 -0700 (PDT) Sender: Alexander Motin Message-ID: <4F7DE863.6080607@FreeBSD.org> Date: Thu, 05 Apr 2012 21:45:55 +0300 From: Alexander Motin User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20120116 Thunderbird/9.0 MIME-Version: 1.0 To: Arnaud Lacombe References: <4F2F7B7F.40508@FreeBSD.org> <4F366E8F.9060207@FreeBSD.org> <4F367965.6000602@FreeBSD.org> <4F396B24.5090602@FreeBSD.org> <4F3978BC.6090608@FreeBSD.org> <4F3990EA.1080002@FreeBSD.org> <4F3C0BB9.6050101@FreeBSD.org> <4F3E807A.60103@FreeBSD.org> <4F3E8858.4000001@FreeBSD.org> In-Reply-To: Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@freebsd.org, Florian Smeets , Jeff Roberson , Andriy Gapon , FreeBSD current Subject: Re: [RFT][patch] Scheduling for HTT and not only X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 18:46:01 -0000 On 05.04.2012 21:12, Arnaud Lacombe wrote: > Hi, > > [Sorry for the delay, I got a bit sidetrack'ed...] > > 2012/2/17 Alexander Motin: >> On 17.02.2012 18:53, Arnaud Lacombe wrote: >>> >>> On Fri, Feb 17, 2012 at 11:29 AM, Alexander Motin wrote: >>>> >>>> On 02/15/12 21:54, Jeff Roberson wrote: >>>>> >>>>> On Wed, 15 Feb 2012, Alexander Motin wrote: >>>>>> >>>>>> I've decided to stop those cache black magic practices and focus on >>>>>> things that really exist in this world -- SMT and CPU load. I've >>>>>> dropped most of cache related things from the patch and made the rest >>>>>> of things more strict and predictable: >>>>>> http://people.freebsd.org/~mav/sched.htt34.patch >>>>> >>>>> >>>>> This looks great. I think there is value in considering the other >>>>> approach further but I would like to do this part first. It would be >>>>> nice to also add priority as a greater influence in the load balancing >>>>> as well. >>>> >>>> >>>> I haven't got good idea yet about balancing priorities, but I've >>>> rewritten >>>> balancer itself. As soon as sched_lowest() / sched_highest() are more >>>> intelligent now, they allowed to remove topology traversing from the >>>> balancer itself. That should fix double-swapping problem, allow to keep >>>> some >>>> affinity while moving threads and make balancing more fair. I did number >>>> of >>>> tests running 4, 8, 9 and 16 CPU-bound threads on 8 CPUs. With 4, 8 and >>>> 16 >>>> threads everything is stationary as it should. With 9 threads I see >>>> regular >>>> and random load move between all 8 CPUs. Measurements on 5 minutes run >>>> show >>>> deviation of only about 5 seconds. It is the same deviation as I see >>>> caused >>>> by only scheduling of 16 threads on 8 cores without any balancing needed >>>> at >>>> all. So I believe this code works as it should. >>>> >>>> Here is the patch: http://people.freebsd.org/~mav/sched.htt40.patch >>>> >>>> I plan this to be a final patch of this series (more to come :)) and if >>>> there will be no problems or objections, I am going to commit it (except >>>> some debugging KTRs) in about ten days. So now it's a good time for >>>> reviews >>>> and testing. :) >>>> >>> is there a place where all the patches are available ? >> >> >> All my scheduler patches are cumulative, so all you need is only the last >> mentioned here sched.htt40.patch. >> > You may want to have a look to the result I collected in the > `runs/freebsd-experiments' branch of: > > https://github.com/lacombar/hackbench/ > > and compare them with vanilla FreeBSD 9.0 and -CURRENT results > available in `runs/freebsd'. On the dual package platform, your patch > is not a definite win. > >> But in some cases, especially for multi-socket systems, to let it show its >> best, you may want to apply additional patch from avg@ to better detect CPU >> topology: >> https://gitorious.org/~avg/freebsd/avgbsd/commit/6bca4a2e4854ea3fc275946a023db65c483cb9dd >> > test I conducted specifically for this patch did not showed much improvement... If I understand right, this test runs thousands of threads sending and receiving data over the pipes. It is quite likely that all CPUs will be always busy and so load balancing is not really important in this test, What looks good is that more complicated new code is not slower then old one. While this test seems very scheduler-intensive, it may depend on many other factors, such as syscall performance, context switch, etc. I'll try to play more with it. -- Alexander Motin From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 19:33:51 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2017D106566B for ; Thu, 5 Apr 2012 19:33:51 +0000 (UTC) (envelope-from andrey@zonov.org) Received: from mail-bk0-f54.google.com (mail-bk0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 91FB48FC12 for ; Thu, 5 Apr 2012 19:33:50 +0000 (UTC) Received: by bkcjc3 with SMTP id jc3so2016738bkc.13 for ; Thu, 05 Apr 2012 12:33:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding :x-gm-message-state; bh=dCDYOvcYYl6FVmZwZAQ/b/9O/6HAqokN4FlSFvB2UqA=; b=hwOK6FS2k0/JKJdo8UKPK1g4oPMV+ln/Z1EDGEuI4FgGUoNUOEadsSu3mM4tmfRauw dUUXuJUK8jYewrLO0z6QyuJO/eKJHfMz6G2g8nHUu3Z8KpW1+Tg7qdxfPvCHbLtTEOja /kuGnB/bJ3wl/NCY5NqSYo+a7IhbqAUXmTPwsbseXe8upRSBZ8bSC0ReYwP8g5yADfi1 +R2AW1bheA8uQsOZbD0bOVJ1SvW4eSPRJREhaGHte53DT7LRueXO4wXZvFoMsMs6Pxrq M2NkRkKQ+cpnxq7DXegYjf/n54Mdz5ruTH6r247OvVdN5+rhJ8ITxryaQFgyr7RIFpl2 48Fw== Received: by 10.204.148.80 with SMTP id o16mr1868400bkv.3.1333654429247; Thu, 05 Apr 2012 12:33:49 -0700 (PDT) Received: from [10.254.254.77] (ppp95-165-133-149.pppoe.spdop.ru. [95.165.133.149]) by mx.google.com with ESMTPS id f5sm10539041bke.9.2012.04.05.12.33.48 (version=SSLv3 cipher=OTHER); Thu, 05 Apr 2012 12:33:48 -0700 (PDT) Message-ID: <4F7DF39A.3000500@zonov.org> Date: Thu, 05 Apr 2012 23:33:46 +0400 From: Andrey Zonov User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.8.1.24) Gecko/20100228 Thunderbird/2.0.0.24 Mnenhy/0.7.6.0 MIME-Version: 1.0 To: Alan Cox References: <4F7B495D.3010402@zonov.org> <20120404071746.GJ2358@deviant.kiev.zoral.com.ua> <4F7DC037.9060803@rice.edu> In-Reply-To: <4F7DC037.9060803@rice.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Gm-Message-State: ALoCoQktlePz3WbQ65VBhrYLSxdJu5lCfUyoiJdeaW+ZDGBUaIGuIZUpgplsUVWCIpAa1Ws0LT/f Cc: Konstantin Belousov , freebsd-hackers@freebsd.org, alc@freebsd.org Subject: Re: problems with mmap() and disk caching X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 19:33:51 -0000 On 05.04.2012 19:54, Alan Cox wrote: > On 04/04/2012 02:17, Konstantin Belousov wrote: >> On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote: [snip] >>> This is what I expect. But why this doesn't work without reading file >>> manually? >> Issue seems to be in some change of the behaviour of the reserv or >> phys allocator. I Cc:ed Alan. > > I'm pretty sure that the behavior here hasn't significantly changed in > about twelve years. Otherwise, I agree with your analysis. > > On more than one occasion, I've been tempted to change: > > pmap_remove_all(mt); > if (mt->dirty != 0) > vm_page_deactivate(mt); > else > vm_page_cache(mt); > > to: > > vm_page_dontneed(mt); > Thanks Alan! Now it works as I expect! But I have more questions to you and kib@. They are in my test below. So, prepare file as earlier, and take information about memory usage from top(1). After preparation, but before test: Mem: 80M Active, 55M Inact, 721M Wired, 215M Buf, 46G Free First run: $ ./mmap /mnt/random mmap: 1 pass took: 7.462865 (none: 0; res: 262144; super: 0; other: 0) No super pages after first run, why?.. Mem: 79M Active, 1079M Inact, 722M Wired, 216M Buf, 45G Free Now the file is in inactive memory, that's good. Second run: $ ./mmap /mnt/random mmap: 1 pass took: 0.004191 (none: 0; res: 262144; super: 511; other: 0) All super pages are here, nice. Mem: 1103M Active, 55M Inact, 722M Wired, 216M Buf, 45G Free Wow, all inactive pages moved to active and sit there even after process was terminated, that's not good, what do you think? Read the file: $ cat /mnt/random > /dev/null Mem: 79M Active, 55M Inact, 1746M Wired, 1240M Buf, 45G Free Now the file is in wired memory. I do not understand why so. Could you please give me explanation about active/inactive/wired memory? > because I suspect that the current code does more harm than good. In > theory, it saves activations of the page daemon. However, more often > than not, I suspect that we are spending more on page reactivations than > we are saving on page daemon activations. The sequential access > detection heuristic is just too easily triggered. For example, I've seen > it triggered by demand paging of the gcc text segment. Also, I think > that pmap_remove_all() and especially vm_page_cache() are too severe for > a detection heuristic that is so easily triggered. > [snip] -- Andrey Zonov From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 19:41:28 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D14E1106566C; Thu, 5 Apr 2012 19:41:28 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 604898FC0A; Thu, 5 Apr 2012 19:41:28 +0000 (UTC) Received: from skuns.kiev.zoral.com.ua (localhost [127.0.0.1]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id q35JfNZ7050764; Thu, 5 Apr 2012 22:41:23 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5) with ESMTP id q35JfNaw028760; Thu, 5 Apr 2012 22:41:23 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5/Submit) id q35JfMIX028759; Thu, 5 Apr 2012 22:41:22 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 5 Apr 2012 22:41:22 +0300 From: Konstantin Belousov To: Andrey Zonov Message-ID: <20120405194122.GC2358@deviant.kiev.zoral.com.ua> References: <4F7B495D.3010402@zonov.org> <20120404071746.GJ2358@deviant.kiev.zoral.com.ua> <4F7DC037.9060803@rice.edu> <4F7DF39A.3000500@zonov.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="apIra6NYt7mqlw41" Content-Disposition: inline In-Reply-To: <4F7DF39A.3000500@zonov.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.0 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: alc@freebsd.org, freebsd-hackers@freebsd.org, Alan Cox Subject: Re: problems with mmap() and disk caching X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 19:41:28 -0000 --apIra6NYt7mqlw41 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Apr 05, 2012 at 11:33:46PM +0400, Andrey Zonov wrote: > On 05.04.2012 19:54, Alan Cox wrote: > >On 04/04/2012 02:17, Konstantin Belousov wrote: > >>On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote: > [snip] > >>>This is what I expect. But why this doesn't work without reading file > >>>manually? > >>Issue seems to be in some change of the behaviour of the reserv or > >>phys allocator. I Cc:ed Alan. > > > >I'm pretty sure that the behavior here hasn't significantly changed in > >about twelve years. Otherwise, I agree with your analysis. > > > >On more than one occasion, I've been tempted to change: > > > >pmap_remove_all(mt); > >if (mt->dirty !=3D 0) > >vm_page_deactivate(mt); > >else > >vm_page_cache(mt); > > > >to: > > > >vm_page_dontneed(mt); > > >=20 > Thanks Alan! Now it works as I expect! >=20 > But I have more questions to you and kib@. They are in my test below. >=20 > So, prepare file as earlier, and take information about memory usage=20 > from top(1). After preparation, but before test: > Mem: 80M Active, 55M Inact, 721M Wired, 215M Buf, 46G Free >=20 > First run: > $ ./mmap /mnt/random > mmap: 1 pass took: 7.462865 (none: 0; res: 262144; super:=20 > 0; other: 0) >=20 > No super pages after first run, why?.. >=20 > Mem: 79M Active, 1079M Inact, 722M Wired, 216M Buf, 45G Free >=20 > Now the file is in inactive memory, that's good. >=20 > Second run: > $ ./mmap /mnt/random > mmap: 1 pass took: 0.004191 (none: 0; res: 262144; super:=20 > 511; other: 0) >=20 > All super pages are here, nice. >=20 > Mem: 1103M Active, 55M Inact, 722M Wired, 216M Buf, 45G Free >=20 > Wow, all inactive pages moved to active and sit there even after process= =20 > was terminated, that's not good, what do you think? Why do you think this is 'not good' ? You have plenty of free memory, there is no memory pressure, and all pages were referenced recently. THere is no reason for them to be deactivated. >=20 > Read the file: > $ cat /mnt/random > /dev/null >=20 > Mem: 79M Active, 55M Inact, 1746M Wired, 1240M Buf, 45G Free >=20 > Now the file is in wired memory. I do not understand why so. You do use UFS, right ? There is enough buffer headers and buffer KVA to have buffers allocated for the whole file content. Since buffers wire corresponding pages, you get pages migrated to wired. When there appears a buffer pressure (i.e., any other i/o started), the buffers will be repurposed and pages moved to inactive. >=20 > Could you please give me explanation about active/inactive/wired memory? >=20 >=20 > >because I suspect that the current code does more harm than good. In > >theory, it saves activations of the page daemon. However, more often > >than not, I suspect that we are spending more on page reactivations than > >we are saving on page daemon activations. The sequential access > >detection heuristic is just too easily triggered. For example, I've seen > >it triggered by demand paging of the gcc text segment. Also, I think > >that pmap_remove_all() and especially vm_page_cache() are too severe for > >a detection heuristic that is so easily triggered. > > > [snip] >=20 > --=20 > Andrey Zonov --apIra6NYt7mqlw41 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAk999WIACgkQC3+MBN1Mb4juYQCgwyIjYQ6mXO30O0w9ktiSiq5p sY0An03Qp2+CcUYdcQ1lHxIbzZ5VCyNG =ehce -----END PGP SIGNATURE----- --apIra6NYt7mqlw41-- From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 19:54:58 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3B5F31065673 for ; Thu, 5 Apr 2012 19:54:58 +0000 (UTC) (envelope-from andrey@zonov.org) Received: from mail-bk0-f54.google.com (mail-bk0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 6236B8FC15 for ; Thu, 5 Apr 2012 19:54:57 +0000 (UTC) Received: by bkcjc3 with SMTP id jc3so2033303bkc.13 for ; Thu, 05 Apr 2012 12:54:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding :x-gm-message-state; bh=TEXOU1FZ34wyB0KMqZrFn+cbWhYE56qvzE1GnQlcYwI=; b=JAJ9c20dRBoz2fIuYa/gWdEwlHM+UlHUp/eXFUMwaVC9RoNpS4DhuSUYe4ZMCpxUjS qvBn9nG5C2XFc6B4xIurtL6CWHa68BHG5GU280MDnr/HfU1Aq2QD+nSpF2H4DE5BIF7T ntyB7gmo8tlHZ7LO5CZR9KlwW2xERPrqRq7GYokhRQzZQjoRpMGSVvluDLXulHYYZQq+ MJYZMmH347mqYhDxDJ/jbed6ABaViS6aAgRdj9YrWLo4kEHoocOYQfrrJsWLhjbmMS5x s1rKhz/i0LAF1dToy86SsP2bPjzP2sI7k/+5bZeL4WZ4kDhC7kTagWahY0hJHIYTMVp4 Qj7A== Received: by 10.204.157.134 with SMTP id b6mr1936147bkx.88.1333655695656; Thu, 05 Apr 2012 12:54:55 -0700 (PDT) Received: from [10.254.254.77] (ppp95-165-133-149.pppoe.spdop.ru. [95.165.133.149]) by mx.google.com with ESMTPS id c4sm10638247bkh.0.2012.04.05.12.54.54 (version=SSLv3 cipher=OTHER); Thu, 05 Apr 2012 12:54:55 -0700 (PDT) Message-ID: <4F7DF88D.2050907@zonov.org> Date: Thu, 05 Apr 2012 23:54:53 +0400 From: Andrey Zonov User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.8.1.24) Gecko/20100228 Thunderbird/2.0.0.24 Mnenhy/0.7.6.0 MIME-Version: 1.0 To: Konstantin Belousov References: <4F7B495D.3010402@zonov.org> <20120404071746.GJ2358@deviant.kiev.zoral.com.ua> <4F7DC037.9060803@rice.edu> <4F7DF39A.3000500@zonov.org> <20120405194122.GC2358@deviant.kiev.zoral.com.ua> In-Reply-To: <20120405194122.GC2358@deviant.kiev.zoral.com.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Gm-Message-State: ALoCoQk+8ws0RelVLrSJpCgqEjox2mq3+6jVJwfUMqOk+FfXzm0WotT5mMzl3ZwEgAwwmeFWQRn5 Cc: alc@freebsd.org, freebsd-hackers@freebsd.org, Alan Cox Subject: Re: problems with mmap() and disk caching X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 19:54:58 -0000 On 05.04.2012 23:41, Konstantin Belousov wrote: > On Thu, Apr 05, 2012 at 11:33:46PM +0400, Andrey Zonov wrote: >> On 05.04.2012 19:54, Alan Cox wrote: >>> On 04/04/2012 02:17, Konstantin Belousov wrote: >>>> On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote: >> [snip] >>>>> This is what I expect. But why this doesn't work without reading file >>>>> manually? >>>> Issue seems to be in some change of the behaviour of the reserv or >>>> phys allocator. I Cc:ed Alan. >>> >>> I'm pretty sure that the behavior here hasn't significantly changed in >>> about twelve years. Otherwise, I agree with your analysis. >>> >>> On more than one occasion, I've been tempted to change: >>> >>> pmap_remove_all(mt); >>> if (mt->dirty != 0) >>> vm_page_deactivate(mt); >>> else >>> vm_page_cache(mt); >>> >>> to: >>> >>> vm_page_dontneed(mt); >>> >> >> Thanks Alan! Now it works as I expect! >> >> But I have more questions to you and kib@. They are in my test below. >> >> So, prepare file as earlier, and take information about memory usage >> from top(1). After preparation, but before test: >> Mem: 80M Active, 55M Inact, 721M Wired, 215M Buf, 46G Free >> >> First run: >> $ ./mmap /mnt/random >> mmap: 1 pass took: 7.462865 (none: 0; res: 262144; super: >> 0; other: 0) >> >> No super pages after first run, why?.. >> >> Mem: 79M Active, 1079M Inact, 722M Wired, 216M Buf, 45G Free >> >> Now the file is in inactive memory, that's good. >> >> Second run: >> $ ./mmap /mnt/random >> mmap: 1 pass took: 0.004191 (none: 0; res: 262144; super: >> 511; other: 0) >> >> All super pages are here, nice. >> >> Mem: 1103M Active, 55M Inact, 722M Wired, 216M Buf, 45G Free >> >> Wow, all inactive pages moved to active and sit there even after process >> was terminated, that's not good, what do you think? > Why do you think this is 'not good' ? You have plenty of free memory, > there is no memory pressure, and all pages were referenced recently. > THere is no reason for them to be deactivated. > I always thought that active memory this is a sum of resident memory of all processes, inactive shows disk cache and wired shows kernel itself. >> >> Read the file: >> $ cat /mnt/random> /dev/null >> >> Mem: 79M Active, 55M Inact, 1746M Wired, 1240M Buf, 45G Free >> >> Now the file is in wired memory. I do not understand why so. > You do use UFS, right ? Yes. > There is enough buffer headers and buffer KVA > to have buffers allocated for the whole file content. Since buffers wire > corresponding pages, you get pages migrated to wired. > > When there appears a buffer pressure (i.e., any other i/o started), > the buffers will be repurposed and pages moved to inactive. > OK, how can I get amount of disk cache? >> >> Could you please give me explanation about active/inactive/wired memory? >> >> >>> because I suspect that the current code does more harm than good. In >>> theory, it saves activations of the page daemon. However, more often >>> than not, I suspect that we are spending more on page reactivations than >>> we are saving on page daemon activations. The sequential access >>> detection heuristic is just too easily triggered. For example, I've seen >>> it triggered by demand paging of the gcc text segment. Also, I think >>> that pmap_remove_all() and especially vm_page_cache() are too severe for >>> a detection heuristic that is so easily triggered. >>> >> [snip] >> >> -- >> Andrey Zonov -- Andrey Zonov From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 5 22:13:53 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 26D3C106564A for ; Thu, 5 Apr 2012 22:13:53 +0000 (UTC) (envelope-from kotasaikrishna28@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id D75E28FC0C for ; Thu, 5 Apr 2012 22:13:52 +0000 (UTC) Received: by iahk25 with SMTP id k25so3144788iah.13 for ; Thu, 05 Apr 2012 15:13:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=fLg68Cbx6Df1lhsck3ovtmOYKhJVoYEE2enDD7Zavgw=; b=vnvHnqAxo2BEPoaWqruGyHE77qtvlFbOp10DWdyuJfldHP3EFH91iRrAJOgsru56OP YaiMBm6rwI2AmAl2YlBFLdrlOQ/m5zj02qr6aUvA82U3ueGG7XhmTQSu9sfY7S6D9Ozn xApW9Kg9FVkcwOK9IaBGrVaYpMq2t8QxQPshoEL54JCizTyyT6bLorxY71etVIKZ6VkP fzrPoeUFI8YPu1QzHAjbhKQgecWdyPI7xhhjf4YuDedE92yjP5In7OEdw5/XfdpuX1TZ SE9+EZUJs6hvbAazOAxSsD5SZQC+pXiBHGDavWZNxqFeu0yNvaVwy0VZwQFKuq9+V965 x/VQ== MIME-Version: 1.0 Received: by 10.50.153.198 with SMTP id vi6mr3597534igb.0.1333664031723; Thu, 05 Apr 2012 15:13:51 -0700 (PDT) Received: by 10.42.132.197 with HTTP; Thu, 5 Apr 2012 15:13:51 -0700 (PDT) Date: Fri, 6 Apr 2012 03:43:51 +0530 Message-ID: From: kota saikrishna To: freebsd-hackers@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Making addresses from the address specified and given size in the child's address space and make it valid. X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2012 22:13:53 -0000 Hello, I am trying to inject code into the child process using ptrace utility. The function of the injecting code is to make addresses from the address specified and the size in the child's address space and make it valid(i.e to read and write ..). On my knowledge i tried to use mmap system call but i could not able to allocate the memory can any one help in this point how to achieve this. From owner-freebsd-hackers@FreeBSD.ORG Fri Apr 6 01:08:31 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4BBC2106566C for ; Fri, 6 Apr 2012 01:08:31 +0000 (UTC) (envelope-from sushanth_rai@yahoo.com) Received: from nm14.bullet.mail.sp2.yahoo.com (nm14.bullet.mail.sp2.yahoo.com [98.139.91.84]) by mx1.freebsd.org (Postfix) with SMTP id 201428FC15 for ; Fri, 6 Apr 2012 01:08:31 +0000 (UTC) Received: from [98.139.91.69] by nm14.bullet.mail.sp2.yahoo.com with NNFMP; 06 Apr 2012 01:08:25 -0000 Received: from [98.139.44.66] by tm9.bullet.mail.sp2.yahoo.com with NNFMP; 06 Apr 2012 01:08:25 -0000 Received: from [127.0.0.1] by omp1003.access.mail.sp2.yahoo.com with NNFMP; 06 Apr 2012 01:08:25 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 333431.3713.bm@omp1003.access.mail.sp2.yahoo.com Received: (qmail 6175 invoked by uid 60001); 6 Apr 2012 01:08:24 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1333674504; bh=ZY/jupau+wmGIoHnSpQZnDVpCcfM2clnbdlwYmmJJis=; h=X-YMail-OSG:Received:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=It6eb1GMG5wfvvBFsRPO3Ta9d0uHXu1cLu/1O1lCYcynlo+lDPo2Yn2CUJMJ+LF8EBeskot4ufpBy4WWES3fv+Z3PFuqeK/1qWDogKhiKDbWldKIfvzErm7eKLgd07aclzyJbI5oIjNBVaDBziH80/FyTRg6eBTajOMg1w3jgzY= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=vnAbMRTBN3PRiO7Q/oW+05CHH3T2h5R0Qamem+9Emz5aXF80cdPj2XJxZ+srVE9/ggkbmiapDuGPvsU1LoP7P2xcGX+b6280a3hEyr2U0qLGKI5+7FkYCdWHI5y5wHRmMW4lQFlPyrlt66jIZSjlbPljTUM8n3EgAatv2aSv0Pw=; X-YMail-OSG: NxGL5jIVM1lJjoAHdiXfgj0g8jpbBhDeZrfFYD42EBCO63r sOFstUBAnQPoTIpxf7cPerBx_dVLxi8oZG6gL1YugXudAGxTakrazJiTCgKf kgrX93Y_.ARSSMsAqVcj_mv0BSVp4.uvx16UV3lBdixDGgK.Cvj.GLkpZgi. IbCTpyftQeLMjdNn9G2xuMmU1YKcZ7DXGDtrQg6bpHr.mRtInrq.reHAxnb6 pv3ZxtfSnHiFKetK.7ouy7El2GON5lQOEwthzWAjqmWZfSTd9cQgq.tlXNuM 9Q8CHeSViQATaIViE0lRbluZELOknG9oL1tIL3jrpe47oWyaIL9L_hTlK6W. ffCk_LMrNDiXZf7VufsLwXFBRtZ6JaTy9.NpH0azOy0jhoZoxT68uHv6Hdis MgQlEBxGrbXkwHqMtijHiIVBlHeLxefYl3FOtWqDb6E0NcrOp75nYG.SWD.f DTqW3 Received: from [75.62.238.5] by web180016.mail.gq1.yahoo.com via HTTP; Thu, 05 Apr 2012 18:08:24 PDT X-Mailer: YahooMailClassic/15.0.5 YahooMailWebService/0.8.117.340979 Message-ID: <1333674504.97862.YahooMailClassic@web180016.mail.gq1.yahoo.com> Date: Thu, 5 Apr 2012 18:08:24 -0700 (PDT) From: Sushanth Rai To: John Baldwin In-Reply-To: <201204051201.58651.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-hackers@freebsd.org Subject: Re: Startvation of realtime piority threads X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2012 01:08:31 -0000 I understand the downside of badly written realtime app. In my case applic= ation runs in userspace without making much syscalls and by all means it is= a well behaved application. Yes, I can wire memory, change the application= to use mutex instead of spinlock and those changes should help but they ar= e still working around the problem. I still believe kernel should not lower= the realtime priority when blocking on resources. This can lead to priorit= y inversion, especially since these threads run at fixed priorities and ker= nel doesn't muck with them.=0A =0AAs you suggested _sleep() should not adju= st the priorities for realtime threads. =0A=0AThanks,=0ASushanth=0A=0A--- O= n Thu, 4/5/12, John Baldwin wrote:=0A=0A> From: John Bald= win =0A> Subject: Re: Startvation of realtime piority thre= ads=0A> To: freebsd-hackers@freebsd.org, davidxu@freebsd.org=0A> Date: Thur= sday, April 5, 2012, 9:01 AM=0A> On Thursday, April 05, 2012 1:07:55=0A> am= David Xu wrote:=0A> > On 2012/4/5 11:56, Konstantin Belousov wrote:=0A> > = > On Wed, Apr 04, 2012 at 06:54:06PM -0700, Sushanth=0A> Rai wrote:=0A> > >= > I have a multithreaded user space program that=0A> basically runs at real= time =0A> priority. Synchronization between threads are done using=0A> spin= lock. When =0A> running this program on a SMP system under heavy memory=0A>= pressure I see that =0A> thread holding the spinlock is starved out of cpu= . The cpus=0A> are effectively =0A> consumed by other threads that are spin= ning for lock to=0A> become available.=0A> > >>=0A> > >> After instrumentin= g the kernel a little bit=0A> what I found was that under =0A> memory press= ure, when the user thread holding the spinlock=0A> traps into the =0A> kern= el due to page fault, that thread sleeps until the free=0A> pages are =0A> = available. The thread sleeps PUSER priority (within=0A> vm_waitpfault()). W= hen it =0A> is ready to run, it is queued at PUSER priority even thought=0A= > it's base =0A> priority is realtime. The other siblings threads that are= =0A> spinning at realtime =0A> priority to acquire the spinlock starves the= owner of=0A> spinlock.=0A> > >>=0A> > >> I was wondering if the sleep in= =0A> vm_waitpfault() should be a =0A> MAX(td_user_pri, PUSER) instead of ju= st PUSER. I'm running=0A> on 7.2 and it looks =0A> like this logic is the s= ame in the trunk.=0A> > > It just so happen that your program stumbles upon= =0A> a single sleep point in=0A> > > the kernel. If for whatever reason the= thread in=0A> kernel is put off CPU=0A> > > due to failure to acquire any = resource without=0A> priority propagation,=0A> > > you would get the same e= ffect. Only blockable=0A> primitives do priority=0A> > > propagation, that = are mutexes and rwlocks, AFAIR.=0A> In other words, any=0A> > > sx/lockmgr/= sleep points are vulnerable to the same=0A> issue.=0A> > This is why I sugg= ested that POSIX realtime priority=0A> should not be =0A> > boosted, it sho= uld be=0A> > only higher than PRI_MIN_TIMESHARE but lower than any=0A> prio= rity all =0A> > msleep() callers=0A> > provided.=A0 The problem is userland= realtime thread=0A> 's busy looping code =0A> > can cause=0A> > starvation= a thread in kernel which holding a critical=0A> resource.=0A> > In kernel = we can avoid to write dead-loop code, but=0A> userland code is not =0A> > t= rustable.=0A> =0A> Note that you have to be root to be rtprio, and that the= re=0A> is trustable=0A> userland code (just because you haven't used any do= esn't=0A> mean it doesn't=0A> exist).=0A> =0A> > If you search "Realtime th= read priorities" in=0A> 2010-december within @arch =0A> > list.=0A> > you m= ay find the argument.=0A> =0A> I think the bug here is that sched_sleep() s= hould not lower=0A> the priority of=0A> an rtprio process.=A0 It should arg= uably not raise the=0A> priority of an idprio=0A> process either, but sched= _sleep() should probably only apply=0A> to timesharing=0A> threads.=0A> =0A= > All that said, userland rtprio code is going to have to be=0A> careful.= =A0 It should=0A> be using things like wired memory as Kostik suggested, an= d=0A> probably avoiding=0A> most system calls.=A0 You can definitely blow y= our foot=0A> off quite easily in lots =0A> of ways with rtprio.=0A> =0A> --= =0A> John Baldwin=0A> _______________________________________________=0A> = freebsd-hackers@freebsd.org=0A> mailing list=0A> http://lists.freebsd.org/m= ailman/listinfo/freebsd-hackers=0A> To unsubscribe, send any mail to "freeb= sd-hackers-unsubscribe@freebsd.org"=0A> From owner-freebsd-hackers@FreeBSD.ORG Fri Apr 6 06:45:12 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C43F81065673 for ; Fri, 6 Apr 2012 06:45:12 +0000 (UTC) (envelope-from andrey@zonov.org) Received: from mail-bk0-f54.google.com (mail-bk0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 429A18FC12 for ; Fri, 6 Apr 2012 06:45:12 +0000 (UTC) Received: by bkcjc3 with SMTP id jc3so2359005bkc.13 for ; Thu, 05 Apr 2012 23:45:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding :x-gm-message-state; bh=hKAOf0L7jBgNIW3eMfAI8vV/JjjACR7NytZWap13Quc=; b=V4+rWInOkKFTxOaKSe/eNIlQucd4q9EgDTQzfyAGZ3g3quCqfTsg4PjZBmZHrti1/u z8ExgVOObnanbjE0VhR2BntQLamyfjafX5A6E2WxI/KdxAcdm+DhZHE2zs6FitPY4WER UoezpJHieDAHrc/Ybz5Bgm5cM8NfoJzvXUU+aeMY9CEsQPniwK5rg6IM28G88G5sUrg8 SHEiP6MmwAch3EXQpjjQzNjXQOXI5sWFZlnmRoQYdOJIYbf7rJYl5lMo4xqPIr1LXiia E6RT9ysT6CSsEaTTIi0al59HYrlAJIz+/gq6EO0L8pkerAhhMjG+/6rkBArtDIgrvsHk poTQ== Received: by 10.204.156.216 with SMTP id y24mr2596782bkw.60.1333694711180; Thu, 05 Apr 2012 23:45:11 -0700 (PDT) Received: from [10.254.254.77] (ppp95-165-133-149.pppoe.spdop.ru. [95.165.133.149]) by mx.google.com with ESMTPS id iv11sm8603238bkc.16.2012.04.05.23.45.10 (version=SSLv3 cipher=OTHER); Thu, 05 Apr 2012 23:45:10 -0700 (PDT) Message-ID: <4F7E90F4.9050107@zonov.org> Date: Fri, 06 Apr 2012 10:45:08 +0400 From: Andrey Zonov User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.8.1.24) Gecko/20100228 Thunderbird/2.0.0.24 Mnenhy/0.7.6.0 MIME-Version: 1.0 To: Konstantin Belousov References: <4F7B495D.3010402@zonov.org> <20120404071746.GJ2358@deviant.kiev.zoral.com.ua> <4F7DC037.9060803@rice.edu> <4F7DF39A.3000500@zonov.org> <20120405194122.GC2358@deviant.kiev.zoral.com.ua> <4F7DF88D.2050907@zonov.org> In-Reply-To: <4F7DF88D.2050907@zonov.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Gm-Message-State: ALoCoQkdwYdddZrerGoM1+3NLqiJO7BSaMOSClKGRRw0IAmJdGhZKRfb3Lz8wUEJyDU0HjA2Efdc Cc: alc@freebsd.org, freebsd-hackers@freebsd.org, Alan Cox Subject: Re: problems with mmap() and disk caching X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2012 06:45:12 -0000 On 05.04.2012 23:54, Andrey Zonov wrote: > On 05.04.2012 23:41, Konstantin Belousov wrote: >> You do use UFS, right ? > > Yes. > I've run test on ZFS. Mem: 2645M Active, 363M Inact, 2042M Wired, 1406M Buf, 42G Free $ ./mmap /mnt/random Mem: 3669M Active, 363M Inact, 3067M Wired, 1406M Buf, 40G Free It eats 2Gb as I understand. # umount /mnt # zfs mount -a Mem: 2645M Active, 363M Inact, 2042M Wired, 1406M Buf, 42G Free $ cat /mnt/random > /dev/null Mem: 2645M Active, 363M Inact, 3067M Wired, 1406M Buf, 41G Free That's correct - 1Gb. About "Buf" memory. Is this reasonable to set it to 10% of physical memory? I've lost 10Gb by default on machines with 96Gb. -- Andrey Zonov From owner-freebsd-hackers@FreeBSD.ORG Fri Apr 6 08:13:54 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B773B1065687; Fri, 6 Apr 2012 08:13:54 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 446A58FC08; Fri, 6 Apr 2012 08:13:54 +0000 (UTC) Received: from skuns.kiev.zoral.com.ua (localhost [127.0.0.1]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id q368DnFk010533; Fri, 6 Apr 2012 11:13:49 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5) with ESMTP id q368Dn5n058714; Fri, 6 Apr 2012 11:13:49 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5/Submit) id q368DnM8058713; Fri, 6 Apr 2012 11:13:49 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 6 Apr 2012 11:13:49 +0300 From: Konstantin Belousov To: Andrey Zonov Message-ID: <20120406081349.GE2358@deviant.kiev.zoral.com.ua> References: <4F7B495D.3010402@zonov.org> <20120404071746.GJ2358@deviant.kiev.zoral.com.ua> <4F7DC037.9060803@rice.edu> <4F7DF39A.3000500@zonov.org> <20120405194122.GC2358@deviant.kiev.zoral.com.ua> <4F7DF88D.2050907@zonov.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="WO8U4WFuZV7m254k" Content-Disposition: inline In-Reply-To: <4F7DF88D.2050907@zonov.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.0 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: alc@freebsd.org, freebsd-hackers@freebsd.org, Alan Cox Subject: Re: problems with mmap() and disk caching X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2012 08:13:54 -0000 --WO8U4WFuZV7m254k Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Apr 05, 2012 at 11:54:53PM +0400, Andrey Zonov wrote: > On 05.04.2012 23:41, Konstantin Belousov wrote: > >On Thu, Apr 05, 2012 at 11:33:46PM +0400, Andrey Zonov wrote: > >>On 05.04.2012 19:54, Alan Cox wrote: > >>>On 04/04/2012 02:17, Konstantin Belousov wrote: > >>>>On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote: > >>[snip] > >>>>>This is what I expect. But why this doesn't work without reading file > >>>>>manually? > >>>>Issue seems to be in some change of the behaviour of the reserv or > >>>>phys allocator. I Cc:ed Alan. > >>> > >>>I'm pretty sure that the behavior here hasn't significantly changed in > >>>about twelve years. Otherwise, I agree with your analysis. > >>> > >>>On more than one occasion, I've been tempted to change: > >>> > >>>pmap_remove_all(mt); > >>>if (mt->dirty !=3D 0) > >>>vm_page_deactivate(mt); > >>>else > >>>vm_page_cache(mt); > >>> > >>>to: > >>> > >>>vm_page_dontneed(mt); > >>> > >> > >>Thanks Alan! Now it works as I expect! > >> > >>But I have more questions to you and kib@. They are in my test below. > >> > >>So, prepare file as earlier, and take information about memory usage > >>from top(1). After preparation, but before test: > >>Mem: 80M Active, 55M Inact, 721M Wired, 215M Buf, 46G Free > >> > >>First run: > >>$ ./mmap /mnt/random > >>mmap: 1 pass took: 7.462865 (none: 0; res: 262144; super: > >>0; other: 0) > >> > >>No super pages after first run, why?.. > >> > >>Mem: 79M Active, 1079M Inact, 722M Wired, 216M Buf, 45G Free > >> > >>Now the file is in inactive memory, that's good. > >> > >>Second run: > >>$ ./mmap /mnt/random > >>mmap: 1 pass took: 0.004191 (none: 0; res: 262144; super: > >>511; other: 0) > >> > >>All super pages are here, nice. > >> > >>Mem: 1103M Active, 55M Inact, 722M Wired, 216M Buf, 45G Free > >> > >>Wow, all inactive pages moved to active and sit there even after process > >>was terminated, that's not good, what do you think? > >Why do you think this is 'not good' ? You have plenty of free memory, > >there is no memory pressure, and all pages were referenced recently. > >THere is no reason for them to be deactivated. > > >=20 > I always thought that active memory this is a sum of resident memory of= =20 > all processes, inactive shows disk cache and wired shows kernel itself. So you are wrong. Both active and inactive memory can be mapped and not mapped, both can belong to vnode or to anonymous objects etc. Active/inactive distinction is only the amount of references that was noted by pagedaemon, or some other page history like the way it was unwired. Wired is not neccessary means kernel-used pages, user processes can wire their pages as well. >=20 > >> > >>Read the file: > >>$ cat /mnt/random> /dev/null > >> > >>Mem: 79M Active, 55M Inact, 1746M Wired, 1240M Buf, 45G Free > >> > >>Now the file is in wired memory. I do not understand why so. > >You do use UFS, right ? >=20 > Yes. >=20 > >There is enough buffer headers and buffer KVA > >to have buffers allocated for the whole file content. Since buffers wire > >corresponding pages, you get pages migrated to wired. > > > >When there appears a buffer pressure (i.e., any other i/o started), > >the buffers will be repurposed and pages moved to inactive. > > >=20 > OK, how can I get amount of disk cache? You cannot. At least I am not aware of any counter that keeps track of the resident pages belonging to vnode pager. Buffers should not be thought as disk cache, pages cache disk content. Instead, VMIO buffers only provide bread()/bwrite() compatible interface to the page cache (*) for filesystems. (*) - The cache term is used in generic term, not to confuse with cached pages counter from top etc. >=20 > >> > >>Could you please give me explanation about active/inactive/wired memory? > >> > >> > >>>because I suspect that the current code does more harm than good. In > >>>theory, it saves activations of the page daemon. However, more often > >>>than not, I suspect that we are spending more on page reactivations th= an > >>>we are saving on page daemon activations. The sequential access > >>>detection heuristic is just too easily triggered. For example, I've se= en > >>>it triggered by demand paging of the gcc text segment. Also, I think > >>>that pmap_remove_all() and especially vm_page_cache() are too severe f= or > >>>a detection heuristic that is so easily triggered. > >>> > >>[snip] > >> > >>-- > >>Andrey Zonov >=20 > --=20 > Andrey Zonov --WO8U4WFuZV7m254k Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAk9+pb0ACgkQC3+MBN1Mb4h1YQCg3KtPSZj31wnDH8WvahyI7xZ+ tg0AoO3p6lFlW8r6OSewR1gwYco7p7le =SUI1 -----END PGP SIGNATURE----- --WO8U4WFuZV7m254k-- From owner-freebsd-hackers@FreeBSD.ORG Fri Apr 6 08:39:03 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B81DE1065672; Fri, 6 Apr 2012 08:39:03 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 13BCD8FC08; Fri, 6 Apr 2012 08:39:02 +0000 (UTC) Received: from skuns.kiev.zoral.com.ua (localhost [127.0.0.1]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id q368cwfp012655; Fri, 6 Apr 2012 11:38:58 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5) with ESMTP id q368cwEn058974; Fri, 6 Apr 2012 11:38:58 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5/Submit) id q368cwW7058973; Fri, 6 Apr 2012 11:38:58 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 6 Apr 2012 11:38:58 +0300 From: Konstantin Belousov To: Alan Cox Message-ID: <20120406083858.GG2358@deviant.kiev.zoral.com.ua> References: <4F7B495D.3010402@zonov.org> <20120404071746.GJ2358@deviant.kiev.zoral.com.ua> <4F7DC037.9060803@rice.edu> <20120405173138.GX2358@deviant.kiev.zoral.com.ua> <4F7DE3AD.5080401@rice.edu> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="iX3VwOUIQMdbvojH" Content-Disposition: inline In-Reply-To: <4F7DE3AD.5080401@rice.edu> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.0 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: alc@freebsd.org, freebsd-hackers@freebsd.org, Andrey Zonov Subject: Re: problems with mmap() and disk caching X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2012 08:39:03 -0000 --iX3VwOUIQMdbvojH Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Apr 05, 2012 at 01:25:49PM -0500, Alan Cox wrote: > On 04/05/2012 12:31, Konstantin Belousov wrote: > >On Thu, Apr 05, 2012 at 10:54:31AM -0500, Alan Cox wrote: > >>On 04/04/2012 02:17, Konstantin Belousov wrote: > >>>On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote: > >>>>Hi, > >>>> > >>>>I open the file, then call mmap() on the whole file and get pointer, > >>>>then I work with this pointer. I expect that page should be only once > >>>>touched to get it into the memory (disk cache?), but this doesn't wor= k! > >>>> > >>>>I wrote the test (attached) and ran it for the 1G file generated from > >>>>/dev/random, the result is the following: > >>>> > >>>>Prepare file: > >>>># swapoff -a > >>>># newfs /dev/ada0b > >>>># mount /dev/ada0b /mnt > >>>># dd if=3D/dev/random of=3D/mnt/random-1024 bs=3D1m count=3D1024 > >>>> > >>>>Purge cache: > >>>># umount /mnt > >>>># mount /dev/ada0b /mnt > >>>> > >>>>Run test: > >>>>$ ./mmap /mnt/random-1024 30 > >>>>mmap: 1 pass took: 7.431046 (none: 262112; res: 32; super: > >>>>0; other: 0) > >>>>mmap: 2 pass took: 7.356670 (none: 261648; res: 496; super: > >>>>0; other: 0) > >>>>mmap: 3 pass took: 7.307094 (none: 260521; res: 1623; super: > >>>>0; other: 0) > >>>>mmap: 4 pass took: 7.350239 (none: 258904; res: 3240; super: > >>>>0; other: 0) > >>>>mmap: 5 pass took: 7.392480 (none: 257286; res: 4858; super: > >>>>0; other: 0) > >>>>mmap: 6 pass took: 7.292069 (none: 255584; res: 6560; super: > >>>>0; other: 0) > >>>>mmap: 7 pass took: 7.048980 (none: 251142; res: 11002; super: > >>>>0; other: 0) > >>>>mmap: 8 pass took: 6.899387 (none: 247584; res: 14560; super: > >>>>0; other: 0) > >>>>mmap: 9 pass took: 7.190579 (none: 242992; res: 19152; super: > >>>>0; other: 0) > >>>>mmap: 10 pass took: 6.915482 (none: 239308; res: 22836; super: > >>>>0; other: 0) > >>>>mmap: 11 pass took: 6.565909 (none: 232835; res: 29309; super: > >>>>0; other: 0) > >>>>mmap: 12 pass took: 6.423945 (none: 226160; res: 35984; super: > >>>>0; other: 0) > >>>>mmap: 13 pass took: 6.315385 (none: 208555; res: 53589; super: > >>>>0; other: 0) > >>>>mmap: 14 pass took: 6.760780 (none: 192805; res: 69339; super: > >>>>0; other: 0) > >>>>mmap: 15 pass took: 5.721513 (none: 174497; res: 87647; super: > >>>>0; other: 0) > >>>>mmap: 16 pass took: 5.004424 (none: 155938; res: 106206; super: > >>>>0; other: 0) > >>>>mmap: 17 pass took: 4.224926 (none: 135639; res: 126505; super: > >>>>0; other: 0) > >>>>mmap: 18 pass took: 3.749608 (none: 117952; res: 144192; super: > >>>>0; other: 0) > >>>>mmap: 19 pass took: 3.398084 (none: 99066; res: 163078; super: > >>>>0; other: 0) > >>>>mmap: 20 pass took: 3.029557 (none: 74994; res: 187150; super: > >>>>0; other: 0) > >>>>mmap: 21 pass took: 2.379430 (none: 55231; res: 206913; super: > >>>>0; other: 0) > >>>>mmap: 22 pass took: 2.046521 (none: 40786; res: 221358; super: > >>>>0; other: 0) > >>>>mmap: 23 pass took: 1.152797 (none: 30311; res: 231833; super: > >>>>0; other: 0) > >>>>mmap: 24 pass took: 0.972617 (none: 16196; res: 245948; super: > >>>>0; other: 0) > >>>>mmap: 25 pass took: 0.577515 (none: 8286; res: 253858; super: > >>>>0; other: 0) > >>>>mmap: 26 pass took: 0.380738 (none: 3712; res: 258432; super: > >>>>0; other: 0) > >>>>mmap: 27 pass took: 0.253583 (none: 1193; res: 260951; super: > >>>>0; other: 0) > >>>>mmap: 28 pass took: 0.157508 (none: 0; res: 262144; super: > >>>>0; other: 0) > >>>>mmap: 29 pass took: 0.156169 (none: 0; res: 262144; super: > >>>>0; other: 0) > >>>>mmap: 30 pass took: 0.156550 (none: 0; res: 262144; super: > >>>>0; other: 0) > >>>> > >>>>If I ran this: > >>>>$ cat /mnt/random-1024> /dev/null > >>>>before test, when result is the following: > >>>> > >>>>$ ./mmap /mnt/random-1024 5 > >>>>mmap: 1 pass took: 0.337657 (none: 0; res: 262144; super: > >>>>0; other: 0) > >>>>mmap: 2 pass took: 0.186137 (none: 0; res: 262144; super: > >>>>0; other: 0) > >>>>mmap: 3 pass took: 0.186132 (none: 0; res: 262144; super: > >>>>0; other: 0) > >>>>mmap: 4 pass took: 0.186535 (none: 0; res: 262144; super: > >>>>0; other: 0) > >>>>mmap: 5 pass took: 0.190353 (none: 0; res: 262144; super: > >>>>0; other: 0) > >>>> > >>>>This is what I expect. But why this doesn't work without reading file > >>>>manually? > >>>Issue seems to be in some change of the behaviour of the reserv or > >>>phys allocator. I Cc:ed Alan. > >>I'm pretty sure that the behavior here hasn't significantly changed in > >>about twelve years. Otherwise, I agree with your analysis. > >> > >>On more than one occasion, I've been tempted to change: > >> > >> pmap_remove_all(mt); > >> if (mt->dirty !=3D 0) > >> vm_page_deactivate(mt); > >> else > >> vm_page_cache(mt); > >> > >>to: > >> > >> vm_page_dontneed(mt); > >> > >>because I suspect that the current code does more harm than good. In > >>theory, it saves activations of the page daemon. However, more often > >>than not, I suspect that we are spending more on page reactivations than > >>we are saving on page daemon activations. The sequential access > >>detection heuristic is just too easily triggered. For example, I've > >>seen it triggered by demand paging of the gcc text segment. Also, I > >>think that pmap_remove_all() and especially vm_page_cache() are too > >>severe for a detection heuristic that is so easily triggered. > >Yes, I agree that such change shall be an improvement, and I expect > >that Andrey will test it. > > > >On the other hand, I do think that allocator should prefer unnamed > >pages to pages which still have valid content. On my 12G desktop, > >I never saw more then 100MB of cached pages, and similar numbers > >are observed on the 32-48GB servers. I suppose that this is related. >=20 > On allocation, the system does prefer free pages over cached pages. =20 > When cached pages are added to the physical memory allocator, they are=20 > added to VM_FREEPOOL_CACHE. When pages are allocated, they are taken=20 > from VM_FREEPOOL_DEFAULT. Generally, pages only move from the CACHE=20 > pool to the DEFAULT pool when the DEFAULT pool is depleted. (However,=20 > occasionally, they do move because of coalescing.) When I redid the=20 > physical memory allocator, I looked at the rate of cached page=20 > reactivation under the old and the new allocators. At least for the=20 > tests that I did the rates weren't that different. It was low,=20 > single-digit percentages. I think the highest likelihood of=20 > reactivation comes from the pages that are cached by the sequential=20 > access heuristic because it is so overzealous. >=20 > I don't think it's related. You see modest numbers of cached pages=20 > simply because the page daemon met its target for the sum of free and=20 > cached pages. So, it just stopped moving pages from the inactive queue= =20 > into the physical memory allocator's cache/free queues. No, I mean something else. Specifically, I mean that somehow the preference for non-named pages does not work. At least, I cannot give any other explanation for the following experiment. Lets take stock HEAD without change in vm_fault.c. The initial state of 8GB machine is as follows, the test file was not even stat(2)-ed yet. Mem: 37M Active, 18M Inact, 150M Wired, 236K Cache, 27M Buf, 7612M Free Now, run the unmodified original Andrey' test with only one pass, making sequential read of the mmap of a 5GB file from UFS volume. After the run Mem: 38M Active, 18M Inact, 153M Wired, 21M Cache, 30M Buf, 7586M Free Please note that cached count increased only for 20M, and this is for calls to vm_page_cache() worth of 5GB. In other words, it seems that allocator almost never touches free memory, always preferring cache. This is mostly coincides with what I saw when I profiled original problem reported by Andrey. --iX3VwOUIQMdbvojH Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAk9+q6EACgkQC3+MBN1Mb4g3KACgrd/EEDznjuG/ZDQujCt3HLUf l7kAn2vcKkYTgsRfhcElYfsmBdSGaJvt =WXvc -----END PGP SIGNATURE----- --iX3VwOUIQMdbvojH-- From owner-freebsd-hackers@FreeBSD.ORG Fri Apr 6 08:37:25 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 442691065673; Fri, 6 Apr 2012 08:37:25 +0000 (UTC) (envelope-from alc@rice.edu) Received: from mh7.mail.rice.edu (mh7.mail.rice.edu [128.42.199.46]) by mx1.freebsd.org (Postfix) with ESMTP id 06C998FC08; Fri, 6 Apr 2012 08:37:25 +0000 (UTC) Received: from mh7.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh7.mail.rice.edu (Postfix) with ESMTP id 69C50291F5C; Fri, 6 Apr 2012 03:37:24 -0500 (CDT) Received: from mh7.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh7.mail.rice.edu (Postfix) with ESMTP id 52EE1292126; Fri, 6 Apr 2012 03:37:24 -0500 (CDT) X-Virus-Scanned: by amavis-2.6.4 at mh7.mail.rice.edu, auth channel Received: from mh7.mail.rice.edu ([127.0.0.1]) by mh7.mail.rice.edu (mh7.mail.rice.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id SIbNeLWa7CVE; Fri, 6 Apr 2012 03:37:24 -0500 (CDT) Received: from adsl-216-63-78-18.dsl.hstntx.swbell.net (adsl-216-63-78-18.dsl.hstntx.swbell.net [216.63.78.18]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: alc) by mh7.mail.rice.edu (Postfix) with ESMTPSA id 634EB291F5C; Fri, 6 Apr 2012 03:37:23 -0500 (CDT) Message-ID: <4F7EAB40.7080706@rice.edu> Date: Fri, 06 Apr 2012 03:37:20 -0500 From: Alan Cox User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:8.0) Gecko/20111113 Thunderbird/8.0 MIME-Version: 1.0 To: Konstantin Belousov References: <4F7B495D.3010402@zonov.org> <20120404071746.GJ2358@deviant.kiev.zoral.com.ua> In-Reply-To: <20120404071746.GJ2358@deviant.kiev.zoral.com.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Mailman-Approved-At: Fri, 06 Apr 2012 10:44:54 +0000 Cc: alc@freebsd.org, freebsd-hackers@freebsd.org, Andrey Zonov Subject: Re: problems with mmap() and disk caching X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2012 08:37:25 -0000 On 04/04/2012 02:17, Konstantin Belousov wrote: > On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote: >> Hi, >> >> I open the file, then call mmap() on the whole file and get pointer, >> then I work with this pointer. I expect that page should be only once >> touched to get it into the memory (disk cache?), but this doesn't work! >> >> I wrote the test (attached) and ran it for the 1G file generated from >> /dev/random, the result is the following: >> >> Prepare file: >> # swapoff -a >> # newfs /dev/ada0b >> # mount /dev/ada0b /mnt >> # dd if=/dev/random of=/mnt/random-1024 bs=1m count=1024 >> >> Purge cache: >> # umount /mnt >> # mount /dev/ada0b /mnt >> >> Run test: >> $ ./mmap /mnt/random-1024 30 >> mmap: 1 pass took: 7.431046 (none: 262112; res: 32; super: >> 0; other: 0) >> mmap: 2 pass took: 7.356670 (none: 261648; res: 496; super: >> 0; other: 0) >> mmap: 3 pass took: 7.307094 (none: 260521; res: 1623; super: >> 0; other: 0) >> mmap: 4 pass took: 7.350239 (none: 258904; res: 3240; super: >> 0; other: 0) >> mmap: 5 pass took: 7.392480 (none: 257286; res: 4858; super: >> 0; other: 0) >> mmap: 6 pass took: 7.292069 (none: 255584; res: 6560; super: >> 0; other: 0) >> mmap: 7 pass took: 7.048980 (none: 251142; res: 11002; super: >> 0; other: 0) >> mmap: 8 pass took: 6.899387 (none: 247584; res: 14560; super: >> 0; other: 0) >> mmap: 9 pass took: 7.190579 (none: 242992; res: 19152; super: >> 0; other: 0) >> mmap: 10 pass took: 6.915482 (none: 239308; res: 22836; super: >> 0; other: 0) >> mmap: 11 pass took: 6.565909 (none: 232835; res: 29309; super: >> 0; other: 0) >> mmap: 12 pass took: 6.423945 (none: 226160; res: 35984; super: >> 0; other: 0) >> mmap: 13 pass took: 6.315385 (none: 208555; res: 53589; super: >> 0; other: 0) >> mmap: 14 pass took: 6.760780 (none: 192805; res: 69339; super: >> 0; other: 0) >> mmap: 15 pass took: 5.721513 (none: 174497; res: 87647; super: >> 0; other: 0) >> mmap: 16 pass took: 5.004424 (none: 155938; res: 106206; super: >> 0; other: 0) >> mmap: 17 pass took: 4.224926 (none: 135639; res: 126505; super: >> 0; other: 0) >> mmap: 18 pass took: 3.749608 (none: 117952; res: 144192; super: >> 0; other: 0) >> mmap: 19 pass took: 3.398084 (none: 99066; res: 163078; super: >> 0; other: 0) >> mmap: 20 pass took: 3.029557 (none: 74994; res: 187150; super: >> 0; other: 0) >> mmap: 21 pass took: 2.379430 (none: 55231; res: 206913; super: >> 0; other: 0) >> mmap: 22 pass took: 2.046521 (none: 40786; res: 221358; super: >> 0; other: 0) >> mmap: 23 pass took: 1.152797 (none: 30311; res: 231833; super: >> 0; other: 0) >> mmap: 24 pass took: 0.972617 (none: 16196; res: 245948; super: >> 0; other: 0) >> mmap: 25 pass took: 0.577515 (none: 8286; res: 253858; super: >> 0; other: 0) >> mmap: 26 pass took: 0.380738 (none: 3712; res: 258432; super: >> 0; other: 0) >> mmap: 27 pass took: 0.253583 (none: 1193; res: 260951; super: >> 0; other: 0) >> mmap: 28 pass took: 0.157508 (none: 0; res: 262144; super: >> 0; other: 0) >> mmap: 29 pass took: 0.156169 (none: 0; res: 262144; super: >> 0; other: 0) >> mmap: 30 pass took: 0.156550 (none: 0; res: 262144; super: >> 0; other: 0) >> >> If I ran this: >> $ cat /mnt/random-1024> /dev/null >> before test, when result is the following: >> >> $ ./mmap /mnt/random-1024 5 >> mmap: 1 pass took: 0.337657 (none: 0; res: 262144; super: >> 0; other: 0) >> mmap: 2 pass took: 0.186137 (none: 0; res: 262144; super: >> 0; other: 0) >> mmap: 3 pass took: 0.186132 (none: 0; res: 262144; super: >> 0; other: 0) >> mmap: 4 pass took: 0.186535 (none: 0; res: 262144; super: >> 0; other: 0) >> mmap: 5 pass took: 0.190353 (none: 0; res: 262144; super: >> 0; other: 0) >> >> This is what I expect. But why this doesn't work without reading file >> manually? > Issue seems to be in some change of the behaviour of the reserv or > phys allocator. I Cc:ed Alan. > > What happen is that fault handler deactivates or caches the pages > previous to the one which would satisfy the fault. See the if() > statement starting at line 463 of vm/vm_fault.c. Since all pages > of the object in your test are clean, the pages are cached. > > Next fault would need to allocate some more pages for different index > of the same object. What I see is that vm_reserv_alloc_page() returns a > page that is from the cache for the same object, but different pindex. > As an obvious result, the page is invalidated and repurposed. When next > loop started, the page is not resident anymore, so it has to be re-read > from disk. I pretty sure that the pages aren't being repurposed this quickly. Instead, I believe that the explanation is to be found in mincore(). mincore() is only reporting pages that are in the object's memq as resident. It is not reporting cache pages as resident. > The behaviour of the allocator is not consistent, so some pages are not > reused, allowing the test to converge and to collect all pages of the > object eventually. > > Calling madvise(MADV_RANDOM) fixes the issue, because the code to > deactivate/cache the pages is turned off. On the other hand, it also > turns of read-ahead for faulting, and the first loop becomes eternally > long. > > Doing MADV_WILLNEED does not fix the problem indeed, since willneed > reactivates the pages of the object at the time of call. To use > MADV_WILLNEED, you would need to call it between faults/memcpy. > >> I've also never seen super pages, how to make them work? > They just work, at least for me. Look at the output of procstat -v > after enough loops finished to not cause disk activity. > >> I've been playing with madvise and posix_fadvise but no luck. BTW, >> posix_fadvise(POSIX_FADV_WILLNEED) does nothing as the commentary says, >> shouldn't this be documented in the manual page? >> >> All tests were run under 9.0-STABLE (r233744). >> >> -- >> Andrey Zonov >> /*_ >> * Andrey Zonov (c) 2011 >> */ >> >> #include >> #include >> #include >> #include >> #include >> #include >> #include >> #include >> #include >> >> int >> main(int argc, char **argv) >> { >> int i; >> int fd; >> int num; >> int block; >> int pagesize; >> size_t n; >> size_t size; >> size_t none, incore, super, other; >> char *p; >> char *tmp; >> char *vec; >> char *vecp; >> struct stat sb; >> struct timeval tp, tp1, tp2; >> >> if (argc< 2 || argc> 4) >> errx(1, "usage: mmap [num] [block]"); >> >> fd = open(argv[1], O_RDONLY); >> if (fd == -1) >> err(1, "open()"); >> >> num = 1; >> if (argc>= 3) >> num = atoi(argv[2]); >> >> pagesize = getpagesize(); >> block = pagesize; >> if (argc == 4) >> block = atoi(argv[3]); >> >> if (fstat(fd,&sb) == -1) >> err(1, "fstat()"); >> size = sb.st_size; >> >> #if 0 >> if (posix_fadvise(fd, (off_t)0, (off_t)0, POSIX_FADV_WILLNEED) == -1) >> err(1, "posix_fadvise()"); >> #endif >> >> p = mmap(NULL, sb.st_size, PROT_READ, /*MAP_PREFAULT_READ |*/ MAP_PRIVATE, fd, (off_t)0); >> if (p == MAP_FAILED) >> err(1, "mmap()"); >> >> #if 0 >> if (madvise(p, (size_t)size, MADV_WILLNEED) == -1) >> err(1, "madvise()"); >> #endif >> >> tmp = calloc(1, block); >> if (tmp == NULL) >> err(1, "calloc()"); >> vec = calloc(1, size / pagesize); >> if (vec == NULL) >> err(1, "calloc()"); >> for (i = 0; i< num; i++) { >> gettimeofday(&tp1, NULL); >> for (n = 0; n< size / block; n++) >> memcpy(tmp, p + (n * block), block); >> gettimeofday(&tp2, NULL); >> timersub(&tp2,&tp1,&tp); >> >> if (mincore(p, size, vec) == -1) >> err(1, "mincore()"); >> >> none = incore = super = other = 0; >> for (vecp = vec; (size_t)(vecp - vec)< size / pagesize; vecp++) { >> if (*vecp == 0) >> none++; >> else if (*vecp& MINCORE_INCORE) >> incore++; >> else if (*vecp& MINCORE_SUPER) >> super++; >> else >> other++; >> } >> warnx("%2d pass took: %3ld.%06ld (none: %6ld; res: %6ld; super: %6ld; other: %6ld)", >> i + 1, tp.tv_sec, tp.tv_usec, none, incore, super, other); >> } >> free(vec); >> free(tmp); >> >> if (munmap(p, sb.st_size) == -1) >> err(1, "munmap()"); >> >> close(fd); >> >> exit(0); >> } >> _______________________________________________ >> freebsd-hackers@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers >> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" From owner-freebsd-hackers@FreeBSD.ORG Fri Apr 6 10:47:13 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: by hub.freebsd.org (Postfix, from userid 1233) id 8A7EE106566C; Fri, 6 Apr 2012 10:47:13 +0000 (UTC) Date: Fri, 6 Apr 2012 10:47:13 +0000 From: Alexander Best To: freebsd-hackers@freebsd.org Message-ID: <20120406104713.GA12282@freebsd.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="ikeVEW9yuYc//A+q" Content-Disposition: inline Subject: possible signedness issue in aic7xxx X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2012 10:47:13 -0000 --ikeVEW9yuYc//A+q Content-Type: text/plain; charset=us-ascii Content-Disposition: inline hi there, i noticed the following worning from clang when building HEAD: ===> sys/modules/aic7xxx/aicasm (obj,build-tools) /usr/github-freebsd-head/sys/modules/aic7xxx/aicasm/../../../dev/aic7xxx/aicasm/aicasm.c:604:5: warning: passing 'int *' to parameter of type 'unsigned int *' converts between pointers to integer types with different sign [-Wpointer-sign] &skip_addr, func_values) == 0) { ^~~~~~~~~~ /usr/github-freebsd-head/sys/modules/aic7xxx/aicasm/../../../dev/aic7xxx/aicasm/aicasm.c:83:24: note: passing argument to parameter 'skip_addr' here unsigned int *skip_addr, int *func_vals); ^ 1 warning generated. will the attached patch take care of the problem? cheers. alex --ikeVEW9yuYc//A+q Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="aicasm.c.patch" diff --git a/sys/dev/aic7xxx/aicasm/aicasm.c b/sys/dev/aic7xxx/aicasm/aicasm.c index 1b88ba0..08a540f 100644 --- a/sys/dev/aic7xxx/aicasm/aicasm.c +++ b/sys/dev/aic7xxx/aicasm/aicasm.c @@ -353,7 +353,7 @@ output_code(void) patch_t *cur_patch; critical_section_t *cs; symbol_node_t *cur_node; - int instrcount; + unsigned int instrcount; instrcount = 0; fprintf(ofile, @@ -455,7 +455,7 @@ output_code(void) "static const int num_critical_sections = sizeof(critical_sections)\n" " / sizeof(*critical_sections);\n"); - fprintf(stderr, "%s: %d instructions used\n", appname, instrcount); + fprintf(stderr, "%s: %u instructions used\n", appname, instrcount); } static void @@ -526,11 +526,11 @@ output_listing(char *ifilename) patch_t *cur_patch; symbol_node_t *cur_func; int *func_values; - int instrcount; + unsigned int instrcount; int instrptr; unsigned int line; int func_count; - int skip_addr; + unsigned int skip_addr; instrcount = 0; instrptr = 0; --ikeVEW9yuYc//A+q-- From owner-freebsd-hackers@FreeBSD.ORG Fri Apr 6 09:06:59 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B50D8106566C; Fri, 6 Apr 2012 09:06:59 +0000 (UTC) (envelope-from alc@rice.edu) Received: from mh4.mail.rice.edu (mh4.mail.rice.edu [128.42.199.11]) by mx1.freebsd.org (Postfix) with ESMTP id 3C8448FC08; Fri, 6 Apr 2012 09:06:59 +0000 (UTC) Received: from mh4.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh4.mail.rice.edu (Postfix) with ESMTP id E82E6291CED; Fri, 6 Apr 2012 04:06:52 -0500 (CDT) Received: from mh4.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh4.mail.rice.edu (Postfix) with ESMTP id D38732975F3; Fri, 6 Apr 2012 04:06:52 -0500 (CDT) X-Virus-Scanned: by amavis-2.6.4 at mh4.mail.rice.edu, auth channel Received: from mh4.mail.rice.edu ([127.0.0.1]) by mh4.mail.rice.edu (mh4.mail.rice.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id HT8S0ODncpPr; Fri, 6 Apr 2012 04:06:52 -0500 (CDT) Received: from adsl-216-63-78-18.dsl.hstntx.swbell.net (adsl-216-63-78-18.dsl.hstntx.swbell.net [216.63.78.18]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: alc) by mh4.mail.rice.edu (Postfix) with ESMTPSA id 2A38B291CED; Fri, 6 Apr 2012 04:06:52 -0500 (CDT) Message-ID: <4F7EB22B.3000509@rice.edu> Date: Fri, 06 Apr 2012 04:06:51 -0500 From: Alan Cox User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:8.0) Gecko/20111113 Thunderbird/8.0 MIME-Version: 1.0 To: Konstantin Belousov References: <4F7B495D.3010402@zonov.org> <20120404071746.GJ2358@deviant.kiev.zoral.com.ua> <4F7DC037.9060803@rice.edu> <20120405173138.GX2358@deviant.kiev.zoral.com.ua> <4F7DE3AD.5080401@rice.edu> <20120406083858.GG2358@deviant.kiev.zoral.com.ua> In-Reply-To: <20120406083858.GG2358@deviant.kiev.zoral.com.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Mailman-Approved-At: Fri, 06 Apr 2012 11:00:43 +0000 Cc: alc@freebsd.org, freebsd-hackers@freebsd.org, Andrey Zonov Subject: Re: problems with mmap() and disk caching X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2012 09:06:59 -0000 On 04/06/2012 03:38, Konstantin Belousov wrote: > On Thu, Apr 05, 2012 at 01:25:49PM -0500, Alan Cox wrote: >> On 04/05/2012 12:31, Konstantin Belousov wrote: >>> On Thu, Apr 05, 2012 at 10:54:31AM -0500, Alan Cox wrote: >>>> On 04/04/2012 02:17, Konstantin Belousov wrote: >>>>> On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote: >>>>>> Hi, >>>>>> >>>>>> I open the file, then call mmap() on the whole file and get pointer, >>>>>> then I work with this pointer. I expect that page should be only once >>>>>> touched to get it into the memory (disk cache?), but this doesn't work! >>>>>> >>>>>> I wrote the test (attached) and ran it for the 1G file generated from >>>>>> /dev/random, the result is the following: >>>>>> >>>>>> Prepare file: >>>>>> # swapoff -a >>>>>> # newfs /dev/ada0b >>>>>> # mount /dev/ada0b /mnt >>>>>> # dd if=/dev/random of=/mnt/random-1024 bs=1m count=1024 >>>>>> >>>>>> Purge cache: >>>>>> # umount /mnt >>>>>> # mount /dev/ada0b /mnt >>>>>> >>>>>> Run test: >>>>>> $ ./mmap /mnt/random-1024 30 >>>>>> mmap: 1 pass took: 7.431046 (none: 262112; res: 32; super: >>>>>> 0; other: 0) >>>>>> mmap: 2 pass took: 7.356670 (none: 261648; res: 496; super: >>>>>> 0; other: 0) >>>>>> mmap: 3 pass took: 7.307094 (none: 260521; res: 1623; super: >>>>>> 0; other: 0) >>>>>> mmap: 4 pass took: 7.350239 (none: 258904; res: 3240; super: >>>>>> 0; other: 0) >>>>>> mmap: 5 pass took: 7.392480 (none: 257286; res: 4858; super: >>>>>> 0; other: 0) >>>>>> mmap: 6 pass took: 7.292069 (none: 255584; res: 6560; super: >>>>>> 0; other: 0) >>>>>> mmap: 7 pass took: 7.048980 (none: 251142; res: 11002; super: >>>>>> 0; other: 0) >>>>>> mmap: 8 pass took: 6.899387 (none: 247584; res: 14560; super: >>>>>> 0; other: 0) >>>>>> mmap: 9 pass took: 7.190579 (none: 242992; res: 19152; super: >>>>>> 0; other: 0) >>>>>> mmap: 10 pass took: 6.915482 (none: 239308; res: 22836; super: >>>>>> 0; other: 0) >>>>>> mmap: 11 pass took: 6.565909 (none: 232835; res: 29309; super: >>>>>> 0; other: 0) >>>>>> mmap: 12 pass took: 6.423945 (none: 226160; res: 35984; super: >>>>>> 0; other: 0) >>>>>> mmap: 13 pass took: 6.315385 (none: 208555; res: 53589; super: >>>>>> 0; other: 0) >>>>>> mmap: 14 pass took: 6.760780 (none: 192805; res: 69339; super: >>>>>> 0; other: 0) >>>>>> mmap: 15 pass took: 5.721513 (none: 174497; res: 87647; super: >>>>>> 0; other: 0) >>>>>> mmap: 16 pass took: 5.004424 (none: 155938; res: 106206; super: >>>>>> 0; other: 0) >>>>>> mmap: 17 pass took: 4.224926 (none: 135639; res: 126505; super: >>>>>> 0; other: 0) >>>>>> mmap: 18 pass took: 3.749608 (none: 117952; res: 144192; super: >>>>>> 0; other: 0) >>>>>> mmap: 19 pass took: 3.398084 (none: 99066; res: 163078; super: >>>>>> 0; other: 0) >>>>>> mmap: 20 pass took: 3.029557 (none: 74994; res: 187150; super: >>>>>> 0; other: 0) >>>>>> mmap: 21 pass took: 2.379430 (none: 55231; res: 206913; super: >>>>>> 0; other: 0) >>>>>> mmap: 22 pass took: 2.046521 (none: 40786; res: 221358; super: >>>>>> 0; other: 0) >>>>>> mmap: 23 pass took: 1.152797 (none: 30311; res: 231833; super: >>>>>> 0; other: 0) >>>>>> mmap: 24 pass took: 0.972617 (none: 16196; res: 245948; super: >>>>>> 0; other: 0) >>>>>> mmap: 25 pass took: 0.577515 (none: 8286; res: 253858; super: >>>>>> 0; other: 0) >>>>>> mmap: 26 pass took: 0.380738 (none: 3712; res: 258432; super: >>>>>> 0; other: 0) >>>>>> mmap: 27 pass took: 0.253583 (none: 1193; res: 260951; super: >>>>>> 0; other: 0) >>>>>> mmap: 28 pass took: 0.157508 (none: 0; res: 262144; super: >>>>>> 0; other: 0) >>>>>> mmap: 29 pass took: 0.156169 (none: 0; res: 262144; super: >>>>>> 0; other: 0) >>>>>> mmap: 30 pass took: 0.156550 (none: 0; res: 262144; super: >>>>>> 0; other: 0) >>>>>> >>>>>> If I ran this: >>>>>> $ cat /mnt/random-1024> /dev/null >>>>>> before test, when result is the following: >>>>>> >>>>>> $ ./mmap /mnt/random-1024 5 >>>>>> mmap: 1 pass took: 0.337657 (none: 0; res: 262144; super: >>>>>> 0; other: 0) >>>>>> mmap: 2 pass took: 0.186137 (none: 0; res: 262144; super: >>>>>> 0; other: 0) >>>>>> mmap: 3 pass took: 0.186132 (none: 0; res: 262144; super: >>>>>> 0; other: 0) >>>>>> mmap: 4 pass took: 0.186535 (none: 0; res: 262144; super: >>>>>> 0; other: 0) >>>>>> mmap: 5 pass took: 0.190353 (none: 0; res: 262144; super: >>>>>> 0; other: 0) >>>>>> >>>>>> This is what I expect. But why this doesn't work without reading file >>>>>> manually? >>>>> Issue seems to be in some change of the behaviour of the reserv or >>>>> phys allocator. I Cc:ed Alan. >>>> I'm pretty sure that the behavior here hasn't significantly changed in >>>> about twelve years. Otherwise, I agree with your analysis. >>>> >>>> On more than one occasion, I've been tempted to change: >>>> >>>> pmap_remove_all(mt); >>>> if (mt->dirty != 0) >>>> vm_page_deactivate(mt); >>>> else >>>> vm_page_cache(mt); >>>> >>>> to: >>>> >>>> vm_page_dontneed(mt); >>>> >>>> because I suspect that the current code does more harm than good. In >>>> theory, it saves activations of the page daemon. However, more often >>>> than not, I suspect that we are spending more on page reactivations than >>>> we are saving on page daemon activations. The sequential access >>>> detection heuristic is just too easily triggered. For example, I've >>>> seen it triggered by demand paging of the gcc text segment. Also, I >>>> think that pmap_remove_all() and especially vm_page_cache() are too >>>> severe for a detection heuristic that is so easily triggered. >>> Yes, I agree that such change shall be an improvement, and I expect >>> that Andrey will test it. >>> >>> On the other hand, I do think that allocator should prefer unnamed >>> pages to pages which still have valid content. On my 12G desktop, >>> I never saw more then 100MB of cached pages, and similar numbers >>> are observed on the 32-48GB servers. I suppose that this is related. >> On allocation, the system does prefer free pages over cached pages. >> When cached pages are added to the physical memory allocator, they are >> added to VM_FREEPOOL_CACHE. When pages are allocated, they are taken >> from VM_FREEPOOL_DEFAULT. Generally, pages only move from the CACHE >> pool to the DEFAULT pool when the DEFAULT pool is depleted. (However, >> occasionally, they do move because of coalescing.) When I redid the >> physical memory allocator, I looked at the rate of cached page >> reactivation under the old and the new allocators. At least for the >> tests that I did the rates weren't that different. It was low, >> single-digit percentages. I think the highest likelihood of >> reactivation comes from the pages that are cached by the sequential >> access heuristic because it is so overzealous. >> >> I don't think it's related. You see modest numbers of cached pages >> simply because the page daemon met its target for the sum of free and >> cached pages. So, it just stopped moving pages from the inactive queue >> into the physical memory allocator's cache/free queues. > No, I mean something else. Specifically, I mean that somehow the > preference for non-named pages does not work. At least, I cannot give > any other explanation for the following experiment. > > Lets take stock HEAD without change in vm_fault.c. The initial > state of 8GB machine is as follows, the test file was not even > stat(2)-ed yet. > Mem: 37M Active, 18M Inact, 150M Wired, 236K Cache, 27M Buf, 7612M Free > > Now, run the unmodified original Andrey' test with only one pass, > making sequential read of the mmap of a 5GB file from UFS volume. > After the run > Mem: 38M Active, 18M Inact, 153M Wired, 21M Cache, 30M Buf, 7586M Free > > Please note that cached count increased only for 20M, and this is > for calls to vm_page_cache() worth of 5GB. In other words, it seems > that allocator almost never touches free memory, always preferring > cache. This is mostly coincides with what I saw when I profiled > original problem reported by Andrey. Ah, I understand. From owner-freebsd-hackers@FreeBSD.ORG Fri Apr 6 14:13:20 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EBA62106566B; Fri, 6 Apr 2012 14:13:20 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-lpp01m010-f54.google.com (mail-lpp01m010-f54.google.com [209.85.215.54]) by mx1.freebsd.org (Postfix) with ESMTP id 7D35E8FC12; Fri, 6 Apr 2012 14:13:19 +0000 (UTC) Received: by lagv3 with SMTP id v3so3256993lag.13 for ; Fri, 06 Apr 2012 07:13:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=0HKZXzzhla0zLLcbztegWtUXI92ZPMj/ck0vAu6yrBM=; b=Bo4hm4+ERIOFBL9nP7Kms0U1ki9RU6AyXEH6EZlYk2lz/hqPNCLnIiukqdXexZAm1Y Xb48ZTQgub5ocus7UA/oGNv3yOaZ30nIepAS2jxct95Y0LDDtJNhd8Ta4nXJ0S7GMECb slstf2QDThU9MPztOY6/Q3iIQteWWATt9QHFPIjylTc9ukYjBl0x+LmtxLdg5sDr7x8E 3nsPhL8xxtqJpEi8zmG3WzbgBn6HZg55Z2mEf2Be/egHzqSkH2uLaH/tjxy3K4P0+nMz O0HrdQB8IZVWMVo96SFa9IhwEegzQxDoBNfBScnmXAzXveiIlEKNsrAPqLzh3R3IrOga t+Og== MIME-Version: 1.0 Received: by 10.152.103.239 with SMTP id fz15mr8742722lab.42.1333721598380; Fri, 06 Apr 2012 07:13:18 -0700 (PDT) Sender: asmrookie@gmail.com Received: by 10.112.93.138 with HTTP; Fri, 6 Apr 2012 07:13:18 -0700 (PDT) In-Reply-To: References: <4F2F7B7F.40508@FreeBSD.org> <4F366E8F.9060207@FreeBSD.org> <4F367965.6000602@FreeBSD.org> <4F396B24.5090602@FreeBSD.org> <4F3978BC.6090608@FreeBSD.org> <4F3990EA.1080002@FreeBSD.org> <4F3C0BB9.6050101@FreeBSD.org> <4F3E807A.60103@FreeBSD.org> <4F3E8858.4000001@FreeBSD.org> Date: Fri, 6 Apr 2012 15:13:18 +0100 X-Google-Sender-Auth: KJiZxTVK21Sfhjp69bIgLvdGm2o Message-ID: From: Attilio Rao To: Arnaud Lacombe Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: Florian Smeets , freebsd-hackers@freebsd.org, Alexander Motin , Andriy Gapon , FreeBSD current , Jeff Roberson Subject: Re: [RFT][patch] Scheduling for HTT and not only X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2012 14:13:21 -0000 Il 05 aprile 2012 19:12, Arnaud Lacombe ha scritto: > Hi, > > [Sorry for the delay, I got a bit sidetrack'ed...] > > 2012/2/17 Alexander Motin : >> On 17.02.2012 18:53, Arnaud Lacombe wrote: >>> >>> On Fri, Feb 17, 2012 at 11:29 AM, Alexander Motin =C2= =A0wrote: >>>> >>>> On 02/15/12 21:54, Jeff Roberson wrote: >>>>> >>>>> On Wed, 15 Feb 2012, Alexander Motin wrote: >>>>>> >>>>>> I've decided to stop those cache black magic practices and focus on >>>>>> things that really exist in this world -- SMT and CPU load. I've >>>>>> dropped most of cache related things from the patch and made the res= t >>>>>> of things more strict and predictable: >>>>>> http://people.freebsd.org/~mav/sched.htt34.patch >>>>> >>>>> >>>>> This looks great. I think there is value in considering the other >>>>> approach further but I would like to do this part first. It would be >>>>> nice to also add priority as a greater influence in the load balancin= g >>>>> as well. >>>> >>>> >>>> I haven't got good idea yet about balancing priorities, but I've >>>> rewritten >>>> balancer itself. As soon as sched_lowest() / sched_highest() are more >>>> intelligent now, they allowed to remove topology traversing from the >>>> balancer itself. That should fix double-swapping problem, allow to kee= p >>>> some >>>> affinity while moving threads and make balancing more fair. I did numb= er >>>> of >>>> tests running 4, 8, 9 and 16 CPU-bound threads on 8 CPUs. With 4, 8 an= d >>>> 16 >>>> threads everything is stationary as it should. With 9 threads I see >>>> regular >>>> and random load move between all 8 CPUs. Measurements on 5 minutes run >>>> show >>>> deviation of only about 5 seconds. It is the same deviation as I see >>>> caused >>>> by only scheduling of 16 threads on 8 cores without any balancing need= ed >>>> at >>>> all. So I believe this code works as it should. >>>> >>>> Here is the patch: http://people.freebsd.org/~mav/sched.htt40.patch >>>> >>>> I plan this to be a final patch of this series (more to come :)) and i= f >>>> there will be no problems or objections, I am going to commit it (exce= pt >>>> some debugging KTRs) in about ten days. So now it's a good time for >>>> reviews >>>> and testing. :) >>>> >>> is there a place where all the patches are available ? >> >> >> All my scheduler patches are cumulative, so all you need is only the las= t >> mentioned here sched.htt40.patch. >> > You may want to have a look to the result I collected in the > `runs/freebsd-experiments' branch of: > > https://github.com/lacombar/hackbench/ > > and compare them with vanilla FreeBSD 9.0 and -CURRENT results > available in `runs/freebsd'. On the dual package platform, your patch > is not a definite win. > >> But in some cases, especially for multi-socket systems, to let it show i= ts >> best, you may want to apply additional patch from avg@ to better detect = CPU >> topology: >> https://gitorious.org/~avg/freebsd/avgbsd/commit/6bca4a2e4854ea3fc275946= a023db65c483cb9dd >> > test I conducted specifically for this patch did not showed much improvem= ent... Can you please clarify on this point? The test you did included cases where the topology was detected badly against cases where the topology was detected correctly as a patched kernel (and you still didn't see a performance improvement), in terms of cache line sharing? Attilio --=20 Peace can only be achieved by understanding - A. Einstein From owner-freebsd-hackers@FreeBSD.ORG Fri Apr 6 14:27:20 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DFF831065672; Fri, 6 Apr 2012 14:27:20 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-wg0-f50.google.com (mail-wg0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id AFFD78FC18; Fri, 6 Apr 2012 14:27:19 +0000 (UTC) Received: by wgbds12 with SMTP id ds12so2228200wgb.31 for ; Fri, 06 Apr 2012 07:27:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=emZVa9b8Bc1ZMHPCqJo/EhEjOxbLz1GKWE7WZGF5a9o=; b=YAe/XBzLhJxuHsLOIWDvwe9GSTLSPviUny3gQuBWMmMK/VUZBD0JbPDYsI5B0CNne/ zcLHWlAM7LJKKPzHUHkbfs/fhLkyNx/gZrRHt4AO8ifC+Buc31ChKVBjXHnOMhHV1EXn VrKbZN+CatQr8l9v+o6PZtCou3K/eDZaLjEeklLe9pZeY3qhsCBAwoIJ+L8awDM0PMLl lvUUXjqPaH2vKegZS7//FgDEho8Bjp3zvol8of3lNWY3sEd20/18Yur7qyrvqZE6wleC WFNrteosfT9kjnM8JnaS0IQRubVW33yfnXOTgWYS4K0d10E3sUQDdyxvzgszm+0jck/f 7N/g== Received: by 10.180.101.230 with SMTP id fj6mr20202527wib.13.1333722437873; Fri, 06 Apr 2012 07:27:17 -0700 (PDT) Received: from mavbook2.mavhome.dp.ua (pc.mavhome.dp.ua. [212.86.226.226]) by mx.google.com with ESMTPS id gg2sm11209533wib.7.2012.04.06.07.27.15 (version=SSLv3 cipher=OTHER); Fri, 06 Apr 2012 07:27:17 -0700 (PDT) Sender: Alexander Motin Message-ID: <4F7EFD42.9010507@FreeBSD.org> Date: Fri, 06 Apr 2012 17:27:14 +0300 From: Alexander Motin User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:10.0.2) Gecko/20120226 Thunderbird/10.0.2 MIME-Version: 1.0 To: Attilio Rao References: <4F2F7B7F.40508@FreeBSD.org> <4F366E8F.9060207@FreeBSD.org> <4F367965.6000602@FreeBSD.org> <4F396B24.5090602@FreeBSD.org> <4F3978BC.6090608@FreeBSD.org> <4F3990EA.1080002@FreeBSD.org> <4F3C0BB9.6050101@FreeBSD.org> <4F3E807A.60103@FreeBSD.org> <4F3E8858.4000001@FreeBSD.org> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Florian Smeets , freebsd-hackers@freebsd.org, Andriy Gapon , FreeBSD current , Jeff Roberson , Arnaud Lacombe Subject: Re: [RFT][patch] Scheduling for HTT and not only X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2012 14:27:21 -0000 On 04/06/12 17:13, Attilio Rao wrote: > Il 05 aprile 2012 19:12, Arnaud Lacombe ha scritto: >> Hi, >> >> [Sorry for the delay, I got a bit sidetrack'ed...] >> >> 2012/2/17 Alexander Motin: >>> On 17.02.2012 18:53, Arnaud Lacombe wrote: >>>> >>>> On Fri, Feb 17, 2012 at 11:29 AM, Alexander Motin wrote: >>>>> >>>>> On 02/15/12 21:54, Jeff Roberson wrote: >>>>>> >>>>>> On Wed, 15 Feb 2012, Alexander Motin wrote: >>>>>>> >>>>>>> I've decided to stop those cache black magic practices and focus on >>>>>>> things that really exist in this world -- SMT and CPU load. I've >>>>>>> dropped most of cache related things from the patch and made the rest >>>>>>> of things more strict and predictable: >>>>>>> http://people.freebsd.org/~mav/sched.htt34.patch >>>>>> >>>>>> >>>>>> This looks great. I think there is value in considering the other >>>>>> approach further but I would like to do this part first. It would be >>>>>> nice to also add priority as a greater influence in the load balancing >>>>>> as well. >>>>> >>>>> >>>>> I haven't got good idea yet about balancing priorities, but I've >>>>> rewritten >>>>> balancer itself. As soon as sched_lowest() / sched_highest() are more >>>>> intelligent now, they allowed to remove topology traversing from the >>>>> balancer itself. That should fix double-swapping problem, allow to keep >>>>> some >>>>> affinity while moving threads and make balancing more fair. I did number >>>>> of >>>>> tests running 4, 8, 9 and 16 CPU-bound threads on 8 CPUs. With 4, 8 and >>>>> 16 >>>>> threads everything is stationary as it should. With 9 threads I see >>>>> regular >>>>> and random load move between all 8 CPUs. Measurements on 5 minutes run >>>>> show >>>>> deviation of only about 5 seconds. It is the same deviation as I see >>>>> caused >>>>> by only scheduling of 16 threads on 8 cores without any balancing needed >>>>> at >>>>> all. So I believe this code works as it should. >>>>> >>>>> Here is the patch: http://people.freebsd.org/~mav/sched.htt40.patch >>>>> >>>>> I plan this to be a final patch of this series (more to come :)) and if >>>>> there will be no problems or objections, I am going to commit it (except >>>>> some debugging KTRs) in about ten days. So now it's a good time for >>>>> reviews >>>>> and testing. :) >>>>> >>>> is there a place where all the patches are available ? >>> >>> >>> All my scheduler patches are cumulative, so all you need is only the last >>> mentioned here sched.htt40.patch. >>> >> You may want to have a look to the result I collected in the >> `runs/freebsd-experiments' branch of: >> >> https://github.com/lacombar/hackbench/ >> >> and compare them with vanilla FreeBSD 9.0 and -CURRENT results >> available in `runs/freebsd'. On the dual package platform, your patch >> is not a definite win. >> >>> But in some cases, especially for multi-socket systems, to let it show its >>> best, you may want to apply additional patch from avg@ to better detect CPU >>> topology: >>> https://gitorious.org/~avg/freebsd/avgbsd/commit/6bca4a2e4854ea3fc275946a023db65c483cb9dd >>> >> test I conducted specifically for this patch did not showed much improvement... > > Can you please clarify on this point? > The test you did included cases where the topology was detected badly > against cases where the topology was detected correctly as a patched > kernel (and you still didn't see a performance improvement), in terms > of cache line sharing? At this moment SCHED_ULE does almost nothing in terms of cache line sharing affinity (though it probably worth some further experiments). What this patch may improve is opposite case -- reduce cache sharing pressure for cache-hungry applications. For example, proper cache topology detection (such as lack of global L3 cache, but shared L2 per pairs of cores on Core2Quad class CPUs) increases pbzip2 performance when number of threads is less then number of CPUs (i.e. when there is place for optimization). -- Alexander Motin From owner-freebsd-hackers@FreeBSD.ORG Fri Apr 6 14:30:35 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3DB1F106564A; Fri, 6 Apr 2012 14:30:35 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-lb0-f182.google.com (mail-lb0-f182.google.com [209.85.217.182]) by mx1.freebsd.org (Postfix) with ESMTP id C306C8FC1D; Fri, 6 Apr 2012 14:30:33 +0000 (UTC) Received: by lbok6 with SMTP id k6so967605lbo.13 for ; Fri, 06 Apr 2012 07:30:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=tOILqdEcU3+tOlJFu8Kfm13oW7YAskpkPAOORjDaPrw=; b=MSjuwHUeBYrbdIDRV08hw8GBWVa0Hlaq5zUhQh2M0xhhL/YpizN+IKDyyerfQFdqRN P1TWrX7veFdFj+mqus1R7OCJRR21cFDhgUVHZtOA3J+A+DAaTiLAehG7vBk1v+flXT6v uPmg79eoRq6dbKBfYBkgYoo2tDfAkn5VF1Hkrm4g0jClr+K8Uvbk4ICsxgtWTIkUK7vz WVHb6q20WVy6mZrG8dVqOe4Dsji08eHVGse2o6xdgfCcCxq9ytnHmQD2/pJomAKgU8HD +qsUA3GwXkaUQNuTwtn/P1ggtUW18paTililqocUChC0SZBTPlblmNGd7Edsi51kYS0n 12QA== MIME-Version: 1.0 Received: by 10.152.103.239 with SMTP id fz15mr8813821lab.42.1333722632712; Fri, 06 Apr 2012 07:30:32 -0700 (PDT) Sender: asmrookie@gmail.com Received: by 10.112.93.138 with HTTP; Fri, 6 Apr 2012 07:30:32 -0700 (PDT) In-Reply-To: <4F7EFD42.9010507@FreeBSD.org> References: <4F2F7B7F.40508@FreeBSD.org> <4F366E8F.9060207@FreeBSD.org> <4F367965.6000602@FreeBSD.org> <4F396B24.5090602@FreeBSD.org> <4F3978BC.6090608@FreeBSD.org> <4F3990EA.1080002@FreeBSD.org> <4F3C0BB9.6050101@FreeBSD.org> <4F3E807A.60103@FreeBSD.org> <4F3E8858.4000001@FreeBSD.org> <4F7EFD42.9010507@FreeBSD.org> Date: Fri, 6 Apr 2012 15:30:32 +0100 X-Google-Sender-Auth: 7dogYljInpWmobDNMWo-gwj86CU Message-ID: From: Attilio Rao To: Alexander Motin Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: Florian Smeets , freebsd-hackers@freebsd.org, Andriy Gapon , FreeBSD current , Jeff Roberson , Arnaud Lacombe Subject: Re: [RFT][patch] Scheduling for HTT and not only X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2012 14:30:35 -0000 Il 06 aprile 2012 15:27, Alexander Motin ha scritto: > On 04/06/12 17:13, Attilio Rao wrote: >> >> Il 05 aprile 2012 19:12, Arnaud Lacombe =C2=A0ha scr= itto: >>> >>> Hi, >>> >>> [Sorry for the delay, I got a bit sidetrack'ed...] >>> >>> 2012/2/17 Alexander Motin: >>>> >>>> On 17.02.2012 18:53, Arnaud Lacombe wrote: >>>>> >>>>> >>>>> On Fri, Feb 17, 2012 at 11:29 AM, Alexander Motin >>>>> =C2=A0wrote: >>>>>> >>>>>> >>>>>> On 02/15/12 21:54, Jeff Roberson wrote: >>>>>>> >>>>>>> >>>>>>> On Wed, 15 Feb 2012, Alexander Motin wrote: >>>>>>>> >>>>>>>> >>>>>>>> I've decided to stop those cache black magic practices and focus o= n >>>>>>>> things that really exist in this world -- SMT and CPU load. I've >>>>>>>> dropped most of cache related things from the patch and made the >>>>>>>> rest >>>>>>>> of things more strict and predictable: >>>>>>>> http://people.freebsd.org/~mav/sched.htt34.patch >>>>>>> >>>>>>> >>>>>>> >>>>>>> This looks great. I think there is value in considering the other >>>>>>> approach further but I would like to do this part first. It would b= e >>>>>>> nice to also add priority as a greater influence in the load >>>>>>> balancing >>>>>>> as well. >>>>>> >>>>>> >>>>>> >>>>>> I haven't got good idea yet about balancing priorities, but I've >>>>>> rewritten >>>>>> balancer itself. As soon as sched_lowest() / sched_highest() are mor= e >>>>>> intelligent now, they allowed to remove topology traversing from the >>>>>> balancer itself. That should fix double-swapping problem, allow to >>>>>> keep >>>>>> some >>>>>> affinity while moving threads and make balancing more fair. I did >>>>>> number >>>>>> of >>>>>> tests running 4, 8, 9 and 16 CPU-bound threads on 8 CPUs. With 4, 8 >>>>>> and >>>>>> 16 >>>>>> threads everything is stationary as it should. With 9 threads I see >>>>>> regular >>>>>> and random load move between all 8 CPUs. Measurements on 5 minutes r= un >>>>>> show >>>>>> deviation of only about 5 seconds. It is the same deviation as I see >>>>>> caused >>>>>> by only scheduling of 16 threads on 8 cores without any balancing >>>>>> needed >>>>>> at >>>>>> all. So I believe this code works as it should. >>>>>> >>>>>> Here is the patch: http://people.freebsd.org/~mav/sched.htt40.patch >>>>>> >>>>>> I plan this to be a final patch of this series (more to come :)) and >>>>>> if >>>>>> there will be no problems or objections, I am going to commit it >>>>>> (except >>>>>> some debugging KTRs) in about ten days. So now it's a good time for >>>>>> reviews >>>>>> and testing. :) >>>>>> >>>>> is there a place where all the patches are available ? >>>> >>>> >>>> >>>> All my scheduler patches are cumulative, so all you need is only the >>>> last >>>> mentioned here sched.htt40.patch. >>>> >>> You may want to have a look to the result I collected in the >>> `runs/freebsd-experiments' branch of: >>> >>> https://github.com/lacombar/hackbench/ >>> >>> and compare them with vanilla FreeBSD 9.0 and -CURRENT results >>> available in `runs/freebsd'. On the dual package platform, your patch >>> is not a definite win. >>> >>>> But in some cases, especially for multi-socket systems, to let it show >>>> its >>>> best, you may want to apply additional patch from avg@ to better detec= t >>>> CPU >>>> topology: >>>> >>>> https://gitorious.org/~avg/freebsd/avgbsd/commit/6bca4a2e4854ea3fc2759= 46a023db65c483cb9dd >>>> >>> test I conducted specifically for this patch did not showed much >>> improvement... >> >> >> Can you please clarify on this point? >> The test you did included cases where the topology was detected badly >> against cases where the topology was detected correctly as a patched >> kernel (and you still didn't see a performance improvement), in terms >> of cache line sharing? > > > At this moment SCHED_ULE does almost nothing in terms of cache line shari= ng > affinity (though it probably worth some further experiments). What this > patch may improve is opposite case -- reduce cache sharing pressure for > cache-hungry applications. For example, proper cache topology detection > (such as lack of global L3 cache, but shared L2 per pairs of cores on > Core2Quad class CPUs) increases pbzip2 performance when number of threads= is > less then number of CPUs (i.e. when there is place for optimization). My asking is not referred to your patch really. I just wanted to know if he correctly benchmark a case where the topology was screwed up and then correctly recognized by avg's patch in terms of cache level aggregation (it wasn't referred to your patch btw). Attilio --=20 Peace can only be achieved by understanding - A. Einstein From owner-freebsd-hackers@FreeBSD.ORG Fri Apr 6 14:41:14 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 145F0106564A; Fri, 6 Apr 2012 14:41:14 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-wg0-f50.google.com (mail-wg0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id CD41C8FC0C; Fri, 6 Apr 2012 14:41:12 +0000 (UTC) Received: by wgbds12 with SMTP id ds12so2238046wgb.31 for ; Fri, 06 Apr 2012 07:41:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=MS2dklQoXxfnle5eL7zTe/RcCJ+1c6Y0WNgi+VB4LXA=; b=tBn1iWlNZ2hHZAEFHq+YhDNBF3YN61F/hIHOF9OwXNnrT86oJBpWlW2N7thq6lH3YA ZLo1yha1jyVltfU103zoOwq1latnFLXGd2fVPt6UygHmGeeEsX4v26SVT1hBI2BBh5iI yJ/qnCdT0csd6vxPfk7SlN8OstAO4S0IPo1iugsCHuU24fLA1JVhDtpqRW7mlGZYeSRr Qc7JsD4ynt4hDT61E9RvO3heZLJ+WLwOwZGo79h5jLllVhuDGrnB624t6lvFGPqajNSC KiB8Carg8N44XLGsP6aqkSiQPv1aiANzLAwve72rH9/5UdbvtQGqyYDD7UTlFij6ggkj sgBA== Received: by 10.180.82.136 with SMTP id i8mr12007448wiy.19.1333723271834; Fri, 06 Apr 2012 07:41:11 -0700 (PDT) Received: from mavbook2.mavhome.dp.ua (pc.mavhome.dp.ua. [212.86.226.226]) by mx.google.com with ESMTPS id fn2sm11423355wib.0.2012.04.06.07.41.09 (version=SSLv3 cipher=OTHER); Fri, 06 Apr 2012 07:41:11 -0700 (PDT) Sender: Alexander Motin Message-ID: <4F7F0085.7090001@FreeBSD.org> Date: Fri, 06 Apr 2012 17:41:09 +0300 From: Alexander Motin User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:10.0.2) Gecko/20120226 Thunderbird/10.0.2 MIME-Version: 1.0 To: Attilio Rao References: <4F2F7B7F.40508@FreeBSD.org> <4F366E8F.9060207@FreeBSD.org> <4F367965.6000602@FreeBSD.org> <4F396B24.5090602@FreeBSD.org> <4F3978BC.6090608@FreeBSD.org> <4F3990EA.1080002@FreeBSD.org> <4F3C0BB9.6050101@FreeBSD.org> <4F3E807A.60103@FreeBSD.org> <4F3E8858.4000001@FreeBSD.org> <4F7EFD42.9010507@FreeBSD.org> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Florian Smeets , freebsd-hackers@freebsd.org, Andriy Gapon , FreeBSD current , Jeff Roberson , Arnaud Lacombe Subject: Re: [RFT][patch] Scheduling for HTT and not only X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2012 14:41:14 -0000 On 04/06/12 17:30, Attilio Rao wrote: > Il 06 aprile 2012 15:27, Alexander Motin ha scritto: >> On 04/06/12 17:13, Attilio Rao wrote: >>> >>> Il 05 aprile 2012 19:12, Arnaud Lacombe ha scritto: >>>> >>>> Hi, >>>> >>>> [Sorry for the delay, I got a bit sidetrack'ed...] >>>> >>>> 2012/2/17 Alexander Motin: >>>>> >>>>> On 17.02.2012 18:53, Arnaud Lacombe wrote: >>>>>> >>>>>> >>>>>> On Fri, Feb 17, 2012 at 11:29 AM, Alexander Motin >>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> On 02/15/12 21:54, Jeff Roberson wrote: >>>>>>>> >>>>>>>> >>>>>>>> On Wed, 15 Feb 2012, Alexander Motin wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> I've decided to stop those cache black magic practices and focus on >>>>>>>>> things that really exist in this world -- SMT and CPU load. I've >>>>>>>>> dropped most of cache related things from the patch and made the >>>>>>>>> rest >>>>>>>>> of things more strict and predictable: >>>>>>>>> http://people.freebsd.org/~mav/sched.htt34.patch >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> This looks great. I think there is value in considering the other >>>>>>>> approach further but I would like to do this part first. It would be >>>>>>>> nice to also add priority as a greater influence in the load >>>>>>>> balancing >>>>>>>> as well. >>>>>>> >>>>>>> >>>>>>> >>>>>>> I haven't got good idea yet about balancing priorities, but I've >>>>>>> rewritten >>>>>>> balancer itself. As soon as sched_lowest() / sched_highest() are more >>>>>>> intelligent now, they allowed to remove topology traversing from the >>>>>>> balancer itself. That should fix double-swapping problem, allow to >>>>>>> keep >>>>>>> some >>>>>>> affinity while moving threads and make balancing more fair. I did >>>>>>> number >>>>>>> of >>>>>>> tests running 4, 8, 9 and 16 CPU-bound threads on 8 CPUs. With 4, 8 >>>>>>> and >>>>>>> 16 >>>>>>> threads everything is stationary as it should. With 9 threads I see >>>>>>> regular >>>>>>> and random load move between all 8 CPUs. Measurements on 5 minutes run >>>>>>> show >>>>>>> deviation of only about 5 seconds. It is the same deviation as I see >>>>>>> caused >>>>>>> by only scheduling of 16 threads on 8 cores without any balancing >>>>>>> needed >>>>>>> at >>>>>>> all. So I believe this code works as it should. >>>>>>> >>>>>>> Here is the patch: http://people.freebsd.org/~mav/sched.htt40.patch >>>>>>> >>>>>>> I plan this to be a final patch of this series (more to come :)) and >>>>>>> if >>>>>>> there will be no problems or objections, I am going to commit it >>>>>>> (except >>>>>>> some debugging KTRs) in about ten days. So now it's a good time for >>>>>>> reviews >>>>>>> and testing. :) >>>>>>> >>>>>> is there a place where all the patches are available ? >>>>> >>>>> >>>>> >>>>> All my scheduler patches are cumulative, so all you need is only the >>>>> last >>>>> mentioned here sched.htt40.patch. >>>>> >>>> You may want to have a look to the result I collected in the >>>> `runs/freebsd-experiments' branch of: >>>> >>>> https://github.com/lacombar/hackbench/ >>>> >>>> and compare them with vanilla FreeBSD 9.0 and -CURRENT results >>>> available in `runs/freebsd'. On the dual package platform, your patch >>>> is not a definite win. >>>> >>>>> But in some cases, especially for multi-socket systems, to let it show >>>>> its >>>>> best, you may want to apply additional patch from avg@ to better detect >>>>> CPU >>>>> topology: >>>>> >>>>> https://gitorious.org/~avg/freebsd/avgbsd/commit/6bca4a2e4854ea3fc275946a023db65c483cb9dd >>>>> >>>> test I conducted specifically for this patch did not showed much >>>> improvement... >>> >>> >>> Can you please clarify on this point? >>> The test you did included cases where the topology was detected badly >>> against cases where the topology was detected correctly as a patched >>> kernel (and you still didn't see a performance improvement), in terms >>> of cache line sharing? >> >> >> At this moment SCHED_ULE does almost nothing in terms of cache line sharing >> affinity (though it probably worth some further experiments). What this >> patch may improve is opposite case -- reduce cache sharing pressure for >> cache-hungry applications. For example, proper cache topology detection >> (such as lack of global L3 cache, but shared L2 per pairs of cores on >> Core2Quad class CPUs) increases pbzip2 performance when number of threads is >> less then number of CPUs (i.e. when there is place for optimization). > > My asking is not referred to your patch really. > I just wanted to know if he correctly benchmark a case where the > topology was screwed up and then correctly recognized by avg's patch > in terms of cache level aggregation (it wasn't referred to your patch > btw). I understand. I've just described test case when properly detected topology could give benefit. What the test really does is indeed a good question. -- Alexander Motin From owner-freebsd-hackers@FreeBSD.ORG Sat Apr 7 06:39:20 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A9FF71065673 for ; Sat, 7 Apr 2012 06:39:20 +0000 (UTC) (envelope-from scdbackup@gmx.net) Received: from mailout-de.gmx.net (mailout-de.gmx.net [213.165.64.23]) by mx1.freebsd.org (Postfix) with SMTP id 060FB8FC08 for ; Sat, 7 Apr 2012 06:39:19 +0000 (UTC) Received: (qmail invoked by alias); 07 Apr 2012 06:39:13 -0000 Received: from 165.126.46.212.adsl.ncore.de (HELO 192.168.2.69) [212.46.126.165] by mail.gmx.net (mp033) with SMTP; 07 Apr 2012 08:39:13 +0200 X-Authenticated: #2145628 X-Provags-ID: V01U2FsdGVkX18D181C5MQDK1g+4gtKEoYkTpahdBhVjJhNzuX239 cFq3uXgFHombKX Date: Sat, 07 Apr 2012 08:39:54 +0200 From: "Thomas Schmitt" To: freebsd-hackers@freebsd.org Message-Id: <100419431014380@192.168.2.69> X-Y-GMX-Trusted: 0 Subject: Did something change with ioctl CAMIOCOMMAND from 8.0 to 9.0 ? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 Apr 2012 06:39:20 -0000 Hi, googling brought me to this forum post http://forums.freebsd.org/showthread.php?p=172885 which reports that xfburn fails to recognize optical drives on FreeBSD 9.0. There are error messages about a ioctl which might be emitted by libburn for getting a list of drives: xfburn: error sending CAMIOCOMMAND ioctl: Inappropriate ioctl for device On my FreeBSD 8.0 test system, everything seems ok with libburn. xorriso lists both drives and is willing to blank and burn a CD. Could somebody with a 9.0 system and a CD/DVD/BD drive please get xorriso (e.g. from ports) and try whether it shows all drives ? This command: xorriso -devices should report something like 0 -dev '/dev/cd0' rwrwr- : 'TSSTcorp' 'CDDVDW SH-S223B' 1 -dev '/dev/cd1' rwrwr- : 'TSSTcorp' 'DVD-ROM SH-D162C' One needs rw-permissions for the involved devices in order to get them listed. Up to now, this were: acd* cd* pass* xpt* If the CAMIOCOMMAND of libburn/sg-freebsd.c is wrong for 9.0, then i could need instructions how to perform drive listing and how to recognize 9.0 resp. the need for the new code at compile time. The code can be inspected online at http://libburnia-project.org/browser/libburn/trunk/libburn/sg-freebsd.c The (union ccb) idx->ccb for this ioctl at line 231 if (ioctl(idx->fd, CAMIOCOMMAND, &(idx->ccb)) == -1) { is set up in this function beginning at line 160 static int sg_init_enumerator(burn_drive_enumerator_t *idx_) Have a nice day :) Thomas From owner-freebsd-hackers@FreeBSD.ORG Sat Apr 7 21:17:12 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AF890106566C for ; Sat, 7 Apr 2012 21:17:12 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from mail-gy0-f182.google.com (mail-gy0-f182.google.com [209.85.160.182]) by mx1.freebsd.org (Postfix) with ESMTP id 6F44A8FC15 for ; Sat, 7 Apr 2012 21:17:12 +0000 (UTC) Received: by ghrr20 with SMTP id r20so1872856ghr.13 for ; Sat, 07 Apr 2012 14:17:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:from:date:x-google-sender-auth:message-id :subject:to:content-type; bh=sls/5b9UwPi3UWoRg51HmgiUshuXDoVslJkS5JudoqM=; b=gUcv2NoRzYCTX0S/nVuxyAx8v9ZweN1ScXUlp5/gjzroGfzR16Jz4vcdhvtMm1SRvB Cc/XkNFT76VbMNl7UmWI3bMYN3POiBZSn599tyIwaLjpOVJEVXtan0sDaiydb9WSjKno C/eAYEay+KMQwO1+pfJ9UTuXhGlviY8U9NnB7gfZfwQq81ufAyMZJaebv4BeuRA8Mwda Iy3UrJ3JVkCwcah2HAyOjgx73dkjHu+aBcVCRP4pnlabX21NjJ5vW2dRMKyXAqxRNyor u4qe8f0mj8Ex6Z4s+Iu6QYnUNcbMe0TZOdkM2KWu0AiFJIr5JGt9pahb5Mnixc5ZwgK0 a4gg== Received: by 10.101.165.33 with SMTP id s33mr593065ano.70.1333833432036; Sat, 07 Apr 2012 14:17:12 -0700 (PDT) MIME-Version: 1.0 Sender: ivoras@gmail.com Received: by 10.100.3.20 with HTTP; Sat, 7 Apr 2012 14:16:31 -0700 (PDT) From: Ivan Voras Date: Sat, 7 Apr 2012 23:16:31 +0200 X-Google-Sender-Auth: pgJWzIusehvSlZ0rXm5ZGpZMo8o Message-ID: To: freebsd-hackers Content-Type: text/plain; charset=UTF-8 Subject: Socket buffer usage X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 Apr 2012 21:17:12 -0000 Hi, I'm tracking down an obscure bug in my userland program and it might have something to do with the way I write&read data through a (Unix domain) socket. I'm setting SO_SNDBUF and SO_RCVBUF, and what I'm looking for is some way to query the amount of TX & RX buffered / free data on a socket. Is there something I can use? I'll even accept inspecting kernel structures if explained in detail and can be done on a running system. Alternatively, is there anything else which could cause poll(2) with POLLOUT on a socket to return no events ready on such a socket? (my expectation being that a socket is always ready to be written to if there is buffer space free...). From owner-freebsd-hackers@FreeBSD.ORG Sat Apr 7 21:36:15 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DCE571065676 for ; Sat, 7 Apr 2012 21:36:15 +0000 (UTC) (envelope-from dudu@dudu.ro) Received: from mail-wi0-f172.google.com (mail-wi0-f172.google.com [209.85.212.172]) by mx1.freebsd.org (Postfix) with ESMTP id 631128FC23 for ; Sat, 7 Apr 2012 21:36:15 +0000 (UTC) Received: by wibhj6 with SMTP id hj6so1181807wib.13 for ; Sat, 07 Apr 2012 14:36:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dudu.ro; s=google; h=date:from:to:cc:message-id:in-reply-to:references:subject:x-mailer :mime-version:content-type:content-transfer-encoding :content-disposition; bh=RgihElmKKhuscJWUOmQWbnAXR9jzfnvVX47T4wHBOyU=; b=IoCb4SVU0ymyjpXDQZ+bt9VRlqawOHYCRWu7G9u9LXN5ppe8cvs8Z583OWGdGQZG8B dbXRlLWBcLrAZlFs4fPUYrkhuTnqSgb/2Tm3rI2Hib93lPSlr7iJKV9qELrzz1TyHB+f 65eSQwEjuoEzEOLTLSuHyUyJs6cwkgh71ETiM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=date:from:to:cc:message-id:in-reply-to:references:subject:x-mailer :mime-version:content-type:content-transfer-encoding :content-disposition:x-gm-message-state; bh=RgihElmKKhuscJWUOmQWbnAXR9jzfnvVX47T4wHBOyU=; b=CTIn1iDQxWgfLBF2xc3L3v+Aa6Cq3hFun+hjaL2uYyFrIfCuwc8GPz6XLhH6rC/GFt NmDPzZTRAWxAAyL/Jbl3X0V/v1NVbbukFLV3uyCgJDsFbI4528Jl8D3w9KjOtrPOIOvj vVVftxFkdDOBqj/dyhNS4nV+6Xiw5OanJSNfYUIeE+w42pKpM0291/aQPFcr+f4JQwm/ afXFn6yeIBFVVBrYhgoMH4ad1rm/497nw2P9mb9xRtFLbJphOvAhJOHLI7ePdTfDTNyz I4d4jfmMpnp9xt8Iu4PuPO7Tn/6YYhr/6WQacRi/WZ2jd21AV09BUBix8E06ETArQjO0 w5HA== Received: by 10.216.139.12 with SMTP id b12mr1390241wej.4.1333834574534; Sat, 07 Apr 2012 14:36:14 -0700 (PDT) Received: from LesPaul.local ([82.76.253.74]) by mx.google.com with ESMTPS id j3sm28360193wiw.1.2012.04.07.14.36.12 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 07 Apr 2012 14:36:13 -0700 (PDT) Date: Sat, 7 Apr 2012 22:36:08 +0100 From: Vlad Galu To: Ivan Voras Message-ID: <0D2E65B3D0AB4A6483C26A613FC73F83@dudu.ro> In-Reply-To: References: X-Mailer: sparrow 1.5 (build 1043.1) MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Gm-Message-State: ALoCoQlNtcr5STJ6/2Qj5ypDDvOhyekISFsnmHzPDNfVpkvADxPAGmc9oKy75px9w7vmOGblEmEI Cc: freebsd-hackers Subject: Re: Socket buffer usage X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 Apr 2012 21:36:15 -0000 This might not exactly be what you want, but struct kevent has a member called "data" which, for sockets and pipes, returns the number of available bytes to read (or write) for EVFILT_READ (or EVFILT_READ) events. -- Good, fast and cheap: pick any two. On Saturday, April 7, 2012 at 10:16 PM, Ivan Voras wrote: > Hi, > > I'm tracking down an obscure bug in my userland program and it might > have something to do with the way I write&read data through a (Unix > domain) socket. I'm setting SO_SNDBUF and SO_RCVBUF, and what I'm > looking for is some way to query the amount of TX & RX buffered / free > data on a socket. Is there something I can use? I'll even accept > inspecting kernel structures if explained in detail and can be done on > a running system. > > Alternatively, is there anything else which could cause poll(2) with > POLLOUT on a socket to return no events ready on such a socket? (my > expectation being that a socket is always ready to be written to if > there is buffer space free...). > _______________________________________________ > freebsd-hackers@freebsd.org (mailto:freebsd-hackers@freebsd.org) mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org (mailto:freebsd-hackers-unsubscribe@freebsd.org)" From owner-freebsd-hackers@FreeBSD.ORG Sat Apr 7 23:01:43 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 328A9106566C for ; Sat, 7 Apr 2012 23:01:43 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from mail-ob0-f182.google.com (mail-ob0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id EAA158FC08 for ; Sat, 7 Apr 2012 23:01:42 +0000 (UTC) Received: by obbwc18 with SMTP id wc18so5890209obb.13 for ; Sat, 07 Apr 2012 16:01:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=L+/exG3J+loeMSU4Dn3IHePTLcDcDJ4zkW4wO7/HpmA=; b=ubWZBcuiEp584jNg33s5PBzi9KQCrRMH7EREtOUsS2TtmBeXcIq2E4CWN5tJVEfqHm nTCJYu7oJedWIO86KPGTjdgIf0uLah53IRuxRzLogBSu5XkCBEM2TzUc71ysdtz3FgWq wPkfp5ZWuEFN5c5lpRxtaYZXjPvbFFNn7Op7WA5SvHl3tLa17aVcWCsPNIXDjIMB7mOb oxWQXKJji/NVUOPlR69GfioexuVFMypM040z1YW/GR9EonMo/Fs5+XRDgJTxEytctrWG 2iwI/tCCfLIit5C2H3doionvzDjT2Rw5xVoaSlSrACuSgFJP5MTX5GPOQ+PIhSKMbeRh Lm/w== Received: by 10.60.18.198 with SMTP id y6mr3636104oed.38.1333839702505; Sat, 07 Apr 2012 16:01:42 -0700 (PDT) MIME-Version: 1.0 Sender: ivoras@gmail.com Received: by 10.182.85.71 with HTTP; Sat, 7 Apr 2012 16:01:01 -0700 (PDT) In-Reply-To: <0D2E65B3D0AB4A6483C26A613FC73F83@dudu.ro> References: <0D2E65B3D0AB4A6483C26A613FC73F83@dudu.ro> From: Ivan Voras Date: Sun, 8 Apr 2012 01:01:01 +0200 X-Google-Sender-Auth: kqudcxh1SRU9JlXTE2BJwmMM91s Message-ID: To: Vlad Galu Content-Type: text/plain; charset=UTF-8 Cc: freebsd-hackers Subject: Re: Socket buffer usage X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 Apr 2012 23:01:43 -0000 On 7 April 2012 23:36, Vlad Galu wrote: > This might not exactly be what you want, but struct kevent has a member called "data" which, for sockets and pipes, returns the number of available bytes to read (or write) for EVFILT_READ (or EVFILT_WRITE) events. > That's a good idea but I'm actually trying to find out why my write events are not firing, so I can't get to that information.