From owner-freebsd-hackers@FreeBSD.ORG Sun Feb 3 08:49:09 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 92DA68B9 for ; Sun, 3 Feb 2013 08:49:09 +0000 (UTC) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from wojtek.tensor.gdynia.pl (wojtek.tensor.gdynia.pl [188.252.31.196]) by mx1.freebsd.org (Postfix) with ESMTP id EC9DBD19 for ; Sun, 3 Feb 2013 08:49:08 +0000 (UTC) Received: from wojtek.tensor.gdynia.pl (localhost [127.0.0.1]) by wojtek.tensor.gdynia.pl (8.14.6/8.14.5) with ESMTP id r138muJN016612 for ; Sun, 3 Feb 2013 09:48:56 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from localhost (wojtek@localhost) by wojtek.tensor.gdynia.pl (8.14.6/8.14.5/Submit) with ESMTP id r138muUS016609 for ; Sun, 3 Feb 2013 09:48:56 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Date: Sun, 3 Feb 2013 09:48:55 +0100 (CET) From: Wojciech Puchar To: freebsd-hackers@freebsd.org Subject: gstripe Message-ID: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.7 (wojtek.tensor.gdynia.pl [127.0.0.1]); Sun, 03 Feb 2013 09:48:56 +0100 (CET) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Feb 2013 08:49:09 -0000 option -s stripe size - what is stripe size? if i stripped 4 devices with -s $[512*1024*1024] then does it mean that 1) it will take 512MB from device 1, then 512MB from device 2, then 512MB from device 3 then 512MB from device 4 or 2) it will take 128MB from each for 512MB total From owner-freebsd-hackers@FreeBSD.ORG Sun Feb 3 10:33:03 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 49039DED; Sun, 3 Feb 2013 10:33:03 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 62577124; Sun, 3 Feb 2013 10:33:01 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id MAA07161; Sun, 03 Feb 2013 12:33:00 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1U1wsd-0006DZ-NI; Sun, 03 Feb 2013 12:32:59 +0200 Message-ID: <510E3CDB.2070803@FreeBSD.org> Date: Sun, 03 Feb 2013 12:32:59 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130121 Thunderbird/17.0.2 MIME-Version: 1.0 To: Konstantin Belousov Subject: Re: scheduler->swapper, SI_SUB_RUN_SCHEDULER->SI_SUB_LAST References: <510CFD90.9000304@FreeBSD.org> <20130202145013.GV2522@kib.kiev.ua> In-Reply-To: <20130202145013.GV2522@kib.kiev.ua> X-Enigmail-Version: 1.4.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@FreeBSD.org, freebsd-current@FreeBSD.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Feb 2013 10:33:03 -0000 on 02/02/2013 16:50 Konstantin Belousov said the following: > On Sat, Feb 02, 2013 at 01:50:40PM +0200, Andriy Gapon wrote: >> >> I would like to propose the following mostly cosmetic change: >> http://people.freebsd.org/~avg/scheduler-swapper.diff >> >> This is something that bit me early in my FreeBSD days, so I am kind of obsessed >> with it. >> The current naming is confusing/misleading indeed. >> And magic properties of SI_SUB_RUN_SCHEDULER:SI_ORDER_LAST is a "hidden gem". > > You may remove the Giant unlock from the scheduler()/swapper() as well > then, doing it before the swapper() call in the mi_startup(). > > Note that the wait chain for the idle swapper is still called "sched". Thank you for the review. I am fixing both issues. -- Andriy Gapon From owner-freebsd-hackers@FreeBSD.ORG Sun Feb 3 17:45:03 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 72507619; Sun, 3 Feb 2013 17:45:03 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 72E3F2C6; Sun, 3 Feb 2013 17:45:02 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id TAA09560; Sun, 03 Feb 2013 19:44:58 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1U23cg-0006ph-81; Sun, 03 Feb 2013 19:44:58 +0200 Message-ID: <510EA219.2090504@FreeBSD.org> Date: Sun, 03 Feb 2013 19:44:57 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130121 Thunderbird/17.0.2 MIME-Version: 1.0 To: freebsd-current@FreeBSD.org Subject: detect mwait capabilities and extensions X-Enigmail-Version: 1.4.6 Content-Type: text/plain; charset=X-VIET-VPS Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@FreeBSD.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Feb 2013 17:45:03 -0000 Guys, could you please the following change? It is amd64-centric now, but obviously I plan equivalent changes for i386. I am mostly concerned about proper header files for various definitions and proper names for them. Especially I am not sure where to put STATE_RUNNING, STATE_MWAIT, STATE_SLEEPING... Thank you. diff --git a/sys/amd64/amd64/identcpu.c b/sys/amd64/amd64/identcpu.c index 2517498..d831f95 100644 --- a/sys/amd64/amd64/identcpu.c +++ b/sys/amd64/amd64/identcpu.c @@ -513,6 +513,13 @@ identify_cpu(void) } } + if (cpu_high >= 5 && (cpu_feature2 & CPUID2_MON) != 0) { + do_cpuid(5, regs); + cpu_mon_mwait_flags = regs[2]; + cpu_mon_min_size = regs[0] & CPUID5_MON_MIN_SIZE; + cpu_mon_max_size = regs[1] & CPUID5_MON_MAX_SIZE; + } + if (cpu_high >= 7) { cpuid_count(7, 0, regs); cpu_stdext_feature = regs[1]; diff --git a/sys/amd64/amd64/initcpu.c b/sys/amd64/amd64/initcpu.c index 4abed4c..f7574b1 100644 --- a/sys/amd64/amd64/initcpu.c +++ b/sys/amd64/amd64/initcpu.c @@ -75,6 +75,9 @@ u_int cpu_mxcsr_mask; /* Valid bits in mxcsr */ u_int cpu_clflush_line_size = 32; u_int cpu_stdext_feature; u_int cpu_max_ext_state_size; +u_int cpu_mon_mwait_flags; /* MONITOR/MWAIT flags (CPUID.05H.ECX) */ +u_int cpu_mon_min_size; /* MONITOR minimum range size, bytes */ +u_int cpu_mon_max_size; /* MONITOR minimum range size, bytes */ SYSCTL_UINT(_hw, OID_AUTO, via_feature_rng, CTLFLAG_RD, &via_feature_rng, 0, "VIA RNG feature available in CPU"); diff --git a/sys/amd64/include/md_var.h b/sys/amd64/include/md_var.h index 5d7cb74..ddc5b9f 100644 --- a/sys/amd64/include/md_var.h +++ b/sys/amd64/include/md_var.h @@ -58,6 +58,9 @@ extern u_int cpu_procinfo; extern u_int cpu_procinfo2; extern char cpu_vendor[]; extern u_int cpu_vendor_id; +extern u_int cpu_mon_mwait_flags; +extern u_int cpu_mon_min_size; +extern u_int cpu_mon_max_size; extern char ctx_switch_xsave[]; extern char kstack[]; extern char sigcode[]; diff --git a/sys/x86/include/specialreg.h b/sys/x86/include/specialreg.h index dbf9ba0..af64c1b 100644 --- a/sys/x86/include/specialreg.h +++ b/sys/x86/include/specialreg.h @@ -240,6 +240,29 @@ #define CPUID_LOCAL_APIC_ID 0xff000000 /* + * CPUID instruction 5 info + */ +#define CPUID5_MON_MIN_SIZE 0x0000ffff /* eax */ +#define CPUID5_MON_MAX_SIZE 0x0000ffff /* ebx */ +#define CPUID5_MON_MWAIT_EXT 0x00000001 /* ecx */ +#define CPUID5_MWAIT_INTRBREAK 0x00000002 /* ecx */ + +/* + * MWAIT cpu power states. Lower 4 bits are sub-states. + */ +#define MWAIT_C0 0xf0 +#define MWAIT_C1 0x00 +#define MWAIT_C2 0x10 +#define MWAIT_C3 0x20 +#define MWAIT_C4 0x30 + +/* + * MWAIT extensions. + */ +/* Interrupt breaks MWAIT even when masked. */ +#define MWAIT_INTRBREAK 0x00000001 + +/* * CPUID instruction 6 ecx info */ #define CPUID_PERF_STAT 0x00000001 --- a/sys/amd64/amd64/machdep.c +++ b/sys/amd64/amd64/machdep.c @@ -665,10 +665,6 @@ TUNABLE_INT("machdep.idle_mwait", &idle_mwait); SYSCTL_INT(_machdep, OID_AUTO, idle_mwait, CTLFLAG_RW, &idle_mwait, 0, "Use MONITOR/MWAIT for short idle"); -#define STATE_RUNNING 0x0 -#define STATE_MWAIT 0x1 -#define STATE_SLEEPING 0x2 - static void cpu_idle_acpi(int busy) { diff --git a/sys/amd64/include/cpu.h b/sys/amd64/include/cpu.h index 1c2871f..dc29a37 100644 --- a/sys/amd64/include/cpu.h +++ b/sys/amd64/include/cpu.h @@ -43,8 +43,14 @@ #include #include +/* + * CPU states for the purpose of communication using MONITOR+MWAIT. */ +#define STATE_RUNNING 0x0 +#define STATE_MWAIT 0x1 +#define STATE_SLEEPING 0x2 + #define cpu_exec(p) /* nothing */ #define cpu_swapin(p) /* nothing */ #define cpu_getstack(td) ((td)->td_frame->tf_rsp) #define cpu_setstack(td, ap) ((td)->td_frame->tf_rsp = (ap)) #define cpu_spinwait() ia32_pause() -- Andriy Gapon From owner-freebsd-hackers@FreeBSD.ORG Mon Feb 4 10:43:41 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 859F0373 for ; Mon, 4 Feb 2013 10:43:41 +0000 (UTC) (envelope-from mattblists@icritical.com) Received: from mail3.icritical.com (mail3.icritical.com [212.57.248.143]) by mx1.freebsd.org (Postfix) with SMTP id CDFD6AA5 for ; Mon, 4 Feb 2013 10:43:40 +0000 (UTC) Received: (qmail 9375 invoked from network); 4 Feb 2013 10:36:37 -0000 Received: from localhost (127.0.0.1) by mail3.icritical.com with SMTP; 4 Feb 2013 10:36:37 -0000 Received: (qmail 9367 invoked by uid 599); 4 Feb 2013 10:36:37 -0000 Received: from unknown (HELO PDC002.icritical.int) (212.57.254.146) by mail3.icritical.com (qpsmtpd/0.28) with ESMTP; Mon, 04 Feb 2013 10:36:37 +0000 Message-ID: <510F8F33.3030504@icritical.com> Date: Mon, 4 Feb 2013 10:36:35 +0000 From: Matt Burke User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130122 Thunderbird/17.0.2 MIME-Version: 1.0 To: Subject: kgdb modules Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit X-TLS-Incoming: YES X-Virus-Scanned: by iCritical at mail3.icritical.com X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Feb 2013 10:43:41 -0000 How do I get kgdb to load kernel modules from somewhere other than /boot/kernel? Googling tells me I need to use asf to create a file, but I haven't managed to figure out how to get kgdb use the output. Thanks -- Sorry about the following... The information contained in this message is confidential and intended for the addressee only. If you have received this message in error, or there are any problems with its content, please contact the sender. iCritical is a trading name of Critical Software Ltd. Registered in England: 04909220. Registered Office: IC2, Keele Science Park, Keele, Staffordshire, ST5 5NH. This message has been scanned for security threats by iCritical. www.icritical.com From owner-freebsd-hackers@FreeBSD.ORG Mon Feb 4 11:43:49 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 27C8AF88 for ; Mon, 4 Feb 2013 11:43:49 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 5543A1537 for ; Mon, 4 Feb 2013 11:43:47 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id NAA19325; Mon, 04 Feb 2013 13:43:44 +0200 (EET) (envelope-from avg@FreeBSD.org) Message-ID: <510F9EEF.8080402@FreeBSD.org> Date: Mon, 04 Feb 2013 13:43:43 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130113 Thunderbird/17.0.2 MIME-Version: 1.0 To: Matt Burke Subject: Re: kgdb modules References: <510F8F33.3030504@icritical.com> In-Reply-To: <510F8F33.3030504@icritical.com> X-Enigmail-Version: 1.4.6 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@FreeBSD.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Feb 2013 11:43:49 -0000 on 04/02/2013 12:36 Matt Burke said the following: > How do I get kgdb to load kernel modules from somewhere other than > /boot/kernel? > > Googling tells me I need to use asf to create a file, but I haven't managed > to figure out how to get kgdb use the output. Research in the direction of set sysroot, solib-absolute-prefix, solib-search-path. I would not be surprised if the ancient gdb version on which kgdb is based does not support some of these settings. -- Andriy Gapon From owner-freebsd-hackers@FreeBSD.ORG Mon Feb 4 19:11:56 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 1DECBA34; Mon, 4 Feb 2013 19:11:56 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-wg0-x229.google.com (mail-wg0-x229.google.com [IPv6:2a00:1450:400c:c00::229]) by mx1.freebsd.org (Postfix) with ESMTP id 68824788; Mon, 4 Feb 2013 19:11:55 +0000 (UTC) Received: by mail-wg0-f41.google.com with SMTP id ds1so3411531wgb.0 for ; Mon, 04 Feb 2013 11:11:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=JeGMNOcLUBLJjNWwoyTTZsoQrUyjrvrqfr0VRW6Fgz8=; b=vRSnelNusgVW06L9n6uVw+MOFtQdbhcIN47Q/twAjHQTSQU2IFuslN+OqBLidB/Pgz Oj8oL/3BVaPGgxir84FA/vc+IR0M0crLmoXPM7a9n16+2xdgLSSt+hnCzu5PLZegxLjy /upf22RL1L7IaiTGwuahgG9BPO0juppxDvpczKs2gtVIimfJuMBAykKN3/0DPLoj5N8A s/j2vgfy8X/eyDAvx6W2Vg1Gyyh9G45ZWc44uC2XK6YryTeVH2/jxMd9N2T0vXCkxH16 X5Z07OgEETlDZuHrCUXBxahczSQ2q1PQXaZWJvjZka68DZ/wFYur1UGpB3fVYG4D6guM wZYQ== MIME-Version: 1.0 X-Received: by 10.194.108.101 with SMTP id hj5mr37410921wjb.6.1360005114422; Mon, 04 Feb 2013 11:11:54 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.216.236.88 with HTTP; Mon, 4 Feb 2013 11:11:54 -0800 (PST) In-Reply-To: <51098A9E.1080100@FreeBSD.org> References: <51068B74.2070808@FreeBSD.org> <51098A9E.1080100@FreeBSD.org> Date: Mon, 4 Feb 2013 11:11:54 -0800 X-Google-Sender-Auth: AeBYf7cPCER7RB8BNxSngbL5K_Q Message-ID: Subject: Re: [clang] NMI while trying to read acpi timer register From: Adrian Chadd To: Andriy Gapon Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Feb 2013 19:11:56 -0000 On 30 January 2013 13:03, Andriy Gapon wrote: > on 28/01/2013 16:30 Andriy Gapon said the following: >> is there any reasonable explanation for getting an NMI while trying to read acpi >> timer register? >> Note: this happens only after ACPI suspend/resume. > > An update. > This happens only with clang compiled kernel, gcc compiled kernel is OK. > Also, this happens only in the depth of fwohci driver (where it calls DELAY). > If firewire is not loaded, then there is no problem. > > I suspect that perhaps there is some miscompilation that results in some > incorrect I/O access that later leads to NMI. Too many unknowns and guesses > here, obviously. Do you have stack traces showing where it's happening? Posting that and the disassembly from those areas may shed a clue. Adrian From owner-freebsd-hackers@FreeBSD.ORG Mon Feb 4 20:46:24 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id AD43443B; Mon, 4 Feb 2013 20:46:24 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 8DB28D3F; Mon, 4 Feb 2013 20:46:24 +0000 (UTC) Received: from pakbsde14.localnet (unknown [38.105.238.108]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 0C14CB926; Mon, 4 Feb 2013 15:46:24 -0500 (EST) From: John Baldwin To: freebsd-hackers@freebsd.org Subject: Re: kgdb modules Date: Mon, 4 Feb 2013 14:58:16 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p22; KDE/4.5.5; amd64; ; ) References: <510F8F33.3030504@icritical.com> <510F9EEF.8080402@FreeBSD.org> In-Reply-To: <510F9EEF.8080402@FreeBSD.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201302041458.16404.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 04 Feb 2013 15:46:24 -0500 (EST) Cc: Matt Burke , Andriy Gapon X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Feb 2013 20:46:24 -0000 On Monday, February 04, 2013 6:43:43 am Andriy Gapon wrote: > on 04/02/2013 12:36 Matt Burke said the following: > > How do I get kgdb to load kernel modules from somewhere other than > > /boot/kernel? > > > > Googling tells me I need to use asf to create a file, but I haven't managed > > to figure out how to get kgdb use the output. > > Research in the direction of set sysroot, solib-absolute-prefix, > solib-search-path. I would not be surprised if the ancient gdb version on which > kgdb is based does not support some of these settings. It supports at least some of those. You can also load modules manually by using the add-kld command (give it a full path to an individual module). You may need to use 'nosharedlibrary' to unload symbols from the "wrong" module before add-kld will be useful however. -- John Baldwin From owner-freebsd-hackers@FreeBSD.ORG Mon Feb 4 21:00:14 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 8C337F71; Mon, 4 Feb 2013 21:00:14 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id A5922E6D; Mon, 4 Feb 2013 21:00:13 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id XAA24666; Mon, 04 Feb 2013 23:00:12 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1U2T99-0009dY-UV; Mon, 04 Feb 2013 23:00:11 +0200 Message-ID: <5110215B.5000405@FreeBSD.org> Date: Mon, 04 Feb 2013 23:00:11 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130121 Thunderbird/17.0.2 MIME-Version: 1.0 To: Adrian Chadd Subject: Re: [clang] NMI while trying to read acpi timer register References: <51068B74.2070808@FreeBSD.org> <51098A9E.1080100@FreeBSD.org> In-Reply-To: X-Enigmail-Version: 1.4.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Feb 2013 21:00:14 -0000 on 04/02/2013 21:11 Adrian Chadd said the following: > On 30 January 2013 13:03, Andriy Gapon wrote: >> on 28/01/2013 16:30 Andriy Gapon said the following: >>> is there any reasonable explanation for getting an NMI while trying to read acpi >>> timer register? >>> Note: this happens only after ACPI suspend/resume. >> >> An update. >> This happens only with clang compiled kernel, gcc compiled kernel is OK. >> Also, this happens only in the depth of fwohci driver (where it calls DELAY). >> If firewire is not loaded, then there is no problem. >> >> I suspect that perhaps there is some miscompilation that results in some >> incorrect I/O access that later leads to NMI. Too many unknowns and guesses >> here, obviously. > > Do you have stack traces showing where it's happening? > > Posting that and the disassembly from those areas may shed a clue. The information should be available from a user who got this issue. Are you willing to take a look? I'll connect you. -- Andriy Gapon From owner-freebsd-hackers@FreeBSD.ORG Mon Feb 4 21:06:38 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 56A9B4A4; Mon, 4 Feb 2013 21:06:38 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-wg0-f50.google.com (mail-wg0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id C6FD0F0C; Mon, 4 Feb 2013 21:06:37 +0000 (UTC) Received: by mail-wg0-f50.google.com with SMTP id es5so5162858wgb.5 for ; Mon, 04 Feb 2013 13:06:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=wLK8cRstpZOxRxBVOyc5oCmfydrpoXJC50Ni4prk9zM=; b=cahkQO+hiB7UKfRCDmWN3gcK+JkCRcLyepjmb6X9GGE56rK82LhEfbSC84aR55FahS aBY2tO2ibbeClg6x7ttYJv2Rz44+xWZf0e6lMv59wFua0r8BlWXGErOKgRV9BwT3rGPK orwpIFAbcq94R+PRgfeZ5TmGbCx9fThEyxrMtpkufUa1/u+FSq6dbePfMjvYn93//xDm ZosbUSX1T/HTQYjmAe2z6QtRUzs5r3YHfa6OQOzMUTmAaMeUKaru5MDvGi/FAN/hwnqN XaYa5TeryM4o7UlkDcwulK6DnfXNyitP7kzObFOTuiX+CYDLZTXaqcpO7u8dMc/LIGZb xAEw== MIME-Version: 1.0 X-Received: by 10.180.85.226 with SMTP id k2mr12933749wiz.34.1360011996811; Mon, 04 Feb 2013 13:06:36 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.216.236.88 with HTTP; Mon, 4 Feb 2013 13:06:36 -0800 (PST) In-Reply-To: <5110215B.5000405@FreeBSD.org> References: <51068B74.2070808@FreeBSD.org> <51098A9E.1080100@FreeBSD.org> <5110215B.5000405@FreeBSD.org> Date: Mon, 4 Feb 2013 13:06:36 -0800 X-Google-Sender-Auth: 5usiK5Wv3ppTkPFxdosWs8oaX5Y Message-ID: Subject: Re: [clang] NMI while trying to read acpi timer register From: Adrian Chadd To: Andriy Gapon Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Feb 2013 21:06:38 -0000 I'm not the right person for it, but I think it's worth wrapping up all my requested details in a PR so Those Who Know can take a peek. Especially if it boils down to the choice of compiler. Who knows what other weird corner issues people will see with clang compiling their drivers? Adrian From owner-freebsd-hackers@FreeBSD.ORG Mon Feb 4 21:09:36 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id A71495B7; Mon, 4 Feb 2013 21:09:36 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id C0A91F30; Mon, 4 Feb 2013 21:09:35 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id XAA24730; Mon, 04 Feb 2013 23:09:34 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1U2TIE-0009ea-1r; Mon, 04 Feb 2013 23:09:34 +0200 Message-ID: <5110238D.8090808@FreeBSD.org> Date: Mon, 04 Feb 2013 23:09:33 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130121 Thunderbird/17.0.2 MIME-Version: 1.0 To: Adrian Chadd Subject: Re: [clang] NMI while trying to read acpi timer register References: <51068B74.2070808@FreeBSD.org> <51098A9E.1080100@FreeBSD.org> <5110215B.5000405@FreeBSD.org> In-Reply-To: X-Enigmail-Version: 1.4.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Feb 2013 21:09:36 -0000 on 04/02/2013 23:06 Adrian Chadd said the following: > I'm not the right person for it, but I think it's worth wrapping up > all my requested details in a PR so Those Who Know can take a peek. > > Especially if it boils down to the choice of compiler. Who knows what > other weird corner issues people will see with clang compiling their > drivers? OK, I'll ask the user to open a PR. I'll just note that the problem seems to be too strange... There is a huge distance from compiler to nmi. -- Andriy Gapon From owner-freebsd-hackers@FreeBSD.ORG Mon Feb 4 21:13:51 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 22EB5997; Mon, 4 Feb 2013 21:13:51 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-wg0-f50.google.com (mail-wg0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id 8BF95F9E; Mon, 4 Feb 2013 21:13:50 +0000 (UTC) Received: by mail-wg0-f50.google.com with SMTP id es5so5165381wgb.29 for ; Mon, 04 Feb 2013 13:13:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=oUHoUFcYIC5iymt0mguXKqK8/lBw6OIvJUgC1VWiMSg=; b=uOdwk/XKBMS555+peHXcVe+JAJR/Eq//ZlBX/P/SrKsVh55uB0Cpyntz1nWHapuhYL lLsisjU+cU+NQnu0ACj7VIPXh9m4ylAmuLH42Smpov6PhqvDy7ykeHU1t96D8Vm44AFw aidJTEGj9oDDY2K0jjKa5TvQXWuGlI1nmsLZtNwB0O4Qhn/ZfrwKflrgYLCxg0yYqM93 JnIxD0qiif6UB34QlrMmxNKmgsaWIhSxfgkqCrEmR/mGyFuG6UOYKJQrfcyXA3W/Yxb1 55ioai1U/zgBe9c3T6eRbDLMHfDc/iqBA2m8WjaQeLwcXPsiI4fzIK3lgwBjnVB++M8N th0A== MIME-Version: 1.0 X-Received: by 10.194.172.228 with SMTP id bf4mr38006840wjc.38.1360012429799; Mon, 04 Feb 2013 13:13:49 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.216.236.88 with HTTP; Mon, 4 Feb 2013 13:13:49 -0800 (PST) In-Reply-To: <5110238D.8090808@FreeBSD.org> References: <51068B74.2070808@FreeBSD.org> <51098A9E.1080100@FreeBSD.org> <5110215B.5000405@FreeBSD.org> <5110238D.8090808@FreeBSD.org> Date: Mon, 4 Feb 2013 13:13:49 -0800 X-Google-Sender-Auth: HlZI-HvOXh5nMqUQpQHlHtZsTsQ Message-ID: Subject: Re: [clang] NMI while trying to read acpi timer register From: Adrian Chadd To: Andriy Gapon Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Feb 2013 21:13:51 -0000 ... again, that's why I'm suggesting they post some further details, such as the disassembly in question. The contents of /etc/src.cnof and /etc/make.conf would be useful too, as well as a verbose dmesg just to make sure devices and CPU flags are all there. Who knows, it could be some corner case of optimisation that's screwing him, or a bad choice of instruction for his given platform, etc, etc. Adrian On 4 February 2013 13:09, Andriy Gapon wrote: > on 04/02/2013 23:06 Adrian Chadd said the following: >> I'm not the right person for it, but I think it's worth wrapping up >> all my requested details in a PR so Those Who Know can take a peek. >> >> Especially if it boils down to the choice of compiler. Who knows what >> other weird corner issues people will see with clang compiling their >> drivers? > > OK, I'll ask the user to open a PR. > I'll just note that the problem seems to be too strange... > There is a huge distance from compiler to nmi. > > -- > Andriy Gapon From owner-freebsd-hackers@FreeBSD.ORG Mon Feb 4 23:05:16 2013 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 02CB69E5; Mon, 4 Feb 2013 23:05:16 +0000 (UTC) (envelope-from neelnatu@gmail.com) Received: from mail-ie0-x234.google.com (mail-ie0-x234.google.com [IPv6:2607:f8b0:4001:c03::234]) by mx1.freebsd.org (Postfix) with ESMTP id B17C67EE; Mon, 4 Feb 2013 23:05:15 +0000 (UTC) Received: by mail-ie0-f180.google.com with SMTP id bn7so5402730ieb.39 for ; Mon, 04 Feb 2013 15:05:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:date:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=dCFHZnH0S7EBJwlG6lWlPrHYK7Nub90HahBReQS8w1o=; b=dDkQwwJRYMXKjsgyMfRwYfdMZAG3lzEEHDrzuMDjstEJ/MIJq/u4BByc4r3bN5LkX1 ij8VFcyZk5NZKfYio1dXxPsLsjo37ptmFkPtO1TidB9NHnK47IQBfoL999pFuela5d+8 DNgIyCRokkCXOE6YHHmlrTyhTz36IkX9qkd2olRYyY2WkUk1710pazEryxOgBjUK0y2q XNbWTPQjhmxkQRF0LzhkNABpd0DqnIB3xMxJ4s9HB5t29niK6S4rM6WyXOXUzMww+6Au IJutP1g/znmsA7nG+92QRZw5rxwCpKpj42AsFK3EFaAyVIw+6vQ8BnRhV6pR8kJxj0T5 +sxw== MIME-Version: 1.0 X-Received: by 10.43.17.199 with SMTP id qd7mr18919741icb.52.1360019115187; Mon, 04 Feb 2013 15:05:15 -0800 (PST) Received: by 10.42.23.132 with HTTP; Mon, 4 Feb 2013 15:05:15 -0800 (PST) Date: Mon, 4 Feb 2013 15:05:15 -0800 Message-ID: Subject: dynamically calculating NKPT [was: Re: huge ktr buffer] From: Neel Natu To: hackers@freebsd.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: alc@freebsd.org, davide@freebsd.org, rank1seeker@gmail.com, avg@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Feb 2013 23:05:16 -0000 Hi, I have a patch to dynamically calculate NKPT for amd64 kernels. This should fix the various issues that people pointed out in the email thread. Please review and let me know if there are any objections to committing thi= s. Also, thanks to Alan (alc@) for reviewing and providing feedback on the initial version of the patch. Patch (also available at http://people.freebsd.org/~neel/patches/nkpt_diff.= txt): Index: sys/amd64/include/pmap.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- sys/amd64/include/pmap.h (revision 246277) +++ sys/amd64/include/pmap.h (working copy) @@ -113,13 +113,7 @@ ((unsigned long)(l2) << PDRSHIFT) | \ ((unsigned long)(l1) << PAGE_SHIFT)) -/* Initial number of kernel page tables. */ -#ifndef NKPT -#define NKPT 32 -#endif - #define NKPML4E 1 /* number of kernel PML4 slots */ -#define NKPDPE howmany(NKPT, NPDEPG)/* number of kernel PDP slots */ #define NUPML4E (NPML4EPG/2) /* number of userland PML4 pages */ #define NUPDPE (NUPML4E*NPDPEPG)/* number of userland PDP pages */ @@ -181,6 +175,7 @@ #define PML4map ((pd_entry_t *)(addr_PML4map)) #define PML4pml4e ((pd_entry_t *)(addr_PML4pml4e)) +extern int nkpt; /* Initial number of kernel page tables */ extern u_int64_t KPDPphys; /* physical address of kernel level 3 */ extern u_int64_t KPML4phys; /* physical address of kernel level 4 */ Index: sys/amd64/amd64/minidump_machdep.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- sys/amd64/amd64/minidump_machdep.c (revision 246277) +++ sys/amd64/amd64/minidump_machdep.c (working copy) @@ -232,7 +232,7 @@ /* Walk page table pages, set bits in vm_page_dump */ pmapsize =3D 0; pdp =3D (uint64_t *)PHYS_TO_DMAP(KPDPphys); - for (va =3D VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NBPDR, + for (va =3D VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NBPDR, kernel_vm_end); ) { /* * We always write a page, even if it is zero. Each @@ -364,7 +364,7 @@ /* Dump kernel page directory pages */ bzero(fakepd, sizeof(fakepd)); pdp =3D (uint64_t *)PHYS_TO_DMAP(KPDPphys); - for (va =3D VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NBPDR, + for (va =3D VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NBPDR, kernel_vm_end); va +=3D NBPDP) { i =3D (va >> PDPSHIFT) & ((1ul << NPDPEPGSHIFT) - 1); Index: sys/amd64/amd64/pmap.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- sys/amd64/amd64/pmap.c (revision 246277) +++ sys/amd64/amd64/pmap.c (working copy) @@ -202,6 +202,10 @@ vm_offset_t virtual_avail; /* VA of first avail page (after kernel bss) */ vm_offset_t virtual_end; /* VA of last avail page (end of kernel AS) */ +int nkpt; +SYSCTL_INT(_machdep, OID_AUTO, nkpt, CTLFLAG_RD, &nkpt, 0, + "Number of kernel page table pages allocated on bootup"); + static int ndmpdp; static vm_paddr_t dmaplimit; vm_offset_t kernel_vm_end =3D VM_MIN_KERNEL_ADDRESS; @@ -495,17 +499,42 @@ CTASSERT(powerof2(NDMPML4E)); +/* number of kernel PDP slots */ +#define NKPDPE(ptpgs) howmany((ptpgs), NPDEPG) + static void +nkpt_init(vm_paddr_t addr) +{ + int pt_pages; +=09 +#ifdef NKPT + pt_pages =3D NKPT; +#else + pt_pages =3D howmany(addr, 1 << PDRSHIFT); + pt_pages +=3D NKPDPE(pt_pages); + + /* + * Add some slop beyond the bare minimum required for bootstrapping + * the kernel. + * + * This is quite important when allocating KVA for kernel modules. + * The modules are required to be linked in the negative 2GB of + * the address space. If we run out of KVA in this region then + * pmap_growkernel() will need to allocate page table pages to map + * the entire 512GB of KVA space which is an unnecessary tax on + * physical memory. + */ + pt_pages +=3D 4; /* 8MB additional slop for kernel modules */ +#endif + nkpt =3D pt_pages; +} + +static void create_pagetables(vm_paddr_t *firstaddr) { - int i, j, ndm1g; + int i, j, ndm1g, nkpdpe; - /* Allocate pages */ - KPTphys =3D allocpages(firstaddr, NKPT); - KPML4phys =3D allocpages(firstaddr, 1); - KPDPphys =3D allocpages(firstaddr, NKPML4E); - KPDphys =3D allocpages(firstaddr, NKPDPE); - + /* Allocate page table pages for the direct map */ ndmpdp =3D (ptoa(Maxmem) + NBPDP - 1) >> PDPSHIFT; if (ndmpdp < 4) /* Minimum 4GB of dirmap */ ndmpdp =3D 4; @@ -517,6 +546,22 @@ DMPDphys =3D allocpages(firstaddr, ndmpdp - ndm1g); dmaplimit =3D (vm_paddr_t)ndmpdp << PDPSHIFT; + /* Allocate pages */ + KPML4phys =3D allocpages(firstaddr, 1); + KPDPphys =3D allocpages(firstaddr, NKPML4E); + + /* + * Allocate the initial number of kernel page table pages required to + * bootstrap. We defer this until after all memory-size dependent + * allocations are done (e.g. direct map), so that we don't have to + * build in too much slop in our estimate. + */ + nkpt_init(*firstaddr); + nkpdpe =3D NKPDPE(nkpt); + + KPTphys =3D allocpages(firstaddr, nkpt); + KPDphys =3D allocpages(firstaddr, nkpdpe); + /* Fill in the underlying page table pages */ /* Read-only from zero to physfree */ /* XXX not fully used, underneath 2M pages */ @@ -526,7 +571,7 @@ } /* Now map the page tables at their location within PTmap */ - for (i =3D 0; i < NKPT; i++) { + for (i =3D 0; i < nkpt; i++) { ((pd_entry_t *)KPDphys)[i] =3D KPTphys + (i << PAGE_SHIFT); ((pd_entry_t *)KPDphys)[i] |=3D PG_RW | PG_V; } @@ -539,7 +584,7 @@ } /* And connect up the PD to the PDP */ - for (i =3D 0; i < NKPDPE; i++) { + for (i =3D 0; i < nkpdpe; i++) { ((pdp_entry_t *)KPDPphys)[i + KPDPI] =3D KPDphys + (i << PAGE_SHIFT); ((pdp_entry_t *)KPDPphys)[i + KPDPI] |=3D PG_RW | PG_V | PG_U; @@ -768,7 +813,7 @@ * Initialize the vm page array entries for the kernel pmap's * page table pages. */ - for (i =3D 0; i < NKPT; i++) { + for (i =3D 0; i < nkpt; i++) { mpte =3D PHYS_TO_VM_PAGE(KPTphys + (i << PAGE_SHIFT)); KASSERT(mpte >=3D vm_page_array && mpte < &vm_page_array[vm_page_array_size], @@ -1995,7 +2040,7 @@ * any new kernel page table pages between "kernel_vm_end" and * "KERNBASE". */ - if (KERNBASE < addr && addr <=3D KERNBASE + NKPT * NBPDR) + if (KERNBASE < addr && addr <=3D KERNBASE + nkpt * NBPDR) return; addr =3D roundup2(addr, NBPDR); best Neel On Sun, Dec 9, 2012 at 5:41 AM, wrote: >> As also Alan suggested, a way to workaround the problem is to increase >> NKPT value (e.g. from 32 to 64). Obviously, this is not a proper fix. >> For a proper fix the kernel needs to be able to dynamically set the >> size of NKPT. In this particular case, this wouldn't be too hard, but >> there is a different case, where people preload a large memory disk >> image at boot time that isn't so easy to fix. >> >> Thanks, >> >> Davide > > > Had a same issue. > I use very big preloaded images, with full world + many compiled ports in= it. > > Fix: > 'sh' code snip ... > ---- > # Get default NKTP value > nkpt=3D`cat "/sys/$arch/include/pmap.h" | sed -En 's/.+NKPT[[:blank:]]+([= 0-9]{2})$/\1/p'` > > # How many additional NKPT (4 Mb each), for our image, added to amount of= NKPT? > # Calculated in Kb > : $((nkpt +=3D "$img_size" / 4096)) > ---- > > But it loads sooooo slow into the RAM. > That should be enhanced, too. > > > Domagoj Smol=C4=8Di=C4=87 > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org= " From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 5 02:49:06 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id E062379E for ; Tue, 5 Feb 2013 02:49:06 +0000 (UTC) (envelope-from lists@eitanadler.com) Received: from mail-lb0-f173.google.com (mail-lb0-f173.google.com [209.85.217.173]) by mx1.freebsd.org (Postfix) with ESMTP id 6EF4CFD8 for ; Tue, 5 Feb 2013 02:49:06 +0000 (UTC) Received: by mail-lb0-f173.google.com with SMTP id gf7so7281011lbb.32 for ; Mon, 04 Feb 2013 18:49:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eitanadler.com; s=0xdeadbeef; h=x-received:mime-version:from:date:message-id:subject:to :content-type; bh=ANl7F0X4orqi2P3jXiAnJAUzhSk0lQ4X9zs7WO6WmCU=; b=mCeIxeEkxAlsFWObH/f4XjWzUfj2L3z9WEtlT4pEM0pLuyhE7eHvIqPISyAHwIs7vr C1BD3sctlygFf+mkNInPsJLjwy2OLyDX6XktCfJYyF124PgfHhJ7fv7RGpHPDPcijHcn U8VFT0POjbLLVKPGw38m6exISBB7eLbfibf5Y= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:from:date:message-id:subject:to :content-type:x-gm-message-state; bh=ANl7F0X4orqi2P3jXiAnJAUzhSk0lQ4X9zs7WO6WmCU=; b=GUEq7I8av6qUggqrxbdc7brA/dDExZZqCoJ86IAYUXKG2YICdY60e1dgtfXBpKlQD8 k4FadkZDb29W1rPkWqtPIouY5r0jlOLwJyGW2xm6GosZSpLDsmGWK2+4igqVhxsNARk0 /0rjuWpfFrsjBef87YPr4ZJSPm9cKxLoj04kmrY97MnPiHxSxumXlwfXTGj3HWu4M2hS eIjt8KUC5zhlJjfDGekwwgK03orZgtbtQkrt5PXxYuV3Q5lBYQye6aPIlKVnDtTvmd9F K4U5bYQS+OGtbDS/TBO/1VAa0av2dhuO8GXjKAPT2BJFpLlWHFNnYHchA6UdEHHIFP3e 9XwQ== X-Received: by 10.152.147.103 with SMTP id tj7mr21447060lab.54.1360032545104; Mon, 04 Feb 2013 18:49:05 -0800 (PST) MIME-Version: 1.0 Received: by 10.112.91.164 with HTTP; Mon, 4 Feb 2013 18:48:35 -0800 (PST) From: Eitan Adler Date: Mon, 4 Feb 2013 21:48:35 -0500 Message-ID: Subject: c99 project To: freebsd-doc@freebsd.org, FreeBSD Hackers Content-Type: text/plain; charset=UTF-8 X-Gm-Message-State: ALoCoQk3RbUV+LZOSjlwTsAVa7MTV3uRYgaAzhu19dbF5NVQYMKVtWg1+iMkwBGP2DOScEMf7X7n X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Feb 2013 02:49:06 -0000 Is the following page still useful? Would there be any objection to me removing it? http://www.freebsd.org/projects/c99/index.html -- Eitan Adler From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 5 03:08:47 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id DB8A0EC9; Tue, 5 Feb 2013 03:08:47 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-we0-x234.google.com (we-in-x0234.1e100.net [IPv6:2a00:1450:400c:c03::234]) by mx1.freebsd.org (Postfix) with ESMTP id 2FEDE1A0; Tue, 5 Feb 2013 03:08:47 +0000 (UTC) Received: by mail-we0-f180.google.com with SMTP id k14so5260461wer.25 for ; Mon, 04 Feb 2013 19:08:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=Q9hMiToWvFZIIxeSWq2bP2L53FfqA4hRYR1VKKnMXVI=; b=hXYn2dbeY/E9IbdraWZkMTK0otPOYx0/wrE5ApQ0/h6A77cGfTjsAcoDBex2SFTBeK pAIfnpgYbHVsbCwhQ4nELNRTdYQtTHPTjP6a0eS7y6ihWGbMNhL3LhchhFdYRsXQAbO0 Qx8UXcxBq57yy+rnyEIIHC1ovqPI8cn0y0ZWxwTqxHSU1ATe03TpxWCGwAwMS1HvZDrH VF4PyO+Z/S9EnU1fLU+wj8kfharPyDxzKD71zfR/F5GJVeHiETliYN/UQYl4rnNQCrQT iPfSfeTKcIb7rIwLxlRYPNsJNRLZXHBGh7xB4xa1xCvSfpO9EvgwN3fQPlzZyXl9WJOS qWZQ== MIME-Version: 1.0 X-Received: by 10.194.108.101 with SMTP id hj5mr39239819wjb.6.1360033726048; Mon, 04 Feb 2013 19:08:46 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.216.236.88 with HTTP; Mon, 4 Feb 2013 19:08:45 -0800 (PST) In-Reply-To: References: Date: Mon, 4 Feb 2013 19:08:45 -0800 X-Google-Sender-Auth: e0nl0aPfDxkfI_VLtfcwDverZNo Message-ID: Subject: Re: c99 project From: Adrian Chadd To: Eitan Adler Content-Type: text/plain; charset=ISO-8859-1 Cc: FreeBSD Hackers , freebsd-doc@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Feb 2013 03:08:47 -0000 .. is it actually completed? Adrian On 4 February 2013 18:48, Eitan Adler wrote: > Is the following page still useful? > > Would there be any objection to me removing it? > > http://www.freebsd.org/projects/c99/index.html > > -- > Eitan Adler > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 5 03:12:06 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 1F4F52BD for ; Tue, 5 Feb 2013 03:12:06 +0000 (UTC) (envelope-from lists@eitanadler.com) Received: from mail-la0-x22f.google.com (mail-la0-x22f.google.com [IPv6:2a00:1450:4010:c03::22f]) by mx1.freebsd.org (Postfix) with ESMTP id 89FEA1DE for ; Tue, 5 Feb 2013 03:12:05 +0000 (UTC) Received: by mail-la0-f47.google.com with SMTP id fj20so5216474lab.20 for ; Mon, 04 Feb 2013 19:12:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eitanadler.com; s=0xdeadbeef; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=0MJPIS060YVwAdmkhCvCLt4uKdON6pDijpE52rS97E0=; b=HyVZ4lx3Gte8EIJjnlv8vO75WSos0aVTxXj/GDqiSoSqVmo3bmNMKUnUHXjvONWEg+ MaUu6DcKh4ir6Pn2kWAR6Z4AT+d96P8AqyvvHGvzJx6hMg7YZDT23JOcjNzCijt2+Cd3 wfFdfLmgWzf+ndVXwdml2VDJK6Hc46s1rtIvU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:cc:content-type:x-gm-message-state; bh=0MJPIS060YVwAdmkhCvCLt4uKdON6pDijpE52rS97E0=; b=EkvHs6e3jSk35a6GGWxpCsqRL0JN+AoSQlJ97ZFQBfqODhvX6Yi3PY+vqe5RJVn15i ULY/GwBv9fR/abcxftMmBCHvq5gpGatWYjjM6AxCOh7DVo3ziq4xFG6O9TuQofkJ6KA+ vek/oDUKu4vIptv9xLTLuo9EJoBdXpBC7oFRBjJnALgmSsQKdg8xdMfb6ptnm68ZgjiV PmTHK3EA8AoruOy6/jFzg5CDGHxIBw1TZ0qNJQYHuzA8wMKpwiiMcv5eAy5y1SUz5Ted V53nYPwYrkT/8PhwZttwrmU/GU4kR4E5gBu26BjoaohX9YNGkb5Bvn8DVv61Us+5qaQL U2OA== X-Received: by 10.112.87.66 with SMTP id v2mr9339396lbz.130.1360033924192; Mon, 04 Feb 2013 19:12:04 -0800 (PST) MIME-Version: 1.0 Received: by 10.112.91.164 with HTTP; Mon, 4 Feb 2013 19:11:34 -0800 (PST) In-Reply-To: References: From: Eitan Adler Date: Mon, 4 Feb 2013 22:11:34 -0500 Message-ID: Subject: Re: c99 project To: Adrian Chadd Content-Type: text/plain; charset=UTF-8 X-Gm-Message-State: ALoCoQn7AFnTawB/kkAcm9LF+aGABX+Cp+MUOsnhzJe7bBPESZN9qiGha77m2iKV+tXLEQDv6Y0t Cc: FreeBSD Hackers , freebsd-doc@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Feb 2013 03:12:06 -0000 On 4 February 2013 22:08, Adrian Chadd wrote: > .. is it actually completed? No idea, hence my question ;) -- Eitan Adler From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 5 03:18:22 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 3DB0C733 for ; Tue, 5 Feb 2013 03:18:22 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id E3A0A235 for ; Tue, 5 Feb 2013 03:18:21 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.6/8.14.6) with ESMTP id r153IDo1014403 for ; Mon, 4 Feb 2013 21:18:13 -0600 (CST) (envelope-from stephen@missouri.edu) Message-ID: <511079F5.50300@missouri.edu> Date: Mon, 04 Feb 2013 21:18:13 -0600 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130106 Thunderbird/17.0.2 MIME-Version: 1.0 To: freebsd-hackers@freebsd.org Subject: Re: c99 project References: In-Reply-To: X-Enigmail-Version: 1.5 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Feb 2013 03:18:22 -0000 On 02/04/2013 08:48 PM, Eitan Adler wrote: > Is the following page still useful? > > Would there be any objection to me removing it? > > http://www.freebsd.org/projects/c99/index.html We are still working on complex and long double functions. From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 5 08:53:23 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id A4010ABC; Tue, 5 Feb 2013 08:53:23 +0000 (UTC) (envelope-from gabor@FreeBSD.org) Received: from server.mypc.hu (server.mypc.hu [87.229.73.95]) by mx1.freebsd.org (Postfix) with ESMTP id 5246DF86; Tue, 5 Feb 2013 08:53:23 +0000 (UTC) Received: from server.mypc.hu (localhost [127.0.0.1]) by server.mypc.hu (Postfix) with ESMTP id 0023814D2507; Tue, 5 Feb 2013 09:53:13 +0100 (CET) X-Virus-Scanned: amavisd-new at !change-mydomain-variable!.example.com Received: from server.mypc.hu ([127.0.0.1]) by server.mypc.hu (server.mypc.hu [127.0.0.1]) (amavisd-new, port 10024) with LMTP id Tl4n7GXsRfdZ; Tue, 5 Feb 2013 09:53:13 +0100 (CET) Received: from [192.168.1.100] (5403A6BE.catv.pool.telekom.hu [84.3.166.190]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by server.mypc.hu (Postfix) with ESMTPSA id 601C314D2410; Tue, 5 Feb 2013 09:53:12 +0100 (CET) Message-ID: <5110C878.5080706@FreeBSD.org> Date: Tue, 05 Feb 2013 09:53:12 +0100 From: Gabor Kovesdan User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: Eitan Adler Subject: Re: c99 project References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: FreeBSD Hackers , freebsd-doc@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Feb 2013 08:53:23 -0000 Em 05-02-2013 03:48, Eitan Adler escreveu: > Is the following page still useful? Yes. Gabor From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 5 10:35:20 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 85ED61C4; Tue, 5 Feb 2013 10:35:20 +0000 (UTC) (envelope-from gkeramidas@gmail.com) Received: from mail-bk0-f53.google.com (mail-bk0-f53.google.com [209.85.214.53]) by mx1.freebsd.org (Postfix) with ESMTP id D347CCB2; Tue, 5 Feb 2013 10:35:19 +0000 (UTC) Received: by mail-bk0-f53.google.com with SMTP id j10so3171870bkw.26 for ; Tue, 05 Feb 2013 02:35:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:sender:date:from:to:cc:subject:message-id:references :mime-version:content-type:content-disposition:in-reply-to; bh=UgyGzeF1PcrG30wWgQPgWmyVx/u9ig1RMr/V6UJcmLo=; b=N0yjmidDo2OMHdusTGexthfp6c3RvvCAxkiKp3cugQGHMhC1imiFkxkpVBaqvH8UiX fx+SMncXWljNdH/MJe7Re6bx0qbJwq2vLmUA1ICHcP4tPrEwytNBnFikmtAgz0OWHE9Q /ooX9jyrzraHfqJskK/ljGwraqFziATPCP+F50MvIQ59Vgct1Djc3p3X2jmVoyA+xhP4 GCNn1iCJ9PHMJ1N2MD+TaGliI7QVtZywmObYRuMbfHrDF2JbOwIqbxKeKrjz3nqB3vN9 E0MarbSn+kbiMZZjfHji3s9U2JcG7KgH+LncOC7f7zEgOlLV8apfi1Zy+tfVRQd3/gYm C4og== X-Received: by 10.204.149.140 with SMTP id t12mr1251949bkv.123.1360060513298; Tue, 05 Feb 2013 02:35:13 -0800 (PST) Received: from saturn (217-162-217-29.dynamic.hispeed.ch. [217.162.217.29]) by mx.google.com with ESMTPS id r17sm6252748bkw.21.2013.02.05.02.35.12 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Tue, 05 Feb 2013 02:35:12 -0800 (PST) Sender: Giorgos Keramidas Date: Tue, 5 Feb 2013 11:35:10 +0100 From: Giorgos Keramidas To: Eitan Adler Subject: Re: c99 project Message-ID: <20130205103509.GB28045@saturn> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Cc: FreeBSD Hackers , freebsd-doc@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Feb 2013 10:35:20 -0000 On 2013-02-04 21:48, Eitan Adler wrote: > Is the following page still useful? > Would there be any objection to me removing it? > > http://www.freebsd.org/projects/c99/index.html I think this is useful until we have full C99 support in at least one compiler toolchain. To the best of my knowledge this is not entirely true for either GCC or LLVM. So we should keep the page alive, until the project is done or canceled. From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 5 15:14:24 2013 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 0DF3068C; Tue, 5 Feb 2013 15:14:24 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) by mx1.freebsd.org (Postfix) with ESMTP id 5E691F76; Tue, 5 Feb 2013 15:14:23 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.6/8.14.6) with ESMTP id r15FEDQQ060971; Tue, 5 Feb 2013 17:14:13 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.7.4 kib.kiev.ua r15FEDQQ060971 Received: (from kostik@localhost) by tom.home (8.14.6/8.14.6/Submit) id r15FEDgt060970; Tue, 5 Feb 2013 17:14:13 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 5 Feb 2013 17:14:13 +0200 From: Konstantin Belousov To: Neel Natu Subject: Re: dynamically calculating NKPT [was: Re: huge ktr buffer] Message-ID: <20130205151413.GL2522@kib.kiev.ua> References: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="oImBTl0TNA0mSDFD" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: alc@freebsd.org, davide@freebsd.org, hackers@freebsd.org, avg@freebsd.org, rank1seeker@gmail.com X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Feb 2013 15:14:24 -0000 --oImBTl0TNA0mSDFD Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Feb 04, 2013 at 03:05:15PM -0800, Neel Natu wrote: > Hi, >=20 > I have a patch to dynamically calculate NKPT for amd64 kernels. This > should fix the various issues that people pointed out in the email > thread. >=20 > Please review and let me know if there are any objections to committing t= his. >=20 > Also, thanks to Alan (alc@) for reviewing and providing feedback on > the initial version of the patch. >=20 > Patch (also available at http://people.freebsd.org/~neel/patches/nkpt_dif= f.txt): >=20 > Index: sys/amd64/include/pmap.h > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- sys/amd64/include/pmap.h (revision 246277) > +++ sys/amd64/include/pmap.h (working copy) > @@ -113,13 +113,7 @@ > ((unsigned long)(l2) << PDRSHIFT) | \ > ((unsigned long)(l1) << PAGE_SHIFT)) >=20 > -/* Initial number of kernel page tables. */ > -#ifndef NKPT > -#define NKPT 32 > -#endif > - > #define NKPML4E 1 /* number of kernel PML4 slots */ > -#define NKPDPE howmany(NKPT, NPDEPG)/* number of kernel PDP slots */ >=20 > #define NUPML4E (NPML4EPG/2) /* number of userland PML4 pages */ > #define NUPDPE (NUPML4E*NPDPEPG)/* number of userland PDP pages */ > @@ -181,6 +175,7 @@ > #define PML4map ((pd_entry_t *)(addr_PML4map)) > #define PML4pml4e ((pd_entry_t *)(addr_PML4pml4e)) >=20 > +extern int nkpt; /* Initial number of kernel page tables */ > extern u_int64_t KPDPphys; /* physical address of kernel level 3 */ > extern u_int64_t KPML4phys; /* physical address of kernel level 4 */ >=20 > Index: sys/amd64/amd64/minidump_machdep.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- sys/amd64/amd64/minidump_machdep.c (revision 246277) > +++ sys/amd64/amd64/minidump_machdep.c (working copy) > @@ -232,7 +232,7 @@ > /* Walk page table pages, set bits in vm_page_dump */ > pmapsize =3D 0; > pdp =3D (uint64_t *)PHYS_TO_DMAP(KPDPphys); > - for (va =3D VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NBPDR, > + for (va =3D VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NBPDR, > kernel_vm_end); ) { > /* > * We always write a page, even if it is zero. Each > @@ -364,7 +364,7 @@ > /* Dump kernel page directory pages */ > bzero(fakepd, sizeof(fakepd)); > pdp =3D (uint64_t *)PHYS_TO_DMAP(KPDPphys); > - for (va =3D VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NBPDR, > + for (va =3D VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NBPDR, > kernel_vm_end); va +=3D NBPDP) { > i =3D (va >> PDPSHIFT) & ((1ul << NPDPEPGSHIFT) - 1); >=20 > Index: sys/amd64/amd64/pmap.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- sys/amd64/amd64/pmap.c (revision 246277) > +++ sys/amd64/amd64/pmap.c (working copy) > @@ -202,6 +202,10 @@ > vm_offset_t virtual_avail; /* VA of first avail page (after kernel bss) = */ > vm_offset_t virtual_end; /* VA of last avail page (end of kernel AS) */ >=20 > +int nkpt; > +SYSCTL_INT(_machdep, OID_AUTO, nkpt, CTLFLAG_RD, &nkpt, 0, > + "Number of kernel page table pages allocated on bootup"); > + > static int ndmpdp; > static vm_paddr_t dmaplimit; > vm_offset_t kernel_vm_end =3D VM_MIN_KERNEL_ADDRESS; > @@ -495,17 +499,42 @@ >=20 > CTASSERT(powerof2(NDMPML4E)); >=20 > +/* number of kernel PDP slots */ > +#define NKPDPE(ptpgs) howmany((ptpgs), NPDEPG) > + > static void > +nkpt_init(vm_paddr_t addr) > +{ > + int pt_pages; > +=09 > +#ifdef NKPT > + pt_pages =3D NKPT; > +#else > + pt_pages =3D howmany(addr, 1 << PDRSHIFT); > + pt_pages +=3D NKPDPE(pt_pages); > + > + /* > + * Add some slop beyond the bare minimum required for bootstrapping > + * the kernel. > + * > + * This is quite important when allocating KVA for kernel modules. > + * The modules are required to be linked in the negative 2GB of > + * the address space. If we run out of KVA in this region then > + * pmap_growkernel() will need to allocate page table pages to map > + * the entire 512GB of KVA space which is an unnecessary tax on > + * physical memory. > + */ > + pt_pages +=3D 4; /* 8MB additional slop for kernel modules */ 8MB might be to low. I just checked one of my machines with fully modularized kernel, it takes slightly more than 6 MB to load 50 modules. I think that 16MB would be safer, but it probably needs to be scaled down based on the available phys memory. amd64 kernel could be booted on 128MB machine still. --oImBTl0TNA0mSDFD Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIcBAEBAgAGBQJRESHEAAoJEJDCuSvBvK1BokcP/3oUV+2JJu9FdumDYQlPlkk4 jciIQpb/tl/Z/J9DJf6vThdOaw3R2QXhh1JrvkQFONTno2USeUJWivz7Rtvfdluq n200D5RsgkiWEBuBBLSE5PdKiioMePGFhuRed+67ISxgYWdC+5ZXXwvjHivdN52u +bDgV9d9D1iOX17Fcxu/yAlI5Aed1mlJw4o5YsQCnhw/vzXi2e0/gidqvbX+5JpM 42g8D5V35RWj+xtUvFDuGcFq0aGME0JMmJ/T9txIsWAawgZFqWM5gOVNESgtsLXc 82SWA6jnLy+/Vs889vQVDD6jVq3qIu7S4CnDAcXClfzCX172K6ImTiORAGkQhcoc mKZCEqPB7QyKic4N0jFVI9PdAeMhSzC9NJLxMIBvs7RasmU1QzMaMRpEfNPKEKiv +uG29qatSC3HaxEbmeK/Ix12RRnry9DJGUimZk6qiya3/rZAGFBBv425bQwNMe12 6rSJzUS1zR+Gus519xCqMs6Gxcn35qk+7gSnblsWNiVxRuDqTqodu/spfzRNYZ4I VfDZc0wG/aeeMPWwqLzAzzbWcZLuVaVN2nuN+Em2sVVy0YL/0QVmtSg7XmRJmYnX oqedILnMerIADtpQme5Rr7mh0j4bLcPeBJdYe34jVQqqk/9dzHOPoYU9N0Fsy5H5 NCfa+nr1IEM75JuU1h/G =N9W3 -----END PGP SIGNATURE----- --oImBTl0TNA0mSDFD-- From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 5 15:45:26 2013 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id BC842F24; Tue, 5 Feb 2013 15:45:26 +0000 (UTC) (envelope-from mdf356@gmail.com) Received: from mail-qe0-f47.google.com (mail-qe0-f47.google.com [209.85.128.47]) by mx1.freebsd.org (Postfix) with ESMTP id 21824205; Tue, 5 Feb 2013 15:45:25 +0000 (UTC) Received: by mail-qe0-f47.google.com with SMTP id 2so138997qea.20 for ; Tue, 05 Feb 2013 07:45:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=sz//YS7u3mMP2dcjHx6tMJMuPro0UOsHC4oX06oMF4E=; b=mJj5KsThDTsGBgE5d+fWzumDNfPrB+P8n53Pfrfg0n7yFYUrloFnvDIEGOSCBWQkTP XI5lGo8PWhdpU6Vb9A9wRXahXIKBakviXJQEr3XD5CBKiPnw75XuQDYCvSPeEvtHDgwE kr6okf6w5bTPiu0uRv/ZY1WDwlTtIkRsZHjYKE/qNlbdMFND1GDXKa/o+VxjrHXt2v8V D2N5icf/l8xdGH2kEKciOs3HZkQafilzhI40n2dW+Jgc7DG+kxurVp+EdJA4FVIr/pvA kPTIEilqPGNRsqc1UvgpWKo6hsniT2oE6jjiTEwirA3B5VCCJ2+n2iE983WcWgIHYbAf 3zFA== MIME-Version: 1.0 X-Received: by 10.49.104.108 with SMTP id gd12mr22785699qeb.37.1360079125068; Tue, 05 Feb 2013 07:45:25 -0800 (PST) Sender: mdf356@gmail.com Received: by 10.229.179.42 with HTTP; Tue, 5 Feb 2013 07:45:24 -0800 (PST) In-Reply-To: <20130205151413.GL2522@kib.kiev.ua> References: <20130205151413.GL2522@kib.kiev.ua> Date: Tue, 5 Feb 2013 07:45:24 -0800 X-Google-Sender-Auth: 7gBZSQnM1G8dA3vDltxzkH97Crc Message-ID: Subject: Re: dynamically calculating NKPT [was: Re: huge ktr buffer] From: mdf@FreeBSD.org To: Konstantin Belousov Content-Type: text/plain; charset=ISO-8859-1 Cc: davide@freebsd.org, alc@freebsd.org, avg@freebsd.org, rank1seeker@gmail.com, hackers@freebsd.org, Neel Natu X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Feb 2013 15:45:26 -0000 On Tue, Feb 5, 2013 at 7:14 AM, Konstantin Belousov wrote: > On Mon, Feb 04, 2013 at 03:05:15PM -0800, Neel Natu wrote: >> Hi, >> >> I have a patch to dynamically calculate NKPT for amd64 kernels. This >> should fix the various issues that people pointed out in the email >> thread. >> >> Please review and let me know if there are any objections to committing this. >> >> Also, thanks to Alan (alc@) for reviewing and providing feedback on >> the initial version of the patch. >> >> Patch (also available at http://people.freebsd.org/~neel/patches/nkpt_diff.txt): >> >> Index: sys/amd64/include/pmap.h >> =================================================================== >> --- sys/amd64/include/pmap.h (revision 246277) >> +++ sys/amd64/include/pmap.h (working copy) >> @@ -113,13 +113,7 @@ >> ((unsigned long)(l2) << PDRSHIFT) | \ >> ((unsigned long)(l1) << PAGE_SHIFT)) >> >> -/* Initial number of kernel page tables. */ >> -#ifndef NKPT >> -#define NKPT 32 >> -#endif >> - >> #define NKPML4E 1 /* number of kernel PML4 slots */ >> -#define NKPDPE howmany(NKPT, NPDEPG)/* number of kernel PDP slots */ >> >> #define NUPML4E (NPML4EPG/2) /* number of userland PML4 pages */ >> #define NUPDPE (NUPML4E*NPDPEPG)/* number of userland PDP pages */ >> @@ -181,6 +175,7 @@ >> #define PML4map ((pd_entry_t *)(addr_PML4map)) >> #define PML4pml4e ((pd_entry_t *)(addr_PML4pml4e)) >> >> +extern int nkpt; /* Initial number of kernel page tables */ >> extern u_int64_t KPDPphys; /* physical address of kernel level 3 */ >> extern u_int64_t KPML4phys; /* physical address of kernel level 4 */ >> >> Index: sys/amd64/amd64/minidump_machdep.c >> =================================================================== >> --- sys/amd64/amd64/minidump_machdep.c (revision 246277) >> +++ sys/amd64/amd64/minidump_machdep.c (working copy) >> @@ -232,7 +232,7 @@ >> /* Walk page table pages, set bits in vm_page_dump */ >> pmapsize = 0; >> pdp = (uint64_t *)PHYS_TO_DMAP(KPDPphys); >> - for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NBPDR, >> + for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NBPDR, >> kernel_vm_end); ) { >> /* >> * We always write a page, even if it is zero. Each >> @@ -364,7 +364,7 @@ >> /* Dump kernel page directory pages */ >> bzero(fakepd, sizeof(fakepd)); >> pdp = (uint64_t *)PHYS_TO_DMAP(KPDPphys); >> - for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NBPDR, >> + for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NBPDR, >> kernel_vm_end); va += NBPDP) { >> i = (va >> PDPSHIFT) & ((1ul << NPDPEPGSHIFT) - 1); >> >> Index: sys/amd64/amd64/pmap.c >> =================================================================== >> --- sys/amd64/amd64/pmap.c (revision 246277) >> +++ sys/amd64/amd64/pmap.c (working copy) >> @@ -202,6 +202,10 @@ >> vm_offset_t virtual_avail; /* VA of first avail page (after kernel bss) */ >> vm_offset_t virtual_end; /* VA of last avail page (end of kernel AS) */ >> >> +int nkpt; >> +SYSCTL_INT(_machdep, OID_AUTO, nkpt, CTLFLAG_RD, &nkpt, 0, >> + "Number of kernel page table pages allocated on bootup"); >> + >> static int ndmpdp; >> static vm_paddr_t dmaplimit; >> vm_offset_t kernel_vm_end = VM_MIN_KERNEL_ADDRESS; >> @@ -495,17 +499,42 @@ >> >> CTASSERT(powerof2(NDMPML4E)); >> >> +/* number of kernel PDP slots */ >> +#define NKPDPE(ptpgs) howmany((ptpgs), NPDEPG) >> + >> static void >> +nkpt_init(vm_paddr_t addr) >> +{ >> + int pt_pages; >> + >> +#ifdef NKPT >> + pt_pages = NKPT; >> +#else >> + pt_pages = howmany(addr, 1 << PDRSHIFT); >> + pt_pages += NKPDPE(pt_pages); >> + >> + /* >> + * Add some slop beyond the bare minimum required for bootstrapping >> + * the kernel. >> + * >> + * This is quite important when allocating KVA for kernel modules. >> + * The modules are required to be linked in the negative 2GB of >> + * the address space. If we run out of KVA in this region then >> + * pmap_growkernel() will need to allocate page table pages to map >> + * the entire 512GB of KVA space which is an unnecessary tax on >> + * physical memory. >> + */ >> + pt_pages += 4; /* 8MB additional slop for kernel modules */ > 8MB might be to low. I just checked one of my machines with fully > modularized kernel, it takes slightly more than 6 MB to load 50 modules. > I think that 16MB would be safer, but it probably needs to be scaled > down based on the available phys memory. amd64 kernel could be booted > on 128MB machine still. Is there no way to not map the entire 512GB? Otherwise this patch could really hose some vendors. E.g. the kernel module for the OneFS file system is around 8MB all by itself. I found when we moved from FreeBSD 6 to 7 that the NKPT of 32 was insufficient for our system to even boot so I put it back to 240 (I didn't want to spend a lot of time playing). At that time our module was loaded by the boot loader; now we do it during init to save some seconds on boot. But we're probably not the only ones with a large kernel module. Cheers, matthew From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 5 16:13:41 2013 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 28BCB8DC; Tue, 5 Feb 2013 16:13:41 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) by mx1.freebsd.org (Postfix) with ESMTP id 8F96B3EA; Tue, 5 Feb 2013 16:13:40 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.6/8.14.6) with ESMTP id r15GDaS0067284; Tue, 5 Feb 2013 18:13:36 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.7.4 kib.kiev.ua r15GDaS0067284 Received: (from kostik@localhost) by tom.home (8.14.6/8.14.6/Submit) id r15GDZN8067283; Tue, 5 Feb 2013 18:13:35 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 5 Feb 2013 18:13:35 +0200 From: Konstantin Belousov To: mdf@FreeBSD.org Subject: Re: dynamically calculating NKPT [was: Re: huge ktr buffer] Message-ID: <20130205161335.GM2522@kib.kiev.ua> References: <20130205151413.GL2522@kib.kiev.ua> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="ZKQlerlNKW0xCYkU" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: davide@freebsd.org, alc@freebsd.org, avg@freebsd.org, rank1seeker@gmail.com, hackers@freebsd.org, Neel Natu X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Feb 2013 16:13:41 -0000 --ZKQlerlNKW0xCYkU Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Feb 05, 2013 at 07:45:24AM -0800, mdf@FreeBSD.org wrote: > On Tue, Feb 5, 2013 at 7:14 AM, Konstantin Belousov = wrote: > > On Mon, Feb 04, 2013 at 03:05:15PM -0800, Neel Natu wrote: > >> Hi, > >> > >> I have a patch to dynamically calculate NKPT for amd64 kernels. This > >> should fix the various issues that people pointed out in the email > >> thread. > >> > >> Please review and let me know if there are any objections to committin= g this. > >> > >> Also, thanks to Alan (alc@) for reviewing and providing feedback on > >> the initial version of the patch. > >> > >> Patch (also available at http://people.freebsd.org/~neel/patches/nkpt_= diff.txt): > >> > >> Index: sys/amd64/include/pmap.h > >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > >> --- sys/amd64/include/pmap.h (revision 246277) > >> +++ sys/amd64/include/pmap.h (working copy) > >> @@ -113,13 +113,7 @@ > >> ((unsigned long)(l2) << PDRSHIFT) | \ > >> ((unsigned long)(l1) << PAGE_SHIFT)) > >> > >> -/* Initial number of kernel page tables. */ > >> -#ifndef NKPT > >> -#define NKPT 32 > >> -#endif > >> - > >> #define NKPML4E 1 /* number of kernel PML4= slots */ > >> -#define NKPDPE howmany(NKPT, NPDEPG)/* number of kernel= PDP slots */ > >> > >> #define NUPML4E (NPML4EPG/2) /* number of userland PM= L4 pages */ > >> #define NUPDPE (NUPML4E*NPDPEPG)/* number of userland P= DP pages */ > >> @@ -181,6 +175,7 @@ > >> #define PML4map ((pd_entry_t *)(addr_PML4map)) > >> #define PML4pml4e ((pd_entry_t *)(addr_PML4pml4e)) > >> > >> +extern int nkpt; /* Initial number of kernel page tables = */ > >> extern u_int64_t KPDPphys; /* physical address of kernel level 3 */ > >> extern u_int64_t KPML4phys; /* physical address of kernel level 4 */ > >> > >> Index: sys/amd64/amd64/minidump_machdep.c > >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > >> --- sys/amd64/amd64/minidump_machdep.c (revision 246277) > >> +++ sys/amd64/amd64/minidump_machdep.c (working copy) > >> @@ -232,7 +232,7 @@ > >> /* Walk page table pages, set bits in vm_page_dump */ > >> pmapsize =3D 0; > >> pdp =3D (uint64_t *)PHYS_TO_DMAP(KPDPphys); > >> - for (va =3D VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NB= PDR, > >> + for (va =3D VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NB= PDR, > >> kernel_vm_end); ) { > >> /* > >> * We always write a page, even if it is zero. Each > >> @@ -364,7 +364,7 @@ > >> /* Dump kernel page directory pages */ > >> bzero(fakepd, sizeof(fakepd)); > >> pdp =3D (uint64_t *)PHYS_TO_DMAP(KPDPphys); > >> - for (va =3D VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NB= PDR, > >> + for (va =3D VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NB= PDR, > >> kernel_vm_end); va +=3D NBPDP) { > >> i =3D (va >> PDPSHIFT) & ((1ul << NPDPEPGSHIFT) - 1); > >> > >> Index: sys/amd64/amd64/pmap.c > >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > >> --- sys/amd64/amd64/pmap.c (revision 246277) > >> +++ sys/amd64/amd64/pmap.c (working copy) > >> @@ -202,6 +202,10 @@ > >> vm_offset_t virtual_avail; /* VA of first avail page (after kernel = bss) */ > >> vm_offset_t virtual_end; /* VA of last avail page (end of kernel = AS) */ > >> > >> +int nkpt; > >> +SYSCTL_INT(_machdep, OID_AUTO, nkpt, CTLFLAG_RD, &nkpt, 0, > >> + "Number of kernel page table pages allocated on bootup"); > >> + > >> static int ndmpdp; > >> static vm_paddr_t dmaplimit; > >> vm_offset_t kernel_vm_end =3D VM_MIN_KERNEL_ADDRESS; > >> @@ -495,17 +499,42 @@ > >> > >> CTASSERT(powerof2(NDMPML4E)); > >> > >> +/* number of kernel PDP slots */ > >> +#define NKPDPE(ptpgs) howmany((ptpgs), NPDEPG) > >> + > >> static void > >> +nkpt_init(vm_paddr_t addr) > >> +{ > >> + int pt_pages; > >> + > >> +#ifdef NKPT > >> + pt_pages =3D NKPT; > >> +#else > >> + pt_pages =3D howmany(addr, 1 << PDRSHIFT); > >> + pt_pages +=3D NKPDPE(pt_pages); > >> + > >> + /* > >> + * Add some slop beyond the bare minimum required for bootstrapp= ing > >> + * the kernel. > >> + * > >> + * This is quite important when allocating KVA for kernel module= s. > >> + * The modules are required to be linked in the negative 2GB of > >> + * the address space. If we run out of KVA in this region then > >> + * pmap_growkernel() will need to allocate page table pages to m= ap > >> + * the entire 512GB of KVA space which is an unnecessary tax on > >> + * physical memory. > >> + */ > >> + pt_pages +=3D 4; /* 8MB additional slop for kernel modu= les */ > > 8MB might be to low. I just checked one of my machines with fully > > modularized kernel, it takes slightly more than 6 MB to load 50 modules. > > I think that 16MB would be safer, but it probably needs to be scaled > > down based on the available phys memory. amd64 kernel could be booted > > on 128MB machine still. >=20 > Is there no way to not map the entire 512GB? Otherwise this patch > could really hose some vendors. E.g. the kernel module for the OneFS > file system is around 8MB all by itself. No, I do not think that this patch would hose somebody with the 8MB module, esp. if the slack is increased. But yes, I believe it is possible to note that the growth happen after the KERNBASE point and only allocate the page tables at this region. We would need to not update the kernel_vm_end then, probably creating some other var to keep track of the other tail. >=20 > I found when we moved from FreeBSD 6 to 7 that the NKPT of 32 was > insufficient for our system to even boot so I put it back to 240 (I > didn't want to spend a lot of time playing). At that time our module > was loaded by the boot loader; now we do it during init to save some > seconds on boot. But we're probably not the only ones with a large > kernel module. >=20 > Cheers, > matthew --ZKQlerlNKW0xCYkU Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIcBAEBAgAGBQJRES+vAAoJEJDCuSvBvK1BMmwP/1k8k4vPKEcDet3jQUN4JZu9 KAOoffF0GcJpRMEwvnm5ouGtNleCNeHYk2nlaQQaEUgh9A9qMBHtAE20+cLcTSvw TLjwTu+3GUxB1hljNobyNgDuNYaT32M8JPecH0VBY3z+uPd8ZM2i/eUHM6a25n2C 0SI1+FmXApDJh27nONzUwwgnkIE/Ak40STJVpXSW035QLoun+wYrlq2ut0jKFTAW 2tzx5+D//Zc/PsBnTxVPnk8PjTuj0lXnPBYG+ODYuXZe3QkGf7P3zfNLKD1BL8JQ NAd53eduaoGQYkSgBn87HPW5UW9f3lprgSqkcZxh5x3iZoEWOc/dZXnFTssmbPbA ox+JPwENq3P//SWWwv7C7Um3UIoagGLeiCk2Gq9dkW3HQwd3rpAlJ66BkbAnZ+lR +RsRiiESPR4uwq0jgGbW+OtTHfNUAtpk1TZkAlJx7wNZdVsJY0UmIpeDXW6mAj8v WlUMmW9a7DTIrFcctZ+NdFphM96Tk43lZ/9zyJg5qJNdASq+Aw4RMN9onZ0ssNvl 0v/aQCsF2bjDSFuv5uuUtst68oNCrB7BUs1mXqyXFKSq7BEonTuIfoUKOUX7UrVb puQCTH/bhMsfqpA6EtWp0iN+hCzwZwrSPq8R0C3wejNtqwozJ5mToMBUjZpFO9EF 9iClrroqW5FCHGfUfRPz =AJvn -----END PGP SIGNATURE----- --ZKQlerlNKW0xCYkU-- From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 5 14:37:08 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id D31AF6C2 for ; Tue, 5 Feb 2013 14:37:08 +0000 (UTC) (envelope-from drbaud@yahoo.com) Received: from nm26-vm0.bullet.mail.bf1.yahoo.com (nm26-vm0.bullet.mail.bf1.yahoo.com [98.139.213.74]) by mx1.freebsd.org (Postfix) with ESMTP id 812BED4F for ; Tue, 5 Feb 2013 14:37:07 +0000 (UTC) Received: from [98.139.215.143] by nm26.bullet.mail.bf1.yahoo.com with NNFMP; 05 Feb 2013 14:37:05 -0000 Received: from [98.139.212.200] by tm14.bullet.mail.bf1.yahoo.com with NNFMP; 05 Feb 2013 14:37:05 -0000 Received: from [127.0.0.1] by omp1009.mail.bf1.yahoo.com with NNFMP; 05 Feb 2013 14:37:05 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 639609.97030.bm@omp1009.mail.bf1.yahoo.com Received: (qmail 72292 invoked by uid 60001); 5 Feb 2013 14:37:05 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1360075025; bh=c3hyC4UdGgk8ckWlrGVZBUPeVAOcmrofG1xSWWrZIGE=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Reply-To:Subject:To:MIME-Version:Content-Type; b=WkxAIXnZu8hISE1oKUIytsDpV5NaqMLaXBOnSTUpd3MqFKdlR1aTW4SLPCyXGgCH9KFGfkQG/NcLKJf6QSs7ubQA1vSTZl6OfObpcWe0Kbqz3xmrtegQ468cM1z86cWSkCjIZr69QIH45HMzBlttghUQMyZjsBqtgae3Za749nk= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Reply-To:Subject:To:MIME-Version:Content-Type; b=MK9doxxtcYLNqKB3W5KVyqQwszmgOD+AMcl+mXhdjM0VQwrO4Brl6q4/hVOJ2oSDF4DTvzp6uGh+0YKw4IUACQJFtaCV//W4LPOKhOXJhGZ1cmJY1nIe4r1uQcBo0rSakCRX4qfwzdpHpM96gzMtKHqjlpLeHPv4/rZlwKmPlsY=; X-YMail-OSG: jRkGuXAVM1l3gD9D8qMlRpU9zZAwF2.HgkOiKQUNT7Kdl.j z6QbNw2lMGufrFH.Tr85kVR0_NP0kYaiWF1G0OlQ4Z60ZtTZg9P3I89Thb_S mGy8lYO_X40Oat2iKIwx0fi8qIb1FAw1jomK48.e6MSkTQkqdEytO6PoqbD6 aLvzGhpaq_tXIIuIroSgGDkIl5KtcMphR1wpkI2ssAgR4S3U80GLXukBEVYc uCkjiHrvz1V_N9mNvuWAZTeLmyjoC7bOYOYZgHx_y3VldLHqQqhj3wHt1vns dDTnPmsQdYJn_soeso56vZEfTCprlxoEC.oe54RxVO.MCDbsyb5ueNhryAhv .DhZCF4pz2xTBJ.kytdy3WIWjGBOCfrP8I2618d_kZ2T4zWanufslIbcXCll Woj_LDw2n76xbbWSVtJh0cX7P7t4IP0Ik7_Z2zWQesBB5a5nKLA8mbUp_Tud dc7YzRm_xl2hsZrjtEZApmXz1wikYxHECba6xM6myezXgNOAe6hI- Received: from [64.238.244.146] by web142505.mail.bf1.yahoo.com via HTTP; Tue, 05 Feb 2013 06:37:05 PST X-Rocket-MIMEInfo: 001.001, QWxsLAoKwqDCoMKgwqAgQW55b25lIHVzZSBtdXRleF9vd25lciBpbiBhIGR0cmFjZSBzY3JpcHQsIGFzIHRoZSBvYnZpb3VzIGRvZXMgbm90IHdvcmsgZm9yIG1lOgoKwqDCoMKgIENvbnRlbnQgb2Ygc3Bpbi5kOgoKIyEvdXNyL3NiaW4vZHRyYWNlIC1xcwoKOjo6KnNwaW4KewpzZWxmLT5tdXRleCA9IChrbXV0ZXhfdCAqKSBhcmcwOwpzZWxmLT5tdXRleF9vd25lciA9IG11dGV4X293bmVyKChrbXV0ZXhfdCAqKSA6c2VsZi0.bXV0ZXgpOwp9CgoKCiMgZHRyYWNlIC1zIHNwaW4uZApkdHJhY2U6IGZhaWxlZCABMAEBAQE- X-Mailer: YahooMailWebService/0.8.132.503 Message-ID: <1360075025.71615.YahooMailNeo@web142505.mail.bf1.yahoo.com> Date: Tue, 5 Feb 2013 06:37:05 -0800 (PST) From: "Dr. Baud" Subject: mutex_owner To: "freebsd-hackers@freebsd.org" MIME-Version: 1.0 X-Mailman-Approved-At: Tue, 05 Feb 2013 16:36:17 +0000 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: "Dr. Baud" List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Feb 2013 14:37:08 -0000 All,=0A=0A=A0=A0=A0=A0 Anyone use mutex_owner in a dtrace script, as the ob= vious does not work for me:=0A=0A=A0=A0=A0 Content of spin.d:=0A=0A#!/usr/s= bin/dtrace -qs=0A=0A:::*spin=0A{=0Aself->mutex =3D (kmutex_t *) arg0;=0Asel= f->mutex_owner =3D mutex_owner((kmutex_t *) :self->mutex);=0A}=0A=0A=0A=0A#= dtrace -s spin.d=0Adtrace: failed to compile script spin.d: line 5: mutex_= owner( ) argument #1 is i=0Ancompatible with prototype:=0A=A0=A0=A0=A0=A0= =A0=A0 prototype: struct mtx *=0A=A0=A0=A0=A0=A0=A0=A0=A0 argument: kmutex_= t *=0A=0A=0A=A0=A0=A0 Dr. From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 5 16:38:51 2013 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 2C9A755E; Tue, 5 Feb 2013 16:38:51 +0000 (UTC) (envelope-from alc@rice.edu) Received: from mh11.mail.rice.edu (mh11.mail.rice.edu [128.42.199.30]) by mx1.freebsd.org (Postfix) with ESMTP id 04DBA7C6; Tue, 5 Feb 2013 16:38:50 +0000 (UTC) Received: from mh11.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh11.mail.rice.edu (Postfix) with ESMTP id 008B44C0665; Tue, 5 Feb 2013 10:38:44 -0600 (CST) Received: from mh11.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh11.mail.rice.edu (Postfix) with ESMTP id F28994C0653; Tue, 5 Feb 2013 10:38:43 -0600 (CST) X-Virus-Scanned: by amavis-2.7.0 at mh11.mail.rice.edu, auth channel Received: from mh11.mail.rice.edu ([127.0.0.1]) by mh11.mail.rice.edu (mh11.mail.rice.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id 1FBFauo7Y3fz; Tue, 5 Feb 2013 10:38:43 -0600 (CST) Received: from adsl-216-63-78-18.dsl.hstntx.swbell.net (adsl-216-63-78-18.dsl.hstntx.swbell.net [216.63.78.18]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: alc) by mh11.mail.rice.edu (Postfix) with ESMTPSA id 311A94C0633; Tue, 5 Feb 2013 10:38:43 -0600 (CST) Message-ID: <51113591.8050709@rice.edu> Date: Tue, 05 Feb 2013 10:38:41 -0600 From: Alan Cox User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:17.0) Gecko/20130127 Thunderbird/17.0.2 MIME-Version: 1.0 To: mdf@FreeBSD.org Subject: Re: dynamically calculating NKPT [was: Re: huge ktr buffer] References: <20130205151413.GL2522@kib.kiev.ua> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: davide@freebsd.org, alc@freebsd.org, avg@freebsd.org, rank1seeker@gmail.com, hackers@freebsd.org, Konstantin Belousov , Neel Natu X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Feb 2013 16:38:51 -0000 On 02/05/2013 09:45, mdf@FreeBSD.org wrote: > On Tue, Feb 5, 2013 at 7:14 AM, Konstantin Belousov wrote: >> On Mon, Feb 04, 2013 at 03:05:15PM -0800, Neel Natu wrote: >>> Hi, >>> >>> I have a patch to dynamically calculate NKPT for amd64 kernels. This >>> should fix the various issues that people pointed out in the email >>> thread. >>> >>> Please review and let me know if there are any objections to committing this. >>> >>> Also, thanks to Alan (alc@) for reviewing and providing feedback on >>> the initial version of the patch. >>> >>> Patch (also available at http://people.freebsd.org/~neel/patches/nkpt_diff.txt): >>> >>> Index: sys/amd64/include/pmap.h >>> =================================================================== >>> --- sys/amd64/include/pmap.h (revision 246277) >>> +++ sys/amd64/include/pmap.h (working copy) >>> @@ -113,13 +113,7 @@ >>> ((unsigned long)(l2) << PDRSHIFT) | \ >>> ((unsigned long)(l1) << PAGE_SHIFT)) >>> >>> -/* Initial number of kernel page tables. */ >>> -#ifndef NKPT >>> -#define NKPT 32 >>> -#endif >>> - >>> #define NKPML4E 1 /* number of kernel PML4 slots */ >>> -#define NKPDPE howmany(NKPT, NPDEPG)/* number of kernel PDP slots */ >>> >>> #define NUPML4E (NPML4EPG/2) /* number of userland PML4 pages */ >>> #define NUPDPE (NUPML4E*NPDPEPG)/* number of userland PDP pages */ >>> @@ -181,6 +175,7 @@ >>> #define PML4map ((pd_entry_t *)(addr_PML4map)) >>> #define PML4pml4e ((pd_entry_t *)(addr_PML4pml4e)) >>> >>> +extern int nkpt; /* Initial number of kernel page tables */ >>> extern u_int64_t KPDPphys; /* physical address of kernel level 3 */ >>> extern u_int64_t KPML4phys; /* physical address of kernel level 4 */ >>> >>> Index: sys/amd64/amd64/minidump_machdep.c >>> =================================================================== >>> --- sys/amd64/amd64/minidump_machdep.c (revision 246277) >>> +++ sys/amd64/amd64/minidump_machdep.c (working copy) >>> @@ -232,7 +232,7 @@ >>> /* Walk page table pages, set bits in vm_page_dump */ >>> pmapsize = 0; >>> pdp = (uint64_t *)PHYS_TO_DMAP(KPDPphys); >>> - for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NBPDR, >>> + for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NBPDR, >>> kernel_vm_end); ) { >>> /* >>> * We always write a page, even if it is zero. Each >>> @@ -364,7 +364,7 @@ >>> /* Dump kernel page directory pages */ >>> bzero(fakepd, sizeof(fakepd)); >>> pdp = (uint64_t *)PHYS_TO_DMAP(KPDPphys); >>> - for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NBPDR, >>> + for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NBPDR, >>> kernel_vm_end); va += NBPDP) { >>> i = (va >> PDPSHIFT) & ((1ul << NPDPEPGSHIFT) - 1); >>> >>> Index: sys/amd64/amd64/pmap.c >>> =================================================================== >>> --- sys/amd64/amd64/pmap.c (revision 246277) >>> +++ sys/amd64/amd64/pmap.c (working copy) >>> @@ -202,6 +202,10 @@ >>> vm_offset_t virtual_avail; /* VA of first avail page (after kernel bss) */ >>> vm_offset_t virtual_end; /* VA of last avail page (end of kernel AS) */ >>> >>> +int nkpt; >>> +SYSCTL_INT(_machdep, OID_AUTO, nkpt, CTLFLAG_RD, &nkpt, 0, >>> + "Number of kernel page table pages allocated on bootup"); >>> + >>> static int ndmpdp; >>> static vm_paddr_t dmaplimit; >>> vm_offset_t kernel_vm_end = VM_MIN_KERNEL_ADDRESS; >>> @@ -495,17 +499,42 @@ >>> >>> CTASSERT(powerof2(NDMPML4E)); >>> >>> +/* number of kernel PDP slots */ >>> +#define NKPDPE(ptpgs) howmany((ptpgs), NPDEPG) >>> + >>> static void >>> +nkpt_init(vm_paddr_t addr) >>> +{ >>> + int pt_pages; >>> + >>> +#ifdef NKPT >>> + pt_pages = NKPT; >>> +#else >>> + pt_pages = howmany(addr, 1 << PDRSHIFT); >>> + pt_pages += NKPDPE(pt_pages); >>> + >>> + /* >>> + * Add some slop beyond the bare minimum required for bootstrapping >>> + * the kernel. >>> + * >>> + * This is quite important when allocating KVA for kernel modules. >>> + * The modules are required to be linked in the negative 2GB of >>> + * the address space. If we run out of KVA in this region then >>> + * pmap_growkernel() will need to allocate page table pages to map >>> + * the entire 512GB of KVA space which is an unnecessary tax on >>> + * physical memory. >>> + */ >>> + pt_pages += 4; /* 8MB additional slop for kernel modules */ >> 8MB might be to low. I just checked one of my machines with fully >> modularized kernel, it takes slightly more than 6 MB to load 50 modules. >> I think that 16MB would be safer, but it probably needs to be scaled >> down based on the available phys memory. amd64 kernel could be booted >> on 128MB machine still. > Is there no way to not map the entire 512GB? Otherwise this patch > could really hose some vendors. E.g. the kernel module for the OneFS > file system is around 8MB all by itself. Mapping the entire 512 GB from the start would require the preallocation of 1 GB of memory for page table pages. > I found when we moved from FreeBSD 6 to 7 that the NKPT of 32 was > insufficient for our system to even boot so I put it back to 240 (I > didn't want to spend a lot of time playing). At that time our module > was loaded by the boot loader; now we do it during init to save some > seconds on boot. But we're probably not the only ones with a large > kernel module. This patch should make life easier for people who are loading modules through the boot loader. It will account for the size of these modules in sizing NKPT (or now nkpt). From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 5 16:48:37 2013 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 0089BEDC; Tue, 5 Feb 2013 16:48:36 +0000 (UTC) (envelope-from alc@rice.edu) Received: from mh3.mail.rice.edu (mh3.mail.rice.edu [128.42.199.10]) by mx1.freebsd.org (Postfix) with ESMTP id CDB7F86F; Tue, 5 Feb 2013 16:48:36 +0000 (UTC) Received: from mh3.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh3.mail.rice.edu (Postfix) with ESMTP id 060454033C; Tue, 5 Feb 2013 10:48:30 -0600 (CST) Received: from mh3.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh3.mail.rice.edu (Postfix) with ESMTP id 024814033D; Tue, 5 Feb 2013 10:48:30 -0600 (CST) X-Virus-Scanned: by amavis-2.7.0 at mh3.mail.rice.edu, auth channel Received: from mh3.mail.rice.edu ([127.0.0.1]) by mh3.mail.rice.edu (mh3.mail.rice.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id a1gWwsFap9aw; Tue, 5 Feb 2013 10:48:29 -0600 (CST) Received: from adsl-216-63-78-18.dsl.hstntx.swbell.net (adsl-216-63-78-18.dsl.hstntx.swbell.net [216.63.78.18]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: alc) by mh3.mail.rice.edu (Postfix) with ESMTPSA id E8FF54033C; Tue, 5 Feb 2013 10:48:28 -0600 (CST) Message-ID: <511137DC.20303@rice.edu> Date: Tue, 05 Feb 2013 10:48:28 -0600 From: Alan Cox User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:17.0) Gecko/20130127 Thunderbird/17.0.2 MIME-Version: 1.0 To: Konstantin Belousov Subject: Re: dynamically calculating NKPT [was: Re: huge ktr buffer] References: <20130205151413.GL2522@kib.kiev.ua> <20130205161335.GM2522@kib.kiev.ua> In-Reply-To: <20130205161335.GM2522@kib.kiev.ua> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: davide@freebsd.org, mdf@FreeBSD.org, alc@freebsd.org, avg@freebsd.org, rank1seeker@gmail.com, hackers@freebsd.org, Neel Natu X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Feb 2013 16:48:37 -0000 On 02/05/2013 10:13, Konstantin Belousov wrote: > On Tue, Feb 05, 2013 at 07:45:24AM -0800, mdf@FreeBSD.org wrote: >> On Tue, Feb 5, 2013 at 7:14 AM, Konstantin Belousov wrote: >>> On Mon, Feb 04, 2013 at 03:05:15PM -0800, Neel Natu wrote: >>>> Hi, >>>> >>>> I have a patch to dynamically calculate NKPT for amd64 kernels. This >>>> should fix the various issues that people pointed out in the email >>>> thread. >>>> >>>> Please review and let me know if there are any objections to committing this. >>>> >>>> Also, thanks to Alan (alc@) for reviewing and providing feedback on >>>> the initial version of the patch. >>>> >>>> Patch (also available at http://people.freebsd.org/~neel/patches/nkpt_diff.txt): >>>> >>>> Index: sys/amd64/include/pmap.h >>>> =================================================================== >>>> --- sys/amd64/include/pmap.h (revision 246277) >>>> +++ sys/amd64/include/pmap.h (working copy) >>>> @@ -113,13 +113,7 @@ >>>> ((unsigned long)(l2) << PDRSHIFT) | \ >>>> ((unsigned long)(l1) << PAGE_SHIFT)) >>>> >>>> -/* Initial number of kernel page tables. */ >>>> -#ifndef NKPT >>>> -#define NKPT 32 >>>> -#endif >>>> - >>>> #define NKPML4E 1 /* number of kernel PML4 slots */ >>>> -#define NKPDPE howmany(NKPT, NPDEPG)/* number of kernel PDP slots */ >>>> >>>> #define NUPML4E (NPML4EPG/2) /* number of userland PML4 pages */ >>>> #define NUPDPE (NUPML4E*NPDPEPG)/* number of userland PDP pages */ >>>> @@ -181,6 +175,7 @@ >>>> #define PML4map ((pd_entry_t *)(addr_PML4map)) >>>> #define PML4pml4e ((pd_entry_t *)(addr_PML4pml4e)) >>>> >>>> +extern int nkpt; /* Initial number of kernel page tables */ >>>> extern u_int64_t KPDPphys; /* physical address of kernel level 3 */ >>>> extern u_int64_t KPML4phys; /* physical address of kernel level 4 */ >>>> >>>> Index: sys/amd64/amd64/minidump_machdep.c >>>> =================================================================== >>>> --- sys/amd64/amd64/minidump_machdep.c (revision 246277) >>>> +++ sys/amd64/amd64/minidump_machdep.c (working copy) >>>> @@ -232,7 +232,7 @@ >>>> /* Walk page table pages, set bits in vm_page_dump */ >>>> pmapsize = 0; >>>> pdp = (uint64_t *)PHYS_TO_DMAP(KPDPphys); >>>> - for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NBPDR, >>>> + for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NBPDR, >>>> kernel_vm_end); ) { >>>> /* >>>> * We always write a page, even if it is zero. Each >>>> @@ -364,7 +364,7 @@ >>>> /* Dump kernel page directory pages */ >>>> bzero(fakepd, sizeof(fakepd)); >>>> pdp = (uint64_t *)PHYS_TO_DMAP(KPDPphys); >>>> - for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NBPDR, >>>> + for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NBPDR, >>>> kernel_vm_end); va += NBPDP) { >>>> i = (va >> PDPSHIFT) & ((1ul << NPDPEPGSHIFT) - 1); >>>> >>>> Index: sys/amd64/amd64/pmap.c >>>> =================================================================== >>>> --- sys/amd64/amd64/pmap.c (revision 246277) >>>> +++ sys/amd64/amd64/pmap.c (working copy) >>>> @@ -202,6 +202,10 @@ >>>> vm_offset_t virtual_avail; /* VA of first avail page (after kernel bss) */ >>>> vm_offset_t virtual_end; /* VA of last avail page (end of kernel AS) */ >>>> >>>> +int nkpt; >>>> +SYSCTL_INT(_machdep, OID_AUTO, nkpt, CTLFLAG_RD, &nkpt, 0, >>>> + "Number of kernel page table pages allocated on bootup"); >>>> + >>>> static int ndmpdp; >>>> static vm_paddr_t dmaplimit; >>>> vm_offset_t kernel_vm_end = VM_MIN_KERNEL_ADDRESS; >>>> @@ -495,17 +499,42 @@ >>>> >>>> CTASSERT(powerof2(NDMPML4E)); >>>> >>>> +/* number of kernel PDP slots */ >>>> +#define NKPDPE(ptpgs) howmany((ptpgs), NPDEPG) >>>> + >>>> static void >>>> +nkpt_init(vm_paddr_t addr) >>>> +{ >>>> + int pt_pages; >>>> + >>>> +#ifdef NKPT >>>> + pt_pages = NKPT; >>>> +#else >>>> + pt_pages = howmany(addr, 1 << PDRSHIFT); >>>> + pt_pages += NKPDPE(pt_pages); >>>> + >>>> + /* >>>> + * Add some slop beyond the bare minimum required for bootstrapping >>>> + * the kernel. >>>> + * >>>> + * This is quite important when allocating KVA for kernel modules. >>>> + * The modules are required to be linked in the negative 2GB of >>>> + * the address space. If we run out of KVA in this region then >>>> + * pmap_growkernel() will need to allocate page table pages to map >>>> + * the entire 512GB of KVA space which is an unnecessary tax on >>>> + * physical memory. >>>> + */ >>>> + pt_pages += 4; /* 8MB additional slop for kernel modules */ >>> 8MB might be to low. I just checked one of my machines with fully >>> modularized kernel, it takes slightly more than 6 MB to load 50 modules. >>> I think that 16MB would be safer, but it probably needs to be scaled >>> down based on the available phys memory. amd64 kernel could be booted >>> on 128MB machine still. >> Is there no way to not map the entire 512GB? Otherwise this patch >> could really hose some vendors. E.g. the kernel module for the OneFS >> file system is around 8MB all by itself. > No, I do not think that this patch would hose somebody with the 8MB > module, esp. if the slack is increased. Agreed. With an increase in the slack, this patch can't possibly harm anyone. On the other hand, it will eliminate the need for some people to manually tune NKPT. > But yes, I believe it is possible to note that the growth happen after > the KERNBASE point and only allocate the page tables at this region. > We would need to not update the kernel_vm_end then, probably creating > some other var to keep track of the other tail. Yes, this can probably be done. However, what Neel has is already an improvement, so I see no point in not committing it with an increase in the slack. >> I found when we moved from FreeBSD 6 to 7 that the NKPT of 32 was >> insufficient for our system to even boot so I put it back to 240 (I >> didn't want to spend a lot of time playing). At that time our module >> was loaded by the boot loader; now we do it during init to save some >> seconds on boot. But we're probably not the only ones with a large >> kernel module. >> >> Cheers, >> matthew From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 5 16:58:43 2013 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 129665DA; Tue, 5 Feb 2013 16:58:43 +0000 (UTC) (envelope-from neelnatu@gmail.com) Received: from mail-ie0-x22e.google.com (mail-ie0-x22e.google.com [IPv6:2607:f8b0:4001:c03::22e]) by mx1.freebsd.org (Postfix) with ESMTP id B4719912; Tue, 5 Feb 2013 16:58:42 +0000 (UTC) Received: by mail-ie0-f174.google.com with SMTP id k10so502989iea.19 for ; Tue, 05 Feb 2013 08:58:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=IbJwIy1p4vOS7604rZlkF7qU8A/mvqlkKugqGXNL/Bw=; b=tYsgMlLt62Hi/T/27zFO1HYoQk3MJ8u0LLpCEDG/KGt1nl1sNTPgZwwMN0Zw3rNR+D 9winzRrtU15vXXTJwvuLUOnF1wM1NkPKAu6otMLv1WdZm/ZPmwHANStgWhOluqeFL5Oj z4Bz3yfXrDovXzpwu5EZnimc0lsxAvUiHSbq903qMlnRI1+m43v+tPnUIp3NzcOfwnpO PideCity0HI3xN3UTeFawixTx8g8V1SrhtpJIA+LYnftPLQd5KFMWqALCdFkd2Ffv3C9 CmzgWFvmoyWxmPlkjlInUBpDYK7xsC9MXHt+IBUsBWeliK1zz60V0iNuhg2XjtqQhqcg 868g== MIME-Version: 1.0 X-Received: by 10.50.6.230 with SMTP id e6mr14548974iga.3.1360083522234; Tue, 05 Feb 2013 08:58:42 -0800 (PST) Received: by 10.42.23.132 with HTTP; Tue, 5 Feb 2013 08:58:42 -0800 (PST) In-Reply-To: References: <20130205151413.GL2522@kib.kiev.ua> Date: Tue, 5 Feb 2013 08:58:42 -0800 Message-ID: Subject: Re: dynamically calculating NKPT [was: Re: huge ktr buffer] From: Neel Natu To: mdf@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Cc: davide@freebsd.org, alc@freebsd.org, avg@freebsd.org, rank1seeker@gmail.com, hackers@freebsd.org, Konstantin Belousov X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Feb 2013 16:58:43 -0000 Hi Matthew, On Tue, Feb 5, 2013 at 7:45 AM, wrote: > On Tue, Feb 5, 2013 at 7:14 AM, Konstantin Belousov wrote: >> On Mon, Feb 04, 2013 at 03:05:15PM -0800, Neel Natu wrote: >>> Hi, >>> >>> I have a patch to dynamically calculate NKPT for amd64 kernels. This >>> should fix the various issues that people pointed out in the email >>> thread. >>> >>> Please review and let me know if there are any objections to committing this. >>> >>> Also, thanks to Alan (alc@) for reviewing and providing feedback on >>> the initial version of the patch. >>> >>> Patch (also available at http://people.freebsd.org/~neel/patches/nkpt_diff.txt): >>> >>> Index: sys/amd64/include/pmap.h >>> =================================================================== >>> --- sys/amd64/include/pmap.h (revision 246277) >>> +++ sys/amd64/include/pmap.h (working copy) >>> @@ -113,13 +113,7 @@ >>> ((unsigned long)(l2) << PDRSHIFT) | \ >>> ((unsigned long)(l1) << PAGE_SHIFT)) >>> >>> -/* Initial number of kernel page tables. */ >>> -#ifndef NKPT >>> -#define NKPT 32 >>> -#endif >>> - >>> #define NKPML4E 1 /* number of kernel PML4 slots */ >>> -#define NKPDPE howmany(NKPT, NPDEPG)/* number of kernel PDP slots */ >>> >>> #define NUPML4E (NPML4EPG/2) /* number of userland PML4 pages */ >>> #define NUPDPE (NUPML4E*NPDPEPG)/* number of userland PDP pages */ >>> @@ -181,6 +175,7 @@ >>> #define PML4map ((pd_entry_t *)(addr_PML4map)) >>> #define PML4pml4e ((pd_entry_t *)(addr_PML4pml4e)) >>> >>> +extern int nkpt; /* Initial number of kernel page tables */ >>> extern u_int64_t KPDPphys; /* physical address of kernel level 3 */ >>> extern u_int64_t KPML4phys; /* physical address of kernel level 4 */ >>> >>> Index: sys/amd64/amd64/minidump_machdep.c >>> =================================================================== >>> --- sys/amd64/amd64/minidump_machdep.c (revision 246277) >>> +++ sys/amd64/amd64/minidump_machdep.c (working copy) >>> @@ -232,7 +232,7 @@ >>> /* Walk page table pages, set bits in vm_page_dump */ >>> pmapsize = 0; >>> pdp = (uint64_t *)PHYS_TO_DMAP(KPDPphys); >>> - for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NBPDR, >>> + for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NBPDR, >>> kernel_vm_end); ) { >>> /* >>> * We always write a page, even if it is zero. Each >>> @@ -364,7 +364,7 @@ >>> /* Dump kernel page directory pages */ >>> bzero(fakepd, sizeof(fakepd)); >>> pdp = (uint64_t *)PHYS_TO_DMAP(KPDPphys); >>> - for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NBPDR, >>> + for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NBPDR, >>> kernel_vm_end); va += NBPDP) { >>> i = (va >> PDPSHIFT) & ((1ul << NPDPEPGSHIFT) - 1); >>> >>> Index: sys/amd64/amd64/pmap.c >>> =================================================================== >>> --- sys/amd64/amd64/pmap.c (revision 246277) >>> +++ sys/amd64/amd64/pmap.c (working copy) >>> @@ -202,6 +202,10 @@ >>> vm_offset_t virtual_avail; /* VA of first avail page (after kernel bss) */ >>> vm_offset_t virtual_end; /* VA of last avail page (end of kernel AS) */ >>> >>> +int nkpt; >>> +SYSCTL_INT(_machdep, OID_AUTO, nkpt, CTLFLAG_RD, &nkpt, 0, >>> + "Number of kernel page table pages allocated on bootup"); >>> + >>> static int ndmpdp; >>> static vm_paddr_t dmaplimit; >>> vm_offset_t kernel_vm_end = VM_MIN_KERNEL_ADDRESS; >>> @@ -495,17 +499,42 @@ >>> >>> CTASSERT(powerof2(NDMPML4E)); >>> >>> +/* number of kernel PDP slots */ >>> +#define NKPDPE(ptpgs) howmany((ptpgs), NPDEPG) >>> + >>> static void >>> +nkpt_init(vm_paddr_t addr) >>> +{ >>> + int pt_pages; >>> + >>> +#ifdef NKPT >>> + pt_pages = NKPT; >>> +#else >>> + pt_pages = howmany(addr, 1 << PDRSHIFT); >>> + pt_pages += NKPDPE(pt_pages); >>> + >>> + /* >>> + * Add some slop beyond the bare minimum required for bootstrapping >>> + * the kernel. >>> + * >>> + * This is quite important when allocating KVA for kernel modules. >>> + * The modules are required to be linked in the negative 2GB of >>> + * the address space. If we run out of KVA in this region then >>> + * pmap_growkernel() will need to allocate page table pages to map >>> + * the entire 512GB of KVA space which is an unnecessary tax on >>> + * physical memory. >>> + */ >>> + pt_pages += 4; /* 8MB additional slop for kernel modules */ >> 8MB might be to low. I just checked one of my machines with fully >> modularized kernel, it takes slightly more than 6 MB to load 50 modules. >> I think that 16MB would be safer, but it probably needs to be scaled >> down based on the available phys memory. amd64 kernel could be booted >> on 128MB machine still. > > Is there no way to not map the entire 512GB? Otherwise this patch > could really hose some vendors. E.g. the kernel module for the OneFS > file system is around 8MB all by itself. > > I found when we moved from FreeBSD 6 to 7 that the NKPT of 32 was > insufficient for our system to even boot so I put it back to 240 (I > didn't want to spend a lot of time playing). At that time our module > was loaded by the boot loader; now we do it during init to save some > seconds on boot. But we're probably not the only ones with a large > kernel module. > I work for a vendor with the same feature :-) I don't think it will break your use case - if you override NKPT in your kernel config file there will not be any dynamic calculation of NKPT. It will accept whatever value you have configured without any modification (240 in your case). best Neel > Cheers, > matthew From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 5 17:12:42 2013 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id E90C1B9E; Tue, 5 Feb 2013 17:12:42 +0000 (UTC) (envelope-from neelnatu@gmail.com) Received: from mail-ie0-x22b.google.com (ie-in-x022b.1e100.net [IPv6:2607:f8b0:4001:c03::22b]) by mx1.freebsd.org (Postfix) with ESMTP id 7A3C39E0; Tue, 5 Feb 2013 17:12:42 +0000 (UTC) Received: by mail-ie0-f171.google.com with SMTP id 10so541108ied.2 for ; Tue, 05 Feb 2013 09:12:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=pPawft0ejh5FStjKuVEev75Wc/tPMWbLg3Lmwls9Uq4=; b=gMuq+mbQFL23xP4oxjyNfib/GbDCoB8e3HSDhN4j/vBjZgCVwZgsdfsMicJDlAtlY2 TAbSPM2SDd0RsjlN0EfiyI0vzSscEmpi6xi8ZJxA+Pqa/1WERA14F4oUP9Qy2ffcPF47 ufhRHfmESHrvbB4Xn8GygyVHR2y7rMLTOSzZGzLDZ7Vv974tz6Tw1bwJ8bEk5+UQlP00 KvwEcYJvH7tGcOAn0bIBbELqR3lwB/oDjJCWPsOKOXn3UNh5rYBtAnOg/2sB3kJBOsoo IVs7B7/k5QiSR5evRxImHgIyP/hnW1wCQHMhZLwX/qXt7rzB5C/sHXBjXrNZXX1XjOXm rp9w== MIME-Version: 1.0 X-Received: by 10.50.161.135 with SMTP id xs7mr14638188igb.3.1360084362040; Tue, 05 Feb 2013 09:12:42 -0800 (PST) Received: by 10.42.23.132 with HTTP; Tue, 5 Feb 2013 09:12:41 -0800 (PST) In-Reply-To: <20130205151413.GL2522@kib.kiev.ua> References: <20130205151413.GL2522@kib.kiev.ua> Date: Tue, 5 Feb 2013 09:12:41 -0800 Message-ID: Subject: Re: dynamically calculating NKPT [was: Re: huge ktr buffer] From: Neel Natu To: Konstantin Belousov Content-Type: text/plain; charset=ISO-8859-1 Cc: alc@freebsd.org, davide@freebsd.org, hackers@freebsd.org, avg@freebsd.org, rank1seeker@gmail.com X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Feb 2013 17:12:43 -0000 Hi Konstantin, On Tue, Feb 5, 2013 at 7:14 AM, Konstantin Belousov wrote: > On Mon, Feb 04, 2013 at 03:05:15PM -0800, Neel Natu wrote: >> Hi, >> >> I have a patch to dynamically calculate NKPT for amd64 kernels. This >> should fix the various issues that people pointed out in the email >> thread. >> >> Please review and let me know if there are any objections to committing this. >> >> Also, thanks to Alan (alc@) for reviewing and providing feedback on >> the initial version of the patch. >> >> Patch (also available at http://people.freebsd.org/~neel/patches/nkpt_diff.txt): >> >> Index: sys/amd64/include/pmap.h >> =================================================================== >> --- sys/amd64/include/pmap.h (revision 246277) >> +++ sys/amd64/include/pmap.h (working copy) >> @@ -113,13 +113,7 @@ >> ((unsigned long)(l2) << PDRSHIFT) | \ >> ((unsigned long)(l1) << PAGE_SHIFT)) >> >> -/* Initial number of kernel page tables. */ >> -#ifndef NKPT >> -#define NKPT 32 >> -#endif >> - >> #define NKPML4E 1 /* number of kernel PML4 slots */ >> -#define NKPDPE howmany(NKPT, NPDEPG)/* number of kernel PDP slots */ >> >> #define NUPML4E (NPML4EPG/2) /* number of userland PML4 pages */ >> #define NUPDPE (NUPML4E*NPDPEPG)/* number of userland PDP pages */ >> @@ -181,6 +175,7 @@ >> #define PML4map ((pd_entry_t *)(addr_PML4map)) >> #define PML4pml4e ((pd_entry_t *)(addr_PML4pml4e)) >> >> +extern int nkpt; /* Initial number of kernel page tables */ >> extern u_int64_t KPDPphys; /* physical address of kernel level 3 */ >> extern u_int64_t KPML4phys; /* physical address of kernel level 4 */ >> >> Index: sys/amd64/amd64/minidump_machdep.c >> =================================================================== >> --- sys/amd64/amd64/minidump_machdep.c (revision 246277) >> +++ sys/amd64/amd64/minidump_machdep.c (working copy) >> @@ -232,7 +232,7 @@ >> /* Walk page table pages, set bits in vm_page_dump */ >> pmapsize = 0; >> pdp = (uint64_t *)PHYS_TO_DMAP(KPDPphys); >> - for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NBPDR, >> + for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NBPDR, >> kernel_vm_end); ) { >> /* >> * We always write a page, even if it is zero. Each >> @@ -364,7 +364,7 @@ >> /* Dump kernel page directory pages */ >> bzero(fakepd, sizeof(fakepd)); >> pdp = (uint64_t *)PHYS_TO_DMAP(KPDPphys); >> - for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NBPDR, >> + for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NBPDR, >> kernel_vm_end); va += NBPDP) { >> i = (va >> PDPSHIFT) & ((1ul << NPDPEPGSHIFT) - 1); >> >> Index: sys/amd64/amd64/pmap.c >> =================================================================== >> --- sys/amd64/amd64/pmap.c (revision 246277) >> +++ sys/amd64/amd64/pmap.c (working copy) >> @@ -202,6 +202,10 @@ >> vm_offset_t virtual_avail; /* VA of first avail page (after kernel bss) */ >> vm_offset_t virtual_end; /* VA of last avail page (end of kernel AS) */ >> >> +int nkpt; >> +SYSCTL_INT(_machdep, OID_AUTO, nkpt, CTLFLAG_RD, &nkpt, 0, >> + "Number of kernel page table pages allocated on bootup"); >> + >> static int ndmpdp; >> static vm_paddr_t dmaplimit; >> vm_offset_t kernel_vm_end = VM_MIN_KERNEL_ADDRESS; >> @@ -495,17 +499,42 @@ >> >> CTASSERT(powerof2(NDMPML4E)); >> >> +/* number of kernel PDP slots */ >> +#define NKPDPE(ptpgs) howmany((ptpgs), NPDEPG) >> + >> static void >> +nkpt_init(vm_paddr_t addr) >> +{ >> + int pt_pages; >> + >> +#ifdef NKPT >> + pt_pages = NKPT; >> +#else >> + pt_pages = howmany(addr, 1 << PDRSHIFT); >> + pt_pages += NKPDPE(pt_pages); >> + >> + /* >> + * Add some slop beyond the bare minimum required for bootstrapping >> + * the kernel. >> + * >> + * This is quite important when allocating KVA for kernel modules. >> + * The modules are required to be linked in the negative 2GB of >> + * the address space. If we run out of KVA in this region then >> + * pmap_growkernel() will need to allocate page table pages to map >> + * the entire 512GB of KVA space which is an unnecessary tax on >> + * physical memory. >> + */ >> + pt_pages += 4; /* 8MB additional slop for kernel modules */ > 8MB might be to low. I just checked one of my machines with fully > modularized kernel, it takes slightly more than 6 MB to load 50 modules. > I think that 16MB would be safer, but it probably needs to be scaled > down based on the available phys memory. amd64 kernel could be booted > on 128MB machine still. Sounds fine. I can bump it up to 8 pages. Also, wrt your comment about scaling this number based on available memory, I wonder if it makes sense to optimize for 16KB of additional space. I would much rather work with you and Alan to fix pmap_growkernel() so we don't need to care about this slack in the first place :-) best Neel From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 5 17:18:20 2013 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id C1CF3F5A; Tue, 5 Feb 2013 17:18:20 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) by mx1.freebsd.org (Postfix) with ESMTP id 39000A35; Tue, 5 Feb 2013 17:18:20 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.6/8.14.6) with ESMTP id r15HIDKv022216; Tue, 5 Feb 2013 19:18:13 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.7.4 kib.kiev.ua r15HIDKv022216 Received: (from kostik@localhost) by tom.home (8.14.6/8.14.6/Submit) id r15HIDLP022215; Tue, 5 Feb 2013 19:18:13 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 5 Feb 2013 19:18:13 +0200 From: Konstantin Belousov To: Neel Natu Subject: Re: dynamically calculating NKPT [was: Re: huge ktr buffer] Message-ID: <20130205171813.GP2522@kib.kiev.ua> References: <20130205151413.GL2522@kib.kiev.ua> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="iXvAUMgb137SbSJR" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: alc@freebsd.org, davide@freebsd.org, hackers@freebsd.org, avg@freebsd.org, rank1seeker@gmail.com X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Feb 2013 17:18:20 -0000 --iXvAUMgb137SbSJR Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Feb 05, 2013 at 09:12:41AM -0800, Neel Natu wrote: > Hi Konstantin, >=20 > Sounds fine. I can bump it up to 8 pages. >=20 > Also, wrt your comment about scaling this number based on available > memory, I wonder if it makes sense to optimize for 16KB of additional > space. If it boots on 128MB as is, I am completely fine with the existing patch (after slack increase) as well. >=20 > I would much rather work with you and Alan to fix pmap_growkernel() so > we don't need to care about this slack in the first place :-) Sure, if you consider this important enough. I do agree with mdf that having this happen without any user intervention is good. But having just a tunable for the slack might be much less labor-intensive and good enough. --iXvAUMgb137SbSJR Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIcBAEBAgAGBQJRET7UAAoJEJDCuSvBvK1B3gsP/2uBt9C/AQ4ssADWSqdw/5+s 3fkLNbqhuse/kDHgbhVOBRqRtLr98epNsdu+TGiTyISufUpUhYIl/ou6OSkY/V0N 0AdTNZnCmYVagtDwG11hIibI9lI76FBQi6v1BwQJQ0B+D929OTzstObqDJMUqPx0 fjkrIr1SodJuZt2VBwN/X4DY+wXUFm82QVk3JJkEl60GEy5sxp5T/7SnGm8WJbQk j23Q+syaxL58RHHMKO/6UIZoHjO3u711YPe2E9KYC4MsIFpNfH/a0yXOT9DSepiQ Tz852Cj1U6PHy66BXPyaJaKimGgc1cLxUapuWfa08OI3r3/fT2hMk9FFVcG8/cJh yODPh6vxFtlnjlioz2i2SkQclo28TYoH80ncxQQQoaCpi2tectJIqDCLvz5Fe2Eu uyr/h1ONzFwEh0nOIjhxX1fTlOPfqepUFEuLS0gX9LtH+LtoAYyonphnZUmKkcgE rwLSKJ9tOC4pXZ4l6diEZHp7MxueGypjsK50nTy1aaVgbUBTJgzFdCR8Scn4anit 8yrmrFFN8FiCpGnMJA+KWs8yUzHaGLqAi2L3PryloGUDJ88Juyf52lLt0Vnie2Hy kESANulvu6GGPOUeuWp9rdrNMqcUxRmx3Xvl0lj1ANLT9upSQQVspnbbuWv/oBKz fYXU7ddH+u51U0FhluFN =2QJB -----END PGP SIGNATURE----- --iXvAUMgb137SbSJR-- From owner-freebsd-hackers@FreeBSD.ORG Wed Feb 6 04:41:42 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 5FB814E2 for ; Wed, 6 Feb 2013 04:41:42 +0000 (UTC) (envelope-from ian@FreeBSD.org) Received: from duck.symmetricom.us (duck.symmetricom.us [206.168.13.214]) by mx1.freebsd.org (Postfix) with ESMTP id 3FA46C0F for ; Wed, 6 Feb 2013 04:41:41 +0000 (UTC) Received: from damnhippie.dyndns.org (daffy.symmetricom.us [206.168.13.218]) by duck.symmetricom.us (8.14.6/8.14.6) with ESMTP id r164ffFR003702 for ; Tue, 5 Feb 2013 21:41:41 -0700 (MST) (envelope-from ian@FreeBSD.org) Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240]) by damnhippie.dyndns.org (8.14.3/8.14.3) with ESMTP id r164fcRL032480 for ; Tue, 5 Feb 2013 21:41:38 -0700 (MST) (envelope-from ian@FreeBSD.org) Subject: Request for review, time_pps_fetch() enhancement From: Ian Lepore To: "freebsd-hackers@freebsd.org" Content-Type: multipart/mixed; boundary="=-6J1XeQ0MmbsUd9NHPIee" Date: Tue, 05 Feb 2013 21:41:38 -0700 Message-ID: <1360125698.93359.566.camel@revolution.hippie.lan> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Feb 2013 04:41:42 -0000 --=-6J1XeQ0MmbsUd9NHPIee Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit I'd like feedback on the attached patch, which adds support to our time_pps_fetch() implementation for the blocking behaviors described in section 3.4.3 of RFC 2783. The existing implementation can only return the most recently captured data without blocking. These changes add the ability to block (forever or with timeout) until a new event occurs. -- Ian --=-6J1XeQ0MmbsUd9NHPIee Content-Disposition: inline; filename="pps_fetchwait.diff" Content-Type: text/x-patch; name="pps_fetchwait.diff"; charset="us-ascii" Content-Transfer-Encoding: 7bit Index: sys/kern/kern_tc.c =================================================================== --- sys/kern/kern_tc.c (revision 246337) +++ sys/kern/kern_tc.c (working copy) @@ -1446,6 +1446,50 @@ * RFC 2783 PPS-API implementation. */ +static int +pps_fetch(struct pps_fetch_args *fapi, struct pps_state *pps) +{ + int err, timo; + pps_seq_t aseq, cseq; + struct timeval tv; + + if (fapi->tsformat && fapi->tsformat != PPS_TSFMT_TSPEC) + return (EINVAL); + + /* + * If no timeout is requested, immediately return whatever values were + * most recently captured. If timeout seconds is -1, that's a request + * to block without a timeout. WITNESS won't let us sleep forever + * without a lock (we really don't need a lock), so just repeatedly + * sleep a long time. + */ + if (fapi->timeout.tv_sec || fapi->timeout.tv_nsec) { + if (fapi->timeout.tv_sec == -1) + timo = 0x7fffffff; + else { + tv.tv_sec = fapi->timeout.tv_sec; + tv.tv_usec = fapi->timeout.tv_nsec / 1000; + timo = tvtohz(&tv); + } + aseq = pps->ppsinfo.assert_sequence; + cseq = pps->ppsinfo.clear_sequence; + while (aseq == pps->ppsinfo.assert_sequence && + cseq == pps->ppsinfo.clear_sequence) { + err = tsleep(pps, PCATCH, "ppsfch", timo); + if (err == EWOULDBLOCK && fapi->timeout.tv_sec == -1) { + continue; + } else if (err != 0) { + return (err); + } + } + } + + pps->ppsinfo.current_mode = pps->ppsparam.mode; + fapi->pps_info_buf = pps->ppsinfo; + + return (0); +} + int pps_ioctl(u_long cmd, caddr_t data, struct pps_state *pps) { @@ -1485,13 +1529,7 @@ return (0); case PPS_IOC_FETCH: fapi = (struct pps_fetch_args *)data; - if (fapi->tsformat && fapi->tsformat != PPS_TSFMT_TSPEC) - return (EINVAL); - if (fapi->timeout.tv_sec || fapi->timeout.tv_nsec) - return (EOPNOTSUPP); - pps->ppsinfo.current_mode = pps->ppsparam.mode; - fapi->pps_info_buf = pps->ppsinfo; - return (0); + return (pps_fetch(fapi, pps)); #ifdef FFCLOCK case PPS_IOC_FETCH_FFCOUNTER: fapi_ffc = (struct pps_fetch_ffc_args *)data; @@ -1540,7 +1578,7 @@ void pps_init(struct pps_state *pps) { - pps->ppscap |= PPS_TSFMT_TSPEC; + pps->ppscap |= PPS_TSFMT_TSPEC | PPS_CANWAIT; if (pps->ppscap & PPS_CAPTUREASSERT) pps->ppscap |= PPS_OFFSETASSERT; if (pps->ppscap & PPS_CAPTURECLEAR) @@ -1680,6 +1718,9 @@ hardpps(tsp, ts.tv_nsec + 1000000000 * ts.tv_sec); } #endif + + /* Wakeup anyone sleeping in pps_fetch(). */ + wakeup(pps); } /* --=-6J1XeQ0MmbsUd9NHPIee-- From owner-freebsd-hackers@FreeBSD.ORG Wed Feb 6 09:54:44 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 2A145FBF for ; Wed, 6 Feb 2013 09:54:44 +0000 (UTC) (envelope-from lsanfil@marvell.com) Received: from na3sys009aog110.obsmtp.com (na3sys009aog110.obsmtp.com [74.125.149.203]) by mx1.freebsd.org (Postfix) with ESMTP id E6E3BD84 for ; Wed, 6 Feb 2013 09:54:43 +0000 (UTC) Received: from SC-OWA.marvell.com ([199.233.58.135]) (using TLSv1) by na3sys009aob110.postini.com ([74.125.148.12]) with SMTP ID DSNKURIoYxq1QsyWfY8xnIs+dQihIDKEmyCA@postini.com; Wed, 06 Feb 2013 01:54:43 PST Received: from SC-VEXCH4.marvell.com ([::1]) by SC-OWA.marvell.com ([::1]) with mapi; Wed, 6 Feb 2013 01:50:41 -0800 From: Lino Sanfilippo To: "freebsd-hackers@freebsd.org" Date: Wed, 6 Feb 2013 01:50:39 -0800 Subject: Mbuf memory handling Thread-Topic: Mbuf memory handling Thread-Index: Ac4ET2tf3PBUzyIhR66C8VAs8jlWQQ== Message-ID: <175CCF5F49938B4D99B2E3EF7F558EBE1C73F401F3@SC-VEXCH4.marvell.com> Accept-Language: de-DE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: de-DE, en-US MIME-Version: 1.0 X-Mailman-Approved-At: Wed, 06 Feb 2013 12:30:07 +0000 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: Axel Fischer , Markus Althoff X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Feb 2013 09:54:44 -0000 Hi all, I want to implement a device driver for a NIC which stores received data in= to chunks within a page (>=3D4k) in host memory. One page shall be used for multiple packets= and freed after all mbufs linked to that page have been processed. So I would like to= know what is the recommended way to handle this in FreeBSD? Any hints are very appreciated. Regards, Lino From owner-freebsd-hackers@FreeBSD.ORG Wed Feb 6 12:50:51 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id CC46FD93 for ; Wed, 6 Feb 2013 12:50:51 +0000 (UTC) (envelope-from andrey@zonov.org) Received: from mail-lb0-f169.google.com (mail-lb0-f169.google.com [209.85.217.169]) by mx1.freebsd.org (Postfix) with ESMTP id 2BE8C801 for ; Wed, 6 Feb 2013 12:50:50 +0000 (UTC) Received: by mail-lb0-f169.google.com with SMTP id m4so1159004lbo.28 for ; Wed, 06 Feb 2013 04:50:49 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:sender:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to:x-enigmail-version:content-type :x-gm-message-state; bh=nlOeQiygQmgSZwgXLwE+kOnAj3MtrgccQgXlvlECH2E=; b=ZF7O2WxZqoz+phX6pLtJKkAxe9TkbEe7BhkNoMFcwF38TEF/IkIdCZwCAoUTlFvdhl g+zTyFovARJX9e+Pz5B3ootK4rCEb6udbvBMoW6DtyNVzW0Y06O86icCG21YZ6SzqN5j F9HqgtkbnoyoJx4ravts1cVlCiSc95C4qfjQFmp31dTFeLKHyaySiJZXUaGxrtLiuVIX jMOiTMw9voSOeSFZorT6tGgUlzTL+/SytPQaoWnLqZ6AJa0AIlsdseJ6bkRrxC7QhPix 5y60G9PK9jEdsmHJ8aXpqXjb2UUjNNV0lb1YYe6Q6Gk9tndTm7BP6P3DjacbZD7F3YUU KMoA== X-Received: by 10.112.43.232 with SMTP id z8mr3649385lbl.135.1360155049591; Wed, 06 Feb 2013 04:50:49 -0800 (PST) Received: from dhcp170-82-red.yandex.net (dhcp170-82-red.yandex.net. [95.108.170.82]) by mx.google.com with ESMTPS id t7sm7761958lbf.12.2013.02.06.04.50.47 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 06 Feb 2013 04:50:48 -0800 (PST) Sender: Andrey Zonov Message-ID: <511251A3.1090404@FreeBSD.org> Date: Wed, 06 Feb 2013 16:50:43 +0400 From: Andrey Zonov User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: "Dr. Baud" Subject: Re: mutex_owner References: <1360075025.71615.YahooMailNeo@web142505.mail.bf1.yahoo.com> In-Reply-To: <1360075025.71615.YahooMailNeo@web142505.mail.bf1.yahoo.com> X-Enigmail-Version: 1.5 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="----enig2DXMVDQTOEQMUVFRHTIKI" X-Gm-Message-State: ALoCoQlSYjOeD1owKtvUAotQywR6gAnoKC6OygQPGvSGTIqx8AAV8TmAJBv6uVNgAluubCg7+wlt Cc: "freebsd-hackers@freebsd.org" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Feb 2013 12:50:51 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) ------enig2DXMVDQTOEQMUVFRHTIKI Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 2/5/13 6:37 PM, Dr. Baud wrote: > All, >=20 > Anyone use mutex_owner in a dtrace script, as the obvious does not= work for me: >=20 > Content of spin.d: >=20 > #!/usr/sbin/dtrace -qs >=20 > :::*spin > { > self->mutex =3D (kmutex_t *) arg0; > self->mutex_owner =3D mutex_owner((kmutex_t *) :self->mutex); > } >=20 Lock implementation in FreeBSD is different from in Solaris. The script below has to do what you want. :::*spin { self->mtx =3D (struct mtx *)arg0; self->mtx_owner =3D mutex_owner(self->mtx); } Implementation details of mutexes you can find in sys/sys/_mutex.h, sys/sys/mutex.h, sys/kern/kern_mutex.c. --=20 Andrey Zonov ------enig2DXMVDQTOEQMUVFRHTIKI Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.18 (Darwin) Comment: GPGTools - http://gpgtools.org iQEcBAEBAgAGBQJRElGmAAoJEBWLemxX/CvTsykH/39B7lxGuYTTw0Yfj6NWu5EN XiJoNDRpr6H/14grx8Ps1rGlL3nMc5lsrZeEREIpbQPaEHsiu+SfFqJxt/BlQZZv KAjkvVj5mimDQFIZSBMsFORU7feAKLVAcWMo9WtsyoXkeqjYUrvr9KqJqNGvPr/s x2W43aAQWLF5MWta9cjYapzrumaV9s/hamIO0sNcXFeaRkAStXrp1qhy7Uf6gDN6 TCgCkHzid/WkwE/FjzBPHVX0LPUbYpCvqisCd8D2UniWQb2FR8G2b3W66eIxgDAX tUPYcUI7YQGulr1zNOpfj72dVdt2FXASkuDMdMiD5jXZyNAOdasTRqekm58dd8g= =46fn -----END PGP SIGNATURE----- ------enig2DXMVDQTOEQMUVFRHTIKI-- From owner-freebsd-hackers@FreeBSD.ORG Wed Feb 6 14:05:10 2013 Return-Path: Delivered-To: hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 70A9840D; Wed, 6 Feb 2013 14:05:10 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 3C8D1BF3; Wed, 6 Feb 2013 14:05:08 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA16906; Wed, 06 Feb 2013 16:05:06 +0200 (EET) (envelope-from avg@FreeBSD.org) Message-ID: <51126311.4060907@FreeBSD.org> Date: Wed, 06 Feb 2013 16:05:05 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130206 Thunderbird/17.0.2 MIME-Version: 1.0 To: Neel Natu Subject: Re: dynamically calculating NKPT [was: Re: huge ktr buffer] References: In-Reply-To: X-Enigmail-Version: 1.4.6 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: alc@FreeBSD.org, davide@FreeBSD.org, hackers@FreeBSD.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Feb 2013 14:05:10 -0000 on 05/02/2013 01:05 Neel Natu said the following: > Hi, > > I have a patch to dynamically calculate NKPT for amd64 kernels. This > should fix the various issues that people pointed out in the email > thread. > > Please review and let me know if there are any objections to committing this. > > Also, thanks to Alan (alc@) for reviewing and providing feedback on > the initial version of the patch. > > Patch (also available at http://people.freebsd.org/~neel/patches/nkpt_diff.txt): It seems I am a little bit late with a review, but not too late :-) Some comments below: > Index: sys/amd64/include/pmap.h > =================================================================== > --- sys/amd64/include/pmap.h (revision 246277) > +++ sys/amd64/include/pmap.h (working copy) > @@ -113,13 +113,7 @@ > ((unsigned long)(l2) << PDRSHIFT) | \ > ((unsigned long)(l1) << PAGE_SHIFT)) > > -/* Initial number of kernel page tables. */ > -#ifndef NKPT > -#define NKPT 32 > -#endif I think that we could still keep this, if the below code is done slightly different: [snip] > +/* number of kernel PDP slots */ > +#define NKPDPE(ptpgs) howmany((ptpgs), NPDEPG) > + > static void > +nkpt_init(vm_paddr_t addr) > +{ > + int pt_pages; > + > +#ifdef NKPT > + pt_pages = NKPT; > +#else > + pt_pages = howmany(addr, 1 << PDRSHIFT); A very minor cosmetic note: perhaps NBPDR would look more concise here. > + pt_pages += NKPDPE(pt_pages); > + > + /* > + * Add some slop beyond the bare minimum required for bootstrapping > + * the kernel. > + * > + * This is quite important when allocating KVA for kernel modules. > + * The modules are required to be linked in the negative 2GB of > + * the address space. If we run out of KVA in this region then > + * pmap_growkernel() will need to allocate page table pages to map > + * the entire 512GB of KVA space which is an unnecessary tax on > + * physical memory. > + */ > + pt_pages += 4; /* 8MB additional slop for kernel modules */ > +#endif > + nkpt = pt_pages; > +} I would slightly re-organize this code so that it uses NKPT, if defined, as a default value for nkpt. Then, only if the calculated value is greater then it would override the default. There are tradeoffs, of course. So I am just voicing my opinion/preference. The "slack" thing is a little bit imperfect, but I am not a perfectionist :-) Thank you very much for this great feature. > +static void > create_pagetables(vm_paddr_t *firstaddr) > { > - int i, j, ndm1g; > + int i, j, ndm1g, nkpdpe; > > - /* Allocate pages */ > - KPTphys = allocpages(firstaddr, NKPT); > - KPML4phys = allocpages(firstaddr, 1); > - KPDPphys = allocpages(firstaddr, NKPML4E); > - KPDphys = allocpages(firstaddr, NKPDPE); > - > + /* Allocate page table pages for the direct map */ > ndmpdp = (ptoa(Maxmem) + NBPDP - 1) >> PDPSHIFT; > if (ndmpdp < 4) /* Minimum 4GB of dirmap */ > ndmpdp = 4; > @@ -517,6 +546,22 @@ > DMPDphys = allocpages(firstaddr, ndmpdp - ndm1g); > dmaplimit = (vm_paddr_t)ndmpdp << PDPSHIFT; > > + /* Allocate pages */ > + KPML4phys = allocpages(firstaddr, 1); > + KPDPphys = allocpages(firstaddr, NKPML4E); > + > + /* > + * Allocate the initial number of kernel page table pages required to > + * bootstrap. We defer this until after all memory-size dependent > + * allocations are done (e.g. direct map), so that we don't have to > + * build in too much slop in our estimate. > + */ > + nkpt_init(*firstaddr); > + nkpdpe = NKPDPE(nkpt); > + > + KPTphys = allocpages(firstaddr, nkpt); > + KPDphys = allocpages(firstaddr, nkpdpe); > + > /* Fill in the underlying page table pages */ > /* Read-only from zero to physfree */ > /* XXX not fully used, underneath 2M pages */ -- Andriy Gapon From owner-freebsd-hackers@FreeBSD.ORG Wed Feb 6 14:21:07 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id F0854C4F; Wed, 6 Feb 2013 14:21:07 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2]) by mx1.freebsd.org (Postfix) with ESMTP id BB139DDB; Wed, 6 Feb 2013 14:21:07 +0000 (UTC) Received: from v6.mpls.in ([2a02:978:2::5] helo=ws.su29.net) by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1U35vR-000HbU-Mn; Wed, 06 Feb 2013 18:24:37 +0400 Message-ID: <5112666F.3050904@FreeBSD.org> Date: Wed, 06 Feb 2013 18:19:27 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20120121 Thunderbird/9.0 MIME-Version: 1.0 To: net@freebsd.org, freebsd-hackers@FreeBSD.org Subject: Make kernel aware of NIC queues Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Feb 2013 14:21:08 -0000 Hello list! Today more and more NICs are capable of splitting traffic to different Rx/TX rings permitting OS to dispatch this traffic on different CPU cores. However, there are some problems that arises from using multi-nic (or even singe multi-port NIC) configurations: Typical (OS) questions are: * how much queues we should allocate per port ? * how we should mark packets received in given queue ? * What traffic pattern NIC is used for: should we bind queues to CPU cores and, if so, to which ones? Currently, there are some AI implemented in Intel drivers like: * use maximum available queues if CPU has large number of cores * bind every queue to CPU core sequentially. Problems with (probably, any AI) are: * what NICs (ports) will be _actually_ used? E.g: I have 8-core system with dual 82576 Intel NIC (which is capable of using 8 RX queues per port). If only one port is used, I can allocate 8 (or 7) queues and bind it to given cores. which is generally good for forwarding traffic. For 2-port setups it is probably better to setup 4 queues per each port to make sure ithreads from different cards to not interfere with each other. * How exactly we should mark packets? There are traffic flows which are not hashed properly by NIC (mostly non-IP/IPv6 traffic, PPPoE, various tunnels are good examples) so driver receives all such packets on q0 and marks them with FLOWID 0, which can be unhandy in some situations. It can be better if we can instruct NIC not to mark such packets with any id permitting OS to re-calculate hash via probably more powerful netisr hash function. * Traffic flow inside OS / flowid marking Smarter flowid marking may be needed in some cases: for example, if we are using lagg with 2 NICs for traffic forwarding, this results in increased contention on transmit parts: From the previos example: port 0 has q0-q3 bound to cores 0-3 port 1 has q0-q3 bound to cores 4-7 flow ids are the same as core numbers. lagg uses (flowid % number_nics) which leads to TX contention: 0 (0 % 2)=port0, (0 % 4)=queue0 1 (1 % 2)=port1, (1 % 4)=queue1 2 (2 % 2)=port0, (2 % 4)=queue2 3 (3 % 2)=port1, (3 % 4)=queue3 4 (4 % 2)=port0, (4 % 4)=queue0 5 (5 % 2)=port1, (5 % 4)=queue1 6 (6 % 2)=port0, (6 % 4)=queue2 7 (7 % 2)=port1, (7 % 4)=queue3 Flow IDs 0 and 4, 1 and 5, 2 and 6, 3 and 7 use the same TX queues on the same egress NICs. This can be minimized by using either GCD(queues, ports)=1 configurations (3 queues should do the trick in this case), but this leads to suboptimal CPU usage. We internally uses patched igb/ix driver which permits setting flow ids manually (and I heard other people are using hacks to enable/disabling setting M_FLOWID). I propose implementing common API to permit drivers: * read user-supplied number of queues/other queue options (e.g: * notify kernel of each RX/TX queue being created/destroyed * make binding queues to cores via given API * Export data to userland (for example, via sysctl) to permit users: a) quickly see current configuration b) change CPU binding on-fly c) change flowid numbers on-fly (with the possibility to set 1) NIC-supplied hash 2) manually supplied value 3) disable setting M_FLOWID) Having common interface will help users to make network stack tuning easier and puts us one step further to make (probably userland) AI which can auto-tune system according to template ("router", "webserver") and rc.conf configuration (lagg presense, etc..) What do you guys think? From owner-freebsd-hackers@FreeBSD.ORG Wed Feb 6 14:37:17 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 09C85837; Wed, 6 Feb 2013 14:37:17 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.59.238]) by mx1.freebsd.org (Postfix) with ESMTP id C4AFEEDA; Wed, 6 Feb 2013 14:37:16 +0000 (UTC) Received: by onelab2.iet.unipi.it (Postfix, from userid 275) id 34E6D73029; Wed, 6 Feb 2013 15:37:14 +0100 (CET) Date: Wed, 6 Feb 2013 15:37:14 +0100 From: Luigi Rizzo To: "Alexander V. Chernikov" Subject: Re: Make kernel aware of NIC queues Message-ID: <20130206143714.GA45782@onelab2.iet.unipi.it> References: <5112666F.3050904@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5112666F.3050904@FreeBSD.org> User-Agent: Mutt/1.4.2.3i Cc: freebsd-hackers@freebsd.org, net@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Feb 2013 14:37:17 -0000 On Wed, Feb 06, 2013 at 06:19:27PM +0400, Alexander V. Chernikov wrote: > Hello list! > > Today more and more NICs are capable of splitting traffic to different > Rx/TX rings permitting OS to dispatch this traffic on different CPU > cores. However, there are some problems that arises from using multi-nic > (or even singe multi-port NIC) configurations: ... > I propose implementing common API to permit drivers: > * read user-supplied number of queues/other queue options (e.g: > * notify kernel of each RX/TX queue being created/destroyed > * make binding queues to cores via given API > * Export data to userland (for example, via sysctl) to permit users: > a) quickly see current configuration > b) change CPU binding on-fly > c) change flowid numbers on-fly (with the possibility to set 1) > NIC-supplied hash 2) manually supplied value 3) disable setting M_FLOWID) > > Having common interface will help users to make network stack tuning > easier and puts us one step further to make (probably userland) AI which > can auto-tune system according to template ("router", "webserver") and > rc.conf configuration (lagg presense, etc..) > > > What do you guys think? this is certainly a good idea and a welcome one. Linux has tried to come up with a common framework to implement this kind of controls using "ethtool", and we should probably have a look at their approach and reuse it (or at least the good ideas) to avoid reinventing the same thing. cheers luigi From owner-freebsd-hackers@FreeBSD.ORG Wed Feb 6 14:41:54 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id E7B33A88 for ; Wed, 6 Feb 2013 14:41:54 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id C1881F25 for ; Wed, 6 Feb 2013 14:41:54 +0000 (UTC) Received: from pakbsde14.localnet (unknown [38.105.238.108]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 06299B911; Wed, 6 Feb 2013 09:41:54 -0500 (EST) From: John Baldwin To: freebsd-hackers@freebsd.org Subject: Re: Mbuf memory handling Date: Wed, 6 Feb 2013 08:36:55 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p22; KDE/4.5.5; amd64; ; ) References: <175CCF5F49938B4D99B2E3EF7F558EBE1C73F401F3@SC-VEXCH4.marvell.com> In-Reply-To: <175CCF5F49938B4D99B2E3EF7F558EBE1C73F401F3@SC-VEXCH4.marvell.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201302060836.55404.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 06 Feb 2013 09:41:54 -0500 (EST) Cc: Axel Fischer , Lino Sanfilippo , Markus Althoff X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Feb 2013 14:41:55 -0000 On Wednesday, February 06, 2013 4:50:39 am Lino Sanfilippo wrote: > > Hi all, > > I want to implement a device driver for a NIC which stores received data into chunks within > a page (>=4k) in host memory. One page shall be used for multiple packets and freed > after all mbufs linked to that page have been processed. So I would like to know what is the recommended way > to handle this in FreeBSD? Any hints are very appreciated. I think you can get what you want by allocating M_JUMBOP mbuf clusters for your receive buffers. When you want to split out a packet, allocate a new packet header mbuf and use m_split() to let it take over the rest of the 4k buffer and pass the original mbuf up to if_input() as the new packet. The new mbufs you attach to the cluster via m_split() will all hold a reference on the backing cluster and it won't be freed until all the mbufs are freed. -- John Baldwin From owner-freebsd-hackers@FreeBSD.ORG Wed Feb 6 14:47:24 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id B98C6D6B for ; Wed, 6 Feb 2013 14:47:24 +0000 (UTC) (envelope-from mattblists@icritical.com) Received: from mail3.icritical.com (mail3.icritical.com [212.57.248.143]) by mx1.freebsd.org (Postfix) with SMTP id 296E0F6C for ; Wed, 6 Feb 2013 14:47:23 +0000 (UTC) Received: (qmail 13796 invoked from network); 6 Feb 2013 14:47:05 -0000 Received: from localhost (127.0.0.1) by mail3.icritical.com with SMTP; 6 Feb 2013 14:47:05 -0000 Received: (qmail 13789 invoked by uid 599); 6 Feb 2013 14:47:02 -0000 Received: from unknown (HELO PDC002.icritical.int) (212.57.254.146) by mail3.icritical.com (qpsmtpd/0.28) with ESMTP; Wed, 06 Feb 2013 14:47:02 +0000 Message-ID: <51126CEA.5090302@icritical.com> Date: Wed, 6 Feb 2013 14:47:06 +0000 From: Matt Burke User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130122 Thunderbird/17.0.2 MIME-Version: 1.0 To: Subject: Progress display on multiple core dumps Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit X-TLS-Incoming: YES X-Virus-Scanned: by iCritical at mail3.icritical.com X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Feb 2013 14:47:24 -0000 I've been doing a lot of panicing recently trying to track down a dtrace problem, and have noticed that only the first call of doadump() shows a progress display, resulting in uncertainty as to whether or not the dump is happening, at least with a low amount of RAM dumping to a ramdisk (the panicing machine is a VM) This fixes it for me: --- a/sys/amd64/amd64/minidump_machdep.c +++ b/sys/amd64/amd64/minidump_machdep.c @@ -226,6 +226,9 @@ minidumpsys(struct dumperinfo *di) struct minidumphdr mdhdr; retry_count = 0; + for (i = 0; i < 10; i++) + progress_track[i].visited = 0; + retry: retry_count++; counter = 0; -- Sorry for the below... The information contained in this message is confidential and intended for the addressee only. If you have received this message in error, or there are any problems with its content, please contact the sender. iCritical is a trading name of Critical Software Ltd. Registered in England: 04909220. Registered Office: IC2, Keele Science Park, Keele, Staffordshire, ST5 5NH. This message has been scanned for security threats by iCritical. www.icritical.com From owner-freebsd-hackers@FreeBSD.ORG Wed Feb 6 15:20:52 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 1F043126; Wed, 6 Feb 2013 15:20:52 +0000 (UTC) (envelope-from jacques.fourie@gmail.com) Received: from mail-wg0-x22a.google.com (wg-in-x022a.1e100.net [IPv6:2a00:1450:400c:c00::22a]) by mx1.freebsd.org (Postfix) with ESMTP id 8EE1F1F9; Wed, 6 Feb 2013 15:20:51 +0000 (UTC) Received: by mail-wg0-f42.google.com with SMTP id 12so5105630wgh.1 for ; Wed, 06 Feb 2013 07:20:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=RpIspa7UDDad0DOKWti2hvt9abWVcEa0ywHx1xw06P4=; b=iA1z2fjmIvgd/VXnn/3yI9vfMCrnTSChgtnsX4mt/nkcai1KAe9Lk3l+bAJ7DLGn/9 kj7Nw9jUdDgCld2+j+Iq0un59FOcuQVF+Sa0sVlNcJnBdCgYOE0+DVloopjOvwiqg0KT g8eZrS/r48giYy62rwCS8DjOOspQA9a0nrcCi9TCHyTSsdGQWwOmUuat9tRcll6RJ0cw qD0eiJhcX8HuaqddjmEZ61WO54I5Fgub7f+GKbwIQa4Q6QApjIJJXF0lMewxN+P5LUUk NO5gyebc9lkM8T1L4WTZOeu463DBmPmEC4mXvQGPFhLMp3fL+ZmlPuOL0o9cKGNOxOdC yiEw== MIME-Version: 1.0 X-Received: by 10.194.158.100 with SMTP id wt4mr50509815wjb.37.1360164050803; Wed, 06 Feb 2013 07:20:50 -0800 (PST) Received: by 10.194.110.132 with HTTP; Wed, 6 Feb 2013 07:20:50 -0800 (PST) In-Reply-To: <201302060836.55404.jhb@freebsd.org> References: <175CCF5F49938B4D99B2E3EF7F558EBE1C73F401F3@SC-VEXCH4.marvell.com> <201302060836.55404.jhb@freebsd.org> Date: Wed, 6 Feb 2013 17:20:50 +0200 Message-ID: Subject: Re: Mbuf memory handling From: Jacques Fourie To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: Hackers freeBSD , Axel Fischer , Lino Sanfilippo , Markus Althoff X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Feb 2013 15:20:52 -0000 On Wed, Feb 6, 2013 at 3:36 PM, John Baldwin wrote: > On Wednesday, February 06, 2013 4:50:39 am Lino Sanfilippo wrote: > > > > Hi all, > > > > I want to implement a device driver for a NIC which stores received data > into chunks within > > a page (>=4k) in host memory. One page shall be used for multiple > packets and freed > > after all mbufs linked to that page have been processed. So I would like > to know what is the recommended way > > to handle this in FreeBSD? Any hints are very appreciated. > > I think you can get what you want by allocating M_JUMBOP mbuf clusters for > your receive buffers. When you want to split out a packet, allocate a new > packet header mbuf and use m_split() to let it take over the rest of the 4k > buffer and pass the original mbuf up to if_input() as the new packet. The > new mbufs you attach to the cluster via m_split() will all hold a reference > on the backing cluster and it won't be freed until all the mbufs are freed. > > The resulting mbufs will not be writeable (M_WRITABLE() will evaluate to 0), right? I don't know if this will be an issue in this particular application. > -- > John Baldwin > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > From owner-freebsd-hackers@FreeBSD.ORG Wed Feb 6 15:58:35 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 91410607; Wed, 6 Feb 2013 15:58:35 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) by mx1.freebsd.org (Postfix) with ESMTP id 05C4C633; Wed, 6 Feb 2013 15:58:34 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.6/8.14.6) with ESMTP id r16FwUu3014881; Wed, 6 Feb 2013 17:58:30 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.7.4 kib.kiev.ua r16FwUu3014881 Received: (from kostik@localhost) by tom.home (8.14.6/8.14.6/Submit) id r16FwUVK014880; Wed, 6 Feb 2013 17:58:30 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 6 Feb 2013 17:58:30 +0200 From: Konstantin Belousov To: Ian Lepore Subject: Re: Request for review, time_pps_fetch() enhancement Message-ID: <20130206155830.GX2522@kib.kiev.ua> References: <1360125698.93359.566.camel@revolution.hippie.lan> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="tcpvmIcflXmDwk6w" Content-Disposition: inline In-Reply-To: <1360125698.93359.566.camel@revolution.hippie.lan> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: "freebsd-hackers@freebsd.org" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Feb 2013 15:58:35 -0000 --tcpvmIcflXmDwk6w Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Feb 05, 2013 at 09:41:38PM -0700, Ian Lepore wrote: > I'd like feedback on the attached patch, which adds support to our > time_pps_fetch() implementation for the blocking behaviors described in > section 3.4.3 of RFC 2783. The existing implementation can only return > the most recently captured data without blocking. These changes add the > ability to block (forever or with timeout) until a new event occurs. >=20 > -- Ian >=20 > Index: sys/kern/kern_tc.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- sys/kern/kern_tc.c (revision 246337) > +++ sys/kern/kern_tc.c (working copy) > @@ -1446,6 +1446,50 @@ > * RFC 2783 PPS-API implementation. > */ > =20 > +static int > +pps_fetch(struct pps_fetch_args *fapi, struct pps_state *pps) > +{ > + int err, timo; > + pps_seq_t aseq, cseq; > + struct timeval tv; > + > + if (fapi->tsformat && fapi->tsformat !=3D PPS_TSFMT_TSPEC) > + return (EINVAL); > + > + /* > + * If no timeout is requested, immediately return whatever values were > + * most recently captured. If timeout seconds is -1, that's a request > + * to block without a timeout. WITNESS won't let us sleep forever > + * without a lock (we really don't need a lock), so just repeatedly > + * sleep a long time. > + */ Regarding no need for the lock, it would just move the implementation into the low quality one, for the case when one timestamp capture is lost and caller of time_pps_fetch() sleeps until next pps event is generated. I understand the desire to avoid lock, esp. in the pps_event() called =66rom the arbitrary driver context. But the race is also real. > + if (fapi->timeout.tv_sec || fapi->timeout.tv_nsec) { > + if (fapi->timeout.tv_sec =3D=3D -1) > + timo =3D 0x7fffffff; > + else { > + tv.tv_sec =3D fapi->timeout.tv_sec; > + tv.tv_usec =3D fapi->timeout.tv_nsec / 1000; > + timo =3D tvtohz(&tv); > + } > + aseq =3D pps->ppsinfo.assert_sequence; > + cseq =3D pps->ppsinfo.clear_sequence; > + while (aseq =3D=3D pps->ppsinfo.assert_sequence && > + cseq =3D=3D pps->ppsinfo.clear_sequence) { Note that compilers are allowed to optimize these accesses even over the sequential point, which is the tsleep() call. Only accesses to volatile objects are forbidden to be rearranged. I suggest to add volatile casts to pps in the loop condition. > + err =3D tsleep(pps, PCATCH, "ppsfch", timo); > + if (err =3D=3D EWOULDBLOCK && fapi->timeout.tv_sec =3D=3D -1) { > + continue; > + } else if (err !=3D 0) { > + return (err); > + } > + } > + } > + > + pps->ppsinfo.current_mode =3D pps->ppsparam.mode; > + fapi->pps_info_buf =3D pps->ppsinfo; > + > + return (0); > +} > + > int > pps_ioctl(u_long cmd, caddr_t data, struct pps_state *pps) > { > @@ -1485,13 +1529,7 @@ > return (0); > case PPS_IOC_FETCH: > fapi =3D (struct pps_fetch_args *)data; > - if (fapi->tsformat && fapi->tsformat !=3D PPS_TSFMT_TSPEC) > - return (EINVAL); > - if (fapi->timeout.tv_sec || fapi->timeout.tv_nsec) > - return (EOPNOTSUPP); > - pps->ppsinfo.current_mode =3D pps->ppsparam.mode; > - fapi->pps_info_buf =3D pps->ppsinfo; > - return (0); > + return (pps_fetch(fapi, pps)); > #ifdef FFCLOCK > case PPS_IOC_FETCH_FFCOUNTER: > fapi_ffc =3D (struct pps_fetch_ffc_args *)data; > @@ -1540,7 +1578,7 @@ > void > pps_init(struct pps_state *pps) > { > - pps->ppscap |=3D PPS_TSFMT_TSPEC; > + pps->ppscap |=3D PPS_TSFMT_TSPEC | PPS_CANWAIT; > if (pps->ppscap & PPS_CAPTUREASSERT) > pps->ppscap |=3D PPS_OFFSETASSERT; > if (pps->ppscap & PPS_CAPTURECLEAR) > @@ -1680,6 +1718,9 @@ > hardpps(tsp, ts.tv_nsec + 1000000000 * ts.tv_sec); > } > #endif > + > + /* Wakeup anyone sleeping in pps_fetch(). */ > + wakeup(pps); > } > =20 > /* > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" --tcpvmIcflXmDwk6w Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIcBAEBAgAGBQJREn2lAAoJEJDCuSvBvK1Bz7YP/R5+JihMt78WpNfaMSMlPHyE Kl5GEmAZVojEt8kBp0/uGHzR7VyN9U18lTgQlrC/BJhAlJZlC7vJE856+VS+L5zG Cz3D4+Psx+5IeUBeL1Vq52fJ3BcRWz3dN9XRyK66vFxS0/xWqv2+F6VBMN/pWXI3 1vJmcvcanhLF17hdSF7HwLPweqVyVLf3k43SR/MxVCWkX+Yga9hz0RXfWSJAH4hz zClNLKHsTFJfoHeSLAEJkDXoBoC/JbpWywVjhlmecnldNZKjvE+d5l2Egz39uNyn mRus+gUKbYdYEffadBL49WrGveV3MRNnJNN8QnuB+/9Dt+/x8zkqSqDYSBxoA9Ac b2gYhMw8ytn+oMfGVlqzO9rw4/EBfGAQrkUrLDl9orSYj+P4+Fm8p8Kb6pmzWfLN e7za4VZrPE3fTQecDlZH9ZUmY3Gq9HAdM2SyNeuZiOJYoNEghXch/VXAHQOUpJCC n55ipGFTgzRxfaIE810uu9NkSvlx2iAVJJw48qYjABCejvmNbg59WiBL7IyF/5Ka j/wH2ZnuhLfj4OhSfqVXuMzMSizkGLDNB6VtXmEmvehKhSZ+7VDjBmcaLCcbkgsg NdahVIxLoJtPDW4dv1LxmBonMPvji/nZZUz9Lc1MXJTLs9/BnCHrnVir+PDkgyHl sIC4C6JKgoqjq41k/Flm =9TbN -----END PGP SIGNATURE----- --tcpvmIcflXmDwk6w-- From owner-freebsd-hackers@FreeBSD.ORG Wed Feb 6 16:06:03 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 3C21476B; Wed, 6 Feb 2013 16:06:03 +0000 (UTC) (envelope-from gnn@neville-neil.com) Received: from vps.hungerhost.com (vps.hungerhost.com [216.38.53.176]) by mx1.freebsd.org (Postfix) with ESMTP id 01AFF684; Wed, 6 Feb 2013 16:06:02 +0000 (UTC) Received: from [38.105.238.108] (port=56842 helo=[10.7.1.235]) by vps.hungerhost.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.80) (envelope-from ) id 1U37VU-0002Sj-0r; Wed, 06 Feb 2013 11:05:56 -0500 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: Make kernel aware of NIC queues From: George Neville-Neil In-Reply-To: <20130206143714.GA45782@onelab2.iet.unipi.it> Date: Wed, 6 Feb 2013 11:05:59 -0500 Content-Transfer-Encoding: 7bit Message-Id: References: <5112666F.3050904@FreeBSD.org> <20130206143714.GA45782@onelab2.iet.unipi.it> To: Luigi Rizzo X-Mailer: Apple Mail (2.1499) X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - vps.hungerhost.com X-AntiAbuse: Original Domain - freebsd.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - neville-neil.com X-Get-Message-Sender-Via: vps.hungerhost.com: authenticated_id: gnn@neville-neil.com X-Mailman-Approved-At: Wed, 06 Feb 2013 16:32:53 +0000 Cc: freebsd-hackers@freebsd.org, "Alexander V. Chernikov" , net@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Feb 2013 16:06:03 -0000 On Feb 6, 2013, at 09:37 , Luigi Rizzo wrote: > On Wed, Feb 06, 2013 at 06:19:27PM +0400, Alexander V. Chernikov wrote: >> Hello list! >> >> Today more and more NICs are capable of splitting traffic to different >> Rx/TX rings permitting OS to dispatch this traffic on different CPU >> cores. However, there are some problems that arises from using multi-nic >> (or even singe multi-port NIC) configurations: > ... >> I propose implementing common API to permit drivers: >> * read user-supplied number of queues/other queue options (e.g: >> * notify kernel of each RX/TX queue being created/destroyed >> * make binding queues to cores via given API >> * Export data to userland (for example, via sysctl) to permit users: >> a) quickly see current configuration >> b) change CPU binding on-fly >> c) change flowid numbers on-fly (with the possibility to set 1) >> NIC-supplied hash 2) manually supplied value 3) disable setting M_FLOWID) >> >> Having common interface will help users to make network stack tuning >> easier and puts us one step further to make (probably userland) AI which >> can auto-tune system according to template ("router", "webserver") and >> rc.conf configuration (lagg presense, etc..) >> >> >> What do you guys think? > > this is certainly a good idea and a welcome one. > > Linux has tried to come up with a common framework to implement > this kind of controls using "ethtool", and we should probably > have a look at their approach and reuse it (or at least the good ideas) > to avoid reinventing the same thing. > And, though Luigi didn't say it, I will, this should integrate with netmap. Best, George From owner-freebsd-hackers@FreeBSD.ORG Wed Feb 6 16:38:14 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 3DC6133B for ; Wed, 6 Feb 2013 16:38:14 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 142437E7 for ; Wed, 6 Feb 2013 16:38:14 +0000 (UTC) Received: from pakbsde14.localnet (unknown [38.105.238.108]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 7EDF1B915; Wed, 6 Feb 2013 11:38:13 -0500 (EST) From: John Baldwin To: Jacques Fourie Subject: Re: Mbuf memory handling Date: Wed, 6 Feb 2013 11:37:35 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p22; KDE/4.5.5; amd64; ; ) References: <175CCF5F49938B4D99B2E3EF7F558EBE1C73F401F3@SC-VEXCH4.marvell.com> <201302060836.55404.jhb@freebsd.org> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <201302061137.35651.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 06 Feb 2013 11:38:13 -0500 (EST) Cc: Hackers freeBSD , Axel Fischer , Lino Sanfilippo , Markus Althoff X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Feb 2013 16:38:14 -0000 On Wednesday, February 06, 2013 10:20:50 am Jacques Fourie wrote: > On Wed, Feb 6, 2013 at 3:36 PM, John Baldwin wrote: > > > On Wednesday, February 06, 2013 4:50:39 am Lino Sanfilippo wrote: > > > > > > Hi all, > > > > > > I want to implement a device driver for a NIC which stores received data > > into chunks within > > > a page (>=4k) in host memory. One page shall be used for multiple > > packets and freed > > > after all mbufs linked to that page have been processed. So I would like > > to know what is the recommended way > > > to handle this in FreeBSD? Any hints are very appreciated. > > > > I think you can get what you want by allocating M_JUMBOP mbuf clusters for > > your receive buffers. When you want to split out a packet, allocate a new > > packet header mbuf and use m_split() to let it take over the rest of the 4k > > buffer and pass the original mbuf up to if_input() as the new packet. The > > new mbufs you attach to the cluster via m_split() will all hold a reference > > on the backing cluster and it won't be freed until all the mbufs are freed. > > > > The resulting mbufs will not be writeable (M_WRITABLE() will evaluate to > 0), right? I don't know if this will be an issue in this particular > application. No, they only propagate an existing M_RDONLY flag: n->m_flags |= m->m_flags & M_RDONLY; If the first mbuf is writable the splits remain writable from my reading of the code. OTOH, I think in this case read-only buffers passed up to the stack are probably fine since they are already contiguous so any pullup should be a NOP, etc. -- John Baldwin From owner-freebsd-hackers@FreeBSD.ORG Wed Feb 6 16:06:52 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id CF31A80C; Wed, 6 Feb 2013 16:06:52 +0000 (UTC) (envelope-from gnn@neville-neil.com) Received: from vps.hungerhost.com (vps.hungerhost.com [216.38.53.176]) by mx1.freebsd.org (Postfix) with ESMTP id A9FA3695; Wed, 6 Feb 2013 16:06:52 +0000 (UTC) Received: from [38.105.238.108] (port=56842 helo=[10.7.1.235]) by vps.hungerhost.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.80) (envelope-from ) id 1U37WK-0002Sj-KG; Wed, 06 Feb 2013 11:06:51 -0500 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: dtrace vs module unloading From: George Neville-Neil In-Reply-To: <51051C61.4060608@FreeBSD.org> Date: Wed, 6 Feb 2013 11:06:52 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: References: <51051C61.4060608@FreeBSD.org> To: Andriy Gapon X-Mailer: Apple Mail (2.1499) X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - vps.hungerhost.com X-AntiAbuse: Original Domain - freebsd.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - neville-neil.com X-Get-Message-Sender-Via: vps.hungerhost.com: authenticated_id: gnn@neville-neil.com X-Mailman-Approved-At: Wed, 06 Feb 2013 16:46:41 +0000 Cc: freebsd-hackers@FreeBSD.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Feb 2013 16:06:52 -0000 On Jan 27, 2013, at 07:24 , Andriy Gapon wrote: >=20 > It seems that FreeBSD DTrace currently does not track module loading / = unloading > at all. dtrace_module_loaded/dtrace_module_unloaded are both under = ifdef sun. >=20 > I think that this is a root cause of e.g. fbt probes for some = functions > remaining after a module that provides the functions is unloaded. >=20 > It looks like currently we do not post any event when a module gets = loaded / > unloaded. Perhaps this is one of the factors in current situation. > OTOH, in Solaris they just have some dtrace hooks in the form of = function > pointers directly in the module handling code (equivalent of our = kern_linker). >=20 Hrm, sounds like a bug more than anythign else. I don't know enough yet = to say how to solve this but if you want to track this you're welcome to create a PR and = assign it to me. Best, George From owner-freebsd-hackers@FreeBSD.ORG Wed Feb 6 16:55:05 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C0B10E2C; Wed, 6 Feb 2013 16:55:05 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.59.238]) by mx1.freebsd.org (Postfix) with ESMTP id 8476094D; Wed, 6 Feb 2013 16:55:05 +0000 (UTC) Received: by onelab2.iet.unipi.it (Postfix, from userid 275) id 69ACC73027; Wed, 6 Feb 2013 17:55:03 +0100 (CET) Date: Wed, 6 Feb 2013 17:55:03 +0100 From: Luigi Rizzo To: George Neville-Neil Subject: Re: Make kernel aware of NIC queues Message-ID: <20130206165503.GA46925@onelab2.iet.unipi.it> References: <5112666F.3050904@FreeBSD.org> <20130206143714.GA45782@onelab2.iet.unipi.it> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: freebsd-hackers@freebsd.org, "Alexander V. Chernikov" , net@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Feb 2013 16:55:05 -0000 On Wed, Feb 06, 2013 at 11:05:59AM -0500, George Neville-Neil wrote: > > On Feb 6, 2013, at 09:37 , Luigi Rizzo wrote: ... > > Linux has tried to come up with a common framework to implement > > this kind of controls using "ethtool", and we should probably > > have a look at their approach and reuse it (or at least the good ideas) > > to avoid reinventing the same thing. > > > And, though Luigi didn't say it, I will, this should integrate with netmap. i did not say it because it will work without any extra effort: - the netmap version i committed a few days ago already fetch the number of queues and the ring sizes at runtime; - ethtool (or whatever we will call it) only operates on the configuration/control plane (number of queues and slots, partitioning of packets onto input queues, etc.), whereas netmap operates only on the data plane cheers luigi From owner-freebsd-hackers@FreeBSD.ORG Wed Feb 6 17:04:09 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 127216B8; Wed, 6 Feb 2013 17:04:09 +0000 (UTC) (envelope-from jacques.fourie@gmail.com) Received: from mail-we0-x22e.google.com (mail-we0-x22e.google.com [IPv6:2a00:1450:400c:c03::22e]) by mx1.freebsd.org (Postfix) with ESMTP id 56C78AC2; Wed, 6 Feb 2013 17:04:08 +0000 (UTC) Received: by mail-we0-f174.google.com with SMTP id r6so1314141wey.5 for ; Wed, 06 Feb 2013 09:04:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=NopDyO40xAK74thNTluM2wmELVwmb4wQx4dxHhoQ3hE=; b=c107JQq8lXkbdsUkOrbknUfuWx1RYHjgNFNl45TrxrNSRsO/eSxiciD96IymgGgPgJ jf3ZtHMfDpm6MYNoPs2UrEqoidoEH/a+1JMF+m/51x6Sge3Ekvitn5Sg49uEHS5s+WuI rM0nj2Mre+pIOaPlY25J/HGPyYOd02HGhtTieY/r34DWlexGGVPfV0EJT15721XIYDf7 Q6QkjvBewhRSLo9rRDgsH0e/4SrcVwttDVdF0FWZNOaMoKP4vZFeBTOlskkkt3VsDGKx gE3WbCHMTNYrhU4BLuhDQz4PEWPkbtRb6C2khmzg9CvSlGpd2s81UdMful2/oER9jDhX LguA== MIME-Version: 1.0 X-Received: by 10.194.158.100 with SMTP id wt4mr51215150wjb.37.1360170247407; Wed, 06 Feb 2013 09:04:07 -0800 (PST) Received: by 10.194.110.132 with HTTP; Wed, 6 Feb 2013 09:04:07 -0800 (PST) In-Reply-To: <201302061137.35651.jhb@freebsd.org> References: <175CCF5F49938B4D99B2E3EF7F558EBE1C73F401F3@SC-VEXCH4.marvell.com> <201302060836.55404.jhb@freebsd.org> <201302061137.35651.jhb@freebsd.org> Date: Wed, 6 Feb 2013 19:04:07 +0200 Message-ID: Subject: Re: Mbuf memory handling From: Jacques Fourie To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: Hackers freeBSD , Axel Fischer , Lino Sanfilippo , Markus Althoff X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Feb 2013 17:04:09 -0000 On Wed, Feb 6, 2013 at 6:37 PM, John Baldwin wrote: > On Wednesday, February 06, 2013 10:20:50 am Jacques Fourie wrote: > > On Wed, Feb 6, 2013 at 3:36 PM, John Baldwin wrote: > > > > > On Wednesday, February 06, 2013 4:50:39 am Lino Sanfilippo wrote: > > > > > > > > Hi all, > > > > > > > > I want to implement a device driver for a NIC which stores received > data > > > into chunks within > > > > a page (>=4k) in host memory. One page shall be used for multiple > > > packets and freed > > > > after all mbufs linked to that page have been processed. So I would > like > > > to know what is the recommended way > > > > to handle this in FreeBSD? Any hints are very appreciated. > > > > > > I think you can get what you want by allocating M_JUMBOP mbuf clusters > for > > > your receive buffers. When you want to split out a packet, allocate a > new > > > packet header mbuf and use m_split() to let it take over the rest of > the 4k > > > buffer and pass the original mbuf up to if_input() as the new packet. > The > > > new mbufs you attach to the cluster via m_split() will all hold a > reference > > > on the backing cluster and it won't be freed until all the mbufs are > freed. > > > > > > The resulting mbufs will not be writeable (M_WRITABLE() will evaluate > to > > 0), right? I don't know if this will be an issue in this particular > > application. > > No, they only propagate an existing M_RDONLY flag: > > n->m_flags |= m->m_flags & M_RDONLY; > > If the first mbuf is writable the splits remain writable from my reading > of the code. OTOH, I think in this case read-only buffers passed up to > the stack are probably fine since they are already contiguous so any > pullup should be a NOP, etc. > > I agree that read-only buffers may be ok in this case but would like to point out that the M_WRITABLE() macro will evaluate to 0 if the refcount on the cluster is >1, even if the M_RDONLY flag is not set. So the various parts of the networking code that uses M_WRITABLE() to decide if the mbuf is writeable will treat the mbuf as read-only. > -- > John Baldwin > From owner-freebsd-hackers@FreeBSD.ORG Wed Feb 6 19:43:26 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 4A0337BC; Wed, 6 Feb 2013 19:43:26 +0000 (UTC) (envelope-from lsanfil@marvell.com) Received: from na3sys009aog110.obsmtp.com (na3sys009aog110.obsmtp.com [74.125.149.203]) by mx1.freebsd.org (Postfix) with ESMTP id 0B58237A; Wed, 6 Feb 2013 19:43:14 +0000 (UTC) Received: from SC-OWA.marvell.com ([199.233.58.135]) (using TLSv1) by na3sys009aob110.postini.com ([74.125.148.12]) with SMTP ID DSNKURKyQ2mFExrBQYhOxAcfT3iJ7bQsWFbl@postini.com; Wed, 06 Feb 2013 11:43:15 PST Received: from SC-VEXCH4.marvell.com ([::1]) by SC-OWA.marvell.com ([::1]) with mapi; Wed, 6 Feb 2013 11:41:33 -0800 From: Lino Sanfilippo To: Jacques Fourie , John Baldwin Date: Wed, 6 Feb 2013 11:41:32 -0800 Subject: RE: Mbuf memory handling Thread-Topic: Mbuf memory handling Thread-Index: Ac4Ei/tlxSbHW20uRt6xXMY67TQq1gACPnQg Message-ID: <175CCF5F49938B4D99B2E3EF7F558EBE1C73F4038E@SC-VEXCH4.marvell.com> References: <175CCF5F49938B4D99B2E3EF7F558EBE1C73F401F3@SC-VEXCH4.marvell.com> <201302060836.55404.jhb@freebsd.org> <201302061137.35651.jhb@freebsd.org> In-Reply-To: Accept-Language: de-DE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: de-DE, en-US MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: Hackers freeBSD , Axel Fischer , Ralf Assmann , Markus Althoff X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Feb 2013 19:43:26 -0000 John, Jacques, thank you very much for your help. An mbuf cluster seems to be the right di= rection. So I would have to do something like mbuf =3D m_getjcl(how, MT_DATA, M_PKTHDR, MJUMPAGESIZE); left_for_next_rcv =3D m_split(mbuf, chunksize); if_input(ifp, mbuf); right? >I agree that read-only buffers may be ok in this case but would like to po= int out that the M_WRITABLE() macro will evaluate to 0 if the refcount on t= he cluster is >1 The fact that the resulting mbufs returned by m_split() are not writeable a= ny more is indeed a problem: What I would like to do is keep the 'left_for_next_rcv' mbuf until the next= packet arrives and then fill it with the next packets data only up to 'chunksize', split it ag= ain to pass the new mbuf to the protocol stack and so on until 'left_for_next_rcv' becomes too small to= be splitted further. Only then I would want to allocate a new "fresh" jumbo sized mbuf. Is it po= ssible to realize this with cluster mbufs? Thx, Lino From owner-freebsd-hackers@FreeBSD.ORG Wed Feb 6 21:05:08 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 0DF92D55 for ; Wed, 6 Feb 2013 21:05:08 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id D899DA13 for ; Wed, 6 Feb 2013 21:05:07 +0000 (UTC) Received: from pakbsde14.localnet (unknown [38.105.238.108]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 198F5B91E; Wed, 6 Feb 2013 16:05:07 -0500 (EST) From: John Baldwin To: Lino Sanfilippo Subject: Re: Mbuf memory handling Date: Wed, 6 Feb 2013 16:05:00 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p22; KDE/4.5.5; amd64; ; ) References: <175CCF5F49938B4D99B2E3EF7F558EBE1C73F401F3@SC-VEXCH4.marvell.com> <175CCF5F49938B4D99B2E3EF7F558EBE1C73F4038E@SC-VEXCH4.marvell.com> In-Reply-To: <175CCF5F49938B4D99B2E3EF7F558EBE1C73F4038E@SC-VEXCH4.marvell.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <201302061605.00583.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 06 Feb 2013 16:05:07 -0500 (EST) Cc: Hackers freeBSD , Axel Fischer , Jacques Fourie , Ralf Assmann , Markus Althoff X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Feb 2013 21:05:08 -0000 On Wednesday, February 06, 2013 2:41:32 pm Lino Sanfilippo wrote: > John, Jacques, > > thank you very much for your help. An mbuf cluster seems to be the right direction. > So I would have to do something like > > mbuf = m_getjcl(how, MT_DATA, M_PKTHDR, MJUMPAGESIZE); > left_for_next_rcv = m_split(mbuf, chunksize); > if_input(ifp, mbuf); > > right? > > >I agree that read-only buffers may be ok in this case but would like to point out that the M_WRITABLE() macro will evaluate to 0 if the refcount on the cluster is >1 > > The fact that the resulting mbufs returned by m_split() are not writeable any more is indeed a problem: > What I would like to do is keep the 'left_for_next_rcv' mbuf until the next packet arrives and > then fill it with the next packets data only up to 'chunksize', split it again to pass the new mbuf to > the protocol stack and so on until 'left_for_next_rcv' becomes too small to be splitted further. > Only then I would want to allocate a new "fresh" jumbo sized mbuf. Is it possible to > realize this with cluster mbufs? They are only read-only in the sense that you can't call routines like m_pullup() or m_prepend(), etc. Your device should still be able to DMA into the buffer, but once the buffer is passed up to the stack the stack can't mess with it. This is probably what you want anyway as you wouldn't want the stack appending to a buffer and spilling over into the cluster where your device is going to DMA the next packet. -- John Baldwin From owner-freebsd-hackers@FreeBSD.ORG Thu Feb 7 20:22:15 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 70556C89 for ; Thu, 7 Feb 2013 20:22:15 +0000 (UTC) (envelope-from garym@oedata.com) Received: from mail-qc0-f176.google.com (mail-qc0-f176.google.com [209.85.216.176]) by mx1.freebsd.org (Postfix) with ESMTP id 13A10B53 for ; Thu, 7 Feb 2013 20:22:14 +0000 (UTC) Received: by mail-qc0-f176.google.com with SMTP id n41so1135505qco.21 for ; Thu, 07 Feb 2013 12:22:08 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=+xjuZSTyxx9LT7i5qUA3rTRTVfv72SevqEa2PRgA11Q=; b=YRz3hAsswm4N16SfQN/ce2gqyyjlPMPLP6nERy5rZd71M6+cwgwlcGxSYpvQmQHG7I FLf3pt4UAUSQhJ+7NivkGtzzzbo3YYC6VfjnUp34dyO9nB28SsMjBRiOzDg9hQrzdf1w N45Klb/dGeGWJvqiLgaTzPtEYUXIO6KpktELJTwYXRsqRMsVWgUDyGfPMT4Fer8+3O5D fvKh5ACw+qSn8i/p0JBcyuUQmpMU5BhiMBcFEyWDun97iE+xPY0QqdGhG8OXFvH3j6lV YEPAxuvG7PCqNbFnneJEF7bIXJGUXJwCXYaNbRdlD/oa6aVTQLGI9N8ODlAdSV2zK6bx IG7g== MIME-Version: 1.0 X-Received: by 10.229.170.194 with SMTP id e2mr249411qcz.48.1360268528452; Thu, 07 Feb 2013 12:22:08 -0800 (PST) Received: by 10.224.100.133 with HTTP; Thu, 7 Feb 2013 12:22:08 -0800 (PST) Date: Thu, 7 Feb 2013 13:22:08 -0700 Message-ID: Subject: Need advice on sys5 shm and zero copy sockets From: gary mazzaferro To: freebsd-hackers@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQmkoFBPQG6qmIEHwyUELL/nDWLy/vcpbiEwWFc9CC/ZiCjCYRhzTQfYFPRaKR85kEfd1uVS X-Mailman-Approved-At: Thu, 07 Feb 2013 20:46:49 +0000 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Feb 2013 20:22:15 -0000 Hi, I was told to post this question here (Ken Merry), it would be a good place to get some help. I'm not sure this is doable without a kernel module, which I don't want to add. I'll explain what I'm attempting.. I'm designing a high speed rest motor for cloud execution environment. 1) I'd like to eliminate copy from the tcp stack to the application(s). 2) I'm also sharing the buffers across processes and jails. So I'd like to preserve the zero-copy in a msg pipe/unix socket 3) Some buffers will go to disk file systems. Wish list: 4) I'd like it to work with sctp because I like it for local networking :) 5) I'd like to provision memory pools on a per application/connection/ip port basis. Ultimate Goal: 6) Additionally, I'm injecting "code" from a foreign process into the workflow of another process (state machine). The connection between them will be a signal and shared state information. I'm assuming item (6) is a separate issue, but it may impact the direction.. I've tried shm with zero copy sockets with linux and it just will not work !! BTW, I'm returning to freebsd after far too many years cheers, gary From owner-freebsd-hackers@FreeBSD.ORG Fri Feb 8 00:33:30 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 1A1FEF21 for ; Fri, 8 Feb 2013 00:33:30 +0000 (UTC) (envelope-from lists@eitanadler.com) Received: from mail-da0-f45.google.com (mail-da0-f45.google.com [209.85.210.45]) by mx1.freebsd.org (Postfix) with ESMTP id E988B999 for ; Fri, 8 Feb 2013 00:33:29 +0000 (UTC) Received: by mail-da0-f45.google.com with SMTP id w4so1474041dam.18 for ; Thu, 07 Feb 2013 16:33:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eitanadler.com; s=0xdeadbeef; h=x-received:mime-version:from:date:message-id:subject:to :content-type; bh=icc8+26o3/W3d9Ugb3eAsMSjHs/nHsEmPz3c3KmbNaI=; b=Ng+RTl8F/HjJKQrAYmu5xojsHtOtajKdWPmFWEbUhMQzVVDpHnhiiL/nUspy333gPz LwK5GyeHs39x8cjLBCNQJEVpASB5rzec3jYeVhNi+dSuivY84Yd96zAHV1YpGSI/F0xC Nym472NgdH73q2D3m1CGmSMjkGqIVYru9/++w= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:from:date:message-id:subject:to :content-type:x-gm-message-state; bh=icc8+26o3/W3d9Ugb3eAsMSjHs/nHsEmPz3c3KmbNaI=; b=aphrk1JX1h41Q22cTGfBwhcblH8Zhg7dHbxo1z+dIfH+BQvyzfFIgR7k8kTDQM5ox0 j4dWYo6aAX6X6yeKZqKRLZA3ZerKtAs/ZIld9PmDMv0qwPtYeEGEF9CGTGsZmCW3aU7f +dRihMe4KYV8dM2mcvZ+ozgfICUCTo8IKjqT4TVEVejWBxnfCfnjQNNNxv1NE2dgjY77 c/XpJ6WVNp2ILbjKutVqsAUr8yWzm21fl7hq3MKx1Ur5V4STNAQAiTuTPoqdbjbCWzKy 3/2W4kF4ibSaIbbQxawSSudkcEXJavnw1UyMEKQhkLMDvgcHslJKq9C2CflNSF6P1Gsh bWFg== X-Received: by 10.66.81.7 with SMTP id v7mr11019653pax.69.1360283608956; Thu, 07 Feb 2013 16:33:28 -0800 (PST) MIME-Version: 1.0 Received: by 10.66.148.10 with HTTP; Thu, 7 Feb 2013 16:32:58 -0800 (PST) From: Eitan Adler Date: Thu, 7 Feb 2013 19:32:58 -0500 Message-ID: Subject: Reviewing a FAQ change about LORs To: FreeBSD Hackers Content-Type: text/plain; charset=UTF-8 X-Gm-Message-State: ALoCoQmSxKbmILjXD420YU5oVeclLG7/9YGDchkOrYvJNhxcDoeP0NN0IyC99SYJ1rUsVwMnqdSO X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Feb 2013 00:33:30 -0000 Does someone here mind reviewing http://www.freebsd.org/cgi/query-pr.cgi?pr=174226 for correctness. Please feel free to post alternate diffs as a reply as well. -- Eitan Adler From owner-freebsd-hackers@FreeBSD.ORG Fri Feb 8 00:41:23 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 8EDAB4B6 for ; Fri, 8 Feb 2013 00:41:23 +0000 (UTC) (envelope-from erichsfreebsdlist@alogt.com) Received: from alogt.com (alogt.com [69.36.191.58]) by mx1.freebsd.org (Postfix) with ESMTP id 54E9D9EC for ; Fri, 8 Feb 2013 00:41:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=alogt.com; s=default; h=Content-Transfer-Encoding:Content-Type:Mime-Version:References:In-Reply-To:Message-ID:Subject:Cc:To:From:Date; bh=IlPeMqS7PiRcJNZAAeztjyBC1MvP1FfzEEtfKAzUZ3U=; b=lXtEdhu2gR0wJpqKDgUBQHrvtTTJc4hqlS4WdOKz+qfxK+TKkLG8udolJf8jOqI1lS+wBiLIzU4aUe5ZlxSHCk+233xxldLCPks7ctW1jSM54Byhbt1FPL1n1xPqTJ4H; Received: from [122.129.203.50] (port=38955 helo=X220.ovitrap.com) by sl-508-2.slc.westdc.net with esmtpsa (SSLv3:DHE-RSA-AES128-SHA:128) (Exim 4.80) (envelope-from ) id 1U3c1j-001i5Z-JQ; Thu, 07 Feb 2013 17:41:16 -0700 Date: Fri, 8 Feb 2013 07:41:11 +0700 From: Erich Dollansky To: Eitan Adler Subject: Re: Reviewing a FAQ change about LORs Message-ID: <20130208074111.62d661e4@X220.ovitrap.com> In-Reply-To: References: X-Mailer: Claws Mail 3.9.0 (GTK+ 2.24.6; amd64-portbld-freebsd10.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - sl-508-2.slc.westdc.net X-AntiAbuse: Original Domain - freebsd.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - alogt.com X-Get-Message-Sender-Via: sl-508-2.slc.westdc.net: authenticated_id: erichsfreebsdlist@alogt.com X-Source: X-Source-Args: X-Source-Dir: Cc: FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Feb 2013 00:41:23 -0000 Hi, On Thu, 7 Feb 2013 19:32:58 -0500 Eitan Adler wrote: > Does someone here mind reviewing > http://www.freebsd.org/cgi/query-pr.cgi?pr=174226 for correctness. > > Please feel free to post alternate diffs as a reply as well. > your text makes sense and is easy to understand. Erich From owner-freebsd-hackers@FreeBSD.ORG Fri Feb 8 21:32:19 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id BA04F158 for ; Fri, 8 Feb 2013 21:32:19 +0000 (UTC) (envelope-from ian@FreeBSD.org) Received: from duck.symmetricom.us (duck.symmetricom.us [206.168.13.214]) by mx1.freebsd.org (Postfix) with ESMTP id 67F01211 for ; Fri, 8 Feb 2013 21:32:19 +0000 (UTC) Received: from damnhippie.dyndns.org (daffy.symmetricom.us [206.168.13.218]) by duck.symmetricom.us (8.14.6/8.14.6) with ESMTP id r18LWBOx016364 for ; Fri, 8 Feb 2013 14:32:12 -0700 (MST) (envelope-from ian@FreeBSD.org) Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240]) by damnhippie.dyndns.org (8.14.3/8.14.3) with ESMTP id r18LVxSN036260; Fri, 8 Feb 2013 14:31:59 -0700 (MST) (envelope-from ian@FreeBSD.org) Subject: Re: Reviewing a FAQ change about LORs From: Ian Lepore To: Eitan Adler In-Reply-To: References: Content-Type: text/plain; charset="us-ascii" Date: Fri, 08 Feb 2013 14:31:58 -0700 Message-ID: <1360359118.4545.28.camel@revolution.hippie.lan> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit Cc: FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Feb 2013 21:32:19 -0000 On Thu, 2013-02-07 at 19:32 -0500, Eitan Adler wrote: > Does someone here mind reviewing > http://www.freebsd.org/cgi/query-pr.cgi?pr=174226 for correctness. > > Please feel free to post alternate diffs as a reply as well. > Does it make sense to reference a web page on LOR status that hasn't been updated in four years? -- Ian From owner-freebsd-hackers@FreeBSD.ORG Fri Feb 8 22:33:08 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id A0A885AE for ; Fri, 8 Feb 2013 22:33:08 +0000 (UTC) (envelope-from seanbru@yahoo-inc.com) Received: from mrout1-b.corp.bf1.yahoo.com (mrout1-b.corp.bf1.yahoo.com [98.139.253.104]) by mx1.freebsd.org (Postfix) with ESMTP id 639086E9 for ; Fri, 8 Feb 2013 22:33:08 +0000 (UTC) Received: from [127.0.0.1] (proxy6.corp.yahoo.com [216.145.48.19]) by mrout1-b.corp.bf1.yahoo.com (8.14.4/8.14.4/y.out) with ESMTP id r18MWiNH092420 for ; Fri, 8 Feb 2013 14:32:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=yahoo-inc.com; s=cobra; t=1360362765; bh=PmLLz28RKzwbSyEnx5KG/KrrwhZafK3qwSfpT4i9Uu8=; h=Subject:From:Reply-To:To:Content-Type:Date:Message-ID: Mime-Version:Content-Transfer-Encoding; b=SWdUvWiPAk85x4w8mKQA9y3UsJ/t7YT5nzi7yQ9ApqRuVuxjFxntaZxyj9ZpfaN+6 YGQD9rmKx8gZU/tUxMTeTBhQ8yhvWIlesi8KkOUqq4fr+X4ghqt5ju6FWPykgyL3Z5 wOWqPM0tT9nkXxzV4kZXD9cf67E7zgtvn/RMM088= Subject: clang/llvm failure on a project branch From: Sean Bruno To: "freebsd-hackers@freebsd.org" Content-Type: text/plain; charset="UTF-8" Date: Fri, 08 Feb 2013 14:32:44 -0800 Message-ID: <1360362764.4618.3.camel@powernoodle> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit X-Milter-Version: master.31+4-gbc07cd5+ X-CLX-ID: 362764002 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: sbruno@freebsd.org List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Feb 2013 22:33:08 -0000 Not sure if I really need to have this code yet, but I found a bit of code that works under gcc but fails under clang/llvm on my project branch of pxe_http: http://svnweb.freebsd.org/base/user/sbruno/pxe_http_head/sys/boot/i386/pxe_http/ ===> i386/pxe_http (all) pxe_isr.S:45:3: error: unexpected directive .code16 .code16 ^ pxe_isr.S:45:10: error: .code16 not supported yet .code16 ^ *** [pxe_isr.o] Error code 1 Stop in /home/sbruno/sbruno/pxe_http_head/sys/boot/i386/pxe_http. *** [all] Error code 1 Stop in /home/sbruno/sbruno/pxe_http_head/sys/boot/i386. *** [all] Error code 1 Stop in /home/sbruno/sbruno/pxe_http_head/sys/boot. From owner-freebsd-hackers@FreeBSD.ORG Fri Feb 8 22:51:24 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id B6E3FDD4 for ; Fri, 8 Feb 2013 22:51:24 +0000 (UTC) (envelope-from seanwbruno@gmail.com) Received: from mail-pa0-f47.google.com (mail-pa0-f47.google.com [209.85.220.47]) by mx1.freebsd.org (Postfix) with ESMTP id 943847B1 for ; Fri, 8 Feb 2013 22:51:24 +0000 (UTC) Received: by mail-pa0-f47.google.com with SMTP id bj3so2342952pad.34 for ; Fri, 08 Feb 2013 14:51:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:subject:from:reply-to:to:in-reply-to:references :content-type:date:message-id:mime-version:x-mailer :content-transfer-encoding; bh=u8AnbmaRlcg1QL1VSXgENaD2nxi+oxsEcbVLL3IwPD4=; b=PkbyRIs6T8uCfXsFXmdmWwi1LLtRZy3yDnkVbGLHcGCltGvpaaJ7vNTYzLrIoc3J7B ROFIXdsvCBscMjrwHgX/dMnejKQJAwpkcedgN3hjqEcyoYxSdwCgomk9tUAqZZpeSIhk x2K1Gk/Y9+W80d0nFHrlkDAwvNw7QO6y6rN/B03HekIEPBiNeNOyA8tTsVSKFHTQ9Thc GQl+9TvwWSZIEehL78ol8nqOIzAHw/JUR0Tu02ZlfxgX+9g/axZPxNwDiQz7kMPP7EI6 rHEMUuSBAwHC3qsFudRgbSwbRnDIyd5/uynxncclK0WGTSoVxFk3ujsQbbdXWLMQeE3j 2Oug== X-Received: by 10.66.89.132 with SMTP id bo4mr21720483pab.62.1360363883607; Fri, 08 Feb 2013 14:51:23 -0800 (PST) Received: from [192.168.1.210] (c-71-202-40-63.hsd1.ca.comcast.net. [71.202.40.63]) by mx.google.com with ESMTPS id d8sm57772443pax.23.2013.02.08.14.51.21 (version=SSLv3 cipher=RC4-SHA bits=128/128); Fri, 08 Feb 2013 14:51:22 -0800 (PST) Subject: Re: clang/llvm failure on a project branch [fixed] From: Sean Bruno To: "freebsd-hackers@freebsd.org" In-Reply-To: <1360362764.4618.3.camel@powernoodle> References: <1360362764.4618.3.camel@powernoodle> Content-Type: text/plain; charset="UTF-8" Date: Fri, 08 Feb 2013 14:51:20 -0800 Message-ID: <1360363880.4618.4.camel@powernoodle> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: sbruno@freebsd.org List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Feb 2013 22:51:24 -0000 On Fri, 2013-02-08 at 14:32 -0800, Sean Bruno wrote: > Not sure if I really need to have this code yet, but I found a bit of > code that works under gcc but fails under clang/llvm on my project > branch of pxe_http: > > http://svnweb.freebsd.org/base/user/sbruno/pxe_http_head/sys/boot/i386/pxe_http/ > > ===> i386/pxe_http (all) > pxe_isr.S:45:3: error: unexpected directive .code16 > .code16 > ^ > pxe_isr.S:45:10: error: .code16 not supported yet > .code16 > ^ > *** [pxe_isr.o] Error code 1 > Thanks to Hiren for the suggestion. Thanks to dim for the code. :-) http://svnweb.freebsd.org/base?view=revision&revision=246569 Sean From owner-freebsd-hackers@FreeBSD.ORG Fri Feb 8 23:13:48 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 03C6A8B4 for ; Fri, 8 Feb 2013 23:13:48 +0000 (UTC) (envelope-from ian@FreeBSD.org) Received: from duck.symmetricom.us (duck.symmetricom.us [206.168.13.214]) by mx1.freebsd.org (Postfix) with ESMTP id C6D2E8C9 for ; Fri, 8 Feb 2013 23:13:47 +0000 (UTC) Received: from damnhippie.dyndns.org (daffy.symmetricom.us [206.168.13.218]) by duck.symmetricom.us (8.14.6/8.14.6) with ESMTP id r18NDhsd019349 for ; Fri, 8 Feb 2013 16:13:47 -0700 (MST) (envelope-from ian@FreeBSD.org) Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240]) by damnhippie.dyndns.org (8.14.3/8.14.3) with ESMTP id r18NDeSJ036364; Fri, 8 Feb 2013 16:13:40 -0700 (MST) (envelope-from ian@FreeBSD.org) Subject: Re: Request for review, time_pps_fetch() enhancement From: Ian Lepore To: Konstantin Belousov In-Reply-To: <20130206155830.GX2522@kib.kiev.ua> References: <1360125698.93359.566.camel@revolution.hippie.lan> <20130206155830.GX2522@kib.kiev.ua> Content-Type: text/plain; charset="us-ascii" Date: Fri, 08 Feb 2013 16:13:40 -0700 Message-ID: <1360365220.4545.42.camel@revolution.hippie.lan> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit Cc: "freebsd-hackers@freebsd.org" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Feb 2013 23:13:48 -0000 On Wed, 2013-02-06 at 17:58 +0200, Konstantin Belousov wrote: > On Tue, Feb 05, 2013 at 09:41:38PM -0700, Ian Lepore wrote: > > I'd like feedback on the attached patch, which adds support to our > > time_pps_fetch() implementation for the blocking behaviors described in > > section 3.4.3 of RFC 2783. The existing implementation can only return > > the most recently captured data without blocking. These changes add the > > ability to block (forever or with timeout) until a new event occurs. > > > > -- Ian > > > > > Index: sys/kern/kern_tc.c > > =================================================================== > > --- sys/kern/kern_tc.c (revision 246337) > > +++ sys/kern/kern_tc.c (working copy) > > @@ -1446,6 +1446,50 @@ > > * RFC 2783 PPS-API implementation. > > */ > > > > +static int > > +pps_fetch(struct pps_fetch_args *fapi, struct pps_state *pps) > > +{ > > + int err, timo; > > + pps_seq_t aseq, cseq; > > + struct timeval tv; > > + > > + if (fapi->tsformat && fapi->tsformat != PPS_TSFMT_TSPEC) > > + return (EINVAL); > > + > > + /* > > + * If no timeout is requested, immediately return whatever values were > > + * most recently captured. If timeout seconds is -1, that's a request > > + * to block without a timeout. WITNESS won't let us sleep forever > > + * without a lock (we really don't need a lock), so just repeatedly > > + * sleep a long time. > > + */ > Regarding no need for the lock, it would just move the implementation into > the low quality one, for the case when one timestamp capture is lost > and caller of time_pps_fetch() sleeps until next pps event is generated. > > I understand the desire to avoid lock, esp. in the pps_event() called > from the arbitrary driver context. But the race is also real. > What race? A user of the pps interface understands that there is one event per second, and understands that if you ask to block until the next event at approximately the time that event is expected to occur, then it is ambiguous whether the call completes almost-immediately or in about 1 second. Looking at it another way, if a blocking call is made right around the time of the PPS, the thread could get preempted before getting to pps_fetch() function and not get control again until after the PPS has occurred. In that case it's going to block for about a full second, even though the call was made before top-of-second. That situation is exactly the same with or without locking, so what extra functionality is gained with locking? What guarantee does locking let us make to the caller that the lockless code doesn't? > > + if (fapi->timeout.tv_sec || fapi->timeout.tv_nsec) { > > + if (fapi->timeout.tv_sec == -1) > > + timo = 0x7fffffff; > > + else { > > + tv.tv_sec = fapi->timeout.tv_sec; > > + tv.tv_usec = fapi->timeout.tv_nsec / 1000; > > + timo = tvtohz(&tv); > > + } > > + aseq = pps->ppsinfo.assert_sequence; > > + cseq = pps->ppsinfo.clear_sequence; > > + while (aseq == pps->ppsinfo.assert_sequence && > > + cseq == pps->ppsinfo.clear_sequence) { > Note that compilers are allowed to optimize these accesses even over > the sequential point, which is the tsleep() call. Only accesses to > volatile objects are forbidden to be rearranged. > > I suggest to add volatile casts to pps in the loop condition. > Thank you. I pondered volatility, but was under the impression that the function call took care of it. I'll fix that. -- Ian > > + err = tsleep(pps, PCATCH, "ppsfch", timo); > > + if (err == EWOULDBLOCK && fapi->timeout.tv_sec == -1) { > > + continue; > > + } else if (err != 0) { > > + return (err); > > + } > > + } > > + } > > + > > + pps->ppsinfo.current_mode = pps->ppsparam.mode; > > + fapi->pps_info_buf = pps->ppsinfo; > > + > > + return (0); > > +} > > + > > int > > pps_ioctl(u_long cmd, caddr_t data, struct pps_state *pps) > > { > > @@ -1485,13 +1529,7 @@ > > return (0); > > case PPS_IOC_FETCH: > > fapi = (struct pps_fetch_args *)data; > > - if (fapi->tsformat && fapi->tsformat != PPS_TSFMT_TSPEC) > > - return (EINVAL); > > - if (fapi->timeout.tv_sec || fapi->timeout.tv_nsec) > > - return (EOPNOTSUPP); > > - pps->ppsinfo.current_mode = pps->ppsparam.mode; > > - fapi->pps_info_buf = pps->ppsinfo; > > - return (0); > > + return (pps_fetch(fapi, pps)); > > #ifdef FFCLOCK > > case PPS_IOC_FETCH_FFCOUNTER: > > fapi_ffc = (struct pps_fetch_ffc_args *)data; > > @@ -1540,7 +1578,7 @@ > > void > > pps_init(struct pps_state *pps) > > { > > - pps->ppscap |= PPS_TSFMT_TSPEC; > > + pps->ppscap |= PPS_TSFMT_TSPEC | PPS_CANWAIT; > > if (pps->ppscap & PPS_CAPTUREASSERT) > > pps->ppscap |= PPS_OFFSETASSERT; > > if (pps->ppscap & PPS_CAPTURECLEAR) > > @@ -1680,6 +1718,9 @@ > > hardpps(tsp, ts.tv_nsec + 1000000000 * ts.tv_sec); > > } > > #endif > > + > > + /* Wakeup anyone sleeping in pps_fetch(). */ > > + wakeup(pps); > > } > > > > /* > > > _______________________________________________ > > freebsd-hackers@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > From owner-freebsd-hackers@FreeBSD.ORG Fri Feb 8 23:52:23 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 93AC12FA for ; Fri, 8 Feb 2013 23:52:23 +0000 (UTC) (envelope-from lists@eitanadler.com) Received: from mail-da0-f53.google.com (mail-da0-f53.google.com [209.85.210.53]) by mx1.freebsd.org (Postfix) with ESMTP id 57A0AA38 for ; Fri, 8 Feb 2013 23:52:23 +0000 (UTC) Received: by mail-da0-f53.google.com with SMTP id r13so583802daj.40 for ; Fri, 08 Feb 2013 15:52:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eitanadler.com; s=0xdeadbeef; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=z40UIJv6DkQWkHsn4c8d4h5xbgwLN1JQ/J4VNu0/SLo=; b=BI7fIBIji8AIi16cBgMQzKlIs12SxNTX29dFF4Vz1mhpIyeaDh8Bm8xIAx83AKUHzm 8ATz3h7Ryk8LqilkT6nB5SsF2mhC4KBXjwqidKm7H2TUez3/ip6MW54vmx3B1YHjGB7W jRsFBOmLcjnqSbIWEI+PyvU//2GoTOysVfDus= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:cc:content-type:x-gm-message-state; bh=z40UIJv6DkQWkHsn4c8d4h5xbgwLN1JQ/J4VNu0/SLo=; b=pI+6nmoR6Fl/3FFwc/iE55W1OA+o6z6tal7mtcPcnmtXKx8NBqLuTJcylCM8t9c0e7 uCLaY0c6SWpZQwTmcqnRTXE1pcxTcZF1wTEmvrBo2alxh5qeenIuc9heejFBw0iY5ni2 zUSdAo8hEiUK4gqD2Q4w8bll843t8BghanTZvyHrseds2xr8G5/rjEySfeU3N6sbnjAW igLcTczz27KKMAfme3HtP8gvfdupUEK4+63ON+Oj3vQBLQiqgoKGUU198n8vlOajC+8E 4vFkKZzcAp+sa2RXJsGpSv1wuUImoU4x7e9ToWU1FoaQc9dQLotMIp1Qfy+i8GT3ouDo yBYg== X-Received: by 10.66.72.97 with SMTP id c1mr22194499pav.48.1360367542635; Fri, 08 Feb 2013 15:52:22 -0800 (PST) MIME-Version: 1.0 Received: by 10.66.148.10 with HTTP; Fri, 8 Feb 2013 15:51:52 -0800 (PST) In-Reply-To: <1360359118.4545.28.camel@revolution.hippie.lan> References: <1360359118.4545.28.camel@revolution.hippie.lan> From: Eitan Adler Date: Fri, 8 Feb 2013 18:51:52 -0500 Message-ID: Subject: Re: Reviewing a FAQ change about LORs To: Ian Lepore Content-Type: text/plain; charset=UTF-8 X-Gm-Message-State: ALoCoQnFwyUierAXZZPArRaik3ppeeH4UT6q8jEIMnDzUIQR75fffHJZfYx29GIwsDwYpQCS/QhE Cc: Bas Smeelen , FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Feb 2013 23:52:23 -0000 On 8 February 2013 16:31, Ian Lepore wrote: > On Thu, 2013-02-07 at 19:32 -0500, Eitan Adler wrote: >> Does someone here mind reviewing >> http://www.freebsd.org/cgi/query-pr.cgi?pr=174226 for correctness. >> >> Please feel free to post alternate diffs as a reply as well. >> > > Does it make sense to reference a web page on LOR status that hasn't > been updated in four years? I was unaware of this, which is the reason I asked for review. ;) Is there an updated page or is there no such service anymore? -- Eitan Adler From owner-freebsd-hackers@FreeBSD.ORG Fri Feb 8 23:58:22 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id E5961659 for ; Fri, 8 Feb 2013 23:58:22 +0000 (UTC) (envelope-from ian@FreeBSD.org) Received: from duck.symmetricom.us (duck.symmetricom.us [206.168.13.214]) by mx1.freebsd.org (Postfix) with ESMTP id A137FA8F for ; Fri, 8 Feb 2013 23:58:22 +0000 (UTC) Received: from damnhippie.dyndns.org (daffy.symmetricom.us [206.168.13.218]) by duck.symmetricom.us (8.14.6/8.14.6) with ESMTP id r18NwLCM019831 for ; Fri, 8 Feb 2013 16:58:21 -0700 (MST) (envelope-from ian@FreeBSD.org) Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240]) by damnhippie.dyndns.org (8.14.3/8.14.3) with ESMTP id r18Nw90R036412 for ; Fri, 8 Feb 2013 16:58:09 -0700 (MST) (envelope-from ian@FreeBSD.org) Subject: fcntl(2) F_READAHEAD set to zero doesn't work [patch] From: Ian Lepore To: "freebsd-hackers@freebsd.org" Content-Type: multipart/mixed; boundary="=-c2AO8q53G0nK+3Z7nGkh" Date: Fri, 08 Feb 2013 16:58:09 -0700 Message-ID: <1360367889.4545.58.camel@revolution.hippie.lan> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Feb 2013 23:58:23 -0000 --=-c2AO8q53G0nK+3Z7nGkh Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit I discovered today that fcntl(fd, F_READAHEAD, 0) doesn't work as advertised. It's supposed to disable readahead, but instead it restores the default readahead behavior (if it had previously been changed), and there is no way to disable readahead.[1] I think the attached patch fixes it, but it's not immediately clear from the patch why; here's the deal... The amount of readahead is calculated by sequential_heuristic() in vfs_vnops.c. If the FRDAHEAD flag is set on the file it uses the value stored in the file's f_seqcount, otherwise it calculates a value (and updates f_seqcount, which doesn't ever happen when FRDAHEAD is set). So the patch causes the FRDAHEAD flag to be set even in the case of the readahead amount being zero. Because it seems like a useful concept, it still allows the readahead to be restored to default behavior, now by passing a negative value. Does this look right to those of you who understand this part of the system better than I do? -- Ian [1] No way using F_READAHEAD; I know about POSIX_FADV_RANDOM. --=-c2AO8q53G0nK+3Z7nGkh Content-Disposition: inline; filename="fcntl_readahead.diff" Content-Type: text/x-patch; name="fcntl_readahead.diff"; charset="us-ascii" Content-Transfer-Encoding: 7bit Index: sys/kern/kern_descrip.c =================================================================== --- sys/kern/kern_descrip.c (revision 246337) +++ sys/kern/kern_descrip.c (working copy) @@ -776,7 +776,7 @@ } fhold(fp); FILEDESC_SUNLOCK(fdp); - if (arg != 0) { + if (arg >= 0) { vp = fp->f_vnode; error = vn_lock(vp, LK_SHARED); if (error != 0) { Index: lib/libc/sys/fcntl.2 =================================================================== --- lib/libc/sys/fcntl.2 (revision 246337) +++ lib/libc/sys/fcntl.2 (working copy) @@ -28,7 +28,7 @@ .\" @(#)fcntl.2 8.2 (Berkeley) 1/12/94 .\" $FreeBSD$ .\" -.Dd July 27, 2012 +.Dd February 8, 2013 .Dt FCNTL 2 .Os .Sh NAME @@ -171,7 +171,7 @@ which is rounded up to the nearest block size. A zero value in .Fa arg -turns off read ahead. +turns off read ahead, a negative value restores the system default. .It Dv F_RDAHEAD Equivalent to Darwin counterpart which sets read ahead amount of 128KB when the third argument, --=-c2AO8q53G0nK+3Z7nGkh-- From owner-freebsd-hackers@FreeBSD.ORG Sat Feb 9 09:47:34 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 1D4FF833; Sat, 9 Feb 2013 09:47:34 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 12EA1F0; Sat, 9 Feb 2013 09:47:32 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id LAA19652; Sat, 09 Feb 2013 11:47:29 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1U471t-0007lm-Cw; Sat, 09 Feb 2013 11:47:29 +0200 Message-ID: <51161B30.8060508@FreeBSD.org> Date: Sat, 09 Feb 2013 11:47:28 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130121 Thunderbird/17.0.2 MIME-Version: 1.0 To: Eitan Adler Subject: Re: Reviewing a FAQ change about LORs References: <1360359118.4545.28.camel@revolution.hippie.lan> In-Reply-To: X-Enigmail-Version: 1.4.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Bas Smeelen , FreeBSD Hackers , Ian Lepore X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Feb 2013 09:47:34 -0000 on 09/02/2013 01:51 Eitan Adler said the following: > On 8 February 2013 16:31, Ian Lepore wrote: >> On Thu, 2013-02-07 at 19:32 -0500, Eitan Adler wrote: >>> Does someone here mind reviewing >>> http://www.freebsd.org/cgi/query-pr.cgi?pr=174226 for correctness. >>> >>> Please feel free to post alternate diffs as a reply as well. >>> >> >> Does it make sense to reference a web page on LOR status that hasn't >> been updated in four years? > > I was unaware of this, which is the reason I asked for review. ;) > > Is there an updated page or is there no such service anymore? I suspect that the list of LORs doesn't get updated because we don't get many new LORs that here are to stay. Those old LORs are well known, harmless and hard to fix. We try to not introduce any new LORs of that kind. So the new LORs are either not introduced or getting fixed. Hence no strong need for an up-to-date list. It also seems that the interest in LORs diminished over the years as FreeBSD SMP / locking stabilized to the point of being taken for granted (as opposed to the early SMP days). So nobody (except developers adding new locks) really looks at LORs until a deadlock/livelock is really hit. On the other hand, the referenced page looks like new reports are welcome and get actually processes, which is not true. Also, the list of fixed/patched LORs has no practical use. Additionally, many LORs there are duplicates (e.g. a LOR between devfs and any during unmount is replicated for many values of ). There also seem to be some fixed LORs, etc. It probably would make sense to reference some static page with a list of some well known LORs. But that page doesn't seem to be very useful. -- Andriy Gapon From owner-freebsd-hackers@FreeBSD.ORG Sat Feb 9 09:44:22 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 09E5D816; Sat, 9 Feb 2013 09:44:22 +0000 (UTC) (envelope-from b.smeelen@ose.nl) Received: from mail.ose.nl (mail.ose.nl [212.178.134.164]) by mx1.freebsd.org (Postfix) with ESMTP id 799F1E5; Sat, 9 Feb 2013 09:44:20 +0000 (UTC) X-Footer: b3NlLm5s Received: from localhost ([127.0.0.1]) by mail.ose.nl (using TLSv1/SSLv3 with cipher AES256-SHA (256 bits)); Sat, 9 Feb 2013 10:44:11 +0100 Message-ID: <51161A6B.8060206@ose.nl> Date: Sat, 09 Feb 2013 10:44:11 +0100 From: Bas Smeelen User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130106 Thunderbird/17.0.2 MIME-Version: 1.0 To: Eitan Adler Subject: Re: Reviewing a FAQ change about LORs References: <1360359118.4545.28.camel@revolution.hippie.lan> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Mailman-Approved-At: Sat, 09 Feb 2013 13:09:51 +0000 Cc: FreeBSD Hackers , Ian Lepore X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Feb 2013 09:44:22 -0000 On 02/09/2013 12:51 AM, Eitan Adler wrote: > On 8 February 2013 16:31, Ian Lepore wrote: >> On Thu, 2013-02-07 at 19:32 -0500, Eitan Adler wrote: >>> Does someone here mind reviewing >>> http://www.freebsd.org/cgi/query-pr.cgi?pr=174226 for correctness. >>> >>> Please feel free to post alternate diffs as a reply as well. >>> >> Does it make sense to reference a web page on LOR status that hasn't >> been updated in four years? > I was unaware of this, which is the reason I asked for review. ;) > > Is there an updated page or is there no such service anymore? > Hi Sorry I didn't check this either. There seems to be no updated page on LOR's so maybe just remove the URL? From owner-freebsd-hackers@FreeBSD.ORG Sat Feb 9 13:47:09 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 33C69350; Sat, 9 Feb 2013 13:47:09 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mx1.stack.nl (relay02.stack.nl [IPv6:2001:610:1108:5010::104]) by mx1.freebsd.org (Postfix) with ESMTP id EEEC5ABD; Sat, 9 Feb 2013 13:47:08 +0000 (UTC) Received: from snail.stack.nl (snail.stack.nl [IPv6:2001:610:1108:5010::131]) by mx1.stack.nl (Postfix) with ESMTP id C00B5358C60; Sat, 9 Feb 2013 14:47:06 +0100 (CET) Received: by snail.stack.nl (Postfix, from userid 1677) id A86C02848C; Sat, 9 Feb 2013 14:47:06 +0100 (CET) Date: Sat, 9 Feb 2013 14:47:06 +0100 From: Jilles Tjoelker To: Konstantin Belousov Subject: Re: Request for review, time_pps_fetch() enhancement Message-ID: <20130209134706.GB19909@stack.nl> References: <1360125698.93359.566.camel@revolution.hippie.lan> <20130206155830.GX2522@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130206155830.GX2522@kib.kiev.ua> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: "freebsd-hackers@freebsd.org" , Ian Lepore X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Feb 2013 13:47:09 -0000 On Wed, Feb 06, 2013 at 05:58:30PM +0200, Konstantin Belousov wrote: > On Tue, Feb 05, 2013 at 09:41:38PM -0700, Ian Lepore wrote: > > I'd like feedback on the attached patch, which adds support to our > > time_pps_fetch() implementation for the blocking behaviors described in > > section 3.4.3 of RFC 2783. The existing implementation can only return > > the most recently captured data without blocking. These changes add the > > ability to block (forever or with timeout) until a new event occurs. > > Index: sys/kern/kern_tc.c > > =================================================================== > > --- sys/kern/kern_tc.c (revision 246337) > > +++ sys/kern/kern_tc.c (working copy) > > @@ -1446,6 +1446,50 @@ > > * RFC 2783 PPS-API implementation. > > */ > > > > +static int > > +pps_fetch(struct pps_fetch_args *fapi, struct pps_state *pps) > > +{ > > [snip] > > + aseq = pps->ppsinfo.assert_sequence; > > + cseq = pps->ppsinfo.clear_sequence; > > + while (aseq == pps->ppsinfo.assert_sequence && > > + cseq == pps->ppsinfo.clear_sequence) { > Note that compilers are allowed to optimize these accesses even over > the sequential point, which is the tsleep() call. Only accesses to > volatile objects are forbidden to be rearranged. > I suggest to add volatile casts to pps in the loop condition. The memory pointed to by pps is global (other code may have a pointer to it); therefore, the compiler must assume that the tsleep() call (which invokes code in a different compilation unit) may modify it. Because volatile does not make concurrent access by multiple threads defined either, adding it here only seems to slow down the code (potentially). > > + err = tsleep(pps, PCATCH, "ppsfch", timo); > > + if (err == EWOULDBLOCK && fapi->timeout.tv_sec == -1) { > > + continue; > > + } else if (err != 0) { > > + return (err); > > + } > > + } > > + } -- Jilles Tjoelker From owner-freebsd-hackers@FreeBSD.ORG Sat Feb 9 21:06:02 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 746A2C9E; Sat, 9 Feb 2013 21:06:02 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-wg0-f49.google.com (mail-wg0-f49.google.com [74.125.82.49]) by mx1.freebsd.org (Postfix) with ESMTP id DAF30845; Sat, 9 Feb 2013 21:06:01 +0000 (UTC) Received: by mail-wg0-f49.google.com with SMTP id 15so3930966wgd.16 for ; Sat, 09 Feb 2013 13:05:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=b4Mx4hXPUDG7uueFkXgM0s57I+z5gSjr6uJUtUFSYdE=; b=lpj08EeaUEBwGDN+FViU2+8tjwBs6rdZx2PO6VDLeftsQNL0Dm8jmjJVvy35ApsxCk KRmEwftc20ZBlc/kBX0TJqpSi3PXMf0IPeeJYWsPw7RhILI5Hx1VQLCvw8kgknG7fZYj wa9dkX1fHdUz7lGiGoGQ/W4IOUhat8pvLXvLUCTD9uzGqY/JPAUTIgzowUf6OVjQvLhx 8QBTTGu4wBT1WjhsRZ0WDU3cCvKQp/ThkaiXz1j97bNSLzs4f3JbDgHfWY3E7N66x6U/ f3Eu03HkU4K9oxK/4zdd/7yOoPEIfFdGhTodl62e+aarDrTzbtTGuosNU0kVjxtrrMmT Eazg== MIME-Version: 1.0 X-Received: by 10.194.161.135 with SMTP id xs7mr16272743wjb.41.1360443955355; Sat, 09 Feb 2013 13:05:55 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.216.236.88 with HTTP; Sat, 9 Feb 2013 13:05:55 -0800 (PST) In-Reply-To: <20130209134706.GB19909@stack.nl> References: <1360125698.93359.566.camel@revolution.hippie.lan> <20130206155830.GX2522@kib.kiev.ua> <20130209134706.GB19909@stack.nl> Date: Sat, 9 Feb 2013 13:05:55 -0800 X-Google-Sender-Auth: h95B0TQEXE0sFKybwXMITewM1-E Message-ID: Subject: Re: Request for review, time_pps_fetch() enhancement From: Adrian Chadd To: Jilles Tjoelker Content-Type: text/plain; charset=ISO-8859-1 Cc: Konstantin Belousov , "freebsd-hackers@freebsd.org" , Ian Lepore X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Feb 2013 21:06:02 -0000 ... why aren't you using atomics? or read/write barriers? Adrian