From owner-freebsd-hackers@FreeBSD.ORG Sun Aug 25 06:39:33 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id ACB3B9A6 for ; Sun, 25 Aug 2013 06:39:33 +0000 (UTC) (envelope-from jlh@FreeBSD.org) Received: from caravan.chchile.org (caravan.chchile.org [178.32.125.136]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 744DB2B3D for ; Sun, 25 Aug 2013 06:39:33 +0000 (UTC) Received: by caravan.chchile.org (Postfix, from userid 1000) id 77EE7C0191; Sun, 25 Aug 2013 06:39:31 +0000 (UTC) Date: Sun, 25 Aug 2013 08:39:31 +0200 From: Jeremie Le Hen To: Eitan Adler Subject: Re: weekly periodic security status Message-ID: <20130825063931.GH24767@caravan.chchile.org> Mail-Followup-To: Eitan Adler , FreeBSD Hackers References: <20130822204958.GC24767@caravan.chchile.org> <20130824204725.GF24767@caravan.chchile.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Aug 2013 06:39:33 -0000 On Sat, Aug 24, 2013 at 06:03:37PM -0400, Eitan Adler wrote: > On Sat, Aug 24, 2013 at 4:47 PM, Jeremie Le Hen wrote: > > On Sat, Aug 24, 2013 at 10:41:56AM -0400, Eitan Adler wrote: > >> On Thu, Aug 22, 2013 at 4:49 PM, Jeremie Le Hen wrote: > >> > Well, whatever, if you have any concerns, objections or comments, please > >> > speak now :). > >> > >> This LGTM but please include a comment above the warning with a date / > >> release number when this compatibility can be removed. > > > > If the old variable names are deprecated in releng/10, they can be > > removed in releng/11, can't they? > > Yes, and this should be indicated in a comment. When I see > "deprecated" or "old hack" or similar terms in code it takes some > archaeology to figure out when it was added and when it could be > removed. It would be nice to help the future reader a bit. The purpose of my question was to know what to put in the comment ;-). Thanks for pointing this out. -- Jeremie Le Hen Scientists say the world is made up of Protons, Neutrons and Electrons. They forgot to mention Morons. From owner-freebsd-hackers@FreeBSD.ORG Sun Aug 25 11:05:23 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id DABF04FD for ; Sun, 25 Aug 2013 11:05:22 +0000 (UTC) (envelope-from jlh@FreeBSD.org) Received: from caravan.chchile.org (caravan.chchile.org [178.32.125.136]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 70409278A for ; Sun, 25 Aug 2013 11:05:22 +0000 (UTC) Received: by caravan.chchile.org (Postfix, from userid 1000) id 6F72FC08E3; Sun, 25 Aug 2013 11:05:20 +0000 (UTC) Date: Sun, 25 Aug 2013 13:05:20 +0200 From: Jeremie Le Hen To: Royce Williams , Darren Pilgrim , FreeBSD Hackers Subject: Re: weekly periodic security status Message-ID: <20130825110520.GJ24767@caravan.chchile.org> Mail-Followup-To: Royce Williams , Darren Pilgrim , FreeBSD Hackers References: <20130822204958.GC24767@caravan.chchile.org> <5217AD9E.1000100@bluerosetech.com> <20130824165704.GD24767@caravan.chchile.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130824165704.GD24767@caravan.chchile.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Aug 2013 11:05:23 -0000 On Sat, Aug 24, 2013 at 06:57:04PM +0200, Jeremie Le Hen wrote: > On Fri, Aug 23, 2013 at 08:35:55PM -0800, Royce Williams wrote: > > On Fri, Aug 23, 2013 at 10:44 AM, Darren Pilgrim < > > list_freebsd@bluerosetech.com> wrote: > > > > > Thank you for this, but if I may make one suggestion: don't combine all > > > the security report settings--keep both daily_* and weekly_*. This makes > > > possible running some security tasks on a daily basis and others on a > > > weekly basis. For example, daily pkg/portaudit checks, but weekly > > > filesystem scans. > > > > > > > Agreed. I welcome and would use the weekly option at this level of > > granularity, but would like to retain daily for many checks, and so would > > not use weekly if was an all-or-nothing option. > > Sounds like a good idea. However I don't know how to implement this > because, in the current state of the periodic security scripts, there is > no way to know whether a script had been called from daily or weekly > periodic scripts, so no way to know which variable to check. > > The easy way to work around this would be to declare an environment > variable from 450.status-security, but it sounds like a hackish way > because you create an additional dependency for the periodic security > scripts. I've modified periodic(8) to set the $PERIODIC environment variable in r254829. The attached patch does more or less what you requested, but slightly differently. We now have the following variables to control daily/weekly security runs: daily_status_security_enable="YES" daily_status_security_inline="NO" daily_status_security_output="root" weekly_status_security_enable="YES" weekly_status_security_inline="NO" weekly_status_security_output="root" And the following variables to control whether you want each check to run "daily", "weekly" or directly from "crontab" (the default, backward compatible values are shown): security_status_chksetuid_enable="daily" security_status_neggrpperm_enable="daily" security_status_chkmounts_enable="daily" security_status_chkuid0_enable="daily" security_status_passwdless_enable="daily" security_status_logincheck_enable="daily" security_status_chkportsum_enable="NO" security_status_ipfwdenied_enable="daily" security_status_ipfdenied_enable="daily" security_status_pfdenied_enable="daily" security_status_ipfwlimit_enable="daily" security_status_ipf6denied_enable="daily" security_status_kernelmsg_enable="daily" security_status_loginfail_enable="daily" security_status_tcpwrap_enable="daily" The periodic.conf(5) manpage and default/periodic.conf have been updated accordingly, but I plan to further rework them after the patch is committed (especially, grouping security related variable into their own section). That way the modification done by the patch remain clear. Patch available here: http://people.freebsd.org/~jlh/daily_or_weekly_status_security.diff -- Jeremie Le Hen Scientists say the world is made up of Protons, Neutrons and Electrons. They forgot to mention Morons. From owner-freebsd-hackers@FreeBSD.ORG Sun Aug 25 12:51:26 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id F2D9F54A; Sun, 25 Aug 2013 12:51:25 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-qa0-x234.google.com (mail-qa0-x234.google.com [IPv6:2607:f8b0:400d:c00::234]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id A35FD2C40; Sun, 25 Aug 2013 12:51:25 +0000 (UTC) Received: by mail-qa0-f52.google.com with SMTP id l18so474680qak.4 for ; Sun, 25 Aug 2013 05:51:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=3vJJAUnpL99nThqHiqSkK4tS7mw64j/P3K75r5YVWII=; b=Mj9RHmq4yWDmXlNS7mkXTXb2ZfkJMk8T2ZrUp238/P8sqwI+TVDWjqQLRCNapdHSmI 9OuXUzkiu77InH0cH15h7/vs+eodrVFZ0QV3xj20vMhuijTokb7V9DX2RrmaRkDkPvKy QY7Zp+eVjYj/WtU7iAs66qyYzc7Z74th4jGjrtO+JaZX9AR+Da8cR2QT5RBz+QsLnVSe jueHtqapIJCsW9SEAOKdg4vhX6IL0CAV/K2qQ/tvn1ch2acGTuMaGGBrFfObs93rzkNR /WS+TsGZPUMxlVgvbnOO888rch/CllSv1ykI/h51NLZ9TPutQanoD5lWRK/EFv6X8jOp AlDg== MIME-Version: 1.0 X-Received: by 10.49.62.3 with SMTP id u3mr11080852qer.6.1377435084885; Sun, 25 Aug 2013 05:51:24 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.224.128.70 with HTTP; Sun, 25 Aug 2013 05:51:24 -0700 (PDT) In-Reply-To: <201308251227.r7PCRGog067825@svn.freebsd.org> References: <201308251227.r7PCRGog067825@svn.freebsd.org> Date: Sun, 25 Aug 2013 05:51:24 -0700 X-Google-Sender-Auth: 79w5bHKBHR1VyvC0wY-sXF8KTT0 Message-ID: Subject: Re: svn commit: r254853 - head/sys/dev/drm2 From: Adrian Chadd To: Jean-Sebastien Pedron Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "freebsd-hackers@freebsd.org" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Aug 2013 12:51:26 -0000 Hi! I'd just like to publicly thank you for all your hard work on improving the DRM2 support. This is something that's been sorely lacking lately. So, thank you! -adrian On 25 August 2013 05:27, Jean-Sebastien Pedron wrote: > Author: dumbbell > Date: Sun Aug 25 12:27:15 2013 > New Revision: 254853 > URL: http://svnweb.freebsd.org/changeset/base/254853 > > Log: > drm: Import drm_fixed.h from Linux 3.8 > > Added: > head/sys/dev/drm2/drm_fixed.h (contents, props changed) > > Added: head/sys/dev/drm2/drm_fixed.h > > ============================================================================== > --- /dev/null 00:00:00 1970 (empty, because file is newly added) > +++ head/sys/dev/drm2/drm_fixed.h Sun Aug 25 12:27:15 2013 > (r254853) > @@ -0,0 +1,72 @@ > +/* > + * Copyright 2009 Red Hat Inc. > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the > "Software"), > + * to deal in the Software without restriction, including without > limitation > + * the rights to use, copy, modify, merge, publish, distribute, > sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be > included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT > SHALL > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES > OR > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR > + * OTHER DEALINGS IN THE SOFTWARE. > + * > + * Authors: Dave Airlie > + */ > + > +#include > +__FBSDID("$FreeBSD$"); > + > +#ifndef DRM_FIXED_H > +#define DRM_FIXED_H > + > +typedef union dfixed { > + u32 full; > +} fixed20_12; > + > + > +#define dfixed_const(A) (u32)(((A) << 12))/* + ((B + 0.000122)*4096)) */ > +#define dfixed_const_half(A) (u32)(((A) << 12) + 2048) > +#define dfixed_const_666(A) (u32)(((A) << 12) + 2731) > +#define dfixed_const_8(A) (u32)(((A) << 12) + 3277) > +#define dfixed_mul(A, B) ((u64)((u64)(A).full * (B).full + 2048) >> 12) > +#define dfixed_init(A) { .full = dfixed_const((A)) } > +#define dfixed_init_half(A) { .full = dfixed_const_half((A)) } > +#define dfixed_trunc(A) ((A).full >> 12) > +#define dfixed_frac(A) ((A).full & ((1 << 12) - 1)) > + > +static inline u32 dfixed_floor(fixed20_12 A) > +{ > + u32 non_frac = dfixed_trunc(A); > + > + return dfixed_const(non_frac); > +} > + > +static inline u32 dfixed_ceil(fixed20_12 A) > +{ > + u32 non_frac = dfixed_trunc(A); > + > + if (A.full > dfixed_const(non_frac)) > + return dfixed_const(non_frac + 1); > + else > + return dfixed_const(non_frac); > +} > + > +static inline u32 dfixed_div(fixed20_12 A, fixed20_12 B) > +{ > + u64 tmp = ((u64)A.full << 13); > + > + do_div(tmp, B.full); > + tmp += 1; > + tmp /= 2; > + return lower_32_bits(tmp); > +} > +#endif > From owner-freebsd-hackers@FreeBSD.ORG Sun Aug 25 13:44:57 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id E2DFF18B; Sun, 25 Aug 2013 13:44:57 +0000 (UTC) (envelope-from dumbbell@FreeBSD.org) Received: from mail.made4.biz (unknown [IPv6:2001:41d0:1:7018::1:3]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id A5B7A2E4B; Sun, 25 Aug 2013 13:44:57 +0000 (UTC) Received: from 141.7.19.93.rev.sfr.net ([93.19.7.141] helo=magellan.dumbbell.fr) by mail.made4.biz with esmtpsa (TLSv1:DHE-RSA-CAMELLIA256-SHA:256) (Exim 4.80.1 (FreeBSD)) (envelope-from ) id 1VDacg-000Nv1-Jg; Sun, 25 Aug 2013 15:44:55 +0200 Message-ID: <521A0A52.5010605@FreeBSD.org> Date: Sun, 25 Aug 2013 15:44:50 +0200 From: =?ISO-8859-1?Q?Jean-S=E9bastien_P=E9dron?= User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130816 Thunderbird/17.0.8 MIME-Version: 1.0 To: Adrian Chadd Subject: Re: svn commit: r254853 - head/sys/dev/drm2 References: <201308251227.r7PCRGog067825@svn.freebsd.org> In-Reply-To: X-Enigmail-Version: 1.5.1 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="----enig2KPWPOFWOPBNGCELWOSRB" Cc: "freebsd-hackers@freebsd.org" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Aug 2013 13:44:58 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) ------enig2KPWPOFWOPBNGCELWOSRB Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 25.08.2013 14:51, Adrian Chadd wrote: > I'd just like to publicly thank you for all your hard work on improving= the > DRM2 support. This is something that's been sorely lacking lately. >=20 > So, thank you! Hello! Thank you very much for the kind words! There's still a long way to go until we have an up-to-date graphic stack. Several people (the FreeBSD X11 team, contributors, etc.), not just me, are really motivated to achieve this goal. They all deserve a "thank you"! --=20 Jean-S=E9bastien P=E9dron ------enig2KPWPOFWOPBNGCELWOSRB Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.21 (FreeBSD) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlIaClYACgkQa+xGJsFYOlOuuACaAyJJ2Z3jT8z7riQ+8NMz4kpM V/kAnjAoIoHupUs20aFMM5mH6OF+B00i =NNdL -----END PGP SIGNATURE----- ------enig2KPWPOFWOPBNGCELWOSRB-- From owner-freebsd-hackers@FreeBSD.ORG Sun Aug 25 15:39:48 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 977A4187; Sun, 25 Aug 2013 15:39:48 +0000 (UTC) (envelope-from royce.williams@gmail.com) Received: from mail-lb0-x236.google.com (mail-lb0-x236.google.com [IPv6:2a00:1450:4010:c04::236]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id C4A6923D9; Sun, 25 Aug 2013 15:39:47 +0000 (UTC) Received: by mail-lb0-f182.google.com with SMTP id r12so814329lbi.27 for ; Sun, 25 Aug 2013 08:39:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:content-type; bh=cL7OmDXF0/rjXmrfQtGy+6CuHcAOznrGhTf/I7nmtwc=; b=l2Nm6HrtLCTOqmY3m3EoF2mzpI+aJ3tKJBK5jj9E5BAcY1BZvYTgnUM62g+0KimBON 6eLMcewaFjSiP7uukx9HvmxVqYO9I+TwQmX3GbjibkqacpLNUeOpJyLmucqmr64WFCsV R0co+Ams3HirN8HEnOlsl9AYs1UCUHcwoHuPnVguFWKwuofamimRcKXHsvZdzUHKsxor wXPNb5rYNI0oE6gr8zrlCKOQPFqdICGnjqGv6d2RntXA8MX2dYuqHzWShdz1FCyLEdmO HzjsCfYUKl76/PijJEl5cSXBTy1Xok2sc4cz7js8ka/il//FSRaz4NXkKvGvh07osVO2 mFbA== X-Received: by 10.152.2.226 with SMTP id 2mr9364183lax.14.1377445185590; Sun, 25 Aug 2013 08:39:45 -0700 (PDT) MIME-Version: 1.0 Sender: royce.williams@gmail.com Received: by 10.112.138.227 with HTTP; Sun, 25 Aug 2013 08:39:25 -0700 (PDT) In-Reply-To: <20130825110520.GJ24767@caravan.chchile.org> References: <20130822204958.GC24767@caravan.chchile.org> <5217AD9E.1000100@bluerosetech.com> <20130824165704.GD24767@caravan.chchile.org> <20130825110520.GJ24767@caravan.chchile.org> From: Royce Williams Date: Sun, 25 Aug 2013 07:39:25 -0800 X-Google-Sender-Auth: 9wo9CrTGzKoa9p0crTn9116LJxc Message-ID: Subject: Re: weekly periodic security status To: FreeBSD Hackers , Jeremie Le Hen Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Aug 2013 15:39:48 -0000 On Sun, Aug 25, 2013 at 3:05 AM, Jeremie Le Hen wrote: > On Sat, Aug 24, 2013 at 06:57:04PM +0200, Jeremie Le Hen wrote: > > On Fri, Aug 23, 2013 at 08:35:55PM -0800, Royce Williams wrote: > > > On Fri, Aug 23, 2013 at 10:44 AM, Darren Pilgrim < > > > list_freebsd@bluerosetech.com> wrote: > > > > > > > Thank you for this, but if I may make one suggestion: don't combine > all > > > > the security report settings--keep both daily_* and weekly_*. This > makes > > > > possible running some security tasks on a daily basis and others on a > > > > weekly basis. For example, daily pkg/portaudit checks, but weekly > > > > filesystem scans. > > > > > > > > > > Agreed. I welcome and would use the weekly option at this level of > > > granularity, but would like to retain daily for many checks, and so > would > > > not use weekly if was an all-or-nothing option. > > > > Sounds like a good idea. However I don't know how to implement this > > because, in the current state of the periodic security scripts, there is > > no way to know whether a script had been called from daily or weekly > > periodic scripts, so no way to know which variable to check. > > > > The easy way to work around this would be to declare an environment > > variable from 450.status-security, but it sounds like a hackish way > > because you create an additional dependency for the periodic security > > scripts. > > I've modified periodic(8) to set the $PERIODIC environment variable in > r254829. > > The attached patch does more or less what you requested, but slightly > differently. > > We now have the following variables to control daily/weekly security > runs: > daily_status_security_enable="YES" > daily_status_security_inline="NO" > daily_status_security_output="root" > > weekly_status_security_enable="YES" > weekly_status_security_inline="NO" > weekly_status_security_output="root" > > > And the following variables to control whether you want each check to > run "daily", "weekly" or directly from "crontab" (the default, backward > compatible values are shown): > security_status_chksetuid_enable="daily" > security_status_neggrpperm_enable="daily" > security_status_chkmounts_enable="daily" > security_status_chkuid0_enable="daily" > security_status_passwdless_enable="daily" > security_status_logincheck_enable="daily" > security_status_chkportsum_enable="NO" > security_status_ipfwdenied_enable="daily" > security_status_ipfdenied_enable="daily" > security_status_pfdenied_enable="daily" > security_status_ipfwlimit_enable="daily" > security_status_ipf6denied_enable="daily" > security_status_kernelmsg_enable="daily" > security_status_loginfail_enable="daily" > security_status_tcpwrap_enable="daily" > > > The periodic.conf(5) manpage and default/periodic.conf have been > updated accordingly, but I plan to further rework them after the patch > is committed (especially, grouping security related variable into their > own section). That way the modification done by the patch remain clear. > > Patch available here: > http://people.freebsd.org/~jlh/daily_or_weekly_status_security.diff > > This approach creates the granularity that I was looking for, and represents a non-trivial amount of work; thanks for taking this on! I haven't looked closely at the patch, but I do have a couple of style comments. If someone uses an unrecognized value the config, the phrase "this is incorrect", while strictly accurate, is a little harsh (and less FreeBSD-ish, I think). I would expect something more along the lines of "Valid values are now (daily|weekly|NO). See periodic.conf(5) for more details." This gives the user the minimum information, leaves breadcrumbs ... and is a little less potentially pejorative. :-) Also, while I see the utility in toggling daily/weekly in the *_enable variables, how much precedent is there for overloading *_enable in this way? It's the first time that I've seen it, and may be a mild POLA violation. Most scripts seem to use *_enable solely as a binary YES/NO toggle, and then modify script behavior with other variables (perhaps "_period" in this case?) That would double the config size, though, so I definitely see why you went this route. Both of the above are closely related. If the period was stored in a different variable, and the default _period was "daily", then the vast majority of the user base would still be "correct" and Just Keep Working ... and only those interested in weekly updates would need to modify anything. Just my $0.04. Royce From owner-freebsd-hackers@FreeBSD.ORG Sun Aug 25 16:45:34 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id BC23F727; Sun, 25 Aug 2013 16:45:34 +0000 (UTC) (envelope-from list_freebsd@bluerosetech.com) Received: from rush.bluerosetech.com (rush.bluerosetech.com [IPv6:2607:fc50:1000:9b00::25]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 60D2526FB; Sun, 25 Aug 2013 16:45:34 +0000 (UTC) Received: from chombo.houseloki.net (c-76-27-220-79.hsd1.wa.comcast.net [76.27.220.79]) by rush.bluerosetech.com (Postfix) with ESMTPSA id 6F29711434; Sun, 25 Aug 2013 09:45:26 -0700 (PDT) Received: from [192.168.1.102] (static-71-242-248-73.phlapa.east.verizon.net [71.242.248.73]) by chombo.houseloki.net (Postfix) with ESMTPSA id 1CF8ED8D; Sun, 25 Aug 2013 09:45:23 -0700 (PDT) Message-ID: <521A34A2.303@bluerosetech.com> Date: Sun, 25 Aug 2013 12:45:22 -0400 From: Darren Pilgrim User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20130801 Thunderbird/17.0.8 MIME-Version: 1.0 To: Jeremie Le Hen Subject: Re: weekly periodic security status References: <20130822204958.GC24767@caravan.chchile.org> <5217AD9E.1000100@bluerosetech.com> <20130824165704.GD24767@caravan.chchile.org> <20130825110520.GJ24767@caravan.chchile.org> In-Reply-To: <20130825110520.GJ24767@caravan.chchile.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Aug 2013 16:45:34 -0000 On 8/25/2013 7:05 AM, Jeremie Le Hen wrote: > And the following variables to control whether you want each check to > run "daily", "weekly" or directly from "crontab" (the default, backward > compatible values are shown): What do we do if we want to run a check both daily and weekly? From owner-freebsd-hackers@FreeBSD.ORG Sun Aug 25 17:37:23 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 98C5AEA1; Sun, 25 Aug 2013 17:37:23 +0000 (UTC) (envelope-from jlh@FreeBSD.org) Received: from caravan.chchile.org (caravan.chchile.org [178.32.125.136]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 5F06D28DF; Sun, 25 Aug 2013 17:37:23 +0000 (UTC) Received: by caravan.chchile.org (Postfix, from userid 1000) id 4F04DC1B02; Sun, 25 Aug 2013 17:37:15 +0000 (UTC) Date: Sun, 25 Aug 2013 19:37:15 +0200 From: Jeremie Le Hen To: Darren Pilgrim Subject: Re: weekly periodic security status Message-ID: <20130825173715.GK24767@caravan.chchile.org> Mail-Followup-To: Darren Pilgrim , FreeBSD Hackers References: <20130822204958.GC24767@caravan.chchile.org> <5217AD9E.1000100@bluerosetech.com> <20130824165704.GD24767@caravan.chchile.org> <20130825110520.GJ24767@caravan.chchile.org> <521A34A2.303@bluerosetech.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <521A34A2.303@bluerosetech.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Jeremie Le Hen , FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Aug 2013 17:37:23 -0000 Hi Darren, On Sun, Aug 25, 2013 at 12:45:22PM -0400, Darren Pilgrim wrote: > On 8/25/2013 7:05 AM, Jeremie Le Hen wrote: > > And the following variables to control whether you want each check to > > run "daily", "weekly" or directly from "crontab" (the default, backward > > compatible values are shown): > > What do we do if we want to run a check both daily and weekly? I really don't see the point of running some checks weekly when you do daily. Do you have a particular example in mind? -- Jeremie Le Hen Scientists say the world is made up of Protons, Neutrons and Electrons. They forgot to mention Morons. From owner-freebsd-hackers@FreeBSD.ORG Sun Aug 25 18:00:50 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 2492C305 for ; Sun, 25 Aug 2013 18:00:50 +0000 (UTC) (envelope-from kientzle@freebsd.org) Received: from monday.kientzle.com (99-115-135-74.uvs.sntcca.sbcglobal.net [99.115.135.74]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id CA2AA29E9 for ; Sun, 25 Aug 2013 18:00:49 +0000 (UTC) Received: (from root@localhost) by monday.kientzle.com (8.14.4/8.14.4) id r7PI0fa3016993 for freebsd-hackers@freebsd.org; Sun, 25 Aug 2013 18:00:41 GMT (envelope-from kientzle@freebsd.org) Received: from [192.168.2.123] (CiscoE3000 [192.168.1.65]) by kientzle.com with SMTP id k4kjaqr9frhnwu6edtqrb9i89w; for freebsd-hackers@freebsd.org; Sun, 25 Aug 2013 18:00:41 +0000 (UTC) (envelope-from kientzle@freebsd.org) From: Tim Kientzle Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Subject: Torture test for atomics and userland locking? Date: Sun, 25 Aug 2013 11:00:29 -0700 Message-Id: <615EAF1F-A719-4821-BB94-FF8C2F041AAB@freebsd.org> To: FreeBSD Hackers Mime-Version: 1.0 (Apple Message framework v1283) X-Mailer: Apple Mail (2.1283) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Aug 2013 18:00:50 -0000 Can anyone suggest a good test suite for stressing atomic primitives and/or userland locking? There were some questions on freebsd-arm about verifying that our atomics are correct; a few people have looked at the code and everything looks good so far, but it would be reassuring to have some test suite that could provide additional confidence. Tim From owner-freebsd-hackers@FreeBSD.ORG Sun Aug 25 20:04:01 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id D93A1CA4; Sun, 25 Aug 2013 20:04:01 +0000 (UTC) (envelope-from jlh@FreeBSD.org) Received: from caravan.chchile.org (caravan.chchile.org [178.32.125.136]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 48F4D2FA7; Sun, 25 Aug 2013 20:04:00 +0000 (UTC) Received: by caravan.chchile.org (Postfix, from userid 1000) id C193EC1BD8; Sun, 25 Aug 2013 20:03:58 +0000 (UTC) Date: Sun, 25 Aug 2013 22:03:58 +0200 From: Jeremie Le Hen To: Royce Williams Subject: Re: weekly periodic security status Message-ID: <20130825200358.GL24767@caravan.chchile.org> Mail-Followup-To: Royce Williams , FreeBSD Hackers References: <20130822204958.GC24767@caravan.chchile.org> <5217AD9E.1000100@bluerosetech.com> <20130824165704.GD24767@caravan.chchile.org> <20130825110520.GJ24767@caravan.chchile.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Jeremie Le Hen , FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Aug 2013 20:04:01 -0000 On Sun, Aug 25, 2013 at 07:39:25AM -0800, Royce Williams wrote: > On Sun, Aug 25, 2013 at 3:05 AM, Jeremie Le Hen wrote: > > > On Sat, Aug 24, 2013 at 06:57:04PM +0200, Jeremie Le Hen wrote: > > > On Fri, Aug 23, 2013 at 08:35:55PM -0800, Royce Williams wrote: > > > > On Fri, Aug 23, 2013 at 10:44 AM, Darren Pilgrim < > > > > list_freebsd@bluerosetech.com> wrote: > > > > > > > > > Thank you for this, but if I may make one suggestion: don't combine > > all > > > > > the security report settings--keep both daily_* and weekly_*. This > > makes > > > > > possible running some security tasks on a daily basis and others on a > > > > > weekly basis. For example, daily pkg/portaudit checks, but weekly > > > > > filesystem scans. > > > > > > > > > > > > > Agreed. I welcome and would use the weekly option at this level of > > > > granularity, but would like to retain daily for many checks, and so > > would > > > > not use weekly if was an all-or-nothing option. > > > > > > Sounds like a good idea. However I don't know how to implement this > > > because, in the current state of the periodic security scripts, there is > > > no way to know whether a script had been called from daily or weekly > > > periodic scripts, so no way to know which variable to check. > > > > > > The easy way to work around this would be to declare an environment > > > variable from 450.status-security, but it sounds like a hackish way > > > because you create an additional dependency for the periodic security > > > scripts. > > > > I've modified periodic(8) to set the $PERIODIC environment variable in > > r254829. > > > > The attached patch does more or less what you requested, but slightly > > differently. > > > > We now have the following variables to control daily/weekly security > > runs: > > daily_status_security_enable="YES" > > daily_status_security_inline="NO" > > daily_status_security_output="root" > > > > weekly_status_security_enable="YES" > > weekly_status_security_inline="NO" > > weekly_status_security_output="root" > > > > > > And the following variables to control whether you want each check to > > run "daily", "weekly" or directly from "crontab" (the default, backward > > compatible values are shown): > > security_status_chksetuid_enable="daily" > > security_status_neggrpperm_enable="daily" > > security_status_chkmounts_enable="daily" > > security_status_chkuid0_enable="daily" > > security_status_passwdless_enable="daily" > > security_status_logincheck_enable="daily" > > security_status_chkportsum_enable="NO" > > security_status_ipfwdenied_enable="daily" > > security_status_ipfdenied_enable="daily" > > security_status_pfdenied_enable="daily" > > security_status_ipfwlimit_enable="daily" > > security_status_ipf6denied_enable="daily" > > security_status_kernelmsg_enable="daily" > > security_status_loginfail_enable="daily" > > security_status_tcpwrap_enable="daily" > > > > > > The periodic.conf(5) manpage and default/periodic.conf have been > > updated accordingly, but I plan to further rework them after the patch > > is committed (especially, grouping security related variable into their > > own section). That way the modification done by the patch remain clear. > > > > Patch available here: > > http://people.freebsd.org/~jlh/daily_or_weekly_status_security.diff > > > > > This approach creates the granularity that I was looking for, and > represents a non-trivial amount of work; thanks for taking this on! > > I haven't looked closely at the patch, but I do have a couple of style > comments. > > If someone uses an unrecognized value the config, the phrase "this is > incorrect", while strictly accurate, is a little harsh (and less > FreeBSD-ish, I think). I would expect something more along the lines of > "Valid values are now (daily|weekly|NO). See periodic.conf(5) for more > details." This gives the user the minimum information, leaves breadcrumbs > ... and is a little less potentially pejorative. :-) > > Also, while I see the utility in toggling daily/weekly in the *_enable > variables, how much precedent is there for overloading *_enable in this > way? It's the first time that I've seen it, and may be a mild POLA > violation. Most scripts seem to use *_enable solely as a binary YES/NO > toggle, and then modify script behavior with other variables (perhaps > "_period" in this case?) That would double the config size, though, so I > definitely see why you went this route. > > Both of the above are closely related. If the period was stored in a > different variable, and the default _period was "daily", then the vast > majority of the user base would still be "correct" and Just Keep Working > ... and only those interested in weekly updates would need to modify > anything. > > Just my $0.04. It's more than that, I really like your proposal. I've implemented it here: http://people.freebsd.org/~jlh/security_status_period.diff -- Jeremie Le Hen Scientists say the world is made up of Protons, Neutrons and Electrons. They forgot to mention Morons. From owner-freebsd-hackers@FreeBSD.ORG Sun Aug 25 20:24:41 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 8DF25F48 for ; Sun, 25 Aug 2013 20:24:41 +0000 (UTC) (envelope-from jamesgosnell@gmail.com) Received: from mail-ee0-x22a.google.com (mail-ee0-x22a.google.com [IPv6:2a00:1450:4013:c00::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 0C3BE208A for ; Sun, 25 Aug 2013 20:24:40 +0000 (UTC) Received: by mail-ee0-f42.google.com with SMTP id b45so1238589eek.29 for ; Sun, 25 Aug 2013 13:24:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=meiusFolAYWdqrdOpnaMDPG/OzNnBtwd8hoyL5gMwxQ=; b=DVgXclucAbO1zd1huGd683D62mLfTflOXX5nAsEirkGWwMOBZ1d56lDSGaCL47pYOj lRUexZ8HsIvQAkTwR5voJuqvf8ngP3wf/3PuuyG4PVlI5+kAX4j7f1kFKiaBkQaN9aNV NQ3UoTiM6ZB/Kd/iL151TU2SPKasgs7S/o89nvBbLuMPLw7fEECPt526YiowigLanZUu 7JdejUvVuLTciU6mW902LNMFU1UJ5oR/TMsafIyyt2Nb7h7S2WqY+BjZ9lZppwW8Y/bK D5WAR8SQKJP4zPOm//xN606OpgZM1r4WU3GWauKaW9PPWwKwCu5Ve3fV6FHlPSvfS30r g9mw== MIME-Version: 1.0 X-Received: by 10.14.103.69 with SMTP id e45mr577572eeg.51.1377462279176; Sun, 25 Aug 2013 13:24:39 -0700 (PDT) Received: by 10.223.197.8 with HTTP; Sun, 25 Aug 2013 13:24:39 -0700 (PDT) In-Reply-To: <20130825200358.GL24767@caravan.chchile.org> References: <20130822204958.GC24767@caravan.chchile.org> <5217AD9E.1000100@bluerosetech.com> <20130824165704.GD24767@caravan.chchile.org> <20130825110520.GJ24767@caravan.chchile.org> <20130825200358.GL24767@caravan.chchile.org> Date: Sun, 25 Aug 2013 15:24:39 -0500 Message-ID: Subject: Re: weekly periodic security status From: James Gosnell To: FreeBSD Hackers Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Aug 2013 20:24:41 -0000 *Throughout the land, hard drives rejoice* On Sun, Aug 25, 2013 at 3:03 PM, Jeremie Le Hen wrote: > On Sun, Aug 25, 2013 at 07:39:25AM -0800, Royce Williams wrote: > > On Sun, Aug 25, 2013 at 3:05 AM, Jeremie Le Hen wrote: > > > > > On Sat, Aug 24, 2013 at 06:57:04PM +0200, Jeremie Le Hen wrote: > > > > On Fri, Aug 23, 2013 at 08:35:55PM -0800, Royce Williams wrote: > > > > > On Fri, Aug 23, 2013 at 10:44 AM, Darren Pilgrim < > > > > > list_freebsd@bluerosetech.com> wrote: > > > > > > > > > > > Thank you for this, but if I may make one suggestion: don't > combine > > > all > > > > > > the security report settings--keep both daily_* and weekly_*. > This > > > makes > > > > > > possible running some security tasks on a daily basis and others > on a > > > > > > weekly basis. For example, daily pkg/portaudit checks, but > weekly > > > > > > filesystem scans. > > > > > > > > > > > > > > > > Agreed. I welcome and would use the weekly option at this level of > > > > > granularity, but would like to retain daily for many checks, and so > > > would > > > > > not use weekly if was an all-or-nothing option. > > > > > > > > Sounds like a good idea. However I don't know how to implement this > > > > because, in the current state of the periodic security scripts, > there is > > > > no way to know whether a script had been called from daily or weekly > > > > periodic scripts, so no way to know which variable to check. > > > > > > > > The easy way to work around this would be to declare an environment > > > > variable from 450.status-security, but it sounds like a hackish way > > > > because you create an additional dependency for the periodic security > > > > scripts. > > > > > > I've modified periodic(8) to set the $PERIODIC environment variable in > > > r254829. > > > > > > The attached patch does more or less what you requested, but slightly > > > differently. > > > > > > We now have the following variables to control daily/weekly security > > > runs: > > > daily_status_security_enable="YES" > > > daily_status_security_inline="NO" > > > daily_status_security_output="root" > > > > > > weekly_status_security_enable="YES" > > > weekly_status_security_inline="NO" > > > weekly_status_security_output="root" > > > > > > > > > And the following variables to control whether you want each check to > > > run "daily", "weekly" or directly from "crontab" (the default, backward > > > compatible values are shown): > > > security_status_chksetuid_enable="daily" > > > security_status_neggrpperm_enable="daily" > > > security_status_chkmounts_enable="daily" > > > security_status_chkuid0_enable="daily" > > > security_status_passwdless_enable="daily" > > > security_status_logincheck_enable="daily" > > > security_status_chkportsum_enable="NO" > > > security_status_ipfwdenied_enable="daily" > > > security_status_ipfdenied_enable="daily" > > > security_status_pfdenied_enable="daily" > > > security_status_ipfwlimit_enable="daily" > > > security_status_ipf6denied_enable="daily" > > > security_status_kernelmsg_enable="daily" > > > security_status_loginfail_enable="daily" > > > security_status_tcpwrap_enable="daily" > > > > > > > > > The periodic.conf(5) manpage and default/periodic.conf have been > > > updated accordingly, but I plan to further rework them after the patch > > > is committed (especially, grouping security related variable into their > > > own section). That way the modification done by the patch remain > clear. > > > > > > Patch available here: > > > http://people.freebsd.org/~jlh/daily_or_weekly_status_security.diff > > > > > > > > This approach creates the granularity that I was looking for, and > > represents a non-trivial amount of work; thanks for taking this on! > > > > I haven't looked closely at the patch, but I do have a couple of style > > comments. > > > > If someone uses an unrecognized value the config, the phrase "this is > > incorrect", while strictly accurate, is a little harsh (and less > > FreeBSD-ish, I think). I would expect something more along the lines of > > "Valid values are now (daily|weekly|NO). See periodic.conf(5) for more > > details." This gives the user the minimum information, leaves > breadcrumbs > > ... and is a little less potentially pejorative. :-) > > > > Also, while I see the utility in toggling daily/weekly in the *_enable > > variables, how much precedent is there for overloading *_enable in this > > way? It's the first time that I've seen it, and may be a mild POLA > > violation. Most scripts seem to use *_enable solely as a binary YES/NO > > toggle, and then modify script behavior with other variables (perhaps > > "_period" in this case?) That would double the config size, though, so I > > definitely see why you went this route. > > > > Both of the above are closely related. If the period was stored in a > > different variable, and the default _period was "daily", then the vast > > majority of the user base would still be "correct" and Just Keep Working > > ... and only those interested in weekly updates would need to modify > > anything. > > > > Just my $0.04. > > It's more than that, I really like your proposal. > > I've implemented it here: > http://people.freebsd.org/~jlh/security_status_period.diff > > -- > Jeremie Le Hen > > Scientists say the world is made up of Protons, Neutrons and Electrons. > They forgot to mention Morons. > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > -- James Gosnell, ACP From owner-freebsd-hackers@FreeBSD.ORG Mon Aug 26 06:32:11 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id DA864E45 for ; Mon, 26 Aug 2013 06:32:11 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id AA30A2B98 for ; Mon, 26 Aug 2013 06:32:11 +0000 (UTC) Received: from jre-mbp.elischer.org (ppp121-45-245-177.lns20.per2.internode.on.net [121.45.245.177]) (authenticated bits=0) by vps1.elischer.org (8.14.6/8.14.6) with ESMTP id r7Q6Vuxn011424 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Sun, 25 Aug 2013 23:31:59 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <521AF656.1040200@freebsd.org> Date: Mon, 26 Aug 2013 14:31:50 +0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:17.0) Gecko/20130801 Thunderbird/17.0.8 MIME-Version: 1.0 To: Royce Williams , Darren Pilgrim , FreeBSD Hackers Subject: Re: weekly periodic security status References: <20130822204958.GC24767@caravan.chchile.org> <5217AD9E.1000100@bluerosetech.com> <20130824165704.GD24767@caravan.chchile.org> <20130825110520.GJ24767@caravan.chchile.org> In-Reply-To: <20130825110520.GJ24767@caravan.chchile.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Aug 2013 06:32:11 -0000 On 8/25/13 7:05 PM, Jeremie Le Hen wrote: > > And the following variables to control whether you want each check to > run "daily", "weekly" or directly from "crontab" (the default, backward > compatible values are shown): > security_status_chksetuid_enable="daily" > security_status_neggrpperm_enable="daily" > security_status_chkmounts_enable="daily" > security_status_chkuid0_enable="daily" > security_status_passwdless_enable="daily" > security_status_logincheck_enable="daily" > security_status_chkportsum_enable="NO" > security_status_ipfwdenied_enable="daily" > security_status_ipfdenied_enable="daily" > security_status_pfdenied_enable="daily" > security_status_ipfwlimit_enable="daily" > security_status_ipf6denied_enable="daily" > security_status_kernelmsg_enable="daily" > security_status_loginfail_enable="daily" > security_status_tcpwrap_enable="daily" excellent.. From owner-freebsd-hackers@FreeBSD.ORG Mon Aug 26 11:45:07 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 16ABC4B4 for ; Mon, 26 Aug 2013 11:45:07 +0000 (UTC) (envelope-from freebsd-hackers@m.gmane.org) Received: from plane.gmane.org (plane.gmane.org [80.91.229.3]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id C6ADB2C7B for ; Mon, 26 Aug 2013 11:45:06 +0000 (UTC) Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1VDvEF-0003YX-Q5 for freebsd-hackers@freebsd.org; Mon, 26 Aug 2013 13:45:03 +0200 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 26 Aug 2013 13:45:03 +0200 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 26 Aug 2013 13:45:03 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-hackers@freebsd.org From: Ivan Voras Subject: Re: About CPU cores numbering an processor affinity Date: Mon, 26 Aug 2013 13:42:38 +0200 Lines: 64 Message-ID: References: <1D21F5BC-63CD-4B33-9286-6687E62FDB15@gmail.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="----enig2MVIUSWNMUCBQNGBXWJED" X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130322 Thunderbird/17.0.4 In-Reply-To: <1D21F5BC-63CD-4B33-9286-6687E62FDB15@gmail.com> X-Enigmail-Version: 1.5.1 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Aug 2013 11:45:07 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) ------enig2MVIUSWNMUCBQNGBXWJED Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 23/08/2013 15:23, Dmitry Sivachenko wrote: > Hello! >=20 > I am using FreeBSD-9-STABLE on the following hardware: >=20 > FreeBSD/SMP: Multiprocessor System Detected: 24 CPUs > FreeBSD/SMP: 2 package(s) x 6 core(s) x 2 SMT threads >=20 > So I have 2 physical CPUs with 6 core each. >=20 > # cpuset -g > pid -1 mask: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, = 17, 18, 19, 20, 21, 22, 23 >=20 >=20 > So each of 24 cores are numbered 0..23. >=20 > 1) In what particular order are these cores numbered? Can I assume tha= t 0..11 correspond to 1st physical CPU and 12..23 to second? How SMT thr= eads are numbered within each core? You could look at the kern.sched.topology_spec sysctl, which outputs like this: 0, 1, 2, 3, 4, 5, 6, 7 0, 1, 2, 3 4, 5, 6, 7 Note that this output is created from the kernel's own interpretation of the pysical CPUs, not necessarily from what the physical topology actually is (but if there is a mismatch, it's a bug). ------enig2MVIUSWNMUCBQNGBXWJED Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlIbPy4ACgkQ/QjVBj3/HSw26gCfdNnHGXPHILLSCoZ2ZwXt4ef6 R+wAni0wLHPCFmfqx3fUNbCYRyhN7pRP =3QNt -----END PGP SIGNATURE----- ------enig2MVIUSWNMUCBQNGBXWJED-- From owner-freebsd-hackers@FreeBSD.ORG Mon Aug 26 16:03:39 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 1E501549 for ; Mon, 26 Aug 2013 16:03:39 +0000 (UTC) (envelope-from rwmaillists@googlemail.com) Received: from mail-we0-x22f.google.com (mail-we0-x22f.google.com [IPv6:2a00:1450:400c:c03::22f]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id A4BF62D92 for ; Mon, 26 Aug 2013 16:03:38 +0000 (UTC) Received: by mail-we0-f175.google.com with SMTP id n5so2923908wev.34 for ; Mon, 26 Aug 2013 09:03:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=date:from:to:subject:message-id:in-reply-to:references:mime-version :content-type:content-transfer-encoding; bh=0cy+Ld9ZnpOFY3t3s90YtY0F+ti5FadDJG35Ep3ixmQ=; b=WAJEbxXzyYEKKpLfeVeV0g5a2SJCT+/pm9weF/iAFJHsvqI6kD6NGrINQvWYWcfJIF 78IXV0xeN15ekyr7zWY2qjGJBRKNyS6GmGB5nM6HeFlsc3QShJ39tV/aNH1PJFa1x4WK u3DGP/ha33Wsuz2e+V+4ai2OsshjG3VKZb+xzXwI2PqIyr0nG6pij5elu6HjaR2nhpca pv+sqB07Ud9406A9x5uk/YFO/D67O2LHuSQyQrTpfxsz8F9ZTEDo+hxbG145c8XNGIPl V8b9Y5QmVVuS+DabAx70BlLfFBC5o7atdYVw+cgzfbrCpmTv9HZmHjRjiGz1HpgrFGWa yRjw== X-Received: by 10.180.85.133 with SMTP id h5mr8176957wiz.1.1377533016912; Mon, 26 Aug 2013 09:03:36 -0700 (PDT) Received: from gumby.homeunix.com (87-194-105-247.bethere.co.uk. [87.194.105.247]) by mx.google.com with ESMTPSA id gg10sm19426770wib.1.1969.12.31.16.00.00 (version=SSLv3 cipher=RC4-SHA bits=128/128); Mon, 26 Aug 2013 09:03:36 -0700 (PDT) Date: Mon, 26 Aug 2013 17:03:32 +0100 From: RW To: freebsd-hackers@freebsd.org Subject: Re: weekly periodic security status Message-ID: <20130826170332.416b5c55@gumby.homeunix.com> In-Reply-To: <20130825200358.GL24767@caravan.chchile.org> References: <20130822204958.GC24767@caravan.chchile.org> <5217AD9E.1000100@bluerosetech.com> <20130824165704.GD24767@caravan.chchile.org> <20130825110520.GJ24767@caravan.chchile.org> <20130825200358.GL24767@caravan.chchile.org> X-Mailer: Claws Mail 3.9.0 (GTK+ 2.24.17; amd64-portbld-freebsd10.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Aug 2013 16:03:39 -0000 On Sun, 25 Aug 2013 22:03:58 +0200 Jeremie Le Hen wrote: > I've implemented it here: > http://people.freebsd.org/~jlh/security_status_period.diff > Doesn't this mean that if you want to run "periodic security" from crontab or manually etc, you have to override every single entry to "crontab" in period.conf. IMO when "periodic security" is run directly every test should run unless it's explicitly set to NO. From owner-freebsd-hackers@FreeBSD.ORG Mon Aug 26 16:17:29 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id B6D5BC1B for ; Mon, 26 Aug 2013 16:17:29 +0000 (UTC) (envelope-from jlh@FreeBSD.org) Received: from caravan.chchile.org (caravan.chchile.org [178.32.125.136]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 7D20D2E98 for ; Mon, 26 Aug 2013 16:17:29 +0000 (UTC) Received: by caravan.chchile.org (Postfix, from userid 1000) id 92A97C04FF; Mon, 26 Aug 2013 16:17:25 +0000 (UTC) Date: Mon, 26 Aug 2013 18:17:25 +0200 From: Jeremie Le Hen To: RW Subject: Re: weekly periodic security status Message-ID: <20130826161725.GN24767@caravan.chchile.org> Mail-Followup-To: RW , freebsd-hackers@freebsd.org References: <20130822204958.GC24767@caravan.chchile.org> <5217AD9E.1000100@bluerosetech.com> <20130824165704.GD24767@caravan.chchile.org> <20130825110520.GJ24767@caravan.chchile.org> <20130825200358.GL24767@caravan.chchile.org> <20130826170332.416b5c55@gumby.homeunix.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130826170332.416b5c55@gumby.homeunix.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Aug 2013 16:17:29 -0000 On Mon, Aug 26, 2013 at 05:03:32PM +0100, RW wrote: > On Sun, 25 Aug 2013 22:03:58 +0200 > Jeremie Le Hen wrote: > > > I've implemented it here: > > http://people.freebsd.org/~jlh/security_status_period.diff > > > > Doesn't this mean that if you want to run "periodic security" from > crontab or manually etc, you have to override every single entry to > "crontab" in period.conf. > > IMO when "periodic security" is run directly every test should run > unless it's explicitly set to NO. You are right, I've updated the patch accordingly. Thanks for your review. -- Jeremie Le Hen Scientists say the world is made up of Protons, Neutrons and Electrons. They forgot to mention Morons. From owner-freebsd-hackers@FreeBSD.ORG Mon Aug 26 16:30:15 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 4055FFC8; Mon, 26 Aug 2013 16:30:15 +0000 (UTC) (envelope-from list_freebsd@bluerosetech.com) Received: from yoshi.bluerosetech.com (yoshi.bluerosetech.com [IPv6:2607:f2f8:a450::66]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 262AA2F3A; Mon, 26 Aug 2013 16:30:15 +0000 (UTC) Received: from chombo.houseloki.net (c-76-27-220-79.hsd1.wa.comcast.net [76.27.220.79]) by yoshi.bluerosetech.com (Postfix) with ESMTPSA id 0851DE6001; Mon, 26 Aug 2013 09:30:06 -0700 (PDT) Received: from [192.168.1.102] (static-71-242-248-73.phlapa.east.verizon.net [71.242.248.73]) by chombo.houseloki.net (Postfix) with ESMTPSA id 86F1CE4F; Mon, 26 Aug 2013 09:29:33 -0700 (PDT) Message-ID: <521B826A.6020402@bluerosetech.com> Date: Mon, 26 Aug 2013 12:29:30 -0400 From: Darren Pilgrim User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20130801 Thunderbird/17.0.8 MIME-Version: 1.0 To: Jeremie Le Hen Subject: Re: weekly periodic security status References: <20130822204958.GC24767@caravan.chchile.org> <5217AD9E.1000100@bluerosetech.com> <20130824165704.GD24767@caravan.chchile.org> <20130825110520.GJ24767@caravan.chchile.org> <521A34A2.303@bluerosetech.com> <20130825173715.GK24767@caravan.chchile.org> In-Reply-To: <20130825173715.GK24767@caravan.chchile.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Aug 2013 16:30:15 -0000 On 8/25/2013 1:37 PM, Jeremie Le Hen wrote: > Hi Darren, > > On Sun, Aug 25, 2013 at 12:45:22PM -0400, Darren Pilgrim wrote: >> On 8/25/2013 7:05 AM, Jeremie Le Hen wrote: >>> And the following variables to control whether you want each check to >>> run "daily", "weekly" or directly from "crontab" (the default, backward >>> compatible values are shown): >> >> What do we do if we want to run a check both daily and weekly? > > I really don't see the point of running some checks weekly when you do > daily. Do you have a particular example in mind? On one set of systems, I have a log analyser run as a periodic script. On a daily run, it grabs and filters logs into a database. On weekly runs, it does some statistical analysis of the filtered logs in the database. On monthly runs, it does a larger set of stats and a bit of housekeeping. The script lives in /usr/local/libexec and is hardlinked into the /usr/local/etc/periodic/ subtree and cases out the value of $0. The new framework would let me rely on the environment instead of $0, which, IMO, is more reliable. I'd need to be able to tell periodic to run that script with the daily, weekly and monthly security runs, though. From owner-freebsd-hackers@FreeBSD.ORG Mon Aug 26 21:09:14 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 13D9279D; Mon, 26 Aug 2013 21:09:14 +0000 (UTC) (envelope-from jlh@FreeBSD.org) Received: from caravan.chchile.org (caravan.chchile.org [178.32.125.136]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id CB6E520C1; Mon, 26 Aug 2013 21:09:13 +0000 (UTC) Received: by caravan.chchile.org (Postfix, from userid 1000) id 3D84DC0645; Mon, 26 Aug 2013 21:09:06 +0000 (UTC) Date: Mon, 26 Aug 2013 23:09:06 +0200 From: Jeremie Le Hen To: Darren Pilgrim Subject: Re: weekly periodic security status Message-ID: <20130826210906.GO24767@caravan.chchile.org> Mail-Followup-To: Darren Pilgrim , FreeBSD Hackers References: <20130822204958.GC24767@caravan.chchile.org> <5217AD9E.1000100@bluerosetech.com> <20130824165704.GD24767@caravan.chchile.org> <20130825110520.GJ24767@caravan.chchile.org> <521A34A2.303@bluerosetech.com> <20130825173715.GK24767@caravan.chchile.org> <521B826A.6020402@bluerosetech.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <521B826A.6020402@bluerosetech.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Jeremie Le Hen , FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Aug 2013 21:09:14 -0000 Darren On Mon, Aug 26, 2013 at 12:29:30PM -0400, Darren Pilgrim wrote: > >> On 8/25/2013 7:05 AM, Jeremie Le Hen wrote: > >>> And the following variables to control whether you want each check to > >>> run "daily", "weekly" or directly from "crontab" (the default, backward > >>> compatible values are shown): > >> > >> What do we do if we want to run a check both daily and weekly? > > > > I really don't see the point of running some checks weekly when you do > > daily. Do you have a particular example in mind? > > On one set of systems, I have a log analyser run as a periodic script. > On a daily run, it grabs and filters logs into a database. On weekly > runs, it does some statistical analysis of the filtered logs in the > database. On monthly runs, it does a larger set of stats and a bit of > housekeeping. The script lives in /usr/local/libexec and is hardlinked > into the /usr/local/etc/periodic/ subtree and cases out the value of $0. > > The new framework would let me rely on the environment instead of $0, > which, IMO, is more reliable. I'd need to be able to tell periodic to > run that script with the daily, weekly and monthly security runs, though. If I understand what you say correctly, this should continue to work. -- Jeremie Le Hen Scientists say the world is made up of Protons, Neutrons and Electrons. They forgot to mention Morons. From owner-freebsd-hackers@FreeBSD.ORG Mon Aug 26 22:12:10 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 8AD92B99 for ; Mon, 26 Aug 2013 22:12:10 +0000 (UTC) (envelope-from list_freebsd@bluerosetech.com) Received: from rush.bluerosetech.com (rush.bluerosetech.com [IPv6:2607:fc50:1000:9b00::25]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 5E3BB24B2 for ; Mon, 26 Aug 2013 22:12:10 +0000 (UTC) Received: from chombo.houseloki.net (c-76-27-220-79.hsd1.wa.comcast.net [76.27.220.79]) by rush.bluerosetech.com (Postfix) with ESMTPSA id 0E8691141D for ; Mon, 26 Aug 2013 15:12:09 -0700 (PDT) Received: from [192.168.1.102] (static-71-242-248-73.phlapa.east.verizon.net [71.242.248.73]) by chombo.houseloki.net (Postfix) with ESMTPSA id 2A894E9A for ; Mon, 26 Aug 2013 15:11:07 -0700 (PDT) Message-ID: <521BD278.80007@bluerosetech.com> Date: Mon, 26 Aug 2013 18:11:04 -0400 From: Darren Pilgrim User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20130801 Thunderbird/17.0.8 MIME-Version: 1.0 To: FreeBSD Hackers Subject: Re: weekly periodic security status References: <20130822204958.GC24767@caravan.chchile.org> <5217AD9E.1000100@bluerosetech.com> <20130824165704.GD24767@caravan.chchile.org> <20130825110520.GJ24767@caravan.chchile.org> <521A34A2.303@bluerosetech.com> <20130825173715.GK24767@caravan.chchile.org> <521B826A.6020402@bluerosetech.com> <20130826210906.GO24767@caravan.chchile.org> In-Reply-To: <20130826210906.GO24767@caravan.chchile.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Aug 2013 22:12:10 -0000 On 8/26/2013 5:09 PM, Jeremie Le Hen wrote: > On Mon, Aug 26, 2013 at 12:29:30PM -0400, Darren Pilgrim wrote: >> The new framework would let me rely on the environment instead of $0, >> which, IMO, is more reliable. I'd need to be able to tell periodic to >> run that script with the daily, weekly and monthly security runs, though. > > If I understand what you say correctly, this should continue to work. As I understand the framework, I can only set the enable to "NO", "daily", "weekly" or "monthly". Periodic will only run it one of the periods. How would I set it to run in all three? How would I set to run in only two of the three periods? It seems like you could add this functionality by adding stars to each end of the case patterns in check_yesno_period. From owner-freebsd-hackers@FreeBSD.ORG Tue Aug 27 21:32:48 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 0505E5DA for ; Tue, 27 Aug 2013 21:32:48 +0000 (UTC) (envelope-from jlh@FreeBSD.org) Received: from caravan.chchile.org (caravan.chchile.org [178.32.125.136]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id BDF3D2427 for ; Tue, 27 Aug 2013 21:32:47 +0000 (UTC) Received: by caravan.chchile.org (Postfix, from userid 1000) id 74096C0E5E; Tue, 27 Aug 2013 21:32:46 +0000 (UTC) Date: Tue, 27 Aug 2013 23:32:46 +0200 From: Jeremie Le Hen To: Darren Pilgrim Subject: Re: weekly periodic security status Message-ID: <20130827213246.GS24767@caravan.chchile.org> Mail-Followup-To: Darren Pilgrim , FreeBSD Hackers References: <20130822204958.GC24767@caravan.chchile.org> <5217AD9E.1000100@bluerosetech.com> <20130824165704.GD24767@caravan.chchile.org> <20130825110520.GJ24767@caravan.chchile.org> <521A34A2.303@bluerosetech.com> <20130825173715.GK24767@caravan.chchile.org> <521B826A.6020402@bluerosetech.com> <20130826210906.GO24767@caravan.chchile.org> <521BD278.80007@bluerosetech.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <521BD278.80007@bluerosetech.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Aug 2013 21:32:48 -0000 Hi Darren, On Mon, Aug 26, 2013 at 06:11:04PM -0400, Darren Pilgrim wrote: > On 8/26/2013 5:09 PM, Jeremie Le Hen wrote: > > On Mon, Aug 26, 2013 at 12:29:30PM -0400, Darren Pilgrim wrote: > >> The new framework would let me rely on the environment instead of $0, > >> which, IMO, is more reliable. I'd need to be able to tell periodic to > >> run that script with the daily, weekly and monthly security runs, though. > > > > If I understand what you say correctly, this should continue to work. > > As I understand the framework, I can only set the enable to "NO", > "daily", "weekly" or "monthly". Periodic will only run it one of the > periods. How would I set it to run in all three? How would I set to > run in only two of the three periods? > > It seems like you could add this functionality by adding stars to each > end of the case patterns in check_yesno_period. I've committed my work. Can you propose me a patch implementing what you request please? Thanks, -- Jeremie Le Hen Scientists say the world is made up of Protons, Neutrons and Electrons. They forgot to mention Morons. From owner-freebsd-hackers@FreeBSD.ORG Wed Aug 28 12:10:38 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 10A20A9F; Wed, 28 Aug 2013 12:10:38 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id D5E18222C; Wed, 28 Aug 2013 12:10:32 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id PAA08882; Wed, 28 Aug 2013 15:10:30 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1VEeZy-000OS2-7U; Wed, 28 Aug 2013 15:10:30 +0300 Message-ID: <521DE891.9070107@FreeBSD.org> Date: Wed, 28 Aug 2013 15:09:53 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130810 Thunderbird/17.0.8 MIME-Version: 1.0 To: freebsd-emulation@FreeBSD.org, freebsd-gnome@FreeBSD.org Subject: Re: [kde-freebsd] virtualbox file dialog problem References: <51E6B030.1080009@FreeBSD.org> <51E793DB.2020607@FreeBSD.org> In-Reply-To: <51E793DB.2020607@FreeBSD.org> X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Greg Rivers , freebsd-standards@FreeBSD.org, kde@FreeBSD.org, freebsd-security@FreeBSD.org, freebsd-hackers@FreeBSD.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 12:10:38 -0000 on 18/07/2013 10:06 Andriy Gapon said the following: > on 18/07/2013 03:25 Greg Rivers said the following: >> On Wed, 17 Jul 2013, Andriy Gapon wrote: >> >>> I run virtualbox in KDE environment. A while ago (can't say exactly when) I >>> started to have a problem where any file opening dialog would fail with this >>> message: "Cannot talk to klauncher: Not connected to D-Bus server" >>> >>> I found that setting KDE_FORK_SLAVES=1 in environment works around the problem. >> >> I reported this same problem in this[1] thread on freebsd-ports@. In that post >> I provided a link to a similar report for KDE on openSUSE that required a dbus >> patch to fix. >> >> I'm guessing that either the latest versions of VirtualBox have a bug in their >> dbus interface, or the version of dbus we have needs to be updated. >> >> [1] http://lists.freebsd.org/pipermail/freebsd-ports/2013-July/084783.html > > I saw those OpenSUSE reports but I think that they were against the much older > version of dbus. I have done some more investigation and the problems turns out to be dbus related indeed. The problem has only a tangential relation to KDE, so I plan to drop kde@ from this thread. It has a relation to what VirtualBox does, so I am keeping emulation@. It is related to dbus and gnome@ is its maintainer(s). It is also related to how issetugid(2) works, so I am including standards@, security@ and hackers@. So, please excuse me for such a wide distribution list, but I think that the solution should be negotiated among the parties involved. Now a description of the problem. 1. VirtualBox executable is installed setuid root. Apparently, when it is run it does some privileged things and then drops all of the uids and gids (real, effective and saved) back to what they should have been originally. VirtualBox does not do any (re-)exec of itself after the above manipulations. 2. issetugid(2) (which is apparently a BSD extension) on FreeBSD does not consider the above manipulations as sufficient to mark an executable as untainted. So it would return 1 for the VirtualBox process. 3. dbus code seems to impose some limitations on communication by such "tainted" processes. It has the following code: http://cgit.freedesktop.org/dbus/dbus/tree/dbus/dbus-sysdeps-unix.c#n4139 For web-impaired :) the gist is that on BSD systems the code uses issetugid but on other systems (like Linux) it uses getresuid and getresgid and checks that all 3 uids are the same and all 3 gids are the same. As a result, on FreeBSD the dbus code would consider the VirtualBox process tainted and that impairs its communication with KDE components. On systems without issetugid or those that implement it differently, dbus would work as for a normal process and all the communications are OK. I've also verified this conclusion by forcing dbus to use the alternative logic on FreeBSD. So, possible solutions: A. change how issetugid(2) works on FreeBSD; a comment in sys_issetugid hints that other BSDs may have different behaviors B. change VirtualBox to be friendly to FreeBSD issetugid(2) and exec itself after dropping the privileges C. patch dbus port to not use issetugid(2) D. something else What do you guys think? -- Andriy Gapon From owner-freebsd-hackers@FreeBSD.ORG Wed Aug 28 12:24:29 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id C3749E1; Wed, 28 Aug 2013 12:24:29 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 59B962368; Wed, 28 Aug 2013 12:24:28 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id PAA09014; Wed, 28 Aug 2013 15:24:26 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1VEenS-000OTH-Fs; Wed, 28 Aug 2013 15:24:26 +0300 Message-ID: <521DEBC2.1080602@FreeBSD.org> Date: Wed, 28 Aug 2013 15:23:30 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130810 Thunderbird/17.0.8 MIME-Version: 1.0 To: freebsd-emulation@FreeBSD.org Subject: Re: [kde-freebsd] virtualbox file dialog problem References: <51E6B030.1080009@FreeBSD.org> <51E793DB.2020607@FreeBSD.org> <521DE891.9070107@FreeBSD.org> In-Reply-To: <521DE891.9070107@FreeBSD.org> X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@FreeBSD.org, freebsd-standards@FreeBSD.org, freebsd-security@FreeBSD.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 12:24:29 -0000 on 28/08/2013 15:09 Andriy Gapon said the following: > Now a description of the problem. > > 1. VirtualBox executable is installed setuid root. Apparently, when it is run > it does some privileged things and then drops all of the uids and gids (real, > effective and saved) back to what they should have been originally. > VirtualBox does not do any (re-)exec of itself after the above manipulations. > > 2. issetugid(2) (which is apparently a BSD extension) on FreeBSD does not > consider the above manipulations as sufficient to mark an executable as > untainted. So it would return 1 for the VirtualBox process. > > 3. dbus code seems to impose some limitations on communication by such "tainted" > processes. It has the following code: > http://cgit.freedesktop.org/dbus/dbus/tree/dbus/dbus-sysdeps-unix.c#n4139 > For web-impaired :) the gist is that on BSD systems the code uses issetugid but > on other systems (like Linux) it uses getresuid and getresgid and checks that > all 3 uids are the same and all 3 gids are the same. > > As a result, on FreeBSD the dbus code would consider the VirtualBox process > tainted and that impairs its communication with KDE components. > On systems without issetugid or those that implement it differently, dbus would > work as for a normal process and all the communications are OK. > > I've also verified this conclusion by forcing dbus to use the alternative logic > on FreeBSD. > > So, possible solutions: [snip] > B. change VirtualBox to be friendly to FreeBSD issetugid(2) and exec itself > after dropping the privileges [snip] BTW, I've just found this "interesting" code in the VirtualBox sources (forgive me a full paste, but I couldn't resist): #if defined(RT_OS_DARWIN) # include # include # include # include /** Really ugly hack to shut up a silly check in AppKit. */ static void ShutUpAppKit(void) { /* Check for Snow Leopard or higher */ char szInfo[64]; int rc = RTSystemQueryOSInfo (RTSYSOSINFO_RELEASE, szInfo, sizeof(szInfo)); if ( RT_SUCCESS (rc) && szInfo[0] == '1') /* higher than 1x.x.x */ { /* * Find issetguid() and make it always return 0 by modifying the code. */ void *addr = dlsym(RTLD_DEFAULT, "issetugid"); int rc = mprotect((void *)((uintptr_t)addr & ~(uintptr_t)0xfff), 0x2000, PROT_WRITE|PROT_READ|PROT_EXEC); if (!rc) ASMAtomicWriteU32((volatile uint32_t *)addr, 0xccc3c031); /* xor eax, eax; ret; int3 */ } } #endif /* DARWIN */ -- Andriy Gapon From owner-freebsd-hackers@FreeBSD.ORG Wed Aug 28 16:12:34 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id CB85C1EF; Wed, 28 Aug 2013 16:12:34 +0000 (UTC) (envelope-from gljennjohn@googlemail.com) Received: from mail-bk0-x236.google.com (mail-bk0-x236.google.com [IPv6:2a00:1450:4008:c01::236]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 0903A24EF; Wed, 28 Aug 2013 16:12:33 +0000 (UTC) Received: by mail-bk0-f54.google.com with SMTP id mz12so2303987bkb.13 for ; Wed, 28 Aug 2013 09:12:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=date:from:to:cc:subject:message-id:in-reply-to:references:reply-to :mime-version:content-type:content-transfer-encoding; bh=Q0UFBFG3OQNSfIVBxQ8BX4LdUN5McWVDaf4Dnx2WrqI=; b=c8xMzHl0G+Fn3wODqLRuLBFKLiniTwzjpks5wM9E/TIoQ/lAEHpGn2R2QzeA9mfKl3 TMKJHNZAnJ/JTtjYBAag7S65Pe7eLHVa0IpnHuMsEDKcjA7kUYXlbeRgn9B95HOZIv+B 3k5mIouz9LzVrfcf+ECPUK1G1uBMmYGUQszNrGnry9n5YUkt2CTgW9jh4QoUj4GdGnZG llh24dTP2PhSzr/nbU7GiZ0qAjDwUxWIiesCfjghB2hqAuZipWeG+mreRLQzSW1dsLNa NGYjEVj6/2MvJQP6ilbLHf0kapWSAOMTPv7H7TpmqGSUfiAiwVdQlbY/NoPJL4h6Big8 V5DA== X-Received: by 10.204.62.70 with SMTP id w6mr101436bkh.43.1377706351474; Wed, 28 Aug 2013 09:12:31 -0700 (PDT) Received: from ernst.home (p578E0A66.dip0.t-ipconnect.de. [87.142.10.102]) by mx.google.com with ESMTPSA id jt14sm6184548bkb.0.1969.12.31.16.00.00 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Wed, 28 Aug 2013 09:12:30 -0700 (PDT) Date: Wed, 28 Aug 2013 18:12:28 +0200 From: Gary Jennejohn To: Ivan Voras Subject: Re: Call fo comments - raising vfs.ufs.dirhash_reclaimage? Message-ID: <20130828181228.0d3618dd@ernst.home> In-Reply-To: References: X-Mailer: Claws Mail 3.9.2 (GTK+ 2.24.17; amd64-portbld-freebsd10.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: gljennjohn@googlemail.com List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 16:12:34 -0000 On Wed, 28 Aug 2013 15:56:30 +0200 Ivan Voras wrote: [jump to the chase] > Why not leave it for sysadmins to tune it themselves if they want it: > > 1) They usually don't know about it until it's too late. > > 2) Dirhash is typically miniscule compared to todays memory sizes - a > few dozen MBs even on very busy systems, and there are no typical > situations where a large number of entries are filled in at the same > time which block eviction of a large-ish amount of memory, so having > reclaimage higher will automatically help in file-system intensive > spikes without harming other uses. > So, if I understand this correctly, a normal desktop user won't notice any real change, except that buildworld might get faster, and big servers will benefit? But could this negatively impact small, embedded systems, which usually have only small memory footprints? Although I suppose one could argue that they usually don't have large numbers of files cached in memory at any given time. -- Gary Jennejohn From owner-freebsd-hackers@FreeBSD.ORG Wed Aug 28 16:40:57 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 70AD89AE; Wed, 28 Aug 2013 16:40:57 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from mail-vc0-x232.google.com (mail-vc0-x232.google.com [IPv6:2607:f8b0:400c:c03::232]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 1FD9B270F; Wed, 28 Aug 2013 16:40:57 +0000 (UTC) Received: by mail-vc0-f178.google.com with SMTP id ha12so4428788vcb.9 for ; Wed, 28 Aug 2013 09:40:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=0pUJUXgWtnJQUozJSPQTWj89DGpC1GMmt9qnXPY4x2w=; b=tWne3NFlSrHmxKcgtqdth5eXpHIC/50CIAYQhHYD5nkYfuV6s/JcrgLE9wfY1mt1Qi 7q0AiychkTJdpW5A1ZALdkzVaq2Cjjh3anQTT0dLuj0tFmZwkIk7UDhSVBknzVRKz6TB qE2/qMdALJtlC9BpIZVXDkzxxAftBQKQ7115t7Djuaysw8k3Y9IWGAq35dVcieFGUhJ1 /Hk8+YTisTYkIm1flsHvsFYuZh1OMIoC7FJMadaDVHpUK9/w3ClgYsbuAalcxhNfMoO7 PU3KF6tTe877+HmxRi4CJd8sjAJiRX8Nak6UKhqnLWZXGoJVMG0rFFA7Ad82K9C8WmCp isqw== X-Received: by 10.52.165.111 with SMTP id yx15mr1099455vdb.33.1377708056260; Wed, 28 Aug 2013 09:40:56 -0700 (PDT) MIME-Version: 1.0 Sender: ivoras@gmail.com Received: by 10.58.229.167 with HTTP; Wed, 28 Aug 2013 09:40:15 -0700 (PDT) In-Reply-To: <20130828181228.0d3618dd@ernst.home> References: <20130828181228.0d3618dd@ernst.home> From: Ivan Voras Date: Wed, 28 Aug 2013 18:40:15 +0200 X-Google-Sender-Auth: U9LZAgQfiX3G_bOqoiVEpR_JU6g Message-ID: Subject: Re: Call fo comments - raising vfs.ufs.dirhash_reclaimage? To: gljennjohn@googlemail.com Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs , freebsd-hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 16:40:57 -0000 On 28 August 2013 18:12, Gary Jennejohn wrote: > So, if I understand this correctly, a normal desktop user won't > notice any real change, except that buildworld might get faster, > and big servers will benefit? Basically, yes, but read on... > But could this negatively impact small, embedded systems, which > usually have only small memory footprints? Although I suppose > one could argue that they usually don't have large numbers of > files cached in memory at any given time. Unless I'm wrong, the only pathological case coming from this change would be the following sequence of events: 1) Memory is scarce [*] 2) There's a sudden surge of requests for a huge number of different directories 3) There's an urgent lowmem event which is observed by dirhash, which attempts to free memory but is prevented in doing so for the next 60 seconds because all entries are young (the idea behind dirhash being that if a directory is accessed, it will probably soon be accessed again - think "ls" then "fopen", so we won't evict him until reclaimage seconds) 4) the kernel runs out of memory, game over. Note that this sequence of events could still happen right now, only over a span of 5 seconds, not 60 seconds. Note also that all of this has nothing to do with regular file cache, dirhash is a very specific corner-case of UFS. [*] Keep in mind that dirhash cache even on large and busy systems is usually ~~15-25 MB; on 16 GB machines the auto-tuning code caps it at 25 MB. As an illustration on how tiny dirhash is: a "du -c" on /usr/ports increases dirhash_mem on my desktop from 103945 to 501507 bytes. One of the issues raised by davide is that the benefits from this are also miniscule and hard to prove. A simple buildworld is not a big enough load. I've seen on my own skin how increasing reclaimage helped, but that was under such specific circumstances that I'm still trying to figure out how to create a self-sustained benchmark (basically - how to provoke lowmem events?). Basically, this change will have no effect for 99.9% of users, but could save that 0.1% from going crazy. From owner-freebsd-hackers@FreeBSD.ORG Wed Aug 28 18:30:46 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 624D2B52; Wed, 28 Aug 2013 18:30:46 +0000 (UTC) (envelope-from melifaro@yandex-team.ru) Received: from forward-corp1e.mail.yandex.net (forward-corp1e.mail.yandex.net [IPv6:2a02:6b8:0:202::10]) by mx1.freebsd.org (Postfix) with ESMTP id 3E5232D9C; Wed, 28 Aug 2013 18:30:45 +0000 (UTC) Received: from smtpcorp4.mail.yandex.net (smtpcorp4.mail.yandex.net [95.108.252.2]) by forward-corp1e.mail.yandex.net (Yandex) with ESMTP id 1CAD064006D; Wed, 28 Aug 2013 22:30:42 +0400 (MSK) Received: from smtpcorp4.mail.yandex.net (localhost [127.0.0.1]) by smtpcorp4.mail.yandex.net (Yandex) with ESMTP id 018FC2C0173; Wed, 28 Aug 2013 22:30:41 +0400 (MSK) Received: from dhcp170-36-red.yandex.net (dhcp170-36-red.yandex.net [95.108.170.36]) by smtpcorp4.mail.yandex.net (nwsmtp/Yandex) with ESMTP id fyMijftbk1-UfD0Ynrx; Wed, 28 Aug 2013 22:30:41 +0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1377714641; bh=AW6ufh6kIvAxiM+MsIJACctEm8ZbDDYBUX9jB61JP/4=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: Content-Type; b=u8yrzEOZBJx9qwddK0RtoZ49tkgibPrkr2JZ+Jf85+Jaj0CrmBmy77f0747mqE+AS w7TDf3gcCt+QbncXKKcIFCxas4ITtcPy4R4fhXsELRhDjBGz7WyBYNLhnt1KibM3je uhHqth/NVDpzXMr5toFTgvyh0qUB0BRcXtFa/P2I= Authentication-Results: smtpcorp4.mail.yandex.net; dkim=pass header.i=@yandex-team.ru Message-ID: <521E41CB.30700@yandex-team.ru> Date: Wed, 28 Aug 2013 22:30:35 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130418 Thunderbird/17.0.5 MIME-Version: 1.0 To: FreeBSD Net , freebsd-hackers@freebsd.org, freebsd-arch@freebsd.org Subject: Network stack changes Content-Type: multipart/mixed; boundary="------------010308000904000207080306" X-Mailman-Approved-At: Wed, 28 Aug 2013 18:36:22 +0000 Cc: ae@FreeBSD.org, adrian@freebsd.org, Gleb Smirnoff , andre@freebsd.org, luigi@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 18:30:46 -0000 This is a multi-part message in MIME format. --------------010308000904000207080306 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hello list! There is a lot constantly raising discussions related to networking stack performance/changes. I'll try to summarize current problems and possible solutions from my point of view. (Generally this is one problem: stack is slooooooooooooooooooooooooooow, but we need to know why and what to do). Let's start with current IPv4 packet flow on a typical router: http://static.ipfw.ru/images/freebsd_ipv4_flow.png (I'm sorry I can't provide this as text since Visio don't have any 'ascii-art' exporter). Note that we are using process-to-completion model, e.g. process any packet in ISR until it is either consumed by L4+ stack or dropped or put to egress NIC queue. (There is also deferred ISR model implemented inside netisr but it does not change much: it can help to do more fine-grained hashing (for GRE or other similar traffic), but 1) it uses per-packet mutex locking which kills all performance 2) it currently does not have _any_ hashing functions (see absence of flags in `netstat -Q`) People using http://static.ipfw.ru/patches/netisr_ip_flowid.diff (or modified PPPoe/GRE version) report some profit, but without fixing (1) it can't help much ) So, let's start: 1) Ixgbe uses mutex to protect each RX ring which is perfectly fine since there is nearly no contention (the only thing that can happen is driver reconfiguration which is rare and, more signifficant, we do this once for the batch of packets received in given interrupt). However, due to some (im)possible deadlocks current code does per-packet ring unlock/lock (see ixgbe_rx_input()). There was a discussion ended with nothing: http://lists.freebsd.org/pipermail/freebsd-net/2012-October/033520.html 1*) Possible BPF users. Here we have one rlock if there are any readers present (and mutex for any matching packets, but this is more or less OK. Additionally, there is WIP to implement multiqueue BPF and there is chance that we can reduce lock contention there). There is also an "optimize_writers" hack permitting applications like CDP to use BPF as writers but not registering them as receivers (which implies rlock) 2/3) Virtual interfaces (laggs/vlans over lagg and other simular constructions). Currently we simply use rlock to make s/ix0/lagg0/ and, what is much more funny - we use complex vlan_hash with another rlock to get vlan interface from underlying one. This is definitely not like things should be done and this can be changed more or less easily. There are some useful terms/techniques in world of software/hardware routing: they have clear 'control plane' and 'data plane' separation. Former one is for dealing control traffic (IGP, MLD, IGMP snooping, lagg hellos, ARP/NDP, etc..) and some data traffic (packets with TTL=1, with options, destined to hosts without ARP/NDP record, and similar). Latter one is done in hardware (or effective software implementation). Control plane is responsible to provide data for efficient data plane operations. This is the point we are missing nearly everywhere. What I want to say is: lagg is pure control-plane stuff and vlan is nearly the same. We can't apply this approach to complex cases like lagg-over-vlans-over-vlans-over-(pppoe_ng0-and_wifi0) but we definitely can do this for most common setups like (igb* or ix* in lagg with or without vlans on top of lagg). We already have some capabilities like VLANHWFILTER/VLANHWTAG, we can add some more. We even have per-driver hooks to program HW filtering. One small step to do is to throw packet to vlan interface directly (P1), proof-of-concept(working in production): http://lists.freebsd.org/pipermail/freebsd-net/2013-April/035270.html Another is to change lagg packet accounting: http://lists.freebsd.org/pipermail/svn-src-all/2013-April/067570.html Again, this is more like HW boxes do (aggregate all counters including errors) (and I can't imagine what real error we can get from _lagg_). 4) If we are router, we can do either slooow ip_input() -> ip_forward() -> ip_output() cycle or use optimized ip_fastfwd() which falls back to 'slow' path for multicast/options/local traffic (e.g. works exactly like 'data plane' part). (Btw, we can consider net.inet.ip.fastforwarding to be turned on by default at least for non-IPSEC kernels) Here we have to determine if this is local packet or not, e.g. F(dst_ip) returning 1 or 0. Currently we are simply using standard rlock + hash of iface addresses. (And some consumers like ipfw(4) do the same, but without lock). We don't need to do this! We can build sorted array of IPv4 addresses or other efficient structure on every address change and use it unlocked with delayed garbage collection (proof-of-concept attached) (There is another thing to discuss: maybe we can do this once somewhere in ip_input and mark mbuf as 'local/non-local' ? ) 5, 9) Currently we have L3 ingress/egress PFIL hooks protected by rmlocks. This is OK. However, 6) and 7) are not. Firewall can use the same pfil lock as reader protection without imposing its own lock. currently pfil&ipfw code is ready to do this. 8) Radix/rt* api. This is probably the worst place in entire stack. It is toooo generic, tooo slow and buggy (do you use IPv6? you definitely know what I'm talking about). A) It really is too generic and assumption that it can be (effectively) used for every family is wrong. Two examples: we don't need to lookup all 128 bits of IPv6 address. Subnets with mask >/64 are not used widely (actually the only reason to use them are p2p links due to ND potential problems). One of common solutions is to lookup 64bits, and build another trie (or other structure) in case of collision. Another example is MPLS where we can simply do direct array lookup based on ingress label. B) It is terribly slow (AFAIR luigi@ did some performance management, numbers available in one of netmap pdfs) C) It is not multipath-capable. Stateful (and non-working) multipath is definitely not the right way. 8*) rtentry We are doing it wrong. Currently _every_ lookup locks/unlocks given rte twice. First lock is related to and old-old story for trusting IP redirects (and auto-adding host routes for them). Hopefully currently it is disabled automatically when you turn forwarding on. The second one is much more complicated: we are assuming that rte's with non-zero refcount value can stop egress interface from being destroyed. This is wrong (but widely used) assumption. We can use delayed GC instead of locking for rte's and this won't break things more than they are broken now (patch attached). We can't do the same for ifp structures since a) virtual ones can assume some state in underlying physical NIC b) physical ones just _can_ be destroyed (maybe regardless of user wants this or not, like: SFP being unplugged from NIC) or simply lead to kernel crash due to SW/HW inconsistency One of possible solution is to implement stable refcounts based on PCPU counters, and apply thos counters to ifp, but seem to be non-trivial. Another rtalloc(9) problem is the fact that radix is used as both 'control plane' and 'data plane' structure/api. Some users always want to put more information in rte, while others want to make rte more compact. We just need _different_ structures for that. Feature-rich, lot-of-data control plane one (to store everything we want to store, including, for example, PID of process originating the route) - current radix can be modified to do this. And address-family-depended another structure (array, trie, or anything) which contains _only_ data necessary to put packet on the wire. 11) arpresolve. Currently (this was decoupled in 8.x) we have a) ifaddr rlock b) lle rlock. We don't need those locks. We need to a) make lle layer per-interface instead of global (and this can also solve multiple fibs and L2 mappings done in fib.0 issue) b) use rtalloc(9)-provided lock instead of separate locking c) actually, we need to do rewrite this layer because d) lle actually is the place to do real multipath: briefly, you have rte pointing to some special nexthop structure pointing to lle, which has the following data: num_of_egress_ifaces: [ifindex1, ifindex2, ifindex3] | L2 data to prepend to header Separate post will follow. With the following, we can achieve lagg traffic distribution without actually using lagg_transmit and similar stuff (at least in most common scenarious) (for example, TCP output definitely can benefit from this, since we can account flowid once for TCP session and use in in every mbuf) So. Imagine we have done all this. How we can estimate the difference? There was a thread, started a year ago, describing 'stock' performance and difference for various modifications. It is done on 8.x, however I've got similar results on recent 9.x http://lists.freebsd.org/pipermail/freebsd-net/2012-July/032680.html Briefly: 2xE5645 @ Intel 82599 NIC. Kernel: FreeBSD-8-S r237994, stock drivers, stock routing, no FLOWTABLE, no firewallIxia XM2 (traffic generator) <> ix0 (FreeBSD). Ixia sends 64byte IP packets from vlan10 (10.100.0.64 - 10.100.0.156) to destinations in vlan11 (10.100.1.128 - 10.100.1.192). Static arps are configured for all destination addresses. Traffic level is slightly above or slightly below system performance. we start from 1.4MPPS (if we are using several routes to minimize mutex contention). My 'current' result for the same test, on same HW, with the following modifications: * 1) ixgbe per-packet ring unlock removed * P1) ixgbe is modified to do direct vlan input (so 2,3 are not used) * 4) separate lockless in_localip() version * 6) - using existing pfil lock * 7) using lockless version * 8) radix converted to use rmlock instead of rlock. Delayed GC is used instead of mutexes * 10) - using existing pfil lock * 11) using radix lock to do arpresolve(). Not using lle rlock (so the rmlocks are the only locks used on data path). Additionally, ipstat counters are converted to PCPU (no real performance implications). ixgbe does not do per-packet accounting (as in head). if_vlan counters are converted to PCPU lagg is converted to rmlock, per-packet accounting is removed (using stat from underlying interfaces) lle hash size is bumped to 1024 instead of 32 (not applicable here, but slows things down for large L2 domains) The result is 5.6 MPPS for single port (11 cores) and 6.5MPPS for lagg (16 cores), nearly the same for HT on and 22 cores. .. while Intel DPDK claims 80MPPS (and 6windgate talks about 160 or so) on the same-class hardware and _userland_ forwarding. One of key features making all such products possible (DPDK, netmap, packetshader, Cisco SW forwarding) - is use of batching instead of process-to-completion model. Batching mitigates locking cost, batching does not wash out CPU cache, and so on. So maybe we can consider passing batches from NIC to at least L2 layer with netisr? or even up to ip_input() ? Another question is about making some sort of reliable GC like ("passive serialization" or other similar not-to-pronounce-words about Linux and lockless objects). P.S. Attached patches are 1) for 8.x 2) mostly 'hacks' showing roughly how can this be done and what benefit can be achieved. --------------010308000904000207080306 Content-Type: text/plain; charset=UTF-8; name="1_ixgbe_unlock.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="1_ixgbe_unlock.diff" commit 20a52503455c80cd149d2232bdc0d37e14381178 Author: Charlie Root Date: Tue Oct 23 21:20:13 2012 +0000 Remove RX ring unlock/lock before calling if_input() from ixgbe drivers. diff --git a/sys/dev/ixgbe/ixgbe.c b/sys/dev/ixgbe/ixgbe.c index 5d8752b..fc1491e 100644 --- a/sys/dev/ixgbe/ixgbe.c +++ b/sys/dev/ixgbe/ixgbe.c @@ -4171,9 +4171,7 @@ ixgbe_rx_input(struct rx_ring *rxr, struct ifnet *ifp, struct mbuf *m, u32 ptype if (tcp_lro_rx(&rxr->lro, m, 0) == 0) return; } - IXGBE_RX_UNLOCK(rxr); (*ifp->if_input)(ifp, m); - IXGBE_RX_LOCK(rxr); } static __inline void --------------010308000904000207080306 Content-Type: text/plain; charset=UTF-8; name="2_ixgbe_vlans2.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="2_ixgbe_vlans2.diff" Index: sys/dev/ixgbe/ixgbe.c =================================================================== --- sys/dev/ixgbe/ixgbe.c (revision 248704) +++ sys/dev/ixgbe/ixgbe.c (working copy) @@ -2880,6 +2880,14 @@ ixgbe_allocate_queues(struct adapter *adapter) error = ENOMEM; goto err_rx_desc; } + + if ((rxr->vlans = malloc(sizeof(struct ifvlans), M_DEVBUF, + M_NOWAIT | M_ZERO)) == NULL) { + device_printf(dev, + "Critical Failure setting up vlan index\n"); + error = ENOMEM; + goto err_rx_desc; + } } /* @@ -4271,6 +4279,11 @@ ixgbe_free_receive_buffers(struct rx_ring *rxr) rxr->ptag = NULL; } + if (rxr->vlans != NULL) { + free(rxr->vlans, M_DEVBUF); + rxr->vlans = NULL; + } + return; } @@ -4303,7 +4316,7 @@ ixgbe_rx_input(struct rx_ring *rxr, struct ifnet * return; } IXGBE_RX_UNLOCK(rxr); - (*ifp->if_input)(ifp, m); + (*ifp->if_input)(m->m_pkthdr.rcvif, m); IXGBE_RX_LOCK(rxr); } @@ -4360,6 +4373,7 @@ ixgbe_rxeof(struct ix_queue *que) u16 count = rxr->process_limit; union ixgbe_adv_rx_desc *cur; struct ixgbe_rx_buf *rbuf, *nbuf; + struct ifnet *ifp_dst; IXGBE_RX_LOCK(rxr); @@ -4522,9 +4536,19 @@ ixgbe_rxeof(struct ix_queue *que) (staterr & IXGBE_RXD_STAT_VP)) vtag = le16toh(cur->wb.upper.vlan); if (vtag) { - sendmp->m_pkthdr.ether_vtag = vtag; - sendmp->m_flags |= M_VLANTAG; - } + ifp_dst = rxr->vlans->idx[EVL_VLANOFTAG(vtag)]; + + if (ifp_dst != NULL) { + ifp_dst->if_ipackets++; + sendmp->m_pkthdr.rcvif = ifp_dst; + } else { + sendmp->m_pkthdr.ether_vtag = vtag; + sendmp->m_flags |= M_VLANTAG; + sendmp->m_pkthdr.rcvif = ifp; + } + } else + sendmp->m_pkthdr.rcvif = ifp; + if ((ifp->if_capenable & IFCAP_RXCSUM) != 0) ixgbe_rx_checksum(staterr, sendmp, ptype); #if __FreeBSD_version >= 800000 @@ -4625,7 +4649,32 @@ ixgbe_rx_checksum(u32 staterr, struct mbuf * mp, u return; } +/* + * This routine gets real vlan ifp based on + * underlying ifp and vlan tag. + */ +static struct ifnet * +ixgbe_get_vlan(struct ifnet *ifp, uint16_t vtag) +{ + /* XXX: IFF_MONITOR */ +#if 0 + struct lagg_port *lp = ifp->if_lagg; + struct lagg_softc *sc = lp->lp_softc; + + /* Skip lagg nesting */ + while (ifp->if_type == IFT_IEEE8023ADLAG) { + lp = ifp->if_lagg; + sc = lp->lp_softc; + ifp = sc->sc_ifp; + } +#endif + /* Get vlan interface based on tag */ + ifp = VLAN_DEVAT(ifp, vtag); + + return (ifp); +} + /* ** This routine is run via an vlan config EVENT, ** it enables us to use the HW Filter table since @@ -4637,7 +4686,9 @@ static void ixgbe_register_vlan(void *arg, struct ifnet *ifp, u16 vtag) { struct adapter *adapter = ifp->if_softc; - u16 index, bit; + u16 index, bit, j; + struct rx_ring *rxr; + struct ifnet *ifv; if (ifp->if_softc != arg) /* Not our event */ return; @@ -4645,7 +4696,20 @@ ixgbe_register_vlan(void *arg, struct ifnet *ifp, if ((vtag == 0) || (vtag > 4095)) /* Invalid */ return; + ifv = ixgbe_get_vlan(ifp, vtag); + IXGBE_CORE_LOCK(adapter); + + if (ifp->if_capenable & IFCAP_VLAN_HWFILTER) { + rxr = adapter->rx_rings; + + for (j = 0; j < adapter->num_queues; j++, rxr++) { + IXGBE_RX_LOCK(rxr); + rxr->vlans->idx[vtag] = ifv; + IXGBE_RX_UNLOCK(rxr); + } + } + index = (vtag >> 5) & 0x7F; bit = vtag & 0x1F; adapter->shadow_vfta[index] |= (1 << bit); @@ -4663,7 +4727,8 @@ static void ixgbe_unregister_vlan(void *arg, struct ifnet *ifp, u16 vtag) { struct adapter *adapter = ifp->if_softc; - u16 index, bit; + u16 index, bit, j; + struct rx_ring *rxr; if (ifp->if_softc != arg) return; @@ -4672,6 +4737,15 @@ ixgbe_unregister_vlan(void *arg, struct ifnet *ifp return; IXGBE_CORE_LOCK(adapter); + + rxr = adapter->rx_rings; + + for (j = 0; j < adapter->num_queues; j++, rxr++) { + IXGBE_RX_LOCK(rxr); + rxr->vlans->idx[vtag] = NULL; + IXGBE_RX_UNLOCK(rxr); + } + index = (vtag >> 5) & 0x7F; bit = vtag & 0x1F; adapter->shadow_vfta[index] &= ~(1 << bit); @@ -4686,8 +4760,8 @@ ixgbe_setup_vlan_hw_support(struct adapter *adapte { struct ifnet *ifp = adapter->ifp; struct ixgbe_hw *hw = &adapter->hw; + u32 ctrl, j; struct rx_ring *rxr; - u32 ctrl; /* @@ -4713,6 +4787,15 @@ ixgbe_setup_vlan_hw_support(struct adapter *adapte if (ifp->if_capenable & IFCAP_VLAN_HWFILTER) { ctrl &= ~IXGBE_VLNCTRL_CFIEN; ctrl |= IXGBE_VLNCTRL_VFE; + } else { + /* Zero vlan table */ + rxr = adapter->rx_rings; + + for (j = 0; j < adapter->num_queues; j++, rxr++) { + IXGBE_RX_LOCK(rxr); + memset(rxr->vlans->idx, 0, sizeof(struct ifvlans)); + IXGBE_RX_UNLOCK(rxr); + } } if (hw->mac.type == ixgbe_mac_82598EB) ctrl |= IXGBE_VLNCTRL_VME; Index: sys/dev/ixgbe/ixgbe.h =================================================================== --- sys/dev/ixgbe/ixgbe.h (revision 248704) +++ sys/dev/ixgbe/ixgbe.h (working copy) @@ -284,6 +284,11 @@ struct ix_queue { u64 irqs; }; +struct ifvlans { + struct ifnet *idx[4096]; +}; + + /* * The transmit ring, one per queue */ @@ -307,7 +312,6 @@ struct tx_ring { } queue_status; u32 txd_cmd; bus_dma_tag_t txtag; - char mtx_name[16]; #ifndef IXGBE_LEGACY_TX struct buf_ring *br; struct task txq_task; @@ -324,6 +328,7 @@ struct tx_ring { unsigned long no_tx_dma_setup; u64 no_desc_avail; u64 total_packets; + char mtx_name[16]; }; @@ -346,8 +351,8 @@ struct rx_ring { u16 num_desc; u16 mbuf_sz; u16 process_limit; - char mtx_name[16]; struct ixgbe_rx_buf *rx_buffers; + struct ifvlans *vlans; bus_dma_tag_t ptag; u32 bytes; /* Used for AIM calc */ @@ -363,6 +368,7 @@ struct rx_ring { #ifdef IXGBE_FDIR u64 flm; #endif + char mtx_name[16]; }; /* Our adapter structure */ --------------010308000904000207080306 Content-Type: text/plain; charset=UTF-8; name="3_in_localip_fast.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="3_in_localip_fast.diff" commit 7f1103ac622881182642b2d3ae17b6ff484c1293 Author: Charlie Root Date: Sun Apr 7 23:50:26 2013 +0000 Use lockles in_localip_fast() function. diff --git a/sys/net/route.h b/sys/net/route.h index 4d9371b..f588f03 100644 --- a/sys/net/route.h +++ b/sys/net/route.h @@ -365,6 +365,7 @@ void rt_maskedcopy(struct sockaddr *, struct sockaddr *, struct sockaddr *); */ #define RTGC_ROUTE 1 #define RTGC_IF 3 +#define RTGC_IFADDR 4 int rtexpunge(struct rtentry *); diff --git a/sys/netinet/in.c b/sys/netinet/in.c index 5341918..a83b8a9 100644 --- a/sys/netinet/in.c +++ b/sys/netinet/in.c @@ -93,6 +93,20 @@ VNET_DECLARE(struct inpcbinfo, ripcbinfo); VNET_DECLARE(struct arpstat, arpstat); /* ARP statistics, see if_arp.h */ #define V_arpstat VNET(arpstat) +struct in_ifaddrf { + struct in_ifaddrf *next; + struct in_addr addr; +}; + +struct in_ifaddrhashf { + uint32_t hmask; + uint32_t count; + struct in_ifaddrf **hash; +}; + +VNET_DEFINE(struct in_ifaddrhashf *, in_ifaddrhashtblf) = NULL; /* inet addr fast hash table */ +#define V_in_ifaddrhashtblf VNET(in_ifaddrhashtblf) + /* * Return 1 if an internet address is for a ``local'' host * (one to which we have a connection). If subnetsarelocal @@ -145,6 +159,120 @@ in_localip(struct in_addr in) return (0); } +int +in_localip_fast(struct in_addr in) +{ + struct in_ifaddrf *rec; + struct in_ifaddrhashf *f; + + if ((f = V_in_ifaddrhashtblf) == NULL) + return (0); + + rec = f->hash[INADDR_HASHVAL(in) & f->hmask]; + + while (rec != NULL && rec->addr.s_addr != in.s_addr) + rec = rec->next; + + if (rec != NULL) + return (1); + + return (0); +} + +struct in_ifaddrhashf * +in_hash_alloc(int additional) +{ + int count, hsize, i; + struct in_ifaddr *ia; + struct in_ifaddrhashf *new; + + count = additional + 1; + + IN_IFADDR_RLOCK(); + for (i = 0; i < INADDR_NHASH; i++) { + LIST_FOREACH(ia, &V_in_ifaddrhashtbl[i], ia_hash) + count++; + } + IN_IFADDR_RUNLOCK(); + + /* roundup to the next power of 2 */ + hsize = (1UL << flsl(count - 1)); + + new = malloc(sizeof(struct in_ifaddrhashf) + + sizeof(void *) * hsize + + sizeof(struct in_ifaddrf) * count, M_IFADDR, + M_NOWAIT | M_ZERO); + + if (new == NULL) + return (NULL); + + new->count = count; + new->hmask = hsize - 1; + new->hash = (struct in_ifaddrf **)(new + 1); + + return (new); +} + +int +in_hash_build(struct in_ifaddrhashf *new) +{ + struct in_ifaddr *ia; + int i, j, count, hsize, r; + struct in_ifaddrhashf *old; + struct in_ifaddrf *rec, *tmp; + + count = new->count - 1; + hsize = new->hmask + 1; + rec = (struct in_ifaddrf *)&new->hash[hsize]; + + IN_IFADDR_RLOCK(); + for (i = 0; i < INADDR_NHASH; i++) { + LIST_FOREACH(ia, &V_in_ifaddrhashtbl[i], ia_hash) { + rec->addr.s_addr = IA_SIN(ia)->sin_addr.s_addr; + + j = INADDR_HASHVAL(rec->addr) & new->hmask; + if ((tmp = new->hash[j]) == NULL) + new->hash[j] = rec; + else { + while (tmp->next) + tmp = tmp->next; + tmp->next = rec; + } + + rec++; + count--; + + /* End of memory */ + if (count < 0) + break; + } + + /* End of memory */ + if (count < 0) + break; + } + IN_IFADDR_RUNLOCK(); + + /* If count >0 then we succeeded in building hash. Stop cycle */ + + if (count >= 0) { + old = V_in_ifaddrhashtblf; + V_in_ifaddrhashtblf = new; + + rtgc_free(RTGC_IFADDR, old, 0); + + return (1); + } + + /* Fail. */ + if (new) + free(new, M_IFADDR); + + return (0); +} + + + /* * Determine whether an IP address is in a reserved set of addresses * that may not be forwarded, or whether datagrams to that destination @@ -239,6 +367,7 @@ in_control(struct socket *so, u_long cmd, caddr_t data, struct ifnet *ifp, struct sockaddr_in oldaddr; int error, hostIsNew, iaIsNew, maskIsNew; int iaIsFirst; + struct in_ifaddrhashf *new_hash; ia = NULL; iaIsFirst = 0; @@ -405,6 +534,11 @@ in_control(struct socket *so, u_long cmd, caddr_t data, struct ifnet *ifp, goto out; } + if ((new_hash = in_hash_alloc(1)) == NULL) { + error = ENOBUFS; + goto out; + } + ifa = &ia->ia_ifa; ifa_init(ifa); ifa->ifa_addr = (struct sockaddr *)&ia->ia_addr; @@ -427,6 +561,8 @@ in_control(struct socket *so, u_long cmd, caddr_t data, struct ifnet *ifp, IN_IFADDR_WLOCK(); TAILQ_INSERT_TAIL(&V_in_ifaddrhead, ia, ia_link); IN_IFADDR_WUNLOCK(); + + in_hash_build(new_hash); iaIsNew = 1; } break; @@ -649,6 +785,8 @@ in_control(struct socket *so, u_long cmd, caddr_t data, struct ifnet *ifp, ifa_free(&if_ia->ia_ifa); } else IN_IFADDR_WUNLOCK(); + if ((new_hash = in_hash_alloc(0)) != NULL) + in_hash_build(new_hash); ifa_free(&ia->ia_ifa); /* in_ifaddrhead */ out: if (ia != NULL) @@ -852,6 +990,7 @@ in_ifinit(struct ifnet *ifp, struct in_ifaddr *ia, struct sockaddr_in *sin, register u_long i = ntohl(sin->sin_addr.s_addr); struct sockaddr_in oldaddr; int s = splimp(), flags = RTF_UP, error = 0; + struct in_ifaddrhashf *new_hash; oldaddr = ia->ia_addr; if (oldaddr.sin_family == AF_INET) @@ -862,6 +1001,9 @@ in_ifinit(struct ifnet *ifp, struct in_ifaddr *ia, struct sockaddr_in *sin, LIST_INSERT_HEAD(INADDR_HASH(ia->ia_addr.sin_addr.s_addr), ia, ia_hash); IN_IFADDR_WUNLOCK(); + + if ((new_hash = in_hash_alloc(1)) != NULL) + in_hash_build(new_hash); } /* * Give the interface a chance to initialize @@ -887,6 +1029,8 @@ in_ifinit(struct ifnet *ifp, struct in_ifaddr *ia, struct sockaddr_in *sin, */ LIST_REMOVE(ia, ia_hash); IN_IFADDR_WUNLOCK(); + if ((new_hash = in_hash_alloc(1)) != NULL) + in_hash_build(new_hash); return (error); } } diff --git a/sys/netinet/in.h b/sys/netinet/in.h index b03e74c..948938a 100644 --- a/sys/netinet/in.h +++ b/sys/netinet/in.h @@ -741,6 +741,7 @@ int in_broadcast(struct in_addr, struct ifnet *); int in_canforward(struct in_addr); int in_localaddr(struct in_addr); int in_localip(struct in_addr); +int in_localip_fast(struct in_addr); int inet_aton(const char *, struct in_addr *); /* in libkern */ char *inet_ntoa(struct in_addr); /* in libkern */ char *inet_ntoa_r(struct in_addr ina, char *buf); /* in libkern */ diff --git a/sys/netinet/ip_fastfwd.c b/sys/netinet/ip_fastfwd.c index 692e3e5..f7734a9 100644 --- a/sys/netinet/ip_fastfwd.c +++ b/sys/netinet/ip_fastfwd.c @@ -347,7 +347,7 @@ ip_fastforward(struct mbuf *m) /* * Is it for a local address on this host? */ - if (in_localip(ip->ip_dst)) + if (in_localip_fast(ip->ip_dst)) return m; //IPSTAT_INC(ips_total); @@ -390,7 +390,7 @@ ip_fastforward(struct mbuf *m) /* * Is it now for a local address on this host? */ - if (in_localip(dest)) + if (in_localip_fast(dest)) goto forwardlocal; /* * Go on with new destination address @@ -479,7 +479,7 @@ passin: /* * Is it now for a local address on this host? */ - if (m->m_flags & M_FASTFWD_OURS || in_localip(dest)) { + if (m->m_flags & M_FASTFWD_OURS || in_localip_fast(dest)) { forwardlocal: /* * Return packet for processing by ip_input(). diff --git a/sys/netinet/ipfw/ip_fw2.c b/sys/netinet/ipfw/ip_fw2.c index b76a638..53f6e97 100644 --- a/sys/netinet/ipfw/ip_fw2.c +++ b/sys/netinet/ipfw/ip_fw2.c @@ -1450,10 +1450,7 @@ do { \ case O_IP_SRC_ME: if (is_ipv4) { - struct ifnet *tif; - - INADDR_TO_IFP(src_ip, tif); - match = (tif != NULL); + match = in_localip_fast(src_ip); break; } #ifdef INET6 @@ -1490,10 +1487,7 @@ do { \ case O_IP_DST_ME: if (is_ipv4) { - struct ifnet *tif; - - INADDR_TO_IFP(dst_ip, tif); - match = (tif != NULL); + match = in_localip_fast(dst_ip); break; } #ifdef INET6 diff --git a/sys/netinet/ipfw/ip_fw_pfil.c b/sys/netinet/ipfw/ip_fw_pfil.c index a21f501..bdf8beb 100644 --- a/sys/netinet/ipfw/ip_fw_pfil.c +++ b/sys/netinet/ipfw/ip_fw_pfil.c @@ -184,7 +184,7 @@ again: bcopy(args.next_hop, (fwd_tag+1), sizeof(struct sockaddr_in)); m_tag_prepend(*m0, fwd_tag); - if (in_localip(args.next_hop->sin_addr)) + if (in_localip_fast(args.next_hop->sin_addr)) (*m0)->m_flags |= M_FASTFWD_OURS; } #endif /* INET || INET6 */ --------------010308000904000207080306 Content-Type: text/plain; charset=UTF-8; name="80_use_rtgc.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="80_use_rtgc.diff" commit 67a74d91a7b4a47a83fcfa5e79a6c6f0b4b1122d Author: Charlie Root Date: Fri Oct 26 17:10:52 2012 +0000 Remove rte locking for IPv4. Remove one of 2 locks from IPv6 rtes diff --git a/sys/net/if.c b/sys/net/if.c index a875326..eb6a723 100644 --- a/sys/net/if.c +++ b/sys/net/if.c @@ -487,6 +487,13 @@ if_alloc(u_char type) return (ifp); } + +void +if_free_real(struct ifnet *ifp) +{ + free(ifp, M_IFNET); +} + /* * Do the actual work of freeing a struct ifnet, and layer 2 common * structure. This call is made when the last reference to an @@ -499,6 +506,15 @@ if_free_internal(struct ifnet *ifp) KASSERT((ifp->if_flags & IFF_DYING), ("if_free_internal: interface not dying")); + if (rtgc_is_enabled()) { + /* + * FIXME: Sleep some time to permit packets + * using fastforwarding routine without locking + * die withour side effects. + */ + pause("if_free_gc", hz / 20); /* Sleep 50 milliseconds */ + } + if (if_com_free[ifp->if_alloctype] != NULL) if_com_free[ifp->if_alloctype](ifp->if_l2com, ifp->if_alloctype); @@ -511,7 +527,10 @@ if_free_internal(struct ifnet *ifp) IF_AFDATA_DESTROY(ifp); IF_ADDR_LOCK_DESTROY(ifp); ifq_delete(&ifp->if_snd); - free(ifp, M_IFNET); + if (rtgc_is_enabled()) + rtgc_free(RTGC_IF, ifp, 0); + else + if_free_real(ifp); } /* diff --git a/sys/net/if_var.h b/sys/net/if_var.h index 39c499f..5ef6264 100644 --- a/sys/net/if_var.h +++ b/sys/net/if_var.h @@ -857,6 +857,7 @@ void if_down(struct ifnet *); struct ifmultiaddr * if_findmulti(struct ifnet *, struct sockaddr *); void if_free(struct ifnet *); +void if_free_real(struct ifnet *); void if_free_type(struct ifnet *, u_char); void if_initname(struct ifnet *, const char *, int); void if_link_state_change(struct ifnet *, int); diff --git a/sys/net/route.c b/sys/net/route.c index 3059f5a..97965b3 100644 --- a/sys/net/route.c +++ b/sys/net/route.c @@ -142,6 +142,175 @@ VNET_DEFINE(int, rttrash); /* routes not in table but not freed */ static VNET_DEFINE(uma_zone_t, rtzone); /* Routing table UMA zone. */ #define V_rtzone VNET(rtzone) +SYSCTL_NODE(_net, OID_AUTO, gc, CTLFLAG_RW, 0, "Garbage collector"); + +MALLOC_DEFINE(M_RTGC, "rtgc", "route GC"); +void rtgc_func(void *_unused); +void rtfree_real(struct rtentry *rt); + +int _rtgc_default_enabled = 1; +TUNABLE_INT("net.gc.enable", &_rtgc_default_enabled); + +#define RTGC_CALLOUT_DELAY 1 +#define RTGC_EXPIRE_DELAY 3 + +VNET_DEFINE(struct mtx, rtgc_mtx); +#define V_rtgc_mtx VNET(rtgc_mtx) +VNET_DEFINE(struct callout, rtgc_callout); +#define V_rtgc_callout VNET(rtgc_callout) +VNET_DEFINE(int, rtgc_enabled); +#define V_rtgc_enabled VNET(rtgc_enabled) +SYSCTL_VNET_INT(_net_gc, OID_AUTO, enable, CTLFLAG_RW, + &VNET_NAME(rtgc_enabled), 1, + "Enable garbage collector"); +VNET_DEFINE(int, rtgc_expire_delay) = RTGC_EXPIRE_DELAY; +#define V_rtgc_expire_delay VNET(rtgc_expire_delay) +SYSCTL_VNET_INT(_net_gc, OID_AUTO, expire, CTLFLAG_RW, + &VNET_NAME(rtgc_expire_delay), 1, + "Object expiration delay"); +VNET_DEFINE(int, rtgc_numfailures); +#define V_rtgc_numfailures VNET(rtgc_numfailures) +SYSCTL_VNET_INT(_net_gc, OID_AUTO, failures, CTLFLAG_RD, + &VNET_NAME(rtgc_numfailures), 0, + "Number of objects leaked from route garbage collector"); +VNET_DEFINE(int, rtgc_numqueued); +#define V_rtgc_numqueued VNET(rtgc_numqueued) +SYSCTL_VNET_INT(_net_gc, OID_AUTO, queued, CTLFLAG_RD, + &VNET_NAME(rtgc_numqueued), 0, + "Number of objects queued for deletion"); +VNET_DEFINE(int, rtgc_numfreed); +#define V_rtgc_numfreed VNET(rtgc_numfreed) +SYSCTL_VNET_INT(_net_gc, OID_AUTO, freed, CTLFLAG_RD, + &VNET_NAME(rtgc_numfreed), 0, + "Number of objects deleted"); +VNET_DEFINE(int, rtgc_numinvoked); +#define V_rtgc_numinvoked VNET(rtgc_numinvoked) +SYSCTL_VNET_INT(_net_gc, OID_AUTO, invoked, CTLFLAG_RD, + &VNET_NAME(rtgc_numinvoked), 0, + "Number of times GC was invoked"); + +struct rtgc_item { + time_t expire; /* Whe we can delete this entry */ + int etype; /* Entry type */ + void *data; /* data to free */ + TAILQ_ENTRY(rtgc_item) items; +}; + +VNET_DEFINE(TAILQ_HEAD(, rtgc_item), rtgc_queue); +#define V_rtgc_queue VNET(rtgc_queue) + +int +rtgc_is_enabled() +{ + return V_rtgc_enabled; +} + +void +rtgc_func(void *_unused) +{ + struct rtgc_item *item, *temp_item; + TAILQ_HEAD(, rtgc_item) rtgc_tq; + int empty, deleted; + + CTR2(KTR_NET, "%s: started with %d objects", __func__, V_rtgc_numqueued); + + TAILQ_INIT(&rtgc_tq); + + /* Move all contents of current queue to new empty queue */ + mtx_lock(&V_rtgc_mtx); + V_rtgc_numinvoked++; + TAILQ_SWAP(&rtgc_queue, &rtgc_tq, rtgc_item, items); + mtx_unlock(&V_rtgc_mtx); + + deleted = 0; + + /* Dispatch as much as we can */ + TAILQ_FOREACH_SAFE(item, &rtgc_tq, items, temp_item) { + if (item->expire > time_uptime) + break; + + /* We can definitely delete this item */ + TAILQ_REMOVE(&rtgc_tq, item, items); + + switch (item->etype) { + case RTGC_ROUTE: + CTR1(KTR_NET, "Freeing route structure %p", item->data); + rtfree_real((struct rtentry *)item->data); + break; + case RTGC_IF: + CTR1(KTR_NET, "Freeing iface structure %p", item->data); + if_free_real((struct ifnet *)item->data); + break; + default: + CTR2(KTR_NET, "Unknown type: %d %p", item->etype, item->data); + break; + } + + /* Remove item itself */ + free(item, M_RTGC); + deleted++; + } + + /* + * Add remaining data back to mail queue. + * Note items are still sorted by time_uptime after merge. + */ + + mtx_lock(&V_rtgc_mtx); + /* Add new items to the end of our temporary queue */ + TAILQ_CONCAT(&rtgc_tq, &rtgc_queue, items); + /* Move items back to stable storage */ + TAILQ_SWAP(&rtgc_queue, &rtgc_tq, rtgc_item, items); + /* Check if we need to run callout another time */ + empty = TAILQ_EMPTY(&rtgc_queue); + /* Update counters */ + V_rtgc_numfreed += deleted; + V_rtgc_numqueued -= deleted; + mtx_unlock(&V_rtgc_mtx); + + CTR4(KTR_NET, "%s: ended with %d object(s) (%d deleted), callout: %s", + __func__, V_rtgc_numqueued, deleted, empty ? "stopped" : "sheduled"); + /* Schedule ourself iff there are items to delete */ + if (!empty) + callout_reset(&V_rtgc_callout, hz * RTGC_CALLOUT_DELAY, rtgc_func, NULL); +} + +void +rtgc_free(int etype, void *data, int can_sleep) +{ + struct rtgc_item *item; + + item = malloc(sizeof(struct rtgc_item), M_RTGC, (can_sleep ? M_WAITOK : M_NOWAIT) | M_ZERO); + if (item == NULL) { + V_rtgc_numfailures++; /* XXX: locking */ + return; /* Skip route freeing. Memory leak is much better than panic */ + } + + item->expire = time_uptime + V_rtgc_expire_delay; + item->etype = etype; + item->data = data; + + if ((!can_sleep) && (mtx_trylock(&V_rtgc_mtx) == 0)) { + /* Fail to acquire lock. Add another leak */ + free(item, M_RTGC); + V_rtgc_numfailures++; /* XXX: locking */ + return; + } + + if (can_sleep) + mtx_lock(&V_rtgc_mtx); + + TAILQ_INSERT_TAIL(&rtgc_queue, item, items); + V_rtgc_numqueued++; + + mtx_unlock(&V_rtgc_mtx); + + /* Schedule callout if not running */ + if (!callout_pending(&V_rtgc_callout)) + callout_reset(&V_rtgc_callout, hz * RTGC_CALLOUT_DELAY, rtgc_func, NULL); +} + + /* * handler for net.my_fibnum */ @@ -241,6 +410,17 @@ vnet_route_init(const void *unused __unused) dom->dom_rtattach((void **)rnh, dom->dom_rtoffset); } } + + /* Init garbage collector */ + mtx_init(&V_rtgc_mtx, "routeGC", NULL, MTX_DEF); + /* Init queue */ + TAILQ_INIT(&V_rtgc_queue); + /* Init garbage callout */ + memset(&V_rtgc_callout, 0, sizeof(rtgc_callout)); + callout_init(&V_rtgc_callout, 1); + /* Set default from loader tunable */ + V_rtgc_enabled = _rtgc_default_enabled; + //callout_reset(&V_rtgc_callout, 3 * hz, &rtgc_func, NULL); } VNET_SYSINIT(vnet_route_init, SI_SUB_PROTO_DOMAIN, SI_ORDER_FOURTH, vnet_route_init, 0); @@ -351,6 +531,74 @@ rtalloc1(struct sockaddr *dst, int report, u_long ignflags) } struct rtentry * +rtalloc1_fib_nolock(struct sockaddr *dst, int report, u_long ignflags, + u_int fibnum) +{ + struct radix_node_head *rnh; + struct radix_node *rn; + struct rtentry *newrt; + struct rt_addrinfo info; + int err = 0, msgtype = RTM_MISS; + int needlock; + + KASSERT((fibnum < rt_numfibs), ("rtalloc1_fib: bad fibnum")); + switch (dst->sa_family) { + case AF_INET6: + case AF_INET: + /* We support multiple FIBs. */ + break; + default: + fibnum = RT_DEFAULT_FIB; + break; + } + rnh = rt_tables_get_rnh(fibnum, dst->sa_family); + newrt = NULL; + if (rnh == NULL) + goto miss; + + /* + * Look up the address in the table for that Address Family + */ + needlock = !(ignflags & RTF_RNH_LOCKED); + if (needlock) + RADIX_NODE_HEAD_RLOCK(rnh); +#ifdef INVARIANTS + else + RADIX_NODE_HEAD_LOCK_ASSERT(rnh); +#endif + rn = rnh->rnh_matchaddr(dst, rnh); + if (rn && ((rn->rn_flags & RNF_ROOT) == 0)) { + newrt = RNTORT(rn); + if (needlock) + RADIX_NODE_HEAD_RUNLOCK(rnh); + goto done; + + } else if (needlock) + RADIX_NODE_HEAD_RUNLOCK(rnh); + + /* + * Either we hit the root or couldn't find any match, + * Which basically means + * "caint get there frm here" + */ +miss: + V_rtstat.rts_unreach++; + + if (report) { + /* + * If required, report the failure to the supervising + * Authorities. + * For a delete, this is not an error. (report == 0) + */ + bzero(&info, sizeof(info)); + info.rti_info[RTAX_DST] = dst; + rt_missmsg_fib(msgtype, &info, 0, err, fibnum); + } +done: + return (newrt); +} + +struct rtentry * rtalloc1_fib(struct sockaddr *dst, int report, u_long ignflags, u_int fibnum) { @@ -422,6 +670,23 @@ done: return (newrt); } + +void +rtfree_real(struct rtentry *rt) +{ + /* + * The key is separatly alloc'd so free it (see rt_setgate()). + * This also frees the gateway, as they are always malloc'd + * together. + */ + Free(rt_key(rt)); + + /* + * and the rtentry itself of course + */ + uma_zfree(V_rtzone, rt); +} + /* * Remove a reference count from an rtentry. * If the count gets low enough, take it out of the routing table @@ -484,18 +749,13 @@ rtfree(struct rtentry *rt) */ if (rt->rt_ifa) ifa_free(rt->rt_ifa); - /* - * The key is separatly alloc'd so free it (see rt_setgate()). - * This also frees the gateway, as they are always malloc'd - * together. - */ - Free(rt_key(rt)); - /* - * and the rtentry itself of course - */ RT_LOCK_DESTROY(rt); - uma_zfree(V_rtzone, rt); + + if (V_rtgc_enabled) + rtgc_free(RTGC_ROUTE, rt, 0); + else + rtfree_real(rt); return; } done: diff --git a/sys/net/route.h b/sys/net/route.h index b26ac44..3aa694d 100644 --- a/sys/net/route.h +++ b/sys/net/route.h @@ -363,9 +363,14 @@ void rt_maskedcopy(struct sockaddr *, struct sockaddr *, struct sockaddr *); * * RTFREE() uses an unlocked entry. */ +#define RTGC_ROUTE 1 +#define RTGC_IF 3 + int rtexpunge(struct rtentry *); void rtfree(struct rtentry *); +void rtgc_free(int etype, void *data, int can_sleep); +int rtgc_is_enabled(void); int rt_check(struct rtentry **, struct rtentry **, struct sockaddr *); /* XXX MRT COMPAT VERSIONS THAT SET UNIVERSE to 0 */ @@ -394,6 +399,7 @@ int rt_getifa_fib(struct rt_addrinfo *, u_int fibnum); void rtalloc_ign_fib(struct route *ro, u_long ignflags, u_int fibnum); void rtalloc_fib(struct route *ro, u_int fibnum); struct rtentry *rtalloc1_fib(struct sockaddr *, int, u_long, u_int); +struct rtentry *rtalloc1_fib_nolock(struct sockaddr *, int, u_long, u_int); int rtioctl_fib(u_long, caddr_t, u_int); void rtredirect_fib(struct sockaddr *, struct sockaddr *, struct sockaddr *, int, struct sockaddr *, u_int); diff --git a/sys/netinet/in_rmx.c b/sys/netinet/in_rmx.c index 1389873..1c9d9db 100644 --- a/sys/netinet/in_rmx.c +++ b/sys/netinet/in_rmx.c @@ -122,12 +122,12 @@ in_matroute(void *v_arg, struct radix_node_head *head) struct rtentry *rt = (struct rtentry *)rn; if (rt) { - RT_LOCK(rt); +// RT_LOCK(rt); if (rt->rt_flags & RTPRF_OURS) { rt->rt_flags &= ~RTPRF_OURS; rt->rt_rmx.rmx_expire = 0; } - RT_UNLOCK(rt); +// RT_UNLOCK(rt); } return rn; } @@ -365,7 +365,7 @@ in_inithead(void **head, int off) rnh = *head; rnh->rnh_addaddr = in_addroute; - rnh->rnh_matchaddr = in_matroute; + rnh->rnh_matchaddr = rn_match; rnh->rnh_close = in_clsroute; if (_in_rt_was_here == 0 ) { callout_init(&V_rtq_timer, CALLOUT_MPSAFE); diff --git a/sys/netinet/ip_fastfwd.c b/sys/netinet/ip_fastfwd.c index d7fe411..d2b98b3 100644 --- a/sys/netinet/ip_fastfwd.c +++ b/sys/netinet/ip_fastfwd.c @@ -112,6 +112,22 @@ static VNET_DEFINE(int, ipfastforward_active); SYSCTL_VNET_INT(_net_inet_ip, OID_AUTO, fastforwarding, CTLFLAG_RW, &VNET_NAME(ipfastforward_active), 0, "Enable fast IP forwarding"); +void +rtalloc_ign_fib_nolock(struct route *ro, u_long ignore, u_int fibnum); + +void +rtalloc_ign_fib_nolock(struct route *ro, u_long ignore, u_int fibnum) +{ + struct rtentry *rt; + + if ((rt = ro->ro_rt) != NULL) { + if (rt->rt_ifp != NULL && rt->rt_flags & RTF_UP) + return; + ro->ro_rt = NULL; + } + ro->ro_rt = rtalloc1_fib_nolock(&ro->ro_dst, 1, ignore, fibnum); +} + static struct sockaddr_in * ip_findroute(struct route *ro, struct in_addr dest, struct mbuf *m) { @@ -126,7 +142,7 @@ ip_findroute(struct route *ro, struct in_addr dest, struct mbuf *m) dst->sin_family = AF_INET; dst->sin_len = sizeof(*dst); dst->sin_addr.s_addr = dest.s_addr; - in_rtalloc_ign(ro, 0, M_GETFIB(m)); + rtalloc_ign_fib_nolock(ro, 0, M_GETFIB(m)); /* * Route there and interface still up? @@ -140,8 +156,10 @@ ip_findroute(struct route *ro, struct in_addr dest, struct mbuf *m) } else { IPSTAT_INC(ips_noroute); IPSTAT_INC(ips_cantforward); +#if 0 if (rt) RTFREE(rt); +#endif icmp_error(m, ICMP_UNREACH, ICMP_UNREACH_HOST, 0, 0); return NULL; } @@ -334,10 +352,11 @@ ip_fastforward(struct mbuf *m) if (in_localip(ip->ip_dst)) return m; - IPSTAT_INC(ips_total); + //IPSTAT_INC(ips_total); /* * Step 3: incoming packet firewall processing + in_rtalloc_ign(ro, 0, M_GETFIB(m)); */ /* @@ -476,8 +495,10 @@ forwardlocal: * "ours"-label. */ m->m_flags |= M_FASTFWD_OURS; +/* if (ro.ro_rt) RTFREE(ro.ro_rt); +*/ return m; } /* @@ -490,7 +511,7 @@ forwardlocal: m_tag_delete(m, fwd_tag); } #endif /* IPFIREWALL_FORWARD */ - RTFREE(ro.ro_rt); +// RTFREE(ro.ro_rt); if ((dst = ip_findroute(&ro, dest, m)) == NULL) return NULL; /* icmp unreach already sent */ ifp = ro.ro_rt->rt_ifp; @@ -601,17 +622,21 @@ passout: if (error != 0) IPSTAT_INC(ips_odropped); else { +#if 0 ro.ro_rt->rt_rmx.rmx_pksent++; IPSTAT_INC(ips_forward); IPSTAT_INC(ips_fastforward); +#endif } consumed: - RTFREE(ro.ro_rt); +// RTFREE(ro.ro_rt); return NULL; drop: if (m) m_freem(m); +/* if (ro.ro_rt) RTFREE(ro.ro_rt); +*/ return NULL; } diff --git a/sys/netinet6/in6_rmx.c b/sys/netinet6/in6_rmx.c index b526030..9aabe63 100644 --- a/sys/netinet6/in6_rmx.c +++ b/sys/netinet6/in6_rmx.c @@ -195,12 +195,12 @@ in6_matroute(void *v_arg, struct radix_node_head *head) struct rtentry *rt = (struct rtentry *)rn; if (rt) { - RT_LOCK(rt); + //RT_LOCK(rt); if (rt->rt_flags & RTPRF_OURS) { rt->rt_flags &= ~RTPRF_OURS; rt->rt_rmx.rmx_expire = 0; } - RT_UNLOCK(rt); + //RT_UNLOCK(rt); } return rn; } @@ -440,7 +440,7 @@ in6_inithead(void **head, int off) rnh = *head; rnh->rnh_addaddr = in6_addroute; - rnh->rnh_matchaddr = in6_matroute; + rnh->rnh_matchaddr = rn_match; if (V__in6_rt_was_here == 0) { callout_init(&V_rtq_timer6, CALLOUT_MPSAFE); --------------010308000904000207080306 Content-Type: text/plain; charset=UTF-8; name="81_radix_rmlock.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="81_radix_rmlock.diff" commit 0e7cebd1753c3b77bdc00d728fbd5910c2d2afec Author: Charlie Root Date: Mon Apr 8 15:35:00 2013 +0000 Make radix use rmlock. diff --git a/sys/contrib/ipfilter/netinet/ip_compat.h b/sys/contrib/ipfilter/netinet/ip_compat.h index 31e5b11..5e74da4 100644 --- a/sys/contrib/ipfilter/netinet/ip_compat.h +++ b/sys/contrib/ipfilter/netinet/ip_compat.h @@ -870,6 +870,7 @@ typedef u_int32_t u_32_t; # if (__FreeBSD_version >= 500043) # include # if (__FreeBSD_version > 700014) +# include # include # define KRWLOCK_T struct rwlock # ifdef _KERNEL diff --git a/sys/contrib/pf/net/pf_table.c b/sys/contrib/pf/net/pf_table.c index 40c9f67..b1dd703 100644 --- a/sys/contrib/pf/net/pf_table.c +++ b/sys/contrib/pf/net/pf_table.c @@ -44,6 +44,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #ifdef __FreeBSD__ #include diff --git a/sys/kern/subr_witness.c b/sys/kern/subr_witness.c index e565d01..f913d27 100644 --- a/sys/kern/subr_witness.c +++ b/sys/kern/subr_witness.c @@ -508,7 +508,7 @@ static struct witness_order_list_entry order_lists[] = { * Routing */ { "so_rcv", &lock_class_mtx_sleep }, - { "radix node head", &lock_class_rw }, + { "radix node head", &lock_class_rm }, { "rtentry", &lock_class_mtx_sleep }, { "ifaddr", &lock_class_mtx_sleep }, { NULL, NULL }, diff --git a/sys/kern/sys_socket.c b/sys/kern/sys_socket.c index 4cbae74..fea12d0 100644 --- a/sys/kern/sys_socket.c +++ b/sys/kern/sys_socket.c @@ -50,6 +50,8 @@ __FBSDID("$FreeBSD$"); #include #include +#include +#include #include #include diff --git a/sys/kern/vfs_export.c b/sys/kern/vfs_export.c index 4185211..848c232 100644 --- a/sys/kern/vfs_export.c +++ b/sys/kern/vfs_export.c @@ -47,7 +47,7 @@ __FBSDID("$FreeBSD$"); #include #include #include -#include +#include #include #include #include @@ -427,6 +427,7 @@ vfs_export_lookup(struct mount *mp, struct sockaddr *nam) register struct netcred *np; register struct radix_node_head *rnh; struct sockaddr *saddr; + RADIX_NODE_HEAD_READER; nep = mp->mnt_export; if (nep == NULL) diff --git a/sys/net/if.c b/sys/net/if.c index 5ecde8c..351e046 100644 --- a/sys/net/if.c +++ b/sys/net/if.c @@ -51,6 +51,7 @@ #include #include #include +#include #include #include #include diff --git a/sys/net/radix.c b/sys/net/radix.c index 33fcf82..d8d1e8b 100644 --- a/sys/net/radix.c +++ b/sys/net/radix.c @@ -37,7 +37,7 @@ #ifdef _KERNEL #include #include -#include +#include #include #include #include diff --git a/sys/net/radix.h b/sys/net/radix.h index 29659b5..2d130f0 100644 --- a/sys/net/radix.h +++ b/sys/net/radix.h @@ -36,7 +36,7 @@ #ifdef _KERNEL #include #include -#include +#include #endif #ifdef MALLOC_DECLARE @@ -133,7 +133,7 @@ struct radix_node_head { struct radix_node rnh_nodes[3]; /* empty tree for common case */ int rnh_multipath; /* multipath capable ? */ #ifdef _KERNEL - struct rwlock rnh_lock; /* locks entire radix tree */ + struct rmlock rnh_lock; /* locks entire radix tree */ #endif }; @@ -146,18 +146,21 @@ struct radix_node_head { #define R_Zalloc(p, t, n) (p = (t) malloc((unsigned long)(n), M_RTABLE, M_NOWAIT | M_ZERO)) #define Free(p) free((caddr_t)p, M_RTABLE); +#define RADIX_NODE_HEAD_READER struct rm_priotracker tracker #define RADIX_NODE_HEAD_LOCK_INIT(rnh) \ - rw_init_flags(&(rnh)->rnh_lock, "radix node head", 0) -#define RADIX_NODE_HEAD_LOCK(rnh) rw_wlock(&(rnh)->rnh_lock) -#define RADIX_NODE_HEAD_UNLOCK(rnh) rw_wunlock(&(rnh)->rnh_lock) -#define RADIX_NODE_HEAD_RLOCK(rnh) rw_rlock(&(rnh)->rnh_lock) -#define RADIX_NODE_HEAD_RUNLOCK(rnh) rw_runlock(&(rnh)->rnh_lock) -#define RADIX_NODE_HEAD_LOCK_TRY_UPGRADE(rnh) rw_try_upgrade(&(rnh)->rnh_lock) - - -#define RADIX_NODE_HEAD_DESTROY(rnh) rw_destroy(&(rnh)->rnh_lock) -#define RADIX_NODE_HEAD_LOCK_ASSERT(rnh) rw_assert(&(rnh)->rnh_lock, RA_LOCKED) -#define RADIX_NODE_HEAD_WLOCK_ASSERT(rnh) rw_assert(&(rnh)->rnh_lock, RA_WLOCKED) + rm_init(&(rnh)->rnh_lock, "radix node head") +#define RADIX_NODE_HEAD_LOCK(rnh) rm_wlock(&(rnh)->rnh_lock) +#define RADIX_NODE_HEAD_UNLOCK(rnh) rm_wunlock(&(rnh)->rnh_lock) +#define RADIX_NODE_HEAD_RLOCK(rnh) rm_rlock(&(rnh)->rnh_lock, &tracker) +#define RADIX_NODE_HEAD_RUNLOCK(rnh) rm_runlock(&(rnh)->rnh_lock, &tracker) +//#define RADIX_NODE_HEAD_LOCK_TRY_UPGRADE(rnh) rw_try_upgrade(&(rnh)->rnh_lock) + + +#define RADIX_NODE_HEAD_DESTROY(rnh) rm_destroy(&(rnh)->rnh_lock) +#define RADIX_NODE_HEAD_LOCK_ASSERT(rnh) +#define RADIX_NODE_HEAD_WLOCK_ASSERT(rnh) +//#define RADIX_NODE_HEAD_LOCK_ASSERT(rnh) rw_assert(&(rnh)->rnh_lock, RA_LOCKED) +//#define RADIX_NODE_HEAD_WLOCK_ASSERT(rnh) rw_assert(&(rnh)->rnh_lock, RA_WLOCKED) #endif /* _KERNEL */ void rn_init(int); diff --git a/sys/net/radix_mpath.c b/sys/net/radix_mpath.c index ee7826f..c69888e 100644 --- a/sys/net/radix_mpath.c +++ b/sys/net/radix_mpath.c @@ -45,6 +45,8 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include +#include #include #include #include diff --git a/sys/net/route.c b/sys/net/route.c index 5d56688..2cf6ea5 100644 --- a/sys/net/route.c +++ b/sys/net/route.c @@ -52,6 +52,8 @@ #include #include #include +#include +#include #include #include @@ -544,6 +546,7 @@ rtalloc1_fib_nolock(struct sockaddr *dst, int report, u_long ignflags, struct rtentry *newrt; struct rt_addrinfo info; int err = 0, msgtype = RTM_MISS; + RADIX_NODE_HEAD_READER; int needlock; KASSERT((fibnum < rt_numfibs), ("rtalloc1_fib: bad fibnum")); @@ -612,6 +615,7 @@ rtalloc1_fib(struct sockaddr *dst, int report, u_long ignflags, struct rtentry *newrt; struct rt_addrinfo info; int err = 0, msgtype = RTM_MISS; + RADIX_NODE_HEAD_READER; int needlock; KASSERT((fibnum < rt_numfibs), ("rtalloc1_fib: bad fibnum")); @@ -799,6 +803,7 @@ rtredirect_fib(struct sockaddr *dst, struct rt_addrinfo info; struct ifaddr *ifa; struct radix_node_head *rnh; + RADIX_NODE_HEAD_READER; ifa = NULL; rnh = rt_tables_get_rnh(fibnum, dst->sa_family); diff --git a/sys/net/rtsock.c b/sys/net/rtsock.c index 58c46a6..18d3e06 100644 --- a/sys/net/rtsock.c +++ b/sys/net/rtsock.c @@ -45,6 +45,7 @@ #include #include #include +#include #include #include #include @@ -577,6 +578,7 @@ route_output(struct mbuf *m, struct socket *so) struct ifnet *ifp = NULL; union sockaddr_union saun; sa_family_t saf = AF_UNSPEC; + RADIX_NODE_HEAD_READER; #define senderr(e) { error = e; goto flush;} if (m == NULL || ((m->m_len < sizeof(long)) && @@ -1818,6 +1820,7 @@ sysctl_rtsock(SYSCTL_HANDLER_ARGS) int i, lim, error = EINVAL; u_char af; struct walkarg w; + RADIX_NODE_HEAD_READER; name ++; namelen--; diff --git a/sys/netinet/in_rmx.c b/sys/netinet/in_rmx.c index 1c9d9db..775ba5a 100644 --- a/sys/netinet/in_rmx.c +++ b/sys/netinet/in_rmx.c @@ -53,6 +53,8 @@ __FBSDID("$FreeBSD$"); #include #include +#include +#include #include #include diff --git a/sys/netinet6/in6_ifattach.c b/sys/netinet6/in6_ifattach.c index 80eb022..cbfe1d8 100644 --- a/sys/netinet6/in6_ifattach.c +++ b/sys/netinet6/in6_ifattach.c @@ -42,6 +42,8 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include +#include #include #include diff --git a/sys/netinet6/in6_rmx.c b/sys/netinet6/in6_rmx.c index 9aabe63..a291db2 100644 --- a/sys/netinet6/in6_rmx.c +++ b/sys/netinet6/in6_rmx.c @@ -84,6 +84,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include #include diff --git a/sys/netinet6/nd6_rtr.c b/sys/netinet6/nd6_rtr.c index 687d84d..7737d47 100644 --- a/sys/netinet6/nd6_rtr.c +++ b/sys/netinet6/nd6_rtr.c @@ -45,6 +45,7 @@ __FBSDID("$FreeBSD: stable/8/sys/netinet6/nd6_rtr.c 233201 2012-03-19 20:49:42Z #include #include #include +#include #include #include #include --------------010308000904000207080306 Content-Type: text/plain; charset=UTF-8; name="11_no_lle_rlock.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="11_no_lle_rlock.diff" commit 963196095589c03880ddd13a5c16f9e50cf6d7ce Author: Charlie Root Date: Sun Nov 4 15:52:50 2012 +0000 Do not require locking arp lle diff --git a/sys/net/if_llatbl.h b/sys/net/if_llatbl.h index 9f6531b..c1b2af9 100644 --- a/sys/net/if_llatbl.h +++ b/sys/net/if_llatbl.h @@ -169,6 +169,7 @@ MALLOC_DECLARE(M_LLTABLE); #define LLE_PUB 0x0020 /* publish entry ??? */ #define LLE_DELETE 0x4000 /* delete on a lookup - match LLE_IFADDR */ #define LLE_CREATE 0x8000 /* create on a lookup miss */ +#define LLE_UNLOCKED 0x1000 /* return lle unlocked */ #define LLE_EXCLUSIVE 0x2000 /* return lle xlocked */ #define LLATBL_HASH(key, mask) \ diff --git a/sys/netinet/if_ether.c b/sys/netinet/if_ether.c index f61b803..ecb9b8e 100644 --- a/sys/netinet/if_ether.c +++ b/sys/netinet/if_ether.c @@ -283,10 +283,10 @@ arpresolve(struct ifnet *ifp, struct rtentry *rt0, struct mbuf *m, struct sockaddr *dst, u_char *desten, struct llentry **lle) { struct llentry *la = 0; - u_int flags = 0; + u_int flags = LLE_UNLOCKED; struct mbuf *curr = NULL; struct mbuf *next = NULL; - int error, renew; + int error, renew = 0; *lle = NULL; if (m != NULL) { @@ -307,7 +307,41 @@ arpresolve(struct ifnet *ifp, struct rtentry *rt0, struct mbuf *m, retry: IF_AFDATA_RLOCK(ifp); la = lla_lookup(LLTABLE(ifp), flags, dst); + + /* + * Fast path. Do not require rlock on llentry. + */ + if ((la != NULL) && (flags & LLE_UNLOCKED)) { + if ((la->la_flags & LLE_VALID) && + ((la->la_flags & LLE_STATIC) || la->la_expire > time_uptime)) { + bcopy(&la->ll_addr, desten, ifp->if_addrlen); + /* + * If entry has an expiry time and it is approaching, + * see if we need to send an ARP request within this + * arpt_down interval. + */ + if (!(la->la_flags & LLE_STATIC) && + time_uptime + la->la_preempt > la->la_expire) { + renew = 1; + la->la_preempt--; + } + + IF_AFDATA_RUNLOCK(ifp); + if (renew != 0) + arprequest(ifp, NULL, &SIN(dst)->sin_addr, NULL); + + return (0); + } + + /* Revert to normal path for other cases */ + *lle = la; + LLE_RLOCK(la); + } + + flags &= ~LLE_UNLOCKED; + IF_AFDATA_RUNLOCK(ifp); + if ((la == NULL) && ((flags & LLE_EXCLUSIVE) == 0) && ((ifp->if_flags & (IFF_NOARP | IFF_STATICARP)) == 0)) { flags |= (LLE_CREATE | LLE_EXCLUSIVE); @@ -324,27 +358,6 @@ retry: return (EINVAL); } - if ((la->la_flags & LLE_VALID) && - ((la->la_flags & LLE_STATIC) || la->la_expire > time_second)) { - bcopy(&la->ll_addr, desten, ifp->if_addrlen); - /* - * If entry has an expiry time and it is approaching, - * see if we need to send an ARP request within this - * arpt_down interval. - */ - if (!(la->la_flags & LLE_STATIC) && - time_second + la->la_preempt > la->la_expire) { - arprequest(ifp, NULL, - &SIN(dst)->sin_addr, IF_LLADDR(ifp)); - - la->la_preempt--; - } - - *lle = la; - error = 0; - goto done; - } - if (la->la_flags & LLE_STATIC) { /* should not happen! */ log(LOG_DEBUG, "arpresolve: ouch, empty static llinfo for %s\n", inet_ntoa(SIN(dst)->sin_addr)); diff --git a/sys/netinet/in.c b/sys/netinet/in.c index eaba4e5..5341918 100644 --- a/sys/netinet/in.c +++ b/sys/netinet/in.c @@ -1561,7 +1561,7 @@ in_lltable_lookup(struct lltable *llt, u_int flags, const struct sockaddr *l3add if (LLE_IS_VALID(lle)) { if (flags & LLE_EXCLUSIVE) LLE_WLOCK(lle); - else + else if (!(flags & LLE_UNLOCKED)) LLE_RLOCK(lle); } done: --------------010308000904000207080306-- From owner-freebsd-hackers@FreeBSD.ORG Wed Aug 28 19:37:12 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id BDB86FF7; Wed, 28 Aug 2013 19:37:12 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-ve0-x235.google.com (mail-ve0-x235.google.com [IPv6:2607:f8b0:400c:c01::235]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id DEA2221E4; Wed, 28 Aug 2013 19:37:11 +0000 (UTC) Received: by mail-ve0-f181.google.com with SMTP id jz10so4684066veb.12 for ; Wed, 28 Aug 2013 12:37:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=dxnKa8hM+AefrXStvsj3F1WeY+ZAph3Wgx8EEc4Juck=; b=nDn2BH6n6WYTCIyZd96zHN8qhwNcQ0pJVWb4lKM7AK7rcBXSJRUP5XpTA3vnpkd5Qz 5uPda4ecEq7EqxRDWBYRA4Schfm5gAeoGby4K41DMofd8RpKrdlnWj+7ZT8JND8Hedh5 tAPciWe84X9MKbEc4HINMV7Yku+OAZn2/zgpeaye7vPXzAsGnHwUEpFOcWWPTR05qs43 8usgrgTfZx3ua2xF9o1tCACdJTt7edXUX0o9mGvYSsaCSTDTJOTVjltYPOmw716R9Fwl 6k+XnaZw8cUQmXH6qpoWM4ijNVJEv5LgiNCZ6A6JWdDuv0U76pIIt5Z0f7hkNytI40AL Phew== MIME-Version: 1.0 X-Received: by 10.58.235.69 with SMTP id uk5mr27194246vec.17.1377718630983; Wed, 28 Aug 2013 12:37:10 -0700 (PDT) Received: by 10.220.159.141 with HTTP; Wed, 28 Aug 2013 12:37:10 -0700 (PDT) In-Reply-To: <521E41CB.30700@yandex-team.ru> References: <521E41CB.30700@yandex-team.ru> Date: Wed, 28 Aug 2013 12:37:10 -0700 Message-ID: Subject: Re: Network stack changes From: Jack Vogel To: "Alexander V. Chernikov" X-Mailman-Approved-At: Wed, 28 Aug 2013 19:48:23 +0000 Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: Adrian Chadd , Andre Oppermann , FreeBSD Hackers , FreeBSD Net , Luigi Rizzo , "Andrey V. Elsukov" , Gleb Smirnoff , freebsd-arch@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 19:37:12 -0000 Very interesting material Alexander, only had time to glance at it now, will look in more depth later, thanks! Jack On Wed, Aug 28, 2013 at 11:30 AM, Alexander V. Chernikov < melifaro@yandex-team.ru> wrote: > Hello list! > > There is a lot constantly raising discussions related to networking stack > performance/changes. > > I'll try to summarize current problems and possible solutions from my > point of view. > (Generally this is one problem: stack is slooooooooooooooooooooooooooow**, > but we need to know why and what to do). > > Let's start with current IPv4 packet flow on a typical router: > http://static.ipfw.ru/images/**freebsd_ipv4_flow.png > > (I'm sorry I can't provide this as text since Visio don't have any > 'ascii-art' exporter). > > Note that we are using process-to-completion model, e.g. process any > packet in ISR until it is either > consumed by L4+ stack or dropped or put to egress NIC queue. > > (There is also deferred ISR model implemented inside netisr but it does > not change much: > it can help to do more fine-grained hashing (for GRE or other similar > traffic), but > 1) it uses per-packet mutex locking which kills all performance > 2) it currently does not have _any_ hashing functions (see absence of > flags in `netstat -Q`) > People using http://static.ipfw.ru/patches/**netisr_ip_flowid.diff(or modified PPPoe/GRE version) > report some profit, but without fixing (1) it can't help much > ) > > So, let's start: > > 1) Ixgbe uses mutex to protect each RX ring which is perfectly fine since > there is nearly no contention > (the only thing that can happen is driver reconfiguration which is rare > and, more signifficant, we do this once > for the batch of packets received in given interrupt). However, due to > some (im)possible deadlocks current code > does per-packet ring unlock/lock (see ixgbe_rx_input()). > There was a discussion ended with nothing: http://lists.freebsd.org/** > pipermail/freebsd-net/2012-**October/033520.html > > 1*) Possible BPF users. Here we have one rlock if there are any readers > present > (and mutex for any matching packets, but this is more or less OK. > Additionally, there is WIP to implement multiqueue BPF > and there is chance that we can reduce lock contention there). There is > also an "optimize_writers" hack permitting applications > like CDP to use BPF as writers but not registering them as receivers > (which implies rlock) > > 2/3) Virtual interfaces (laggs/vlans over lagg and other simular > constructions). > Currently we simply use rlock to make s/ix0/lagg0/ and, what is much more > funny - we use complex vlan_hash with another rlock to > get vlan interface from underlying one. > > This is definitely not like things should be done and this can be changed > more or less easily. > > There are some useful terms/techniques in world of software/hardware > routing: they have clear 'control plane' and 'data plane' separation. > Former one is for dealing control traffic (IGP, MLD, IGMP snooping, lagg > hellos, ARP/NDP, etc..) and some data traffic (packets with TTL=1, with > options, destined to hosts without ARP/NDP record, and similar). Latter one > is done in hardware (or effective software implementation). > Control plane is responsible to provide data for efficient data plane > operations. This is the point we are missing nearly everywhere. > > What I want to say is: lagg is pure control-plane stuff and vlan is nearly > the same. We can't apply this approach to complex cases like > lagg-over-vlans-over-vlans-**over-(pppoe_ng0-and_wifi0) > but we definitely can do this for most common setups like (igb* or ix* in > lagg with or without vlans on top of lagg). > > We already have some capabilities like VLANHWFILTER/VLANHWTAG, we can add > some more. We even have per-driver hooks to program HW filtering. > > One small step to do is to throw packet to vlan interface directly (P1), > proof-of-concept(working in production): > http://lists.freebsd.org/**pipermail/freebsd-net/2013-**April/035270.html > > Another is to change lagg packet accounting: http://lists.freebsd.org/** > pipermail/svn-src-all/2013-**April/067570.html > Again, this is more like HW boxes do (aggregate all counters including > errors) (and I can't imagine what real error we can get from _lagg_). > > 4) If we are router, we can do either slooow ip_input() -> ip_forward() -> > ip_output() cycle or use optimized ip_fastfwd() which falls back to 'slow' > path for multicast/options/local traffic (e.g. works exactly like 'data > plane' part). > (Btw, we can consider net.inet.ip.fastforwarding to be turned on by > default at least for non-IPSEC kernels) > > Here we have to determine if this is local packet or not, e.g. F(dst_ip) > returning 1 or 0. Currently we are simply using standard rlock + hash of > iface addresses. > (And some consumers like ipfw(4) do the same, but without lock). > We don't need to do this! We can build sorted array of IPv4 addresses or > other efficient structure on every address change and use it unlocked with > delayed garbage collection (proof-of-concept attached) > (There is another thing to discuss: maybe we can do this once somewhere in > ip_input and mark mbuf as 'local/non-local' ? ) > > 5, 9) Currently we have L3 ingress/egress PFIL hooks protected by rmlocks. > This is OK. > > However, 6) and 7) are not. > Firewall can use the same pfil lock as reader protection without imposing > its own lock. currently pfil&ipfw code is ready to do this. > > 8) Radix/rt* api. This is probably the worst place in entire stack. It is > toooo generic, tooo slow and buggy (do you use IPv6? you definitely know > what I'm talking about). > A) It really is too generic and assumption that it can be (effectively) > used for every family is wrong. Two examples: > we don't need to lookup all 128 bits of IPv6 address. Subnets with mask > >/64 are not used widely (actually the only reason to use them are p2p > links due to ND potential problems). > One of common solutions is to lookup 64bits, and build another trie (or > other structure) in case of collision. > Another example is MPLS where we can simply do direct array lookup based > on ingress label. > > B) It is terribly slow (AFAIR luigi@ did some performance management, > numbers available in one of netmap pdfs) > C) It is not multipath-capable. Stateful (and non-working) multipath is > definitely not the right way. > > 8*) rtentry > We are doing it wrong. > Currently _every_ lookup locks/unlocks given rte twice. > First lock is related to and old-old story for trusting IP redirects (and > auto-adding host routes for them). Hopefully currently it is disabled > automatically when you turn forwarding on. > The second one is much more complicated: we are assuming that rte's with > non-zero refcount value can stop egress interface from being destroyed. > This is wrong (but widely used) assumption. > > We can use delayed GC instead of locking for rte's and this won't break > things more than they are broken now (patch attached). > We can't do the same for ifp structures since > a) virtual ones can assume some state in underlying physical NIC > b) physical ones just _can_ be destroyed (maybe regardless of user wants > this or not, like: SFP being unplugged from NIC) or simply lead to kernel > crash due to SW/HW inconsistency > > One of possible solution is to implement stable refcounts based on PCPU > counters, and apply thos counters to ifp, but seem to be non-trivial. > > > Another rtalloc(9) problem is the fact that radix is used as both 'control > plane' and 'data plane' structure/api. Some users always want to put more > information in rte, while others > want to make rte more compact. We just need _different_ structures for > that. > Feature-rich, lot-of-data control plane one (to store everything we want > to store, including, for example, PID of process originating the route) - > current radix can be modified to do this. > And address-family-depended another structure (array, trie, or anything) > which contains _only_ data necessary to put packet on the wire. > > 11) arpresolve. Currently (this was decoupled in 8.x) we have > a) ifaddr rlock > b) lle rlock. > > We don't need those locks. > We need to > a) make lle layer per-interface instead of global (and this can also solve > multiple fibs and L2 mappings done in fib.0 issue) > b) use rtalloc(9)-provided lock instead of separate locking > c) actually, we need to do rewrite this layer because > d) lle actually is the place to do real multipath: > > briefly, > you have rte pointing to some special nexthop structure pointing to lle, > which has the following data: > num_of_egress_ifaces: [ifindex1, ifindex2, ifindex3] | L2 data to prepend > to header > Separate post will follow. > > With the following, we can achieve lagg traffic distribution without > actually using lagg_transmit and similar stuff (at least in most common > scenarious) > (for example, TCP output definitely can benefit from this, since we can > account flowid once for TCP session and use in in every mbuf) > > > So. Imagine we have done all this. How we can estimate the difference? > > There was a thread, started a year ago, describing 'stock' performance and > difference for various modifications. > It is done on 8.x, however I've got similar results on recent 9.x > > http://lists.freebsd.org/**pipermail/freebsd-net/2012-**July/032680.html > > Briefly: > > 2xE5645 @ Intel 82599 NIC. > Kernel: FreeBSD-8-S r237994, stock drivers, stock routing, no FLOWTABLE, > no firewallIxia XM2 (traffic generator) <> ix0 (FreeBSD). Ixia sends 64byte > IP packets from vlan10 (10.100.0.64 - 10.100.0.156) to destinations in > vlan11 (10.100.1.128 - 10.100.1.192). Static arps are configured for all > destination addresses. Traffic level is slightly above or slightly below > system performance. > > we start from 1.4MPPS (if we are using several routes to minimize mutex > contention). > > My 'current' result for the same test, on same HW, with the following > modifications: > > * 1) ixgbe per-packet ring unlock removed > * P1) ixgbe is modified to do direct vlan input (so 2,3 are not used) > * 4) separate lockless in_localip() version > * 6) - using existing pfil lock > * 7) using lockless version > * 8) radix converted to use rmlock instead of rlock. Delayed GC is used > instead of mutexes > * 10) - using existing pfil lock > * 11) using radix lock to do arpresolve(). Not using lle rlock > > (so the rmlocks are the only locks used on data path). > > Additionally, ipstat counters are converted to PCPU (no real performance > implications). > ixgbe does not do per-packet accounting (as in head). > if_vlan counters are converted to PCPU > lagg is converted to rmlock, per-packet accounting is removed (using stat > from underlying interfaces) > lle hash size is bumped to 1024 instead of 32 (not applicable here, but > slows things down for large L2 domains) > > The result is 5.6 MPPS for single port (11 cores) and 6.5MPPS for lagg (16 > cores), nearly the same for HT on and 22 cores. > > .. > while Intel DPDK claims 80MPPS (and 6windgate talks about 160 or so) on > the same-class hardware and _userland_ forwarding. > > One of key features making all such products possible (DPDK, netmap, > packetshader, Cisco SW forwarding) - is use of batching instead of > process-to-completion model. > Batching mitigates locking cost, batching does not wash out CPU cache, and > so on. > > So maybe we can consider passing batches from NIC to at least L2 layer > with netisr? or even up to ip_input() ? > > Another question is about making some sort of reliable GC like ("passive > serialization" or other similar not-to-pronounce-words about Linux and > lockless objects). > > > P.S. Attached patches are 1) for 8.x 2) mostly 'hacks' showing roughly how > can this be done and what benefit can be achieved. > > > > > > > > > > > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" > From owner-freebsd-hackers@FreeBSD.ORG Wed Aug 28 22:13:20 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 794C963B; Wed, 28 Aug 2013 22:13:20 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mx1.stack.nl (unknown [IPv6:2001:610:1108:5012::107]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 207F62BB4; Wed, 28 Aug 2013 22:13:20 +0000 (UTC) Received: from snail.stack.nl (snail.stack.nl [IPv6:2001:610:1108:5010::131]) by mx1.stack.nl (Postfix) with ESMTP id 3CDF9120207; Thu, 29 Aug 2013 00:13:04 +0200 (CEST) Received: by snail.stack.nl (Postfix, from userid 1677) id 20BF828494; Thu, 29 Aug 2013 00:13:04 +0200 (CEST) Date: Thu, 29 Aug 2013 00:13:04 +0200 From: Jilles Tjoelker To: Andriy Gapon Subject: Re: [kde-freebsd] virtualbox file dialog problem Message-ID: <20130828221303.GA53931@stack.nl> References: <51E6B030.1080009@FreeBSD.org> <51E793DB.2020607@FreeBSD.org> <521DE891.9070107@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <521DE891.9070107@FreeBSD.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Greg Rivers , kde@FreeBSD.org, freebsd-gnome@FreeBSD.org, freebsd-hackers@FreeBSD.org, freebsd-security@FreeBSD.org, freebsd-emulation@FreeBSD.org, freebsd-standards@FreeBSD.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 22:13:20 -0000 On Wed, Aug 28, 2013 at 03:09:53PM +0300, Andriy Gapon wrote: > on 18/07/2013 10:06 Andriy Gapon said the following: > > on 18/07/2013 03:25 Greg Rivers said the following: > >> On Wed, 17 Jul 2013, Andriy Gapon wrote: > >>> I run virtualbox in KDE environment. A while ago (can't say > >>> exactly when) I started to have a problem where any file opening > >>> dialog would fail with this message: "Cannot talk to klauncher: > >>> Not connected to D-Bus server" > >>> > >>> I found that setting KDE_FORK_SLAVES=1 in environment works around > >>> the problem. > >> > >> I reported this same problem in this[1] thread on freebsd-ports@. > >> In that post I provided a link to a similar report for KDE on > >> openSUSE that required a dbus patch to fix. > >> I'm guessing that either the latest versions of VirtualBox have a > >> bug in their dbus interface, or the version of dbus we have needs > >> to be updated. > >> [1] http://lists.freebsd.org/pipermail/freebsd-ports/2013-July/084783.html > > I saw those OpenSUSE reports but I think that they were against the > > much older version of dbus. > I have done some more investigation and the problems turns out to be dbus > related indeed. > The problem has only a tangential relation to KDE, so I plan to drop > kde@ from this thread. It has a relation to what VirtualBox does, so > I am keeping emulation@. It is related to dbus and gnome@ is its > maintainer(s). It is also related to how issetugid(2) works, so I am > including standards@, security@ and hackers@. So, please excuse me for > such a wide distribution list, but I think that the solution should be > negotiated among the parties involved. > Now a description of the problem. > 1. VirtualBox executable is installed setuid root. Apparently, when > it is run it does some privileged things and then drops all of the > uids and gids (real, effective and saved) back to what they should > have been originally. VirtualBox does not do any (re-)exec of itself > after the above manipulations. > 2. issetugid(2) (which is apparently a BSD extension) on FreeBSD does > not consider the above manipulations as sufficient to mark an > executable as untainted. So it would return 1 for the VirtualBox > process. The manipulations do not guarantee that all privileged information and descriptors are no longer in the process. Often, a process will acquire some privileged resource and then drop to user credentials; for example, a raw socket in ping(8). Also, calls like getpwuid() might leave privileged information in memory. > 3. dbus code seems to impose some limitations on communication by such > "tainted" processes. It has the following code: > http://cgit.freedesktop.org/dbus/dbus/tree/dbus/dbus-sysdeps-unix.c#n4139 > For web-impaired :) the gist is that on BSD systems the code uses > issetugid but on other systems (like Linux) it uses getresuid and > getresgid and checks that all 3 uids are the same and all 3 gids are > the same. > As a result, on FreeBSD the dbus code would consider the VirtualBox > process tainted and that impairs its communication with KDE > components. On systems without issetugid or those that implement it > differently, dbus would work as for a normal process and all the > communications are OK. > I've also verified this conclusion by forcing dbus to use the > alternative logic on FreeBSD. I think dbus is doing the right thing on BSD and the getresuid/getresgid-based check on Linux is a bug. This bug was reported on https://bugs.freedesktop.org/show_bug.cgi?id=52202 however it was decided not to fix the bug because gnome-keyring-daemon relies on it. The gnome-keyring-daemon obtains cap_ipc_lock privilege (capability in Linux terms) from the filesystem and needs untrusted environment variables to work. (Note that this also means that moving a program from setuid root to capabilities may decrease security, since dbus and glib no longer know to be careful.) > So, possible solutions: > A. change how issetugid(2) works on FreeBSD; a comment in > sys_issetugid hints that other BSDs may have different behaviors I think it works correctly. By the way, issetugid(2) man page appears a bit too focused on UIDs/GIDs. The implementation also sets the bit (and rightly so) if MAC causes a transition on execve(2) or if jail_attach(2) is called. > B. change VirtualBox to be friendly to FreeBSD issetugid(2) and exec itself > after dropping the privileges This would be good, but it may need invasive changes to VirtualBox that its developers do not want to make. > C. patch dbus port to not use issetugid(2) This may open up security holes. > D. something else Two ideas. Firstly, a hack in VirtualBox that subverts issetugid() or _dbus_check_setuid() somehow may be appropriate. This does not require invasive changes to VirtualBox, and if you want a secure system you do not install VirtualBox anyway. This subversion could be done by overwriting the code of issetugid() or by inserting a dummy implementation of issetugid() with FBSD_1.0 version before libc.so in the lookup order, for example. Secondly, if setting KDE_FORK_SLAVES=1 works around the problem, perhaps KDE should behave that way automatically if it is called from a process with issetugid() true. -- Jilles Tjoelker From owner-freebsd-hackers@FreeBSD.ORG Wed Aug 28 22:25:04 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id AD3F6D3E for ; Wed, 28 Aug 2013 22:25:04 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 8E8D52CD0 for ; Wed, 28 Aug 2013 22:25:03 +0000 (UTC) Received: (qmail 22174 invoked from network); 28 Aug 2013 23:06:41 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 28 Aug 2013 23:06:41 -0000 Message-ID: <521E78B0.6080709@freebsd.org> Date: Thu, 29 Aug 2013 00:24:48 +0200 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130801 Thunderbird/17.0.8 MIME-Version: 1.0 To: "Alexander V. Chernikov" Subject: Re: Network stack changes References: <521E41CB.30700@yandex-team.ru> In-Reply-To: <521E41CB.30700@yandex-team.ru> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: adrian@freebsd.org, freebsd-hackers@freebsd.org, FreeBSD Net , luigi@freebsd.org, ae@FreeBSD.org, Gleb Smirnoff , freebsd-arch@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 22:25:04 -0000 On 28.08.2013 20:30, Alexander V. Chernikov wrote: > Hello list! Hello Alexander, you sent quite a few things in the same email. I'll try to respond as much as I can right now. Later you should split it up to have more in-depth discussions on the individual parts. If you could make it to the EuroBSDcon 2013 DevSummit that would be even more awesome. Most of the active network stack people will be there too. > There is a lot constantly raising discussions related to networking stack performance/changes. > > I'll try to summarize current problems and possible solutions from my point of view. > (Generally this is one problem: stack is slooooooooooooooooooooooooooow, but we need to know why and > what to do). Compared to others its not thaaaaaaat slow. ;) > Let's start with current IPv4 packet flow on a typical router: > http://static.ipfw.ru/images/freebsd_ipv4_flow.png > > (I'm sorry I can't provide this as text since Visio don't have any 'ascii-art' exporter). > > Note that we are using process-to-completion model, e.g. process any packet in ISR until it is either > consumed by L4+ stack or dropped or put to egress NIC queue. > > (There is also deferred ISR model implemented inside netisr but it does not change much: > it can help to do more fine-grained hashing (for GRE or other similar traffic), but > 1) it uses per-packet mutex locking which kills all performance > 2) it currently does not have _any_ hashing functions (see absence of flags in `netstat -Q`) > People using http://static.ipfw.ru/patches/netisr_ip_flowid.diff (or modified PPPoe/GRE version) > report some profit, but without fixing (1) it can't help much > ) > > So, let's start: > > 1) Ixgbe uses mutex to protect each RX ring which is perfectly fine since there is nearly no contention > (the only thing that can happen is driver reconfiguration which is rare and, more signifficant, we > do this once > for the batch of packets received in given interrupt). However, due to some (im)possible deadlocks > current code > does per-packet ring unlock/lock (see ixgbe_rx_input()). > There was a discussion ended with nothing: > http://lists.freebsd.org/pipermail/freebsd-net/2012-October/033520.html > > 1*) Possible BPF users. Here we have one rlock if there are any readers present > (and mutex for any matching packets, but this is more or less OK. Additionally, there is WIP to > implement multiqueue BPF > and there is chance that we can reduce lock contention there). Rlock to rmlock? > There is also an "optimize_writers" hack permitting applications > like CDP to use BPF as writers but not registering them as receivers (which implies rlock) I believe longer term we should solve this with a protocol type "ethernet" so that one can send/receive ethernet frames through a normal socket. > 2/3) Virtual interfaces (laggs/vlans over lagg and other simular constructions). > Currently we simply use rlock to make s/ix0/lagg0/ and, what is much more funny - we use complex > vlan_hash with another rlock to > get vlan interface from underlying one. > > This is definitely not like things should be done and this can be changed more or less easily. Indeed. > There are some useful terms/techniques in world of software/hardware routing: they have clear > 'control plane' and 'data plane' separation. > Former one is for dealing control traffic (IGP, MLD, IGMP snooping, lagg hellos, ARP/NDP, etc..) and > some data traffic (packets with TTL=1, with options, destined to hosts without ARP/NDP record, and > similar). Latter one is done in hardware (or effective software implementation). > Control plane is responsible to provide data for efficient data plane operations. This is the point > we are missing nearly everywhere. ACK. > What I want to say is: lagg is pure control-plane stuff and vlan is nearly the same. We can't apply > this approach to complex cases like lagg-over-vlans-over-vlans-over-(pppoe_ng0-and_wifi0) > but we definitely can do this for most common setups like (igb* or ix* in lagg with or without vlans > on top of lagg). ACK. > We already have some capabilities like VLANHWFILTER/VLANHWTAG, we can add some more. We even have > per-driver hooks to program HW filtering. We could. Though for vlan it looks like it would be easier to remove the hardware vlan tag stripping and insertion. It only adds complexity in all drivers for no gain. > One small step to do is to throw packet to vlan interface directly (P1), proof-of-concept(working in > production): > http://lists.freebsd.org/pipermail/freebsd-net/2013-April/035270.html > > Another is to change lagg packet accounting: > http://lists.freebsd.org/pipermail/svn-src-all/2013-April/067570.html > Again, this is more like HW boxes do (aggregate all counters including errors) (and I can't imagine > what real error we can get from _lagg_). > > 4) If we are router, we can do either slooow ip_input() -> ip_forward() -> ip_output() cycle or use > optimized ip_fastfwd() which falls back to 'slow' path for multicast/options/local traffic (e.g. > works exactly like 'data plane' part). > (Btw, we can consider net.inet.ip.fastforwarding to be turned on by default at least for non-IPSEC > kernels) ACK. > Here we have to determine if this is local packet or not, e.g. F(dst_ip) returning 1 or 0. Currently > we are simply using standard rlock + hash of iface addresses. > (And some consumers like ipfw(4) do the same, but without lock). > We don't need to do this! We can build sorted array of IPv4 addresses or other efficient structure > on every address change and use it unlocked with delayed garbage collection (proof-of-concept attached) I'm a bit uneasy with unlocked access. On very weakly ordered architectures this could trip over cache coherency issues. A rmlock is essentially for free in the read case. > (There is another thing to discuss: maybe we can do this once somewhere in ip_input and mark mbuf as > 'local/non-local' ? ) The problem is packet filters may change the destination address and thus can invalidate such a lookup. > 5, 9) Currently we have L3 ingress/egress PFIL hooks protected by rmlocks. This is OK. > > However, 6) and 7) are not. > Firewall can use the same pfil lock as reader protection without imposing its own lock. currently > pfil&ipfw code is ready to do this. The problem with the global pfil rmlock is the comparatively long time it is held in a locked state. Also packet filters may have to acquire additional locks when they have to modify state tables. Rmlocks are not made for that because they pin the thread to the cpu they're currently on. This is what Gleb is complaining about. My idea is to hold the pfil rmlock only for the lookup of the first/next packet filter that will run, not for the entire duration. That would solve the problem. However packets filter then have to use their own locks again, which could be rmlock too. > 8) Radix/rt* api. This is probably the worst place in entire stack. It is toooo generic, tooo slow > and buggy (do you use IPv6? you definitely know what I'm talking about). > A) It really is too generic and assumption that it can be (effectively) used for every family is > wrong. Two examples: > we don't need to lookup all 128 bits of IPv6 address. Subnets with mask >/64 are not used widely > (actually the only reason to use them are p2p links due to ND potential problems). > One of common solutions is to lookup 64bits, and build another trie (or other structure) in case of > collision. > Another example is MPLS where we can simply do direct array lookup based on ingress label. Yes. While we shouldn't throw it out, it should be run as RIB and allow a much more protocol specific FIB for the hot packet path. > B) It is terribly slow (AFAIR luigi@ did some performance management, numbers available in one of > netmap pdfs) Again not thaaaat slow but inefficient enough. > C) It is not multipath-capable. Stateful (and non-working) multipath is definitely not the right way. Indeed. > 8*) rtentry > We are doing it wrong. > Currently _every_ lookup locks/unlocks given rte twice. > First lock is related to and old-old story for trusting IP redirects (and auto-adding host routes > for them). Hopefully currently it is disabled automatically when you turn forwarding on. They're disabled. > The second one is much more complicated: we are assuming that rte's with non-zero refcount value can > stop egress interface from being destroyed. > This is wrong (but widely used) assumption. Not really. The reason for the refcount is not the ifp reference but other code parts that may hold direct pointers to the rtentry and do direct dereferencing to access information in it. > We can use delayed GC instead of locking for rte's and this won't break things more than they are > broken now (patch attached). Nope. Delayed GC is not the way to go here. To do away with rtentry locking and refcounting we have change rtalloc(9) to return the information the caller wants (e.g. ifp, ia, others) and not the rtentry address anymore. So instead of rtalloc() we have rtlookup(). > We can't do the same for ifp structures since > a) virtual ones can assume some state in underlying physical NIC > b) physical ones just _can_ be destroyed (maybe regardless of user wants this or not, like: SFP > being unplugged from NIC) or simply lead to kernel crash due to SW/HW inconsistency Here I actually believe we can do a GC or stable storage based approach. Ifp pointers are kept in too many places and properly refcounting it is very (too) hard. So whenever an interface gets destroyed or disappears it's callable function pointers are replaced with dummies returning an error. The ifp in memory will stay for some time and even may be reused for another new interface later again (Cisco does it that way in their IOS). > One of possible solution is to implement stable refcounts based on PCPU counters, and apply thos > counters to ifp, but seem to be non-trivial. > > > Another rtalloc(9) problem is the fact that radix is used as both 'control plane' and 'data plane' > structure/api. Some users always want to put more information in rte, while others > want to make rte more compact. We just need _different_ structures for that. ACK. > Feature-rich, lot-of-data control plane one (to store everything we want to store, including, for > example, PID of process originating the route) - current radix can be modified to do this. > And address-family-depended another structure (array, trie, or anything) which contains _only_ data > necessary to put packet on the wire. ACK. > 11) arpresolve. Currently (this was decoupled in 8.x) we have > a) ifaddr rlock > b) lle rlock. > > We don't need those locks. > We need to > a) make lle layer per-interface instead of global (and this can also solve multiple fibs and L2 > mappings done in fib.0 issue) Yes! > b) use rtalloc(9)-provided lock instead of separate locking No. Interface rmlock. > c) actually, we need to do rewrite this layer because > d) lle actually is the place to do real multipath: No, you can do multipath through more than one interface. If lle is per interface that wont work and is not the right place. > briefly, > you have rte pointing to some special nexthop structure pointing to lle, which has the following data: > num_of_egress_ifaces: [ifindex1, ifindex2, ifindex3] | L2 data to prepend to header > Separate post will follow. This should be part of the RIB/FIB and select on of the ifp+nexthops to return on lookup. > With the following, we can achieve lagg traffic distribution without actually using lagg_transmit > and similar stuff (at least in most common scenarious) This seems to be a rather nasty layering violation. > (for example, TCP output definitely can benefit from this, since we can account flowid once for TCP > session and use in in every mbuf) > > So. Imagine we have done all this. How we can estimate the difference? > > There was a thread, started a year ago, describing 'stock' performance and difference for various > modifications. > It is done on 8.x, however I've got similar results on recent 9.x > > http://lists.freebsd.org/pipermail/freebsd-net/2012-July/032680.html > > Briefly: > > 2xE5645 @ Intel 82599 NIC. > Kernel: FreeBSD-8-S r237994, stock drivers, stock routing, no FLOWTABLE, no firewallIxia XM2 > (traffic generator) <> ix0 (FreeBSD). Ixia sends 64byte IP packets from vlan10 (10.100.0.64 - > 10.100.0.156) to destinations in vlan11 (10.100.1.128 - 10.100.1.192). Static arps are configured > for all destination addresses. Traffic level is slightly above or slightly below system performance. > > we start from 1.4MPPS (if we are using several routes to minimize mutex contention). > > My 'current' result for the same test, on same HW, with the following modifications: > > * 1) ixgbe per-packet ring unlock removed > * P1) ixgbe is modified to do direct vlan input (so 2,3 are not used) > * 4) separate lockless in_localip() version > * 6) - using existing pfil lock > * 7) using lockless version > * 8) radix converted to use rmlock instead of rlock. Delayed GC is used instead of mutexes > * 10) - using existing pfil lock > * 11) using radix lock to do arpresolve(). Not using lle rlock > > (so the rmlocks are the only locks used on data path). > > Additionally, ipstat counters are converted to PCPU (no real performance implications). > ixgbe does not do per-packet accounting (as in head). > if_vlan counters are converted to PCPU > lagg is converted to rmlock, per-packet accounting is removed (using stat from underlying interfaces) > lle hash size is bumped to 1024 instead of 32 (not applicable here, but slows things down for large > L2 domains) > > The result is 5.6 MPPS for single port (11 cores) and 6.5MPPS for lagg (16 cores), nearly the same > for HT on and 22 cores. That's quite good, but we want more. ;) > .. > while Intel DPDK claims 80MPPS (and 6windgate talks about 160 or so) on the same-class hardware and > _userland_ forwarding. Those numbers sound a bit far out. Maybe if the packet isn't touched or looked at at all in a pure netmap interface to interface bridging scenario. I don't believe these numbers. > One of key features making all such products possible (DPDK, netmap, packetshader, Cisco SW > forwarding) - is use of batching instead of process-to-completion model. > Batching mitigates locking cost, batching does not wash out CPU cache, and so on. The work has to be done eventually. Batching doesn't relieve from it. IMHO batch moving is only the last step would should look at. It makes the stack rather complicated and introduces other issues like packet latency. > So maybe we can consider passing batches from NIC to at least L2 layer with netisr? or even up to > ip_input() ? And then? You probably won't win much in the end (if the lock path is optimized). > Another question is about making some sort of reliable GC like ("passive serialization" or other > similar not-to-pronounce-words about Linux and lockless objects). Rmlocks are our secret weapon and just as good. > P.S. Attached patches are 1) for 8.x 2) mostly 'hacks' showing roughly how can this be done and what > benefit can be achieved. -- Andre From owner-freebsd-hackers@FreeBSD.ORG Thu Aug 29 01:30:37 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id A19FC88D; Thu, 29 Aug 2013 01:30:37 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) by mx1.freebsd.org (Postfix) with ESMTP id 5B62426E5; Thu, 29 Aug 2013 01:30:37 +0000 (UTC) Received: from slw by zxy.spb.ru with local (Exim 4.69 (FreeBSD)) (envelope-from ) id 1VEr6H-000In5-Q7; Thu, 29 Aug 2013 05:32:41 +0400 Date: Thu, 29 Aug 2013 05:32:41 +0400 From: Slawa Olhovchenkov To: Andre Oppermann Subject: Re: Network stack changes Message-ID: <20130829013241.GB70584@zxy.spb.ru> References: <521E41CB.30700@yandex-team.ru> <521E78B0.6080709@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <521E78B0.6080709@freebsd.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-Mailman-Approved-At: Thu, 29 Aug 2013 02:25:59 +0000 Cc: "Alexander V. Chernikov" , adrian@freebsd.org, freebsd-hackers@freebsd.org, freebsd-arch@freebsd.org, luigi@freebsd.org, ae@FreeBSD.org, Gleb Smirnoff , FreeBSD Net X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 01:30:37 -0000 On Thu, Aug 29, 2013 at 12:24:48AM +0200, Andre Oppermann wrote: > > .. > > while Intel DPDK claims 80MPPS (and 6windgate talks about 160 or so) on the same-class hardware and > > _userland_ forwarding. > > Those numbers sound a bit far out. Maybe if the packet isn't touched > or looked at at all in a pure netmap interface to interface bridging > scenario. I don't believe these numbers. 80*64*8 = 40.960 Gb/s May be DCA? And use CPU with 40 PCIe lane and 4 memory chanell. From owner-freebsd-hackers@FreeBSD.ORG Thu Aug 29 06:46:54 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 0F26FE7D; Thu, 29 Aug 2013 06:46:54 +0000 (UTC) (envelope-from bryanv@daemoninthecloset.org) Received: from torment.daemoninthecloset.org (torment.daemoninthecloset.org [94.242.209.234]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id C098F2912; Thu, 29 Aug 2013 06:46:53 +0000 (UTC) Received: from sage.daemoninthecloset.org (unknown [70.114.209.60]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "sage.daemoninthecloset.org", Issuer "daemoninthecloset.org" (verified OK)) by torment.daemoninthecloset.org (Postfix) with ESMTPS id DFBE342C08C6; Thu, 29 Aug 2013 08:52:03 +0200 (CEST) X-Virus-Scanned: amavisd-new at daemoninthecloset.org X-Virus-Scanned: amavisd-new at daemoninthecloset.org Date: Thu, 29 Aug 2013 01:46:32 -0500 (CDT) From: Bryan Venteicher To: Andre Oppermann Message-ID: <2112475076.435.1377758792082.JavaMail.root@daemoninthecloset.org> In-Reply-To: <521E78B0.6080709@freebsd.org> References: <521E41CB.30700@yandex-team.ru> <521E78B0.6080709@freebsd.org> Subject: Re: Network stack changes MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [192.168.10.20] X-Mailer: Zimbra 8.0.2_GA_5569 (ZimbraWebClient - GC20 ([unknown])/8.0.2_GA_5569) Thread-Topic: Network stack changes Thread-Index: anDUShTn7iVw7wFEqZDuK6ld/6VXsQ== Cc: "Alexander V. Chernikov" , adrian@freebsd.org, freebsd-hackers@freebsd.org, freebsd-arch@freebsd.org, luigi@freebsd.org, ae@FreeBSD.org, Gleb Smirnoff , FreeBSD Net X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 06:46:54 -0000 ----- Original Message ----- > On 28.08.2013 20:30, Alexander V. Chernikov wrote: > > Hello list! > > Hello Alexander, > > you sent quite a few things in the same email. I'll try to respond > as much as I can right now. Later you should split it up to have > more in-depth discussions on the individual parts. > > > > We already have some capabilities like VLANHWFILTER/VLANHWTAG, we can add > > some more. We even have > > per-driver hooks to program HW filtering. > > We could. Though for vlan it looks like it would be easier to remove the > hardware vlan tag stripping and insertion. It only adds complexity in all > drivers for no gain. > In the shorter term, can we remove the requirement for the parent interface to support IFCAP_VLAN_HWTAGGING in order to do checksum offloading on the VLAN interface (see vlan_capabilities())? From owner-freebsd-hackers@FreeBSD.ORG Thu Aug 29 11:49:34 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id B9C08243; Thu, 29 Aug 2013 11:49:34 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-wi0-x234.google.com (mail-wi0-x234.google.com [IPv6:2a00:1450:400c:c05::234]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 566572FB7; Thu, 29 Aug 2013 11:49:33 +0000 (UTC) Received: by mail-wi0-f180.google.com with SMTP id l12so352069wiv.13 for ; Thu, 29 Aug 2013 04:49:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=t5rRGR7p2lccT5dIVAFMMz5iEhB8EN3uCWH29jaGBug=; b=jG6EG9SHnUTL+LWD0mt8ZdlyhVyFrBZee9eO0wArZSQi7Kxa4sipEEiBbicH27NlRE WCFqBBALUkxOLkfAinjqMBlaV/iJhly1bozkC2JSX40PczqetRoSxgspp1/Uf8S+/7Y/ SAPOMG5R/RfYBn/5LaIxPpziJpJ8uJvmxiuc1U90ViJZGA7R/XjoJgRyDWubRr53+sIM prBz7ivSPp48uUqSxvRc6u09Edy/XM3+hSFHKyMWPMoP/isaPhtr5W6IrGK0lz1Cm4Oc SNbYlKGLpk1MD4mzIVxrREDDjMFyu8VVJjqpBIM9jv8oxMF2TcPAFeMlBXnnKrl9XP9A PNgw== MIME-Version: 1.0 X-Received: by 10.194.79.33 with SMTP id g1mr2141120wjx.79.1377776971643; Thu, 29 Aug 2013 04:49:31 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.216.146.2 with HTTP; Thu, 29 Aug 2013 04:49:31 -0700 (PDT) In-Reply-To: <521E41CB.30700@yandex-team.ru> References: <521E41CB.30700@yandex-team.ru> Date: Thu, 29 Aug 2013 04:49:31 -0700 X-Google-Sender-Auth: fjTZLF4GZ_Hxxlda_cdxncMN6aA Message-ID: Subject: Re: Network stack changes From: Adrian Chadd To: "Alexander V. Chernikov" Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: Luigi Rizzo , Andre Oppermann , "freebsd-hackers@freebsd.org" , FreeBSD Net , "Andrey V. Elsukov" , Gleb Smirnoff , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 11:49:34 -0000 Hi, There's a lot of good stuff to review here, thanks! Yes, the ixgbe RX lock needs to die in a fire. It's kinda pointless to keep locking things like that on a per-packet basis. We should be able to do this in a cleaner way - we can defer RX into a CPU pinned taskqueue and convert the interrupt handler to a fast handler that just schedules that taskqueue. We can ignore the ithread entirely here. What do you think? Totally pie in the sky handwaving at this point: * create an array of mbuf pointers for completed mbufs; * populate the mbuf array; * pass the array up to ether_demux(). For vlan handling, it may end up populating its own list of mbufs to push up to ether_demux(). So maybe we should extend the API to have a bitmap of packets to actually handle from the array, so we can pass up a larger array of mbufs, note which ones are for the destination and then the upcall can mark which frames its consumed. I specifically wonder how much work/benefit we may see by doing: * batching packets into lists so various steps can batch process things rather than run to completion; * batching the processing of a list of frames under a single lock instance - eg, if the forwarding code could do the forwarding lookup for 'n' packets under a single lock, then pass that list of frames up to inet_pfil_hook() to do the work under one lock, etc, etc. Here, the processing would look less like "grab lock and process to completion" and more like "mark and sweep" - ie, we have a list of frames that we mark as needing processing and mark as having been processed at each layer, so we know where to next dispatch them. I still have some tool coding to do with PMC before I even think about tinkering with this as I'd like to measure stuff like per-packet latency as well as top-level processing overhead (ie, CPU_CLK_UNHALTED.THREAD_P / lagg0 TX bytes/pkts, RX bytes/pkts, NIC interrupts on that core, etc.) Thanks, -adrian From owner-freebsd-hackers@FreeBSD.ORG Thu Aug 29 16:37:51 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id F195315E for ; Thu, 29 Aug 2013 16:37:51 +0000 (UTC) (envelope-from gibblertron@gmail.com) Received: from mail-oa0-x22a.google.com (mail-oa0-x22a.google.com [IPv6:2607:f8b0:4003:c02::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id C187226B6 for ; Thu, 29 Aug 2013 16:37:51 +0000 (UTC) Received: by mail-oa0-f42.google.com with SMTP id j10so724109oah.29 for ; Thu, 29 Aug 2013 09:37:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=6ZnesQJvSuzdxQdcLuP9OUxx3ja8H5DGa1xqJ9zM3Ts=; b=BlddKEO8gJM5yCTVLnwNORfrY8/tB8EK1oepQ37aw0itGsZCeCXHVJlEQtoiUxWTTL YL4HUYOhnNi7iGb+RFH1CMBItczgeQx24gFV7Cyfriv4cltpLZJeIC1VTPTC8G1ZNsdY GZPFNQVfzuA2GzRQXZZjhsZn/cwlhCb415E+KeznJeJEkWjhRzpamCGJRkqR/MzYtMdU i1N6gLjlifm0rmsZpmQxxRDaefAse3njIqand+YkWfZZBXmgQNOvCk02gGFA09cLGDFF P2Ob/AQkXgNwtxK7BmMmhyUwM8ZCdn7hlmOX6NFcy6/CN4C1v+b3KBT2JkahSoNqIa1O UB+w== MIME-Version: 1.0 X-Received: by 10.60.133.233 with SMTP id pf9mr3210637oeb.46.1377794271003; Thu, 29 Aug 2013 09:37:51 -0700 (PDT) Received: by 10.182.45.228 with HTTP; Thu, 29 Aug 2013 09:37:50 -0700 (PDT) Date: Thu, 29 Aug 2013 09:37:50 -0700 Message-ID: Subject: Fatal trap 12 going from 8.2 to 8.4 with ZFS From: Patrick To: freebsd-hackers@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 16:37:52 -0000 I've got a system running on a VPS that I'm trying to upgrade from 8.2 to 8.4. It has a ZFS root. After booting the new kernel, I get: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x40 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff810d7691 stack pointer = 0x28:0xffffff800001ba60 frame pointer = 0x28:0xffffff800001ba90 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 1 (kernel) trap number = 12 panic: page fault cpuid = 0 KDB: stack backtrace: #0 0xffffffff8066cb96 at kdb_backtrace+0x66 #1 0xffffffff8063925e at panic+0x1ce #2 0xffffffff809c21d0 at trap_fatal+0x290 #3 0xffffffff809c255e at trap_pfault+0x23e #4 0xffffffff809c2a2e at trap+0x3ce #5 0xffffffff809a9624 at calltrap+0x8 #6 0xffffffff810df517 at vdev_mirror_child_select+0x67 #7 0xffffffff810dfacc at vdev_mirror_io_start+0x24c #8 0xffffffff810f7c52 at zio_vdev_io_start+0x232 #9 0xffffffff810f76f3 at zio_execute+0xc3 #10 0xffffffff810f77ad at zio_wait+0x2d #11 0xffffffff8108991e at arc_read+0x6ce #12 0xffffffff8109d9d4 at dmu_objset_open_impl+0xd4 #13 0xffffffff810b4014 at dsl_pool_init+0x34 #14 0xffffffff810c7eea at spa_load+0x6aa #15 0xffffffff810c90b2 at spa_load_best+0x52 #16 0xffffffff810cb0ca at spa_open_common+0x14a #17 0xffffffff810a892d at dsl_dir_open_spa+0x2cd Uptime: 3s Cannot dump. Device not defined or unavailable. I've booted back into the 8.2 kernel without any problems, but I'm wondering if anyone can suggest what I should try to get this working? I used freebsd-update to upgrade, and this was after the first "freebsd-update install" where it installs the kernel. My /boot/loader.conf has: zfs_load="YES" vfs.root.mountfrom="zfs:zroot" Should I be going from 8.2 -> 8.3 -> 8.4? Patrick From owner-freebsd-hackers@FreeBSD.ORG Thu Aug 29 21:33:53 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id D9205A54 for ; Thu, 29 Aug 2013 21:33:53 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 294042B32 for ; Thu, 29 Aug 2013 21:33:52 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id AAA03030; Fri, 30 Aug 2013 00:33:49 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1VF9qf-0003h2-JN; Fri, 30 Aug 2013 00:33:49 +0300 Message-ID: <521FBE05.6020007@FreeBSD.org> Date: Fri, 30 Aug 2013 00:32:53 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130810 Thunderbird/17.0.8 MIME-Version: 1.0 To: Patrick Subject: Re: Fatal trap 12 going from 8.2 to 8.4 with ZFS References: In-Reply-To: X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@FreeBSD.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 21:33:53 -0000 on 29/08/2013 19:37 Patrick said the following: > I've got a system running on a VPS that I'm trying to upgrade from 8.2 > to 8.4. It has a ZFS root. After booting the new kernel, I get: > > Fatal trap 12: page fault while in kernel mode > cpuid = 0; apic id = 00 > fault virtual address = 0x40 > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff810d7691 > stack pointer = 0x28:0xffffff800001ba60 > frame pointer = 0x28:0xffffff800001ba90 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 1 (kernel) > trap number = 12 > panic: page fault > cpuid = 0 > KDB: stack backtrace: > #0 0xffffffff8066cb96 at kdb_backtrace+0x66 > #1 0xffffffff8063925e at panic+0x1ce > #2 0xffffffff809c21d0 at trap_fatal+0x290 > #3 0xffffffff809c255e at trap_pfault+0x23e > #4 0xffffffff809c2a2e at trap+0x3ce > #5 0xffffffff809a9624 at calltrap+0x8 > #6 0xffffffff810df517 at vdev_mirror_child_select+0x67 If possible, please run 'kgdb /path/to/8.4/kernel' and then in kgdb do 'list *vdev_mirror_child_select+0x67' > #7 0xffffffff810dfacc at vdev_mirror_io_start+0x24c > #8 0xffffffff810f7c52 at zio_vdev_io_start+0x232 > #9 0xffffffff810f76f3 at zio_execute+0xc3 > #10 0xffffffff810f77ad at zio_wait+0x2d > #11 0xffffffff8108991e at arc_read+0x6ce > #12 0xffffffff8109d9d4 at dmu_objset_open_impl+0xd4 > #13 0xffffffff810b4014 at dsl_pool_init+0x34 > #14 0xffffffff810c7eea at spa_load+0x6aa > #15 0xffffffff810c90b2 at spa_load_best+0x52 > #16 0xffffffff810cb0ca at spa_open_common+0x14a > #17 0xffffffff810a892d at dsl_dir_open_spa+0x2cd > Uptime: 3s > Cannot dump. Device not defined or unavailable. > > I've booted back into the 8.2 kernel without any problems, but I'm > wondering if anyone can suggest what I should try to get this working? > I used freebsd-update to upgrade, and this was after the first > "freebsd-update install" where it installs the kernel. -- Andriy Gapon From owner-freebsd-hackers@FreeBSD.ORG Fri Aug 30 08:17:19 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 12F8281F for ; Fri, 30 Aug 2013 08:17:19 +0000 (UTC) (envelope-from gibblertron@gmail.com) Received: from mail-oa0-x22a.google.com (mail-oa0-x22a.google.com [IPv6:2607:f8b0:4003:c02::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id D5B262082 for ; Fri, 30 Aug 2013 08:17:18 +0000 (UTC) Received: by mail-oa0-f42.google.com with SMTP id j10so1502113oah.1 for ; Fri, 30 Aug 2013 01:17:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=7X2N0VeTmec4lqlxHpM+plbpS9NE1ivYBL313Q9mqZA=; b=MdfkgkpRuqSmxATRoy2xo9YiE5nVqZR7PKC0WnS4XI98Y4Mtu7EV42xRAvVF8VzUrK NyIcat79hpeCM9aRGpGHVde0nicaXgUK+IlS4WK6yfYzos/3PhdjYfwxnipXXDZIXKv1 T46zqLxg1HKuxeRNg8jMphrc0DwIV/fdcovoS31QimlHJtUBZ/ZYHJ+4jzpWNgq/uTVm RBjMXl2qrwq4dCPZOR4P2RZFeuUncBRGr68JjPEf7BV0P1qt6hGY964ErE7SNJ5nHH7l l9bpHmAuM4muRolZf+QoEUZ7/Hp9nq69rzltjMN5kWhV4iC5HCibMmEgxdDl3oP/kwnG xHnQ== MIME-Version: 1.0 X-Received: by 10.182.158.42 with SMTP id wr10mr1126282obb.92.1377850638165; Fri, 30 Aug 2013 01:17:18 -0700 (PDT) Received: by 10.182.45.228 with HTTP; Fri, 30 Aug 2013 01:17:18 -0700 (PDT) In-Reply-To: <521FBE05.6020007@FreeBSD.org> References: <521FBE05.6020007@FreeBSD.org> Date: Fri, 30 Aug 2013 01:17:18 -0700 Message-ID: Subject: Re: Fatal trap 12 going from 8.2 to 8.4 with ZFS From: Patrick To: freebsd-hackers@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Aug 2013 08:17:19 -0000 On Thu, Aug 29, 2013 at 2:32 PM, Andriy Gapon wrote: > on 29/08/2013 19:37 Patrick said the following: >> I've got a system running on a VPS that I'm trying to upgrade from 8.2 >> to 8.4. It has a ZFS root. After booting the new kernel, I get: >> >> Fatal trap 12: page fault while in kernel mode >> cpuid = 0; apic id = 00 >> fault virtual address = 0x40 >> fault code = supervisor read data, page not present >> instruction pointer = 0x20:0xffffffff810d7691 >> stack pointer = 0x28:0xffffff800001ba60 >> frame pointer = 0x28:0xffffff800001ba90 >> code segment = base 0x0, limit 0xfffff, type 0x1b >> = DPL 0, pres 1, long 1, def32 0, gran 1 >> processor eflags = interrupt enabled, resume, IOPL = 0 >> current process = 1 (kernel) >> trap number = 12 >> panic: page fault >> cpuid = 0 >> KDB: stack backtrace: >> #0 0xffffffff8066cb96 at kdb_backtrace+0x66 >> #1 0xffffffff8063925e at panic+0x1ce >> #2 0xffffffff809c21d0 at trap_fatal+0x290 >> #3 0xffffffff809c255e at trap_pfault+0x23e >> #4 0xffffffff809c2a2e at trap+0x3ce >> #5 0xffffffff809a9624 at calltrap+0x8 >> #6 0xffffffff810df517 at vdev_mirror_child_select+0x67 > > If possible, please run 'kgdb /path/to/8.4/kernel' and then in kgdb do 'list > *vdev_mirror_child_select+0x67' Hmmmm... (kgdb) list *vdev_mirror_child_select+0x67 No symbol table is loaded. Use the "file" command. Do I need to build the kernel from source myself? This kernel is what freebsd-update installed during part 1 of the upgrade. Patrick From owner-freebsd-hackers@FreeBSD.ORG Fri Aug 30 08:31:42 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 0AC716D for ; Fri, 30 Aug 2013 08:31:42 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 55A9521A6 for ; Fri, 30 Aug 2013 08:31:40 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id LAA09686; Fri, 30 Aug 2013 11:31:31 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1VFK79-0007RI-2H; Fri, 30 Aug 2013 11:31:31 +0300 Message-ID: <52205811.1020109@FreeBSD.org> Date: Fri, 30 Aug 2013 11:30:09 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130810 Thunderbird/17.0.8 MIME-Version: 1.0 To: Patrick Subject: Re: Fatal trap 12 going from 8.2 to 8.4 with ZFS References: <521FBE05.6020007@FreeBSD.org> In-Reply-To: X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@FreeBSD.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Aug 2013 08:31:42 -0000 on 30/08/2013 11:17 Patrick said the following: > Hmmmm... > > (kgdb) list *vdev_mirror_child_select+0x67 > No symbol table is loaded. Use the "file" command. > > Do I need to build the kernel from source myself? This kernel is what > freebsd-update installed during part 1 of the upgrade. I don't have an exact recollection of what is installed by freebsd-update - are *.symbols files installed? -- Andriy Gapon From owner-freebsd-hackers@FreeBSD.ORG Fri Aug 30 23:37:42 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 221C59DF; Fri, 30 Aug 2013 23:37:42 +0000 (UTC) (envelope-from gibblertron@gmail.com) Received: from mail-oa0-x231.google.com (mail-oa0-x231.google.com [IPv6:2607:f8b0:4003:c02::231]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id D73A32753; Fri, 30 Aug 2013 23:37:41 +0000 (UTC) Received: by mail-oa0-f49.google.com with SMTP id i7so3036738oag.36 for ; Fri, 30 Aug 2013 16:37:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=18RkNI11tbDMAFUEwSMN0G66yCbPT59Xh+OvcKeOcU4=; b=iuIDZqk6UxgVEPKYq+xthrQy+/DoKkQcf3VeXjczI0TFbNALxxhDnMH8ZOumYkZ8Ui Oi4wdmfa/QbMpF8dT+uVt3L3KhNAoRuAgGWo1vkmCLi117iZFAws/GDyCpJQU7mPEHho TXuCUIgUUneCp9NbtoMdRP/gfMVhvkKGS/EMJA4bq7ouGlacHVTjxRiX3QxhOCrf215W x0Kaxig3SYdNCaZI9uHf9WsHe3BNXls4RkRrV8MFp11RoXwqdNeG9xU2CCf27Jqfs/Cv FwKC4rGVuftPGn/Yv7buL2Z4UQhI9AYtZgyJr9hxOGBFMfoTNU7wfUY1rH3hfcCw9Ybq AcPw== MIME-Version: 1.0 X-Received: by 10.182.110.202 with SMTP id ic10mr8689087obb.73.1377905861177; Fri, 30 Aug 2013 16:37:41 -0700 (PDT) Received: by 10.182.45.228 with HTTP; Fri, 30 Aug 2013 16:37:41 -0700 (PDT) In-Reply-To: <52205811.1020109@FreeBSD.org> References: <521FBE05.6020007@FreeBSD.org> <52205811.1020109@FreeBSD.org> Date: Fri, 30 Aug 2013 16:37:41 -0700 Message-ID: Subject: Re: Fatal trap 12 going from 8.2 to 8.4 with ZFS From: Patrick To: Andriy Gapon Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Aug 2013 23:37:42 -0000 On Fri, Aug 30, 2013 at 1:30 AM, Andriy Gapon wrote: > > I don't have an exact recollection of what is installed by freebsd-update - are > *.symbols files installed? Doesn't look like it. I wonder if I can grab that from a distro site or somewhere? From owner-freebsd-hackers@FreeBSD.ORG Sat Aug 31 05:30:17 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 18856AA4; Sat, 31 Aug 2013 05:30:17 +0000 (UTC) (envelope-from eliot@vaikuntha.co.uk) Received: from mail.vaikuntha.co.uk (parvati.vaikuntha.co.uk [176.31.180.96]) by mx1.freebsd.org (Postfix) with ESMTP id DB0DA29DE; Sat, 31 Aug 2013 05:30:16 +0000 (UTC) Received: by mail.vaikuntha.co.uk (Postfix, from userid 1001) id 07AEF6B4E73F; Sat, 31 Aug 2013 06:30:02 +0100 (BST) In-Reply-To: <52205811.1020109@FreeBSD.org> Date: Fri, 30 Aug 2013 16:37:41 -0700 Subject: Re: Fatal trap 12 going from 8.2 to 8.4 with ZFS From: Patrick To: Andriy Gapon Content-Type: text/plain; charset="us-ascii" Message-Id: <20130831053002.07AEF6B4E73F@mail.vaikuntha.co.uk> Cc: freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 31 Aug 2013 05:30:17 -0000 On Fri, Aug 30, 2013 at 1:30 AM, Andriy Gapon wrote: > > I don't have an exact recollection of what is installed by freebsd-update - are > *.symbols files installed? Doesn't look like it. I wonder if I can grab that from a distro site or somewhere? _______________________________________________ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" From owner-freebsd-hackers@FreeBSD.ORG Sat Aug 31 13:27:04 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 626AFCC3; Sat, 31 Aug 2013 13:27:04 +0000 (UTC) (envelope-from marck@rinet.ru) Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id E0180218F; Sat, 31 Aug 2013 13:27:03 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id r7VDQtmc059811; Sat, 31 Aug 2013 17:26:55 +0400 (MSK) (envelope-from marck@rinet.ru) Date: Sat, 31 Aug 2013 17:26:55 +0400 (MSK) From: Dmitry Morozovsky To: Patrick Subject: Re: Fatal trap 12 going from 8.2 to 8.4 with ZFS In-Reply-To: Message-ID: References: <521FBE05.6020007@FreeBSD.org> <52205811.1020109@FreeBSD.org> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) X-NCC-RegID: ru.rinet X-OpenPGP-Key-ID: 6B691B03 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (woozle.rinet.ru [0.0.0.0]); Sat, 31 Aug 2013 17:26:55 +0400 (MSK) Cc: freebsd-hackers@freebsd.org, Andriy Gapon X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 31 Aug 2013 13:27:04 -0000 On Fri, 30 Aug 2013, Patrick wrote: > On Fri, Aug 30, 2013 at 1:30 AM, Andriy Gapon wrote: > > > > I don't have an exact recollection of what is installed by freebsd-update - are > > *.symbols files installed? > > Doesn't look like it. I wonder if I can grab that from a distro site > or somewhere? it seems so: marck@woozle:/pub/FreeBSD/releases/amd64/8.4-RELEASE/kernels> grep -c symbol generic.mtree 636 So, get kernels subdir from the release and extract symbols from them: cat generic.?? | tar tvjf - \*.symbols -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: marck@FreeBSD.org ] ------------------------------------------------------------------------ *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru *** ------------------------------------------------------------------------ From owner-freebsd-hackers@FreeBSD.ORG Sat Aug 31 14:09:59 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id C53B1A4A; Sat, 31 Aug 2013 14:09:59 +0000 (UTC) (envelope-from marck@rinet.ru) Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 4C0E023E1; Sat, 31 Aug 2013 14:09:58 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id r7VE9vHH060245; Sat, 31 Aug 2013 18:09:57 +0400 (MSK) (envelope-from marck@rinet.ru) Date: Sat, 31 Aug 2013 18:09:57 +0400 (MSK) From: Dmitry Morozovsky To: Patrick Subject: Re: Fatal trap 12 going from 8.2 to 8.4 with ZFS In-Reply-To: Message-ID: References: <521FBE05.6020007@FreeBSD.org> <52205811.1020109@FreeBSD.org> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) X-NCC-RegID: ru.rinet X-OpenPGP-Key-ID: 6B691B03 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (woozle.rinet.ru [0.0.0.0]); Sat, 31 Aug 2013 18:09:57 +0400 (MSK) Cc: freebsd-hackers@freebsd.org, Andriy Gapon X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 31 Aug 2013 14:09:59 -0000 On Sat, 31 Aug 2013, Dmitry Morozovsky wrote: > > > I don't have an exact recollection of what is installed by freebsd-update - are > > > *.symbols files installed? > > > > Doesn't look like it. I wonder if I can grab that from a distro site > > or somewhere? > > it seems so: > > marck@woozle:/pub/FreeBSD/releases/amd64/8.4-RELEASE/kernels> grep -c symbol generic.mtree > 636 > > So, get kernels subdir from the release and extract symbols from them: > > cat generic.?? | tar tvjf - \*.symbols ah, ``tar xvjf'' of course -- I did test-run -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: marck@FreeBSD.org ] ------------------------------------------------------------------------ *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru *** ------------------------------------------------------------------------