From owner-freebsd-stable@FreeBSD.ORG  Sat Feb 16 18:25:33 2013
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 56C3C47D
 for <stable@freebsd.org>; Sat, 16 Feb 2013 18:25:33 +0000 (UTC)
 (envelope-from alan.l.cox@gmail.com)
Received: from mail-ee0-f43.google.com (mail-ee0-f43.google.com [74.125.83.43])
 by mx1.freebsd.org (Postfix) with ESMTP id E6E447FB
 for <stable@freebsd.org>; Sat, 16 Feb 2013 18:25:32 +0000 (UTC)
Received: by mail-ee0-f43.google.com with SMTP id c50so2178718eek.30
 for <stable@freebsd.org>; Sat, 16 Feb 2013 10:25:26 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:reply-to:in-reply-to:references:date
 :message-id:subject:from:to:cc:content-type;
 bh=gIKkZypgJkVINsr1xM++O0F0Awt+hmOfBQayjXLLMOk=;
 b=jKbBMy9WaEXEOTkRJU4n24gbnMK6pnNu76RaDTCOTyB3t6TPsrUpfEVEB/Ztpx6r4z
 HSlA53lPYa5L++xW2/GvzqN1cFSXZDQdCF3LBGRqQkPktCoy9I5lSp4lRtOydp3m1UGH
 Ce3/HbXaIpd1PIbro3iuDWzUFynXMz0WMnkVOyu4o21JFBgd9jkaS6XfwLL5v8pLs/Ft
 m2M/ksBo1ADevY9mON3rNQZvF/BUgjsNIfu6SqfdILEYpuomy8ptJAvobaFUzgGctxOb
 yecpCw64t5Q9loFzlbVVkjMHjxfPGCPFTANqAQoNpTpVeVYiiscXvH2J9kjR4KiJQx5k
 Ir3g==
MIME-Version: 1.0
X-Received: by 10.14.207.200 with SMTP id n48mr23377834eeo.4.1361039126044;
 Sat, 16 Feb 2013 10:25:26 -0800 (PST)
Received: by 10.223.177.7 with HTTP; Sat, 16 Feb 2013 10:25:25 -0800 (PST)
In-Reply-To: <511CECCC.60400@grosbein.pp.ru>
References: <511CECCC.60400@grosbein.pp.ru>
Date: Sat, 16 Feb 2013 12:25:25 -0600
Message-ID: <CAJUyCcOHFgGqzvm43y-dojvsXdNydYmfd9rwHMkdjqDyvjJuKQ@mail.gmail.com>
Subject: Re: i386: vm.pmap kernel local race condition
From: Alan Cox <alan.l.cox@gmail.com>
To: Eugene Grosbein <eugen@grosbein.pp.ru>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: stable@freebsd.org
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: alc@freebsd.org
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 16 Feb 2013 18:25:33 -0000

On Thu, Feb 14, 2013 at 7:55 AM, Eugene Grosbein <eugen@grosbein.pp.ru>wrote:

> Hi!
>
> I've got FreeBSD 8.3-STABLE/i386 server that can be reliably panicked
> using just 'squid -k rotatelog' command. It seems the system suffers
> from the problem described here:
>
> http://cxsecurity.com/issue/WLB-2010090156
>
> I could not find any FreeBSD Security Advisory containing a fix.
>
> My server has 4G physical RAM (about 3.2G available) and runs
> squid (about 110M VSS) with 500 ntlm_auth subprocesses.
> Lesser number of ntlm_auth sometimes results in squid crash
> as it sometimes has several hundreds requests per second to authorize
> and is intolerant to exhaustion of free ntlm_auth.
>
> "squid -k rotatelog" at midnight results in crash:
>
> Feb 14 00:03:00 irl savecore: reboot after panic: get_pv_entry: increase
> vm.pmap.shpgperproc
> Feb 14 00:03:00 irl savecore: writing core to vmcore.1
>
> Btw, I have coredump.
>
> vm.pmap.shpgperproc has default value (200) here, as well as m.v_free_min,
> vm.v_free_reserved, and vm.v_free_target and KVA_PAGES.
>
> These crashes are pretty regular
>
> # last|fgrep reboot
> reboot           ~                         Thu Feb 14 00:03
> reboot           ~                         Wed Feb 13 19:08
> reboot           ~                         Wed Feb 13 10:40
> reboot           ~                         Wed Feb 13 00:04
> reboot           ~                         Tue Feb 12 00:09
> reboot           ~                         Mon Feb 11 00:03
> reboot           ~                         Sun Feb 10 00:03
> reboot           ~                         Thu Feb  7 00:03
> reboot           ~                         Wed Feb  6 10:52
> reboot           ~                         Sun Feb  3 00:03
> reboot           ~                         Sat Feb  2 00:03
>
> May this be considered as security problem?
> Can it be fixed without switch to amd64?
> I have only remote access to this production server, no serial console.
>
>
Regardless of what that web site says, this is not really a race
condition.  Instead, you're exhausting a resource in the kernel because of
the characteristics of your workload.  The kernel tries to handle this
gracefully, but in extreme cases, the kernel can't keep up with the
demand.  Have you simply tried doing as the panic message suggests, i.e.,
increase vm.pmap.shpgperproc?  Alternatively, you can increase
vm.pmap.pv_entry_max to more directly accomplish the same.

That said, if possible, you should do as Adrian suggests and change your
Squid configuration to not use 500 helper processes.  That will allow a lot
more of your machine's physical memory to go to caching data rather
bookkeeping data structures in the kernel.

Regards,
Alan