From owner-freebsd-pf@freebsd.org  Sun Mar  5 12:43:11 2017
Return-Path: <owner-freebsd-pf@freebsd.org>
Delivered-To: freebsd-pf@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id BD948CF828F
 for <freebsd-pf@mailman.ysv.freebsd.org>; Sun,  5 Mar 2017 12:43:11 +0000 (UTC)
 (envelope-from kp@FreeBSD.org)
Received: from venus.codepro.be (venus.codepro.be [IPv6:2a01:4f8:162:1127::2])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
 bits))
 (Client CN "*.codepro.be", Issuer "Gandi Standard SSL CA 2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 8B77B15F6
 for <freebsd-pf@freebsd.org>; Sun,  5 Mar 2017 12:43:11 +0000 (UTC)
 (envelope-from kp@FreeBSD.org)
Received: from [172.16.1.189] (s224.GtokyoFL6.vectant.ne.jp [222.228.90.224])
 (Authenticated sender: kp)
 by venus.codepro.be (Postfix) with ESMTPSA id 1BAC01EF5E;
 Sun,  5 Mar 2017 13:43:07 +0100 (CET)
From: "Kristof Provost" <kp@FreeBSD.org>
To: Ross <basarevych@gmail.com>
Cc: freebsd-pf@freebsd.org
Subject: Re: sonewconn: pru_attach() failed and kernel panic in PF
Date: Sun, 05 Mar 2017 21:42:59 +0900
Message-ID: <D0CD7B4C-2C21-4ABE-9F1B-41E5414A9A8A@FreeBSD.org>
In-Reply-To: <CANmv3=xB0Kce4ZQ4GYBE0cNpam0jzGPX7dSYSVBPiT-sryCyHA@mail.gmail.com>
References: <CANmv3=xB0Kce4ZQ4GYBE0cNpam0jzGPX7dSYSVBPiT-sryCyHA@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Mailer: MailMate (2.0BETAr6080)
X-BeenThere: freebsd-pf@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: "Technical discussion and general questions about packet filter
 \(pf\)" <freebsd-pf.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-pf>,
 <mailto:freebsd-pf-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-pf/>
List-Post: <mailto:freebsd-pf@freebsd.org>
List-Help: <mailto:freebsd-pf-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-pf>,
 <mailto:freebsd-pf-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Mar 2017 12:43:11 -0000


On 27 Feb 2017, at 21:08, Ross wrote:

> Hello
>
> One of my machines panics almost every day. It is always like this: 
> first
> there is a number of messages about "sonewconn: pcb 
> 0xfffff80085478740:
> pru_attach() failed" at the same time and then panic. Here's an 
> example:
>
> ... many lines of sonewconn ...
> Feb 27 13:41:43 core kernel: sonewconn: pcb 0xfffff8008575bcb0:
> pru_attach() failed
> Feb 27 13:41:43 core kernel:

I wonder if you’re running low on memory by any chance.

I think I know why you’re crashing, but I suspect your root problem is 
that
you’re running low on memory and that’s why you’re seeing the 
pru_attach()
failures, and eventually running into the pf panic.

> Feb 27 13:41:43 core kernel: KDB: stack backtrace:
> Feb 27 13:41:43 core kernel: #0 0xffffffff80b312c7 at 
> kdb_backtrace+0x67
> Feb 27 13:41:43 core kernel: #1 0xffffffff80ae5c92 at vpanic+0x182
> Feb 27 13:41:43 core kernel: #2 0xffffffff80ae5b03 at panic+0x43
> Feb 27 13:41:43 core kernel: #3 0xffffffff80fd6d51 at trap_fatal+0x351
> Feb 27 13:41:43 core kernel: #4 0xffffffff80fd6f43 at 
> trap_pfault+0x1e3
> Feb 27 13:41:43 core kernel: #5 0xffffffff80fd64ec at trap+0x26c
> Feb 27 13:41:43 core kernel: #6 0xffffffff80fb9d61 at calltrap+0x8
> Feb 27 13:41:43 core kernel: #7 0xffffffff80e4185e at 
> uma_zfree_arg+0x4fe
> Feb 27 13:41:43 core kernel: #8 0xffffffff82442165 at
> pf_get_translation+0x2c5

There’s only a couple of calls to uma_zfree() in 
pf_get_translations().

These are:
  * uma_zfree(V_pf_state_key_z, skp);
  * uma_zfree(V_pf_state_key_z, *nkp);
  * uma_zfree(V_pf_state_key_z, *skp);

Going by the inconsistent pointer use the first one is rather suspect.
Looking a bit deeper, pf_get_translation() is only called from one 
place,
and it always passes stack variables for skp and nkp, so the first call
ends up trying to free that, which won’t work too well.

That’s a bug (and I’ll fix it), but you’re only running into it 
because
pf_state_key_clone() returned NULL, which will only happen under memory
pressure.

> What should I do to fix it?
>
You’ll need to look at your system and figure out who’s running away 
with all
of your memory.

Regards,
Kristof