Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 2 May 2016 23:20:31 -0700
From:      Mark Johnston <markj@FreeBSD.org>
To:        Steve Wills <swills@FreeBSD.org>
Cc:        FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, freebsd-current@freebsd.org, scottl@FreeBSD.org, Warner Losh <imp@bsdimp.com>
Subject:   Re: wired memory leak at r298785
Message-ID:  <20160503062031.GA2209@raichu>
In-Reply-To: <5727F71E.20101@FreeBSD.org>
References:  <572756DF.1010809@FreeBSD.org> <5727F71E.20101@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, May 02, 2016 at 08:55:58PM -0400, Steve Wills wrote:
> Hi,
> 
> On 05/ 2/16 09:32 AM, Steve Wills wrote:
> > Hi,
> > 
> > Just did my monthly update and r298785 seems to be leaking wired memory
> > rather rapidly. My system has 8gb of RAM and the amount of wired memory
> > just goes up and up continuously. It takes about 12 hours before it
> > exhausts all the RAM and sort of locks up (though shutdown still works).
> > 
> > I also made one other change to the system at the same time as updating,
> > which was to add another disk and configure it using ZFS. Perhaps this
> > is a ZFS on PowerPC64 issue? My amd64 box running the same rev of
> > CURRENT doesn't have the issue.
> > 
> 
> I've rebooted the box and started repeatedly logging the output of
> vmstat -m. It seems to show CAM CCB using a lot of memory and growing
> rather rapidly. For example, here's a few lines of diff output:
> 
> - CAM CCB 91418 182836K - 187149 2048
> + CAM CCB 447070 894140K - 900292 2048
> 
> from two samples that are 60 minutes apart.
> 
> The box is isn't terribly busy, it's just running the monitoring daemons
> running (snmpd, collectd), whatever web requests are hitting it (very
> few if any), this logging process, and my shell, etc.

This was causing problems on one of my amd64 systems, so it's not
specific to powerpc64. It turns out to be due to r298004: the CCB
allocated in cam_periph_devctl_notify() never gets freed. The patch
below seems to fix it.

It's possible to trace CCB allocations/frees using dtrace, which makes
many of these sorts of problems trivial to find. Running

# dtrace -n 'dtmalloc::CAM_CCB: {printf("%s", execname); stack();}'

and examining the output showed that hald was frequently allocating CCBs
at cam_periph_error+0x48f, but never freeing them. This corresponds to
the allocation in cam_periph_devctl_notify().

diff --git a/sys/cam/cam_periph.c b/sys/cam/cam_periph.c
index 85b2ff9..1f7be4f 100644
--- a/sys/cam/cam_periph.c
+++ b/sys/cam/cam_periph.c
@@ -1876,6 +1876,7 @@ cam_periph_devctl_notify(union ccb *ccb)
 
 		if (cgd->ccb_h.status == CAM_REQ_CMP)
 			sbuf_bcat(&sb, cgd->serial_num, cgd->serial_num_len);
+		xpt_free_ccb((union ccb *)cgd);
 	}
 	sbuf_printf(&sb, "\" ");
 	sbuf_printf(&sb, "cam_status=\"0x%x\" ", ccb->ccb_h.status);



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160503062031.GA2209>