From owner-freebsd-performance@FreeBSD.ORG  Thu Apr  3 17:46:54 2008
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0725F1065676
	for <freebsd-performance@freebsd.org>;
	Thu,  3 Apr 2008 17:46:54 +0000 (UTC)
	(envelope-from Jinmei_Tatuya@isc.org)
Received: from mon.jinmei.org (mon.jinmei.org [IPv6:2001:4f8:3:36::162])
	by mx1.freebsd.org (Postfix) with ESMTP id DE8238FC19
	for <freebsd-performance@freebsd.org>;
	Thu,  3 Apr 2008 17:46:53 +0000 (UTC)
	(envelope-from Jinmei_Tatuya@isc.org)
Received: from dhcp-191.sql1.isc.org (unknown
	[IPv6:2001:4f8:3:bb:217:f2ff:fee0:a91f])
	by mon.jinmei.org (Postfix) with ESMTP id 9875D33C2E;
	Thu,  3 Apr 2008 10:46:53 -0700 (PDT)
Date: Thu, 03 Apr 2008 10:46:53 -0700
Message-ID: <m21w5mltki.wl%Jinmei_Tatuya@isc.org>
From: JINMEI Tatuya / =?ISO-2022-JP?B?GyRCP0BMQEMjOkgbKEI=?=
	<Jinmei_Tatuya@isc.org>
To: Attila Nagy <bra@fsn.hu>
In-Reply-To: <47F4F63E.80703@fsn.hu>
References: <475B0F3E.5070100@fsn.hu> <m2lk6g71bc.wl%Jinmei_Tatuya@isc.org>
	<479DFE74.8030004@fsn.hu> <m2k5ltke09.wl%Jinmei_Tatuya@isc.org>
	<479F02A7.9020607@fsn.hu> <m24pcwt5b7.wl%Jinmei_Tatuya@isc.org>
	<47A614E9.4030501@fsn.hu> <m2wspkpl7r.wl%Jinmei_Tatuya@isc.org>
	<47A77A13.6010802@fsn.hu> <m2zlueohxk.wl%Jinmei_Tatuya@isc.org>
	<47B1D2F4.5070304@fsn.hu> <m2tzkexdo7.wl%Jinmei_Tatuya@isc.org>
	<47B2DD62.6020507@fsn.hu> <m2abm4y4by.wl%Jinmei_Tatuya@isc.org>
	<47BAE0B3.4090004@fsn.hu> <m2pruse24g.wl%Jinmei_Tatuya@isc.org>
	<47F4F63E.80703@fsn.hu>
User-Agent: Wanderlust/2.14.0 (Africa) Emacs/22.1 Mule/5.0 (SAKAKI)
MIME-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka")
Content-Type: text/plain; charset=US-ASCII
X-Mailman-Approved-At: Thu, 03 Apr 2008 18:09:27 +0000
Cc: freebsd-performance@freebsd.org, bind-users@isc.org
Subject: Re: max-cache-size doesn't work with 9.5.0b1
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 03 Apr 2008 17:46:54 -0000

At Thu, 03 Apr 2008 17:22:38 +0200,
Attila Nagy <bra@fsn.hu> wrote:

> Sorry again for the long delay, I've got other work to do, and our 9.4 
> servers work fine (at least on FreeBSD 6, though, see the other 
> -performance- problem)...

No problem, I understand testing a beta version cannot be a high
priority work.

> > BTW, is this reproduceable on FreeBSD 6.x?  If so, then I'd like to
> > see what happens if you specify some small value of datasize
> > (e.g. 512MB) and have named abort when malloc() fails with the "X"
> > _malloc_options.  (This option doesn't seem to work for FreeBSD 7.x,
> > at least at the moment).
> >   
> Yes, it's the same, even when there is a different (libpthreads, KSE) 
> threading library is in use.
> I've recompiled named with the following in main():
> ./work/bind-9.5.0b2/bin/named/main.c:   _malloc_options="X";
> 
> And set cache-size to 32MB.
> 
> At:
> 21664 bind        4  20    0   819M   819M kserel 0   5:32  0.00% named.950
> I pressed a CTRL-C:
> mem.c:1114: REQUIRE((((ctx) != ((void *)0)) && (((const isc__magic_t 
> *)(ctx))->magic == ((('M') << 24 | ('e') << 16 | ('m') << 8 | ('C')))))) 
> failed.

Hmm, this is odd in two points:
1. the "X" malloc option doesn't seem to work as expected.  I expected
   a call to malloc() should trigger an assertion failure (within the
   malloc library) at a much earlier stage.  Does it change if you try
   the alternative debugging approach I mentioned before?  That is:
  - create a symbolic link from "/etc/malloc.conf" to "X":
    # ln -s X /etc/malloc.conf
  - start named with a moderate limitation of virtual memory size, e.g.
    # /usr/bin/limits -v 384m $path_to_named/named <command line options>

2. Whether it's related to this max-cache-size issue, the assertion
   failure in mem.c wasn't an expected result; this is likely to be a
   bug anyway.  If the process dumped a core, can you show the
   stack backtrace of it?
   (gdb) thread apply all bt full

> > Some other questions:
> > - can we see your named.conf?  If you specify non-default
> >   configuration options, that might be the reason for, or related to,
> >   this problem.
> >   
> Of course (see at the end).
> 
> > - does your named produce lot of log messages?  If so, it might also
> >   be a reason (simply because it relies on standard libraries).
> >   
> grep named ns20080403.log | wc -l
> 1930006
> For today (17 hours and 18 minutes of logs).
> Is this a lot?

This means about 31 log messages per second.  This may not be
extremely frequent, but if some memory is lost for every log message,
I guess it could be a reason for the growing memory at the hight rate
we've seen.

What if you change the channel setting from:

>     channel syslog-ng {
>         syslog local5;
>         severity info;
>         print-category yes;
>         print-severity yes;
>         };

to this one?

     channel syslog-ng {
         null;
         severity info;
         print-category yes;
         print-severity yes;
         };

BTW,

> -hmm I haven't tried to change cleaning-interval, it was needed because 
> the default cache housekeeping effectively stopped the ns during the 
> cleanup-

This doesn't matter for 9.5.  It doesn't perform periodic cleaning
regardless of the value of cleaning-interval.

---
JINMEI, Tatuya
Internet Systems Consortium, Inc.