Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 06 May 2002 09:54:59 -0700
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Patrick Thomas <root@utility.clubscholarship.com>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: what causes a userland to stop, but allows kernel to continue?
Message-ID:  <3CD6B563.ECF6A475@mindspring.com>
References:  <20020506080159.K86733-100000@utility.clubscholarship.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Patrick Thomas wrote:
> > No denied requests.  It's not mbufs.  It must be something else.
> 
> How do you feel about this:

[ ... ]

You have 24M in vnodes, which is surprising for a machine whose
job is supposedly postgres.  You have another 17M in PV ENTRY
values, which is for page mapping.  You have 81M in swap metadata;
12M in VM OBJECTS.

You don't tell us when you took this sample, relative to the crash
time... right after the start?  Right before the crash?

Do you restart postgres?  Does it fork for each client conection?

Also, not all memory is accounted to zones, which is why I suggested
"vmstat -m", *NOT* "vmstat -z".


> anything interesting ?


You claim really small numbers for the shared memory segments,
but then in another message, you say you are running multiple
instances of postgres in jails.  We don't have totals on these
numbers.

You set the physmap tunable that Alfred said would help *unless
you run out of memory* ...and are maybe hitting that wall.

You aren't telling us the output of "ps -gaxl" at the time of
the crash (which is only interesting for the top VSZ/RSS numbers,
the WCHAN's, the STAT, and the commands for the large VSZ/RSS).

THis really isn't going to be interesting or useful data until
you can show us trends.  The way to show us trends is to capture
the information at fixed intervals (e.g. with a cron job), so
that it's there from start to lockup.  You should calculate the
lockup interval, and pick an update interval based on that.

I'm personally not going to look at that amount of data unless
you use gnuplot or Excel or some other tool to graph it, so
that we can see time on one axis and resource consumption on
the other.  So don't post it directly to the list.

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3CD6B563.ECF6A475>