Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 13 Dec 2004 18:53:28 +0100
From:      "J. Martin Petersen" <techlists@motrix.dk>
To:        <freebsd-stable@freebsd.org>
Subject:    RE: netstat fails with memory allocation error and error in kvm_read
Message-ID:  <20041213175329.2930BC2C6@brugere.aub.dk>
In-Reply-To: <20041212215318.S83257@carver.gumbysoft.com>

next in thread | previous in thread | raw e-mail | index | archive | help
 

> > We are trying to gather some debug information for a problem that is 
> > difficult to diagnose, as the machine always ends up hard frozen 
> > (does not do anything, can not break to debugger, does not respond 
> > to keyboard, etc.), so we are dumping output from netstat, vmstat, 
> > iostat etc. quite often.
> >
> > The cron jobs fail ever so often with error messages I do not quite 
> > understand, and I can not find anything relevant in the archives. 
> > Can anyone shed some light on this?
> >
> > Command:	netstat -r
> > Error message: netstat: kvm_read: Bad address Debug before: 
> > http://www.aub.dk/~jmp/fw/debug-2004.12.09-21.34.41.gz
> > Debug after: http://www.aub.dk/~jmp/fw/debug-2004.12.09-21.35.06.gz
> > # errors:	7
> >
> > Command:	netstat -an
> > Error message:netstat: sysctl: net.inet.udp.pcblist: Cannot allocate 
> > memory Debug before: 
> > http://www.aub.dk/~jmp/fw/debug-2004.12.09-07.38.48.gz
> > Debug after: http://www.aub.dk/~jmp/fw/debug-2004.12.09-07.39.04.gz
> > # errors:	3
> 
> You appear to be running out of kernel memory. Since you're capturing 
> the output of vmstat -m, you should check that for any bins that are 
> growing at a high rate of speed.
> 
> Seems possible that its in pf :)

I've checked the numbers from just before the freeze (it's within 15 secs)
with two sets of data: From a fresh boot and five minutes minutes before the
freeze. 

Here are the stuff that changed significantly between the fresh boot and
just before the freeze:

Just before the freeze
(http://www.aub.dk/~jmp/fw/tmp/debug-2004.12.11-22.59.01.gz):
   AR driver     2     1K    268K  2922822  64,256,512,2048
       kqueue     0     0K     38K 13304405  128,1024
  UFS dirhash   444    88K    107K     2559
16,32,64,128,256,512,1024,2048,4096
     freeblks    13     4K     29K   103030  256
     freefrag     0     0K      1K   164217  32
   allocindir   287    18K    162K  1966413  64
     indirdep     1     1K    209K     9925  32
  allocdirect    27     4K     18K   301048  128
     inodedep    18   131K    150K   164032  128,256
     routetbl   566    47K     67K   800649  16,32,64,128,256
      subproc    99   301K    849K  1873146  32,4096

Five minutes before the freeze
(http://www.aub.dk/~jmp/fw/tmp/debug-2004.12.11-22.55.42.gz):
     AR driver     2     1K    268K  2921793  64,256,512,2048
       kqueue     0     0K     38K 13296556  128,1024
  UFS dirhash   444    88K    107K     2559
16,32,64,128,256,512,1024,2048,4096
     freeblks     1     1K     29K   102978  256
     freefrag     0     0K      1K   164153  32
   allocindir     0     0K    162K  1965284  64
     indirdep     0     0K    209K     9921  32
  allocdirect     1     1K     18K   300886  128
     inodedep    14   130K    150K   163954  128,256
     routetbl   562    46K     67K   800255  16,32,64,128,256
      subproc    99   301K    849K  1872250  32,4096

>From a fresh boot
(http://www.aub.dk/~jmp/fw/tmp/debug-2004.12.11-23.31.31.gz):
    AR driver     2     1K    190K    23450  64,256,512,2048
       kqueue     0     0K      3K     1062  128,1024
  UFS dirhash    36    13K     13K       42  16,32,512,2048,4096
     freeblks   115    29K     29K      253  256
     freefrag     0     0K      1K       51  32
   allocindir     2     1K    135K     3332  64
     indirdep    10     1K    173K      630  32
  allocdirect     2     1K     40K      456  128
     inodedep   137   145K    168K      506  128,256
     routetbl   306    26K     27K      495  16,32,64,128,256
      subproc   107   317K    466K     1554  32,4096

The numbers for pflog and pf_if does not change at all. I checked vmstat -z,
and the highest pf-related entries we're actually decreasing at the time of
the deadlock, but I noticed the following:
VM OBJECT:       132,        0,  31508,   2132, 14364021
128 Bucket:      524,        0,    727,      1,        0
64 Bucket:       268,        0,     23,     19,        0
32 Bucket:       140,        0,     34,     22,        0
16 Bucket:        76,        0,     15,     35,        0

Can you or anyone else deduce anything from the numbers? If not, I'll whip
something together that runs vmstat -m ever so often, parses the output and
remove the non-increasing entries so it'll be easier to spot the trends.

Thanks, Martin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041213175329.2930BC2C6>