Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 13 Mar 1998 12:31:13 +0200 (SAT)
From:      Reinier Bezuidenhout <rbezuide@oskar.nanoteq.co.za>
To:        freebsd-hackers@FreeBSD.ORG
Subject:   2.2.5 PANIC when out of mbufs
Message-ID:  <199803131032.MAA04804@oskar.nanoteq.co.za>

next in thread | raw e-mail | index | archive | help
Hi ...

It seems that there is something a-foot in uipc_mbuf.c

We have an application program that basically does a relay-ing
function at the user level.  I have the following test setup

PII-266/64MB <-100 Mbit-> P166/128MB <-100 Mbit-> PII-266/64MB

connected with X-over cat 5 cables.

I am using ttcp on the one PII to connect to the "relay" on
the P166 that reconnects me to a small get-and-dump server
on the other PII.

Between the PII's I can start 400 of these sessions.

I try to start 300 sessions through the relay by starting
them 1 at a time with a 1sec delay between them, each traqnsferring
about 26M of data.

The MBUF clusters start to increase on the P166 to about

2942 mbufs in use
2703/2714 mbuf clusters in use

( the P166 kernel has 128 max users = (512 + 128 * 16) = 2560 
  mbuf clusters according to param.c )

The kernel panics with the following on screen message


Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x18
fault code              = supervisor write, page not present
instruction pointer     = 0x8:0xf0122d1d
stack pointer           = 0x10:0xefbffe94
frame pointer           = 0x10:0xefbffeb8
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 998 (tcpr)
interrupt mask          = 
panic: page fault



I have recreated this about 8 time with the ip always being 0xf0122d1d

nm /kernel | sort 


---- cut ------
f0122528 T _solisten
f01225d4 T _sofree
f01226a0 T _soclose
f01227c0 T _soabort
f01227e8 T _soaccept
f0122850 T _soconnect
f01228d8 T _soconnect2
f0122924 T _sodisconnect
f0122990 T _sosend     <------ in here
f012304c T _soreceive
f01239c8 T _soshutdown
f0123a04 T _sorflush
f0123ad0 T _sosetopt
f0123d98 T _sogetopt
----- cur ------


Hi then had a look at the files being used, and saw the following

gdb -k kernel /a/tmp/vmcore.2

(kgdb) bt
#0  0xf010e7d3 in boot ()
#1  0xf010ea92 in panic ()
#2  0xf0192fb6 in trap_fatal ()
#3  0xf0192aa4 in trap_pfault ()
#4  0xf019277f in trap ()
#5  0xf0122d1d in sosend (so=0xf13d4500, addr=0x0, uio=0xefbffef4, top=0x0, 
    control=0x0, flags=0) at ../../kern/uipc_socket.c:427
#6  0xf0125a81 in sendit ()
#7  0xf0125b60 in sendto ()
#8  0xf019324f in syscall ()
#9  0x200c8bd1 in ?? ()
#10 0xc55e in ?? ()
#11 0xcbf0 in ?? ()
#12 0x2419 in ?? ()
#13 0x2374 in ?? ()
#14 0x1095 in ?? ()
(kgdb) 


#5  0xf0122d1d in sosend (so=0xf13d4500, addr=0x0, uio=0xefbffef4, top=0x0, 
    control=0x0, flags=0) at ../../kern/uipc_socket.c:427
427                                     mlen = MHLEN;
(kgdb) li
422                             if (flags & MSG_EOR)
423                                     top->m_flags |= M_EOR;
424                         } else do {
425                             if (top == 0) {
426                                     MGETHDR(m, M_WAIT, MT_DATA);
427                                     mlen = MHLEN;
428                                     m->m_pkthdr.len = 0;
429                                     m->m_pkthdr.rcvif = (struct ifnet *)0;
430                             } else {
431                                     MGET(m, M_WAIT, MT_DATA);
(kgdb) p m
$1 = (struct mbuf *) 0x0
(kgdb) 


I then had a look in sys/sys/mbuf.h and saw the following

#define MGETHDR(m, how, type) { \
          int _ms = splimp(); \
          if (mmbfree == 0) \
                (void)m_mballoc(1, (how)); \
          if (((m) = mmbfree) != 0) { \
                mmbfree = (m)->m_next; \
                mbstat.m_mtypes[MT_FREE]--; \
                (m)->m_type = (type); \
                mbstat.m_mtypes[type]++; \
                (m)->m_next = (struct mbuf *)NULL; \
                (m)->m_nextpkt = (struct mbuf *)NULL; \
                (m)->m_data = (m)->m_pktdat; \
                (m)->m_flags = M_PKTHDR; \
                splx(_ms); \
        } else { \
                splx(_ms); \
                (m) = m_retryhdr((how), (type)); \
        } \
}

say it goes to the else because no mbufs are available, then it will
call m_retryhdr 

in uipc_mbuf.c

struct mbuf *
m_retryhdr(i, t)
        int i, t;
{
        register struct mbuf *m;      

        m_reclaim();
#define m_retryhdr(i, t) (struct mbuf *)0
        MGETHDR(m, i, t);
#undef m_retryhdr
        if (m != NULL)
                mbstat.m_wait++;      
        else
                mbstat.m_drops++;     
        return (m);
}

say the m_reclaim doesn't free anything because everything is in use ..
Then it will make m_retryhdr(i, t) null and recall MGETHDR(m, i, t) who
still can't allocate anything and then does
(m) = m_retryhdr((how), (type)); which has now been defined as 0x0 ...
MGETHDR (with M_WAIT) defined now happily returns m = 0x0 and no 
one checks for that.

It then causes the kernel to panic.

:) would it not have been easier to call panic from withing the else
in m_retryhdr :) instead of waiting for the mbuf to be referenced :)

Am I missing anything obvious here ???

Thanx
Reinier




To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199803131032.MAA04804>