Date: Tue, 11 Aug 1998 19:54:49 +0200 From: Stefan Bethke <stb@FreeBSD.ORG> To: Thomas Gellekum <tg@ihf.rwth-aachen.de>, "Jordan K. Hubbard" <jkh@time.cdrom.com> Cc: freebsd-stable@FreeBSD.ORG Subject: Re: Huge Bug in FreeBSD not fixed? Message-ID: <1682190.3111854089@d254.promo.de> In-Reply-To: <87yasvsqfv.fsf@ghpc6.ihf.rwth-aachen.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On Die, 11. Aug 1998 13:33 Uhr +0200 Thomas Gellekum <tg@ihf.rwth-aachen.de> wrote: > I have run this program five times and it finished once. The other > four occasions I got > > Fatal trap 12: page fault while in kernel mode > fault virtual address = 0x18 > fault code = supervisor write, page ot present > instruction pointer = 0x8:0xf0126d21 > stack pointer = 0x10:0xefbffe50 > frame pointer = 0x10:0xefbffe74 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 395 (crashbsd) > interrupt mask = > kernel: type 12 trap, code=0 > Stopped at _sosend+0x391: movl $0, 0x18(%ebx) > > After saving the core dump and recompiling a few object files with -g: > #9 0xf01c0a37 in trap (frame={tf_es = -2147483632, tf_ds = -272695280, > tf_edi = -272630136, tf_esi = -2147483648, tf_ebp = -272630156, > tf_isp = -272630212, tf_ebx = 0, tf_edx = 2147483647, > tf_ecx = -1073277766, tf_eax = 0, tf_trapno = 12, tf_err = 2, > tf_eip = -267227871, tf_cs = 8, tf_eflags = 66198, tf_esp = 0, > tf_ss = 1}) at ../../i386/i386/trap.c:324 > #10 0xf0126d21 in sosend (so=0xf0937f00, addr=0x0, uio=0xefbffeb0, > top=0x0, control=0xf06fff00, flags=0) at ../../kern/uipc_socket.c:432 Looking at kern/uipc_socket.c:sosend(), one can easily spot the problem (which IIRC even Stevens mentions?) sosend() uses MGETHDR() to get a fresh mbuf, and expect it to always succeed. Looking through MGETHDR (in sys/mbuf.h) and m_mballoc() and m_retryhdr() (in kern/uipc_mbuf.c), the following can happen: The free list mmbfree is empty. MGETHDR calls m_mballoc, which in turn calls kmem_malloc(). kmem_malloc() fails because the map mb_map is full (this is where the message is logged), and returns NULL. MGETHDR then calls m_retryhdr(). m_retryhdr() tries to get mbufs from the protocols by calling m_reclaim(). If no mbufs can be recovered this way, m_retry() returns NULL. Because sosend() expects a MGET(m, M_WAIT, MT_DATA) to always succeed, it pagefaults while trying to manipulate the non-allocated mbuf (m->m_pkthdr.len at 0+0x18). As a relief you can try to increase the number of mbufs; however, this will only make the case less likely to occur. The solution would be either to make MGET() and MGETHRD() to always succeed (or sleep indefinitly), or check the result of any of those calls (as many callers already do). This in both -stable and -current. A patch might be trivial for someone who understands sosend() fully; I currently don't :-( > Anything else I can do? Fix the bug :-) ? Stefan -- Hamburg | Voice: +49-177-3504009 Germany | e-mail: stb@freebsd.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1682190.3111854089>