Date: Thu, 18 Dec 2003 08:40:45 +0800 From: "=?gb2312?B?WGluIExJL8Du9s4=?=" <delphij@frontfree.net> To: <stable@FreeBSD.org>, <current@freebsd.org> Subject: [Backtrace] 4.9 and 5.1-RELEASE occasionly panic on RAM > 4GB without PAE (long) Message-ID: <20031218004038.B3EE45302@ftp.bjpu.edu.cn>
next in thread | raw e-mail | index | archive | help
Hello, One of my friend is operating servers on a big ICP in China. They found that if they run FreeBSD 4.9 or FreeBSD 5.1-RELEASE on their server, which has more than 4GB RAM installed, then they will occasionaly have the server panic'ed, or even rebooted without any reason. The server hardware is: P4-Xeon-2.4G*2, 4G RAM(DDR266), 146G*2 on asr(4) RAID-1. The kernel is not quite different from GENERIC, except that we have AUTO_EOI1 enabled. When PAE is enabled, then the kernel will not recognize the asr(4) device, and thus would refuse to boot from it, so we have it disabled. This problem afffects 4.9-RELEASE and 5.1-RELEASE. As the ICP is a conservative user, they will not run -STABLE, nor -CURRENT. I have turned the debugging symbols on their server and grabbed the following backtraces. Also, I have found how to reproduce this problem on a machine running with > 4GB memory without a PAE configuration, which, in my opinion, should not cause a kernel panic because it's just a resource hungry program and all it will do should, at most, hang the system, or even being killed directly by the operating system. ----------------------------------------------------------------------- #include <stdio.h> #include <stdlib.h> #include <sys/types.h> #include <unistd.h> int main() { char* s; int p; int d; d = 4096; for(;d>0;d--) { p=fork(); if(p==0) { s=malloc(1024*1024); if(s==NULL) printf("Pid %d malloc(1024*1024) failed\n",getpid()); else memset(s,0,1024*1024); while(1); } else if(p==-1) printf("Fork() %d failed\n",d); else continue; } return 0; } ----------------------------------------------------------------------- Unfortuantelly, this will cause panic after a short while. We got these backtrace: gdb /usr/src/sys/compile/DELPHIJ/kernel.debug -k vmcore.2 GNU gdb 4.18 (FreeBSD) Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-unknown-freebsd"...Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 2627 in elfstab_build_psymtabs Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 933 in fill_symbuf SMP 4 cpus IdlePTD at phsyical address 0x003af000 initial pcb at physical address 0x00313fe0 panicstr: pmap_new_proc: u_map allocation failed panic messages: --- panic: pmap_new_proc: u_map allocation failed mp_lock = 02000001; cpuid = 2; lapic.id = 01000000 boot() called on cpu#2 syncing disks... 3 done Uptime: 2h44m38s dumping to dev #da/0x20001, offset 263168 dump 3967 3966 3965.................. ..................................... ..................................... ...............10 9 8 7 6 5 4 3 2 1 0 --- #0 dumpsys () at ../../kern/kern_shutdown.c:487 487 if (dumping++) { (kgdb) bt #0 dumpsys () at ../../kern/kern_shutdown.c:487 #1 0xc0185beb in boot (howto=256) at ../../kern/kern_shutdown.c:316 #2 0xc018605d in panic (fmt=0xc02e15e0 "pmap_new_proc: u_map allocation failed") at ../../kern/kern_shutdown.c:595 #3 0xc028d575 in pmap_new_proc (p=0xffb5d1e0) at ../../i386/i386/pmap.c:919 #4 0xc0239944 in vm_fork (p1=0xfbfe2ba0, p2=0xffb5d1e0, flags=20) at ../../vm/vm_glue.c:240 #5 0xc017fa85 in fork1 (p1=0xfbfe2ba0, flags=20, procp=0xfc0fef24) at ../.. /kern/kern_fork.c:517 #6 0xc017f1fa in fork (p=0xfbfe2ba0, uap=0xfc0fef80) at ../../kern/kern_fork.c:100 #7 0xc02916e1 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = -1077937296, tf_esi = -1077937304, tf_ebp = -1077937392, tf_isp = -66064428, tf_ebx = 1, tf_edx = 0, tf_ecx = -1077937066, tf_eax = 2, tf_trapno = 12, tf_err = 2, tf_eip = 1208488192, tf_cs = 31, tf_eflags = 514, tf_esp = -1077937420, tf_ss = 47}) at ../../i386/i386/trap.c:1175 #8 0xc027e09b in Xint0x80_syscall () #9 0x8048506 in ?? () (kgdb) up #1 0xc0185beb in boot (howto=256) at ../../kern/kern_shutdown.c:316 316 dumpsys(); (kgdb) #2 0xc018605d in panic (fmt=0xc02e15e0 "pmap_new_proc: u_map allocation failed") at ../../kern/kern_shutdown.c:595 595 boot(bootopt); (kgdb) #3 0xc028d575 in pmap_new_proc (p=0xffb5d1e0) at ../../i386/i386/pmap.c:919 919 panic("pmap_new_proc: u_map allocation failed"); (kgdb) l 914 915 /* get a kernel virtual address for the UPAGES for this proc */ 916 if ((up = (vm_offset_t) p->p_addr) == 0) { 917 up = kmem_alloc_nofault(kernel_map, UPAGES * PAGE_SIZE); 918 if (up == 0) 919 panic("pmap_new_proc: u_map allocation failed"); 920 p->p_addr = (struct user *) up; 921 } 922 923 for(i = 0; i < UPAGES; i++) { (kgdb) p up $1 = 0 (kgdb) p {struct proc *}p $2 = (struct proc *) 0xffb5d040 (kgdb) p {struct proc}p $3 = {p_procq = {tqe_next = 0xffb5d040, tqe_prev = 0x0}, p_list = {le_next = 0xffb5d380, le_prev = 0xc032ac38}, p_cred = 0xd2898620, p_fd = 0xd289e900, p_stats = 0x0, p_limit = 0xd275e700, p_upages_obj = 0xffbf9e60, p_procsig = 0xd289dac0, p_flag = 6, p_stat = 1 '\001', p_pad1 = "\000\000", p_pid = 5832, p_hash = {le_next = 0xfd8641a0, le_prev = 0xd2198b20}, p_pglist = {le_next = 0xffb5d380, le_prev = 0xfbfe2bdc}, p_pptr = 0xfbfe2ba0, p_sibling = {le_next = 0xffb5d380, le_prev = 0xfbfe2bf0}, p_children = {lh_first = 0x0}, p_ithandle = {callout = 0x0}, p_oppid = 0, p_dupfd = 0, p_vmspace = 0xffb239c0, p_estcpu = 7, p_cpticks = 0, p_pctcpu = 0, p_wchan = 0x0, p_wmesg = 0x0, p_swtime = 0, p_slptime = 0, p_realtimer = {it_interval = {tv_sec = 0, tv_usec = 0}, it_value = {tv_sec = 0, tv_usec = 0}}, p_runtime = 0, p_uu = 0, p_su = 0, p_iu = 0, p_uticks = 0, p_sticks = 0, p_iticks = 0, p_traceflag = 0, p_tracep = 0x0, p_siglist = { __bits = {0, 0, 0, 0}}, p_textvp = 0xfc0310c0, p_lock = 0 '\000', p_oncpu = 0 '\000', p_lastcpu = 0 '\000', p_rqindex = 0 '\000', p_locks = 0, p_simple_locks = 0, p_stops = 0, p_stype = 0, p_step = 0 '\000', p_pfsflags = 0 '\000', p_pad3 = "\000", p_retval = {0, 0}, p_sigiolst = {slh_first = 0x0}, p_sigparent = 20, p_oldsigmask = {__bits = {0, 0, 0, 0}}, p_sig = 0, p_code = 0, p_klist = { slh_first = 0x0}, p_sigmask = {__bits = {0, 0, 0, 0}}, p_sigstk = {ss_sp = 0x0, ss_size = 0, ss_flags = 4}, p_priority = 50 '2', p_usrpri = 50 '2', p_nice = 0 '\000', p_comm = "fork_and_malloc_", p_pgrp = 0xd2740f20, p_sysent = 0xc02f13e0, p_rtprio = {type = 1, prio = 0}, p_prison = 0x0, p_args = 0xd2770b40, p_addr = 0x0, p_md = {md_regs = 0x0}, p_xstat = 0, p_acflag = 0, p_ru = 0x0, p_nthreads = 0, p_aioinfo = 0x0, p_wakeup = 0, p_peers = 0x0, p_leader = 0xffb5d1e0, p_asleep = {as_priority = 0, as_timo = 0}, p_emuldata = 0x0, p_fdtol = 0x0} (kgdb) ----------------------------------------------------------------------- On another machine, we have tested the following CGI script. When having heavy load, it will cause panic, too: ----------------------------------------------------------------------- #!/bin/sh printf "content-type:text/html\n\n" printf "<table bordercolor=a0a0a0 border=1 cellpadding=1 cellspacing=1>\n" printf "<tr>\n" printf "<td colspan=8 align=center>`/sbin/ifconfig|grep broadcast|head -n 1|perl -pi -e 's/ +/\t/g'|cut -f3|perl -pi -e 's/\n//g'`</td>\n" printf "</tr>\n" printf "<tr>\n" printf "<td align=center width=100>Load</td>\n" printf "<td align=center width=100>Connections</td>\n" printf "<td align=center width=100>Processes</td>\n" printf "<td align=center width=100>Open Files</td>\n" printf "<td align=center width=300>Processor(User%% System%% Free%%)</td>\n" printf "<td align=center width=100>Active Memory</td>\n" printf "<td align=center width=100>Swap</td>\n" printf "</tr>\n" printf "<tr>\n" printf "<td align=right>`w w|head -n 1|perl -pi -e 's/^.*, //g'|perl -pi -e 's/\n/ /g'`</td>\n" printf "<td align=right>`netstat -an|wc|cut -c1-8|perl -pi -e 's/ //g'|perl -pi -e 's/\n/ /g'`</td>\n" printf "<td align=right>`ps ax|wc|cut -c1-8|perl -pi -e 's/ //g'|perl -pi -e 's/\n/ /g'`</td>\n" printf "<td align=right>`fstat|wc|cut -c1-8|perl -pi -e 's/ //g'|perl -pi -e 's/\n/ /g'`</td>\n" printf "<td align=right>`vmstat|tail -n 1|perl -pi -e 's/ +/\t/g'|cut -f18-|perl -pi -e 's/\t/%% /g'|perl -pi -e 's/\n/%%/g'`</td>\n" printf "<td align=right>`top -n|head -n 4|tail -n 1|perl -pi -e 's/ +/\t/g'|cut -f2|perl -pi -e 's/\n/ /g'`</td>\n" printf "<td align=right>`top -n|head -n 5|tail -n 1|perl -pi -e 's/ +/\t/g'|cut -f4|perl -pi -e 's/\n/ /g'`</td>\n" printf "</tr>\n" printf "</table>\n" ----------------------------------------------------------------------- To simulate the situation, we have used ab: /usr/local/apache/bin/ab -n 16384000 -c 1024 And we got this: ----------------------------------------------------------------------- gdb /usr/src/sys/compile/DELPHIJ/kernel.debug -k vmcore.1 GNU gdb 4.18 (FreeBSD) Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-unknown-freebsd"...Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 2627 in elfstab_build_psymtabs Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 933 in fill_symbuf SMP 4 cpus IdlePTD at phsyical address 0x003af000 initial pcb at physical address 0x00313fe0 panicstr: pmap_new_proc: u_map allocation failed panic messages: --- panic: pmap_new_proc: u_map allocation failed mp_lock = 01000001; cpuid = 1; lapic.id = 06000000 boot() called on cpu#1 syncing disks... 1 done Uptime: 1d0h36m40s dumping to dev #da/0x20001, offset 263168 dump 3967 3966 3965.................. ..................................... ..................................... ...............10 9 8 7 6 5 4 3 2 1 0 --- #0 dumpsys () at ../../kern/kern_shutdown.c:487 487 if (dumping++) { (kgdb) bt #0 dumpsys () at ../../kern/kern_shutdown.c:487 #1 0xc0185beb in boot (howto=256) at ../../kern/kern_shutdown.c:316 #2 0xc018605d in panic (fmt=0xc02e15e0 "pmap_new_proc: u_map allocation failed") at ../../kern/kern_shutdown.c:595 #3 0xc028d575 in pmap_new_proc (p=0xffa7f680) at ../../i386/i386/pmap.c:919 #4 0xc0239944 in vm_fork (p1=0xfed67780, p2=0xffa7f680, flags=20) at ../../vm/vm_glue.c:240 #5 0xc017fa85 in fork1 (p1=0xfed67780, flags=20, procp=0xfed6ef24) at ../.. /kern/kern_fork.c:517 #6 0xc017f1fa in fork (p=0xfed67780, uap=0xfed6ef80) at ../../kern/kern_fork.c:100 #7 0xc02916e1 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 134960572, tf_esi = 135016448, tf_ebp = -1077938448, tf_isp = -19468332, tf_ebx = 134959244, tf_edx = 4, tf_ecx = -1077938376, tf_eax = 2, tf_trapno = 22, tf_err = 2, tf_eip = 134672192, tf_cs = 31, tf_eflags = 582, tf_esp = -1077938476, tf_ss = 47}) at ../../i386/i386/trap.c:1175 #8 0xc027e09b in Xint0x80_syscall () #9 0x804b34f in ?? () #10 0x804d915 in ?? () #11 0x804d403 in ?? () #12 0x804d17e in ?? () #13 0x804b4fb in ?? () #14 0x804ac40 in ?? () #15 0x80526be in ?? () #16 0x805262a in ?? () #17 0x804813e in ?? () (kgdb) up #1 0xc0185beb in boot (howto=256) at ../../kern/kern_shutdown.c:316 316 dumpsys(); (kgdb) #2 0xc018605d in panic (fmt=0xc02e15e0 "pmap_new_proc: u_map allocation failed") at ../../kern/kern_shutdown.c:595 595 boot(bootopt); (kgdb) #3 0xc028d575 in pmap_new_proc (p=0xffa7f680) at ../../i386/i386/pmap.c:919 919 panic("pmap_new_proc: u_map allocation failed"); (kgdb) l 914 915 /* get a kernel virtual address for the UPAGES for this proc */ 916 if ((up = (vm_offset_t) p->p_addr) == 0) { 917 up = kmem_alloc_nofault(kernel_map, UPAGES * PAGE_SIZE); 918 if (up == 0) 919 panic("pmap_new_proc: u_map allocation failed"); 920 p->p_addr = (struct user *) up; 921 } 922 923 for(i = 0; i < UPAGES; i++) { (kgdb) p p $1 = (struct proc *) 0xffa7f680 (kgdb) p {struct proc *}p $2 = (struct proc *) 0xffa7f4e0 (kgdb) p {struct proc}p $3 = {p_procq = {tqe_next = 0xffa7f4e0, tqe_prev = 0x0}, p_list = {le_next = 0xffa7f820, le_prev = 0xc032ac38}, p_cred = 0xd3cb9f20, p_fd = 0xd3c95d00, p_stats = 0x0, p_limit = 0xd3c9e100, p_upages_obj = 0xff64de04, p_procsig = 0xd3ca7400, p_flag = 4, p_stat = 1 '\001', p_pad1 = "\000\000", p_pid = 26450, p_hash = {le_next = 0xfd812dc0, le_prev = 0xd2198d48}, p_pglist = {le_next = 0xff18c3c0, le_prev = 0xfed677bc}, p_pptr = 0xfed67780, p_sibling = {le_next = 0x0, le_prev = 0xfed677d0}, p_children = {lh_first = 0x0}, p_ithandle = {callout = 0x0}, p_oppid = 0, p_dupfd = 0, p_vmspace = 0xffae2500, p_estcpu = 184, p_cpticks = 0, p_pctcpu = 0, p_wchan = 0x0, p_wmesg = 0x0, p_swtime = 0, p_slptime = 0, p_realtimer = {it_interval = {tv_sec = 0, tv_usec = 0}, it_value = {tv_sec = 0, tv_usec = 0}}, p_runtime = 0, p_uu = 0, p_su = 0, p_iu = 0, p_uticks = 0, p_sticks = 0, p_iticks = 0, p_traceflag = 0, p_tracep = 0x0, p_siglist = { __bits = {0, 0, 0, 0}}, p_textvp = 0xf9b5c740, p_lock = 0 '\000', p_oncpu = 0 '\000', p_lastcpu = 0 '\000', p_rqindex = 0 '\000', p_locks = 0, p_simple_locks = 0, p_stops = 0, p_stype = 0, p_step = 0 '\000', p_pfsflags = 0 '\000', p_pad3 = "\000", p_retval = {0, 0}, p_sigiolst = {slh_first = 0x0}, p_sigparent = 20, p_oldsigmask = {__bits = {0, 0, 0, 0}}, p_sig = 0, p_code = 0, p_klist = { slh_first = 0x0}, p_sigmask = {__bits = {0, 0, 0, 0}}, p_sigstk = {ss_sp = 0x0, ss_size = 0, ss_flags = 4}, p_priority = 52 '4', p_usrpri = 52 '4', p_nice = 0 '\000', p_comm = "sh\000pd\000r\000\000\000\000\000\000\000\000\000", p_pgrp = 0xd21ce460, p_sysent = 0xc02f13e0, p_rtprio = {type = 1, prio = 0}, p_prison = 0x0, p_args = 0xd3c97240, p_addr = 0x0, p_md = {md_regs = 0x0}, p_xstat = 0, p_acflag = 0, p_ru = 0x0, p_nthreads = 0, p_aioinfo = 0x0, p_wakeup = 0, p_peers = 0x0, p_leader = 0xffa7f680, p_asleep = { as_priority = 0, as_timo = 0}, p_emuldata = 0x0, p_fdtol = 0x0} (kgdb) p kernel_map $4 = 0xc031a60c ----------------------------------------------------------------------- I have took a sight on the pmap and vm code, and did not figured out why the panic was caused. So I post it here, and I will have to say that -CURRENT may have the same problem, and it may affect the upcoming 5.2-RELEASE. For additional information, please fell free to contact me, and I can act as liaison. Any suggestions? Thank you in advance! Xin LI, FreeBSD (Simplified) Chinese Project
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20031218004038.B3EE45302>