From owner-freebsd-stable@FreeBSD.ORG Thu Jun 6 12:03:40 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id DF360CD5 for ; Thu, 6 Jun 2013 12:03:40 +0000 (UTC) (envelope-from freebsd-stable@m.gmane.org) Received: from plane.gmane.org (plane.gmane.org [80.91.229.3]) by mx1.freebsd.org (Postfix) with ESMTP id 852071C8C for ; Thu, 6 Jun 2013 12:03:40 +0000 (UTC) Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1UkYgC-0003Pi-4E for freebsd-stable@freebsd.org; Thu, 06 Jun 2013 13:48:32 +0200 Received: from august.inf.tu-dresden.de ([141.76.48.124]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 06 Jun 2013 13:48:32 +0200 Received: from jsteckli by august.inf.tu-dresden.de with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 06 Jun 2013 13:48:32 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-stable@freebsd.org From: Julian Stecklina Subject: Reproducable Infiniband panic Date: Thu, 06 Jun 2013 13:48:21 +0200 Lines: 195 Message-ID: <51B07705.207@os.inf.tu-dresden.de> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: august.inf.tu-dresden.de User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130514 Thunderbird/17.0.6 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Jun 2013 12:03:40 -0000 Hello, I see a reproducable panic when doing ibping and aborting it with ^C. My setup is two machines with Mellanox Infinihost III HCAs (one Linux one FreeBSD) connected back-to-back. Details below. I can upload 2 crash dumps, if this is useful. For some reason the port doesn't become ACTIVE, so no packets arrive, but that is probably unrelated. % uname -a FreeBSD cosel.inf.tu-dresden.de 9.1-STABLE FreeBSD 9.1-STABLE #0 r+b6547e3: Wed Jun 5 18:29:51 CEST 2013 julian@cosel.inf.tu-dresden.de:/usr/obj/usr/home/julian/src/freebsd/sys/COSEL amd64 % sudo ibping 2 ^C --- (Lid 2) ibping statistics --- 6 packets transmitted, 0 received, 100% packet loss, time 5161 ms rtt min/avg/max = 0.000/0.000/0.000 ms Fatal trap 12: page fault while in kernel mode cpuid = 6; apic id = 06 fault virtual address = 0x18 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff807a3d83 stack pointer = 0x28:0xffffff8092c97890 frame pointer = 0x28:0xffffff8092c978b0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 1489 (ibping) trap number = 12 panic: page fault cpuid = 6 KDB: stack backtrace: #0 0xffffffff80632a96 at kdb_backtrace+0x66 #1 0xffffffff805f9fce at panic+0x1ce #2 0xffffffff808a7380 at trap_fatal+0x290 #3 0xffffffff808a76e1 at trap_pfault+0x211 #4 0xffffffff808a7c94 at trap+0x344 #5 0xffffffff80891043 at calltrap+0x8 #6 0xffffffff80513c39 at devfs_destroy_cdevpriv+0x69 #7 0xffffffff80513e47 at devfs_close_f+0x57 #8 0xffffffff805b4b23 at _fdrop+0x23 #9 0xffffffff805b65ec at closef+0x4c #10 0xffffffff805b76cc at fdfree+0x23c #11 0xffffffff805c4945 at exit1+0x305 #12 0xffffffff805c5d0e at sys_sys_exit+0xe #13 0xffffffff808a6b56 at amd64_syscall+0x5d6 #14 0xffffffff80891327 at Xfast_syscall+0xf7 Full backtrace from kgdb: #0 doadump (textdump=) at pcpu.h:234 No locals. #1 0xffffffff805f9aa4 in kern_reboot (howto=260) at /usr/home/julian/src/freebsd/sys/kern/kern_shutdown.c:449 _ep = (struct eventhandler_entry *) 0x0 _el = first_buf_printf = 1 #2 0xffffffff805f9fa7 in panic (fmt=0x1
) at /usr/home/julian/src/freebsd/sys/kern/kern_shutdown.c:637 td = (struct thread *) 0x1 bootopt = newpanic = ap = {{gp_offset = 16, fp_offset = 48, overflow_arg_area = 0xffffff8092b934f0, reg_save_area = 0xffffff8092b93410}} panic_cpu = 7 buf = "page fault", '\0' #3 0xffffffff808a7380 in trap_fatal (frame=0xc, eva=) at /usr/home/julian/src/freebsd/sys/amd64/amd64/trap.c:878 code = ss = 40 type = 12 esp = softseg = {ssd_base = 0, ssd_limit = 1048575, ssd_type = 27, ssd_dpl = 0, ssd_p = 1, ssd_long = 1, ssd_def32 = 0, ssd_gran = 1} msg = #4 0xffffffff808a76e1 in trap_pfault (frame=0xffffff8092b937e0, usermode=0) at /usr/home/julian/src/freebsd/sys/amd64/amd64/trap.c:794 id = va = 0 vm = map = 0xfffffe000b0a3498 rv = 0 ftype = 255 'ΓΏ' td = (struct thread *) 0xfffffe000b0af000 p = (struct proc *) 0xfffffe000b181950 eva = 24 #5 0xffffffff808a7c94 in trap (frame=0xffffff8092b937e0) at /usr/home/julian/src/freebsd/sys/amd64/amd64/trap.c:463 regs = {r_r15 = -2136840320, r_r14 = -547294202144, r_r13 = -547294202208, r_r12 = -2140660471, r_r11 = -2136840320, r_r10 = 594, r_r9 = -547294202160, r_r8 = -2198830683992, r_rdi = 0, r_rsi = -2136780531, r_rbp = 219043332096, r_rbx = -2198837989376, r_rdx = -547294202048, r_rcx = 2154444695, r_rax = -2133265824, r_trapno = 192571360, r_fs = 65024, r_gs = 65535, r_err = 525312, r_es = 65408, r_ds = 65535, r_rip = -2136840320, r_cs = -547294202064, r_rflags = -2133515200, r_rsp = -547294201968, r_ss = 0} td = (struct thread *) 0xfffffe000b0af000 p = i = ucode = code = 0 type = 12 addr = ksi = {ksi_link = {tqe_next = 0xfffffe000553ac00, tqe_prev = 0xfffffe000b0af000}, ksi_info = {si_signo = -1833355440, si_errno = -128, si_code = -2140661293, si_pid = -1, si_uid = 0, si_status = 0, si_addr = 0xfffffe0000000000, si_value = {sival_int = -1833355392, sival_ptr = 0xffffff8092b93780, sigval_int = -1833355392, sigval_ptr = 0xffffff8092b93780}, _reason = {_fault = {_trapno = -2138032854}, _timer = {_timerid = -2138032854, _overrun = -1}, _mesgq = {_mqd = -2138032854}, _poll = {_band = -2138032854}, __spare__ = { __spare1__ = -2138032854, __spare2__ = {192571360, -512, 192571360, -512, 2, 0, 0}}}}, ksi_flags = -1833355296, ksi_sigq = 0xffffffff806955f3} #6 0xffffffff80891043 in calltrap () at /usr/home/julian/src/freebsd/sys/amd64/amd64/exception.S:228 No locals. #7 0xffffffff807a3d83 in linux_file_dtor (cdp=0xfffffe000aeabb80) at /usr/home/julian/src/freebsd/sys/ofed/include/linux/linux_compat.c:214 filp = (struct linux_file *) 0xfffffe000aeabb80 #8 0xffffffff80513c39 in devfs_destroy_cdevpriv (p=0xfffffe0005772980) at /usr/home/julian/src/freebsd/sys/fs/devfs/devfs_vnops.c:159 No locals. #9 0xffffffff80513e47 in devfs_close_f (fp=0xfffffe000b0e9aa0, td=) at /usr/home/julian/src/freebsd/sys/fs/devfs/devfs_vnops.c:619 error = 0 fpop = (struct file *) 0x0 #10 0xffffffff805b4b23 in _fdrop (fp=0xfffffe000b0e9aa0, td=) at file.h:334 error = 0 #11 0xffffffff805b65ec in closef (fp=0xfffffe000b0e9aa0, td=0xfffffe000b0af000) at /usr/home/julian/src/freebsd/sys/kern/kern_descrip.c:2272 vp = lf = {l_start = 0, l_len = -2198837126832, l_pid = 4, l_type = 0, l_whence = 0, l_sysid = 185266176} fdtol = (struct filedesc_to_leader *) 0x0 fdp = fp_object = (struct file *) 0xfffffe000b0e9aa0 #12 0xffffffff805b76cc in fdfree (td=0xfffffe000b0af000) at /usr/home/julian/src/freebsd/sys/kern/kern_descrip.c:1976 fdp = (struct filedesc *) 0xfffffe000b1a8000 fpp = (struct file **) 0xfffffe000b1a8098 i = 3 fdtol = fp = (struct file *) 0xfffffe000b0e9aa0 cdir = jdir = rdir = vp = lf = {l_start = -547294201248, l_len = -2140995476, l_pid = 4, l_type = 0, l_whence = 0, l_sysid = 0} #13 0xffffffff805c4945 in exit1 (td=0xfffffe000b0af000, rv=0) at /usr/home/julian/src/freebsd/sys/kern/kern_exit.c:301 id = p = (struct proc *) 0xfffffe000b181950 nq = q = (struct proc *) 0x4 vtmp = ttyvp = plim = reason = #14 0xffffffff805c5d0e in sys_sys_exit (td=, uap=) at /usr/home/julian/src/freebsd/sys/kern/kern_exit.c:122 No locals. #15 0xffffffff808a6b56 in amd64_syscall (td=0xfffffe000b0af000, traced=0) at subr_syscall.c:135 sa = {code = 1, callp = 0xffffffff80d31330, args = {0, 0, 10, 0, 0, 0, 133124, -547294200768}, narg = 1} error = 0 ksi = {ksi_link = {tqe_next = 0x0, tqe_prev = 0x0}, ksi_info = {si_signo = 2, si_errno = 0, si_code = 65542, si_pid = 0, si_uid = 0, si_status = 0, si_addr = 0x0, si_value = {sival_int = 0, sival_ptr = 0x0, sigval_int = 0, sigval_ptr = 0x0}, _reason = {_fault = { _trapno = 0}, _timer = {_timerid = 0, _overrun = 0}, _mesgq = {_mqd = 0}, _poll = {_band = 0}, __spare__ = {__spare1__ = 0, __spare2__ = {0, 0, 0, 0, 0, 0, 0}}}}, ksi_flags = 0, ksi_sigq = 0x0} #16 0xffffffff80891327 in Xfast_syscall () at /usr/home/julian/src/freebsd/sys/amd64/amd64/exception.S:387 No locals. #17 0x0000000800eda82c in ?? () Julian