From owner-freebsd-amd64@FreeBSD.ORG Sun Jan 7 11:54:08 2007 Return-Path: X-Original-To: freebsd-amd64@hub.freebsd.org Delivered-To: freebsd-amd64@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BC69216A40F; Sun, 7 Jan 2007 11:54:08 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [69.147.83.40]) by mx1.freebsd.org (Postfix) with ESMTP id 954ED13C45D; Sun, 7 Jan 2007 11:54:08 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from freefall.freebsd.org (rwatson@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id l07Bs8Zr092346; Sun, 7 Jan 2007 11:54:08 GMT (envelope-from rwatson@freefall.freebsd.org) Received: (from rwatson@localhost) by freefall.freebsd.org (8.13.4/8.13.4/Submit) id l07Bs8nN092342; Sun, 7 Jan 2007 11:54:08 GMT (envelope-from rwatson) Date: Sun, 7 Jan 2007 11:54:08 GMT From: Robert Watson Message-Id: <200701071154.l07Bs8nN092342@freefall.freebsd.org> To: rwatson@FreeBSD.org, freebsd-amd64@FreeBSD.org, sos@FreeBSD.org Cc: Subject: Re: amd64/107639: Kernel Panic/Crash on dd if=/dev/ad4 of=/dev/ad6 bs=1m X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Jan 2007 11:54:08 -0000 Synopsis: Kernel Panic/Crash on dd if=/dev/ad4 of=/dev/ad6 bs=1m Responsible-Changed-From-To: freebsd-amd64->sos Responsible-Changed-By: rwatson Responsible-Changed-When: Sun Jan 7 11:53:13 UTC 2007 Responsible-Changed-Why: Assign to sos, as this may be an ATA-related problem. If not, please assign back to me. http://www.freebsd.org/cgi/query-pr.cgi?pr=107639 From owner-freebsd-amd64@FreeBSD.ORG Sun Jan 7 23:12:48 2007 Return-Path: X-Original-To: freebsd-amd64@freebsd.org Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 70C4E16A407 for ; Sun, 7 Jan 2007 23:12:48 +0000 (UTC) (envelope-from sven@dmv.com) Received: from smtp-gw-cl-c.dmv.com (smtp-gw-cl-c.dmv.com [216.240.97.41]) by mx1.freebsd.org (Postfix) with ESMTP id 2C06F13C45B for ; Sun, 7 Jan 2007 23:12:47 +0000 (UTC) (envelope-from sven@dmv.com) Received: from mail-gw-cl-a.dmv.com (mail-gw-cl-a.dmv.com [216.240.97.38]) by smtp-gw-cl-c.dmv.com (8.12.10/8.12.10) with ESMTP id l07N0k0F061559; Sun, 7 Jan 2007 18:00:46 -0500 (EST) (envelope-from sven@dmv.com) Received: from lanshark.dmv.com (lanshark.dmv.com [216.240.97.46]) by mail-gw-cl-a.dmv.com (8.12.9/8.12.9) with ESMTP id l07N0jXi009733; Sun, 7 Jan 2007 18:00:45 -0500 (EST) (envelope-from sven@dmv.com) From: Sven Willenberger To: stable@freebsd.org Content-Type: text/plain Date: Sun, 07 Jan 2007 18:06:45 -0500 Message-Id: <1168211205.22629.6.camel@lanshark.dmv.com> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.39 X-Scanned-By: MIMEDefang 2.48 on 216.240.97.38 Cc: freebsd-amd64@freebsd.org Subject: Panic in 6.2-PRERELEASE with bge on amd64 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Jan 2007 23:12:48 -0000 I am starting a new thread on this as what I had assumed was a panic in nfsd turns out to be an issue with the bge driver. This is an amd64 box, dual processor (SMP kernel) that happens to be running nfsd. About every 3-5 days the kernel panics and I have finally managed to get a core dump. The system: FreeBSD 6.2-PRERELEASE #8: Tue Jan 2 10:57:39 EST 2007 The short and dirty of the dump: # kgdb /usr/obj/usr/src/sys/MSPOOL/kernel.debug /var/crash/vmcore.0 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd". Unread portion of the kernel message buffer: lock order reversal: (sleepable after non-sleepable) 1st 0xffffffff8836b010 bge0 (network driver) @ /usr/src/sys/dev/bge/if_bge.c:2675 2nd 0xffffffff805f26b0 user map (user map) @ /usr/src/sys/vm/vm_map.c:3074 KDB: stack backtrace: witness_checkorder() at witness_checkorder+0x4da _sx_xlock() at _sx_xlock+0x51 vm_map_lookup() at vm_map_lookup+0x44 vm_fault() at vm_fault+0xba trap_pfault() at trap_pfault+0x13c trap() at trap+0x1f9 calltrap() at calltrap+0x5 --- trap 0xc, rip = 0xffffffff801d5f17, rsp = 0xffffffffb371ab50, rbp = 0xffffffffb371aba0 --- bge_rxeof() at bge_rxeof+0x3b7 bge_intr() at bge_intr+0x1c8 ithread_loop() at ithread_loop+0x14c fork_exit() at fork_exit+0xbb fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffffffb371ad00, rbp = 0 --- Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0x28 fault code = supervisor write, page not present instruction pointer = 0x8:0xffffffff801d5f17 stack pointer = 0x10:0xffffffffb371ab50 frame pointer = 0x10:0xffffffffb371aba0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 28 (irq24: bge0) trap number = 12 panic: page fault cpuid = 1 Uptime: 3d4h18m42s #0 doadump () at pcpu.h:172 172 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump () at pcpu.h:172 #1 0xffffffff802771b9 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #2 0xffffffff80276c4b in panic (fmt=0xffffffff8044160c "%s") at /usr/src/sys/kern/kern_shutdown.c:565 #3 0xffffffff803ebba6 in trap_fatal (frame=0xc, eva=18446742978291675136) at /usr/src/sys/amd64/amd64/trap.c:660 #4 0xffffffff803ebee3 in trap_pfault (frame=0xffffffffb371aaa0, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:573 #5 0xffffffff803ec0f9 in trap (frame= {tf_rdi = 0, tf_rsi = 0, tf_rdx = 1, tf_rcx = 499, tf_r8 = 2521427970, tf_r9 = -1099500152320, tf_rax = 0, tf_rbx = -1263948192, tf_rbp = -1284396128, tf_r10 = 0, tf_r11 = 0, tf_r12 = -2009681920, tf_r13 = 0, tf_r14 = 0, tf_r15 = -1099499984896, tf_trapno = 12, tf_addr = 40, tf_flags = -1263948192, tf_err = 2, tf_rip = -2145558761, tf_cs = 8, tf_rflags = 66071, tf_rsp = -1284396192, tf_ss = 16}) at /usr/src/sys/amd64/amd64/trap.c:352 #6 0xffffffff803d779b in calltrap () at /usr/src/sys/amd64/amd64/exception.S:168 #7 0xffffffff801d5f17 in bge_rxeof (sc=0xffffffff8836b000) at /usr/src/sys/dev/bge/if_bge.c:2528 #8 0xffffffff801db818 in bge_intr (xsc=0x0) at /usr/src/sys/dev/bge/if_bge.c:2707 #9 0xffffffff8025f2bc in ithread_loop (arg=0xffffff0000b1b320) at /usr/src/sys/kern/kern_intr.c:682 #10 0xffffffff8025e00b in fork_exit (callout=0xffffffff8025f170 , arg=0xffffff0000b1b320, frame=0xffffffffb371ac50) at /usr/src/sys/kern/kern_fork.c:821 #11 0xffffffff803d7afe in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:394 If more information is needed (disassemble, etc) please let me know. In the interim I may switch to either using the base100 ethernet port (fxp) or turn off SMP. Sven From owner-freebsd-amd64@FreeBSD.ORG Mon Jan 8 05:06:48 2007 Return-Path: X-Original-To: freebsd-amd64@FreeBSD.org Delivered-To: freebsd-amd64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6862C16A407; Mon, 8 Jan 2007 05:06:48 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailout2.pacific.net.au (mailout2-3.pacific.net.au [61.8.2.226]) by mx1.freebsd.org (Postfix) with ESMTP id 0C35013C43E; Mon, 8 Jan 2007 05:06:48 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailproxy1.pacific.net.au (mailproxy1.pacific.net.au [61.8.2.162]) by mailout2.pacific.net.au (Postfix) with ESMTP id 923DD6E305; Mon, 8 Jan 2007 16:06:44 +1100 (EST) Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailproxy1.pacific.net.au (Postfix) with ESMTP id 0D4398C08; Mon, 8 Jan 2007 16:06:44 +1100 (EST) Date: Mon, 8 Jan 2007 16:06:44 +1100 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Sven Willenberger In-Reply-To: <1168211205.22629.6.camel@lanshark.dmv.com> Message-ID: <20070108154433.C75042@delplex.bde.org> References: <1168211205.22629.6.camel@lanshark.dmv.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: stable@FreeBSD.org, freebsd-amd64@FreeBSD.org Subject: Re: Panic in 6.2-PRERELEASE with bge on amd64 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jan 2007 05:06:48 -0000 On Sun, 7 Jan 2007, Sven Willenberger wrote: > I am starting a new thread on this as what I had assumed was a panic in > nfsd turns out to be an issue with the bge driver. This is an amd64 box, > dual processor (SMP kernel) that happens to be running nfsd. About every > 3-5 days the kernel panics and I have finally managed to get a core > dump. > The system: FreeBSD 6.2-PRERELEASE #8: Tue Jan 2 10:57:39 EST 2007 Like most NIC drivers, bge unlocks and re-locks around its call to ether_input() in its interrupt handler. This isn't very safe, and it certainly causes panics for bge. I often see it panic when bringing the interface down and up while input is arriving, on a non-SMP non-amd64 (actually i386) non-6.x (actually -current) system. Bringing the interface down is probably the worst case. It creates a null pointer for bge_intr() to follow. > The short and dirty of the dump: > ... > --- trap 0xc, rip = 0xffffffff801d5f17, rsp = 0xffffffffb371ab50, rbp = 0xffffffffb371aba0 --- > bge_rxeof() at bge_rxeof+0x3b7 What is the instruction here? > bge_intr() at bge_intr+0x1c8 > ithread_loop() at ithread_loop+0x14c > fork_exit() at fork_exit+0xbb > fork_trampoline() at fork_trampoline+0xe > --- trap 0, rip = 0, rsp = 0xffffffffb371ad00, rbp = 0 --- > Fatal trap 12: page fault while in kernel mode > cpuid = 1; apic id = 01 > fault virtual address = 0x28 Looks like a null pointer panic anyway. I guess the instruction is movl to/from 0x28(%reg) where %reg is a null pointer. > ... > #8 0xffffffff801db818 in bge_intr (xsc=0x0) at /usr/src/sys/dev/bge/if_bge.c:2707 What is the statement here? It presumably follow a null pointer and only the exprssion for the pointer is interesting. xsc is already null but that is probably a bug in gdb, or the result of excessive optimization. Compiling kernels with -O2 has little effect except to break debugging. I rarely use gdb on kernels and haven't looked closely enough using ddb to see where the null pointer for the panic on down/up came from. BTW, the sbdrop panic in -current isn't bge-only or SMP-only. I saw it once for sk on a non-SMP system. It rarely happens for non-SMP (much more rarely than the panic in bge_intr()). Under -current, on an SMP amd64 system with bge, It happens almost every time on close of the socket for a ttcp server if input is arriving at the time of the close. I haven't seen it for 6.x. Bruce From owner-freebsd-amd64@FreeBSD.ORG Mon Jan 8 11:08:13 2007 Return-Path: X-Original-To: freebsd-amd64@FreeBSD.org Delivered-To: freebsd-amd64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7BF5B16A415 for ; Mon, 8 Jan 2007 11:08:13 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [69.147.83.40]) by mx1.freebsd.org (Postfix) with ESMTP id 6811813C480 for ; Mon, 8 Jan 2007 11:08:13 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (linimon@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id l08B8DWH016403 for ; Mon, 8 Jan 2007 11:08:13 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.13.4/8.13.4/Submit) id l08B8BQ3016399 for freebsd-amd64@FreeBSD.org; Mon, 8 Jan 2007 11:08:11 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 8 Jan 2007 11:08:11 GMT Message-Id: <200701081108.l08B8BQ3016399@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: linimon set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-amd64@FreeBSD.org Cc: Subject: Current problem reports assigned to you X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jan 2007 11:08:13 -0000 Current FreeBSD problem reports Critical problems S Tracker Resp. Description -------------------------------------------------------------------------------- o amd64/89202 amd64 [ufs] [panic] Kernel crash when accessing filesystem w 1 problem total. Serious problems S Tracker Resp. Description -------------------------------------------------------------------------------- o amd64/69704 amd64 ext2/ext3 unstable in amd64 o amd64/69707 amd64 IPC32 dont work OK in amd64 FreeBSD o amd64/71644 amd64 [panic] amd64 5.3-BETA4 crash when heavy load o amd64/73252 amd64 ad6: WARNING - READ_DMA interrupt was seen but timeout o amd64/73322 amd64 [msdosfs] [hang] unarchiving /etc to msdosfs locks up o amd64/73650 amd64 5.3-release panics on boot o amd64/73775 amd64 Kernel panic (trap 12) when booting with (not from!) P o amd64/74747 amd64 System panic on shutdown when process will not die o amd64/76136 amd64 system halts before reboot o amd64/76336 amd64 racoon/setkey -D cases instant "Fatal Trap 12: Page fa o amd64/78406 amd64 [panic]AMD64 w/ SCSI: issue 'rm -r /usr/ports' and sys o amd64/78848 amd64 [sis] sis driver on FreeBSD 5.x does not work on amd64 o amd64/80114 amd64 kldload snd_ich causes interrupt storm when ACPI is en o amd64/80691 amd64 amd64 kernel hangs on load o amd64/81037 amd64 SATA problem o amd64/81602 amd64 SATA crashes with parallel pcm access o amd64/82425 amd64 [fxp] fxp0: device timeout, fxp interface dies on 5.4/ o amd64/82555 amd64 Kernel Panic - after i connect to my "amd64" from anot o amd64/83005 amd64 Memory Occupied during installation of the FreeBSD 5.4 o amd64/84832 amd64 Installation crashes just at boot AMD64/ Version 5.4 o amd64/84930 amd64 [msdosfs] something wrong with msdosfs on amd64 o amd64/85431 amd64 AMD64 has short but temporary freezes (hangups) on Sun o amd64/85451 amd64 [hang] 6.0-BETA3 lockups on AMD64 (PREEMPTION only) o amd64/86080 amd64 [radeon] [hang] radeon DRI causes system hang on amd64 o amd64/86503 amd64 [atapicam] [panic] k3b crash the system like hardware o amd64/87156 amd64 First Installation: Kernel crashes o amd64/87258 amd64 [smp] [boot] cannot boot with SMP and Areca ARC-1160 r o amd64/87305 amd64 [smp] Dual Opteron / FreeBSD 5 & 6 / powerd results in o amd64/87316 amd64 [vge] "vge0 attach returned 6" on FreeBSD 6.0-RC1 amd6 o amd64/87348 amd64 amd64+smp+startkde always crashing o amd64/87472 amd64 I downloaded 5.4 and went to install it, but it keeps o amd64/87689 amd64 [powerd] [hang] powerd hangs SMP Opteron 244 5-STABLE o amd64/87977 amd64 [busdma] [panic] amd64 busdma dflt_lock called (by ata o amd64/88299 amd64 swapcontext fails with errno 0 f amd64/88568 amd64 [panic] 6.0-RELEASE install cd does not boot with usb o amd64/88790 amd64 kernel panic on first boot (after the FreeBSD installa o amd64/89501 amd64 System crashes on install using ftp on local subnet o amd64/89503 amd64 Cant Boot Installation Disk o amd64/89546 amd64 [geom] GEOM error o amd64/89549 amd64 [amd64] nve timeouts on 6.0-release o amd64/89550 amd64 [amd64] sym0: VTOBUS failed (6.0 Release) o amd64/89968 amd64 [ata] Asus A8N-E MediaShield RAID problem (read-only s o amd64/91405 amd64 [asr] [panic] Kernel panic caused by asr on 6.0-amd64 o amd64/91492 amd64 BTX halted o amd64/92337 amd64 [em] FreeBSD 6.0 Release Intel Pro 1000 MT em1 no buff o amd64/92889 amd64 [libc] xdr double buffer overflow o amd64/92991 amd64 FreeBSD(amd64) freezes when primary disk is on a SiI 3 o amd64/93961 amd64 [busdma] Problem in bounce buffer handling in sys/amd6 o amd64/94677 amd64 panic in amd64 install at non-root user creation o amd64/94989 amd64 BTX Halts on Sun Fire X2100 w/6.1-BETA4 (amd64) and 5. f amd64/95167 amd64 driver for SuperMicro H8DAR-T (Adaptec AIC-8130: (Marv o amd64/95414 amd64 kernel crashes during install o amd64/95888 amd64 kernel: ad2: TIMEOUT - WRITE_DMA retrying on HP DL140G o amd64/96400 amd64 FreeBSD 6.0 Bootin Conflict between Broadcom on-broad o amd64/97075 amd64 Panic, Trap 12 o amd64/97337 amd64 xorg reboots system if dri module is enabled o amd64/99561 amd64 system hangs in FreeBSD AMD64 when writting ext2fs o amd64/102122 amd64 6.1-RELEASE amd64 Install Media panics on boot. s amd64/104311 amd64 ports/wine should be installable on amd64 o amd64/105187 amd64 make -j2 buildworld renders FreeBSD 6.2-PRE/AMD64 unus o amd64/105207 amd64 nVidia MCP55 drivers fail to boot on 6.2B3 amd64 o amd64/105513 amd64 Kernel Panic during package installation on 6.2 beta3 o amd64/105531 amd64 gigabyte GA-M51GM-S2G / nVidia nForce 430 - does not d o amd64/105629 amd64 [re] Issue with re driver p amd64/106109 amd64 amd64: si_addr is not set when sending a signal o amd64/106604 amd64 saslauthd crashes with signal 6 on FreeBSD 6.2-PREREL o amd64/106918 amd64 Asus P5B with internal RealTek PCIe Ethernet o amd64/107345 amd64 Kernel Panic/Crash on dd if=/dev/ad4 of=/dev/ad6 bs=1m o amd64/107433 amd64 i can't install FreeBSD Release 6.1 on HP Pavillion dv 69 problems total. Non-critical problems S Tracker Resp. Description -------------------------------------------------------------------------------- o amd64/61209 amd64 ppc0: cannot reserve I/O port range o amd64/63188 amd64 [ti] ti(4) broken on amd64 o amd64/69705 amd64 IPC problem (msq_queues) o amd64/74608 amd64 [mpt] [hang] mpt hangs 5 minutes when booting o amd64/74811 amd64 [nfs] df, nfs mount, negative Avail -> 32/64-bit confu s amd64/85273 amd64 FreeBSD (NetBSD or OpenBSD) not install on laptop Comp o amd64/87882 amd64 emu10k1 and APCI on amd64 is just noisy o amd64/88730 amd64 kernel panics during booting from the installation CD o amd64/91195 amd64 FreeBSD 6.0(amd64) and Asus A8R-MVP a amd64/92527 amd64 [ciphy.c] ][patch] no driver for "CICADA VSC 8201 Giga o amd64/93002 amd64 amd64 (6.0) coredumps at unpredictable times a amd64/93090 amd64 NIC on GA-K8NF-9 motherboard is recognized, but does n o amd64/95282 amd64 [ed] fix ed for RELENG_5 amd64 so that it has network o amd64/97489 amd64 nForce 410 ATA controller dma time out o amd64/100326 amd64 /dev/fd0 not created after installation FreeBSD 6.1 AM o amd64/100347 amd64 No hardware support Silicon Image SiI 3132 o amd64/100838 amd64 FreeBSD 6.0/6.1 kernel panics when booting with EIST e o amd64/101132 amd64 Incorrect cpu idle and usage statistics in top and sys o amd64/101248 amd64 vi(1) can crash in ncurses(3) on amd64 o amd64/102716 amd64 ex with no argument in an xterm gets SIGSEGV f amd64/102975 amd64 NIC unknown o amd64/103259 amd64 Cannot use ataraid on nvidia nForce4+amd64 o amd64/104875 amd64 unsupported intel Desktop Board DG965WH o amd64/105129 amd64 Compatibility with Intel D o amd64/106186 amd64 panic in swap_pager_swap_init (amd64/smp/6.2-pre) 25 problems total. From owner-freebsd-amd64@FreeBSD.ORG Mon Jan 8 15:52:55 2007 Return-Path: X-Original-To: freebsd-amd64@freebsd.org Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8E9C516A403; Mon, 8 Jan 2007 15:52:55 +0000 (UTC) (envelope-from sven@dmv.com) Received: from smtp-gw-cl-c.dmv.com (smtp-gw-cl-c.dmv.com [216.240.97.41]) by mx1.freebsd.org (Postfix) with ESMTP id 4953613C428; Mon, 8 Jan 2007 15:52:54 +0000 (UTC) (envelope-from sven@dmv.com) Received: from mail-gw-cl-a.dmv.com (mail-gw-cl-a.dmv.com [216.240.97.38]) by smtp-gw-cl-c.dmv.com (8.12.10/8.12.10) with ESMTP id l08Fqr0F077568; Mon, 8 Jan 2007 10:52:53 -0500 (EST) (envelope-from sven@dmv.com) Received: from lanshark.dmv.com (lanshark.dmv.com [216.240.97.46]) by mail-gw-cl-a.dmv.com (8.12.9/8.12.9) with ESMTP id l08FqpXi024980; Mon, 8 Jan 2007 10:52:51 -0500 (EST) (envelope-from sven@dmv.com) From: Sven Willenberger To: Bruce Evans In-Reply-To: <20070108154433.C75042@delplex.bde.org> References: <1168211205.22629.6.camel@lanshark.dmv.com> <20070108154433.C75042@delplex.bde.org> Content-Type: text/plain Date: Mon, 08 Jan 2007 10:58:55 -0500 Message-Id: <1168271935.23549.10.camel@lanshark.dmv.com> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.39 X-Scanned-By: MIMEDefang 2.48 on 216.240.97.38 Cc: stable@freebsd.org, freebsd-amd64@freebsd.org Subject: Re: Panic in 6.2-PRERELEASE with bge on amd64 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jan 2007 15:52:55 -0000 On Mon, 2007-01-08 at 16:06 +1100, Bruce Evans wrote: > On Sun, 7 Jan 2007, Sven Willenberger wrote: > > > I am starting a new thread on this as what I had assumed was a panic in > > nfsd turns out to be an issue with the bge driver. This is an amd64 box, > > dual processor (SMP kernel) that happens to be running nfsd. About every > > 3-5 days the kernel panics and I have finally managed to get a core > > dump. > > The system: FreeBSD 6.2-PRERELEASE #8: Tue Jan 2 10:57:39 EST 2007 > > Like most NIC drivers, bge unlocks and re-locks around its call to > ether_input() in its interrupt handler. This isn't very safe, and it > certainly causes panics for bge. I often see it panic when bringing > the interface down and up while input is arriving, on a non-SMP non-amd64 > (actually i386) non-6.x (actually -current) system. Bringing the > interface down is probably the worst case. It creates a null pointer > for bge_intr() to follow. > > > The short and dirty of the dump: > > ... > > --- trap 0xc, rip = 0xffffffff801d5f17, rsp = 0xffffffffb371ab50, rbp = 0xffffffffb371aba0 --- > > bge_rxeof() at bge_rxeof+0x3b7 > > What is the instruction here? I will do my best to ferret out the information you need. For the bge_rxeof() at bge_rxeof+0x3b7 line, the instruction is: 0xffffffff801d5f17 : mov %r15,0x28(%r14) bge_intr() at bge_intr+0x1c8 line, the instruction is: 0xffffffff801db818 : mov %rbx,%rdi > > > bge_intr() at bge_intr+0x1c8 > > ithread_loop() at ithread_loop+0x14c > > fork_exit() at fork_exit+0xbb > > fork_trampoline() at fork_trampoline+0xe > > --- trap 0, rip = 0, rsp = 0xffffffffb371ad00, rbp = 0 --- > > > Fatal trap 12: page fault while in kernel mode > > cpuid = 1; apic id = 01 > > fault virtual address = 0x28 > > Looks like a null pointer panic anyway. I guess the instruction is > movl to/from 0x28(%reg) where %reg is a null pointer. > from the above lines, apparently %r14 is null then. > > ... > > #8 0xffffffff801db818 in bge_intr (xsc=0x0) at /usr/src/sys/dev/bge/if_bge.c:2707 > > What is the statement here? It presumably follow a null pointer and only > the exprssion for the pointer is interesting. xsc is already null but > that is probably a bug in gdb, or the result of excessive optimization. > Compiling kernels with -O2 has little effect except to break debugging. > the block of code from if_bge.c: 2705 if (ifp->if_drv_flags & IFF_DRV_RUNNING) { 2706 /* Check RX return ring producer/consumer. */ 2707 bge_rxeof(sc); 2708 2709 /* Check TX ring producer/consumer. */ 2710 bge_txeof(sc); 2711 } By default -O2 is passed to CC (I don't use any custom make flags other than and only define CPUTYPE in my /etc/make.conf). > I rarely use gdb on kernels and haven't looked closely enough using ddb > to see where the null pointer for the panic on down/up came from. > > BTW, the sbdrop panic in -current isn't bge-only or SMP-only. I saw > it once for sk on a non-SMP system. It rarely happens for non-SMP > (much more rarely than the panic in bge_intr()). Under -current, on > an SMP amd64 system with bge, It happens almost every time on close > of the socket for a ttcp server if input is arriving at the time of > the close. I haven't seen it for 6.x. > > Bruce The short of it is that this interface sees pretty much non-stop traffic as this is a mailserver (final destination) and is constantly being delivered to (direct disk access) and mail being retrieved (remote machine(s) with nfs mounted mail spools. If a momentary down of the interface is enough to completely panic the driver and then the kernel, this hardly seems "robust" if, in fact, this is what is happening. So the question arises as to what would be causing the down/up of the interface; I could start looking at the cable, the switch it's connected to and ... any other ideas? (I don't have watchdog enabled or anything like that, for example). Sven From owner-freebsd-amd64@FreeBSD.ORG Mon Jan 8 21:59:43 2007 Return-Path: X-Original-To: amd64@freebsd.org Delivered-To: freebsd-amd64@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7AE5C16A415; Mon, 8 Jan 2007 21:59:43 +0000 (UTC) (envelope-from tinderbox@freebsd.org) Received: from smarthost2.sentex.ca (smarthost2.sentex.ca [205.211.164.50]) by mx1.freebsd.org (Postfix) with ESMTP id 3DC2113C44C; Mon, 8 Jan 2007 21:59:41 +0000 (UTC) (envelope-from tinderbox@freebsd.org) Received: from smtp2.sentex.ca (smtp2.sentex.ca [199.212.134.9]) by smarthost2.sentex.ca (8.13.8/8.13.8) with ESMTP id l08LxeGK010409; Mon, 8 Jan 2007 16:59:40 -0500 (EST) (envelope-from tinderbox@freebsd.org) Received: from freebsd-current.sentex.ca (freebsd-current.sentex.ca [64.7.128.98]) by smtp2.sentex.ca (8.13.8/8.13.8) with ESMTP id l08LxeUm039567; Mon, 8 Jan 2007 16:59:40 -0500 (EST) (envelope-from tinderbox@freebsd.org) Received: by freebsd-current.sentex.ca (Postfix, from userid 666) id 7B15773034; Mon, 8 Jan 2007 16:59:40 -0500 (EST) Sender: FreeBSD Tinderbox From: FreeBSD Tinderbox To: FreeBSD Tinderbox , , Precedence: bulk Message-Id: <20070108215940.7B15773034@freebsd-current.sentex.ca> Date: Mon, 8 Jan 2007 16:59:40 -0500 (EST) X-Virus-Scanned: ClamAV version devel-20070108, clamav-milter version devel-111206 on news X-Virus-Status: Clean Cc: Subject: [head tinderbox] failure on amd64/amd64 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jan 2007 21:59:43 -0000 TB --- 2007-01-08 21:05:00 - tinderbox 2.3 running on freebsd-current.sentex.ca TB --- 2007-01-08 21:05:00 - starting HEAD tinderbox run for amd64/amd64 TB --- 2007-01-08 21:05:00 - cleaning the object tree TB --- 2007-01-08 21:05:50 - checking out the source tree TB --- 2007-01-08 21:05:50 - cd /tinderbox/HEAD/amd64/amd64 TB --- 2007-01-08 21:05:50 - /usr/bin/cvs -f -R -q -d/home/ncvs update -Pd -A src TB --- 2007-01-08 21:15:55 - building world (CFLAGS=-O2 -pipe) TB --- 2007-01-08 21:15:55 - cd /src TB --- 2007-01-08 21:15:55 - /usr/bin/make -B buildworld >>> World build started on Mon Jan 8 21:15:57 UTC 2007 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything [...] (cd /src/rescue/rescue/../../sbin/dhclient && /usr/bin/make -DRESCUE CRUNCH_CFLAGS=-DRESCUE DIRPRFX=rescue/rescue/dhclient/ -DRELEASE_CRUNCH -Dlint depend && /usr/bin/make -DRESCUE CRUNCH_CFLAGS=-DRESCUE DIRPRFX=rescue/rescue/dhclient/ -DRELEASE_CRUNCH -Dlint dhclient.o clparse.o alloc.o dispatch.o hash.o bpf.o options.o tree.o conflex.o errwarn.o inet.o packet.o convert.o tables.o parse.o privsep.o) rm -f .depend mkdep -f .depend -a -DRESCUE /src/sbin/dhclient/dhclient.c /src/sbin/dhclient/clparse.c /src/sbin/dhclient/alloc.c /src/sbin/dhclient/dispatch.c /src/sbin/dhclient/hash.c /src/sbin/dhclient/bpf.c /src/sbin/dhclient/options.c /src/sbin/dhclient/tree.c /src/sbin/dhclient/conflex.c /src/sbin/dhclient/errwarn.c /src/sbin/dhclient/inet.c /src/sbin/dhclient/packet.c /src/sbin/dhclient/convert.c /src/sbin/dhclient/tables.c /src/sbin/dhclient/parse.c /src/sbin/dhclient/privsep.c echo dhclient: /obj/amd64/src/tmp/usr/lib/libc.a >> .depend cc -O2 -pipe -DRESCUE -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -c /src/sbin/dhclient/dhclient.c In file included from /src/sbin/dhclient/dhclient.c:62: /obj/amd64/src/tmp/usr/include/net80211/ieee80211_freebsd.h:151: warning: "struct ifqueue" declared inside parameter list /obj/amd64/src/tmp/usr/include/net80211/ieee80211_freebsd.h:151: warning: its scope is only this definition or declaration, which is probably not what you want *** Error code 1 Stop in /src/sbin/dhclient. *** Error code 1 Stop in /obj/amd64/src/rescue/rescue. *** Error code 1 Stop in /src/rescue/rescue. *** Error code 1 Stop in /src/rescue. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2007-01-08 21:59:40 - WARNING: /usr/bin/make returned exit code 1 TB --- 2007-01-08 21:59:40 - ERROR: failed to build world TB --- 2007-01-08 21:59:40 - tinderbox aborted TB --- 0.96 user 3.41 system 3279.68 real http://tinderbox.des.no/tinderbox-head-HEAD-amd64-amd64.full From owner-freebsd-amd64@FreeBSD.ORG Tue Jan 9 01:50:55 2007 Return-Path: X-Original-To: freebsd-amd64@FreeBSD.org Delivered-To: freebsd-amd64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8BEFE16A403; Tue, 9 Jan 2007 01:50:55 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailout1.pacific.net.au (mailout1-3.pacific.net.au [61.8.2.210]) by mx1.freebsd.org (Postfix) with ESMTP id 28B0313C44B; Tue, 9 Jan 2007 01:50:55 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailproxy1.pacific.net.au (mailproxy1.pacific.net.au [61.8.2.162]) by mailout1.pacific.net.au (Postfix) with ESMTP id A49835A7C29; Tue, 9 Jan 2007 12:50:53 +1100 (EST) Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailproxy1.pacific.net.au (Postfix) with ESMTP id 34C958C13; Tue, 9 Jan 2007 12:50:52 +1100 (EST) Date: Tue, 9 Jan 2007 12:50:51 +1100 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Sven Willenberger In-Reply-To: <1168271935.23549.10.camel@lanshark.dmv.com> Message-ID: <20070109124826.M79616@delplex.bde.org> References: <1168211205.22629.6.camel@lanshark.dmv.com> <20070108154433.C75042@delplex.bde.org> <1168271935.23549.10.camel@lanshark.dmv.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: stable@FreeBSD.org, freebsd-amd64@FreeBSD.org Subject: Re: Panic in 6.2-PRERELEASE with bge on amd64 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jan 2007 01:50:55 -0000 On Mon, 8 Jan 2007, Sven Willenberger wrote: > On Mon, 2007-01-08 at 16:06 +1100, Bruce Evans wrote: >> On Sun, 7 Jan 2007, Sven Willenberger wrote: >>> The short and dirty of the dump: >>> ... >>> --- trap 0xc, rip = 0xffffffff801d5f17, rsp = 0xffffffffb371ab50, rbp = 0xffffffffb371aba0 --- >>> bge_rxeof() at bge_rxeof+0x3b7 >> >> What is the instruction here? > > I will do my best to ferret out the information you need. For the > bge_rxeof() at bge_rxeof+0x3b7 line, the instruction is: > > 0xffffffff801d5f17 : mov %r15,0x28(%r14) > ... >> Looks like a null pointer panic anyway. I guess the instruction is >> movl to/from 0x28(%reg) where %reg is a null pointer. >> > > from the above lines, apparently %r14 is null then. Yes. It's a bit suprising that the access is a write. >>> ... >>> #8 0xffffffff801db818 in bge_intr (xsc=0x0) at /usr/src/sys/dev/bge/if_bge.c:2707 >> >> What is the statement here? It presumably follow a null pointer and only >> the exprssion for the pointer is interesting. xsc is already null but >> that is probably a bug in gdb, or the result of excessive optimization. >> Compiling kernels with -O2 has little effect except to break debugging. > > the block of code from if_bge.c: > > 2705 if (ifp->if_drv_flags & IFF_DRV_RUNNING) { > 2706 /* Check RX return ring producer/consumer. */ > 2707 bge_rxeof(sc); > 2708 > 2709 /* Check TX ring producer/consumer. */ > 2710 bge_txeof(sc); > 2711 } Oops. I should have asked for the statment in bge_rxeof(). > By default -O2 is passed to CC (I don't use any custom make flags other > than and only define CPUTYPE in my /etc/make.conf). -O2 is unfortunately the default for COPTFLAGS for most arches in sys/conf/kern.pre.mk. All of my machines and most FreeBSD cluster machines override this default in /etc/make.conf. With the override overridden for RELENG_6 amd64, gcc inlines bge_rxeof(), so your environment must be a little different to get even the above ifo. I think gdb can show the correct line numbers but not the call frames (since there is no call). ddb and the kernel stack trace can only show the call frames for actual calls. With -O1, I couldn't find any instruction similar to the mov to the null pointer + 28. 28 is a popular offset in mbufs > The short of it is that this interface sees pretty much non-stop traffic > as this is a mailserver (final destination) and is constantly being > delivered to (direct disk access) and mail being retrieved (remote > machine(s) with nfs mounted mail spools. If a momentary down of the > interface is enough to completely panic the driver and then the kernel, > this hardly seems "robust" if, in fact, this is what is happening. So > the question arises as to what would be causing the down/up of the > interface; I could start looking at the cable, the switch it's connected > to and ... any other ideas? (I don't have watchdog enabled or anything > like that, for example). I don't think down/up can occur in normal operation, since it takes ioctls or a watchdog timeout to do it. Maybe some ioctls other than a full down/up can cause problems... bge_init() is called for the following ioctls: - mtu changes - some near down/up (possibly only these) Suspend/resume and of course detach/attach do much the same things as down/up. BTW, I added some sysctls and found it annoying to have to do down/up to make the sysctls take effect. Sysctls in several other NIC drivers require the same, since doing a full reinitialization is easiest. Since I am tuning using sysctls, I got used to doing down/up too much. Similarly for the mtu ioctl. I think a full reinitialization is used for mtu changes mainly in cases the change switches on/off support for jumbo buffers. Then there is a lot of buffer reallocation to be done, and interfaces have to be stopped to ensure that the bufferes being deallocated are not in use, etc. Bruce From owner-freebsd-amd64@FreeBSD.ORG Tue Jan 9 11:57:34 2007 Return-Path: X-Original-To: freebsd-amd64@freebsd.org Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8952F16A40F for ; Tue, 9 Jan 2007 11:57:34 +0000 (UTC) (envelope-from caiusthegreat@cashette.com) Received: from rtr.cashette.com (rtr.cashette.com [216.218.254.132]) by mx1.freebsd.org (Postfix) with SMTP id 61ECE13C44B for ; Tue, 9 Jan 2007 11:57:34 +0000 (UTC) (envelope-from caiusthegreat@cashette.com) Received: CashetteMail 9 Jan 2007 11:30:54 -0000 Received: from [139.130.36.190] by mail.cashette.com via HTTP; Tue Jan 09 03:35:16 PST 2007 Message-ID: <32864251.1168342516734.JavaMail.Administrator@appsrv> Date: Tue, 9 Jan 2007 03:35:16 -0800 (PST) From: Caius Theophrastus To: freebsd-amd64@freebsd.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: CashetteMail Subject: Re: freebsd-amd64 Digest, Vol 187, Issue 1 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jan 2007 11:57:34 -0000 --- freebsd-amd64-request@freebsd.org wrote: > Send freebsd-amd64 mailing list submissions to > freebsd-amd64@freebsd.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.freebsd.org/mailman/listinfo/freebsd-amd64 > or, via email, send a message with subject or body 'help' to > freebsd-amd64-request@freebsd.org > > You can reach the person managing the list at > freebsd-amd64-owner@freebsd.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of freebsd-amd64 digest..." --- freebsd-amd64-request@freebsd.org wrote: > Today's Topics: > > 1. Panic in 6.2-PRERELEASE with bge on amd64 (Sven Willenberger) > 2. Re: Panic in 6.2-PRERELEASE with bge on amd64 (Bruce Evans) > 3. Current problem reports assigned to you (FreeBSD bugmaster) --- freebsd-amd64-request@freebsd.org wrote: > _______________________________________________ > freebsd-amd64@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-amd64 > To unsubscribe, send any mail to "freebsd-amd64-unsubscribe@freebsd.org" __________________________ Stops spam 100% for your email accounts or you get paid. http://www.cashette.com From owner-freebsd-amd64@FreeBSD.ORG Tue Jan 9 14:30:55 2007 Return-Path: X-Original-To: freebsd-amd64@FreeBSD.org Delivered-To: freebsd-amd64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E3F7E16A4AB; Tue, 9 Jan 2007 14:30:55 +0000 (UTC) (envelope-from sven@dmv.com) Received: from smtp-gw-cl-c.dmv.com (smtp-gw-cl-c.dmv.com [216.240.97.41]) by mx1.freebsd.org (Postfix) with ESMTP id B545213C469; Tue, 9 Jan 2007 14:30:55 +0000 (UTC) (envelope-from sven@dmv.com) Received: from mail-gw-cl-a.dmv.com (mail-gw-cl-a.dmv.com [216.240.97.38]) by smtp-gw-cl-c.dmv.com (8.12.10/8.12.10) with ESMTP id l09EUs0F006724; Tue, 9 Jan 2007 09:30:54 -0500 (EST) (envelope-from sven@dmv.com) Received: from lanshark.dmv.com (lanshark.dmv.com [216.240.97.46]) by mail-gw-cl-a.dmv.com (8.12.9/8.12.9) with ESMTP id l09EUrXi052697; Tue, 9 Jan 2007 09:30:54 -0500 (EST) (envelope-from sven@dmv.com) From: Sven Willenberger To: Bruce Evans In-Reply-To: <20070109124826.M79616@delplex.bde.org> References: <1168211205.22629.6.camel@lanshark.dmv.com> <20070108154433.C75042@delplex.bde.org> <1168271935.23549.10.camel@lanshark.dmv.com> <20070109124826.M79616@delplex.bde.org> Content-Type: text/plain Date: Tue, 09 Jan 2007 09:37:05 -0500 Message-Id: <1168353425.29047.8.camel@lanshark.dmv.com> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.39 X-Scanned-By: MIMEDefang 2.48 on 216.240.97.38 Cc: stable@FreeBSD.org, freebsd-amd64@FreeBSD.org Subject: Re: Panic in 6.2-PRERELEASE with bge on amd64 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jan 2007 14:30:56 -0000 On Tue, 2007-01-09 at 12:50 +1100, Bruce Evans wrote: > On Mon, 8 Jan 2007, Sven Willenberger wrote: > > > On Mon, 2007-01-08 at 16:06 +1100, Bruce Evans wrote: > >> On Sun, 7 Jan 2007, Sven Willenberger wrote: > > >>> The short and dirty of the dump: > >>> ... > >>> --- trap 0xc, rip = 0xffffffff801d5f17, rsp = 0xffffffffb371ab50, rbp = 0xffffffffb371aba0 --- > >>> bge_rxeof() at bge_rxeof+0x3b7 > >> > >> What is the instruction here? > > > > I will do my best to ferret out the information you need. For the > > bge_rxeof() at bge_rxeof+0x3b7 line, the instruction is: > > > > 0xffffffff801d5f17 : mov %r15,0x28(%r14) > > ... > >> Looks like a null pointer panic anyway. I guess the instruction is > >> movl to/from 0x28(%reg) where %reg is a null pointer. > >> > > > > from the above lines, apparently %r14 is null then. > > Yes. It's a bit suprising that the access is a write. > > >>> ... > >>> #8 0xffffffff801db818 in bge_intr (xsc=0x0) at /usr/src/sys/dev/bge/if_bge.c:2707 > >> > >> What is the statement here? It presumably follow a null pointer and only > >> the exprssion for the pointer is interesting. xsc is already null but > >> that is probably a bug in gdb, or the result of excessive optimization. > >> Compiling kernels with -O2 has little effect except to break debugging. > > > > the block of code from if_bge.c: > > > > 2705 if (ifp->if_drv_flags & IFF_DRV_RUNNING) { > > 2706 /* Check RX return ring producer/consumer. */ > > 2707 bge_rxeof(sc); > > 2708 > > 2709 /* Check TX ring producer/consumer. */ > > 2710 bge_txeof(sc); > > 2711 } > > Oops. I should have asked for the statment in bge_rxeof(). #7 0xffffffff801d5f17 in bge_rxeof (sc=0xffffffff8836b000) at /usr/src/sys/dev/bge/if_bge.c:2528 2528 m->m_pkthdr.len = m->m_len = cur_rx->bge_len - ETHER_CRC_LEN; (where m is defined as: 2449 struct mbuf *m = NULL; ) > > > By default -O2 is passed to CC (I don't use any custom make flags other > > than and only define CPUTYPE in my /etc/make.conf). > > -O2 is unfortunately the default for COPTFLAGS for most arches in > sys/conf/kern.pre.mk. All of my machines and most FreeBSD cluster > machines override this default in /etc/make.conf. > > With the override overridden for RELENG_6 amd64, gcc inlines bge_rxeof(), > so your environment must be a little different to get even the above > ifo. I think gdb can show the correct line numbers but not the call > frames (since there is no call). ddb and the kernel stack trace can > only show the call frames for actual calls. > > With -O1, I couldn't find any instruction similar to the mov to the > null pointer + 28. 28 is a popular offset in mbufs If you have a suggestion for an /etc/make.conf line, I can recompile the kernel accordingly assuming it still panics or locks up after the change of interface noted below. > > > The short of it is that this interface sees pretty much non-stop traffic > > as this is a mailserver (final destination) and is constantly being > > delivered to (direct disk access) and mail being retrieved (remote > > machine(s) with nfs mounted mail spools. If a momentary down of the > > interface is enough to completely panic the driver and then the kernel, > > this hardly seems "robust" if, in fact, this is what is happening. So > > the question arises as to what would be causing the down/up of the > > interface; I could start looking at the cable, the switch it's connected > > to and ... any other ideas? (I don't have watchdog enabled or anything > > like that, for example). > > I don't think down/up can occur in normal operation, since it takes ioctls > or a watchdog timeout to do it. Maybe some ioctls other than a full > down/up can cause problems... bge_init() is called for the following > ioctls: > - mtu changes > - some near down/up (possibly only these) > Suspend/resume and of course detach/attach do much the same things as > down/up. > > BTW, I added some sysctls and found it annoying to have to do down/up > to make the sysctls take effect. Sysctls in several other NIC drivers > require the same, since doing a full reinitialization is easiest. > Since I am tuning using sysctls, I got used to doing down/up too much. > > Similarly for the mtu ioctl. I think a full reinitialization is used > for mtu changes mainly in cases the change switches on/off support for > jumbo buffers. Then there is a lot of buffer reallocation to be > done, and interfaces have to be stopped to ensure that the bufferes > being deallocated are not in use, etc. > > Bruce As this was connected to a gigE switch with mtu left at 1500 I supposed it is possible that perhaps some mtu discovery/change may have been happening on the switch but that seems a bit out in left field. For now I am using the fxp interface connected to the same switch to see if the issue continues (the change of interface was driven by a hard lockup yesterday where I could not even type anything on the term). Sven From owner-freebsd-amd64@FreeBSD.ORG Tue Jan 9 17:22:19 2007 Return-Path: X-Original-To: freebsd-amd64@freebsd.org Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 13F9C16A407; Tue, 9 Jan 2007 17:22:19 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (66-23-211-162.clients.speedfactory.net [66.23.211.162]) by mx1.freebsd.org (Postfix) with ESMTP id B1DD313C468; Tue, 9 Jan 2007 17:22:18 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.13.6/8.13.6) with ESMTP id l09HM8sS077062; Tue, 9 Jan 2007 12:22:14 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: Bruce Evans Date: Tue, 9 Jan 2007 11:51:16 -0500 User-Agent: KMail/1.9.1 References: <1168211205.22629.6.camel@lanshark.dmv.com> <20070108154433.C75042@delplex.bde.org> In-Reply-To: <20070108154433.C75042@delplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200701091151.17166.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Tue, 09 Jan 2007 12:22:15 -0500 (EST) X-Virus-Scanned: ClamAV 0.88.3/2429/Tue Jan 9 09:23:53 2007 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: rwatson@freebsd.org, freebsd-amd64@freebsd.org Subject: Re: Panic in 6.2-PRERELEASE with bge on amd64 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jan 2007 17:22:19 -0000 On Monday 08 January 2007 00:06, Bruce Evans wrote: > On Sun, 7 Jan 2007, Sven Willenberger wrote: > > > I am starting a new thread on this as what I had assumed was a panic in > > nfsd turns out to be an issue with the bge driver. This is an amd64 box, > > dual processor (SMP kernel) that happens to be running nfsd. About every > > 3-5 days the kernel panics and I have finally managed to get a core > > dump. > > The system: FreeBSD 6.2-PRERELEASE #8: Tue Jan 2 10:57:39 EST 2007 > > Like most NIC drivers, bge unlocks and re-locks around its call to > ether_input() in its interrupt handler. This isn't very safe, and it > certainly causes panics for bge. I often see it panic when bringing > the interface down and up while input is arriving, on a non-SMP non-amd64 > (actually i386) non-6.x (actually -current) system. Bringing the > interface down is probably the worst case. It creates a null pointer > for bge_intr() to follow. Why do you feel that it is unsafe to drop the lock around if_input()? -- John Baldwin From owner-freebsd-amd64@FreeBSD.ORG Tue Jan 9 17:22:25 2007 Return-Path: X-Original-To: freebsd-amd64@freebsd.org Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0199116A5F5 for ; Tue, 9 Jan 2007 17:22:25 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (66-23-211-162.clients.speedfactory.net [66.23.211.162]) by mx1.freebsd.org (Postfix) with ESMTP id 9239913C45B for ; Tue, 9 Jan 2007 17:22:24 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.13.6/8.13.6) with ESMTP id l09HM8sR077062; Tue, 9 Jan 2007 12:22:09 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: freebsd-amd64@freebsd.org Date: Tue, 9 Jan 2007 11:50:14 -0500 User-Agent: KMail/1.9.1 References: <1168211205.22629.6.camel@lanshark.dmv.com> <20070109124826.M79616@delplex.bde.org> <1168353425.29047.8.camel@lanshark.dmv.com> In-Reply-To: <1168353425.29047.8.camel@lanshark.dmv.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200701091150.15274.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Tue, 09 Jan 2007 12:22:12 -0500 (EST) X-Virus-Scanned: ClamAV 0.88.3/2429/Tue Jan 9 09:23:53 2007 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: stable@freebsd.org Subject: Re: Panic in 6.2-PRERELEASE with bge on amd64 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jan 2007 17:22:25 -0000 On Tuesday 09 January 2007 09:37, Sven Willenberger wrote: > On Tue, 2007-01-09 at 12:50 +1100, Bruce Evans wrote: > > On Mon, 8 Jan 2007, Sven Willenberger wrote: > > > > > On Mon, 2007-01-08 at 16:06 +1100, Bruce Evans wrote: > > >> On Sun, 7 Jan 2007, Sven Willenberger wrote: > > > > >>> The short and dirty of the dump: > > >>> ... > > >>> --- trap 0xc, rip = 0xffffffff801d5f17, rsp = 0xffffffffb371ab50, rbp = 0xffffffffb371aba0 --- > > >>> bge_rxeof() at bge_rxeof+0x3b7 > > >> > > >> What is the instruction here? > > > > > > I will do my best to ferret out the information you need. For the > > > bge_rxeof() at bge_rxeof+0x3b7 line, the instruction is: > > > > > > 0xffffffff801d5f17 : mov %r15,0x28(%r14) > > > ... > > >> Looks like a null pointer panic anyway. I guess the instruction is > > >> movl to/from 0x28(%reg) where %reg is a null pointer. > > >> > > > > > > from the above lines, apparently %r14 is null then. > > > > Yes. It's a bit suprising that the access is a write. > > > > >>> ... > > >>> #8 0xffffffff801db818 in bge_intr (xsc=0x0) at /usr/src/sys/dev/bge/if_bge.c:2707 > > >> > > >> What is the statement here? It presumably follow a null pointer and only > > >> the exprssion for the pointer is interesting. xsc is already null but > > >> that is probably a bug in gdb, or the result of excessive optimization. > > >> Compiling kernels with -O2 has little effect except to break debugging. > > > > > > the block of code from if_bge.c: > > > > > > 2705 if (ifp->if_drv_flags & IFF_DRV_RUNNING) { > > > 2706 /* Check RX return ring producer/consumer. */ > > > 2707 bge_rxeof(sc); > > > 2708 > > > 2709 /* Check TX ring producer/consumer. */ > > > 2710 bge_txeof(sc); > > > 2711 } > > > > Oops. I should have asked for the statment in bge_rxeof(). > > #7 0xffffffff801d5f17 in bge_rxeof (sc=0xffffffff8836b000) at /usr/src/sys/dev/bge/if_bge.c:2528 > 2528 m->m_pkthdr.len = m->m_len = cur_rx->bge_len - ETHER_CRC_LEN; > > (where m is defined as: > 2449 struct mbuf *m = NULL; > ) It's assigned earlier in between those two places. Can you 'p rxidx' as well as 'p sc->bge_cdata.bge_rx_std_chain[rxidx]' and 'p sc->bge_cdata.bge_rx_jumbo_chain[rxidx]'? Also, are you using jumbo frames at all? -- John Baldwin From owner-freebsd-amd64@FreeBSD.ORG Tue Jan 9 17:47:19 2007 Return-Path: X-Original-To: freebsd-amd64@freebsd.org Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9BDF816A412; Tue, 9 Jan 2007 17:47:19 +0000 (UTC) (envelope-from sven@dmv.com) Received: from smtp-gw-cl-d.dmv.com (smtp-gw-cl-d.dmv.com [216.240.97.42]) by mx1.freebsd.org (Postfix) with ESMTP id 5722813C458; Tue, 9 Jan 2007 17:47:19 +0000 (UTC) (envelope-from sven@dmv.com) Received: from mail-gw-cl-a.dmv.com (mail-gw-cl-a.dmv.com [216.240.97.38]) by smtp-gw-cl-d.dmv.com (8.12.10/8.12.10) with ESMTP id l09HlHJX098815; Tue, 9 Jan 2007 12:47:17 -0500 (EST) (envelope-from sven@dmv.com) Received: from lanshark.dmv.com (lanshark.dmv.com [216.240.97.46]) by mail-gw-cl-a.dmv.com (8.12.9/8.12.9) with ESMTP id l09HlFXi060226; Tue, 9 Jan 2007 12:47:16 -0500 (EST) (envelope-from sven@dmv.com) From: Sven Willenberger To: John Baldwin In-Reply-To: <200701091150.15274.jhb@freebsd.org> References: <1168211205.22629.6.camel@lanshark.dmv.com> <20070109124826.M79616@delplex.bde.org> <1168353425.29047.8.camel@lanshark.dmv.com> <200701091150.15274.jhb@freebsd.org> Content-Type: text/plain Date: Tue, 09 Jan 2007 12:53:29 -0500 Message-Id: <1168365209.29047.23.camel@lanshark.dmv.com> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.48 on 216.240.97.42 X-Scanned-By: MIMEDefang 2.48 on 216.240.97.38 Cc: stable@freebsd.org, freebsd-amd64@freebsd.org Subject: Re: Panic in 6.2-PRERELEASE with bge on amd64 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jan 2007 17:47:19 -0000 On Tue, 2007-01-09 at 11:50 -0500, John Baldwin wrote: > On Tuesday 09 January 2007 09:37, Sven Willenberger wrote: > > On Tue, 2007-01-09 at 12:50 +1100, Bruce Evans wrote: > > > On Mon, 8 Jan 2007, Sven Willenberger wrote: > > > > > > > On Mon, 2007-01-08 at 16:06 +1100, Bruce Evans wrote: > > > >> On Sun, 7 Jan 2007, Sven Willenberger wrote: > > > > > > >>> The short and dirty of the dump: > > > >>> ... > > > >>> --- trap 0xc, rip = 0xffffffff801d5f17, rsp = 0xffffffffb371ab50, rbp > = 0xffffffffb371aba0 --- > > > >>> bge_rxeof() at bge_rxeof+0x3b7 > > > >> > > > >> What is the instruction here? > > > > > > > > I will do my best to ferret out the information you need. For the > > > > bge_rxeof() at bge_rxeof+0x3b7 line, the instruction is: > > > > > > > > 0xffffffff801d5f17 : mov %r15,0x28(%r14) > > > > ... > > > >> Looks like a null pointer panic anyway. I guess the instruction is > > > >> movl to/from 0x28(%reg) where %reg is a null pointer. > > > >> > > > > > > > > from the above lines, apparently %r14 is null then. > > > > > > Yes. It's a bit suprising that the access is a write. > > > > > > >>> ... > > > >>> #8 0xffffffff801db818 in bge_intr (xsc=0x0) > at /usr/src/sys/dev/bge/if_bge.c:2707 > > > >> > > > >> What is the statement here? It presumably follow a null pointer and > only > > > >> the exprssion for the pointer is interesting. xsc is already null but > > > >> that is probably a bug in gdb, or the result of excessive optimization. > > > >> Compiling kernels with -O2 has little effect except to break debugging. > > > > > > > > the block of code from if_bge.c: > > > > > > > > 2705 if (ifp->if_drv_flags & IFF_DRV_RUNNING) { > > > > 2706 /* Check RX return ring producer/consumer. */ > > > > 2707 bge_rxeof(sc); > > > > 2708 > > > > 2709 /* Check TX ring producer/consumer. */ > > > > 2710 bge_txeof(sc); > > > > 2711 } > > > > > > Oops. I should have asked for the statment in bge_rxeof(). > > > > #7 0xffffffff801d5f17 in bge_rxeof (sc=0xffffffff8836b000) > at /usr/src/sys/dev/bge/if_bge.c:2528 > > 2528 m->m_pkthdr.len = m->m_len = cur_rx->bge_len - > ETHER_CRC_LEN; > > > > (where m is defined as: > > 2449 struct mbuf *m = NULL; > > ) > > It's assigned earlier in between those two places. Can you 'p rxidx' as well > as 'p sc->bge_cdata.bge_rx_std_chain[rxidx]' and 'p > sc->bge_cdata.bge_rx_jumbo_chain[rxidx]'? Also, are you using jumbo frames > at all? > (kgdb) p rxidx $1 = 499 (kgdb) p sc->bge_cdata.bge_rx_std_chain[rxidx] $2 = (struct mbuf *) 0xffffff0097a27900 (kgdb) p sc->bge_cdata.bge_rx_jumbo_chain[rxidx] $3 = (struct mbuf *) 0x0 And no, I am not using jumbo frames: bge0: flags=8843 mtu 1500 options=1b Sven From owner-freebsd-amd64@FreeBSD.ORG Tue Jan 9 19:46:41 2007 Return-Path: X-Original-To: freebsd-amd64@freebsd.org Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0C5E216A509; Tue, 9 Jan 2007 19:46:41 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (66-23-211-162.clients.speedfactory.net [66.23.211.162]) by mx1.freebsd.org (Postfix) with ESMTP id 93EA213C465; Tue, 9 Jan 2007 19:46:40 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.13.6/8.13.6) with ESMTP id l09JkQBm077947; Tue, 9 Jan 2007 14:46:27 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: Sven Willenberger Date: Tue, 9 Jan 2007 14:09:29 -0500 User-Agent: KMail/1.9.1 References: <1168211205.22629.6.camel@lanshark.dmv.com> <200701091150.15274.jhb@freebsd.org> <1168365209.29047.23.camel@lanshark.dmv.com> In-Reply-To: <1168365209.29047.23.camel@lanshark.dmv.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200701091409.29828.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Tue, 09 Jan 2007 14:46:29 -0500 (EST) X-Virus-Scanned: ClamAV 0.88.3/2430/Tue Jan 9 12:35:51 2007 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: stable@freebsd.org, freebsd-amd64@freebsd.org Subject: Re: Panic in 6.2-PRERELEASE with bge on amd64 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jan 2007 19:46:41 -0000 On Tuesday 09 January 2007 12:53, Sven Willenberger wrote: > On Tue, 2007-01-09 at 11:50 -0500, John Baldwin wrote: > > On Tuesday 09 January 2007 09:37, Sven Willenberger wrote: > > > On Tue, 2007-01-09 at 12:50 +1100, Bruce Evans wrote: > > > > On Mon, 8 Jan 2007, Sven Willenberger wrote: > > > > > > > > > On Mon, 2007-01-08 at 16:06 +1100, Bruce Evans wrote: > > > > >> On Sun, 7 Jan 2007, Sven Willenberger wrote: > > > > > > > > >>> The short and dirty of the dump: > > > > >>> ... > > > > >>> --- trap 0xc, rip = 0xffffffff801d5f17, rsp = 0xffffffffb371ab50, rbp > > = 0xffffffffb371aba0 --- > > > > >>> bge_rxeof() at bge_rxeof+0x3b7 > > > > >> > > > > >> What is the instruction here? > > > > > > > > > > I will do my best to ferret out the information you need. For the > > > > > bge_rxeof() at bge_rxeof+0x3b7 line, the instruction is: > > > > > > > > > > 0xffffffff801d5f17 : mov %r15,0x28(%r14) > > > > > ... > > > > >> Looks like a null pointer panic anyway. I guess the instruction is > > > > >> movl to/from 0x28(%reg) where %reg is a null pointer. > > > > >> > > > > > > > > > > from the above lines, apparently %r14 is null then. > > > > > > > > Yes. It's a bit suprising that the access is a write. > > > > > > > > >>> ... > > > > >>> #8 0xffffffff801db818 in bge_intr (xsc=0x0) > > at /usr/src/sys/dev/bge/if_bge.c:2707 > > > > >> > > > > >> What is the statement here? It presumably follow a null pointer and > > only > > > > >> the exprssion for the pointer is interesting. xsc is already null but > > > > >> that is probably a bug in gdb, or the result of excessive optimization. > > > > >> Compiling kernels with -O2 has little effect except to break debugging. > > > > > > > > > > the block of code from if_bge.c: > > > > > > > > > > 2705 if (ifp->if_drv_flags & IFF_DRV_RUNNING) { > > > > > 2706 /* Check RX return ring producer/consumer. */ > > > > > 2707 bge_rxeof(sc); > > > > > 2708 > > > > > 2709 /* Check TX ring producer/consumer. */ > > > > > 2710 bge_txeof(sc); > > > > > 2711 } > > > > > > > > Oops. I should have asked for the statment in bge_rxeof(). > > > > > > #7 0xffffffff801d5f17 in bge_rxeof (sc=0xffffffff8836b000) > > at /usr/src/sys/dev/bge/if_bge.c:2528 > > > 2528 m->m_pkthdr.len = m->m_len = cur_rx->bge_len - > > ETHER_CRC_LEN; > > > > > > (where m is defined as: > > > 2449 struct mbuf *m = NULL; > > > ) > > > > It's assigned earlier in between those two places. Can you 'p rxidx' as well > > as 'p sc->bge_cdata.bge_rx_std_chain[rxidx]' and 'p > > sc->bge_cdata.bge_rx_jumbo_chain[rxidx]'? Also, are you using jumbo frames > > at all? > > > > (kgdb) p rxidx > $1 = 499 > (kgdb) p sc->bge_cdata.bge_rx_std_chain[rxidx] > $2 = (struct mbuf *) 0xffffff0097a27900 > (kgdb) p sc->bge_cdata.bge_rx_jumbo_chain[rxidx] > $3 = (struct mbuf *) 0x0 > > And no, I am not using jumbo frames: > bge0: flags=8843 mtu 1500 > options=1b Did you do a 'p m' to verify that m is NULL? If you can reproduce this, I'd add some KASSERT's where it fetches the mbuf out of the descriptor data to see if m is NULL. -- John Baldwin From owner-freebsd-amd64@FreeBSD.ORG Tue Jan 9 20:30:17 2007 Return-Path: X-Original-To: freebsd-amd64@hub.freebsd.org Delivered-To: freebsd-amd64@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E97AA16A40F for ; Tue, 9 Jan 2007 20:30:17 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [69.147.83.40]) by mx1.freebsd.org (Postfix) with ESMTP id BD68013C465 for ; Tue, 9 Jan 2007 20:30:17 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id l09KUHvw079569 for ; Tue, 9 Jan 2007 20:30:17 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.13.4/8.13.4/Submit) id l09KUHGl079564; Tue, 9 Jan 2007 20:30:17 GMT (envelope-from gnats) Resent-Date: Tue, 9 Jan 2007 20:30:17 GMT Resent-Message-Id: <200701092030.l09KUHGl079564@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-amd64@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Richard Moeller Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6C10E16A412 for ; Tue, 9 Jan 2007 20:29:52 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (www.freebsd.org [69.147.83.33]) by mx1.freebsd.org (Postfix) with ESMTP id 4686B13C44B for ; Tue, 9 Jan 2007 20:29:52 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.13.1/8.13.1) with ESMTP id l09KTq4k018378 for ; Tue, 9 Jan 2007 20:29:52 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.13.1/8.13.1/Submit) id l09KTq1n018377; Tue, 9 Jan 2007 20:29:52 GMT (envelope-from nobody) Message-Id: <200701092029.l09KTq1n018377@www.freebsd.org> Date: Tue, 9 Jan 2007 20:29:52 GMT From: Richard Moeller To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.0 Cc: Subject: amd64/107716: Addition for "FreeBSD/amd64 Project -- motherboards " X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jan 2007 20:30:18 -0000 >Number: 107716 >Category: amd64 >Synopsis: Addition for "FreeBSD/amd64 Project -- motherboards " >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-amd64 >State: open >Quarter: >Keywords: >Date-Required: >Class: update >Submitter-Id: current-users >Arrival-Date: Tue Jan 09 20:30:17 GMT 2007 >Closed-Date: >Last-Modified: >Originator: Richard Moeller >Release: 6.1 - RELEASE >Organization: >Environment: >Description: Here is an updated HW/Motherboard for the AMD64 listing: Manufacturer: HP Model: Proliant DL360 G5 Chipset/Socket: Intel 5000p / Socket LGA771 Tested Freebsd version: 6.1-RELEASE Notes: functional - requires bce change of BCE_MAX_SEGMENTS for issue during certain network conditions. ubench -c = 1212897 with 2 x Intel 5160 procs >How-To-Repeat: >Fix: >Release-Note: >Audit-Trail: >Unformatted: From owner-freebsd-amd64@FreeBSD.ORG Tue Jan 9 21:12:28 2007 Return-Path: X-Original-To: freebsd-amd64@freebsd.org Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id ECCFC16A565; Tue, 9 Jan 2007 21:12:28 +0000 (UTC) (envelope-from sven@dmv.com) Received: from smtp-gw-cl-c.dmv.com (smtp-gw-cl-c.dmv.com [216.240.97.41]) by mx1.freebsd.org (Postfix) with ESMTP id A499213C442; Tue, 9 Jan 2007 21:12:28 +0000 (UTC) (envelope-from sven@dmv.com) Received: from mail-gw-cl-b.dmv.com (mail-gw-cl-b.dmv.com [216.240.97.39]) by smtp-gw-cl-c.dmv.com (8.12.10/8.12.10) with ESMTP id l09LCR0F022382; Tue, 9 Jan 2007 16:12:27 -0500 (EST) (envelope-from sven@dmv.com) Received: from lanshark.dmv.com (lanshark.dmv.com [216.240.97.46]) by mail-gw-cl-b.dmv.com (8.12.9/8.12.9) with ESMTP id l09LCP8s051708; Tue, 9 Jan 2007 16:12:25 -0500 (EST) (envelope-from sven@dmv.com) From: Sven Willenberger To: John Baldwin In-Reply-To: <200701091409.29828.jhb@freebsd.org> References: <1168211205.22629.6.camel@lanshark.dmv.com> <200701091150.15274.jhb@freebsd.org> <1168365209.29047.23.camel@lanshark.dmv.com> <200701091409.29828.jhb@freebsd.org> Content-Type: text/plain Date: Tue, 09 Jan 2007 16:18:38 -0500 Message-Id: <1168377518.29047.27.camel@lanshark.dmv.com> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.39 X-Scanned-By: MIMEDefang 2.48 on 216.240.97.39 Cc: stable@freebsd.org, freebsd-amd64@freebsd.org Subject: Re: Panic in 6.2-PRERELEASE with bge on amd64 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jan 2007 21:12:29 -0000 On Tue, 2007-01-09 at 14:09 -0500, John Baldwin wrote: > On Tuesday 09 January 2007 12:53, Sven Willenberger wrote: > > On Tue, 2007-01-09 at 11:50 -0500, John Baldwin wrote: > > > On Tuesday 09 January 2007 09:37, Sven Willenberger wrote: > > > > On Tue, 2007-01-09 at 12:50 +1100, Bruce Evans wrote: > > > > > On Mon, 8 Jan 2007, Sven Willenberger wrote: > > > > > > > > > > > On Mon, 2007-01-08 at 16:06 +1100, Bruce Evans wrote: > > > > > >> On Sun, 7 Jan 2007, Sven Willenberger wrote: > > > > > > > > > > >>> The short and dirty of the dump: > > > > > >>> ... > > > > > >>> --- trap 0xc, rip = 0xffffffff801d5f17, rsp = 0xffffffffb371ab50, > rbp > > > = 0xffffffffb371aba0 --- > > > > > >>> bge_rxeof() at bge_rxeof+0x3b7 > > > > > >> > > > > > >> What is the instruction here? > > > > > > > > > > > > I will do my best to ferret out the information you need. For the > > > > > > bge_rxeof() at bge_rxeof+0x3b7 line, the instruction is: > > > > > > > > > > > > 0xffffffff801d5f17 : mov %r15,0x28(%r14) > > > > > > ... > > > > > >> Looks like a null pointer panic anyway. I guess the instruction is > > > > > >> movl to/from 0x28(%reg) where %reg is a null pointer. > > > > > >> > > > > > > > > > > > > from the above lines, apparently %r14 is null then. > > > > > > > > > > Yes. It's a bit suprising that the access is a write. > > > > > > > > > > >>> ... > > > > > >>> #8 0xffffffff801db818 in bge_intr (xsc=0x0) > > > at /usr/src/sys/dev/bge/if_bge.c:2707 > > > > > >> > > > > > >> What is the statement here? It presumably follow a null pointer > and > > > only > > > > > >> the exprssion for the pointer is interesting. xsc is already null > but > > > > > >> that is probably a bug in gdb, or the result of excessive > optimization. > > > > > >> Compiling kernels with -O2 has little effect except to break > debugging. > > > > > > > > > > > > the block of code from if_bge.c: > > > > > > > > > > > > 2705 if (ifp->if_drv_flags & IFF_DRV_RUNNING) { > > > > > > 2706 /* Check RX return ring producer/consumer. */ > > > > > > 2707 bge_rxeof(sc); > > > > > > 2708 > > > > > > 2709 /* Check TX ring producer/consumer. */ > > > > > > 2710 bge_txeof(sc); > > > > > > 2711 } > > > > > > > > > > Oops. I should have asked for the statment in bge_rxeof(). > > > > > > > > #7 0xffffffff801d5f17 in bge_rxeof (sc=0xffffffff8836b000) > > > at /usr/src/sys/dev/bge/if_bge.c:2528 > > > > 2528 m->m_pkthdr.len = m->m_len = cur_rx->bge_len - > > > ETHER_CRC_LEN; > > > > > > > > (where m is defined as: > > > > 2449 struct mbuf *m = NULL; > > > > ) > > > > > > It's assigned earlier in between those two places. Can you 'p rxidx' as > well > > > as 'p sc->bge_cdata.bge_rx_std_chain[rxidx]' and 'p > > > sc->bge_cdata.bge_rx_jumbo_chain[rxidx]'? Also, are you using jumbo > frames > > > at all? > > > > > > > (kgdb) p rxidx > > $1 = 499 > > (kgdb) p sc->bge_cdata.bge_rx_std_chain[rxidx] > > $2 = (struct mbuf *) 0xffffff0097a27900 > > (kgdb) p sc->bge_cdata.bge_rx_jumbo_chain[rxidx] > > $3 = (struct mbuf *) 0x0 > > > > And no, I am not using jumbo frames: > > bge0: flags=8843 mtu 1500 > > options=1b > > Did you do a 'p m' to verify that m is NULL? If you can reproduce this, I'd > add some KASSERT's where it fetches the mbuf out of the descriptor data to > see if m is NULL. > at this spot, m is null: (kgdb) p m $3 = (struct mbuf *) 0x0 As far as adding some KASSERT's ... you have gone beyond my rudimentary knowledge here as far as application goes. From owner-freebsd-amd64@FreeBSD.ORG Wed Jan 10 02:28:41 2007 Return-Path: X-Original-To: freebsd-amd64@freebsd.org Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3C20916A415; Wed, 10 Jan 2007 02:28:41 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailout1.pacific.net.au (mailout1-3.pacific.net.au [61.8.2.210]) by mx1.freebsd.org (Postfix) with ESMTP id D265313C4A9; Wed, 10 Jan 2007 02:28:40 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailproxy1.pacific.net.au (mailproxy1.pacific.net.au [61.8.2.162]) by mailout1.pacific.net.au (Postfix) with ESMTP id 159EF5A3EBC; Wed, 10 Jan 2007 13:28:36 +1100 (EST) Received: from besplex.bde.org (katana.zip.com.au [61.8.7.246]) by mailproxy1.pacific.net.au (Postfix) with ESMTP id E21A38C29; Wed, 10 Jan 2007 13:28:33 +1100 (EST) Date: Wed, 10 Jan 2007 13:28:33 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: John Baldwin In-Reply-To: <200701091151.17166.jhb@freebsd.org> Message-ID: <20070110123340.E16378@besplex.bde.org> References: <1168211205.22629.6.camel@lanshark.dmv.com> <20070108154433.C75042@delplex.bde.org> <200701091151.17166.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: rwatson@freebsd.org, freebsd-amd64@freebsd.org Subject: Re: Panic in 6.2-PRERELEASE with bge on amd64 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jan 2007 02:28:41 -0000 On Tue, 9 Jan 2007, John Baldwin wrote: > On Monday 08 January 2007 00:06, Bruce Evans wrote: >> Like most NIC drivers, bge unlocks and re-locks around its call to >> ether_input() in its interrupt handler. This isn't very safe, and it >> certainly causes panics for bge. I often see it panic when bringing >> the interface down and up while input is arriving, on a non-SMP non-amd64 >> (actually i386) non-6.x (actually -current) system. Bringing the >> interface down is probably the worst case. It creates a null pointer >> for bge_intr() to follow. > > Why do you feel that it is unsafe to drop the lock around if_input()? General principles. After dropping a lock, it is necessary to check for any relevant state changes after reacquiring the lock. This is not easy to do. bge_rxeof() has no explicit checking. It apparently depends on implicit checking and/or no relevant state changes occurring. This is even less easy to do. It seems to be possible for at least the huge state change of the device being uninitialized to occur. Then it is surprising for bge_rxeof() to get as far as it does before crashing. The critical things that it does are (for the nun-jumbo case): % while(sc->bge_rx_saved_considx != % sc->bge_ldata.bge_status_block->bge_idx[0].bge_rx_prod_idx) { We get back here immediately after reacquiring the lock (if the indexes don't match). % ... % cur_rx = % &sc->bge_ldata.bge_rx_return_ring[sc->bge_rx_saved_considx]; % % rxidx = cur_rx->bge_idx; cur_tx must be non-NULL to get this far. I don't understand its lifetime. % ... % if (cur_rx->bge_flags & BGE_RXBDFLAG_JUMBO_RING) { % ... % } else { % BGE_INC(sc->bge_std, BGE_STD_RX_RING_CNT); % bus_dmamap_sync(sc->bge_cdata.bge_mtag, % sc->bge_cdata.bge_rx_std_dmamap[rxidx], % BUS_DMASYNC_POSTREAD); % bus_dmamap_unload(sc->bge_cdata.bge_mtag, % sc->bge_cdata.bge_rx_std_dmamap[rxidx]); I don't understand the lifetime of the dma data structures. % m = sc->bge_cdata.bge_rx_std_chain[rxidx]; % sc->bge_cdata.bge_rx_std_chain[rxidx] = NULL; All entries in bge_rx_std_chain are NULL after uninitialization, so m is certain to be NULL here after uninitialization. Note that this can happen even if the uninitialization routines wait for the hardware to finish using the mbufs before freeing them. The hardware may have used the old sc->bge_cdata.bge_rx_std_chain[rxidx] with no problems. In fact, after uninitialization we can only get here in two ways: - the hardware used the mbuf and updated the status block to indicate the change, and (even if the uninitialization stopped the hardware correctly) the the uninitialization didn't modify the status block enough. The use and update may occur either before bge_rxeof() is called or concurrently. - the uninitialization modified the status block in way that caused the problem. The only setting that prevents the loop proceeding is sc->bge_rx_saved_considx == sc->bge_ldata.bge_status_block->bge_idx[0].bge_rx_prod_idx, and for setting both these indexes to work the hardware must be stopped so that it doesn't update the producer index. In fact, the second way doesn't seem to happen -- bge initializes all the indexes to 0 in its init routine, but doesn't seem to touch them in its uninit routine. % ... % } % ... % BGE_UNLOCK(sc); % (*ifp->if_input)(ifp, m); % BGE_LOCK(sc); % } Stopping the loop by setting the indexes equal is an efficient way to abort bge_rxeof(), but the async status update makes it fragile. I think the current problem is not a full uninit but more subtle. Bruce From owner-freebsd-amd64@FreeBSD.ORG Wed Jan 10 02:42:49 2007 Return-Path: X-Original-To: freebsd-amd64@freebsd.org Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3B3FD16A517; Wed, 10 Jan 2007 02:42:49 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailout1.pacific.net.au (mailout1-3.pacific.net.au [61.8.2.210]) by mx1.freebsd.org (Postfix) with ESMTP id 0206913C4C9; Wed, 10 Jan 2007 02:42:49 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.2.163]) by mailout1.pacific.net.au (Postfix) with ESMTP id BC25E3282F3; Wed, 10 Jan 2007 13:42:47 +1100 (EST) Received: from besplex.bde.org (katana.zip.com.au [61.8.7.246]) by mailproxy2.pacific.net.au (Postfix) with ESMTP id 9B79E2741B; Wed, 10 Jan 2007 13:42:46 +1100 (EST) Date: Wed, 10 Jan 2007 13:42:45 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: John Baldwin In-Reply-To: <200701091150.15274.jhb@freebsd.org> Message-ID: <20070110132839.X16378@besplex.bde.org> References: <1168211205.22629.6.camel@lanshark.dmv.com> <20070109124826.M79616@delplex.bde.org> <1168353425.29047.8.camel@lanshark.dmv.com> <200701091150.15274.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: stable@freebsd.org, freebsd-amd64@freebsd.org Subject: Re: Panic in 6.2-PRERELEASE with bge on amd64 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jan 2007 02:42:49 -0000 On Tue, 9 Jan 2007, John Baldwin wrote: > On Tuesday 09 January 2007 09:37, Sven Willenberger wrote: >> On Tue, 2007-01-09 at 12:50 +1100, Bruce Evans wrote: >>> Oops. I should have asked for the statment in bge_rxeof(). >> >> #7 0xffffffff801d5f17 in bge_rxeof (sc=0xffffffff8836b000) > at /usr/src/sys/dev/bge/if_bge.c:2528 >> 2528 m->m_pkthdr.len = m->m_len = cur_rx->bge_len - > ETHER_CRC_LEN; >> >> (where m is defined as: >> 2449 struct mbuf *m = NULL; >> ) > > It's assigned earlier in between those two places. Its initialization here is just a style bug. > Can you 'p rxidx' as well > as 'p sc->bge_cdata.bge_rx_std_chain[rxidx]' and 'p > sc->bge_cdata.bge_rx_jumbo_chain[rxidx]'? Also, are you using jumbo frames > at all? Also look at nearby chain entries (especially at (rxidx - 1) mod 512)). I think the previous 255 entries and the rxidx one should be non-NULL since we should have refilled them as we used them (so the one at rxidx is least interesting since we certainly just refilled it), and the next 256 entries should be NULL since we bogusly only use half of the entries. If the problem is uninitialization, then I expect all 512 entries except the one just refilled at rxidx to be NULL. Bruce From owner-freebsd-amd64@FreeBSD.ORG Wed Jan 10 05:00:37 2007 Return-Path: X-Original-To: amd64@freebsd.org Delivered-To: freebsd-amd64@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C1CBC16A403 for ; Wed, 10 Jan 2007 05:00:37 +0000 (UTC) (envelope-from kgunders@teamcool.net) Received: from koyukuk.teamcool.net (koyukuk.teamcool.net [209.161.34.19]) by mx1.freebsd.org (Postfix) with ESMTP id 87D7B13C45A for ; Wed, 10 Jan 2007 05:00:37 +0000 (UTC) (envelope-from kgunders@teamcool.net) Received: from koyukuk.teamcool.net (localhost [127.0.0.1]) by koyukuk.teamcool.net (TeamCool Rocks) with ESMTP id 8601EF877; Tue, 9 Jan 2007 22:00:36 -0700 (MST) Received: from shredder.teamcool.net (unknown [192.168.1.58]) by koyukuk.teamcool.net (TeamCool Rocks) with ESMTP id 4A45AF85F; Tue, 9 Jan 2007 22:00:36 -0700 (MST) Date: Tue, 9 Jan 2007 22:00:35 -0700 From: Ken Gunderson To: cokane@cokane.org Message-Id: <20070109220035.8c6da366.kgunders@teamcool.net> In-Reply-To: <346a80220701041445m26df386p84778e9ce574f02b@mail.gmail.com> References: <20061226104440.5d52417b.kgunders@teamcool.net> <20061226153416.a2aacc13.kgunders@teamcool.net> <346a80220701041445m26df386p84778e9ce574f02b@mail.gmail.com> X-Mailer: Sylpheed version 2.2.10 (GTK+ 2.10.6; i386-portbld-freebsd6.1) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV using ClamSMTP Cc: amd64@freebsd.org Subject: Re: >32GB memory with Xeon ? X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jan 2007 05:00:37 -0000 On Thu, 4 Jan 2007 15:45:40 -0700 "Coleman Kane" wrote: > On 12/26/06, Ken Gunderson wrote: > > > > On Wed, 27 Dec 2006 05:08:36 +0900 > > Hiroharu Tamaru wrote: > > > > > At Tue, 26 Dec 2006 10:44:40 -0700, Ken Gunderson wrote: > > > > > > > > On Tue, 26 Dec 2006 23:07:58 +0900 > > > > Hiroharu Tamaru wrote: > > > > > > > > > Hello list, > > > > > > > > > > Is there any limit on the amount of memory that can be used > > > > > with FreeBSD 6.x/amd64 ? I found that FreeBSD/ia64 > > > > > currently has 2GB limit. I was wondering how amd64 is like. > [snip] > Hey all, > > When I first looked into the amd64 platform I did research and made sure to > stay far from the nForce offerings that were around. I found a laptop with a > VIA chipset that has been pretty good to me, and was pretty decently > supported back in the early days of amd64 support in FreeBSD. > > From what I can tell, the nForce offerings are still troublesome but there > are many patches to work-around their problems. The VIA offerings seem to be > much more compatible. > > -- > Coleman Kane > Be interesting to see what shakes out on the chipset front for AMD now that they've bought ATI. I wonder if they'll go back to making their own like the good ol' 8131/8311 days... -- Best regards, Ken Gunderson "The course of history shows that as a government grows, liberty decreases." (Thomas Jefferson) From owner-freebsd-amd64@FreeBSD.ORG Wed Jan 10 20:39:06 2007 Return-Path: X-Original-To: freebsd-amd64@freebsd.org Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8D0C916A40F; Wed, 10 Jan 2007 20:39:06 +0000 (UTC) (envelope-from sven@dmv.com) Received: from smtp-gw-cl-c.dmv.com (smtp-gw-cl-c.dmv.com [216.240.97.41]) by mx1.freebsd.org (Postfix) with ESMTP id D9E8613C44C; Wed, 10 Jan 2007 20:39:05 +0000 (UTC) (envelope-from sven@dmv.com) Received: from mail-gw-cl-b.dmv.com (mail-gw-cl-b.dmv.com [216.240.97.39]) by smtp-gw-cl-c.dmv.com (8.12.10/8.12.10) with ESMTP id l0AKd10F053223; Wed, 10 Jan 2007 15:39:02 -0500 (EST) (envelope-from sven@dmv.com) Received: from [67.62.150.139] (static-67-62-150-139.dsl.cavtel.net [67.62.150.139]) (authenticated bits=0) by mail-gw-cl-b.dmv.com (8.12.9/8.12.9) with ESMTP id l0AKcv8s082002; Wed, 10 Jan 2007 15:39:00 -0500 (EST) (envelope-from sven@dmv.com) Message-ID: <45A54FC9.8040900@dmv.com> Date: Wed, 10 Jan 2007 15:42:49 -0500 From: Sven Willenberger User-Agent: Thunderbird 1.5.0.7 (X11/20060919) MIME-Version: 1.0 To: Bruce Evans References: <1168211205.22629.6.camel@lanshark.dmv.com> <20070109124826.M79616@delplex.bde.org> <1168353425.29047.8.camel@lanshark.dmv.com> <200701091150.15274.jhb@freebsd.org> <20070110132839.X16378@besplex.bde.org> In-Reply-To: <20070110132839.X16378@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.39 X-Scanned-By: MIMEDefang 2.48 on 216.240.97.39 Cc: stable@freebsd.org, freebsd-amd64@freebsd.org Subject: Re: Panic in 6.2-PRERELEASE with bge on amd64 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jan 2007 20:39:06 -0000 Bruce Evans presumably uttered the following on 01/09/07 21:42: > On Tue, 9 Jan 2007, John Baldwin wrote: > >> On Tuesday 09 January 2007 09:37, Sven Willenberger wrote: >>> On Tue, 2007-01-09 at 12:50 +1100, Bruce Evans wrote: >>>> Oops. I should have asked for the statment in bge_rxeof(). >>> >>> #7 0xffffffff801d5f17 in bge_rxeof (sc=0xffffffff8836b000) >> at /usr/src/sys/dev/bge/if_bge.c:2528 >>> 2528 m->m_pkthdr.len = m->m_len = cur_rx->bge_len - >> ETHER_CRC_LEN; >>> >>> (where m is defined as: >>> 2449 struct mbuf *m = NULL; >>> ) >> >> It's assigned earlier in between those two places. > > Its initialization here is just a style bug. > >> Can you 'p rxidx' as well >> as 'p sc->bge_cdata.bge_rx_std_chain[rxidx]' and 'p >> sc->bge_cdata.bge_rx_jumbo_chain[rxidx]'? Also, are you using jumbo >> frames >> at all? > > Also look at nearby chain entries (especially at (rxidx - 1) mod 512)). > I think the previous 255 entries and the rxidx one should be > non-NULL since we should have refilled them as we used them (so the > one at rxidx is least interesting since we certainly just refilled > it), and the next 256 entries should be NULL since we bogusly only use > half of the entries. If the problem is uninitialization, then I expect > all 512 entries except the one just refilled at rxidx to be NULL. > > Bruce > _______________________________________________ (kgdb) p sc->bge_cdata.bge_rx_std_chain[rxidx] $1 = (struct mbuf *) 0xffffff0097a27900 (kgdb) p rxidx $2 = 499 since rxidx = 499, I assume you are most interested in 498: (kgdb) p sc->bge_cdata.bge_rx_std_chain[498] $3 = (struct mbuf *) 0xffffff00cf1b3100 for the sake of argument, 500 is null: (kgdb) p sc->bge_cdata.bge_rx_std_chain[500] $13 = (struct mbuf *) 0x0 the indexes with values basically are 243 through 499: (kgdb) p sc->bge_cdata.bge_rx_std_chain[241] $30 = (struct mbuf *) 0x0 (kgdb) p sc->bge_cdata.bge_rx_std_chain[242] $31 = (struct mbuf *) 0x0 (kgdb) p sc->bge_cdata.bge_rx_std_chain[243] $32 = (struct mbuf *) 0xffffff005d4ab700 (kgdb) p sc->bge_cdata.bge_rx_std_chain[244] $33 = (struct mbuf *) 0xffffff004f644b00 so it does not seem to be a problem with "uninitialization". From owner-freebsd-amd64@FreeBSD.ORG Thu Jan 11 14:18:32 2007 Return-Path: X-Original-To: freebsd-amd64@hub.freebsd.org Delivered-To: freebsd-amd64@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8C63816A417; Thu, 11 Jan 2007 14:18:32 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [69.147.83.40]) by mx1.freebsd.org (Postfix) with ESMTP id 64AA713C46C; Thu, 11 Jan 2007 14:18:32 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (linimon@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id l0BEIW4l093801; Thu, 11 Jan 2007 14:18:32 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.13.4/8.13.4/Submit) id l0BEIWDl093797; Thu, 11 Jan 2007 14:18:32 GMT (envelope-from linimon) Date: Thu, 11 Jan 2007 14:18:32 GMT From: Mark Linimon Message-Id: <200701111418.l0BEIWDl093797@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-amd64@FreeBSD.org, freebsd-www@FreeBSD.org Cc: Subject: Re: www/107716: Addition for "FreeBSD/amd64 Project -- motherboards " X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Jan 2007 14:18:32 -0000 Synopsis: Addition for "FreeBSD/amd64 Project -- motherboards " Responsible-Changed-From-To: freebsd-amd64->freebsd-www Responsible-Changed-By: linimon Responsible-Changed-When: Thu Jan 11 14:17:29 UTC 2007 Responsible-Changed-Why: This actually applies to the website (www), even though it is about amd64. The 'amd64' category is for PRs about problems running FreeBSD that only seem to affect amd64. Thanks. http://www.freebsd.org/cgi/query-pr.cgi?pr=107716 From owner-freebsd-amd64@FreeBSD.ORG Thu Jan 11 20:35:27 2007 Return-Path: X-Original-To: amd64@freebsd.org Delivered-To: freebsd-amd64@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5736916A40F; Thu, 11 Jan 2007 20:35:27 +0000 (UTC) (envelope-from tinderbox@freebsd.org) Received: from smarthost2.sentex.ca (smarthost2.sentex.ca [205.211.164.50]) by mx1.freebsd.org (Postfix) with ESMTP id 0ACBC13C4B4; Thu, 11 Jan 2007 20:35:26 +0000 (UTC) (envelope-from tinderbox@freebsd.org) Received: from smtp1.sentex.ca (smtp1.sentex.ca [199.212.134.4]) by smarthost2.sentex.ca (8.13.8/8.13.8) with ESMTP id l0BKZQQi028273; Thu, 11 Jan 2007 15:35:26 -0500 (EST) (envelope-from tinderbox@freebsd.org) Received: from freebsd-current.sentex.ca (freebsd-current.sentex.ca [64.7.128.98]) by smtp1.sentex.ca (8.13.8/8.13.8) with ESMTP id l0BKZQlO075786; Thu, 11 Jan 2007 15:35:26 -0500 (EST) (envelope-from tinderbox@freebsd.org) Received: by freebsd-current.sentex.ca (Postfix, from userid 666) id F261473034; Thu, 11 Jan 2007 15:35:25 -0500 (EST) Sender: FreeBSD Tinderbox From: FreeBSD Tinderbox To: FreeBSD Tinderbox , , Precedence: bulk Message-Id: <20070111203525.F261473034@freebsd-current.sentex.ca> Date: Thu, 11 Jan 2007 15:35:25 -0500 (EST) X-Virus-Scanned: ClamAV version devel-20070102, clamav-milter version devel-111206 on clamscanner5 X-Virus-Status: Clean Cc: Subject: [head tinderbox] failure on amd64/amd64 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Jan 2007 20:35:27 -0000 TB --- 2007-01-11 19:35:01 - tinderbox 2.3 running on freebsd-current.sentex.ca TB --- 2007-01-11 19:35:01 - starting HEAD tinderbox run for amd64/amd64 TB --- 2007-01-11 19:35:01 - cleaning the object tree TB --- 2007-01-11 19:35:48 - checking out the source tree TB --- 2007-01-11 19:35:48 - cd /tinderbox/HEAD/amd64/amd64 TB --- 2007-01-11 19:35:48 - /usr/bin/cvs -f -R -q -d/home/ncvs update -Pd -A src TB --- 2007-01-11 19:46:13 - building world (CFLAGS=-O2 -pipe) TB --- 2007-01-11 19:46:13 - cd /src TB --- 2007-01-11 19:46:13 - /usr/bin/make -B buildworld >>> World build started on Thu Jan 11 19:46:16 UTC 2007 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything [...] cc -O2 -pipe -c /src/usr.bin/gprof/printlist.c cc -O2 -pipe -c /src/usr.bin/gprof/kernel.c cc -O2 -pipe -o gprof gprof.o aout.o arcs.o dfn.o elf.o lookup.o hertz.o printgprof.o printlist.o kernel.o gzip -cn /src/usr.bin/gprof/gprof.1 > gprof.1.gz ===> usr.bin/head (all) cc -O2 -pipe -Wsystem-headers -Werror -Wall -Wno-format-y2k -W -Wno-unused-parameter -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wreturn-type -Wcast-qual -Wwrite-strings -Wswitch -Wshadow -Wcast-align -Wunused-parameter -Wchar-subscripts -Winline -Wnested-externs -Wredundant-decls -c /src/usr.bin/head/head.c /src/usr.bin/head/head.c: In function `head_bytes': /src/usr.bin/head/head.c:149: warning: comparison between signed and unsigned *** Error code 1 Stop in /src/usr.bin/head. *** Error code 1 Stop in /src/usr.bin. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2007-01-11 20:35:25 - WARNING: /usr/bin/make returned exit code 1 TB --- 2007-01-11 20:35:25 - ERROR: failed to build world TB --- 2007-01-11 20:35:25 - tinderbox aborted TB --- 0.89 user 3.49 system 3624.20 real http://tinderbox.des.no/tinderbox-head-HEAD-amd64-amd64.full From owner-freebsd-amd64@FreeBSD.ORG Fri Jan 12 11:40:40 2007 Return-Path: X-Original-To: freebsd-amd64@freebsd.org Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0773D16A412 for ; Fri, 12 Jan 2007 11:40:40 +0000 (UTC) (envelope-from conrads@cox.net) Received: from eastrmmtai08.cox.net (eastrmmtai08.cox.net [68.230.240.51]) by mx1.freebsd.org (Postfix) with ESMTP id 8ACA913C441 for ; Fri, 12 Jan 2007 11:40:39 +0000 (UTC) (envelope-from conrads@cox.net) Received: from eastrmimpo01.cox.net ([68.1.16.119]) by eastrmmtao03.cox.net (InterMail vM.6.01.06.03 201-2131-130-104-20060516) with ESMTP id <20070112112529.DNWD28701.eastrmmtao03.cox.net@eastrmimpo01.cox.net>; Fri, 12 Jan 2007 06:25:29 -0500 Received: from serene.no-ip.org ([72.200.30.62]) by eastrmimpo01.cox.net with bizsmtp id ABQ61W0061LR1K40000000; Fri, 12 Jan 2007 06:24:06 -0500 Received: from serene.no-ip.org (localhost [127.0.0.1]) by serene.no-ip.org (8.13.8/8.13.8) with ESMTP id l0CBPT4Y059191; Fri, 12 Jan 2007 05:25:29 -0600 (CST) (envelope-from conrads@cox.net) Date: Fri, 12 Jan 2007 05:25:24 -0600 From: "Conrad J. Sabatier" To: "Fulvio Mariola" Message-ID: <20070112052524.471bbe49@serene.no-ip.org> In-Reply-To: References: X-Mailer: Claws Mail 2.7.0 (GTK+ 2.10.7; amd64-portbld-freebsd7.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: freebsd-amd64@freebsd.org Subject: Re: FreeBSD 6.2 AMD64+NVIDIA X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Jan 2007 11:40:40 -0000 On Fri, 29 Dec 2006 17:02:50 +0100 "Fulvio Mariola" wrote: > I have installed FreeBSD 6.2 AMD64 to the notebook. The graphic-card > is NVIDIA GeFORCE Go 7600. > The NVIDIA vendor don't release driver for This architecture. > Can I try?? > I make the kernel or to return at ix86 architecture? > > Thanks for your attention. > Fulvio Mariola We need more amd64 users to register at the nVidia website and add their comments in the hardware forum (see http://forums.nvidia.com) to request a driver for FreeBSD/amd64. If you look at the BSD stats site on the CPU stats page (http://www.bsdstats.org/cpus.php), you'll see that the Athlon 64 is currently the most popular processor listed by far. We need to make nVidia acutely aware of this situation and urge them to give a higher priority to developing and releasing a driver for this platform. It is *long* overdue now. Thank you for your support. -- Conrad J. Sabatier "In Unix veritas" From owner-freebsd-amd64@FreeBSD.ORG Fri Jan 12 20:43:14 2007 Return-Path: X-Original-To: freebsd-amd64@freebsd.org Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B90A616A4B3; Fri, 12 Jan 2007 20:43:14 +0000 (UTC) (envelope-from peterjeremy@optushome.com.au) Received: from turion.vk2pj.dyndns.org (c220-239-3-125.belrs4.nsw.optusnet.com.au [220.239.3.125]) by mx1.freebsd.org (Postfix) with ESMTP id EB85613C458; Fri, 12 Jan 2007 20:43:13 +0000 (UTC) (envelope-from peterjeremy@optushome.com.au) Received: from turion.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1]) by turion.vk2pj.dyndns.org (8.13.8/8.13.8) with ESMTP id l0CKh8aU011048; Sat, 13 Jan 2007 07:43:08 +1100 (EST) (envelope-from peter@turion.vk2pj.dyndns.org) Received: (from peter@localhost) by turion.vk2pj.dyndns.org (8.13.8/8.13.8/Submit) id l0CKh82L011047; Sat, 13 Jan 2007 07:43:08 +1100 (EST) (envelope-from peter) Date: Sat, 13 Jan 2007 07:43:08 +1100 From: Peter Jeremy To: "Conrad J. Sabatier" Message-ID: <20070112204308.GE842@turion.vk2pj.dyndns.org> References: <20070112052524.471bbe49@serene.no-ip.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="H+4ONPRPur6+Ovig" Content-Disposition: inline In-Reply-To: <20070112052524.471bbe49@serene.no-ip.org> X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc User-Agent: Mutt/1.5.13 (2006-08-11) Cc: freebsd-amd64@freebsd.org Subject: Re: FreeBSD 6.2 AMD64+NVIDIA X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Jan 2007 20:43:14 -0000 --H+4ONPRPur6+Ovig Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, 2007-Jan-12 05:25:24 -0600, Conrad J. Sabatier wrote: >We need more amd64 users to register at the nVidia website and add their >comments in the hardware forum (see http://forums.nvidia.com) to request a >driver for FreeBSD/amd64. My understanding is that nVidia have stated that they will release a FreeBSD/amd64 driver once FreeBSD/amd64 supports some features that the driver needs. Looking back thru my archives, the blocking issue appears to be PAT (Page Attribute Table). jhb@ [copied] has done some work on this and (last mail I've got) was waiting for feedback from nVidia. See http://lists.freebsd.org/pipermail/freebsd-current/2005-November/057808.html http://lists.freebsd.org/pipermail/freebsd-amd64/2006-April/007932.html --=20 Peter Jeremy --H+4ONPRPur6+Ovig Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFFp/Lc/opHv/APuIcRAqP3AKC0CC7wpzgDwqF3uX57AGQj0ImQSwCgtJOc 2yK06twBIw0KeTkItgpc6lA= =b/0D -----END PGP SIGNATURE----- --H+4ONPRPur6+Ovig-- From owner-freebsd-amd64@FreeBSD.ORG Fri Jan 12 21:01:45 2007 Return-Path: X-Original-To: freebsd-amd64@freebsd.org Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2444F16A40F for ; Fri, 12 Jan 2007 21:01:45 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (66-23-211-162.clients.speedfactory.net [66.23.211.162]) by mx1.freebsd.org (Postfix) with ESMTP id C017A13C458 for ; Fri, 12 Jan 2007 21:01:44 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from zion.baldwin.cx (zion.baldwin.cx [192.168.0.7]) (authenticated bits=0) by server.baldwin.cx (8.13.6/8.13.6) with ESMTP id l0CL1avB009232; Fri, 12 Jan 2007 16:01:36 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: Peter Jeremy Date: Fri, 12 Jan 2007 15:57:41 -0500 User-Agent: KMail/1.9.4 References: <20070112052524.471bbe49@serene.no-ip.org> <20070112204308.GE842@turion.vk2pj.dyndns.org> In-Reply-To: <20070112204308.GE842@turion.vk2pj.dyndns.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200701121557.42417.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [192.168.0.1]); Fri, 12 Jan 2007 16:01:36 -0500 (EST) X-Virus-Scanned: ClamAV 0.88.3/2437/Thu Jan 11 18:59:09 2007 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: freebsd-amd64@freebsd.org Subject: Re: FreeBSD 6.2 AMD64+NVIDIA X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Jan 2007 21:01:45 -0000 On Friday 12 January 2007 15:43, Peter Jeremy wrote: > On Fri, 2007-Jan-12 05:25:24 -0600, Conrad J. Sabatier wrote: > >We need more amd64 users to register at the nVidia website and add their > >comments in the hardware forum (see http://forums.nvidia.com) to request a > >driver for FreeBSD/amd64. > > My understanding is that nVidia have stated that they will release a > FreeBSD/amd64 driver once FreeBSD/amd64 supports some features that > the driver needs. Looking back thru my archives, the blocking issue > appears to be PAT (Page Attribute Table). jhb@ [copied] has done some > work on this and (last mail I've got) was waiting for feedback from > nVidia. There's more work to be done as well. An nvidia developer sent out an e-mail explaining exactly what they need a while back. -- John Baldwin From owner-freebsd-amd64@FreeBSD.ORG Sat Jan 13 01:30:13 2007 Return-Path: X-Original-To: freebsd-amd64@hub.freebsd.org Delivered-To: freebsd-amd64@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BB17116A407 for ; Sat, 13 Jan 2007 01:30:13 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [69.147.83.40]) by mx1.freebsd.org (Postfix) with ESMTP id 9A13813C45B for ; Sat, 13 Jan 2007 01:30:13 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id l0D1UDoU081523 for ; Sat, 13 Jan 2007 01:30:13 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.13.4/8.13.4/Submit) id l0D1UDcG081522; Sat, 13 Jan 2007 01:30:13 GMT (envelope-from gnats) Resent-Date: Sat, 13 Jan 2007 01:30:13 GMT Resent-Message-Id: <200701130130.l0D1UDcG081522@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-amd64@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Gregory Hunt Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3AD3016A40F for ; Sat, 13 Jan 2007 01:28:27 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (www.freebsd.org [69.147.83.33]) by mx1.freebsd.org (Postfix) with ESMTP id 2902713C448 for ; Sat, 13 Jan 2007 01:28:27 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.13.1/8.13.1) with ESMTP id l0D1SQwq088353 for ; Sat, 13 Jan 2007 01:28:26 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.13.1/8.13.1/Submit) id l0D1SQ7b088349; Sat, 13 Jan 2007 01:28:26 GMT (envelope-from nobody) Message-Id: <200701130128.l0D1SQ7b088349@www.freebsd.org> Date: Sat, 13 Jan 2007 01:28:26 GMT From: Gregory Hunt To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.0 Cc: Subject: amd64/107858: amd64 motherboard project - none working sound and graphics X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Jan 2007 01:30:13 -0000 >Number: 107858 >Category: amd64 >Synopsis: amd64 motherboard project - none working sound and graphics >Confidential: no >Severity: non-critical >Priority: medium >Responsible: freebsd-amd64 >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sat Jan 13 01:30:12 GMT 2007 >Closed-Date: >Last-Modified: >Originator: Gregory Hunt >Release: RELENG-6 >Organization: >Environment: FreeBSD smirnoff 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #2: Fri Jan 12 19:58:49 GMT 2007 root@smirnoff:/usr/obj/usr/src/sys/CEIBO amd64 Motherboard is an ASRock Conroe-945G-DVI >Description: The soundcard is not recognised at all by snd_ich, the patch for snd_hd doesn't seem to work anymore. The video is only recognised as a vga device, with no agpgart device. There is also a minior problem of shutdown -p seemingly suspending instead of powering down. $ pciconf -lv hostb0@pci0:0:0: class=0x060000 card=0x27701849 chip=0x27708086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82945 Series Memory Controller Hub (MCH)' class = bridge subclass = HOST-PCI none0@pci0:2:0: class=0x030000 card=0x27721849 chip=0x27728086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = 'Integrated Graphics Controller' class = display subclass = VGA none1@pci0:27:0: class=0x040300 card=0x08881849 chip=0x27d88086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) High Definition Audio' class = multimedia pcib1@pci0:28:0: class=0x060400 card=0x00000040 chip=0x27d08086 rev=0x01 hdr=0x01 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) PCI Express Root Port' class = bridge subclass = PCI-PCI pcib2@pci0:28:1: class=0x060400 card=0x00000040 chip=0x27d28086 rev=0x01 hdr=0x01 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) PCI Express Root Port' class = bridge subclass = PCI-PCI uhci0@pci0:29:0: class=0x0c0300 card=0x27c81849 chip=0x27c88086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) USB Universal Host Controller' class = serial bus subclass = USB uhci1@pci0:29:1: class=0x0c0300 card=0x27c91849 chip=0x27c98086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) USB Universal Host Controller' class = serial bus subclass = USB uhci2@pci0:29:2: class=0x0c0300 card=0x27ca1849 chip=0x27ca8086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) USB Universal Host Controller' class = serial bus subclass = USB uhci3@pci0:29:3: class=0x0c0300 card=0x27cb1849 chip=0x27cb8086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) USB Universal Host Controller' class = serial bus subclass = USB ehci0@pci0:29:7: class=0x0c0320 card=0x27cc1849 chip=0x27cc8086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) USB 2.0 Enhanced Host Controller' class = serial bus subclass = USB pcib3@pci0:30:0: class=0x060401 card=0x00000050 chip=0x244e8086 rev=0xe1 hdr=0x01 vendor = 'Intel Corporation' device = '82801BA/CA/DB/DBL/EB/ER/FB (ICH2/3/4/4/5/5/6), 6300ESB Hub Interface to PCI Bridge' class = bridge subclass = PCI-PCI isab0@pci0:31:0: class=0x060100 card=0x27b81849 chip=0x27b88086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82801GB/GR (ICH7 Family) LPC Interface Controller' class = bridge subclass = PCI-ISA atapci0@pci0:31:1: class=0x01018a card=0x27df1849 chip=0x27df8086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) Ultra ATA Storage Controller' class = mass storage subclass = ATA atapci1@pci0:31:2: class=0x01018f card=0x27c01849 chip=0x27c08086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82801GB/GR/GH (ICH7 Family) Serial ATA Storage Controller' class = mass storage subclass = ATA none2@pci0:31:3: class=0x0c0500 card=0x27da1849 chip=0x27da8086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) SMBus Controller' class = serial bus subclass = SMBus re0@pci1:0:0: class=0x020000 card=0x81681849 chip=0x816810ec rev=0x01 hdr=0x00 vendor = 'Realtek Semiconductor' class = network subclass = ethernet vr0@pci3:1:0: class=0x020000 card=0x14011186 chip=0x30651106 rev=0x42 hdr=0x00 vendor = 'VIA Technologies Inc' device = 'VT6102 Rhine II PCI Fast Ethernet Controller' class = network subclass = ethernet $ dmeg -a Copyright (c) 1992-2007 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 6.2-PRERELEASE #2: Fri Jan 12 19:58:49 GMT 2007 root@smirnoff:/usr/obj/usr/src/sys/CEIBO Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz (2393.38-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x6f6 Stepping = 6 Features=0xbfebfbff Features2=0xe3bd,CX16,XTPR,> AMD Features=0x20000800 AMD Features2=0x1 Cores per package: 2 real memory = 1065025536 (1015 MB) avail memory = 1015099392 (968 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 ioapic0 irqs 0-23 on motherboard kbd1 at kbdmux0 ath_hal: 0.9.17.2 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) acpi0: on motherboard acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi0: Power Button (fixed) acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 cpu0: on acpi0 acpi_perf0: on cpu0 cpu1: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pci0: at device 2.0 (no driver attached) pci0: at device 27.0 (no driver attached) pcib1: irq 16 at device 28.0 on pci0 pci2: on pcib1 pcib2: irq 17 at device 28.1 on pci0 pci1: on pcib2 re0: port 0xd800-0xd8ff mem 0xfeaff000-0xfeafffff irq 17 at device 0.0 on pci1 miibus0: on re0 rgephy0: on miibus0 rgephy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto re0: Ethernet address: 00:13:8f:dd:24:49 re0: [FAST] uhci0: port 0xc400-0xc41f irq 23 at device 29.0 on pci0 uhci0: [GIANT-LOCKED] usb0: on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: port 0xc480-0xc49f irq 19 at device 29.1 on pci0 uhci1: [GIANT-LOCKED] usb1: on uhci1 usb1: USB revision 1.0 uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered uhci2: port 0xc800-0xc81f irq 18 at device 29.2 on pci0 uhci2: [GIANT-LOCKED] usb2: on uhci2 usb2: USB revision 1.0 uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub2: 2 ports with 2 removable, self powered uhci3: port 0xc880-0xc89f irq 16 at device 29.3 on pci0 uhci3: [GIANT-LOCKED] usb3: on uhci3 usb3: USB revision 1.0 uhub3: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub3: 2 ports with 2 removable, self powered ehci0: mem 0xfe937c00-0xfe937fff irq 23 at device 29.7 on pci0 ehci0: [GIANT-LOCKED] usb4: EHCI version 1.0 usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3 usb4: on ehci0 usb4: USB revision 2.0 uhub4: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub4: 8 ports with 8 removable, self powered uhub5: NEC Corporation USB2.0 Hub Controller, class 9/0, rev 2.00/1.00, addr 2 uhub5: single transaction translator uhub5: 4 ports with 4 removable, self powered ugen0: Sony Ericsson SEMC DSS-20 SyncStation, rev 1.10/4.00, addr 3 pcib3: at device 30.0 on pci0 pci3: on pcib3 vr0: port 0xe800-0xe8ff mem 0xfebffc00-0xfebffcff irq 22 at device 1.0 on pci3 miibus1: on vr0 ukphy0: on miibus1 ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto vr0: Ethernet address: 00:50:ba:ef:f1:59 isab0: at device 31.0 on pci0 isa0: on isab0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 31.1 on pci0 ata0: on atapci0 ata1: on atapci0 atapci1: port 0xc080-0xc087,0xc000-0xc003,0xbc00-0xbc07,0xb880-0xb883,0xb800-0xb80f irq 19 at device 31.2 on pci0 atapci1: failed to enable memory mapping! ata2: on atapci1 ata3: on atapci1 pci0: at device 31.3 (no driver attached) acpi_button0: on acpi0 sio0: configured irq 3 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: port 0x2f8-0x2ff irq 3 flags 0x10 on acpi0 sio0: type 16550A fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FAST] ppc0: port 0x378-0x37f,0x778-0x77f irq 7 drq 3 on acpi0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/9 bytes threshold ppbus0: on ppc0 plip0: on ppbus0 lpt0: on ppbus0 lpt0: Interrupt-driven port ppi0: on ppbus0 atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] sio1: configured irq 4 not in bitmap of probed irqs 0 sio1: port may not be enabled sio1: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 on acpi0 sio1: type 16550A sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 ums0: Logitech USB Receiver, rev 1.10/16.00, addr 2, iclass 3/1 ums0: 16 buttons and Z dir. Timecounters tick every 1.000 msec ad0: 152627MB at ata0-master UDMA100 ad4: 238475MB at ata2-master SATA150 SMP: AP CPU #1 Launched! >How-To-Repeat: Replicate hardware and run RELENG_6 >Fix: There seems to plenty of patched about, perhaps it's a case of plugging some of the chip identifiers into the driver, hopefully this will be of some use. >Release-Note: >Audit-Trail: >Unformatted: From owner-freebsd-amd64@FreeBSD.ORG Sat Jan 13 07:19:34 2007 Return-Path: X-Original-To: freebsd-amd64@FreeBSD.org Delivered-To: freebsd-amd64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6AC4716A595; Sat, 13 Jan 2007 07:19:34 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailout1.pacific.net.au (mailout1-3.pacific.net.au [61.8.2.210]) by mx1.freebsd.org (Postfix) with ESMTP id 06D7613C44C; Sat, 13 Jan 2007 07:19:34 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.2.163]) by mailout1.pacific.net.au (Postfix) with ESMTP id 935EA5A0D55; Sat, 13 Jan 2007 18:19:32 +1100 (EST) Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailproxy2.pacific.net.au (Postfix) with ESMTP id 4804E27408; Sat, 13 Jan 2007 18:19:31 +1100 (EST) Date: Sat, 13 Jan 2007 18:19:30 +1100 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Sven Willenberger In-Reply-To: <45A54FC9.8040900@dmv.com> Message-ID: <20070113172849.E94785@delplex.bde.org> References: <1168211205.22629.6.camel@lanshark.dmv.com> <20070109124826.M79616@delplex.bde.org> <1168353425.29047.8.camel@lanshark.dmv.com> <200701091150.15274.jhb@freebsd.org> <20070110132839.X16378@besplex.bde.org> <45A54FC9.8040900@dmv.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: stable@FreeBSD.org, freebsd-amd64@FreeBSD.org, John Baldwin Subject: Re: Panic in 6.2-PRERELEASE with bge on amd64 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Jan 2007 07:19:34 -0000 On Wed, 10 Jan 2007, Sven Willenberger wrote: > Bruce Evans presumably uttered the following on 01/09/07 21:42: >> Also look at nearby chain entries (especially at (rxidx - 1) mod 512)). >> I think the previous 255 entries and the rxidx one should be >> non-NULL since we should have refilled them as we used them (so the >> one at rxidx is least interesting since we certainly just refilled >> it), and the next 256 entries should be NULL since we bogusly only use >> half of the entries. If the problem is uninitialization, then I expect >> all 512 entries except the one just refilled at rxidx to be NULL. > (kgdb) p sc->bge_cdata.bge_rx_std_chain[rxidx] > $1 = (struct mbuf *) 0xffffff0097a27900 > (kgdb) p rxidx > $2 = 499 > > since rxidx = 499, I assume you are most interested in 498: > (kgdb) p sc->bge_cdata.bge_rx_std_chain[498] > $3 = (struct mbuf *) 0xffffff00cf1b3100 > > for the sake of argument, 500 is null: > (kgdb) p sc->bge_cdata.bge_rx_std_chain[500] > $13 = (struct mbuf *) 0x0 > > the indexes with values basically are 243 through 499: > (kgdb) p sc->bge_cdata.bge_rx_std_chain[241] > $30 = (struct mbuf *) 0x0 > (kgdb) p sc->bge_cdata.bge_rx_std_chain[242] > $31 = (struct mbuf *) 0x0 > (kgdb) p sc->bge_cdata.bge_rx_std_chain[243] > $32 = (struct mbuf *) 0xffffff005d4ab700 > (kgdb) p sc->bge_cdata.bge_rx_std_chain[244] > $33 = (struct mbuf *) 0xffffff004f644b00 > > so it does not seem to be a problem with "uninitialization". There are supposed to be only 256 nonzero entries (except briefly while one is being refreshed), but the above indicates that there 257: #243 through #499 gives 257 nonzero entries. Everything indicates that entry #499 was null before it was refreshed, and that the loop in bge_rxeof() is trying to process a descriptor 1 after the last valid (previously handled) descriptor. I cannot see why it might do this. The next step might be to add active debugging code: - check that m != NULL when m is taken off the rx chain (before refresshing its entry), and panic if it is. - check that there are always BGE_SSLOTS (256) nonzero mbufs in the std rx chain. It would be interesting to know if they are always contiguous. They might not be since this depends on how the hardware uses them. Debugging is simpler if they are. - check that bge_rxeof() is not reentered. - check the rx producer index and related data before and after getting a null m. It can easily change while bge_rxeof() is running, so recording its value before and after might be useful. Bruce