Date: Thu, 6 Mar 2008 08:44:10 -0500 From: John Baldwin <jhb@freebsd.org> To: freebsd-hackers@freebsd.org Cc: anholt@FreeBSD.org, =?iso-8859-1?q?Fr=E9d=E9ric_PRACA?= <frederic.praca@freebsd-fr.org> Subject: Re: Kernel crash on Asus A7N8X-X Message-ID: <200803060844.10772.jhb@freebsd.org> In-Reply-To: <200803060831.27056.jhb@freebsd.org> References: <1204671599.47cdd46f6b1e2@imp.free.fr> <200803060831.27056.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday 06 March 2008 08:31:26 am John Baldwin wrote: > On Tuesday 04 March 2008 05:59:59 pm Fr=E9d=E9ric PRACA wrote: > > Hello dear hackers, > > I own a Asus A7N8X-X motherboard (NForce2 chipset) with a Radeon 9600 > > video card. After upgrading from 6.3 to 7.0, I launched xorg which > > crashed the kernel. After looking in the kernel core dump, I found that > > the > > agp_nvidia_flush_tlb function of /usr/src/sys/pci/agp_nvidia.c crashed = on > > the line 377. The loop fails from the beginning (when i=3D=3D0). I comm= ented > > out the two last loops and it seems to work now but as I didn't > > understand what is this code for, I'd like to have some explanation abo= ut > > it and want to know if someone got the same problem. > > The Linux AGP driver has the same code. It appears to be forcing a read = of > the TLB registers to force prior writes to clear the TLB entries to flush > perhaps? I'm not sure why you are getting a panic. What kind of fault d= id > you get? (The original kernel panic messages would be needed.) Actually, it looks like you have a 64MB aperture and with either a 32MB or= =20 64MB aperture this loop runs off the end of the GATT (GATT has 16384 entrie= s=20 * 4 bytes =3D=3D 64k =3D=3D 16 pages on x86) so if it dies before it starts= the next=20 loop that might explain it. The patch below makes it walk the full GATT=20 reading the first word from each page to force a flush w/o walking off the= =20 end of the GATT. Actually, this is what appears to have happened: (gdb) set $start =3D 0xd4d05000 (ag_virtual) (gdb) set $fva =3D 3570491392 (eva in trap_pfault() frame) (gdb) p ($fva - $start) / 4 $2 =3D 17408 That's well over your current ag_entries of 16384. Try this patch (note=20 Linux's in-kernel agp driver has the same bug): Index: agp_nvidia.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /host/cvs/usr/cvs/src/sys/dev/agp/agp_nvidia.c,v retrieving revision 1.13 diff -u -r1.13 agp_nvidia.c =2D-- agp_nvidia.c 12 Nov 2007 21:51:37 -0000 1.13 +++ agp_nvidia.c 6 Mar 2008 13:37:43 -0000 @@ -347,7 +347,7 @@ struct agp_nvidia_softc *sc; u_int32_t wbc_reg, temp; volatile u_int32_t *ag_virtual; =2D int i; + int i, pages; =20 sc =3D (struct agp_nvidia_softc *)device_get_softc(dev); =20 @@ -373,9 +373,10 @@ ag_virtual =3D (volatile u_int32_t *)sc->gatt->ag_virtual; =20 /* Flush TLB entries. */ =2D for(i =3D 0; i < 32 + 1; i++) + pages =3D sc->gatt->ag_entries * sizeof(u_int32_t) / PAGE_SIZE; + for(i =3D 0; i < pages; i++) temp =3D ag_virtual[i * PAGE_SIZE / sizeof(u_int32_t)]; =2D for(i =3D 0; i < 32 + 1; i++) + for(i =3D 0; i < pages; i++) temp =3D ag_virtual[i * PAGE_SIZE / sizeof(u_int32_t)]; =20 return (0); =2D-=20 John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200803060844.10772.jhb>