Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 27 Sep 2004 11:31:03 -0400
From:      John Baldwin <jhb@FreeBSD.org>
To:        Danny Braniss <danny@cs.huji.ac.il>
Cc:        hackers@FreeBSD.org
Subject:   Re: Dell gx280 and acpi problems
Message-ID:  <200409271131.03437.jhb@FreeBSD.org>
In-Reply-To: <20040927083420.B967043D1F@mx1.FreeBSD.org>
References:  <20040927083420.B967043D1F@mx1.FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Monday 27 September 2004 04:34 am, Danny Braniss wrote:
> for the short verions goto the end.
>
> > On Thursday 23 September 2004 04:29 am, Danny Braniss wrote:
> > > > On Wednesday 22 September 2004 04:58 am, Danny Braniss wrote:
> > > > > could some acpi expert shed some light?
> > > > >
> > > > > -current panics on boot with BIOS default settings (Suspend Mode =
is
> > > > > S3) fix: set Power Management/Suspend Mode to S1 in BIOS
> > > > >
> > > > > disabling ACPI on boot is not good, since this box has no PS/2, a=
nd
> > > > > the USB keyboard/mouse don't work with ACPI off.
> > > > >
> > > > > the acpi dumps are available from:
> > > > > 	ftp://ftp.cs.huji.ac.il/users/danny/freebsd/gx280
> > > > >
> > > > > this is the panic:
> > > > >
> > > > >
> > > > > KDB: debugger backends: ddb
> > > > > KDB: current backend: ddb
> > > > > Copyright (c) 1992-2004 The FreeBSD Project.
> > > > > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 199=
3,
> > > > > 1994 The Regents of the University of California. All rights
> > > > > reserved. FreeBSD 5.3-BETA5 #14: Tue Sep 21 13:44:32 IDT 2004
> > > > >     danny@new-dev:/r+d/obj/new-dev/r+d/5.3/src/sys/HUJI
> > > > > Timecounter "i8254" frequency 1193182 Hz quality 0
> > > > > CPU: Intel(R) Pentium(R) 4 CPU 3.00GHz (2992.52-MHz 686-class CPU)
> > > > >   Origin =3D "GenuineIntel"  Id =3D 0xf34  Stepping =3D 4
> > > > >
> > > > > Features=3D0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP=
,MTR
> > > > >R,PG E,MC A,
> > > > > CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
> > > > > Hyperthreading: 2 logical CPUs
> > > > > real memory  =3D 1063813120 (1014 MB)
> > > > > avail memory =3D 1031565312 (983 MB)
> > > > > kernel trap 12 with interrupts disabled
> > > > >
> > > > >
> > > > > Fatal trap 12: page fault while in kernel mode
> > > > > cpuid =3D 0; apic id =3D 00
> > > > > fault virtual address   =3D 0x1c
> > > > > fault code              =3D supervisor write, page not present
> > > > > instruction pointer     =3D 0x8:0xc075dab5
> > > > > stack pointer           =3D 0x10:0xc0c21be0
> > > > > frame pointer           =3D 0x10:0xc0c21cac
> > > > > code segment            =3D base 0x0, limit 0xfffff, type 0x1b
> > > > >                         =3D DPL 0, pres 1, def32 1, gran 1
> > > > > processor eflags        =3D interrupt enabled, resume, IOPL =3D 0
> > > > > current process         =3D 0 ()
> > > > > [thread 0]
> > > > > Stopped at      vm_fault+0x1b1: lock cmpxchgl   %ecx,0x1c(%edx)
> > > > > db> trace
> > > > > vm_fault(c103a000,c1004000,1,0,c08e36c0) at vm_fault+0x1b1
> > > > > trap_pfault(c0c21d14,0,c1004c29) at trap_pfault+0x184
> > > > > trap(fffd0018,c1000010,c0c20010,c1004bfd,7) at trap+0x2f1
> > > > > calltrap() at calltrap+0x5
> > > > > --- trap 0xc, eip =3D 0xc0a18574, esp =3D 0xc0c21d54, ebp =3D 0xc=
0c21d74
> > > > > --- madt_probe(c22264f0,c08bb1f0,c0c21d98,c05e8302,0) at
> > > > > madt_probe+0x174 apic_init(0,c1ec00,c1e000,0,c0441225) at
> > > > > apic_init+0x47
> > > > > mi_startup() at mi_startup+0x96
> > > > > begin() at begin+0x2c
> > > >
> > > > Can you do a 'gdb kernel.debug' and then do 'l madt_probe+0x174' and
> > > > e-mail the results?
> > >
> > > I think i'm doing something wrong :-), tip -38400 com1 works fine,
> > > Type '?' for a list of commands, 'help' for more detailed help.
> > > OK boot -d
> > > /boot/kernel/acpi.ko text=3D0x3fa30 data=3D0x1be4+0x110c
> > > syms=3D[0x4+0x72a0+0x4+0x9743]
> > > GDB: debug ports: sio
> > > GDB: current port: sio
> > > KDB: debugger backends: ddb gdb
> > > KDB: current backend: ddb
> > > KDB: enter: Boot flags requested debugger
> > > [thread 0]
> > > Stopped at      kdb_enter+0x2b: nop
> > > db> gdb
> > > Step to enter the remote GDB backend.
> > >
> > > backing out of tip via ~.
> > >
> > >
> > > shuttle-2# gdb -b 38400 kernel.debug
> > > GNU gdb 6.1.1 [FreeBSD]
> > > Copyright 2004 Free Software Foundation, Inc.
> > > GDB is free software, covered by the GNU General Public License, and
> > > you are welcome to change it and/or distribute copies of it under
> > > certain conditions. Type "show copying" to see the conditions.
> > > There is absolutely no warranty for GDB.  Type "show warranty" for
> > > details. This GDB was configured as "i386-marcel-freebsd"...
> > > Ready to go.  Enter 'tr' to connect to the remote target
> > > with /dev/cuaa0, 'tr /dev/cuaa1' to connect to a different port
> > > or 'trf portno' to connect to the remote target with the firewire
> > > interface.  portno defaults to 5556.
> > >
> > > Type 'getsyms' after connection to load kld symbols.
> > >
> > > If you're debugging a local system, you can use 'kldsyms' instead
> > > to load the kld symbols.  That's a less obnoxious interface.
> > > (gdb) tr /dev/cuaa0
> > > Ignoring packet error, continuing...
> > > Ignoring packet error, continuing...
> > > Ignoring packet error, continuing...
> > > Couldn't establish connection to remote target
> > > Malformed response to offset query, timeout
> > > (gdb)
> >
> > You don't have to do the gdb during the panic.  You just need access to
> > the kernel.debug corresponding to the kernel you are booting.  Is this a
> > custom kernel on the box or are you doing an install?  If you are doing
> > an install, try disabling apic support by entering 'set
> > hint.apic.0.disabled=3D1' at the loader prompt and install that way.  T=
hen,
> > once the box is running, build a debug kernel, reproduce the panic, get
> > the instruction pointer address, and then fire up gdb on the kernel.deb=
ug
> > file and do 'l *<value of instruction pointer>'.
>
> to get gdb talking i had to:
> db> gdb
> db> step
>
> Fatal trap 12: page fault while in kernel mode
> cpuid =3D 0; apic id =3D 00
> fault virtual address   =3D 0x1c
> fault code              =3D supervisor write, page not present
> instruction pointer     =3D 0x8:0xc07673b1
> stack pointer           =3D 0x10:0xc0c21be0
> frame pointer           =3D 0x10:0xc0c21cac
> code segment            =3D base 0x0, limit 0xfffff, type 0x1b
>                         =3D DPL 0, pres 1, def32 1, gran 1
> processor eflags        =3D trace trap, interrupt enabled, resume, IOPL =
=3D 0
> current process         =3D 0 ()
> $T0b8:b17376c0;thread:0;#ad~
> 			   ^
>
> 			   |--- i typed to get out of db
>
> then
> (not clear from docs, maybe common sense, but you better be in boot/kerne=
l)
> gdb kernel.debug
>
> (gdb) l madt_probe+0x174
> Junk at end of line specification.

Have to put a * here, i.e.

'l *madt_probe+0x174'

> (gdb) bt
> #0  vm_fault (map=3D0xc103a000, vaddr=3D0xc1004000, fault_type=3D0x1,
> fault_flags=3D0x0) at atomic.h:154
> During symbol reading, Incomplete CFI data; unspecified registers at
> 0xc07673d5.
> #1  0xc07ce128 in trap_pfault (frame=3D0xc0c21d14, usermode=3D0x0,
> eva=3D0xc1004c29) at /r+d/5.3/src/sys/i386/i3
> 86/trap.c:716
> #2  0xc07cdd91 in trap (frame=3D
>       {tf_fs =3D 0xfffd0018, tf_es =3D 0xc1000010, tf_ds =3D 0xc0c20010, =
tf_edi =3D
> 0xc1004bfd, tf_esi =3D 0x7, tf_e
> bp =3D 0xc0c21d74, tf_isp =3D 0xc0c21d40, tf_ebx =3D 0x2, tf_edx =3D 0x12=
, tf_ecx =3D
> 0x4, tf_eax =3D 0x0, tf_trapno =3D
>  0xc, tf_err =3D 0x0, tf_eip =3D 0xc0a24574, tf_cs =3D 0x8, tf_eflags =3D=
 0x90093,
> tf_esp =3D 0xc00fec00, tf_ss =3D 0x
> 1})
>     at /r+d/5.3/src/sys/i386/i386/trap.c:417
> #3  0xc07bc7aa in calltrap () at /r+d/5.3/src/sys/i386/i386/exception.s:1=
40
> #4  0xfffd0018 in ?? ()
> #5  0xc1000010 in ?? ()
> #6  0xc0c20010 in ?? ()
> #7  0xc1004bfd in ?? ()
> #8  0x00000007 in ?? ()
> #9  0xc0c21d74 in ?? ()
> #10 0xc0c21d40 in ?? ()
> #11 0x00000002 in ?? ()
> #12 0x00000012 in ?? ()
> #13 0x00000004 in ?? ()
> #14 0x00000000 in ?? ()
> #15 0x0000000c in ?? ()
> #16 0x00000000 in ?? ()
> #17 0xc0a24574 in madt_probe () at
> /r+d/5.3/src/sys/modules/acpi/acpi/../../../ i386/acpica/madt.c:258
> #18 0xc07c2757 in apic_init (dummy=3D0x0) at
> /r+d/5.3/src/sys/i386/i386/local_api c.c:564
> #19 0xc05f1bfe in mi_startup () at /r+d/5.3/src/sys/kern/init_main.c:210
> #20 0xc0441225 in begin () at /r+d/5.3/src/sys/i386/i386/locore.s:348
> (gdb) frame 17
> #17 0xc0a24574 in madt_probe () at
> /r+d/5.3/src/sys/modules/acpi/acpi/../../../ i386/acpica/madt.c:258
> 258                     for (i =3D 0; i < count; i++)
> (gdb) l
> 253                                     printf("MADT: Failed to map
> RSDT\n"); 254                             return (ENXIO);
> 255                     }
> 256                     count =3D (rsdt->Length - sizeof(ACPI_TABLE_HEADE=
R))
> / 257                         sizeof(UINT32);
> 258                     for (i =3D 0; i < count; i++)
> 259                             if (madt_probe_table(rsdt->
> TableOffsetEntry[i]))
> 260                                     break;
> 261                     madt_unmap_table(rsdt);
> 262             }
>
> the suspicious part:
>
> (gdb) p *rsdp
> $5 =3D {
>   Signature =3D "RSD PTR ",
>   Checksum =3D 0xa9,
>   OemId =3D "DELL  ",
>   Revision =3D 0x0,
>   RsdtPhysicalAddress =3D 0xfcbfd,
>   Length =3D 0xffffffff,
>   XsdtPhysicalAddress =3D 0xffffffffffffffff,
>   ExtendedChecksum =3D 0xff,
>   Reserved =3D "=FF=FF=FF"
> }
> (gdb) p rsdt->Length
> Cannot access memory at address 0xc1004c01
> (gdb) p rsdp->RsdtPhysicalAddress
> $6 =3D 0xfcbfd
>
>
> rsdp seems to point to valid data, p->RsdtPhysicalAddress also, but
> rsdt->Length gives an gdb error, and in any case seems wrong (0xffffffff).
>
> so i hope all this helps someone,
>
> 	danny
> PS: i think i should change the subject to: 'debugging on the Bleeding
> Edge'

Ok, this is helpful.  How about first installing the box using safe mode=20
because this will be a lot easier to debug if you can build custom kernels.=
 =20
Next, add some printf's to dump out rdst->Length in the madt_probe()=20
function.  Then boot that kernel over the serial console and mail the outpu=
t=20
of your printf.

=2D-=20
John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve"  =3D  http://www.FreeBSD.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200409271131.03437.jhb>