Date: Sun, 31 Aug 2003 01:08:54 -0600 From: "Kenneth D. Merry" <ken@kdm.org> To: Poul-Henning Kamp <phk@phk.freebsd.dk> Cc: current@FreeBSD.org Subject: Re: need some debugging help Message-ID: <20030831070854.GA54748@panzer.kdm.org> In-Reply-To: <3821.1062228130@critter.freebsd.dk> References: <20030830040357.GA42770@panzer.kdm.org> <3821.1062228130@critter.freebsd.dk>
next in thread | previous in thread | raw e-mail | index | archive | help
--vkogqOf2sHV7VnPd Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Sat, Aug 30, 2003 at 09:22:10 +0200, Poul-Henning Kamp wrote: > In message <20030830040357.GA42770@panzer.kdm.org>, "Kenneth D. Merry" writes: > > >I think I have everything setup correctly, but I keep getting panics inside > >the GEOM code with these patches. (Memory modified after free.) I don't > >know whether I've just exposed some race condition, or whether I've done > >something wrong. > > Do you have any idea what goes on at/right before the panic ? > > Ie: has drives been created [disk_create()] or removed [disk_destroy()] > right before ? Well, the panic happens right after the probes complete for the various CAM devices in the system. I've got 4 SCSI disks in the system, so disk_create() has just been called 4 times. That's also the point where I fire off the task queue to create the sysctl variables for each device, though. (Right after the probe message is printed.) > My best shot, would be that disk_destroy() was called and something > somehow fiddled the related structures subsequently. > > You may want to set kern.geom.debugflags=N and see if that offers > any clues. > > N |= 1 topology events > N |= 2 bio processing (ie: many lines for each I/O) > N |= 4 access processing (open/close) I wouldn't expect that disk_destroy() is getting called, although I suppose that's possible. I had to make kern.geom.debugflags a tunable in order to get the debugging output. (The system panics before root gets mounted, so there's no way to set a sysctl variable.) Anyway, I got some debugging output, and I've attached dmesg output. Let me know whether anything in there looks suspicious or points to a possible problem. Thanks, Ken -- Kenneth Merry ken@kdm.org --vkogqOf2sHV7VnPd Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="session.log.geom_trace.20030831" Content-Transfer-Encoding: quoted-printable Copyright (c) 1992-2003 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.1-CURRENT #8: Sun Aug 31 00:48:50 MDT 2003 ken@nargothrond.kdm.org:/usr/home/ken/perforce2/FreeBSD-ken/src/sys/i38= 6/compile/gondolin Preloaded elf kernel "/boot/kernel.test64/kernel" at 0xc05c8000. Preloaded elf module "/boot/kernel.test64/acpi.ko" at 0xc05c8274. Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Pentium(R) III CPU family 1266MHz (1266.07-MHz 686-class= CPU) Origin =3D "GenuineIntel" Id =3D 0x6b1 Stepping =3D 1 Features=3D0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE= ,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE> real memory =3D 2684289024 (2559 MB) avail memory =3D 2607640576 (2486 MB) Programming 16 pins in IOAPIC #0 IOAPIC #0 intpin 2 -> irq 0 Programming 16 pins in IOAPIC #1 FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): apic id: 0, version: 0x00040011, at 0xfee00000 cpu1 (AP): apic id: 1, version: 0x00040011, at 0xfee00000 io0 (APIC): apic id: 4, version: 0x000f0011, at 0xfec00000 io1 (APIC): apic id: 5, version: 0x000f0011, at 0xfec01000 g_ignition g_modevent(MD, LOAD) g_post_event_x(0xc021e050, 0xc47896a0, 2 g_modevent(BSD, LOAD) g_post_event_x(0xc021e050, 0xc4789690, 2 g_modevent(MBR, LOAD) g_post_event_x(0xc021e050, 0xc4789680, 2 g_modevent(MBREXT, LOAD) g_post_event_x(0xc021e050, 0xc4789670, 2 g_modevent(CCD, LOAD) g_post_event_x(0xc021e050, 0xc4789660, 2 g_modevent(DEV, LOAD) g_post_event_x(0xc021e050, 0xc4789650, 2 g_modevent(DISK, LOAD) g_post_event_x(0xc021e050, 0xc4789620, 2 Pentium Pro MTRR support enabled ACPI-0660: *** Warning: Type override - [DEB_] had invalid type (Intege= r) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [MLIB] had invalid type (Intege= r) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [IO__] had invalid type (Intege= r) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [DATA] had invalid type (String= ) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [SIO_] had invalid type (String= ) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [SB__] had invalid type (String= ) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [PM__] had invalid type (String= ) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [ICNT] had invalid type (String= ) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [ACPI] had invalid type (String= ) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [OSB4] had invalid type (String= ) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [PM__] had invalid type (String= ) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [BIOS] had invalid type (Intege= r) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [CMOS] had invalid type (Intege= r) for Scope operator, changed to (Scope) npx0: <math processor> on motherboard npx0: INT 16 interface acpi0: <RCC RCCNILE > on motherboard acpi0: power button is handled as a fixed feature programming model. Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000 pcibios: BIOS version 2.10 Using $PIR table, 10 entries at 0xc00f52e0 acpi_timer0: <32-bit timer at 3.579545MHz> port 0x508-0x50b on acpi0 acpi_cpu0: <CPU> port 0x530-0x537 on acpi0 acpi_cpu1: <CPU> port 0x530-0x537 on acpi0 acpi_cpu2: <CPU> port 0x530-0x537 on acpi0 acpi_cpu3: <CPU> port 0x530-0x537 on acpi0 acpi_button0: <Sleep Button> on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 IOAPIC #1 intpin 10 -> irq 2 IOAPIC #1 intpin 11 -> irq 5 IOAPIC #1 intpin 15 -> irq 9 pcib1: <ACPI PCI-PCI bridge> at device 0.1 on pci0 pci1: <ACPI PCI bus> on pcib1 IOAPIC #1 intpin 14 -> irq 11 pci1: <display, VGA> at device 0.0 (no driver attached) ahc0: <Adaptec aic7899 Ultra160 SCSI adapter> port 0xd000-0xd0ff mem 0xfeaf= c000-0xfeafcfff irq 2 at device 5.0 on pci0 aic7899: Ultra160 Wide Channel A, SCSI Id=3D7, 32/253 SCBs ahc1: <Adaptec aic7899 Ultra160 SCSI adapter> port 0xd800-0xd8ff mem 0xfeaf= f000-0xfeafffff irq 5 at device 5.1 on pci0 aic7899: Ultra160 Wide Channel B, SCSI Id=3D7, 32/253 SCBs fxp0: <Intel 82559 Pro/100 Ethernet> port 0xd400-0xd43f mem 0xfe900000-0xfe= 9fffff,0xfeafd000-0xfeafdfff irq 9 at device 6.0 on pci0 fxp0: Ethernet address 00:30:48:21:bb:74 miibus0: <MII bus> on fxp0 inphy0: <i82555 10/100 media interface> on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto isab0: <PCI-ISA bridge> port 0x580-0x58f at device 15.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <ServerWorks ROSB4 UDMA33 controller> port 0xffa0-0xffaf at device= 15.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 pci0: <serial bus, USB> at device 15.2 (no driver attached) pcib2: <ACPI Host-PCI bridge> on acpi0 pci2: <ACPI PCI bus> on pcib2 IOAPIC #1 intpin 8 -> irq 16 ti0: <Netgear GA620 1000baseT Gigabit Ethernet> mem 0xfebfc000-0xfebfffff i= rq 16 at device 2.0 on pci2 ti0: Ethernet address: 00:02:e3:00:34:10 atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0 atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0 kbd0 at atkbd0 psm0: <PS/2 Mouse> irq 12 on atkbdc0 psm0: model Generic PS/2 mouse, device ID 0 fdc0: cmd 3 failed at out byte 1 of 3 sio0 port 0x3f8-0x3ff irq 4 on acpi0 sio0: type 16550A, console sio1 port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A ppc0 port 0x778-0x77f,0x378-0x37f irq 7 drq 3 on acpi0 ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/8 bytes threshold ppbus0: <Parallel port bus> on ppc0 plip0: <PLIP network interface> on ppbus0 lpt0: <Printer> on ppbus0 lpt0: Interrupt-driven port ppi0: <Parallel I/O> on ppbus0 fdc0: cmd 3 failed at out byte 1 of 3 orm0: <Option ROMs> at iomem 0xd1800-0xd27ff,0xc0000-0xcbfff on isa0 fdc0: <Enhanced floppy controller (i82077, NE72065 or clone)> at port 0x3f7= ,0x3f0-0x3f5 irq 6 drq 2 on isa0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=3D0x100> pmtimer0 on isa0 APIC_IO: Testing 8254 interrupt delivery APIC_IO: Broken MP table detected: 8254 is not connected to IOAPIC #0 intpi= n 2 APIC_IO: routing 8254 via 8259 and IOAPIC #0 intpin 0 Timecounters tick every 10.000 msec g_load_class(MD) g_load_class(BSD) g_load_class(MBR) g_load_class(MBREXT) g_load_class(CCD) g_load_class(DEV) g_load_class(DISK) acpi_cpu: throttling enabled, 16 steps (100% to 6.2%), currently 100.0% acd0: DVDR <SONY DVD RW DRU-500A> at ata0-master UDMA33 Waiting 2 seconds for SCSI devices to settle g_post_event_x(0xc021a9c0, 0xcb9d0450, 2 ref 0xcb9d0450 g_post_event_x(0xc021a9c0, 0xcb9d1850, 2 ref 0xcb9d1850 g_post_event_x(0xc021a9c0, 0xcbd75c50, 2 ref 0xcbd75c50 g_post_event_x(0xc021a9c0, 0xcbd76050, 2 ref 0xcbd76050 da0 at ahc0 bus 0 target 0 lun 0 da0: <SEAGATE ST318406LW 0108> Fixed Direct Access SCSI-3 device=20 da0: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing E= nabled da0: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C) da1 at ahc0 bus 0 target 1 lun 0 da1: <SEAGATE ST318406LW 0108> Fixed Direct Access SCSI-3 device=20 da1: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing E= nabled da1: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C) da2 at ahc0 bus 0 target 2 lun 0 da2: <IBM DDYS-T36950N S93E> Fixed Direct Access SCSI-3 device=20 da2: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing E= nabled da2: 35003MB (71687340 512 byte sectors: 255H 63S/T 4462C) da3 at ahc0 bus 0 target 3 lun 0 da3: <IBM DDYS-T36950N S93E> Fixed Direct Access SCSI-3 device=20 da3: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing E= nabled da3: 35003MB (71687340 512 byte sectors: 255H 63S/T 4462C) cd1 at ata0 bus 0 target 0 lun 0 cd1: <SONY DVD RW DRU-500A 1.0c> Removable CD-ROM SCSI-0 device=20 cd1: 33.000MB/s transfers cd1: cd present [278942 x 2048 byte records] g_post_event_x(0xc021ea70, 0xcbd4f100, 2 ref 0xcbd4f100 g_post_event_x(0xc021ea70, 0xcbd4f700, 2 ref 0xcbd4f700 g_post_event_x(0xc021ea70, 0xcbd4f280, 2 ref 0xcbd4f280 g_post_event_x(0xc021ea70, 0xcbd4ee80, 2 ref 0xcbd4ee80 dev_taste(DEV,da0) g_mbrext_taste(MBREXT,da0) mbr_taste(MBR,da0) g_access_rel(0xcb9a2e00(da0), 1, 0, 0) open delta:[r1w0e0] old:[r0w0e0] provider:[r0w0e0] 0xcbd4f100(da0) g_disk_access(da0, 1, 0, 0) SMP: AP CPU #1 Launched! bio_getattr(GEOM::fwsectors) bio_request(0xcbda3000) from 0xcb9a2e00(da0) to 0xcbd4f100(da0) cmd 8 g_io_deliver(0xcbda3000) from 0xcb9a2e00(da0) to 0xcbd4f100(da0) cmd 8 erro= r 0 off 0 len 4 bio_request(0xcbda2f30) from 0xcb9a2e00(da0) to 0xcbd4f100(da0) cmd 1 g_io_deliver(0xcbda2f30) from 0xcb9a2e00(da0) to 0xcbd4f100(da0) cmd 1 erro= r 0 off 0 len 512 g_slice_config(da0, 0, 0) g_slice_confcidg0( daat0 ,a h1c,1 0b)us g0_ stlaircge_ecto n4f ilgu(nd a00, c2d,0 :0 )<P LgE_XsTlOiRc eC_Dc-oRnOfMi gP(Xd-a80X,C S3 ,1 .00)0> gR_esmloivcaeb_lceo nCfDi-gR(OdMa 0S,C S0I,- 21 )dev igc_ep os tc_de0v:e n1t0_.x0(000xMcB0/2s1 etar7a0n,s f0exrcsb d(41f00.8000,0 M2Hz , orfeffs e0tx c1b5d)4f 0c8d00: gc_ds lpircees_ecnotn f[i2g7(5d6a800, x1 ,2 014)8=20 bgy_tsel irceec_ocrodnsf]ig (da0, 2, 1) g_slice_config(da0, 3, 1) g_access_rel(0xcb9a2e00(da0), -1, 0, 0) open delta:[r-1w0e0] old:[r1w0e0] provider:[r1w0e0] 0xcbd4f100(da0) g_disk_access(da0, -1, 0, 0) bsd_taste(BSD,da0) g_access_rel(0xcbda1400(da0), 1, 0, 0) open delta:[r1w0e0] old:[r0w0e0] provider:[r0w0e0] 0xcbd4f100(da0) g_disk_access(da0, 1, 0, 0) bio_getattr(MBR::type) bio_request(0xcbda6120) from 0xcbda1400(da0) to 0xcbd4f100(da0) cmd 8 g_io_deliver(0xcbda6120) from 0xcbda1400(da0) to 0xcbd4f100(da0) cmd 8 erro= r -3 off 0 len 4 bio_getattr(PC98::type) bio_request(0xcbda6090) from 0xcbda1400(da0) to 0xcbd4f100(da0) cmd 8 g_io_deliver(0xcbda6090) from 0xcbda1400(da0) to 0xcbd4f100(da0) cmd 8 erro= r -3 off 0 len 4 bio_request(0xcbda6000) from 0xcbda1400(da0) to 0xcbd4f100(da0) cmd 1 g_io_deliver(0xcbda6000) from 0xcbda1400(da0) to 0xcbd4f100(da0) cmd 1 erro= r 0 off 512 len 512 bio_request(0xcbda2e10) from 0xcbda1400(da0) to 0xcbd4f100(da0) cmd 1 g_io_deliver(0xcbda2e10) from 0xcbda1400(da0) to 0xcbd4f100(da0) cmd 1 erro= r 0 off 0 len 512 g_access_rel(0xcbda1400(da0), -1, 0, 0) open delta:[r-1w0e0] old:[r1w0e0] provider:[r1w0e0] 0xcbd4f100(da0) g_disk_access(da0, -1, 0, 0) g_slice_spoiled(0xcbda1400/da0) g_wither_geom(0xcbda1440(da0)) g_detach(0xcbda1400) g_destroy_consumer(0xcbda1400) g_destroy_geom(0xcbda1440(da0)) dev_taste(DEV,da1) g_mbrext_taste(MBREXT,da1) mbr_taste(MBR,da1) g_access_rel(0xcbd59000(da1), 1, 0, 0) open delta:[r1w0e0] old:[r0w0e0] provider:[r0w0e0] 0xcbd4f700(da1) g_disk_access(da1, 1, 0, 0) bio_getattr(GEOM::fwsectors) bio_request(0xcbda3ea0) from 0xcbd59000(da1) to 0xcbd4f700(da1) cmd 8 g_io_deliver(0xcbda3ea0) from 0xcbd59000(da1) to 0xcbd4f700(da1) cmd 8 erro= r 0 off 0 len 4 bio_request(0xcbda2cf0) from 0xcbd59000(da1) to 0xcbd4f700(da1) cmd 1 g_io_deliver(0xcbda2cf0) from 0xcbd59000(da1) to 0xcbd4f700(da1) cmd 1 erro= r 0 off 0 len 512 g_slice_config(da1, 0, 0) g_slice_config(da1, 1, 0) g_slice_config(da1, 2, 0) g_slice_config(da1, 3, 0) g_slice_config(da1, 0, 1) Memory modified after free 0xcbd4f600(124) panic: Most recently used by devbuf cpuid =3D 0; lapic.id =3D 00000000 Debugger("panic") Stopped at Debugger+0x55: xchgl %ebx,in_Debugger.0 db> `=08 =08trace Debugger(c03e9757,0,c03fb4e2,e5e45af0,100) at Debugger+0x55 panic(c03fb4e2,c03e7f5a,7c,c083ac14,c083ac00) at panic+0x15f mtrash_ctor(cbd4f600,80,0,57a,cbd4f600) at mtrash_ctor+0x5d uma_zalloc_arg(c083ac00,0,102,e5e45b58,e5e45b58) at uma_zalloc_arg+0x1e4 malloc(5a,c042fae0,102,1,c02756e4) at malloc+0xd3 g_new_providerf(cb9a2f00,cb9b0b90,e5e45bf8,1,1) at g_new_providerf+0xa3 g_slice_config(cb9a2f00,0,1,7e00,0) at g_slice_config+0x259 g_mbr_modify(cb9a2f00,cbd58000,cbd73000,123,0) at g_mbr_modify+0x247 g_mbr_taste(c0470580,cbd4f700,0,159,cbd4f780) at g_mbr_taste+0x1be g_new_provider_event(cbd4f700,0,c03e52b7,b3,66666667) at g_new_provider_eve= nt+0xad one_event(e5e45d0c,c021cd85,c0485fd4,0,4c) at one_event+0x218 g_run_events(c0485fd4,0,4c,c03d7a28,a) at g_run_events+0x15 g_event_procbody(0,e5e45d48,c03e7395,314,345f184d) at g_event_procbody+0x45 fork_exit(c021cd40,0,e5e45d48) at fork_exit+0xcf fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip =3D 0, esp =3D 0xe5e45d7c, ebp =3D 0 --- db>=20 --vkogqOf2sHV7VnPd--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030831070854.GA54748>