Date: Thu, 27 Feb 2003 03:44:11 +1100 (EST) From: Bruce Evans <bde@zeta.org.au> To: Dag-Erling Smorgrav <des@ofug.org> Cc: current@FreeBSD.ORG, <sos@FreeBSD.ORG> Subject: Re: ata dumps broken again Message-ID: <20030227031649.T15538-100000@gamplex.bde.org> In-Reply-To: <xzp7kbnz3d9.fsf@flood.ping.uio.no>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 26 Feb 2003, Dag-Erling Smorgrav wrote: > Top-of-tree -CURRENT: > > db> call doadump > Dumping 639 MB > ata1: resetting devices .. > mi_switch(c4fad9ec,f,f,1c,5f74e) at mi_switch+0x21b > ithread_schedule(c48fb380,1,c4faea50,e99cf84c,c025850c) at ithread_schedule+0xf6 > sched_ithd(f) at sched_ithd+0x38 > Xintr15() at Xintr15+0x6c > --- interrupt, eip = 0xc017388b, esp = 0xe99cf830, ebp = 0xe99cf84c --- > critical_exit(0,c489f900,c489f92c,e99cf884,c0128324) at critical_exit+0x2b > DELAY(a,256c,82,40267d87,0) at DELAY+0x47 > ata_wait(c489f92c,40,0,0,0) at ata_wait+0x84 > ata_command(c489f92c,c6,0,0,10) at ata_command+0x2c5 > ad_reinit(c489f92c,c489f92c,ec) at ad_reinit+0x30 > ata_reinit(c489f900,c489f900,1,e99cf960,e99cf9a8) at ata_reinit+0x265 > addump(c48f3764,c02f67c0,0,18003c00,0,200) at addump+0xe8 > dumpsys(c02cee20,c02cee40,b,e99cf9f8,c016eec0) at dumpsys+0x28b > doadump(0,0,0,0,0,0,0,0,0,0) at doadump+0x20 > db_fncall(0,0,e99cfaa8,e99cfa60,0) at db_fncall+0x7c > db_command(c02a3380,c02a31a0,c029de74,c029de78,c028024d) at db_command+0xfb > db_command_loop(0,0,e99cfc28,c02c1ec8,e99cfb4c) at db_command_loop+0x5c > db_trap(c,0,1,10,e99cfbe0) at db_trap+0x5e > kdb_trap(c,0,e99cfbe0) at kdb_trap+0xe6 > trap_fatal(e99cfbe0,c4,c4faea50,12ab9a0,0) at trap_fatal+0x1cc > trap_pfault(e99cfbe0,0,c4) at trap_pfault+0x154 > trap(18,10,10,c7886300,c4caf500) at trap+0x38b > calltrap() at calltrap+0x5 > --- trap 0xc, eip = 0xc01e94fb, esp = 0xe99cfc20, ebp = 0xe99cfc60 --- > in6_pcbbind(c4bc1390,c7886300,c4faea50) at in6_pcbbind+0x1fb > tcp6_usr_bind(c4caf500,c7886300,c4faea50) at tcp6_usr_bind+0x9f > sobind(c4caf500,c7886300,c4faea50,c4caf500,e99cfd14) at sobind+0x16 > kern_bind(c4faea50,3,c7886300,c7886300,0) at kern_bind+0x70 > bind(c4faea50) at bind+0x30 > syscall(2f,2f,2f,804a3e0,0) at syscall+0x310 > Xint0x80_syscall() at Xint0x80_syscall+0x1d > --- syscall (104), eip = 0x280b1a63, esp = 0xbfbffa2c, ebp = 0xbfbffa88 --- > Context switches not allowed in the debugger. > > (kgdb) l *(ad_reinit+0x30) > 0xc0133770 is in ad_reinit (../../../dev/ata/ata-disk.c:874). > 869 > 870 /* reinit disk parameters */ > 871 ad_invalidatequeue(atadev->driver, NULL); > 872 ata_command(atadev, ATA_C_SET_MULTI, 0, > 873 adp->transfersize / DEV_BSIZE, 0, ATA_WAIT_READY); > 874 atadev->setmode(atadev, adp->device->mode); > 875 } > 876 > 877 void > 878 ad_print(struct ad_softc *adp) > (kgdb) l *(ata_command+0x2c5) > 0xc01287a5 is in ata_command (../../../dev/ata/ata-all.c:1126). > 1121 break; > 1122 > 1123 case ATA_WAIT_READY: > 1124 atadev->channel->active |= ATA_WAIT_READY; > 1125 ATA_OUTB(atadev->channel->r_io, ATA_CMD, command); > 1126 if (ata_wait(atadev, ATA_S_READY) < 0) { > 1127 ata_prtdev(atadev, "timeout waiting for cmd=%02x s=%02x e=%02x\n", > 1128 command, atadev->channel->status,atadev->channel->error); > 1129 error = -1; > 1130 } This seems to be caused by a known bug in ddb itself. Try the following fix. %%% Index: db_interface.c =================================================================== RCS file: /home/ncvs/src/sys/i386/i386/db_interface.c,v retrieving revision 1.70 diff -u -2 -r1.70 db_interface.c --- db_interface.c 22 Feb 2003 23:41:27 -0000 1.70 +++ db_interface.c 23 Feb 2003 09:51:52 -0000 @@ -78,4 +78,5 @@ kdb_trap(int type, int code, struct i386_saved_state *regs) { + u_int ef; volatile int ddb_mode = !(boothowto & RB_GDB); @@ -97,4 +98,8 @@ } + /* XXX is this correctly placed? SMP stop/start doesn't seem to be. */ + ef = read_eflags(); + disable_intr(); + switch (type) { case T_BPTFLT: /* breakpoint */ @@ -217,4 +222,7 @@ regs->tf_cs = ddb_regs.tf_cs & 0xffff; regs->tf_ds = ddb_regs.tf_ds & 0xffff; + + write_eflags(ef); + return (1); } %%% The ata driver apparently wants to wait (without sleeping), but an interrupt occurs and the scheduler wants to switch. The patch fixes letting interrupts occur withing ddb when ddb is entered for most fatal traps (entrering ddb via a ddb trap doesn't have this bug). The only obvious bug in the driver is the syntax error in the resetting message. I don't understand why the scheduler wants to switch. kern_bind() holds Giant and the interrupt is for ata and the ata interrupt handler is not INTR_MPSAFE so it shouldn't be switched to. Maybe the interrupt is shared and is attached to an INTR_MPSAFE handler. Any active interrupt attached to an INTR_MPSAFE handler would cause this problem, but the trace doesn't show any others. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030227031649.T15538-100000>