Date: Wed, 05 Jul 1995 14:43:50 PDT From: Voradesh Yenbut <yenbut@cs.washington.edu> To: esser@zpr.uni-koeln.de (Stefan Esser) Cc: Voradesh Yenbut <yenbut@cs.washington.edu>, hackers@freebsd.org Subject: Re: One cause of 2.05R instability found Message-ID: <199507052143.OAA20148@vetch.cs.washington.edu> In-Reply-To: Your message of "Wed, 05 Jul 1995 17:52:34 %2B0200." <199507051552.AA06352@FileServ1.MI.Uni-Koeln.DE>
next in thread | previous in thread | raw e-mail | index | archive | help
In message <199507051552.AA06352@FileServ1.MI.Uni-Koeln.DE>, Stefan Esser writes: >Regarding problems with panics: > > Fatal trap 12: page fault while in kernel mode > >Is this a single case ? Yes, that was a single case of panics on my system. >No, sorry, this statement isn't there (at ncr_complete+195) for sure ... You are absolutely correct right. That if statement isn't there in my kernel. >Well, since there shouldn't have been any code generated >before, there shouldn't be any difference ... There are some differences. One fact is that my system no longer crashes. The others seem to be some shifts in the code. More details of code changes are below. >For further diagnosis, I need to know: > >Did you change the sources or use any NCR specific kernel >config file options ? I did make a trivial change of code in if_ed.c for it to identify my NIC board as 8216 instead of 8416. At first I thought the problem was with the modified ed driver, but changing it to unmodified version or version from previous release did not make any difference to the crash. I did not use any NCR specific kernel config file options. >How did you identify the suspected error location in ncr.c ? The instruction pointer at the crash location points to a location somewhere between call ncb_prfile() and printf(), so I simply looked for if statement between the locations without realizing that "if (DEBUG_FLAGS & DEBUG_TINY)" was not generated. Since commenting it out makes a difference, I (incorrectly) presume it must be present. >; ncb_profile (np, cp); > pushl %ecx > pushl 8(%ebp) > call _ncb_profile <ncr_complete+128> > addl $8,%esp > >; if (DEBUG_FLAGS & DEBUG_TINY) >; printf ("CCB=%x STAT=%x/%x\n", (unsigned)cp & 0xfff, >; cp->host_status,cp->scsi_status); > >; xp = cp->xfer; > movl 12(%ebp),%ecx > movl 452(%ecx),%edi > >; cp->xfer = NULL; > movl $0,452(%ecx) When the "if (DEBUG_FLAGS.." statement has actually been commented out in the source code, the line "addl $8,%esp" above was moved to a location before "if (cp->parity_status" as below. There is no change to the code between the old and the new locations of addl. >Alll data structures should remain unchanged over the >execution of ncr_complete(), since they are locked in a >way that should also prevent simultanous updates by the >NCR ... > > xp = cp->xfer; > cp->xfer = NULL; > tp = &np->target[xp->sc_link->target]; > lp = tp->lp[xp->sc_link->lun]; >>><ncr_complete+189> >>> addl $8,%esp <<<<<< New location >ncr_complete + 195: > if (cp->parity_status) { > ... > { Also the locations of instructions were shifted. For example, ncr_complete is now at 0xf0168eb1 instead of at 0xf0168ec1. There could also be other changes that are not mentioned here. >It might help to send a stack trace obtained using >the kernel debugger ... I am afraid it would be hard to do. My system has 64 MBs of memory and each swap partition has only 48 MBs. Since the panic was in ncr.c, sometimes the system was just stuck not being able to write anything to the disks. If there is an easy way to get a dump (without changing the system much), I might attempt to do it. ---- Voradesh Yenbut Phone: +1 206 685-0912 BOX 352350, U of Washington FAX: +1 206 543-2969 Seattle, WA 98195 Email: yenbut@cs.washington.edu
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199507052143.OAA20148>