From owner-freebsd-stable Wed Jan 3 16: 4:38 2001 From owner-freebsd-stable@FreeBSD.ORG Wed Jan 3 16:04:31 2001 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from wantadilla.lemis.com (wantadilla.lemis.com [192.109.197.80]) by hub.freebsd.org (Postfix) with ESMTP id 311D437B400 for ; Wed, 3 Jan 2001 16:04:29 -0800 (PST) Received: by wantadilla.lemis.com (Postfix, from userid 1004) id 481ED6A911; Thu, 4 Jan 2001 10:34:26 +1030 (CST) Date: Thu, 4 Jan 2001 10:34:26 +1030 From: Greg Lehey To: Roman Shterenzon Cc: Daniel Lang , freebsd-stable@freebsd.org Subject: Re: Vinum saga continues Message-ID: <20010104103426.C4336@wantadilla.lemis.com> References: <20010103141514.A381@jamus.xpert.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.2.5i In-Reply-To: <20010103141514.A381@jamus.xpert.com>; from roman@jamus.xpert.com on Wed, Jan 03, 2001 at 02:15:14PM +0200 Organization: LEMIS, PO Box 460, Echunga SA 5153, Australia Phone: +61-8-8388-8286 Fax: +61-8-8388-8725 Mobile: +61-418-838-708 WWW-Home-Page: http://www.lemis.com/~grog X-PGP-Fingerprint: 6B 7B C3 8C 61 CD 54 AF 13 24 52 F8 6D A4 95 EF Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG [Format recovered--see http://www.lemis.com/email/email-format.html] On Wednesday, 3 January 2001 at 14:15:14 +0200, Roman Shterenzon wrote: > Hi, > > Attached is the most valuable information that was in my pr 22103. > I've read the vinumdebug and the other guy's PR. > I'm still not getting what is missing. > You told the other guy to submit the backtrace, but it was in fact submitted! > It's as well in my PR as well. > Your responses are very brief - "please read vinumdebug", but in fact, if > there's something that is missing, you can be more specific. OK. I don't know what's so difficult about this, but here we go:. On the web page to which I refer, I say: If you need to contact me because of problems with Vinum, please send me a mail message with the following information: - What problems are you having? You don't say this, but I suppose it's obvious. - Which version of FreeBSD are you running? I can't find this in your report. - Have you made any changes to the system sources, including Vinum? I can't find this in your report. - Supply the output of the vinum list command. If you can't start Vinum, supply the on-disk configuration, as described below. If you can't start Vinum, then (and only then) send a copy of the configuration file. I can't find this in your report. - Supply an extract of the Vinum history file. Unless you have explicitly renamed it, it will be /var/log/vinum_history. This file can get very big; please limit it to the time around when you have the problems. Each line contains a timestamp at the beginning, so you will have no difficulty in establishing which data is of relevance. I can't find this in your report. - Supply an extract of the file /var/log/messages. Restrict the extract to the same time frame as the history file. Again, each line contains a timestamp at the beginning, so you will have no difficulty in establishing which data is of relevance. I can't find this in your report. - If you have a crash, please supply a backtrace from the dump analysis as discussed below under Kernel Panics. Please don't delete the crash dump; it may be needed for further analysis. Basically, all I can see here is the backtrace, which is still wrapped at 80 characters, despite all my requests. I've had to manually reformat it to make it legible. Have you really read the web page? > Alfred Perlstein looked it my PR once and he thinks that it's due to > stack smashing. > However, he wasn't able to find where it happends. > It may be in fact interaction with some other driver, like you said, for > example - fxp. This is why I submitted the dmesg output. Please, only if I ask for it. > #62 0xc023660b in trap (frame={tf_fs = 0xc0270010, tf_es = 0xc0150010, tf_ds = 0x680010, tf_edi = 0xc16e9588, > tf_esi = 0xc16e9400, tf_ebp = 0xc02773b0, tf_isp = 0xc0277380, tf_ebx = 0xc208e340, tf_edx = 0x0, > tf_ecx = 0x5610001, tf_eax = 0xff9773bf, tf_trapno = 0xc, tf_err = 0x2, tf_eip = 0xc150fc67, tf_cs = 0x8, > tf_eflags = 0x10246, tf_esp = 0xc16e9588, tf_ss = 0xc14bd000}) at ../../i386/i386/trap.c:426 > #63 0xc150fc67 in complete_rqe () at /usr/src/sys/modules/vinum/../../dev/vinum/vinuminterrupt.c:199 > #64 0xc0178d6b in biodone (bp=0xc16e9588) at ../../kern/vfs_bio.c:2637 > #65 0xc0126bb9 in dadone (periph=0xc14ca700, done_ccb=0xc1808400) at ../../cam/scsi/scsi_da.c:1246 > #66 0xc0122aff in camisr (queue=0xc0298690) at ../../cam/cam_xpt.c:6319 > #67 0xc0122911 in swi_cambio () at ../../cam/cam_xpt.c:6222 > #68 0xc022d0e0 in splz_swi () > (kgdb) up 63 > #64 0xc0178d6b in biodone (bp=0xc16e9588) at ../../kern/vfs_bio.c:2637 > 2637 (*bp->b_iodone) (bp); > (kgdb) print bp > $1 = (struct buf *) 0xc16e9588 > (kgdb) print *bp->b_iodone > $2 = {void ()} 0xc150f6ac > (kgdb) down > #63 0xc150fc67 in complete_rqe () at /usr/src/sys/modules/vinum/../../dev/vinum/vinuminterrupt.c:199 > 199 } > (kgdb) list > 194 VOL[rq->volplex.volno].active--; /* another request finished */ > 195 biodone(ubp); /* top level buffer completed */ > 196 freerq(rq); /* return the request storage */ > 197 } > 198 } > 199 } > (kgdb) down > #62 0xc023660b in trap (frame={tf_fs = 0xc0270010, tf_es = 0xc0150010, tf_ds = 0x680010, tf_edi = 0xc16e9588, > tf_esi = 0xc16e9400, tf_ebp = 0xc02773b0, tf_isp = 0xc0277380, tf_ebx = 0xc208e340, tf_edx = 0x0, > tf_ecx = 0x5610001, tf_eax = 0xff9773bf, tf_trapno = 0xc, tf_err = 0x2, tf_eip = 0xc150fc67, tf_cs = 0x8, > tf_eflags = 0x10246, tf_esp = 0xc16e9588, tf_ss = 0xc14bd000}) at ../../i386/i386/trap.c:426 > 426 (void) trap_pfault(&frame, FALSE, eva); > (kgdb) up 2 > #64 0xc0178d6b in biodone (bp=0xc16e9588) at ../../kern/vfs_bio.c:2637 > 2637 (*bp->b_iodone) (bp); > (kgdb) up > #65 0xc0126bb9 in dadone (periph=0xc14ca700, done_ccb=0xc1808400) at ../../cam/scsi/scsi_da.c:1246 > 1246 biodone(bp); > (kgdb) print bp > $3 = (struct buf *) 0xc16e9588 > (kgdb) print *bp > b_flags = 0x204, > b_qindex = 0x0, > b_xflags = 0x0, > b_lock = { > lk_interlock = { > lock_data = 0x0 > }, > lk_flags = 0x400, > lk_sharecount = 0x0, > lk_waitcount = 0x0, > lk_exclusivecount = 0x1, > lk_prio = 0x14, > lk_wmesg = 0xc0257a24 "bufwait", > lk_timo = 0x0, > lk_lockholder = 0x5 > }, > b_error = 0x0, > b_bufsize = 0x2000, > b_bcount = 0x2000, > b_resid = 0x0, > b_dev = 0xc15cd880, > b_data = 0xcbdcc000 "jA\002", > b_kvabase = 0x0, > b_kvasize = 0x0, > b_lblkno = 0x0, > b_blkno = 0x2b08149, > b_offset = 0x0, > b_iodone = 0xc150f6ac , OK, this is *not* the buffer header corruption bug, but it's happening in a very similar position. With the buffer header corruption, you wouldn't have got as far as this, because b_iodone would be zeroed out. I also can't see any other obvious damage to the buffer header. What we need to do now is to find out where the trap occurred. That's at line 199 of complete_rqe, which shows as the very end of the function. Could you give me the following information from gdb, please? (gdb) x/20i 0xc150fc60 Thanks Greg -- When replying to this message, please take care not to mutilate the original text. For more information, see http://www.lemis.com/email.html Finger grog@lemis.com for PGP public key See complete headers for address and phone numbers To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message