From owner-freebsd-questions Thu Jan 20 22:35:55 2000 Delivered-To: freebsd-questions@freebsd.org Received: from server.baldwin.cx (jobaldwi.campus.vt.edu [198.82.67.146]) by hub.freebsd.org (Postfix) with ESMTP id 4D406151D3 for ; Thu, 20 Jan 2000 22:35:48 -0800 (PST) (envelope-from jhb@FreeBSD.org) Received: from john.baldwin.cx (john [10.0.0.2]) by server.baldwin.cx (8.9.3/8.9.3) with ESMTP id BAA73206; Fri, 21 Jan 2000 01:35:33 -0500 (EST) (envelope-from jhb@FreeBSD.org) Message-Id: <200001210635.BAA73206@server.baldwin.cx> X-Mailer: XFMail 1.4.0 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <20000121105518.N481@mojave.worldwide.lemis.com> Date: Fri, 21 Jan 2000 01:35:33 -0500 (EST) From: John Baldwin To: Greg Lehey Subject: Re: Recoverving/reviving a 'stale' subdisk under vinum Cc: freebsd-questions@FreeBSD.org, cjclark@home.com Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On 21-Jan-00 Greg Lehey wrote: > On Thursday, 20 January 2000 at 19:15:43 -0500, Crist J. Clark wrote: >> On Thu, Jan 20, 2000 at 01:56:07PM -0500, John H. Baldwin wrote: >>> I've read the vinum(4) and vinum(8) manpages as well as the webpages at >>> www.lemis.com/~grog/vinum.html, and while they are very good as far as >>> setup and configuration info, I haven't been able to find a lot of info >>> about recovering. I have a stale subdisk that I can't get to recover no >>> matter how many different start commands I try. I've tried starting the >>> volume, the plex, and the subdisk itself with no success. >>> >>> # vinum list >>> Configuration summary >>> >>> Drives: 3 (4 configured) >>> Volumes: 1 (4 configured) >>> Plexes: 1 (8 configured) >>> Subdisks: 3 (16 configured) >>> >>> D vinumdrive0 State: up Device /dev/da1s1e Avail: 0/8683 MB (0%) >>> D vinumdrive1 State: up Device /dev/da2s1e Avail: 0/8683 MB (0%) >>> D vinumdrive2 State: up Device /dev/da3s1e Avail: 0/8683 MB (0%) >>> >>> V ftp_mirror State: up Plexes: 1 Size: 25 GB >>> >>> P ftp_mirror.p0 S State: corrupt Subdisks: 3 Size: 25 GB >>> >>> S ftp_mirror.p0.s0 State: up PO: 0 B Size: 8683 MB >>> S ftp_mirror.p0.s1 State: up PO: 256 kB Size: 8683 MB >>> S ftp_mirror.p0.s2 State: stale PO: 512 kB Size: 8683 MB >>> >>> # vinum start ftp_mirror.p0.s2 >>> Can't start ftp_mirror.p0.s2: Device busy (16) > > Hmm. That shouldn't happen. Well, that's comforting. :) >> You have to 'stop' everything first. (I might be overkilling here, >> but better safe...) > > No, that's not safe. That would mean taking down the volume. Err, oops. I already did this and it worked. I've already fsck'd the volume and have it in use right now. > I haven't seen this before. How about the information I ask for in > the web page? Ok, here's what I do have, but I did fix it using the above hackishness, so some of it may not apply. # uname -a FreeBSD raven.XXXXX 3.3-STABLE FreeBSD 3.3-STABLE #0: Mon Dec 6 16:25:01 EST 1999 root@snowcow.XXXXX:/usr/source/src/sys/compile/RAVEN i386 the output of 'vinum list' you already have above, here's some of vinum_history, although it doesn't include any of the return values, so I don't think it will be of much use: 20 Jan 2000 12:39:55.489661 *** vinum started *** 20 Jan 2000 12:39:55.540632 start 20 Jan 2000 12:39:55.820518 *** Created devices *** 20 Jan 2000 12:40:12.649217 *** vinum started *** 20 Jan 2000 12:40:13.502406 help 20 Jan 2000 12:40:25.188145 ls 20 Jan 2000 13:10:31.321216 start 20 Jan 2000 13:10:47.978917 start ftp_mirror.p0.s2 20 Jan 2000 13:10:50.980012 stop That is what I did when I first brought the machine back up. 20 Jan 2000 16:21:53.536302 *** vinum started *** 20 Jan 2000 16:21:53.537010 stop ftp_mirror.p0 20 Jan 2000 16:21:58.984393 *** vinum started *** 20 Jan 2000 16:21:58.985133 list 20 Jan 2000 16:22:06.561902 *** vinum started *** 20 Jan 2000 16:22:06.562622 stop ftp_mirror.p0.s2 20 Jan 2000 16:22:17.000952 *** vinum started *** 20 Jan 2000 16:22:17.005242 stop -f ftp_mirror.p0.s2 20 Jan 2000 16:22:21.145993 *** vinum started *** 20 Jan 2000 16:22:21.146744 list 20 Jan 2000 16:22:40.709634 *** vinum started *** 20 Jan 2000 16:22:40.710394 start ftp_mirror 20 Jan 2000 16:22:54.393075 *** vinum started *** 20 Jan 2000 16:22:54.393778 start ftp_mirror.p0.s0 20 Jan 2000 16:23:00.238272 *** vinum started *** 20 Jan 2000 16:23:00.239015 list 20 Jan 2000 16:23:09.552251 *** vinum started *** 20 Jan 2000 16:23:09.552963 start ftp_mirror.p0.s1 20 Jan 2000 16:23:16.193159 *** vinum started *** 20 Jan 2000 16:23:16.193896 start ftp_mirror.p0.s2 That is how I "fixed" it. However, the drive seems to have fallen over again (*sigh*) with the following kernel messages: Jan 20 23:28:38 raven /kernel: (da2:ahc1:0:1:0): SCB 0x96 - timed out while idle, LASTPHASE == 0x1, SEQADDR == 0xa Jan 20 23:28:38 raven /kernel: (da2:ahc1:0:1:0): Queuing a BDR SCB Jan 20 23:28:38 raven /kernel: (da2:ahc1:0:1:0): Bus Device Reset Message Sent Jan 20 23:28:38 raven /kernel: (da2:ahc1:0:1:0): no longer in timeout, status = 34b Jan 20 23:28:38 raven /kernel: ahc1: Bus Device Reset on A:1. 1 SCBs aborted Note that I didn't get this message until after the drive had been booted for a while, the kernel found it fine during boot: ahc1: rev 0x00 int a irq 10 on pci2.9.0 ahc1: aic7890/91 Wide Channel A, SCSI Id=7, 16/255 SCBs ... da2 at ahc1 bus 0 target 1 lun 0 da2: Fixed Direct Access SCSI-2 device da2: 80.000MB/s transfers (40.000MHz, offset 15, 16bit), Tagged Queueing Enabled da2: 8683MB (17783240 512 byte sectors: 255H 63S/T 1106C) Hope this enough info and hope your stay in India is going well. > Greg -- John Baldwin -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.cslab.vt.edu/~jobaldwi/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message