Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 16 Dec 2003 21:26:14 +0100
From:      Joachim Dagerot <joachim@dagerot.nu>
To:        "Greg 'groggy' Lehey" <grog@freebsd.org>
Cc:        freebsd-questions@freebsd.org
Subject:   Re:(UPDATED DETAILS)Re:Vinumquestion"Incompatiblesectorsizes"
Message-ID:  <200312162026.hBGKQEq04292@thunder.trej.net>
In-Reply-To: <200312152337.hBFNbGq15378@thunder.trej.net>

next in thread | previous in thread | raw e-mail | index | archive | help
A good and diciplined user woul have waited for a knowledgeable
answer, but I can't get no rest until this is solved so I tried to
follow the steps in "Replacing a failed Vinum drive
" and managed to get a drive up and running to replace the faulty one.

Now the current status of my vinum is:

*** Revive of raid.p0.s0 has stalled ***

And when I start it with "start raid.p0.s0" I get the message: 
"Can't start raid.p0.s0: Input/Output error (5)"
"*** Warning: configuration updates are disabled. ***"


I certainly hope I didn't increase the damage during the eves events,
and that there still are some hope left.


//Joachim


------------------
 | I managed to access the /var/messages file, I have removed anything
 | that I for sure know you wouldn't be interested in, please find the
 | other anwsers down in the mail.
 | 
 | /var/log/messages:
 | Dec 13 09:00:00 big newsyslog[786]: logfile turned over due to
 | size>100K
 | Dec 13 12:46:05 big kernel: ad4: soft error (ECC corrected)
cmd=read
 | fsbn 237565992 of 237565992-237566023
 | Dec 13 12:46:05 big kernel: ad4: hard error cmd=read fsbn 237565992
of
 | 237565992-237566023 status=7f error=7f
 | Dec 13 12:46:05 big kernel: vinum: raid.p0.s1 is crashed by force
 | Dec 13 12:46:05 big kernel: vinum: raid.p0 is corrupt
 | Dec 13 12:46:05 big kernel: fatal:raid.p0.s1 read error, block
 | 237565929 for 16384 bytes
 | Dec 13 12:46:05 big kernel: raid.p0.s1: user buffer block 475130592
 | for 16384 bytes
 | Dec 13 12:46:05 big kernel: ad4: soft error (ECC corrected)
cmd=read
 | fsbn 238506216 of 238506216-238506247
 | Dec 13 12:46:05 big kernel: ad4: hard error cmd=read fsbn 238506216
of
 | 238506216-238506247 status=7f error=7f
 | Dec 13 12:46:05 big kernel: fatal:raid.p0.s1 read error, block
 | 238506153 for 16384 bytes
 | Dec 13 12:46:05 big kernel: raid.p0.s1: user buffer block 477011872
 | for 16384 bytes
 | Dec 13 12:46:05 big kernel: ad4: soft error (ECC corrected)
cmd=write
 | fsbn 71
 | Dec 13 12:46:05 big kernel: ad4: hard error cmd=write fsbn 71
 | status=7f error=7f
 | Dec 13 12:46:05 big kernel: vinum: Can't write config to
/dev/ad4s1e,
 | error 5
 | Dec 13 12:46:05 big kernel: vinum: drive b is down
 | Dec 13 12:46:05 big kernel: ad4: soft error (ECC corrected)
cmd=write
 | fsbn 237565992 of 237565992-237566023
 | Dec 13 12:46:05 big kernel: ad4: hard error cmd=write fsbn
237565992
 | of 237565992-237566023 status=7f error=7f
 | Dec 13 12:46:05 big kernel: vinum: raid.p0.s1 is stale by force
 | Dec 13 12:46:05 big kernel: fatal :raid.p0.s1 write error, block
 | 237565929 for 16384 bytes
 | Dec 13 12:46:05 big kernel: raid.p0.s1: user buffer block 475130592
 | for 16384 bytes
 | Dec 13 12:46:05 big kernel: ad4: soft error (ECC corrected)
cmd=write
 | fsbn 238506216 of 238506216-238506247
 | Dec 13 12:46:05 big kernel: ad4: hard error cmd=write fsbn
238506216
 | of 238506216-238506247 status=7f error=7f
 | Dec 13 12:46:05 big kernel: fatal :raid.p0.s1 write error, block
 | 238506153 for 16384 bytes
 | Dec 13 12:46:05 big kernel: raid.p0.s1: user buffer block 477011872
 | for 16384 bytes
 | Dec 13 12:46:05 big kernel: ad4: soft error (ECC corrected)
cmd=read
 | fsbn 238506280 of 238506280-238506311
 | Dec 13 12:46:05 big kernel: ad4: hard error cmd=read fsbn 238506280
of
 | 238506280-238506311 status=7f error=7f
 | Dec 13 12:46:05 big kernel: vinum: raid.p0.s1 is crashed by force
 | Dec 13 12:46:05 big kernel: fatal:raid.p0.s1 read error, block
 | 238506217 for 16384 bytes
 | Dec 13 12:46:05 big kernel: raid.p0.s1: user buffer block 477011936
 | for 16384 bytes
 | Dec 13 12:46:05 big kernel: ad4: soft error (ECC corrected)
cmd=write
 | fsbn 238506280 of 238506280-238506311
 | Dec 13 12:46:05 big kernel: ad4: hard error cmd=write fsbn
238506280
 | of 238506280-238506311 status=7f error=7f
 | Dec 13 12:46:05 big kernel: vinum: raid.p0.s1 is stale by force
 | Dec 13 12:46:05 big kernel: fatal :raid.p0.s1 write error, block
 | 238506217 for 16384 bytes
 | Dec 13 12:46:05 big kernel: raid.p0.s1: user buffer block 477011936
 | for 16384 bytes
 | Dec 13 12:46:05 big kernel: ad4: soft error (ECC corrected)
cmd=read
 | fsbn 1
 | Dec 13 12:46:05 big kernel: ad4: hard error cmd=read fsbn 1
status=7f
 | error=7f
 | Dec 13 12:46:05 big kernel: ad4: soft error (ECC corrected)
cmd=read
 | fsbn 0
 | Dec 13 12:46:05 big kernel: ad4: hard error cmd=read fsbn 0
status=7f
 | error=7f
 | Dec 13 12:46:05 big kernel: ad4: soft error (ECC corrected)
cmd=read
 | fsbn 64
 | Dec 13 12:46:05 big kernel: ad4: hard error cmd=read fsbn 64
status=7f
 | error=7f
 | Dec 13 12:46:05 big kernel: ad4: soft error (ECC corrected)
cmd=read
 | fsbn 63
 | Dec 13 12:46:05 big kernel: ad4: hard error cmd=read fsbn 63
status=7f
 | error=7f
 | Dec 13 12:46:05 big kernel: vinum: raid.p0.s0 is stale by force
 | Dec 13 12:46:05 big kernel: vinum: raid.p0 is faulty
 | Dec 13 12:46:05 big kernel: vinum: raid is down
 | … cut …
 | Dec 16 00:25:28 big kernel: atapci1: <Promise PDC20269 UDMA133
 | controller> port
 |
0x1060-0x106f,0x1018-0x101b,0x1070-0x1077,0x101c-0x101f,0x1078-0x107f
 | mem 0xe8004000-0xe8007fff irq 3 at device 17.0 on pci0
 | Dec 16 00:25:28 big kernel: ata2: at 0x1078 on atapci1
 | Dec 16 00:25:28 big kernel: ata3: at 0x1070 on atapci1
 | Dec 16 00:25:28 big kernel: fdc0: <Enhanced floppy controller
(i82077,
 | NE72065 or clone)> port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0
 | Dec 16 00:25:28 big kernel: fdc0: FIFO enabled, 8 bytes threshold
 | Dec 16 00:25:28 big kernel: fd0: <1440-KB 3.5" drive> on fdc0 drive
0
 | Dec 16 00:25:28 big kernel: sio0 port 0x3f8-0x3ff irq 4 on acpi0
 | Dec 16 00:25:28 big kernel: sio0: type 16550A
 | Dec 16 00:25:28 big kernel: atkbdc0: <Keyboard controller (i8042)>
 | port 0x64,0x60 irq 1 on acpi0
 | Dec 16 00:25:28 big kernel: atkbd0: <AT Keyboard> flags 0x1 irq 1
on
 | atkbdc0
 | Dec 16 00:25:28 big kernel: kbd0 at atkbd0
 | Dec 16 00:25:28 big kernel: orm0: <Option ROMs> at iomem
 | 0xe4000-0xeffff,0xe0000-0xe3fff,0xd0800-0xd2fff,0xd0000-0xd07ff on
 | isa0
 | Dec 16 00:25:28 big kernel: pmtimer0 on isa0
 | Dec 16 00:25:28 big kernel: ppc0: parallel port not found.
 | Dec 16 00:25:28 big kernel: sc0: <System console> at flags 0x100 on
 | isa0
 | Dec 16 00:25:28 big kernel: sc0: VGA <16 virtual consoles,
 | flags=0x300>
 | Dec 16 00:25:28 big kernel: sio1: configured irq 3 not in bitmap of
 | probed irqs 0
 | Dec 16 00:25:28 big kernel: sio1: port may not be enabled
 | Dec 16 00:25:28 big kernel: vga0: <Generic ISA VGA> at port
 | 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
 | Dec 16 00:25:28 big kernel: Timecounters tick every 10.000 msec
 | Dec 16 00:25:28 big kernel: ad0: 117800MB <IC35L120AVVA07-0>
 | [239340/16/63] at ata0-master UDMA33
 | Dec 16 00:25:28 big kernel: ad1: 117800MB <IC35L120AVVA07-0>
 | [239340/16/63] at ata0-slave UDMA33
 | Dec 16 00:25:28 big kernel: ad4: 117800MB <IC35L120AVV207-0>
 | [239340/16/63] at ata2-master UDMA100
 | Dec 16 00:25:28 big kernel: ad5: 176700MB <IC35L180AVV207-1>
 | [359010/16/63] at ata2-slave UDMA100
 | Dec 16 00:25:28 big kernel: ad6: 176700MB <IC35L180AVV207-1>
 | [359010/16/63] at ata3-master UDMA100
 | Dec 16 00:25:28 big kernel: ad7: 19574MB <IBM-DPTA-372050>
 | [39770/16/63] at ata3-slave UDMA66
 | Dec 16 00:25:28 big kernel: Mounting root from ufs:/dev/ad0s1a
 | Dec 16 00:25:28 big kernel: vinum: incompatible sector sizes. 
 | raid.p0.s2 has 0 bytes, raid.p0 has 512 bytes.  Ignored.
 | 
 | 
 | 
 | 
 | 
 | 
 | -------------------
 |  | First let me assure you that I do not blame neither you or any
 | other
 |  | one involved in the vinum project. I am perfectly aware that
 | nothing
 |  | beats a good tape when it comes to data recovery.
 |  | 
 |  | Thansk for putting interest in my problem, I have tried to write
 | down
 |  | all information you requested:
 |  | 
 |  | > What problems are you having? 
 |  | One of my drives are flagged down. Vinum reports that drive as
 |  | "referenced". The other two drives in the RAID-5 is up.
According
 | to
 |  | "vinum list" my subdisks are:
 |  | s0 State: R 0%
 |  | s1 State: crashed
 |  | s2 State: stale
 |  | 
 |  | > Which version of FreeBSD are you running?
 |  | FreeBSD 5.1
 |  | 
 |  | > Have you made any changes to the system sources, including
Vinum?
 |  | Nope
 |  | 
 |  | > Supply the output of the vinum list command. If you can't
start
 |  | Vinum, supply the on-disk configuration, as described below. If
you
 |  | can't start Vinum, then (and only then) send a copy of the
 |  | configuration file. 
 |  | (Must write of the screen:)
 |  | 2 drives:
 |  | D b	State: up	/dev/ad4s1e	A: 36/117796 MB (0%)
 |  | D a	State: up	/dev/ad1s1e	A: 36/117796 MB (0%)
 |  | D c	State: referenced	unknown	A: 0/0 MB
 |  | 
 |  | 1 volumes:
 |  | V raid	State: down	Plexes:	1	Size:	230GB
 |  | 
 |  | 1 plexes:
 |  | P raid.p0	R5 State: faulty	Subdisks:	3	Size:	230 GB
 |  | 
 |  | 3 subdisks:
 |  | S raid.p0.s0	State: R 0%	D: a	Size:	115GB
 |  | 		*** Start raid.p0.s0 with 'start' command ***
 |  | S raid.p0.s1	State: crashed	D: b	Size:	115GB
 |  | S raid.p0.s2	State: stale	D: c	Size:	115GB
 |  | 
 |  | > Supply an extract of the Vinum history file
 |  | Can't do that, can't run an editor now. (TMP drive read-only)
 |  | 
 |  | > Supply an extract of the file /var/log/messages
 |  | Can't do that, can't run an editor now. (TMP drive read-only)
 |  | 
 |  | > If you have a crash
 |  | No crash.
 |  | 
 |  | 
 |  | 
 |  | 
 |  | 
 |  | 
 |  | 
 |  | -------------------
 |  |  | [Format recovered--see
 |  | http://www.lemis.com/email/email-format.html]
 |  |  | 
 |  |  | Wrapped log output.
 |  |  | 
 |  |  | On Monday, 15 December 2003 at 22:16:10 +0100, Joachim
Dagerot
 |  | wrote:
 |  |  | > I have a three disk IDE RAID-5 system using vinum. I do not
 | have
 |  | the
 |  |  | > root or the system disks there, but I do (did) have the
/HOME
 | on
 |  | the
 |  |  | > RAID.
 |  |  | >
 |  |  | > Now one disk has broken down and I'm trying to replace it.
 |  | However
 |  |  | > there is some problem when I'm booting: "init: /bin/sh on
 | /etc/rc
 |  |  | > terminated abnormally, going to single user mode" I'm not
 | really
 |  | sure
 |  |  | > what this single user mode is, but I can't write to the
/tmp/
 |  | disk for
 |  |  | > some reason. Is this normal behaviour?
 |  |  | 
 |  |  | No.
 |  |  | 
 |  |  | > During boot vinum reports the following:
 |  |  | >
 |  |  | > ###
 |  |  | > vinum: loaded
 |  |  | > vinum: reading configuration from /dev/ad1s1e
 |  |  | > vinum: updating configuration from /dev/ad4s1e
 |  |  | > vinum: incompatible sector sizes. raid.p0.s2 has 0 bytes,
 | raid.p0
 |  | has 512 bytes. Ignored.
 |  |  | > ###
 |  |  | >
 |  |  | > The broken disk is /dev/ad5 and it's not completely
replaced
 | yet.
 |  | I do
 |  |  | > have a bad feeling that both ad5 AND ad4 is broken, but I
 |  | certainly
 |  |  | > hope this isn't the case. I'm using RAID instead of a
backup
 |  | system.
 |  |  | 
 |  |  | Hmm, that's not what it's designed for.
 |  |  | 
 |  |  | > All help here is much appreciated, I have pictures from my
 | sons
 |  |  | > first year on this RAID volume...
 |  |  | 
 |  |  | Backups are always good.  But we can probably recover the
data. 
 |  | First
 |  |  | I need the information I ask for on
 |  |  | http://www.vinumvm.org/vinum/how-to-debug.html.
 |  |  | 
 |  |  | Greg
 |  |  | --
 |  |  | When replying to this message, please copy the original
 | recipients.
 |  |  | If you don't, I may ignore the reply or reply to the original
 |  | recipients.
 |  |  | For more information, see http://www.lemis.com/questions.html
 |  |  | See complete headers for address and phone numbers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200312162026.hBGKQEq04292>