Date: Mon, 27 Sep 2004 09:20:24 -0700 From: Dennis G Allard <allard@oceanpark.com> To: freebsd-scsi@freebsd.org Subject: AFACLI failover success (was: non-responding PERC with aaccli ...) Message-ID: <41583DC8.7000301@oceanpark.com>
next in thread | raw e-mail | index | archive | help
Thanks for everyone who replied. The original procedure for using AFACLI to activate a HOTSPARE that I outlined worked except, after performing `container set global_failover (0,3,0), I had to do `controller rescan` before the failover kicked in. This in spite of the fact that `controller show automatic_failover` was ENABLED. Since I do not subscribe to this mailing list (or any mailing list, preferring IETF standard newsgroup culture), I am sending this summary as a stand-alone post. (It would be good if someone were to post a follow up to the original thread and include the following text to complete that thread, thanks)... DETAILS: (1 of 4) ORIGINAL STATE (pre-failover): AFA0> enclosure show slot > Executing: enclosure show slot > > Enclosure > ID (B:ID:L) Slot scsiId Insert Status > ----------- ---- ------ ------- ------------------------------------------ > 0 0:06:0 0 0:00:0 1 OK FAILED CRITICAL ACTIVATE > 0 0:06:0 1 0:01:0 1 OK FAILED CRITICAL ACTIVATE > 0 0:06:0 2 0:02:0 1 ERROR FAULTY FAILED CRITICAL ACTIVATE > 0 0:06:0 3 0:03:0 1 OK UNCONFIG HOTSPARE ACTIVATE > > AFA0> disk list > Executing: disk list > > B:ID:L Device Type Blocks Bytes/Block Usage Shared > ------ -------------- --------- ----------- ---------------- ------ > 0:00:0 Disk 71132959 512 Initialized NO > 0:01:0 Disk 71132960 512 Initialized NO > 0:02:0 Disk 0 0 Offline NO > 0:03:0 Disk 71132960 512 Initialized NO > > AFA0> container list > Executing: container list > Num Total Oth Chunk Scsi Partition > Label Type Size Ctr Size Usage B:ID:L Offset:Size > ----- ------ ------ --- ------ ------- ------ ------------- > 0 RAID-5 67.7GB 32KB Open 0:00:0 64.0KB:33.8GB > /dev/sda SEPT 0:01:0 64.0KB:33.8GB > 0:02:0 64.0KB!33.8GB > > > AFA0> (2 of 4) ACTIONS TAKEN: container set global_failover (0,3,0) controller rescan Excerpts from acutal session: Note: [[my comments are in double square brackets like these]] > AFA0> task list > Executing: task list > > Controller Tasks > > TaskId Function Done% Container State Specific1 Specific2 > ------ -------- ------- --------- ----- --------- --------- > > No tasks currently running on controller > > AFA0> container set global_failover (0,3,0) > Executing: container set global_failover (BUS=0,ID=3,LUN=0) > > AFA0> task list > Executing: task list > > Controller Tasks > > TaskId Function Done% Container State Specific1 Specific2 > ------ -------- ------- --------- ----- --------- --------- > > No tasks currently running on controller [[Hmmm - why not?]] > > AFA0> > AFA0> > AFA0> disk show space > Executing: disk show space > > Scsi B:ID:L Usage Size > ----------- ---------- ------------- > 0:00:0 Container 64.0KB:33.8GB > 0:00:0 Free 33.8GB:59.0KB > 0:01:0 Container 64.0KB:33.8GB > 0:01:0 Free 33.8GB:59.0KB > 0:02:0 Dead 64.0KB:33.8GB > 0:02:0 Free 33.8GB:59.0KB > 0:03:0 Free 64.0KB:33.8GB > > AFA0> > AFA0> > AFA0> container list > Executing: container list > Num Total Oth Chunk Scsi Partition > Label Type Size Ctr Size Usage B:ID:L Offset:Size > ----- ------ ------ --- ------ ------- ------ ------------- > 0 RAID-5 67.7GB 32KB Open 0:00:0 64.0KB:33.8GB > /dev/sda SEPT 0:01:0 64.0KB:33.8GB > 0:02:0 64.0KB!33.8GB > > > AFA0> enclosure show slot > Executing: enclosure show slot > > Enclosure > ID (B:ID:L) Slot scsiId Insert Status > ----------- ---- ------ ------- ------------------------------------------ > 0 0:06:0 0 0:00:0 1 OK FAILED CRITICAL ACTIVATE > 0 0:06:0 1 0:01:0 1 OK FAILED CRITICAL ACTIVATE > 0 0:06:0 2 0:02:0 1 ERROR FAULTY FAILED CRITICAL ACTIVATE > 0 0:06:0 3 0:03:0 1 OK UNCONFIG HOTSPARE ACTIVATE > > AFA0> > AFA0> > AFA0> > AFA0> container show failover > Executing: container show failover > > Container Scsi B:ID:L > --------- ---------------------------------- > GLOBAL 0:03:0 > 0 --- No Devices Assigned --- > > AFA0> > AFA0> > AFA0> > AFA0> controller show automatic_failover > Executing: controller show automatic_failover > Automatic failover ENABLED [[Well????]] > > AFA0> > AFA0> > AFA0> [[I guessed to try...]] > AFA0> controller rescan > Executing: controller rescan > > AFA0> > AFA0> task list > Executing: task list > > Controller Tasks > > TaskId Function Done% Container State Specific1 Specific2 > ------ -------- ------- --------- ----- --------- --------- > 101 Rebuild 0.1% 00 RUN 00000000 00000000 > > AFA0> task list > Executing: task list > > Controller Tasks > > TaskId Function Done% Container State Specific1 Specific2 > ------ -------- ------- --------- ----- --------- --------- > 101 Rebuild 0.1% 00 RUN 00000000 00000000 [[Much Better!!!]] > > AFA0> task list > Executing: task list > > Controller Tasks > > TaskId Function Done% Container State Specific1 Specific2 > ------ -------- ------- --------- ----- --------- --------- > 101 Rebuild 0.2% 00 RUN 00000000 00000000 > > AFA0> task list > Executing: task list > > Controller Tasks > > TaskId Function Done% Container State Specific1 Specific2 > ------ -------- ------- --------- ----- --------- --------- > 101 Rebuild 0.6% 00 RUN 00000000 00000000 > > AFA0> task list > Executing: task list > > Controller Tasks > > TaskId Function Done% Container State Specific1 Specific2 > ------ -------- ------- --------- ----- --------- --------- > 101 Rebuild 0.6% 00 RUN 00000000 00000000 > > AFA0> task list > Executing: task list > > Controller Tasks > > TaskId Function Done% Container State Specific1 Specific2 > ------ -------- ------- --------- ----- --------- --------- > 101 Rebuild 1.3% 00 RUN 00000000 00000000 > > AFA0> > AFA0> > AFA0> > AFA0> enclosure show slot > Executing: enclosure show slot > > Enclosure > ID (B:ID:L) Slot scsiId Insert Status [[note the REBUILD]] > ----------- ---- ------ ------- ------------------------------------------ > 0 0:06:0 0 0:00:0 1 OK REBUILD FAILED CRITICAL ACTIVATE > 0 0:06:0 1 0:01:0 1 OK REBUILD FAILED CRITICAL ACTIVATE > 0 0:06:0 2 0:02:0 1 OK FAILED CRITICAL UNCONFIG ACTIVATE > 0 0:06:0 3 0:03:0 1 OK REBUILD FAILED CRITICAL HOTSPARE > ACTIVATE > > AFA0> > AFA0> > AFA0> > AFA0> container list > Executing: container list > Num Total Oth Chunk Scsi Partition > Label Type Size Ctr Size Usage B:ID:L Offset:Size > ----- ------ ------ --- ------ ------- ------ ------------- > 0 RAID-5 67.7GB 32KB Open 0:00:0 64.0KB:33.8GB > /dev/sda SEPT 0:01:0 64.0KB:33.8GB > 0:03:0 64.0KB:33.8GB > > > AFA0> > AFA0> > AFA0> disk list > Executing: disk list > > B:ID:L Device Type Blocks Bytes/Block Usage Shared > ------ -------------- --------- ----------- ---------------- ------ > 0:00:0 Disk 71132959 512 Initialized NO > 0:01:0 Disk 71132960 512 Initialized NO > 0:02:0 Disk 0 0 Offline NO > 0:03:0 Disk 71132960 512 Initialized NO > > AFA0> > AFA0> > AFA0> disk show space > Executing: disk show space > > Scsi B:ID:L Usage Size > ----------- ---------- ------------- > 0:00:0 Container 64.0KB:33.8GB > 0:00:0 Free 33.8GB:59.0KB > 0:01:0 Container 64.0KB:33.8GB > 0:01:0 Free 33.8GB:59.0KB > 0:03:0 64.0KB:33.8GB [[0:02:0 is gone -- good]] > 0:03:0 Free 33.8GB:59.0KB > > AFA0> > AFA0> > AFA0> [[ultimately, the REBUILD took ~2.5 hours]] (3 of 4) FINAL STATE (post-failover): > afacli > --------------------------------------------------------------------------------------------------------------------------------------------- > DELL PowerEdge Expandable RAID Controller 2 Command Line Interface > Copyright 1998-2000 Adaptec, Inc. All rights reserved > --------------------------------------------------------------------------------------------------------------------------------------------- > > FASTCMD> open afa0 > Executing: open "afa0" > > AFA0> enclosure show slot > Executing: enclosure show slot > > Enclosure > ID (B:ID:L) Slot scsiId Insert Status > ----------- ---- ------ ------- ------------------------------------------ > 0 0:06:0 0 0:00:0 1 OK ACTIVATE > 0 0:06:0 1 0:01:0 1 OK ACTIVATE > 0 0:06:0 2 0:02:0 1 OK FAILED CRITICAL UNCONFIG ACTIVATE > 0 0:06:0 3 0:03:0 1 OK HOTSPARE ACTIVATE [[why still see 'HOTSPARE'?]] > > AFA0> > AFA0> > AFA0> disk list > Executing: disk list > > B:ID:L Device Type Blocks Bytes/Block Usage Shared > ------ -------------- --------- ----------- ---------------- ------ > 0:00:0 Disk 71132959 512 Initialized NO > 0:01:0 Disk 71132960 512 Initialized NO > 0:02:0 Disk 0 0 Offline NO > 0:03:0 Disk 71132960 512 Initialized NO > > AFA0> > AFA0> > AFA0> container list > Executing: container list > Num Total Oth Chunk Scsi Partition > Label Type Size Ctr Size Usage B:ID:L Offset:Size > ----- ------ ------ --- ------ ------- ------ ------------- > 0 RAID-5 67.7GB 32KB Open 0:00:0 64.0KB:33.8GB > /dev/sda SEPT 0:01:0 64.0KB:33.8GB > 0:03:0 64.0KB:33.8GB > > > AFA0> > AFA0> > AFA0> disk show space > Executing: disk show space > > Scsi B:ID:L Usage Size > ----------- ---------- ------------- > 0:00:0 Container 64.0KB:33.8GB > 0:00:0 Free 33.8GB:59.0KB > 0:01:0 Container 64.0KB:33.8GB > 0:01:0 Free 33.8GB:59.0KB > 0:03:0 Container 64.0KB:33.8GB > 0:03:0 Free 33.8GB:59.0KB > > AFA0> > AFA0> > AFA0> container show failover > Executing: container show failover > > Container Scsi B:ID:L > --------- ---------------------------------- > GLOBAL 0:03:0 > 0 --- No Devices Assigned --- > > AFA0> > AFA0> > AFA0> controller show automatic_failover > Executing: controller show automatic_failover > Automatic failover ENABLED > > AFA0> > AFA0> > AFA0> (4 of 4) REMAINING QUESTIONS A. Given that `controller show automatic_failover` = ENABLED both before and after, why was it necessary for me to issue a `controller rescan` in order to make the rebuild task kick in? B. Why does `enclosure show slot` still list drive (0,3,0) as having state label 'HOTSPARE'? C. How do I get rid of the (0,2,0) drive? What I am going to try is: enclosure prepoare slot 0 2 <physically remove the drive> -end of post- Cheers, Dennis -- Dennis G. Allard telephone: 1.310.399.4740 Ocean Park Software http://oceanpark.com ________________________________________________________________________
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?41583DC8.7000301>