Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 19 Dec 2009 18:35:04 -0500
From:      Alexander Sack <pisymbol@gmail.com>
To:        Scott Long <scottl@samsco.org>
Cc:        freebsd-scsi@freebsd.org, freebsd-current@freebsd.org
Subject:   Re: aac(4) handling of probe when no devices are there
Message-ID:  <3c0b01820912191535j4e43f4f8i48ea5239add8fa0e@mail.gmail.com>
In-Reply-To: <3c0b01820912160910i35e12112s4d6412d6cb174f3b@mail.gmail.com>
References:  <3c0b01820912141347y366a7252y5d9711b1141b9b70@mail.gmail.com> <978BBD51-222D-42F0-9D3A-FFACCBCC886D@samsco.org> <3c0b01820912160910i35e12112s4d6412d6cb174f3b@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Dec 16, 2009 at 12:10 PM, Alexander Sack <pisymbol@gmail.com> wrote=
:
> On Tue, Dec 15, 2009 at 4:54 AM, Scott Long <scottl@samsco.org> wrote:
>> On Dec 14, 2009, at 2:47 PM, Alexander Sack wrote:
> Yes surely. =A0 I think what might be happening is that after the
> INQUIRY fails, xpt_release_ccb() which I think will also check to see
> if any more CCBs should be sent to the device and send them.
> Basically the boot -v output is I am getting a CAM_SEL_TIMEOUT for
> each target and just hit into the 500 interrupt storm default
> threshold on 6.1.

Sorry for the delay.  Its the holidays and a bunch of stuff going on.

Alright, honestly, it looks like everything is FINE minus the fact
that camisr() is going to get called a lot given the number of buses
and targest on the system.  I instrumented aac_cam_action() and it
appears to be normal:

mfid0: <MFI Logical Disk> on mfi0
mfid0: 238418MB (488280064 sectors) RAID
aacd0: <RAID 5> on aac0
aacd0: 9533430MB (19524464640 sectors)
aacd1: <RAID 5> on aac1
aacd1: 9533430MB (19524464640 sectors)
GEOM: new disk mfid0
GEOM: new disk aacd0
GEOM: new disk aacd1
GEOM_LABEL: Label for provider mfid0 is label/disk0.
GEOM_LABEL: Label for provider aacd0 is label/disk1.
GEOM_LABEL: Label for provider aacd1 is label/disk2.
(probe5:aacp5:0:0:0): Request completed with CAM_REQ_CMP_ERR
(probe5:aacp5:0:0:0): Retrying Command
(probe2:aacp2:0:0:0): Request completed with CAM_REQ_CMP_ERR
(probe2:aacp2:0:0:0): Retrying Command
(probe5:aacp5:0:0:0): Request completed with CAM_REQ_CMP_ERR
(probe5:aacp5:0:0:0): Retrying Command
(probe2:aacp2:0:0:0): Request completed with CAM_REQ_CMP_ERR
(probe2:aacp2:0:0:0): Retrying Command
(probe5:aacp5:0:0:0): Request completed with CAM_REQ_CMP_ERR
(probe5:aacp5:0:0:0): Retrying Command
(probe2:aacp2:0:0:0): Request completed with CAM_REQ_CMP_ERR
(probe2:aacp2:0:0:0): Retrying Command
(probe0:aacp3:0:8:1): error 22
(probe0:aacp3:0:8:1): Unretryable Error
(probe5:aacp5:0:0:0): Request completed with CAM_REQ_CMP_ERR
(probe5:aacp5:0:0:0): Retrying Command
(probe4:aacp0:0:8:1): error 22
(probe4:aacp0:0:8:1): Unretryable Error
(probe2:aacp2:0:0:0): Request completed with CAM_REQ_CMP_ERR
(probe2:aacp2:0:0:0): Retrying Command
(probe0:aacp3:0:8:2): error 22
(probe0:aacp3:0:8:2): Unretryable Error
(probe5:aacp5:0:0:0): Request completed with CAM_REQ_CMP_ERR
(probe5:aacp5:0:0:0): error 5
(probe5:aacp5:0:0:0): Retries Exausted
(probe4:aacp0:0:8:2): error 22
(probe4:aacp0:0:8:2): Unretryable Error
(probe2:aacp2:0:0:0): Request completed with CAM_REQ_CMP_ERR
(probe2:aacp2:0:0:0): error 5
(probe2:aacp2:0:0:0): Retries Exausted
(probe0:aacp3:0:8:3): error 22
(probe0:aacp3:0:8:3): Unretryable Error
.
.
.
(probe0:aacp0:0:19:3): Unretryable Error
(probe2:aacp3:0:19:2): error 22
(probe2:aacp3:0:19:2): Unretryable Error
(probe1:aacp0:0:19:4): error 22
(probe1:aacp0:0:19:4): Unretryable Error
(probe2:aacp3:0:19:3): error 22
(probe2:aacp3:0:19:3): Unretryable Error
(probe0:aacp0:0:19:5): error 22
(probe0:aacp0:0:19:5): Unretryable Error
(probe2:aacp3:0:19:4): error 22
(probe2:aacp3:0:19:4): Unretryable Error
(probe1:aacp0:0:19:6): error 22
(probe1:aacp0:0:19:6): Unretryable Error
(probe2:aacp3:0:19:5): error 22
(probe2:aacp3:0:19:5): Unretryable Error
(probe0:aacp0:0:19:7): error 22
(probe0:aacp0:0:19:7): Unretryable Error
(probe2:aacp3:0:19:6): error 22
(probe2:aacp3:0:19:6): Unretryable Error
(probe2:aacp3:0:19:7): error 22
(probe2:aacp3:0:19:7): Unretryable Error
Interrupt storm detected on "swi2:"; throttling interrupt source

If you look at aac_cam_action()/aac_cam_complete() via KTR:

5:0:1 CAM_DEV_NOT_THERE
   144 5:0:1 op: 12

   143 4:17:0 0xa
   142 4:17:0 op: 12

   141 3:8:4 CAM_DEV_NOT_THERE
   140 3:8:4 op: 12

   139 0:8:3 CAM_DEV_NOT_THERE
   138 0:8:3 op: 12

   137 1:13:0 0xa
   136 1:13:0 op: 12

   135 2:0:0 op: 0

   134 4:16:0 0xa
   133 4:16:0 op: 12

   132 3:8:3 CAM_DEV_NOT_THERE
   131 3:8:3 op: 12

   130 5:0:0 op: 0

   129 2:0:0 0x84
   128 0:8:2 CAM_DEV_NOT_THERE
   127 1:12:0 0xa
   126 0:8:2 op: 12

   125 1:12:0 op: 12

   124 5:0:0 0x84
   123 4:15:0 0xa
   122 3:8:2 CAM_DEV_NOT_THERE
   121 2:0:0 op: 12

   120 4:15:0 op: 12

   119 3:8:2 op: 12

   118 5:0:0 op: 12

   117 2:0:0 0x84
   116 0:8:1 CAM_DEV_NOT_THERE
   115 1:11:0 0xa
   114 5:0:0 0x84
   113 4:14:0 0xa
   112 3:8:1 CAM_DEV_NOT_THERE
   111 0:8:1 op: 12

   110 1:11:0 op: 12

   109 4:14:0 op: 12

   108 3:8:1 op: 12

   107 2:0:0 op: 12

   106 2:0:0 0x84
   105 1:10:0 0xa
   104 5:0:0 op: 12

   103 4:13:0 0xa
   102 4:13:0 op: 12

   101 5:0:0 0x84
   100 4:12:0 0xa
    99 0:8:0 op: 0

    98 1:10:0 op: 12

    97 4:12:0 op: 12

    96 2:0:0 op: 12

    95 3:8:0 op: 0

    94 2:0:0 0x84
    93 1:9:0 0xa
    92 5:0:0 op: 12

    91 4:11:0 0xa
    90 4:11:0 op: 12

    89 5:0:0 0x84
    88 4:10:0 0xa
    87 0:8:0 op: 12

    86 1:9:0 op: 12

    85 4:10:0 op: 12

    84 3:8:0 op: 12

    83 2:0:0 op: 12

    82 0:8:0 op: 1a

    81 5:0:0 op: 12

    80 2:0:0 0x84
    79 1:8:0 0xa
    78 4:9:0 0xa
    77 4:9:0 op: 12

    76 5:0:0 0x84
    75 4:8:0 0xa
    74 3:8:0 op: 1a

    73 1:8:0 op: 12

    72 0:8:0 op: 12

    71 4:8:0 op: 12

    70 3:8:0 op: 12

    69 1:7:0 0xa
    68 1:7:0 op: 12

    67 0:7:0 0xa
    66 0:7:0 op: 12

    65 4:7:0 0xa
    64 4:7:0 op: 12

    63 3:7:0 0xa
    62 3:7:0 op: 12

    61 1:6:0 0xa
    60 1:6:0 op: 12

    59 0:6:0 0xa
    58 0:6:0 op: 12

    57 4:6:0 0xa
    56 4:6:0 op: 12

    55 3:6:0 0xa
    54 3:6:0 op: 12

    53 1:5:0 0xa
    52 1:5:0 op: 12

    51 0:5:0 0xa
    50 0:5:0 op: 12

    49 4:5:0 0xa
    48 4:5:0 op: 12

    47 3:5:0 0xa
    46 3:5:0 op: 12

    45 1:4:0 0xa
    44 1:4:0 op: 12

    43 0:4:0 0xa
    42 0:4:0 op: 12

    41 4:4:0 0xa
    40 2:0:0 op: 12

    39 5:0:0 op: 12

    38 4:4:0 op: 12

    37 3:4:0 0xa
    36 3:4:0 op: 12

    35 1:3:0 0xa
    34 1:3:0 op: 12

    33 0:3:0 0xa
    32 5:0:0 op: 12

    31 2:0:0 op: 12

    30 0:3:0 op: 12

    29 4:3:0 0xa
    28 4:3:0 op: 12

    27 3:3:0 0xa
    26 3:3:0 op: 12

    25 1:2:0 0xa
    24 1:2:0 op: 12

    23 0:2:0 0xa
    22 0:2:0 op: 12

    21 4:2:0 0xa
    20 4:2:0 op: 12

    19 3:2:0 0xa
    18 3:2:0 op: 12

    17 1:1:0 0xa
    16 0:1:0 0xa
    15 4:1:0 0xa
    14 3:1:0 0xa
    13 1:1:0 op: 12

    12 4:1:0 op: 12

    11 3:1:0 op: 12

    10 0:1:0 op: 12

     9 1:0:0 0xa
     8 4:0:0 0xa
     7 3:0:0 0xa
     6 0:0:0 0xa
     5 5:0:0 op: 12

     4 4:0:0 op: 12

     3 3:0:0 op: 12

     2 2:0:0 op: 12

     1 1:0:0 op: 12

     0 0:0:0 op: 12

At a limit of 500 on 6.1, it was easy to hit this threshold.  I
assumed before that the CAM_REQ_CMP_ERR's were CAM_SEL_TIMEOUT and
confused the issues.  Sorry for that, its been a long week dealing
with several issues.

With enough cards, you will even hit the 1000 limit in CURRENT (though
I admit its not really that serious).  Right?

-aps



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3c0b01820912191535j4e43f4f8i48ea5239add8fa0e>