Date: Tue, 15 Jan 2008 15:12:34 -0500 From: "Robin Blanchard" <robin.blanchard@itos.uga.edu> To: "Bob Hetzel" <beh@case.edu> Cc: freebsd-scsi@freebsd.org Subject: RE: LTO-3 / scsi woes Message-ID: <1C28E42139C61D4BB4F418A19EAA2E3507B169@MAIL.itos.uga.edu> In-Reply-To: <478D0FB7.8060404@case.edu> References: <478D04A4.9000103@case.edu> <1C28E42139C61D4BB4F418A19EAA2E3507B168@MAIL.itos.uga.edu> <478D0FB7.8060404@case.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
By "bypass" I mean I plugged the LTO drive directly to the controller rather than in-line through the library. I am only using sym0 (the LVD channel). These are the devices in question: <QUALSTAR RLS-8204-20 006D> at scbus0 target 0 lun 0 (ch0,pass0) <IBM ULTRIUM-TD3 73P5> at scbus0 target 1 lun 0 (sa0,pass1) Thanks for your help.... > -----Original Message----- > From: Bob Hetzel [mailto:beh@case.edu] > Sent: Tuesday, January 15, 2008 2:56 PM > To: Robin Blanchard > Subject: Re: [Bacula-users] LTO-3 / scsi woes >=20 > Robin, >=20 > The Freebsd log says the controller is operating at Fast-80... that > doesn't sound good. I also noted that it's a dual channel with the > other channel operating SE (single ended which is "high voltage" scsi > as > opposed to LVD or low voltage. >=20 > Also, when you say "bypassed" can you clarify? >=20 > Bob >=20 > Robin Blanchard wrote: > > Bob, > > > > Thanks for the length reply and suggestions. I've swapped terminators > > (I've got a U160 and a U320, both with indicator lights -- both > indicate > > 'green'), as well as cables, cards, and even the drive itself (we > have > > two of the same library, each with a single drive in each). The > > "closest" I've come thus far is to bypass the library/exchanger, and > > connect only the LTO-3 drive; but having to set the speed to 20 MB/s. > > Using an adaptec card, I get absolutely nowhere at all, hence the > > current use of the LSI card (which, yes, is LVD/SE). I just installed > > FBSD (as opposed to RHEL5) to see if I could glean anything else > useful. > > The attached dmesg is FBSD 6.2-STABLE with the LSI (sym) card, and > the > > library/drive both set in the BIOS to defaults/auto. > > > > > > With an adaptec 2940U2W, I get nothing but garbage: > > > > <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> > > (probe0:ahc0:0:0:3): SCB 0xe - timed out > > sg[0] - Addr 0x37d084 : Length 36 > > (probe0:ahc0:0:0:3): Other SCB Timeout > > ahc0: Timedout SCBs already complete. Interrupts may not be > functioning. > > ahc0: Recovery Initiated > >>>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<< > > ahc0: Dumping Card State in Data-in phase, at SEQADDR 0xa0 > > Card was paused > > ACCUM =3D 0x40, SINDEX =3D 0xa, DINDEX =3D 0xe4, ARG_2 =3D 0x0 > > HCNT =3D 0x0 SCBPTR =3D 0x0 > > SCSISIGI[0x54]:(BSYI|ATNI|IOI) ERROR[0x0] SCSIBUSL[0x0] > > LASTPHASE[0x40]:(IOI) SCSISEQ[0x12]:(ENAUTOATNP|ENRSELI) > > SBLKCTL[0xa]:(SELWIDE|SELBUSB) SCSIRATE[0x93]:(SINGLE_EDGE|WIDEXFER) > > SEQCTL[0x10]:(FASTMODE) SEQ_FLAGS[0x20]:(DPHASE) > > SSTAT0[0x5]:(DMADONE|SDONE) > > SSTAT1[0x2]:(PHASECHG) SSTAT2[0x0] SSTAT3[0x0] SIMODE0[0x8]:(ENSWRAP) > > SIMODE1[0xac]:(ENSCSIPERR|ENBUSFREE|ENSCSIRST|ENSELTIMO) > > SXFRCTL0[0x88]:(SPIOEN|DFON) DFCNTRL[0x0] > > DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) > > STACK: 0x0 0x167 0x17d 0x84 > > SCB count =3D 20 > > Kernel NEXTQSCB =3D 7 > > Card NEXTQSCB =3D 14 > > QINFIFO entries: 14 > > Waiting Queue entries: > > Disconnected Queue entries: > > QOUTFIFO entries: > > Sequencer Free SCB List: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 > 19 > > 20 21 22 23 24 25 26 27 28 29 30 31 > > Sequencer SCB Info: > > 0 SCB_CONTROL[0x40]:(DISCENB) SCB_SCSIID[0x17] SCB_LUN[0x1] > > SCB_TAG[0x1] > > 1 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 2 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 3 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 4 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 5 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 6 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 7 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 8 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 9 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 10 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 11 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 12 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 13 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 14 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 15 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 16 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 17 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 18 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 19 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 20 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 21 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 22 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 23 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 24 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 25 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 26 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 27 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 28 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 29 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 30 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > 31 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) > > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] > > Pending list: > > 14 SCB_CONTROL[0x40]:(DISCENB) SCB_SCSIID[0x7] SCB_LUN[0x3] > > 1 SCB_CONTROL[0x40]:(DISCENB) SCB_SCSIID[0x17] SCB_LUN[0x1] > > Kernel Free SCB list: 15 16 17 18 19 0 2 3 4 5 6 8 9 13 12 11 10 > > Untagged Q(0): 14 > > Untagged Q(1): 1 > > > > > > > >> -----Original Message----- > >> From: Bob Hetzel [mailto:beh@case.edu] > >> Sent: Tuesday, January 15, 2008 2:08 PM > >> To: Robin Blanchard > >> Cc: Allan Black > >> Subject: Re: [Bacula-users] LTO-3 / scsi woes > >> > >> Robin, > >> > >> Just a few additions to what Allan said... termination is not simply > >> one > >> size fits all. On most controllers the internal and external > > connector > >> are considered part of the same bus. You need to always terminate > > both > >> ends of the bus. If you don't use a connector the controller > > generally > >> terminates that "stub" for you. If not, the controller is considered > > to > >> be in the middle of the bus, if memory serves, and therefore you'd > > need > >> to terminate both ends with an actual physical terminator or using > the > >> terminator built into the device at the end if it has one (many > > devices > >> no longer come with that option and I don't think any automatically > >> terminate). > >> > >> In any case, you need to read the manual for whatever controller > > you're > >> using and do what it says to do. You also need to look into the > >> instructions about SCSI target ID's too. If you're using cheap > >> internal > >> connectors that aren't keyed there's also a chance you've got one > >> backward. You also need to make sure your terminator is good for > LVD > >> (Ultra 160) devices. I suspect you're only using external devices. > >> Most external terminators have a light and perhaps it'll have info > on > >> it. Some change the light color depending on what it detects on the > >> bus. If the light isn't lit it may not be working properly or you > may > >> have a goofy termpower setting somewhere. > >> > >> I can't seem to google the LSI controller, are you sure it's LVD? > >> > >> Also there are cable length restrictions of around 25 feet total. > >> Additionally, many times the more devices you use the shorter the > > cable > >> you can use (each device has wiring inside it and with every change > in > >> cabling you add noise). > >> > >> If it's an IBM drive you can download IBM diags and extensively test > > it > >> as well as communications to it. Likewise for HP and probably other > >> drives. > >> > >> You also may need to see if the firmware is current on the > controller > >> as > >> sometimes peripherals are shipped with bugs that get fixed in > software > >> later (blame the marketing guys for pushing production schedules > up). > >> It's also likely there's a firmware update available for the drive > but > >> be careful about doing that when you have communications problems as > >> you > >> could render the drive useless if corrupted firmware gets loaded > into > >> it. > >> > >> Bob > >> > >>> Message: 2 > >>> Date: Tue, 15 Jan 2008 00:10:22 +0000 > >>> From: Allan Black <Allan.Black@btconnect.com> > >>> Subject: Re: [Bacula-users] LTO-3 / scsi woes > >>> To: Robin Blanchard <robin.blanchard@itos.uga.edu> > >>> Cc: uganet@listserv.uga.edu, bacula > >>> <bacula-users@lists.sourceforge.net> > >>> Message-ID: <478BF9EE.2020803@btconnect.com> > >>> Content-Type: text/plain; charset=3DISO-8859-1; format=3Dflowed > >>> > >>> Robin Blanchard wrote: > >>>>> I've been around the block with LSI and with adaptec: tried an > > LSI > >>>>> SYMC101, an adaptec 2940U2W and a 39160. I've removed the > >>>>> library/exchange from the equation, using only the LTO-3 drive > >> (and have > >>>>> actually swapped that drive out for another as well), swapped > > SCSI > >>>>> cables, and terminators, and still am getting scsi errors. Anyone > >> got > >>>>> any tips/ideas here ? > >>> To be honest, this looks like a termination problem (or, at least, > a > >> bus > >>> problem of some sort). However, since you appear to have swapped > out > >> every > >>> piece of hardware, the only thing left seems to be the > configuration > >> .... > >>> Both the external and the internal segments of the SCSI bus need to > >> be > >>> properly terminated. If not, errors will occur on the bus and the > > HBA > >> will > >>> step back the clock speed in an attempt to make the bus work > >> reliably. So > >>> far so good - you are seeing parity errors and the HBA is stepping > >> the > >>> speed back from 160 MB/s to 40 MB/s. > >>> > >>> It looks as if there is either insufficient termination, or too > much > >>> termination, on the bus - too much termination can be as bad as no > >>> termination. > >>> > >>> There should be configuration options in the HBA's BIOS to set > >> termination > >>> of the internal and external segments at the HBA. Usually > > termination > >> of > >>> the external and internal segments are configured independently. > > They > >> can > >>> usually be set to auto (which is usually the default and the > >> manufacturer's > >>> recommended setting), or explicitly on or off. Check you do not > have > >>> termination switched on at the HBA *and* a terminator at the end of > >> the > >>> cable. > >>> > >>> Is the LTO3 drive internal or external, BTW? The internal segment > of > >> the > >>> bus needs to follow the same rules as the external segment, but is > >> usually > >>> more difficult to get right - termination can occur at the HBA, the > >> drive(s) > >>> or the end of the cable. Normally for an Ultra 160 SCSI bus, the > >> internal > >>> segment should have termination switched off (or set to auto) at > the > >> HBA, > >>> the devices on the bus should be unterminated and the ribbon cable > >> should > >>> be terminated. > >>> > >>> Similarly (but less likely), check there is not an option in the > > LTO3 > >> drive > >>> to terminate the bus. If the drive is terminating the bus, *and* > >> there is > >>> a terminator screwed onto the back, this will add up to too much > >>> termination. Like I said, this is unlikely. If anything, if the > > drive > >> can > >>> terminate the bus, it will most probably be automatic. > >>> > >>> If you have a mixture of wide and narrow devices on the bus > >> (particularly > >>> the internal segment) termination can get a bit tricky :-) > >>> > >>> I have never used any of the 3 HBAs you mention at the start of the > >> email, > >>> but I have used a 2940N and a 29160N. A couple of points - the > 2940N > >> did not > >>> (as far as I can remember) have an "auto" setting for bus > > termination > >> at > >>> the HBA; it could only be set to "on" or "off". The 2940UW may be > > the > >> same. > >>> Also, the 29160 BIOS tests the bus segments and reports if it > > detects > >> a > >>> termination problem. The 39160 is, I believe, similar to the 29000 > >> series, > >>> being mainly a dual-channel version. It may tell you of a > > termination > >>> problem if you look carefully at the BIOS output (and try not to > >> blink > >>> at the wrong time in case you miss it :-) > >>> > >>> Depending on how the HBA works, it is possible that a termination > >> error > >>> on one segment will affect both segments. If, for example, you have > >> an > >>> external device attached, but no internal devices, then the HBA > >> should have > >>> termination *on* for the internal segment, and *off* for the > > external > >>> segment. [Or "auto", of course.] > >>> > >>> Allan > >>>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1C28E42139C61D4BB4F418A19EAA2E3507B169>