Date: Wed, 8 Jan 2014 06:44:39 +0000 From: Ben Laurie <ben@links.org> To: "Justin T. Gibbs" <gibbs@scsiguy.com> Cc: freebsd-scsi@freebsd.org Subject: Re: Dropped interrupts Message-ID: <CAG5KPzxqxmzfnAiUyVyApMs5XqKpH0F0sbTxy1h=4cL=rEROpQ@mail.gmail.com> In-Reply-To: <84D23688-DDC6-421E-9D21-3DA646229038@scsiguy.com> References: <CAG5KPzxUAnORPSyJDrSdBBNE7MBi-dD=M6=E1rMG%2Bc8rn4deUQ@mail.gmail.com> <AE41EB93-26BD-4BD0-9913-7CD0A5B6E1A4@scsiguy.com> <CAG5KPzwR5=WNKc5hck8F7CtCtk3mivwYFRFeCJT_zWdnetW=3w@mail.gmail.com> <84D23688-DDC6-421E-9D21-3DA646229038@scsiguy.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 7 January 2014 18:11, Justin T. Gibbs <gibbs@scsiguy.com> wrote: > On Jan 7, 2014, at 12:36 AM, Ben Laurie <ben@links.org> wrote: > >> Attached. >> >> On 7 January 2014 05:46, Justin T. Gibbs <gibbs@scsiguy.com> wrote: >>> On Jan 6, 2014, at 3:01 PM, Ben Laurie <ben@links.org> wrote: >>> >>>> Not subscribed to the list, so please cc on replies. >>>> >>>> I'm using Bacula with an LTO-2 SCSI drive. >>>> >>>> With increasing frequency lately, I've been getting errors like this >>>> from bacula: >>>> >>>> backup-sd JobId 13092: Error: block.c:608 Write error at 23:6772 on >>>> device "Ultrium" (/dev/nsa0). ERR=3DOperation not permitted. >>>> >>>> Associated with this, I see in dmesg: >>>> >>>> ahc0: Recovery Initiated >>>> >>>> [a lot of dump info, including=85] >>> >>> If you provide the dump info, I may be able to tell you why recovery is= starting. >>> >>> The dmesg information from a boot of the system would be good to have t= oo. >>> >>> =97 >>> Justin > > The target is keeping us in command phase for some reason. No parity or = other > errors are being reported. My guess is that the tape drive does not like= the command > that was issued for some reason. > > Attached are two totally untested/uncompiled changes for you to try out. = The first > should give more information about the command that timed out so we can b= etter > determine if it is well formed. The second is an attempted fix for spuri= ous > =93Interrupts may not be functioning=94 warnings. Can you attempt to rep= licate this > again with these changes? Rebuilding now - you had a ; missing in the patch :-) Of course, now I've done this, it'll not fail for a month (its been failing multiple times per day recently, but on average its a lot rarer than that!). Will let you know when I get a fresh failure.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAG5KPzxqxmzfnAiUyVyApMs5XqKpH0F0sbTxy1h=4cL=rEROpQ>