Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 9 Jan 2014 07:37:02 +0000
From:      Ben Laurie <ben@links.org>
To:        "Justin T. Gibbs" <gibbs@scsiguy.com>
Cc:        freebsd-scsi@freebsd.org
Subject:   Re: Dropped interrupts
Message-ID:  <CAG5KPzyPXLEOAFKTAW%2BJaai=XbT3mi5yGHs1LnUbMcD-LfQ1YQ@mail.gmail.com>
In-Reply-To: <CAG5KPzxqxmzfnAiUyVyApMs5XqKpH0F0sbTxy1h=4cL=rEROpQ@mail.gmail.com>
References:  <CAG5KPzxUAnORPSyJDrSdBBNE7MBi-dD=M6=E1rMG%2Bc8rn4deUQ@mail.gmail.com> <AE41EB93-26BD-4BD0-9913-7CD0A5B6E1A4@scsiguy.com> <CAG5KPzwR5=WNKc5hck8F7CtCtk3mivwYFRFeCJT_zWdnetW=3w@mail.gmail.com> <84D23688-DDC6-421E-9D21-3DA646229038@scsiguy.com> <CAG5KPzxqxmzfnAiUyVyApMs5XqKpH0F0sbTxy1h=4cL=rEROpQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 8 January 2014 06:44, Ben Laurie <ben@links.org> wrote:
> On 7 January 2014 18:11, Justin T. Gibbs <gibbs@scsiguy.com> wrote:
>> On Jan 7, 2014, at 12:36 AM, Ben Laurie <ben@links.org> wrote:
>>
>>> Attached.
>>>
>>> On 7 January 2014 05:46, Justin T. Gibbs <gibbs@scsiguy.com> wrote:
>>>> On Jan 6, 2014, at 3:01 PM, Ben Laurie <ben@links.org> wrote:
>>>>
>>>>> Not subscribed to the list, so please cc on replies.
>>>>>
>>>>> I'm using Bacula with an LTO-2 SCSI drive.
>>>>>
>>>>> With increasing frequency lately, I've been getting errors like this
>>>>> from bacula:
>>>>>
>>>>> backup-sd JobId 13092: Error: block.c:608 Write error at 23:6772 on
>>>>> device "Ultrium" (/dev/nsa0). ERR=3DOperation not permitted.
>>>>>
>>>>> Associated with this, I see in dmesg:
>>>>>
>>>>> ahc0: Recovery Initiated
>>>>>
>>>>> [a lot of dump info, including=85]
>>>>
>>>> If you provide the dump info, I may be able to tell you why recovery i=
s starting.
>>>>
>>>> The dmesg information from a boot of the system would be good to have =
too.
>>>>
>>>> =97
>>>> Justin
>>
>> The target is keeping us in command phase for some reason.  No parity or=
 other
>> errors are being reported.  My guess is that the tape drive does not lik=
e the command
>> that was issued for some reason.
>>
>> Attached are two totally untested/uncompiled changes for you to try out.=
  The first
>> should give more information about the command that timed out so we can =
better
>> determine if it is well formed.  The second is an attempted fix for spur=
ious
>> =93Interrupts may not be functioning=94 warnings.  Can you attempt to re=
plicate this
>> again with these changes?
>
> Rebuilding now - you had a ; missing in the patch :-)
>
> Of course, now I've done this, it'll not fail for a month (its been
> failing multiple times per day recently, but on average its a lot
> rarer than that!).
>
> Will let you know when I get a fresh failure.

As predicted, it has now done 3 complete tapes with no problems, and
is on the fourth.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAG5KPzyPXLEOAFKTAW%2BJaai=XbT3mi5yGHs1LnUbMcD-LfQ1YQ>