Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 8 Jan 2014 06:44:39 +0000
From:      Ben Laurie <ben@links.org>
To:        "Justin T. Gibbs" <gibbs@scsiguy.com>
Cc:        freebsd-scsi@freebsd.org
Subject:   Re: Dropped interrupts
Message-ID:  <CAG5KPzxqxmzfnAiUyVyApMs5XqKpH0F0sbTxy1h=4cL=rEROpQ@mail.gmail.com>
In-Reply-To: <84D23688-DDC6-421E-9D21-3DA646229038@scsiguy.com>
References:  <CAG5KPzxUAnORPSyJDrSdBBNE7MBi-dD=M6=E1rMG%2Bc8rn4deUQ@mail.gmail.com> <AE41EB93-26BD-4BD0-9913-7CD0A5B6E1A4@scsiguy.com> <CAG5KPzwR5=WNKc5hck8F7CtCtk3mivwYFRFeCJT_zWdnetW=3w@mail.gmail.com> <84D23688-DDC6-421E-9D21-3DA646229038@scsiguy.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 7 January 2014 18:11, Justin T. Gibbs <gibbs@scsiguy.com> wrote:
> On Jan 7, 2014, at 12:36 AM, Ben Laurie <ben@links.org> wrote:
>
>> Attached.
>>
>> On 7 January 2014 05:46, Justin T. Gibbs <gibbs@scsiguy.com> wrote:
>>> On Jan 6, 2014, at 3:01 PM, Ben Laurie <ben@links.org> wrote:
>>>
>>>> Not subscribed to the list, so please cc on replies.
>>>>
>>>> I'm using Bacula with an LTO-2 SCSI drive.
>>>>
>>>> With increasing frequency lately, I've been getting errors like this
>>>> from bacula:
>>>>
>>>> backup-sd JobId 13092: Error: block.c:608 Write error at 23:6772 on
>>>> device "Ultrium" (/dev/nsa0). ERR=3DOperation not permitted.
>>>>
>>>> Associated with this, I see in dmesg:
>>>>
>>>> ahc0: Recovery Initiated
>>>>
>>>> [a lot of dump info, including=85]
>>>
>>> If you provide the dump info, I may be able to tell you why recovery is=
 starting.
>>>
>>> The dmesg information from a boot of the system would be good to have t=
oo.
>>>
>>> =97
>>> Justin
>
> The target is keeping us in command phase for some reason.  No parity or =
other
> errors are being reported.  My guess is that the tape drive does not like=
 the command
> that was issued for some reason.
>
> Attached are two totally untested/uncompiled changes for you to try out. =
 The first
> should give more information about the command that timed out so we can b=
etter
> determine if it is well formed.  The second is an attempted fix for spuri=
ous
> =93Interrupts may not be functioning=94 warnings.  Can you attempt to rep=
licate this
> again with these changes?

Rebuilding now - you had a ; missing in the patch :-)

Of course, now I've done this, it'll not fail for a month (its been
failing multiple times per day recently, but on average its a lot
rarer than that!).

Will let you know when I get a fresh failure.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAG5KPzxqxmzfnAiUyVyApMs5XqKpH0F0sbTxy1h=4cL=rEROpQ>