Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 05 Jun 1998 17:23:12 -0400 (EDT)
From:      Simon Shapiro <shimon@simon-shapiro.org>
To:        Greg Lehey <grog@lemis.com>
Cc:        Mike Smith <mike@smith.net.au>, Karl Pielorz <kpielorz@tdx.co.uk>, tcobb <tcobb@staff.circle.net>, "freebsd-current@freebsd.org" <freebsd-current@FreeBSD.ORG>, Michael Hancock <michaelh@cet.co.jp>
Subject:   Re: DPT driver fails and panics with Degraded Array
Message-ID:  <XFMail.980605172312.shimon@simon-shapiro.org>
In-Reply-To: <19980605093046.J768@freebie.lemis.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On 05-Jun-98 Greg Lehey wrote:
> On Thu,  4 June 1998 at 12:00:46 -0400, Simon Shapiro wrote:
>>
>> On 03-Jun-98 Greg Lehey wrote:
>>> Why would a driver call biodone on a buffer that doens't belong to it?
>>
>> The block belongs to it. Only it gets marked as done somehow.
> 
> That in itself is normal enough.  How come it's not busy?

I dunno.  From the driver, when biodone needs to be called, I enter a
critical section, move the block to the proper queue, call biodone, clear
some bits, and release critical section.  Maybe something times out some
blocks?  I tried for a while to trace it down, but found nothing
interesting.

...

> I don't know the driver, but I'm surprised you need to maintain
> separate information.  I'd use the state in the bp->b_flags.

I do not replicate b_flags.  I do maintain some other state bits in regards
to the DPT state machine.

>> Since the greatest sensitivity was in the st.c, and st.c is new in CAM,
>> I
>> basically dropped the ball.  Especially when I did not have this problem
>> in
>> 3.0, from very early on.
> 
> I haven't seen a driver called st.c in CAM.  They've changed the
> names, and the tape driver is now called scsi_sa.c.  st.c is the old
> tape driver.  How do you determine "greatest sensitivity"?

If I run (including in 3.0, and SMP) two cpio sessions to two tape drives,
the system panics.  I can access multiple disks, or multiple CD-ROMs
without error, but it is easiest to induce an error with two tape drives.

> In any case, I can't see how a different driver can influence things.
> Heavy tape I/O may help the problem to show itself, but I can't think
> it's in any way to blame.

Next time I am running multiple tape drives, I will write dowm the failure
mode.  But things happen like when one tape is rewinding, the other one
stops writing as it suddenly ``is'' at EOT.  Stuff like that.
Please do not go chasing code, as this is a horrible way to describe a
problem.  I'll post more specifics at a later date.

...

>> Are you using two tape drives, and write to both concurrently, using 64k
>> blocks?
> 
> Occasionally.

Without failure?  That's good.

>> Are you running disk I/O at 1500-1900 operations per second?  Is the
>> SCSI controller you use capable of causing biodone to be called
>> within less than 1us from the driver being called?
> 
> Well, I suppose each of the controllers could generate a number of
> interrupts per second, so sooner or later that scenario would arise.
> But as I said above, there's nothing to point to the st driver except
> it's the new kid on the block.  What you have said points fairly and
> squarely to the DPT driver as the culprit.

I fail to see how.  Read my comments carefully.  I am not of the opinion
that the tape driver is at fault.  I simply say that I observe the failure
most dramatically when using DAT drives as destination.

For example, last time I tried, I could not write tapes with any blocking
factor other than 512 bytes, and still be able to read the tape correctly. 
When writing to disks, this restriction does not apply.  Since the code in
the DPT driver is the same, regardless of the nature of the target (or its
address), I naively assumed the DPT driver is not the culprit.

> OK.  What happens if you analyse the buffer header before calling
> biodone and just ignore it if it's not busy?

I dunno.  Excellent suggestion.  I'll try that.  Anyone willing to test
that?

Simon


---


Sincerely Yours, 

Simon Shapiro                                           Shimon@Simon-Shapiro.ORG
                                                        770.265.7340

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.980605172312.shimon>