Date: Mon, 13 Oct 1997 15:01:04 -0700 (PDT) From: Simon Shapiro <Shimon@i-Connect.Net> To: (Curt Welch) <curt@kcwc.com> Cc: scsi@FreeBSD.ORG, Jaye Mathisen <mrcpu@cdsnet.net> Subject: Re: Still having some amusing DPT problems. Message-ID: <XFMail.971013150104.Shimon@i-Connect.Net> In-Reply-To: <9710131940.AA12601@mail.kcwc.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Curt Welch; On 13-Oct-97 you wrote:
> > Invariably, after 20-30 hours of getting hammered, the
> > box dies with "DPT: Undocumented error", and drops into
> > DDB. If I take out DDB, I get a kazillion messages about
> > stale transactions that aren't really dead, and a bunch
> > of other errors.
If you remove a species you upset the delicate balance of nature :-)
Instead, change the lines: default:
Debugger("DPT: Undocumented Error");
With: default:
Debugger("DPT: Undocumented Error %x",
ccb->status_packet.hba_stat);
at the end of dpt_process_completion().
This will give us a clue what is the DPT moaning about :-)
> I had similar symptoms. I'm running the DPT on a busy
> usenet news server. It would never run for more than a day
> or two without crashing. Messages about stale transactions
> were one of the problems I saw. I never used DDB. I've
> never seen the "Undocumented error" diagnostic.
>
> However, once I upgraded to version 1.2.3 of the DPT driver,
> my problems went away (2 months of no problems). But what
> happened with that release is that the disk performance
> improved enough to allow my news server to keep up with the
> news for the first time. This means that every 5 minutes
> or so, the incoming feeds completes and the disks get to
> "catch up" - i.e. flush their cache. Before version 1.2.3
> the news server could never keep up and was therefor busy
> non-stop 24 hours a day.
1.2.4 made some performance improvement and fixed some bugs. It is a
(proud) fact that FreeBSD allows very hevey loads to be imposed on the
system. Let's debug these...
> Are you using the DPT_HANDLE_TIMEOUTS? If you are getting messages
> about stale transactions I think you must have it on. Try turing
> it off. As I understand it, this makes the DPT driver check
> for transactions that take too long to complete. But "Too long"
> is just a calculated value that could well be too short for a very
> heavilly loaded system. This was one of the problems I was
> having. The DPT driver was aborting transactions that took too
> long (>20 seconds or so), when it shouldn't have. If it would have
> waited a bit long, the transaction would finish fine on their own.
> These long waits are side-effect of a large cache on the controller.
> I had the full 64Meg cahce on mine.
The ``Undocumented Error'' should not crop up, regardless of load.
The long delays are not exactly a result of the large cache. They are
despite the large cache. There is a starvation condition with certain
disks. The only time a large cache slows things down, in in flushing. The
bigger, the more to flush the more to wait.
> If you are having real problems with lost transactions, then you
> might have to leave the option on, and instead try changing
> how it calculates the timeout values.
>
> The only options I'm using are:
>
> DPT_MEASURE_PERFORMANCE
> DPT_TIMEOUT_FACTOR=4
Change this factor to increase the timeout.
>
>
> Simon of course can correct anything wrong I might have
> said above....
>
> --
>
> Curt Welch
> http://CurtWelch.Com/
>
> (Just trying to take some of the load off of Simon after all
> the time he spent helping get my system stable...)
>
>
>
---
Sincerely Yours,
Simon Shapiro Atlas Telecom
Senior Architect 14355 SW Allen Blvd., Suite 130 Beaverton OR 97005
Shimon@i-Connect.Net Voice: 503.799.2313
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.971013150104.Shimon>
