Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 28 Dec 1995 23:05:50 +0100
From:      se@zpr.uni-koeln.de (Stefan Esser)
To:        "Rodney W. Grimes" <rgrimes@gndrsh.aac.dev.com>
Cc:        CVS-committers@freefall.freebsd.org, cvs-sys@freefall.freebsd.org, Andrew Russell <arussell@bga.com>, Dmitry Kohmanyuk <dk@dog.farm.org>, Joakim Henriksson <murduth@ludd.luth.se>, Karl Wiebe <karl@hopf.dnai.com>, Rich Beerman <rbeer@jaguar.cris.com>
Subject:   Re: cvs commit: src/sys/pci ncr.c
Message-ID:  <199512282205.AA06029@Sysiphos>
In-Reply-To: "Rodney W. Grimes" <rgrimes@GndRsh.aac.dev.com> "Re: cvs commit: src/sys/pci ncr.c" (Dec 28, 12:50)

next in thread | previous in thread | raw e-mail | index | archive | help
On Dec 28, 12:50, "Rodney W. Grimes" wrote:
} Subject: Re: cvs commit: src/sys/pci ncr.c
} > 
} > se          95/12/28 05:04:05
} > 
} >   Modified:    sys/pci   ncr.c
} >   Log:
} >   Preserve SIGP bit when clearing INTF condition.
} 
} Can you expand upon the ramifications of this fix?  Ie, how does the
} problem it fix manifest itself, symptoms, etc.

This is supposed to fix the timeouts (which eventually lead to bus resets)
observed on a few systems over the last few months, e.g.:

% ncr0: SCSI phase error fixup: CCB already dequeued (0xf06bdc00)
% ncr0:2: ERROR (80:100) (e-a9-23) (e0/13) @ (1214:0e000000).
%   script cmd = c0000001
%   reg:     da 10 00 13 47 e0 03 1f 00 0e 82 a9 80 00 01 00.
% ncr0: handshake timeout

I've never had it happen on my system, but Gerard Roudier managed to 
reproduce the problem under Linux (when doing the Linux port :) and 
suggested a fix, which made his system work reliable.

In a way, I'm surprised this fix makes any difference at all, but I've 
got to believe it ...
(It's kind of hard to believe, since the NCR is polled once a second, 
and SIGP is set to 1 on these occasions. For this reason it should have 
hardly any effect, if it was in fact possible to reset SIGP. But I neither 
observed that kind of a few seconds sleep nor the corresponding console 
message written by the timeout handler.)

The interrupt register (sist) contains a number of status bits, and 
writing a 1 to some bit acknowledges recognition of the corresponding 
interrupt condition. Now it seems, that SIGP (which makes the NCR start 
execution if set) can be reset by writing a 0 into it's bit position.
I don't have the NCR manual here right now, and I can't check whether 
this is in fact documented behaviour, but the patch seems to fix the 
problem.

The previous code assumed that writing 0 bits to any of the registers 
was a NOP, but it might in fact be true, that the SIGP bit is special, 
and does react not only on a 1 being written (as documented), but also 
on a 0 ...

I'm sure that this change can't break anything, since writing a 1 to 
SIGP is allowed at any time. It will just wake up the NCR if it was 
sleeping, and if nothing is to be done, it will go to sleep again.

People who might see an improvement are:

 Andrew Russell <arussell@bga.com>
 David Greenman <davidg@root.com>
 Dmitry Kohmanyuk <dk@dog.farm.org>
 Joakim Henriksson <murduth@ludd.luth.se>
 Karl Wiebe <karl@hopf.dnai.com>
 Rich Beerman <rbeer@jaguar.cris.com>
 Satoshi Asami <asami@cs.berkeley.edu>

Some reported about single failures and I'm not sure their reports have 
not been caused by transient effects. 

I'm CCing this message to the above list of people, and I'd like to hear 
whether the problem did still exist with a recent version of the NCR driver, 
and whether the fix does help them ...

(David reported a single failure, and I suppose it didn't repeat ??? And 
Satoshi reported timeouts with fsck. In most cases the problems were solved 
by disabling tags or upgrading the drive's firmware ...)

Regards, STefan

-- 
 Stefan Esser, Zentrum fuer Paralleles Rechnen		Tel:	+49 221 4706021
 Universitaet zu Koeln, Weyertal 80, 50931 Koeln	FAX:	+49 221 4705160
 ==============================================================================
 http://www.zpr.uni-koeln.de/~se			  <se@ZPR.Uni-Koeln.DE>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199512282205.AA06029>