Date: Tue, 09 Sep 1997 18:40:51 -0500 From: Doug Ledford <dledford@dialnet.net> To: "Justin T. Gibbs" <gibbs@plutotech.com> Cc: Doug Ledford <dledford@dialnet.net>, "Daniel M. Eischen" <deischen@iworks.InterWorks.org>, aic7xxx@freebsd.org Subject: Re: Interesting anomoly with a 2940UW Message-ID: <199709092340.SAA09203@dledford.dialnet.net> In-Reply-To: Your message of "Tue, 09 Sep 1997 16:39:22 MDT." <199709092242.QAA25554@pluto.plutotech.com>
next in thread | previous in thread | raw e-mail | index | archive | help
-------- > >Wouldn't matter. If we did pause things here, then when we unpaused them, > >the QOUTCNT register would get incremented as we are writing CLRCMDINT to > >CLRINT, then we would check QOUTCNT again, it would be non-zero, so we would > >re-run the loop, and we would re-write the CMDOUTCNT variable again. > > Sure, but what causes high interrupt latency, Doug? Other interrupt > handlers running, or your interrupt handler taking a long time, is what > causes it. In FreeBSD, your interrupt handler can be interrupted by any > non masked interrupts which means, during that window, you could easily be > diverted from running your loop perhaps long enough for multiple commands > to pile up which might just be long enough for you to overflow the qoutfifo > before your interrupt handler is resumed and you can complete your work. > And, since the sequencer might have stuffed multiple commands into the > QOUTFIFO since you last read it, the variance in what you write to > CMDOUTCNT and how full the fifo is could be quite large. For example: > > qoutcnt <- QOUTCNT == 5 > Process 5 commands in interrupt handler 10 Commands complete in sequencer > Set CMDOUTCNT to 0 should be 10 > > qoutcnt <- QOUTCNT == 10 > Process 10 commands 8 Commands complete in sequencer > OVERFLOW!!!! As I think you noticed later on in the email, this is moot since we have interrupts disabled as we are grabbing the QOUTFIFO entries and putting them on our internal completion queue and we aren't doing completion processing here, so we should be well outrunning the sequencer. Also, since we loop until QOUTCNT goes to 0, we know that we've grabbed everything with only a *very* small window for one command to complete after we check (and according to the comments in the aic7xxx.c file, Dan structured the isr in an attempt to defeat this window, so maybe that doesn't even exist). > Okay. I'll tell you again. It's inefficient code. In the example > you site, you're only able to fill the QOUTFIFO 56 times after performing > how many transactions??? Probably a few hundred thousand on a busy news > server if not more. > > I never said that you don't have high interrupt latency. What I said was > that I don't have high interrupt latency, but of course, I don't run > Linux. In my system, the hardware interrupt handler for the aic7xxx card > simply removes the entry from the QOUTFIFO, sets a few status bits in > the generic SCSI structure associated with this transaction and queues it > to a software interrupt handler. Which is exactly what we do in that particular section of code, we do the call to scsi_done later on, which is where our latency comes from. So, yes, the code is inneficient, but in a best case scenario, we should pause, write, unpause only once per interrupt. In a worst case scenario we would pause, write, unpause twice in an interrupt (becuase a command completed while we were reading the qoutfifo). Without the benefit of a PCI bus analyzer, it would seem to me that this is better than the lazy updates to the CMDOUTCNT register in the fasion that you use them. However, I would agree with lazy updates if they were done something like this (something I thought about in between the time I wrote you and you responded): p->cmdoutcnt += qoutcnt; .... do stuff .... if ((p->flags & PAGE_ENABLED) && (p->cmdoutcnt > (p->qfullcnt >> 1))) { outb(0, p->base + CMDOUTCNT); } At least this way, with our high latency, we wouldn't risk spin locking on only a few commands (unless the fifo depth was very small). Instead, we would update the variable once we got half way full each time, and that would leave half of the real depth as an effective always correct space count. > Its a shame that Linux doesn't offer a decent software interrupt strategy, > but that's not my problem. You should still be able to get decent latency > for setting the CMDOUTCNT back to 0 if you clear the QOUTFIFO first, > putting entries into a list, setting CMDOUTCNT to zero, then processing > the entries on the list. You are probably getting into your interrupt > handler plenty fast, but getting crushed by the overhead of generic SCSI > processing at interrupt time. Getting in, getting out. It doesn't matter. If our interrupt handler gets in plenty fast to grab things, but then gets delayed in our tail end execution, then we still block the sequencer until we can get out and get re-entered. > Wow. I never knew that you used to run your interrupt handler with all > other interrupts disabled. Don't your network servers drop packets like > crazy when you do this? Ummm...there are assorted problems under very high load, but fortunately in my case anyway, the network card I use has a rather large rx ring buffer that is accessed via DMA, so it tends to survive (or if it does drop packets, it doesn't say anything). However, the change I mentioned in regards to enabling interrupts specifically during the completion processing has a good deal of impact on that situation. > > >The second reason I wrote it that way is because of this. Let's say your > >code answers an interrupt with two commands on the QOUTFIFO, and p-> > >cmdoutcnt == 12, then cmdoutcnt will get incremented to 14 while the > >QOUTFIFO goes to zero. Now, if the next interrupt has a high latency, then > >you may end up using that spin lock far before you ever reach the QOUTFIFO > >depth since you didn't update the CMDOUTCNT variable during the last isr. > >So, which is more inneficient, allowing a high latency interrupt to block > >with only a command or two complete, or writing out the actual CMDOUTCNT on > >each interrupt routine when we are already writing to the card? Keep in > >mind the interrupt latency that we see sometimes. > > I'm fully aware that CMDOUTCNT does not directly track the current state > of the FIFO. I wanted a lazy update as it means I only have to do a single > write which can be done with AAP. In order for your algorithm to work, you > have to perform a read and a write with the sequencer paused and having > looked at what this does with a PCI bus analyzer, it's simply not worth > it. Says who? When we go through that code the sequencer is in one of three states. One, running. Two, spin locked for the CMDOUTCNT variable. Three, paused for some other INT condition (seqint, scsiint). If we are running and we write a 0 to CMDOUTCNT, then we've got from the time we write until after we've written to CLRINT for another command to complete. If we hit the race window you are talking about, then we should re-run our loop as we read the QOUTCNT register, see we have another command, re-run the loop, re-write the CMDOUTCNT variable, race fixed because we simply wrote a 0 over a 0 while also emptying the QOUTFIFO. If we are spin locking, then when we write the variable and unpause, we end up nearly immediately writing to QOUTFIFO in the sequencer, we catch that in QOUTCNT (since there is a delay as we write to CLRINT) and we re-run the loop. If we are paused for a seqint or scsiint, then we don't unpause, we aren't near a command completion, race window doesn't exist. Now, without a PCI analyzer to guide me on this, I could be wrong, but it seems to me that as small as the race window is that you pointed out in the sequencer, if we hit that race window, the extra check of the actual QOUTCNT register a few lines later after having written to CLRINT should catch that race. The only way for it to miss is if we are able to complete the unpause_sequencer(); outb(CLRCMDINT, p->base + CLRINT); interrupts_cleard++; inb(p->base + QOUTCNT); faster than the sequencer can do a mov QOUTFIFO, SCB_TAG; This is if we happended to pause the sequencer right after the inc CMDOUTCNT; statement. The other possible race is if the sequencer is spin locked, but then it does the inc after we have written to CMDOUTCNT, so that isn't really a race at all. That's why I don't bother to re-read the QOUTCNT register, because if it isn't 0, then we are going to re-run the loop anyway. > >Also, who's to say the > >reason you don't see messages about the QOUTCNT isn't due to this very > >condition instead of interrupt latency? A better test to see if this > >algorithm does what you want would be not to check and print a message about > >the QOUTFIFO depth, but check to see if your sequencer is spin locking on > >CMDOUTCNT and holding up the bus. > > Actually, I incremented a count in sequencer scratch ram for every time I > hit the lock. Either every time I went to look it had wrapped to 0 or my > lock was never hit. As I said before, you are probably getting into > your interrupt handler plenty fast, it's just that your interrupt handler > runs for a long time before you go back and clean out the queue. That's good for BSD, but I suspect that if you checked that lock under linux, it would be incrementing. Our basic flow of the isr is like this: handle cmdcmplt interrupts handle seqint handle scsiint (sequencer should be unpaused at this point) enable interrupts again run completion processing exit isr While we are doing the completion processing, the kernel won't allow our isr to be re-entrant, so that's the cause of our latency, but regardless, it's still an occasionally long time before we get around to re-entering ourself and cleaning the queue out again. -- ***************************************************************************** * Doug Ledford * Unix, Novell, Dos, Windows 3.x, * * dledford@dialnet.net 873-DIAL * WfW, Windows 95 & NT Technician * * PPP access $14.95/month ***************************************** * Springfield, MO and surrounding * Usenet news, e-mail and shell account.* * communities. Sign-up online at * Web page creation and hosting, other * * 873-9000 V.34 * services available, call for info. * *****************************************************************************
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199709092340.SAA09203>