Date: Thu, 20 Aug 2009 12:04:13 +0930 From: "Daniel O'Connor" <doconnor@gsoft.com.au> To: freebsd-stable@freebsd.org Subject: Blocked process Message-ID: <200908201204.24914.doconnor@gsoft.com.au>
next in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] Hi, We have several systems doing data acquisition and I had originally thought we were seeing the interrupt handler for out PCI card not being called quickly enough, however I misread the diagnostics :) The digitised data is fed into a FIFO and when it is part full (32kbytes) an interrupt is generated. The IRQ routine reads 32kbyte chunks into a kernel buffer (4Mbyte) until part full goes away. If the FIFO full flag is seen (it is latched by the hardware) then acquisition is halted. The problem appears to now be that the userland process that reads data out of the kernel is being stalled for over 4 seconds. This process reads from the kernel and does some minor processing and then writes it out to a child process to do some more work on it. I ran 'ps -xaulwww' in a loop every second to see what ELSE was using the CPU when it was stalled and found that my script stalled for 7 seconds. I tried increasing the buffer inside the kernel (to 8Mb) which seemed to have no effect, however renice'ing the process from -5 to -20 has greatly reduced the frequency of occurrence. WRT the buffer size - I would expect that if I increased it more it would reduce the problem but since I have only increased it to ~4 seconds worth and the stall is longer I see no effect. Given that renice'ing has an effect it seems to be a scheduler problem, I don't see how it can be something to do with the motherboard stalling the whole system otherwise the FIFO full error would occur, however I only see the 4Mb kernel buffer filling up. One other possibility would be something holding a lock for too long that blocks both the DAQ readout process and ps, however I am not sure how I would find out what. Unfortunately the system is in Finland and I'm in Australia so I can't sit at the console :( I am hoping to be able to replicate the HW & SW locally at some stage but haven't been able to yet. Any help appreciated, thanks! -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au "The nice thing about standards is that there are so many of them to choose from." -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C [-- Attachment #2 --] -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.12 (FreeBSD) iD8DBQBKjLYw5ZPcIHs/zowRAqa7AJ9W8IABIKjqB7Owy1Bn3n3d3H5rzACfS93E 1rl/XRZzeFggAjs0MhDFCLw= =hOG1 -----END PGP SIGNATURE-----
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200908201204.24914.doconnor>
