Date: Fri, 17 May 1996 19:36:49 -0400 (EDT) From: Brian Tao <taob@io.org> To: FREEBSD-HACKERS-L <freebsd-hackers@freebsd.org> Subject: Slow tty updates and high load, but idle CPU Message-ID: <Pine.NEB.3.92.960517185251.6632A-100000@zap.io.org>
next in thread | raw e-mail | index | archive | help
I've got an odd problem with the 2.1.0-RELEASE shell servers here. I'm not sure where to begin, except to say that screen refreshes become very slow all of a sudden. When I say "slow", I mean that the screen is redrawn in "chunks". For example, if I hit ^L in a Pine window to refresh the screen, the screen clears, and groups of 5 or 6 lines will reappear, exactly one second apart. This "chunky updating" appears to happen with 600- character blocks or so. It is not oriented on line boundaries, since a refresh chunk can stop in the middle of a line. Each chunk will be redrawn at line speed (e.g., instantaneously if I'm on the local Ethernet, slightly slower over ISDN). Then there will be a one-second pause, followed by the next chunk. My 140x66 xterm window takes 16 seconds to redraw in this fashion. I've determined that it isn't a tty setting problem, or the tty driver sending out nulls. It isn't a problem with the network, because it happens right on the console as well. It will happen to all users on the system simultaneously, and persist until a reboot. It seems to occur after about 14 days of uptime. I believe it is related to a tty-related call, but I haven't tracked down which yet. As I mentioned, Pine has this problem. In fact, any program that needs to open the tty will have this problem (including more, less, vi, trn, tin, top, lynx, irc, etc.). If I run a command such as "ps -auxww" or "ls -l /dev" or the csh "history" command, the output is displayed at full speed. The "cat" command is also affected. Whereas "ls -l /dev" is displayed at full speed, "ls -l /dev | cat" is chunky. However, if I login to the affected server without a tty (i.e., "rsh server /bin/sh -i") and do a "ls -l /dev | cat", it proceeds at full speed. So it appears that the kernel tty interface code may have something to do with this. I also noticed that the affected server will suddenly report a high load average. For example: # uptime 5:21PM up 15 days, 17:15, 39 users, load averages: 6.77, 5.27, 2.70 Normally the load never rises above 1.00 even with double the number of users. When I check with ps, there are only one or two processes currently in a run state (one being 'ps'). vmstat and top reports over 90% CPU idle time. xperfmon++ does not indicate abnormal network, disk, I/O, interrupt, or swap usage. In fact, everything looks perfectly normal when compared to another one of our shell servers that doesn't have the problem, but with an uptime of only 6 days. This problem has yet to occur on our other servers, also running 2.1.0R, but don't handle interactive logins. It's as if the tty drivers do a sleep(1) between writing out a buffer, after a certain number of bytes have been output over the uptime of the server. Has anyone else seen these problems? -- Brian Tao (BT300, taob@io.org, taob@ican.net) Systems and Network Administrator, Internet Canada Corp. "Though this be madness, yet there is method in't"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.92.960517185251.6632A-100000>