From owner-freebsd-hackers Fri May 17 16:37:59 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.3/8.7.3) id QAA00407 for hackers-outgoing; Fri, 17 May 1996 16:37:59 -0700 (PDT) Received: from io.org (io.org [198.133.36.1]) by freefall.freebsd.org (8.7.3/8.7.3) with SMTP id QAA00402 for ; Fri, 17 May 1996 16:37:57 -0700 (PDT) Received: from zap.io.org (taob@zap.io.org [198.133.36.81]) by io.org (8.6.12/8.6.12) with SMTP id TAA13823 for ; Fri, 17 May 1996 19:37:51 -0400 Date: Fri, 17 May 1996 19:36:49 -0400 (EDT) From: Brian Tao To: FREEBSD-HACKERS-L Subject: Slow tty updates and high load, but idle CPU Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-hackers@freebsd.org X-Loop: FreeBSD.org Precedence: bulk I've got an odd problem with the 2.1.0-RELEASE shell servers here. I'm not sure where to begin, except to say that screen refreshes become very slow all of a sudden. When I say "slow", I mean that the screen is redrawn in "chunks". For example, if I hit ^L in a Pine window to refresh the screen, the screen clears, and groups of 5 or 6 lines will reappear, exactly one second apart. This "chunky updating" appears to happen with 600- character blocks or so. It is not oriented on line boundaries, since a refresh chunk can stop in the middle of a line. Each chunk will be redrawn at line speed (e.g., instantaneously if I'm on the local Ethernet, slightly slower over ISDN). Then there will be a one-second pause, followed by the next chunk. My 140x66 xterm window takes 16 seconds to redraw in this fashion. I've determined that it isn't a tty setting problem, or the tty driver sending out nulls. It isn't a problem with the network, because it happens right on the console as well. It will happen to all users on the system simultaneously, and persist until a reboot. It seems to occur after about 14 days of uptime. I believe it is related to a tty-related call, but I haven't tracked down which yet. As I mentioned, Pine has this problem. In fact, any program that needs to open the tty will have this problem (including more, less, vi, trn, tin, top, lynx, irc, etc.). If I run a command such as "ps -auxww" or "ls -l /dev" or the csh "history" command, the output is displayed at full speed. The "cat" command is also affected. Whereas "ls -l /dev" is displayed at full speed, "ls -l /dev | cat" is chunky. However, if I login to the affected server without a tty (i.e., "rsh server /bin/sh -i") and do a "ls -l /dev | cat", it proceeds at full speed. So it appears that the kernel tty interface code may have something to do with this. I also noticed that the affected server will suddenly report a high load average. For example: # uptime 5:21PM up 15 days, 17:15, 39 users, load averages: 6.77, 5.27, 2.70 Normally the load never rises above 1.00 even with double the number of users. When I check with ps, there are only one or two processes currently in a run state (one being 'ps'). vmstat and top reports over 90% CPU idle time. xperfmon++ does not indicate abnormal network, disk, I/O, interrupt, or swap usage. In fact, everything looks perfectly normal when compared to another one of our shell servers that doesn't have the problem, but with an uptime of only 6 days. This problem has yet to occur on our other servers, also running 2.1.0R, but don't handle interactive logins. It's as if the tty drivers do a sleep(1) between writing out a buffer, after a certain number of bytes have been output over the uptime of the server. Has anyone else seen these problems? -- Brian Tao (BT300, taob@io.org, taob@ican.net) Systems and Network Administrator, Internet Canada Corp. "Though this be madness, yet there is method in't"