From owner-freebsd-current Tue Apr 23 16:27:05 1996 Return-Path: owner-current Received: (from root@localhost) by freefall.freebsd.org (8.7.3/8.7.3) id QAA16300 for current-outgoing; Tue, 23 Apr 1996 16:27:05 -0700 (PDT) Received: from rah.star-gate.com (rah.star-gate.com [204.188.121.18]) by freefall.freebsd.org (8.7.3/8.7.3) with SMTP id QAA16274 Tue, 23 Apr 1996 16:26:50 -0700 (PDT) Received: from rah.star-gate.com (localhost.star-gate.com [127.0.0.1]) by rah.star-gate.com (8.6.12/8.6.12) with ESMTP id QAA00460; Tue, 23 Apr 1996 16:25:20 -0700 Message-Id: <199604232325.QAA00460@rah.star-gate.com> X-Mailer: exmh version 1.6.5 12/11/95 To: "Marc G. Fournier" cc: current@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: Intelligent Debugging Tools... In-reply-to: Your message of "Tue, 23 Apr 1996 18:00:06 EDT." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 23 Apr 1996 16:25:19 -0700 From: "Amancio Hasty Jr." Sender: owner-current@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Wow, A bunch of stuff to try out: 1. swap out the memory 2. try a different scsi controller 3. check out the scsi cables 4. if the scsi bus does not have an active terminator try to get one 5. make sure that one of the drives is sending the termination power 6. check termination on all drives 7. make sure that scsi drives are good --- this is a tough my last scsi drive loved to crashed my system very similar to your symptoms - to say the least it was a mess. Cured the problem by junking the drive. Since you mentioned swap out problems I would concentrate the drives in which you have a swap partition. 8. Compile a kernel with kgdb so that when the system crashes you can hopefelly pop into the debugger and send us a stack trace. 9. See if you can get hold of another vga card something like an ISA et4000 based. Phew, good luck, Amancio >>> "Marc G. Fournier" said: > On Tue, 23 Apr 1996, Amancio Hasty Jr. wrote: > > > Hi, > > > > It will help if you post your hardware configuration. A few months > > ago there was a nasty PCI interaction in the kernel which caused > > my system to crash so glad that whatever it was is gone 8) > > > > dmesg: > FreeBSD 2.1-STABLE #0: Tue Apr 23 10:33:56 EDT 1996 > scrappy@ki.net:/usr/src/sys/compile/kinet > CPU: i486 DX4 (486-class CPU) > Origin = "GenuineIntel" Id = 0x480 Stepping=0 > Features=0x3 > real memory = 16777216 (16384K bytes) > avail memory = 14835712 (14488K bytes) > Probing for devices on PCI bus 0: > chip0 rev 49 on pci0:5 > ncr0 rev 2 int a irq 12 on pci0:11 > (ncr0:0:0): "QUANTUM FIREBALL1280S 630C" type 0 fixed SCSI 2 > sd0(ncr0:0:0): Direct-Access > sd0(ncr0:0:0): FAST SCSI-2 100ns (10 Mb/sec) offset 8. > 1222MB (2503872 512 byte sectors) > (ncr0:1:0): "QUANTUM LPS340S 020B" type 0 fixed SCSI 2 > sd1(ncr0:1:0): Direct-Access > sd1(ncr0:1:0): FAST SCSI-2 100ns (10 Mb/sec) offset 8. > 327MB (670506 512 byte sectors) > (ncr0:2:0): "QUANTUM LP240S GM240S01X 4.6" type 0 fixed SCSI 2 > sd2(ncr0:2:0): Direct-Access > sd2(ncr0:2:0): FAST SCSI-2 100ns (10 Mb/sec) offset 8. > 234MB (479350 512 byte sectors) > (ncr0:3:0): "CONNER CFP1060S 1.05GB 243F" type 0 fixed SCSI 2 > sd3(ncr0:3:0): Direct-Access > sd3(ncr0:3:0): FAST SCSI-2 100ns (10 Mb/sec) offset 8. > 1013MB (2074880 512 byte sectors) > vga0 rev 0 on pci0:15 > Probing for devices on the ISA bus: > sc0 at 0x60-0x6f irq 1 on motherboard > sc0: VGA color <16 virtual consoles, flags=0x0> > ed0 at 0x280-0x29f irq 5 maddr 0xd8000 msize 16384 on isa > ed0: address 00:00:c0:86:44:79, type WD8013EPC (16 bit) > sio0 at 0x3f8-0x3ff irq 4 on isa > sio0: type 16550A > sio1 not probed due to I/O address conflict with sio0 at 0x3f8 > fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa > fdc0: NEC 72065B > fd0: 1.44MB 3.5in > npx0 on motherboard > npx0: INT 16 interface > sctarg0(noadapter::): Processor Target > > > The NCR controller is one of the ASUS SC-200 controllers, > the vga0 device is an ATI Mach64 4MB PCI, and the sio1 conflict is > a misconfiguration on my part in this newest kernel that I have to > fix. > > Oh, the motherboard is an ACER AP43 with a 486DX4-100 > CPU, and sio[01] are both onboard serial. > > The memory is one 16Meg SIMM, and pstat shows: > > Device 1K-blocks Used Avail Capacity Type > /dev/sd0b 51200 6880 44256 13% Interleaved > /dev/sd1b 32768 6936 25768 21% Interleaved > /dev/sd2b 32768 6960 25744 21% Interleaved > /dev/sd3b 102400 6904 95432 7% Interleaved > Total 218880 27680 191200 13% > > Any other information that may be pertinent? helpful? > > I'm running just about everything on this machine...named, > innd, YP, nfs-server and PPP: > > PID TT STAT TIME COMMAND > 0 ?? DLs 0:00.00 (swapper) > 1 ?? IWs 0:00.40 /sbin/init -- > 2 ?? DL 0:17.60 (pagedaemon) > 3 ?? DL 0:05.23 (vmdaemon) > 4 ?? DL 0:12.49 (update) > 78 ?? Ss 0:00.45 routed -q > 102 ?? Ss 0:03.46 syslogd > 105 ?? SWs 0:07.72 named > 110 ?? IWs 0:00.14 portmap > 113 ?? IWs 0:05.73 ypserv > 116 ?? IWs 0:00.02 yppasswdd -s -f > 120 ?? Ss 0:00.74 rwhod > 124 ?? IWs 0:00.09 mountd > 126 ?? IWs 0:00.02 nfsd: master (nfsd) > 128 ?? S 0:47.11 nfsd: server (nfsd) > 129 ?? IW 0:05.28 nfsd: server (nfsd) > 130 ?? IW 0:00.85 nfsd: server (nfsd) > 131 ?? IW 0:00.25 nfsd: server (nfsd) > 138 ?? Ss 0:00.90 inetd > 145 ?? Ss 0:00.62 cron > 149 ?? IWs 0:00.82 (sendmail) > 184 ?? Ss 0:06.29 /usr/httpd/bin/httpd -f /usr/httpd/conf/httpd.conf (h > 187 ?? IW 0:00.84 /usr/httpd/bin/httpd -f /usr/httpd/conf/httpd.conf (h > 188 ?? IW 0:00.84 /usr/httpd/bin/httpd -f /usr/httpd/conf/httpd.conf (h > 208 ?? DNs 12:39.57 /news/admin/etc/innd -p4 -i0 > 218 ?? IWs 0:00.01 /usr/local/lib/pg95/bin/postmaster -S (postgres) > 583 ?? IWN 0:00.91 -204.17.53.78 LIST > 860 ?? IWs 0:01.28 SCREEN -R (screen-3.7.1) > 999 ?? IW 0:00.09 rshd > 1000 ?? IW 0:01.11 /etc/rimapd > 1293 ?? IW 0:00.27 /usr/httpd/bin/httpd -f /usr/httpd/conf/httpd.conf (h > 1495 ?? DN 0:09.59 /news/bin/overchan > 1496 ?? SN 0:09.38 /usr/local/bin/perl /news/stats/bin/flowsum.channe l > 1551 ?? IW 0:00.18 (ftpd) > 1553 ?? IW 0:00.20 (ftpd) > 1686 ?? IWN 0:00.25 sh -c \n^IBATCHFILE=${HOST}.nntp\n^ILOCK=${LOCKS}/ LOC > 1719 ?? SN 0:06.78 innxmit -a -t300 -T1800 news.trends.ca /news/spool /ou > 1792 ?? IW 0:00.10 (sendmail) > 1808 ?? IW 0:00.06 (sendmail) > 1824 ?? IW 0:00.22 (ftpd) > 863 p0 IWs+ 0:00.58 -bin/tcsh > 911 p1 IWs 0:00.65 -bin/tcsh > 921 p1 IW+ 0:00.31 vi filter.c > 220 v0 IWs 0:00.84 -tcsh (tcsh) > 627 v0 IW+ 0:02.82 rlogin freebsd > 628 v0 IW+ 0:04.03 rlogin freebsd > 824 v1 IWs 0:00.62 -tcsh (tcsh) > 859 v1 IW+ 0:00.06 screen -R (screen-3.7.1) > 222 v2 IWs 0:00.66 -tcsh (tcsh) > 1786 v2 S 0:01.69 -su (tcsh) > 1826 v2 R+ 0:00.07 ps -ax > 223 v3 IWs+ 0:00.04 /usr/libexec/getty Pc ttyv3 > 224 d0 IWs+ 0:00.23 /bin/sh /usr/local/lib/ppp/13 > 1647 d0 S+ 0:03.58 /usr/sbin/ppp -direct adrenlin > > > My two most recent panics had to do with vm_page_alloc(), which > I *think* have to do with swap -or- RAM, and pmap_zero_page(), both of > which I've been told no one else is experiencing (and I've gone through > the GNaTs database for anything similar to no avail), which I believe. > > And I have no problems believing that it *may* be a hardware > problem, but what would be nice is some non-"trial and error" method of > narrowing down the problem. Some way of having the panic that vm_page_alloc () > produces send out an error message that states *where* the panic occurred... > ie. in RAM or in swap space, or as a result of either. > > Its difficult to go to the accounting department and ask for more > RAM because "that might fix the problem" :( > > Marc G. Fournier scrappy@ki.net > Systems Administrator @ ki.net scrappy@freebsd.org > >