From owner-freebsd-hackers Thu Apr 6 23: 9:26 2000 Delivered-To: freebsd-hackers@freebsd.org Received: from account.abs.net (account.abs.net [207.114.5.70]) by hub.freebsd.org (Postfix) with ESMTP id 5551037B963; Thu, 6 Apr 2000 23:08:49 -0700 (PDT) (envelope-from howardl@account.abs.net) Received: (from howardl@localhost) by account.abs.net (8.9.3/8.9.3+RBL+DUL+RSS+ORBS) id BAA14434; Fri, 7 Apr 2000 01:57:29 -0400 (EDT) (envelope-from howardl) From: Howard Leadmon Message-Id: <200004070557.BAA14434@account.abs.net> Subject: Re: Troubles with network & buffers.. Any Ideas?? In-Reply-To: <3.0.5.32.20000327160242.02248880@marble.sentex.ca> from Mike Tancsa at "Mar 27, 2000 04:02:42 pm" To: Mike Tancsa Date: Fri, 7 Apr 2000 01:57:29 -0400 (EDT) Cc: freebsd-stable@freebsd.org, freebsd-hackers@freebsd.org X-Mailer: ELM [version 2.4ME+ PL72 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > > Thanks for the fast reply.. :) > > > > Needless to say the machine has been rebooted since the last time it > >died, but I'll try and keep an eye on it and next time it locks if I can > >get to the console I'll see if I can grab a snapshot at that time. If a > >current running snapshot is of any use just let me know.. > > > Perhaps just setup a cronjob... > > vmstat -m >> /var/log/vm.out > > But have a look at your busier times to see if the High Use is getting > dagerously close to the limit. Do you have a lot of aliased IPs on this > box by any chance ? As its an IRC server, its no doubt subject to various > attacks. Check to see if there is any ICMP funny business being blasted at > you like a few million ICMP redirects. ipfw and sysctl can be your friend > here. > > ---Mike Hello Mike, Sorry for the long delay on this, just had a million things pop up at work and was so tired by day end I just crashed. Anyway to try and shed some more light on the above, I have done a few interesting things over the past couple weeks to try and gather some more info on what is happening. Also as for ICMP issues, I not only have ICMP limited to 32K max in my Cisco router using CAR, but also have "options ICMP_BANDLIM" defined in the kernel, and enabled in my rc.conf to just be sure I am not getting hammered in that regard. Anyway here is what I have done, hopefully this may shed some useful information, and if not I tried.. :) First as mentioned previously I had an Intel EEpro card in the box running to my Cisco Catalyst switch, and on the console when everything fell apart and I lost connectivity, I see the following: fxp0: device timeout syslogd: sendto: No buffer space available Here is some of the requested debugging information: ifconfig: fxp0: flags=8843 mtu 1500 inet 207.114.4.35 netmask 0xfffffff0 broadcast 207.114.4.47 inet 207.114.4.36 netmask 0xffffffff broadcast 207.114.4.36 inet 207.114.4.45 netmask 0xffffffff broadcast 207.114.4.45 inet 207.114.4.46 netmask 0xffffffff broadcast 207.114.4.46 ether 00:a0:c9:c7:fb:ff media: autoselect (100baseTX ) status: active supported media: autoselect 100baseTX 100baseTX 10baseT/UTP 10baseT/UTP netstat -m: 403/21472/81920 mbufs in use (current/peak/max): 259 mbufs allocated to data 144 mbufs allocated to packet headers 124/10652/20480 mbuf clusters in use (current/peak/max) 23988 Kbytes allocated to network (1% in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to protocol drain routines vmstat -m: Memory statistics by bucket size Size In Use Free Requests HighWater Couldfree 16 286 994 4581005 0 1280 32 179 36685 518690 0 640 64 14436 4380 1373135 0 320 128 1096 88 9883 0 160 256 13335 30569 317127 0 80 512 18 6 74806 0 40 1K 107 949 12272 0 20 2K 12 6 18478 0 10 4K 13 2 98260 0 5 8K 2 2 384331 0 5 16K 8 0 2689062 0 5 32K 3 0 1321506 0 5 64K 3 0 3 0 5 128K 3 0 3 0 5 256K 1 0 1 0 5 Memory usage type by bucket size Size Type(s) 16 MD disk, kld, proc-args, atexit, temp, sysctl, bus, rman, soname, pcb, mount, vnodes, ether_multi, routetbl, p1003.1b, devbuf, isa_devlist, atkbddev 32 kld, sigio, proc-args, temp, pgrp, proc, subproc, sysctl, bus, eventhandler, SWAP, pcb, cluster_save buffer, vnodes, BPF, ifaddr, ether_multi, routetbl, in_multi, tseg_qent, devbuf 64 file, proc-args, lockf, temp, session, subproc, bus, eventhandler, rman, pcb, vfscache, cluster_save buffer, vnodes, ifaddr, ether_multi, routetbl, isadev, AD driver 128 ppbusdev, kld, timecounter, dev_t, proc-args, zombie, temp, cred, bus, ttys, soname, vfscache, cluster_save buffer, mount, vnodes, ifaddr, routetbl, ZONE, devbuf 256 file desc, proc-args, temp, subproc, bus, ttys, vnodes, ifaddr, routetbl, NFS daemon, FFS node, devbuf 512 kld, file desc, temp, bus, ioctlops, ptys, BIO buffer, mount, UFS mount, ATA generic, devbuf, isa_devlist 1K MD disk, kld, file desc, temp, proc, bus, ioctlops, BIO buffer, NQNFS Lease, AD driver, devbuf, isa_devlist 2K file desc, temp, bus, pcb, BIO buffer, UFS mount, devbuf 4K kld, file desc, temp, proc, devbuf, memdesc 8K kld, file desc, temp, UFS mount 16K file desc, temp, devbuf 32K file desc, temp, devbuf, mbuf 64K ISOFS mount, NFS hash, UFS ihash 128K temp, vfscache, VM pgdata 256K SWAP Memory statistics by type Type Kern Type InUse MemUse HighUse Limit Requests Limit Limit Size(s) MD disk 2 2K 2K 64194K 2 0 0 16,1K ppbusdev 3 1K 1K 64194K 3 0 0 128 ISOFS mount 1 64K 64K 64194K 1 0 0 64K kld 10 11K 16K 64194K 53 0 0 16,32,128,512,1K,4K,8K timecounter 10 2K 2K 64194K 10 0 0 128 dev_t 540 68K 68K 64194K 540 0 0 128 file desc 35 46K 60K 64194K 6800 0 0 256,512,1K,2K,4K,8K,16K,32K file 109 7K 283K 64194K 1135473 0 0 64 sigio 1 1K 1K 64194K 1 0 0 32 proc-args 23 1K 2K 64194K 5559 0 0 16,32,64,128,256 zombie 0 0K 1K 64194K 6753 0 0 128 atexit 1 1K 1K 64194K 1 0 0 16 lockf 1 1K 1K 64194K 23 0 0 64 temp 177 82K 115K 64194K 4596159 0 0 16,32,64,128,256,512,1K,2K,4K,8K,16K,32K,128K pgrp 22 1K 1K 64194K 1233 0 0 32 session 20 2K 2K 64194K 949 0 0 64 proc 7 10K 10K 64194K 11 0 0 32,1K,4K subproc 72 7K 10K 64194K 14795 0 0 32,64,256 cred 9 2K 2K 64194K 1082 0 0 128 sysctl 0 0K 1K 64194K 646 0 0 16,32 bus 358 29K 29K 64194K 476 0 0 16,32,64,128,256,512,1K,2K eventhandler 11 1K 1K 64194K 11 0 0 32,64 SWAP 2 141K 141K 64194K 2 0 0 32,256K ioctlops 0 0K 1K 64194K 5 0 0 512,1K rman 50 3K 3K 64194K 79 0 0 16,64 ttys 410 53K 63K 64194K 1229 0 0 128,256 ptys 3 2K 2K 64194K 3 0 0 512 soname 1 1K 1K 64194K 3940726 0 0 16,128 pcb 45 5K 20K 64194K 639691 0 0 16,32,64,2K BIO buffer 100 102K 1048K 64194K 9950 0 0 512,1K,2K vfscache 14044 1007K 1007K 64194K 17344 0 0 64,128,128K cluster_save buffer 0 0K 1K 64194K 694 0 0 32,64,128 mount 4 2K 2K 64194K 6 0 0 16,128,512 vnodes 24 6K 6K 64194K 327 0 0 16,32,64,128,256 BPF 3 1K 1K 64194K 3 0 0 32 ifaddr 15 2K 2K 64194K 15 0 0 32,64,128,256 ether_multi 7 1K 1K 64194K 7 0 0 16,32,64 routetbl 61 9K 10295K 64194K 585667 0 0 16,32,64,128,256 in_multi 2 1K 1K 64194K 2 0 0 32 tseg_qent 0 0K 5K 64194K 212819 0 0 32 NFS daemon 1 1K 1K 64194K 1 0 0 256 NQNFS Lease 1 1K 1K 64194K 1 0 0 1K NFS hash 1 64K 64K 64194K 1 0 0 64K p1003.1b 1 1K 1K 64194K 1 0 0 16 FFS node 13187 3297K 3297K 64194K 14242 0 0 256 UFS ihash 1 64K 64K 64194K 1 0 0 64K UFS mount 9 20K 20K 64194K 9 0 0 512,2K,8K VM pgdata 1 128K 128K 64194K 1 0 0 128K ZONE 18 3K 3K 64194K 18 0 0 128 isadev 11 1K 1K 64194K 11 0 0 64 ATA generic 0 1K 1K 64194K 1 0 0 512 AD driver 2 2K 2K 64194K 204988 0 0 64,1K devbuf 82 207K 207K 64194K 114 0 0 16,32,128,256,512,1K,2K,4K,16K,32K mbuf 1 28K 28K 64194K 1 0 0 32K memdesc 1 4K 4K 64194K 1 0 0 4K isa_devlist 0 0K 2K 64194K 19 0 0 16,512,1K atkbddev 2 1K 1K 64194K 2 0 0 16 Memory Totals: In Use Free Requests 5472K 10077K 11398562 Now to probably complicate things more, I replaced the EEpro card with a DEC 21143 based board using the dc driver, and with that card the machine dies a little less often, but when it does the machine usually hangs hard, or reboots. Catching the console before it's totally dead, I can see the following message scrolling on the screen: dc0: watchdog timeout Different than the error from the EEpro card, but still network related, so again I dumped the above information for comparison, and here it is: ifconfig: dc0: flags=8843 mtu 1500 inet 207.114.4.35 netmask 0xfffffff0 broadcast 207.114.4.47 inet 207.114.4.36 netmask 0xffffffff broadcast 207.114.4.36 inet 207.114.4.45 netmask 0xffffffff broadcast 207.114.4.45 inet 207.114.4.46 netmask 0xffffffff broadcast 207.114.4.46 ether 00:c0:f0:3b:a7:eb media: autoselect (100baseTX ) status: active supported media: autoselect 100baseTX 100baseTX 10baseT/UTP 10baseT/UTP none netstat -m: 7526/15744/81920 mbufs in use (current/peak/max): 6064 mbufs allocated to data 1462 mbufs allocated to packet headers 3948/7874/20480 mbuf clusters in use (current/peak/max) 17716 Kbytes allocated to network (49% in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to protocol drain routines vmstat -m: Memory statistics by bucket size Size In Use Free Requests HighWater Couldfree 16 296 984 4706660 0 1280 32 3254 21706 599940 0 640 64 14657 3199 1669347 0 320 128 1099 53 15587 0 160 256 16537 14743 357609 0 80 512 14 2 30928 0 40 1K 33 743 13704 0 20 2K 13 5 40824 0 10 4K 13 2 348612 0 5 8K 2 4 1255714 0 5 16K 10 0 2479452 0 5 32K 1 0 1485462 0 5 64K 4 0 4 0 5 128K 3 0 3 0 5 256K 1 0 1 0 5 Memory usage type by bucket size Size Type(s) 16 MD disk, kld, proc-args, atexit, temp, sysctl, bus, rman, soname, pcb, mount, vnodes, ether_multi, routetbl, p1003.1b, devbuf, isa_devlist, atkbddev 32 kld, sigio, proc-args, temp, pgrp, proc, subproc, sysctl, bus, eventhandler, SWAP, pcb, cluster_save buffer, vnodes, BPF, ifaddr, ether_multi, routetbl, in_multi, tseg_qent, newblk, bmsafemap, indirdep, freefrag, freefile, diradd, dirrem, devbuf 64 file, proc-args, lockf, temp, session, subproc, bus, eventhandler, rman, pcb, vfscache, cluster_save buffer, vnodes, ifaddr, ether_multi, routetbl, pagedep, allocdirect, allocindir, isadev, AD driver 128 ppbusdev, kld, timecounter, dev_t, proc-args, zombie, temp, cred, bus, ttys, soname, vfscache, cluster_save buffer, mount, vnodes, ifaddr, routetbl, inodedep, freeblks, ZONE, devbuf 256 file desc, proc-args, temp, subproc, bus, ttys, vnodes, ifaddr, routetbl, NFS daemon, newblk, FFS node, devbuf 512 kld, file desc, temp, bus, ioctlops, ptys, BIO buffer, mount, UFS mount, ATA generic, devbuf, isa_devlist 1K MD disk, kld, file desc, temp, proc, bus, ioctlops, BIO buffer, NQNFS Lease, AD driver, devbuf, isa_devlist 2K file desc, temp, bus, pcb, BIO buffer, UFS mount, devbuf 4K kld, file desc, temp, proc, devbuf, memdesc 8K kld, file desc, temp, indirdep, UFS mount 16K file desc, temp, pagedep, devbuf 32K temp, mbuf 64K ISOFS mount, NFS hash, inodedep, UFS ihash 128K temp, vfscache, VM pgdata 256K SWAP Memory statistics by type Type Kern Type InUse MemUse HighUse Limit Requests Limit Limit Size(s) MD disk 2 2K 2K 64189K 2 0 0 16,1K ppbusdev 3 1K 1K 64189K 3 0 0 128 ISOFS mount 1 64K 64K 64189K 1 0 0 64K kld 10 11K 16K 64189K 53 0 0 16,32,128,512,1K,4K,8K timecounter 10 2K 2K 64189K 10 0 0 128 dev_t 540 68K 68K 64189K 540 0 0 128 file desc 37 30K 36K 64189K 7923 0 0 256,512,1K,2K,4K,8K,16K file 174 11K 208K 64189K 1332169 0 0 64 sigio 1 1K 1K 64189K 1 0 0 32 proc-args 24 2K 2K 64189K 6468 0 0 16,32,64,128,256 zombie 0 0K 1K 64189K 7875 0 0 128 atexit 1 1K 1K 64189K 1 0 0 16 lockf 1 1K 1K 64189K 3 0 0 64 temp 169 113K 138K 64189K 5648678 0 0 16,32,64,128,256,512,1K,2K,4K,8K,16K,32K,128K pgrp 23 1K 1K 64189K 1400 0 0 32 session 21 2K 2K 64189K 1115 0 0 64 proc 7 10K 10K 64189K 7 0 0 32,1K,4K subproc 77 7K 9K 64189K 17254 0 0 32,64,256 cred 9 2K 2K 64189K 1264 0 0 128 sysctl 0 0K 1K 64189K 738 0 0 16,32 bus 367 31K 31K 64189K 503 0 0 16,32,64,128,256,512,1K,2K eventhandler 11 1K 1K 64189K 11 0 0 32,64 SWAP 2 141K 141K 64189K 2 0 0 32,256K ioctlops 0 0K 1K 64189K 5 0 0 512,1K rman 50 3K 3K 64189K 79 0 0 16,64 ttys 410 53K 58K 64189K 1307 0 0 128,256 ptys 2 1K 1K 64189K 2 0 0 512 soname 1 1K 1K 64189K 3967614 0 0 16,128 pcb 50 5K 19K 64189K 738447 0 0 16,32,64,2K BIO buffer 26 28K 769K 64189K 12308 0 0 512,1K,2K vfscache 14194 1016K 1016K 64189K 18278 0 0 64,128,128K cluster_save buffer 0 0K 1K 64189K 981 0 0 32,64,128 mount 4 2K 2K 64189K 6 0 0 16,128,512 vnodes 24 6K 6K 64189K 327 0 0 16,32,64,128,256 BPF 3 1K 1K 64189K 3 0 0 32 ifaddr 16 2K 2K 64189K 16 0 0 32,64,128,256 ether_multi 7 1K 1K 64189K 7 0 0 16,32,64 routetbl 6193 871K 6957K 64189K 663076 0 0 16,32,64,128,256 in_multi 2 1K 1K 64189K 2 0 0 32 tseg_qent 0 0K 2K 64189K 220376 0 0 32 NFS daemon 1 1K 1K 64189K 1 0 0 256 NQNFS Lease 1 1K 1K 64189K 1 0 0 1K NFS hash 1 64K 64K 64189K 1 0 0 64K p1003.1b 1 1K 1K 64189K 1 0 0 16 pagedep 2 17K 17K 64189K 32 0 0 64,16K inodedep 4 65K 68K 64189K 2813 0 0 128,64K newblk 1 1K 1K 64189K 23834 0 0 32,256 bmsafemap 3 1K 1K 64189K 4690 0 0 32 allocdirect 1 1K 2K 64189K 8555 0 0 64 indirdep 1 1K 25K 64189K 2822 0 0 32,8K allocindir 1 1K 26K 64189K 15278 0 0 64 freefrag 0 0K 4K 64189K 3464 0 0 32 freeblks 0 0K 4K 64189K 1520 0 0 128 freefile 0 0K 1K 64189K 40 0 0 32 diradd 2 1K 1K 64189K 61 0 0 32 dirrem 0 0K 1K 64189K 64 0 0 32 FFS node 13320 3330K 3331K 64189K 14373 0 0 256 UFS ihash 1 64K 64K 64189K 1 0 0 64K UFS mount 9 20K 20K 64189K 9 0 0 512,2K,8K VM pgdata 1 128K 128K 64189K 1 0 0 128K ZONE 18 3K 3K 64189K 18 0 0 128 isadev 11 1K 1K 64189K 11 0 0 64 ATA generic 0 1K 1K 64189K 1 0 0 512 AD driver 1 1K 2K 64189K 277266 0 0 64,1K devbuf 81 175K 175K 64189K 113 0 0 16,32,128,256,512,1K,2K,4K,16K mbuf 1 28K 28K 64189K 1 0 0 32K memdesc 1 4K 4K 64189K 1 0 0 4K isa_devlist 0 0K 2K 64189K 18 0 0 16,512,1K atkbddev 2 1K 1K 64189K 2 0 0 16 Memory Totals: In Use Free Requests 6372K 5380K 13003847 All of the above stats were taken while the network card was spitting out errors prior to performing a reboot which brings the box back online. I also tried unplugging the nic and plugging it back in without out any change. I also over time have replaced everything in the box except the case, but still the problem persists, and in fact took the old hardware and built a different machine that works fine. So something related to the heavy use by the IRC programs is killing this thing almost daily, and I am at a loss as to what. If you or anyone here on the list has any ideas, I would sure love to hear them, as it would be nice to get to the bottom of this issue... --- Howard Leadmon - howardl@abs.net - http://www.abs.net ABSnet Internet Services - Phone: 410-361-8160 - FAX: 410-361-8162 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message