Date: Mon, 31 Jul 2000 08:41:14 -0700 (PDT) From: Matt Dillon <dillon@earth.backplane.com> To: gallatin@FreeBSD.ORG Cc: freebsd-stable@FreeBSD.ORG Subject: Re: NFS server running out of bufs & locking up Message-ID: <200007311541.IAA89271@earth.backplane.com> References: <14721.48065.766815.376959@grasshopper.cs.duke.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
I've only got weekends available but I will try to look into it. -Matt :I have an NFS server which I updated to RELENG_4 (as of Jul 13th) from :4.0-current (as of Jul 13th 1999). Since the upgrade, it has locked :up 3 times; it had been up for 180+ days prior to the upgrade. : :The machine serves a large (64GB) volume stripped across 4 ATA drives :with CCD mounted with soft updates. When it locks up, it is getting :beaten upon by a compute farm of 50+ Solaris boxes running NFS over :TCP (via 100Mb ethernet). : :When it locks, most procs are waiting in biofre, and the nfsd's are :wating on inode. I've been unable to get a dump, the most I have :is ps from ddb. (appended below). Its somewhat interesting that :3 of the nfsds are waiting on the same inode : :Stopped at siointr1+0xb1: jmp siointr1+0x1a0 :db> ps : pid proc addr uid ppid pgrp flag stat wmesg wchan cmd : 494 d28fb2a0 d292a000 0 120 120 002004 3 biofre c02aa6d8 amd : 493 d0ad9c20 d2869000 0 373 492 004006 3 biofre c02aa6d8 grep : 473 d0ad7380 d28df000 1597 320 473 004006 3 biofre c02aa6d8 netdump_server : 460 d28fbe00 d28fc000 1597 458 460 004086 3 ttyin c1729630 tcsh : 458 d0ad76c0 d28c8000 0 205 205 000084 3 select c02be9ec sshd1 : 394 d28fb440 d290f000 1 1 394 000104 3 biofre c02aa6d8 portmap : 373 d0ad6ea0 d28eb000 0 285 373 2004086 3 opause d28eb108 tcsh : 320 d0ad7040 d28e4000 1597 317 320 2004086 3 opause d28e4108 tcsh : 317 d0ad71e0 d28e9000 0 205 205 000084 3 select c02be9ec sshd1 : 285 d0ad7520 d28db000 1387 283 285 2004086 3 opause d28db108 tcsh : 283 d0ad7860 d28c4000 0 205 205 000084 3 select c02be9ec sshd1 : 260 d0ad7a00 d28bf000 1387 259 260 004106 3 biofre c02aa6d8 systat : 259 d0ad83c0 d2899000 1387 233 259 004186 3 select c02be9ec xterm : 233 d0ada440 d284e000 1387 230 233 004006 3 inode c16f8000 tcsh : 230 d0ad8220 d289d000 0 205 205 000004 3 biofre c02aa6d8 sshd1 : 223 d0ada5e0 d284b000 0 1 223 004006 3 biofre c02aa6d8 getty : 218 d0ad7ba0 d28bd000 0 1 218 000084 3 sbwait d0668acc zhm : 205 d0ad7d40 d28b1000 0 1 205 000084 3 select c02be9ec sshd1 : 147 d0ad8080 d28a1000 0 1 147 2000184 3 pause d28a1108 sendmail : 144 d0ad7ee0 d28a4000 0 1 144 000084 3 nanslp c02aa580 cron : 142 d0ad9a80 d286c000 0 1 142 000084 3 select c02be9ec inetd : 120 d0ad8be0 d2889000 0 1 120 000084 3 select c02be9ec amd : 115 d0ad8560 d2895000 0 1 110 000084 3 nfsidl c02c0d4c nfsiod : 114 d0ad8700 d2892000 0 1 110 000084 3 nfsidl c02c0d48 nfsiod : 113 d0ad88a0 d288f000 0 1 110 000084 3 nfsidl c02c0d44 nfsiod : 112 d0ad8a40 d288c000 0 1 110 000084 3 nfsidl c02c0d40 nfsiod : 108 d0ad8d80 d2886000 0 1 108 000084 3 select c02be9ec rpc.statd : 105 d0ad8f20 d2882000 0 100 100 000004 3 inode c16c3400 nfsd : 104 d0ad90c0 d287f000 0 100 100 000004 3 inode c16c3400 nfsd : 103 d0ad9260 d287c000 0 100 100 000004 3 inode c16c3400 nfsd : 102 d0ad9400 d2878000 0 100 100 000004 3 biofre c02aa6d8 nfsd : 100 d0ad95a0 d2875000 0 1 100 000084 3 accept d06663f6 nfsd : 98 d0ad9740 d2872000 0 1 98 000084 3 select c02be9ec mountd : 92 d0ad98e0 d286f000 0 1 92 000084 3 select c02be9ec ypbind : 87 d0ad9dc0 d2866000 0 1 87 000084 3 select c02be9ec ntpd : 80 d0ad9f60 d285c000 0 1 80 000084 3 select c02be9ec syslogd : 33 d0ada100 d2858000 0 1 33 2000084 3 pause d2858108 adjkerntz : 25 d0ada2a0 d2855000 0 1 25 000084 3 mfsidl d0ad3d00 mount_mfs : 5 d0ada780 d0ae7000 0 0 0 000204 3 biofre c02aa6d8 syncer : 4 d0ada920 d0ae5000 0 0 0 100204 3 psleep c02aa6a8 bufdaemon : 3 d0adaac0 d0ae3000 0 0 0 000204 3 psleep c02b5fa0 vmdaemon : 2 d0adac60 d0ae1000 0 0 0 100204 3 psleep c029c8b8 pagedaemon : 1 d0adae00 d0adf000 0 0 1 004284 3 wait d0adae00 init : 0 c02bdd80 c0322000 0 0 0 000204 3 sched c02bdd80 swapper : : :About 30 seconds before this lockup, I was looking at how much buf :space is available: : :#sysctl -a | grep buf :kern.ipc.maxsockbuf: 262144 :kern.ipc.sockbuf_waste_factor: 8 :kern.ipc.mbuf_wait: 32 :kern.ipc.nmbufs: 10240 :vfs.nfs.bufpackets: 0 :vfs.numdirtybuffers: 18 :vfs.hidirtybuffers: 796 :vfs.numfreebuffers: 3083 :vfs.lofreebuffers: 177 :vfs.hifreebuffers: 354 :vfs.runningbufspace: 32768 :vfs.maxbufspace: 50872320 :vfs.hibufspace: 50216960 :vfs.lobufspace: 50151424 :vfs.bufspace: 50151424 :vfs.maxmallocbufspace: 2510848 :vfs.bufmallocspace: 4096 :vfs.getnewbufcalls: 512584 :vfs.getnewbufrestarts: 0 :vfs.bufdefragcnt: 0 :vfs.buffreekvacnt: 0 :vfs.bufreusecnt: 3061 :vfs.reassignbufcalls: 426411 :vfs.reassignbufloops: 0 :vfs.reassignbufsortgood: 144583 :vfs.reassignbufsortbad: 4776 :vfs.reassignbufmethod: 1 :vfs.aio.max_buf_aio: 16 :vfs.aio.num_buf_aio: 0 :debug.bpf_bufsize: 4096 :debug.bpf_maxbufsize: 524288 :machdep.msgbuf: :machdep.msgbuf_clear: 0 : : :I have appended my config file & boot messages. : :Thanks for any help you can give, : :Drew : :------------------------------------------------------------------------------ :Andrew Gallatin, Sr Systems Programmer http://www.cs.duke.edu/~gallatin :Duke University Email: gallatin@cs.duke.edu :Department of Computer Science Phone: (919) 660-6590 : : :Copyright (c) 1992-2000 The FreeBSD Project. :Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 : The Regents of the University of California. All rights reserved. :FreeBSD 4.0-STABLE #0: Thu Jul 13 12:11:33 EDT 2000 : gallatin@grits.cs.duke.edu:/usr/src/sys/compile/NFSSERVER :Timecounter "i8254" frequency 1193182 Hz :Timecounter "TSC" frequency 451024727 Hz :CPU: Pentium II/Pentium II Xeon/Celeron (451.02-MHz 686-class CPU) : Origin = "GenuineIntel" Id = 0x652 Stepping = 2 : Features=0x183f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR> :real memory = 402640896 (393204K bytes) :avail memory = 388145152 (379048K bytes) :Preloaded elf kernel "kernel" at 0xc030f000. :ccd0-3: Concatenated disk drivers :Pentium Pro MTRR support enabled :npx0: <math processor> on motherboard :npx0: INT 16 interface :pcib0: <Intel 82443BX (440 BX) host to PCI bridge> on motherboard :pci0: <PCI bus> on pcib0 :pcib1: <Intel 82443BX (440 BX) PCI-PCI (AGP) bridge> at device 1.0 on pci0 :pci1: <PCI bus> on pcib1 :isab0: <Intel 82371AB PCI to ISA bridge> at device 4.0 on pci0 :isa0: <ISA bus> on isab0 :atapci0: <Intel PIIX4 ATA33 controller> port 0xd800-0xd80f at device 4.1 on pci0 :ata0: at 0x1f0 irq 14 on atapci0 :pci0: <Intel 82371AB/EB (PIIX4) USB controller> at 4.2 :chip1: <Intel 82371AB Power management controller> port 0xe800-0xe80f at device 4.3 on pci0 :atapci1: <Promise ATA33 controller> port 0xa800-0xa81f,0xb004-0xb007,0xb400-0xb407,0xb804-0xb807,0xd000-0xd007 irq 12 at device 9.0 on pci0 :ata2: at 0xd000 on atapci1 :ata3: at 0xb400 on atapci1 :fxp0: <Intel Pro 10/100B/100+ Ethernet> port 0xa400-0xa41f mem 0xe2000000-0xe20fffff,0xe3000000-0xe3000fff irq 10 at device 10.0 on pci0 :fxp0: Ethernet address 00:a0:c9:e7:95:bb :atapci2: <Promise ATA33 controller> port 0x8800-0x881f,0x9004-0x9007,0x9400-0x9407,0x9804-0x9807,0xa000-0xa007 irq 11 at device 12.0 on pci0 :ata4: at 0xa000 on atapci2 :ata5: at 0x9400 on atapci2 :fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0 :fdc0: FIFO enabled, 8 bytes threshold :fd0: <1440-KB 3.5" drive> on fdc0 drive 0 :atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 :sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 :sio0: type 16550A, console :sio1 at port 0x2f8-0x2ff irq 3 on isa0 :sio1: type 16550A :ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0 :ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode :ppc0: FIFO with 16/16/9 bytes threshold :ppi0: <Parallel I/O> on ppbus0 :lpt0: <Printer> on ppbus0 :lpt0: Interrupt-driven port :plip0: <PLIP network interface> on ppbus0 :ad0: 4892MB <QUANTUM FIREBALL EX5.1A> [10602/15/63] at ata0-master using UDMA33 :ad1: 16479MB <Maxtor 91728D8> [33483/16/63] at ata2-master using UDMA33 :ad2: 16479MB <Maxtor 91728D8> [33483/16/63] at ata3-master using UDMA33 :ad3: 16479MB <Maxtor 91728D8> [33483/16/63] at ata4-master using UDMA33 :ad4: 16479MB <Maxtor 91728D8> [33483/16/63] at ata5-master using UDMA33 :Mounting root from ufs:/dev/ad0s1a :WARNING: / was not properly dismounted : : :## NFSSERVER :machine i386 :cpu I686_CPU :ident NFSSERVER :maxusers 128 : :#makeoptions DEBUG=-g #Build kernel with gdb(1) debug symbols : :options INET #InterNETworking :options FFS #Berkeley Fast Filesystem :options FFS_ROOT #FFS usable as root device [keep this!] :options SOFTUPDATES #Enable FFS soft updates support :options MFS #Memory Filesystem :options NFS #Network Filesystem :options MSDOSFS #MSDOS Filesystem :options CD9660 #ISO 9660 Filesystem :options CD9660_ROOT #CD-ROM usable as root, CD9660 required :options PROCFS #Process filesystem :options COMPAT_43 #Compatible with BSD 4.3 [KEEP THIS!] :options SCSI_DELAY=1500 #Delay (in ms) before probing SCSI :options UCONSOLE #Allow users to grab the console :options USERCONFIG #boot -c editor :options VISUAL_USERCONFIG #visual boot -c editor :options KTRACE #ktrace(1) support :options SYSVSHM #SYSV-style shared memory :options SYSVMSG #SYSV-style message queues :options SYSVSEM #SYSV-style semaphores :options P1003_1B #Posix P1003_1B real-time extensions :options _KPOSIX_PRIORITY_SCHEDULING :options ICMP_BANDLIM #Rate limit bad replies :options KBD_INSTALL_CDEV # install a CDEV entry in /dev : :# To make an SMP kernel, the next two are needed :#options SMP # Symmetric MultiProcessor Kernel :#options APIC_IO # Symmetric (APIC) I/O :# Optionally these may need tweaked, (defaults shown): :#options NCPU=2 # number of CPUs :#options NBUS=4 # number of busses :#options NAPIC=1 # number of IO APICs :#options NINTR=24 # number of INTs : :device isa :device pci : :# Floppy drives :device fdc0 at isa? port IO_FD1 irq 6 drq 2 :device fd0 at fdc0 drive 0 : :# ATA and ATAPI devices :device ata0 at isa? port IO_WD1 irq 14 :device ata1 at isa? port IO_WD2 irq 15 :device ata :device atadisk # ATA disk drives :device atapicd # ATAPI CDROM drives : : :# SCSI Controllers :#device ahb # EISA AHA1742 family :device ahc # AHA2940 and onboard AIC7xxx devices :#device amd # AMD 53C974 (Teckram DC-390(T)) :#device dpt # DPT Smartcache - See LINT for options! :#device isp # Qlogic family :#device ncr # NCR/Symbios Logic :device sym # NCR/Symbios Logic (newer chipsets) :options SYM_SETUP_LP_PROBE_MAP=0x40 : # Allow ncr to attach legacy NCR devices when : # both sym and ncr are configured : :#device adv0 at isa? :#device adw :#device bt0 at isa? :#device aha0 at isa? :#device aic0 at isa? : :# SCSI peripherals :device scbus # SCSI bus (required) :device da # Direct Access (disks) :device sa # Sequential Access (tape etc) :device cd # CD :device pass # Passthrough device (direct SCSI access) : :# atkbdc0 controls both the keyboard and the PS/2 mouse :device atkbdc0 at isa? port IO_KBD :device atkbd0 at atkbdc? irq 1 flags 0x1 :device psm0 at atkbdc? irq 12 : :device vga0 at isa? : :# splash screen/screen saver :pseudo-device splash : :# syscons is the default console driver, resembling an SCO console :device sc0 at isa? flags 0x100 : :# Floating point support - do not disable. :device npx0 at nexus? port IO_NPX irq 13 : :# Serial (COM) ports :device sio0 at isa? port IO_COM1 flags 0x10 irq 4 :device sio1 at isa? port IO_COM2 irq 3 : :# Parallel port :device ppc0 at isa? irq 7 :device ppbus # Parallel port bus (required) :device lpt # Printer :device plip # TCP/IP over parallel :device ppi # Parallel port interface device :device vpo # Requires scbus and da : : :# PCI Ethernet NICs. :device de # DEC/Intel DC21x4x (``Tulip'') :device fxp # Intel EtherExpress PRO/100B (82557, 82558) : :# Pseudo devices - the number indicates how many units to allocated. :pseudo-device loop # Network loopback :pseudo-device ether # Ethernet support :pseudo-device pty # Pseudo-ttys (telnet etc) : :# The `bpf' pseudo-device enables the Berkeley Packet Filter. :# Be aware of the administrative consequences of enabling this! :pseudo-device bpf #Berkeley packet filter : :pseudo-device ccd 4 #Concatenated disk driver : :# Size of the kernel message buffer. Should be N * pagesize. :options MSGBUF_SIZE=40960 : :# :# Enable the kernel debugger. :# :options DDB : :# :# Don't drop into DDB for a panic. Intended for unattended operation :# where you may want to drop to DDB from the console, but still want :# the machine to recover from a panic :# :options DDB_UNATTENDED : :# Options for serial drivers that support consoles (only for sio now): :options BREAK_TO_DEBUGGER #a BREAK on a comconsole goes to : #DDB, if available. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200007311541.IAA89271>