Date: Thu, 16 Mar 2000 00:51:08 -0600 From: Douglas Swarin <doug@staff.texas.net> To: freebsd-hackers@freebsd.org Subject: NFS Panic Problem Message-ID: <20000316005107.A2883@staff.texas.net>
next in thread | raw e-mail | index | archive | help
Recently one of the FreeBSD machines where I work has been crashing on a semi-regular basis, once or twice a day. The dmesg for the machine is at the bottom of this post. These crashes started very recently, less than a week ago. Before that, the machine had been very reliable (several 100 day uptimes). The machine used to be running FreeBSD 3.1-STABLE as of mid-April 1999. Since I know many NFS bugs have been fixed since then, the box was on Tuesday upgraded to 3.4-STABLE (a completely fresh installation). This, however, did not fix the panics. I believe the problem to be related to one of these two PRs: [1998/06/23] kern/7028 http://www.freebsd.org/cgi/query-pr.cgi?pr=7028 panic in vinvalbuf when appending/looking at tail of NFS file [2000/03/08] misc/17272 http://www.freebsd.org/cgi/query-pr.cgi?pr=17272 deleting a file that a program has open causes vinvalbuf: flush failed Basically, it's: panic: vinvalbuf: flush failed And appears to be triggered by a 'tail -f' on a growing, very large log file over NFS. The NFS host on the other end is running Solaris 2.6 on a sparc. The actual mount is kind of weird; it is indirected through a different NFS mount off a NetApp through a symlink (the NetApp-mounted FS is basically a symlink farm with a few real directories). Basically: netapp:/home on /home sun:/logs on /sun/logs /home/logs@ -> /sun/logs and we are doing 'tail -f /home/logs/largelogfile' (there are good historical reasons for this setup) We have made no significant changes to the other machines in this setup, although the logfile in question has been growing in size over time. We rotate the logfile on the Sun daily as well. No executable files for the BSD machine are stored on the Sun. I have compiled a debug kernel and will provide a traceback and/or dump to anyone who is interested once it happens again. If I find a way to reliably reproduce it, I will post that too. For the meantime, are there any quick patches or other solutions I could use? Thanks in advance for your time and advice, Doug Below is dmesg: Copyright (c) 1992-1999 FreeBSD Inc. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. FreeBSD 3.4-STABLE #2: Tue Mar 14 23:21:39 CST 2000 doug@xxx:/usr/src/sys/compile/XXX Timecounter "i8254" frequency 1193182 Hz Timecounter "TSC" frequency 347664663 Hz CPU: Pentium II/Xeon/Celeron (347.66-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x652 Stepping = 2 Features=0x183fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR> real memory = 536870912 (524288K bytes) avail memory = 519360512 (507188K bytes) Preloaded elf kernel "kernel" at 0xc0309000. Preloaded userconfig_script "/boot/kernel.conf" at 0xc030909c. Pentium Pro MTRR support enabled Probing for devices on PCI bus 0: chip0: <Intel 82443BX host to PCI bridge> rev 0x03 on pci0.0.0 chip1: <Intel 82443BX host to AGP bridge> rev 0x03 on pci0.1.0 chip2: <PCI to PCI bridge (vendor=1011 device=0024)> rev 0x03 on pci0.2.0 chip3: <Intel 82371AB PCI to ISA bridge> rev 0x02 on pci0.7.0 chip4: <Intel 82371AB Power management controller> rev 0x02 on pci0.7.3 fxp0: <Intel EtherExpress Pro 10/100B Ethernet> rev 0x05 int a irq 14 on pci0.8.0 fxp0: Ethernet address 00:90:27:45:ee:ae Probing for devices on PCI bus 1: vga0: <ATI model 4744 graphics accelerator> rev 0x5c on pci1.0.0 Probing for devices on PCI bus 2: ahc0: <Adaptec aic7890/91 Ultra2 SCSI adapter> rev 0x00 int a irq 11 on pci2.4.0 ahc0: aic7890/91 Wide Channel A, SCSI Id=7, 16/255 SCBs ahc1: <Adaptec aic7860 SCSI adapter> rev 0x03 int a irq 11 on pci2.6.0 ahc1: aic7860 Single Channel A, SCSI Id=7, 3/255 SCBs Probing for devices on the ISA bus: sc0 on isa sc0: VGA color <16 virtual consoles, flags=0x0> atkbdc0 at 0x60-0x6f on motherboard atkbd0 irq 1 on isa psm0 not found sio0 at 0x3f8-0x3ff irq 4 flags 0x30 on isa sio0: type 16550A, console sio1 at 0x2f8-0x2ff irq 3 on isa sio1: type 16550A fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa fdc0: FIFO enabled, 8 bytes threshold fd0: 1.44MB 3.5in ppc0 at 0x378 irq 7 on isa ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/8 bytes threshold lpt0: <generic printer> on ppbus 0 lpt0: Interrupt-driven port ppi0: <generic parallel i/o> on ppbus 0 lppps0: <Pulse per second Timing Interface> on ppbus 0 plip0: <PLIP network interface> on ppbus 0 vga0 at 0x3b0-0x3df maddr 0xa0000 msize 131072 on isa npx0 on motherboard npx0: INT 16 interface Waiting 8 seconds for SCSI devices to settle chcd0 at ahc1 bus 0 target 5 lun 0 cd0: <NEC CD-ROM DRIVE:465 1.03> Removable CD-ROM SCSI-2 device cd0: 20.000MB/s transfers (20.000MHz, offset 15) cd0: Attempt to query device size failed: NOT READY, Medium not present da1 at ahc0 bus 0 target 1 lun 0 da1: <IBM DDRS-39130D DC1B> Fixed Direct Access SCSI-2 device da1: 40.000MB/s transfers (20.000MHz, offset 15, 16bit), Tagged Queueing Enabled da1: 8715MB (17850000 512 byte sectors: 255H 63S/T 1111C) da0 at ahc0 bus 0 target 0 lun 0 da0: <IBM DDRS-39130D DC1B> Fixed Direct Access SCSI-2 device da0: 40.000MB/s transfers (20.000MHz, offset 15, 16bit), Tagged Queueing Enabled da0: 8715MB (17850000 512 byte sectors: 255H 63S/T 1111C) changing root device to da0s1a To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20000316005107.A2883>