Date: Sun, 29 Nov 1998 05:00:02 -0800 (PST) From: toasty@dragondata.com To: freebsd-bugs@FreeBSD.ORG Subject: kern/8834: NFS can corrupt local file cache Message-ID: <199811291300.FAA27870@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/8834; it has been noted by GNATS.
From: toasty@dragondata.com
To: FreeBSD-gnats-submit@FreeBSD.ORG
Cc: Subject: kern/8834: NFS can corrupt local file cache
Date: Tue, 24 Nov 1998 04:06:11 -0600 (CST)
>Number: 8834
>Category: kern
>Synopsis: NFS can corrupt local file cache
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Tue Nov 24 02:10:01 PST 1998
>Last-Modified:
>Originator: Kevin Day
>Organization:
DragonData Internet Services
>Release: FreeBSD 2.2.7-STABLE i386
>Environment:
2.2.5 NFS server, 2.2.7 NFS client.
dmesg from client:
Copyright (c) 1992-1998 FreeBSD Inc.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.
FreeBSD 2.2.7-RELEASE #0: Thu Jul 30 16:42:02 CDT 1998
root@shell1.dragondata.com:/usr/src/sys/compile/SHELL1
CPU: Pentium II (quarter-micron) (398.27-MHz 686-class CPU)
Origin = "GenuineIntel" Id = 0x651 Stepping=1
Features=0x183f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,<b16>,<b17>,MMX,<b24>>
real memory = 402653184 (393216K bytes)
avail memory = 391720960 (382540K bytes)
Probing for devices on PCI bus 0:
chip0 <generic PCI bridge (vendor=8086 device=7190 subclass=0)> rev 2 on pci0:0:0
chip1 <generic PCI bridge (vendor=8086 device=7191 subclass=4)> rev 2 on pci0:1:0
chip2 <Intel 82371AB PCI-ISA bridge> rev 2 on pci0:7:0
chip3 <Intel 82371AB IDE interface> rev 1 on pci0:7:1
chip4 <Intel 82371AB USB interface> rev 1 int d irq 9 on pci0:7:2
chip5 <Intel 82371AB Power management controller> rev 2 on pci0:7:3
de0 <Digital 21140A Fast Ethernet> rev 34 int a irq 11 on pci0:14:0
de0: 21140A [10-100Mb/s] pass 2.2
de0: address 00:40:05:43:a3:a3
de1 <Digital 21140A Fast Ethernet> rev 34 int a irq 10 on pci0:15:0
de1: 21140A [10-100Mb/s] pass 2.2
de1: address 00:40:05:42:dd:26
Probing for devices on PCI bus 1:
vga0 <VGA-compatible display device> rev 92 on pci1:0:0
Probing for devices on the ISA bus:
sc0 at 0x60-0x6f irq 1 on motherboard
sc0: VGA color <16 virtual consoles, flags=0x0>
sio0 at 0x3f8-0x3ff irq 4 on isa
sio0: type 16550A
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1 not found at 0x2f8
lpt0 at 0x378-0x37f irq 7 on isa
lpt0: Interrupt-driven port
lp0: TCP/IP capable interface
lpt1 not found at 0xffffffff
psm0 not found at 0x60
fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa
fdc0: FIFO enabled, 8 bytes threshold
fd0: 1.44MB 3.5in
wdc0 at 0x1f0-0x1f7 irq 14 on isa
wdc0: unit 0 (wd0): <Maxtor 91152D8>
wd0: 8063MB (16514064 sectors), 16383 cyls, 16 heads, 63 S/T, 512 B/S
wdc1 at 0x170-0x177 irq 15 on isa
wdc1: unit 0 (atapi): <NEC CD-ROM DRIVE:28C/3.02>, removable, dma, iordy
wcd0: 2412/5512Kb/sec, 128Kb cache, audio play, 256 volume levels, ejectable tray
wcd0: no disc inside, unlocked
npx0 flags 0x1 on motherboard
npx0: INT 16 interface
de0: enabling Full Duplex 100baseTX port
de1: enabling 100baseTX port
>Description:
Heavy NFS activity on the client can corrupt local files in use
>How-To-Repeat:
dd if=/mnt/nfsserver/verylargefile of=/tmp/somefile bs=32k &
Then, I started doing a kernel config, then compile.
vers.c started getting bits and pieces of 'verylargefile' mixed in with it.
Until I rebooted, every time vers.c was recreated by config, it had the same
garbage in it.
As a test, I tried copying the corrupted vers.c back to the nfs server.
/kernel: short receive (0/4) from nfs server home.internal:/home
/kernel: nfs server home.internal:/home: not responding
last message repeated 61 times
/kernel: short receive (0/4) from nfs server home.internal:/home
/kernel: nfs server home.internal:/home: not responding
last message repeated 25 times
/kernel: /kernel: vm_fault: pager input (probably hardware) error, PID 15822 failure
last message repeated 33297 times
last message repeated 128988 times
last message repeated 632240 times
last message repeated 639222 times
/kernel: pid 1328 (inetd), uid 0, was killed: exceeded maximum CPU limit
/kernel: pid 15822 (cp), uid 0: exited on signal 11 (core dumped)
At this point, the system locked up... ddb showed it in nfs_bwrite, or some
function under it, apparently stuck in a loop. It wouldn't do a core dump.
Also, if this helps... it seems somewhat related.... If an executable is ran
from an nfs mount, and gets killed for exceeding it's CPU limit, it'll start
doing the never-ending vm_fault: pager input (probably hardware) error over
and over again, too.
Why inetd died there makes no since, it has no cpu limit.
>Fix:
>Audit-Trail:
>Unformatted:
Kevin Day
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199811291300.FAA27870>
