Date: Fri, 3 Dec 1999 11:25:18 -0600 From: Dan Nelson <dnelson@emsphone.com> To: current@FreeBSD.org Subject: NFS client zeroing out blocks on write? Message-ID: <19991203112518.A43843@dan.emsphone.com>
next in thread | raw e-mail | index | archive | help
--UlVJffcvxoiEqYs2 Content-Type: text/plain; charset=us-ascii I just upgraded a server from 2.2.8 to -current (991201 kernel) and am seeing some NFS corruption. It looks like byte ranges are getting zeroed out by the client (or not getting written at all, and the server at the other end is filling with zeros?). I've seen it while writing to both a Solaris 2.6 server (NFSv3) and a Netware NFS server (NFSv2), so I'm pretty sure it's the client causing the problem. Details: 4.0-CURRENT FreeBSD 4.0-CURRENT #2: Thu Dec 2 17:07:57 CST 1999 CPU: Dual Pentium III/Xeon 600 Mhz RAM: 256MB NIC: fxp0, full-duplex 100mbit NFS mount point: /mnt/filesystem/u01, mounting a Solaris 2.6 box also with a 100mbit full-duplex net connection, 8K NFS blocksize, UDP, via amd. My testcase is a 7-gig text file that I'm copying around with the following commands: $ cd /net/remotesystem/u01 $ split -b 1000000000 /u01/bigfile.txt file creating seven 1-gig files fileaa .. fileag (running at a nice rate of 5-6 MB/sec :). I then run "blankcheck" (attached) to scan the file for runs of zeroes, and get the following: $ for i in filea{a,b,c,d,e,f} ; do echo $i ; ./blankcheck < $i ; done fileaa fileab 168173568-168179199(5632) 384966656-384972287(5632) 385753088-385758719(5632) ( snip 156 lines just like the above, all ranges 5632 bytes in size ) 464068608-464074239(5632) 464723968-464729599(5632) 465248256-465253887(5632) fileac 203448320-203451391(3072) filead fileae 372097024-372103167(6144) fileaf 561774592-561778175(3584) $ All the zeroed out blocks start on an 8k NFS boundary, and I have verified that the rest of the 8k block has the correct data in it. Each corrupted block is always a multiple of 512 bytes long (so far multiples are 6, 7, 11, and 12). On this example run, each file either has no corruption at all, or has corruption with all the zeroed out ranges the same size. Dunno if this matters, but it's interesting. If I run without nfsiod, or copy from a remote NFS mount to a remote NFS mount, the corruption goes way down but still happens. I got only one corrupted block in my 7-gig test run in each of those test cases. I'm afraid I don't know much about the internal workings of NFS, so I'm hoping my description is enough to pinpoint the problem. -- Dan Nelson dnelson@emsphone.com --UlVJffcvxoiEqYs2 Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=dmesg Copyright (c) 1992-1999 The FreeBSD Project. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. FreeBSD 4.0-CURRENT #2: Thu Dec 2 17:07:57 CST 1999 zsh@emssrv7.emsphone.com:/usr/src/sys/compile/EMSSRV7 Timecounter "i8254" frequency 1193182 Hz CPU: Pentium III/Xeon (596.92-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x673 Stepping = 3 Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,XMM> real memory = 268427264 (262136K bytes) avail memory = 257163264 (251136K bytes) Programming 24 pins in IOAPIC #0 FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 1, version: 0x00040011, at 0xfee00000 cpu1 (AP): apic id: 0, version: 0x00040011, at 0xfee00000 io0 (APIC): apic id: 2, version: 0x00170011, at 0xfec00000 Preloaded elf kernel "kernel" at 0xc0303000. VESA: v2.0, 2048k memory, flags:0x0, mode table:0xc02af1c2 (1000022) VESA: ATI MACH64 Pentium Pro MTRR support enabled npx0: <math processor> on motherboard npx0: INT 16 interface pcib0: <Intel 82443BX (440 BX) host to PCI bridge> on motherboard pci0: <PCI bus> on pcib0 pcib1: <Intel 82443BX (440 BX) PCI-PCI (AGP) bridge> at device 1.0 on pci0 pci1: <PCI bus> on pcib1 vga-pci0: <ATI model 4757 graphics accelerator> at device 0.0 on pci1 pcib2: <PCI to PCI bridge (vendor=1011 device=0024)> at device 2.0 on pci0 pci2: <PCI bus> on pcib2 ahc0: <Adaptec 2944 Ultra SCSI adapter> irq 21 at device 9.0 on pci2 ahc0: aic7880 Wide Channel A, SCSI Id=7, 16/255 SCBs ahc1: <Adaptec aic7890/91 Ultra2 SCSI adapter> irq 16 at device 11.0 on pci2 ahc1: aic7890/91 Wide Channel A, SCSI Id=7, 16/255 SCBs isab0: <Intel 82371AB PCI to ISA bridge> at device 7.0 on pci0 isa0: <ISA bus> on isab0 chip1: <Intel PIIX4 IDE controller> at device 7.1 on pci0 pci0: UHCI USB controller (vendor=0x8086, dev=0x7112) at 7.2 irq 19 Timecounter "PIIX" frequency 3579545 Hz intpm0: <Intel 82371AB Power management controller> at device 7.3 on pci0 intpm0: I/O mapped 850 intpm0: intr IRQ 9 enabled revision 0 smbus0: <System Management Bus> on intsmb0 smb0: <SMBus general purpose I/O> on smbus0 intpm0: PM I/O mapped 800 fxp0: <Intel EtherExpress Pro 10/100B Ethernet> irq 18 at device 14.0 on pci0 fxp0: Ethernet address 00:90:27:dc:44:eb fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f7 irq 6 drq 2 on isa0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 wdc1 at port 0x170-0x177 irq 15 on isa0 wdc1: unit 0 (atapi): <SAMSUNG SC-140B/ d005>, removable, intr, dma, iordis wcd0: drive speed 6875KB/sec, 128KB cache wcd0: supported read types: CD-R, CD-RW, CD-DA, packet track wcd0: Audio: play, 255 volume levels wcd0: Mechanism: ejectable tray wcd0: Medium: no/blank disc inside, unlocked atkbdc0: <keyboard controller (i8042)> at port 0x60-0x6f on isa0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 psm0: <PS/2 Mouse> irq 12 on atkbdc0 psm0: model IntelliMouse, device ID 3 vga0: <Generic ISA VGA> at port 0x3b0-0x3df iomem 0xa0000-0xbffff on isa0 sc0: <System console> on isa0 sc0: VGA <16 virtual consoles, flags=0x200> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A sio1 at port 0x2f8-0x2ff irq 3 on isa0 sio1: type 16550A sio2: not probed (disabled) sio3: not probed (disabled) ppc0 at port 0x378-0x37f irq 7 flags 0x40 on isa0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/8 bytes threshold plip0: <PLIP network interface> on ppbus 0 lpt0: <generic printer> on ppbus 0 lpt0: Interrupt-driven port ppi0: <generic parallel i/o> on ppbus 0 APIC_IO: Testing 8254 interrupt delivery APIC_IO: routing 8254 via pin 2 Waiting 3 seconds for SCSI devices to settle SMP: AP CPU #1 Launched! Mounting root from ufda0 at ahc0 bus 0 target 0 lun 0 da0: <StorComp RAID-7 7.02> Fixed Direct Access SCSI-2 device da0: 20.000MB/s transfers (10.000MHz, offset 8, 16bit), Tagged Queueing Enabled da0: 51200MB (104857600 512 byte sectors: 255H 63S/T 6527C) da1 at ahc0 bus 0 target 1 lun 0 da1: <StorComp RAID-7 7.02> Fixed Direct Access SCSI-2 device da1: 20.000MB/s transfers (10.000MHz, offset 8, 16bit), Tagged Queueing Enabled da1: 44032MB (90177536 512 byte sectors: 255H 63S/T 5613C) s:/dev/da2s2a da2 at ahc1 bus 0 target 0 lun 0 da2: <IBM DNES-309170W SA60> Fixed Direct Access SCSI-3 device da2: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled da2: 8683MB (17783301 512 byte sectors: 255H 63S/T 1106C) --UlVJffcvxoiEqYs2 Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=EMSSRV7 # # GENERIC -- Generic machine with WD/AHx/NCR/BTx family disks # # For more information on this file, please read the handbook section on # Kernel Configuration Files: # # http://www.freebsd.org/handbook/kernelconfig-config.html # # The handbook is also available locally in /usr/share/doc/handbook # if you've installed the doc distribution, otherwise always see the # FreeBSD World Wide Web server (http://www.FreeBSD.ORG/) for the # latest information. # # An exhaustive list of options and more detailed explanations of the # device lines is also present in the ./LINT configuration file. If you are # in doubt as to the purpose or necessity of a line, check first in LINT. # # $FreeBSD: src/sys/i386/conf/GENERIC,v 1.199 1999/11/01 04:02:56 peter Exp $ machine i386 #cpu I386_CPU #cpu I486_CPU #cpu I586_CPU cpu I686_CPU ident GENERIC maxusers 100 #makeoptions DEBUG=-g #Build kernel with gdb(1) debug symbols makeoptions COPTFLAGS="-O -pipe" # use some optimizations options MATH_EMULATE #Support for x87 emulation options INET #InterNETworking options FFS #Berkeley Fast Filesystem options FFS_ROOT #FFS usable as root device [keep this!] options MFS #Memory Filesystem options MFS_ROOT #MFS usable as root device, "MFS" req'ed options NFS #Network Filesystem options NFS_ROOT #NFS usable as root device, "NFS" req'ed options MSDOSFS #MSDOS Filesystem options CD9660 #ISO 9660 Filesystem options CD9660_ROOT #CD-ROM usable as root. "CD9660" req'ed options PROCFS #Process filesystem options COMPAT_43 #Compatible with BSD 4.3 [KEEP THIS!] options SCSI_DELAY=3000 #Be optimistic about Joe SCSI device options UCONSOLE #Allow users to grab the console options USERCONFIG #boot -c editor options VISUAL_USERCONFIG #visual boot -c editor options KTRACE #ktrace(1) syscall trace support options SYSVSHM #SYSV-style shared memory options SYSVMSG #SYSV-style message queues options SYSVSEM #SYSV-style semaphores options SOFTUPDATES # To make an SMP kernel, the next two are needed options SMP # Symmetric MultiProcessor Kernel options APIC_IO # Symmetric (APIC) I/O # Optionally these may need tweaked, (defaults shown): #options NCPU=2 # number of CPUs #options NBUS=4 # number of busses #options NAPIC=1 # number of IO APICs #options NINTR=24 # number of INTs controller isa0 controller pnp0 # PnP support for ISA controller eisa0 controller pci0 # Floppy drives controller fdc0 at isa? port IO_FD1 irq 6 drq 2 device fd0 at fdc0 drive 0 device fd1 at fdc0 drive 1 options IDE_DELAY=3000 # Be optimistic about Joe IDE device # IDE controller and disks controller wdc0 at isa? port IO_WD1 irq 14 device wd0 at wdc0 drive 0 device wd1 at wdc0 drive 1 controller wdc1 at isa? port IO_WD2 irq 15 device wd2 at wdc1 drive 0 device wd3 at wdc1 drive 1 # ATAPI devices on wdc? device wcd0 #IDE CD-ROM device wfd0 #IDE Floppy (e.g. LS-120) device wst0 #IDE Tape (e.g. Travan) # SCSI Controllers # A single entry for any of these controllers (ncr, ahb, ahc) is # sufficient for any number of installed devices. controller ahc0 # AHA2940 and onboard AIC7xxx devices #controller adv0 at isa? port ? irq ? #controller adw0 #controller bt0 at isa? port ? irq ? #controller aha0 at isa? port ? irq ? # SCSI peripherals # Only one of each of these is needed, they are dynamically allocated. controller scbus0 # SCSI bus (required) device da0 # Direct Access (disks) device sa0 # Sequential Access (tape etc) device cd0 # CD device pass0 # Passthrough device (direct SCSI access) # atkbdc0 controls both the keyboard and the PS/2 mouse controller atkbdc0 at isa? port IO_KBD device atkbd0 at atkbdc? irq 1 device psm0 at atkbdc? irq 12 device vga0 at isa? port ? conflicts options VGA_WIDTH90 # support 90 column modes options VESA # splash screen/screen saver pseudo-device splash # syscons is the default console driver, resembling an SCO console device sc0 at isa? # Floating point support - do not disable. device npx0 at nexus? port IO_NPX irq 13 # Power management support (see LINT for more options) device apm0 at nexus? disable flags 0x31 # Advanced Power Management # PCCARD (PCMCIA) support #controller card0 #device pcic0 at isa? #device pcic1 at isa? # Serial (COM) ports device sio0 at isa? port IO_COM1 flags 0x10 irq 4 device sio1 at isa? port IO_COM2 irq 3 device sio2 at isa? disable port IO_COM3 irq 5 device sio3 at isa? disable port IO_COM4 irq 9 # Parallel port device ppc0 at isa? port? flags 0x40 irq 7 controller ppbus0 # Parallel port bus (required) device lpt0 # Printer device plip0 # TCP/IP over parallel device ppi0 # Parallel port interface device #controller vpo0 # Requires scbus and da0 # PCI Ethernet NICs. device fxp0 # Intel EtherExpress PRO/100B (82557, 82558) # Pseudo devices - the number indicates how many units to allocated. pseudo-device loop # Network loopback pseudo-device ether # Ethernet support pseudo-device sl 1 # Kernel SLIP pseudo-device ppp 1 # Kernel PPP pseudo-device tun # Packet tunnel. pseudo-device pty # Pseudo-ttys (telnet etc) pseudo-device gzip # Exec gzipped a.out's # The `bpf' pseudo-device enables the Berkeley Packet Filter. # Be aware of the administrative consequences of enabling this! pseudo-device bpf #Berkeley packet filter # USB support #controller uhci0 # UHCI PCI->USB interface #controller ohci0 # OHCI PCI->USB interface #controller usb0 # USB Bus (required) #device ugen0 # Generic #device uhid0 # "Human Interface Devices" #device ukbd0 # Keyboard #device ulpt0 # Printer #controller umass0 # Disks/Mass storage - Requires scbus and da0 #device ums0 # Mouse # System Management Bus support provided by the 'smbus' device. controller smbus0 controller intpm0 controller alpm0 device smb0 at smbus? options MSGBUF_SIZE=32768 options INCLUDE_CONFIG_FILE # Include this file in kernel --UlVJffcvxoiEqYs2 Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="blankcheck.c" #include <stdio.h> #define BS (65536) /* print the byte ranges and sizes of any runs of zeros in a datafile more than 5 bytes long */ int main (void) { char buf[BS]; long long offset = 0; long long first = -1; int size; int bufoff = 0; setvbuf (stdin, NULL, _IOFBF, BS); while ((size = fread (buf, 1, BS, stdin))) { for (bufoff = 0; bufoff < size; bufoff++, offset++) { if (buf[bufoff] == 0) { if (first == -1) first = offset; } else { if (first != -1) { if (offset - first > 5) printf ("%lld-%lld(%lld) \n", first, offset - 1, offset - first); first = -1; } } } } if (first != -1) { if (offset - first + 1 > 5) printf ("%lld-%lld(%lld) \n", first, offset, offset - first + 1); first = -1; } return 0; } --UlVJffcvxoiEqYs2-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19991203112518.A43843>