From owner-freebsd-bugs Sun Jan 17 19:40:08 1999 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id TAA01451 for freebsd-bugs-outgoing; Sun, 17 Jan 1999 19:40:08 -0800 (PST) (envelope-from owner-freebsd-bugs@FreeBSD.ORG) Received: from freefall.freebsd.org (freefall.FreeBSD.ORG [204.216.27.21]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id TAA01370 for ; Sun, 17 Jan 1999 19:40:01 -0800 (PST) (envelope-from gnats@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.8.8/8.8.5) id TAA07317; Sun, 17 Jan 1999 19:40:01 -0800 (PST) Received: from limbic.gc2.kloepfer.org (limbic.gc2.kloepfer.org [206.225.39.2]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id TAA00642 for ; Sun, 17 Jan 1999 19:34:09 -0800 (PST) (envelope-from gil@gc2.kloepfer.org) Received: (from gil@localhost) by limbic.gc2.kloepfer.org (8.9.1/8.9.1) id VAA09847; Sun, 17 Jan 1999 21:34:02 -0600 (CST) Message-Id: <199901180334.VAA09847@limbic.gc2.kloepfer.org> Date: Sun, 17 Jan 1999 21:34:02 -0600 (CST) From: fgil@gc2.kloepfer.org Reply-To: fgil@gc2.kloepfer.org To: FreeBSD-gnats-submit@FreeBSD.ORG Cc: fgil@limbic.gc2.kloepfer.org X-Send-Pr-Version: 3.2 Subject: kern/9548: UNION fs corrupts data and has undefined getpages VOP Sender: owner-freebsd-bugs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org >Number: 9548 >Category: kern >Synopsis: UNION fs corrupts data and has undefined getpages VOP >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sun Jan 17 19:40:00 PST 1999 >Closed-Date: >Last-Modified: >Originator: Gil Kloepfer Jr. >Release: FreeBSD 3.0-RELEASE i386 >Organization: None >Environment: Probably the easiest and the most complete description comes from the kernel itself... avail memory = 46227456 (45144K bytes) Bad BIOS32 Service Directory! Probing for devices on PCI bus 0: chip0: rev 0x02 on pci0.0.0 chip1: rev 0x02 on pci0.7.0 ide_pci0: rev 0x02 on pci0.7.1 vga0: rev 0x00 int a irq 5 on pci0.18.0 Probing for devices on the ISA bus: sc0 at 0x60-0x6f irq 1 on motherboard sc0: VGA color <16 virtual consoles, flags=0x0> ed0 at 0x300-0x31f irq 9 maddr 0xc8000 msize 163 84 on isa ed0: address 00:00:c0:49:93:da, type SMC8216/SMC8216C (16 bit) sio0 at 0x3f8-0x3ff irq 4 flags 0x10 on isa sio0: type 16550A sio1 at 0x2f8-0x2ff irq 3 on isa sio1: type 16550A lpt0 at 0x378-0x37f irq 7 on isa lpt0: Interrupt-driven port lp0: TCP/IP capable interface psm0 at 0x60-0x64 irq 12 on motherboard psm0: model Generic PS/2 mouse, device ID 0 fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa fdc0: FIFO enabled, 8 bytes threshold fd0: 1.44MB 3.5in wdc0 at 0x1f0-0x1f7 irq 14 on isa wdc0: unit 0 (wd0): wd0: 1549MB (3173184 sectors), 3148 cyls, 16 heads, 63 S/T, 512 B/S wdc0: unit 1 (wd1): wd1: 406MB (832608 sectors), 826 cyls, 16 heads, 63 S/T, 512 B/S wdc1 at 0x170-0x177 irq 15 on isa wdc1: unit 0 (atapi): , removable, dma, iordis wcd0: 1033Kb/sec, 128Kb cache, audio play, 256 volume levels, ejectable tray wcd0: no disc inside, unlocked npx0 on motherboard npx0: INT 16 interface Intel Pentium F00F detected, installing workaround >Description: 1. UNION filesystem corrupts data and 2. Reports that there is a stale getpages routine (this is a diagnostic from /sys/vm/vnode_pager.c to show that a EOPNOTSUPP was returned when getpages was called, meaning that in union_vnops.c there was no getpages VOP implemented). Exact message from kernel is: vnode_pager: *** WARNING *** stale FS getpages It appears that the data corruption in (1) is caused during a mmap operation on a file copied to the upper layer of the union filesystem. For example, in the steps outlined in How-To-Repeat, a file can be copied (/bin/cp) around (which uses stdio), but if a "/usr/bin/cmp -l" on the file is performed (which uses mmap) or if an execute is attempted on an executable file, the file on the union filesystem becomes corrupt (basically filled with 0x00). I originally discovered all this because I wanted to keep the kernel sources on a CD, but mount some writable disk space on top in order to do a kernel build, thus avoiding the need to keep the kernel (and other) sources on disk. >How-To-Repeat: two filesystems, /fs1 and /fs2 mkdir /fs1/lower mkdir /fs2/upper mount -t union /fs2/upper /fs1/lower cd /fs1/lower # really the union filesystem at this point cp /etc/termcap . cmp -l termcap /etc/termcap # data in /fs2/upper/termcap is now corrupt -- another example -- mkdir /fs1/lower mkdir /fs2/upper mount -t union /fs2/upper /fs1/lower cd /fs1/lower # really the union filesystem at this point cp /bin/cat . ./cat /etc/termcap >/dev/null # will report "wrong architecture" because the file will become # filled with zeros -- example where it works correctly -- mkdir /fs1/lower mkdir /fs2/upper mount -t union /fs2/upper /fs1/lower cd /fs1/lower # really the union filesystem at this point cp /etc/termcap . cat termcap >/tmp/termcap cd /tmp/termcap cmp -l termcap /etc/termcap # file compare is good, because /tmp/termcap has not been corrupted # NOTE: now umount the union filesystem, and od the following: cd /fs2/upper cmp -l termcap /etc/termcap # What will happen now is that cmp will core-dump (Segmentation fault) # and the kernel will report: # vm_fault: pager read error, pid 298 (cmp) >Fix: I have tried without any luck to find out exactly what is happening in #2, and why this behavior occurs. I don't know enough about the workings of the kernel to understand what may be wrong. (I did learn some about what a vnode is, however...:) For #1, I applied the following changes to /sys/miscfs/union/union_vnops.c as per the recommendations in vnode_pager.c. However, I am not sure if this is the correct fix. (remove leading tab from context diff to use it) *** union_vnops.c.ORIG Sat Jan 9 18:04:08 1999 --- union_vnops.c Sun Jan 17 21:27:28 1999 *************** *** 67,72 **** --- 67,73 ---- static void union_fixup __P((struct union_node *un, struct proc *p)); static int union_fsync __P((struct vop_fsync_args *ap)); static int union_getattr __P((struct vop_getattr_args *ap)); + static int union_getpages __P((struct vop_getpages_args *ap)); static int union_inactive __P((struct vop_inactive_args *ap)); static int union_ioctl __P((struct vop_ioctl_args *ap)); static int union_islocked __P((struct vop_islocked_args *ap)); *************** *** 99,104 **** --- 100,107 ---- static int union_whiteout __P((struct vop_whiteout_args *ap)); static int union_write __P((struct vop_read_args *ap)); + extern int vnode_pager_generic_getpages __P((struct vnode *, vm_page_t *, int, int)); + static void union_fixup(un, p) struct union_node *un; *************** *** 1750,1755 **** --- 1753,1773 ---- return (error); } + + /* + * XXX - This getpages function is copied from the one used in mfs. + * There really needs to be a fs-device-specific default getpages + * vop function written... + */ + + static int + union_getpages(ap) + struct vop_getpages_args *ap; + { + return(vnode_pager_generic_getpages(ap->a_vp, ap->a_m, ap->a_count, ap->a_reqpage)); + } + + /* * Global vfs data structures */ *************** *** 1764,1769 **** --- 1782,1788 ---- { &vop_create_desc, (vop_t *) union_create }, { &vop_fsync_desc, (vop_t *) union_fsync }, { &vop_getattr_desc, (vop_t *) union_getattr }, + { &vop_getpages_desc, (vop_t *) union_getpages }, { &vop_inactive_desc, (vop_t *) union_inactive }, { &vop_ioctl_desc, (vop_t *) union_ioctl }, { &vop_islocked_desc, (vop_t *) union_islocked }, >Release-Note: >Audit-Trail: >Unformatted: To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message