From owner-freebsd-current Sun Feb 9 9:46:54 2003 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B6AF337B401 for ; Sun, 9 Feb 2003 09:46:48 -0800 (PST) Received: from mgate.uni-hannover.de (mgate.uni-hannover.de [130.75.2.3]) by mx1.FreeBSD.org (Postfix) with ESMTP id 721CC43F85 for ; Sun, 9 Feb 2003 09:46:46 -0800 (PST) (envelope-from gerrit@pmp.uni-hannover.de) Received: from www.pmp.uni-hannover.de by mgate.uni-hannover.de with LocalSMTP (PP) with ESMTP; Sun, 9 Feb 2003 18:24:48 +0100 Received: by www.pmp.uni-hannover.de (Postfix, from userid 846) id C5B12169; Sun, 9 Feb 2003 18:24:37 +0100 (CET) Date: Sun, 9 Feb 2003 18:24:37 +0100 From: Gerrit =?iso-8859-1?Q?K=FChn?= To: Andre Guibert de Bruet Cc: Vallo Kallaste , Attila Nagy , current@FreeBSD.ORG Subject: Re: Does bg fsck have problems with large filesystems? Message-ID: <20030209172437.GA59271@pmp.uni-hannover.de> References: <20030127174127.GD71664@pmp.uni-hannover.de> <20030128125432.GB4813@tiiu.internal> <20030128110546.L66869@alpha.siliconlandmark.com> <20030128173142.GF78630@pmp.uni-hannover.de> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20030128173142.GF78630@pmp.uni-hannover.de> User-Agent: Mutt/1.4i X-Operating-System: FreeBSD Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tue, Jan 28, 2003 at 06:31:42PM +0100, Gerrit Kühn wrote: > > I've been trying to reproduce this bug on my desktop. This machine has 2 > > 80gb disks, one of which is dedicated with one slice. So far, after 8 hard > > resets, I haven't had any problem with either the machine or bgfsck > > hanging. > I'll try to reproduce the thing on my machine as soon as possible. > Perhaps it was just because it was Monday, who knows... Meanwhile I found out that my problem is 100% reproducible. My file systems look like this: Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/ad0s1a 257838 67338 169874 28% / devfs 1 1 0 100% /dev /dev/ad0s1g 57467672 2 52870258 0% /export /dev/ad0s1f 4125838 4 3795768 0% /tmp /dev/ad0s1e 12383502 1336152 10056670 12% /usr /dev/ad0s1d 4125838 3458 3792314 0% /var When booting with non-clean filesystems, bgfsck runs quickly over a, d, e and f. However, on g it keeps running forever. I can't kill the fsck processes and I can't access g, though the rest of the system seems to be usable as usual. Here is the output of ps axl: UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND 0 0 0 0 -16 0 0 12 sched DLs ?? 0:00.00 (swapper) 0 1 0 0 8 0 712 392 wait ILs ?? 0:00.01 /sbin/init - 0 2 0 0 -8 0 0 12 g_even DL ?? 0:00.02 (g_event) 0 3 0 0 -8 0 0 12 g_up DL ?? 0:00.09 (g_up) 0 4 0 0 -8 0 0 12 g_down DL ?? 0:00.19 (g_down) 0 5 0 0 -84 0 0 12 actask IL ?? 0:00.00 (acpi_task0 0 6 0 0 -84 0 0 12 actask IL ?? 0:00.00 (acpi_task1 0 7 0 0 -84 0 0 12 actask IL ?? 0:00.00 (acpi_task2 0 8 0 0 -16 0 0 12 psleep DL ?? 0:00.00 (pagedaemon 0 9 0 0 20 0 0 12 psleep DL ?? 0:00.00 (vmdaemon) 0 10 0 0 -16 0 0 12 ktrace DL ?? 0:00.00 (ktrace) 0 11 0 110 -16 0 0 12 - RL ?? 2:20.07 (idle) 0 12 0 0 -48 0 0 12 - WL ?? 0:00.12 (swi6: tty: 0 14 0 0 -44 0 0 12 - WL ?? 0:00.00 (swi1: net) 0 15 0 0 76 0 0 12 sleep DL ?? 0:00.05 (random) 0 19 0 0 -28 0 0 12 - WL ?? 0:00.00 (swi5: acpi 0 22 0 0 -64 0 0 12 - WL ?? 0:00.28 (irq14: ata 0 24 0 0 -68 0 0 12 - WL ?? 0:00.00 (irq11: rl0 0 25 0 0 8 0 0 12 usbevt DL ?? 0:00.00 (usb0) 0 26 0 0 8 0 0 12 usbtsk DL ?? 0:00.00 (usbtask) 0 27 0 0 8 0 0 12 usbevt DL ?? 0:00.00 (usb1) 0 28 0 5 -68 0 0 12 - WL ?? 0:00.00 (irq12: fwo 0 29 0 0 -64 0 0 12 - WL ?? 0:00.00 (irq6: fdc0 0 32 0 0 -60 0 0 12 - WL ?? 0:00.00 (irq7: ppc0 0 33 0 0 -60 0 0 12 - WL ?? 0:00.02 (irq1: atkb 0 36 0 34 171 0 0 12 pgzero DL ?? 0:00.47 (pagezero) 0 37 0 2 -4 0 0 12 snaplk DL ?? 0:00.24 (bufdaemon) 0 38 0 0 20 0 0 12 syncer DL ?? 0:00.01 (syncer) 0 39 0 0 -4 0 0 12 vlruwt DL ?? 0:00.00 (vnlru) 0 40 0 0 8 0 0 12 nfsidl IL ?? 0:00.00 (nfsiod 0) 0 41 0 0 8 0 0 12 nfsidl IL ?? 0:00.00 (nfsiod 1) 0 42 0 0 8 0 0 12 nfsidl IL ?? 0:00.00 (nfsiod 2) 0 43 0 0 8 0 0 12 nfsidl IL ?? 0:00.00 (nfsiod 3) 0 246 1 0 96 0 1172 736 select Ss ?? 0:00.03 /usr/sbin/sy 0 267 1 0 96 0 1372 1016 select Ss ?? 0:00.03 /usr/sbin/rp 0 350 1 155 115 0 1220 992 select Is ?? 0:00.01 /usr/sbin/mo 0 353 1 112 110 0 1168 876 select Is ?? 0:00.13 nfsd: master 0 355 353 155 4 0 1128 748 nfsd I ?? 0:00.00 nfsd: server 0 356 353 155 4 0 1128 748 nfsd I ?? 0:00.00 nfsd: server 0 357 353 155 4 0 1128 748 nfsd I ?? 0:00.00 nfsd: server 0 358 353 155 4 0 1128 748 nfsd I ?? 0:00.00 nfsd: server 0 374 1 0 96 0 1144 680 select Ss ?? 0:00.00 /usr/sbin/us 0 394 1 154 115 0 1196 808 select Is ?? 0:00.01 /usr/sbin/lp 0 454 1 153 115 0 3092 2200 select Is ?? 0:00.63 /usr/sbin/ss 0 460 1 0 96 0 3092 2544 select Ss ?? 0:00.01 sendmail: ac 25 463 1 153 20 0 2992 2500 pause Is ?? 0:00.00 sendmail: Qu 0 512 1 0 8 0 1236 956 nanslp Ss ?? 0:00.01 /usr/sbin/cr 0 522 1 0 8 0 1532 1236 wait Is v0 0:00.05 login [pam] 0 530 522 0 20 0 1504 1092 pause S v0 0:00.06 -csh (csh) 0 544 530 0 96 0 664 444 - R+ v0 0:00.00 ps axl 0 523 1 0 5 0 1184 864 ttyin Is+ v1 0:00.01 /usr/libexec 0 524 1 0 5 0 1184 864 ttyin Is+ v2 0:00.01 /usr/libexec 0 525 1 0 5 0 1184 864 ttyin Is+ v3 0:00.01 /usr/libexec 0 526 1 0 5 0 1184 864 ttyin Is+ v4 0:00.01 /usr/libexec 0 527 1 0 5 0 1184 864 ttyin Is+ v5 0:00.01 /usr/libexec 0 528 1 0 5 0 1184 864 ttyin Is+ v6 0:00.01 /usr/libexec 0 529 1 0 5 0 1184 864 ttyin Is+ v7 0:00.01 /usr/libexec 0 516 1 153 8 4 248 140 wait IN con- 0:00.01 fsck -B -p 0 517 1 153 -8 0 1112 564 piperd I con- 0:00.00 logger -p da 0 521 516 2 -8 4 632 376 getbuf DN con- 0:01.19 fsck_ufs -p After turning off bgfsck in rc.conf the system rebooted using fgfsck without further problems. Here is the dmesg from this booting: Copyright (c) 1992-2003 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.0-RELEASE #1: Mon Jan 27 17:43:54 CET 2003 root@comet.pmp.uni-hannover.de:/usr/obj/usr/src/sys/COMET Preloaded elf kernel "/boot/kernel/kernel" at 0xc0503000. Preloaded elf module "/boot/kernel/acpi.ko" at 0xc05030a8. Timecounter "i8254" frequency 1193182 Hz Timecounter "TSC" frequency 930689508 Hz CPU: VIA C3 Samuel 2 (930.69-MHz 686-class CPU) Origin = "CentaurHauls" Id = 0x67a Stepping = 10 Features=0x803035 real memory = 125763584 (119 MB) avail memory = 116744192 (111 MB) Initializing GEOMetry subsystem npx0: on motherboard npx0: INT 16 interface acpi0: on motherboard ACPI-0625: *** Info: GPE Block0 defined as GPE0 to GPE15 Using $PIR table, 6 entries at 0xc00fdcf0 acpi0: power button is handled as a fixed feature programming model. Timecounter "ACPI-safe" frequency 3579545 Hz acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0 acpi_cpu0: on acpi0 acpi_button0: on acpi0 acpi_button1: on acpi0 pcib0: port 0x6000-0x607f,0x5000-0x500f,0x4080-0x40ff,0x4000-0x407f,0xcf8-0xcff on acpi0 pci0: on pcib0 agp0: mem 0xea000000-0xea3fffff at device 0.0 on pci0 pcib1: at device 1.0 on pci0 pci1: on pcib1 pci1: at device 0.0 (no driver attached) isab0: at device 7.0 on pci0 isa0: on isab0 atapci0: port 0xd000-0xd00f at device 7.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 uhci0: port 0xd400-0xd41f irq 11 at device 7.2 on pci0 usb0: on uhci0 usb0: USB revision 1.0 uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered ugen0: American Power Conversion Back-UPS 500 FW: 6.2.I USB FW: c1, rev 1.10/1.00, addr 2 uhci1: port 0xd800-0xd81f irq 11 at device 7.3 on pci0 usb1: on uhci1 usb1: USB revision 1.0 uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered pci0: at device 7.4 (no driver attached) pcm0: port 0xe400-0xe403,0xe000-0xe003,0xdc00-0xdcff irq 12 at device 7.5 on pci0 fwohci0: port 0xe800-0xe87f mem 0xea400000-0xea4007ff irq 12 at device 11.0 on pci0 fwohci0: PCI bus latency was changing to 250. fwohci0: OHCI version 1.0 (ROM=1) fwohci0: No. of Isochronous channel is 8. fwohci0: EUI64 00:30:1b:ab:00:00:55:b8 fwohci0: Phy 1394a available S400, 3 ports. fwohci0: Link S400, max_rec 2048 bytes. firewire0: on fwohci0 rl0: port 0xec00-0xecff mem 0xea401000-0xea4010ff irq 11 at device 12.0 on pci0 rl0: Realtek 8139B detected. Warning, this may be unstable in autoselect mode rl0: Ethernet address: 00:30:1b:ab:55:54 miibus0: on rl0 rlphy0: on miibus0 rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto fdc0: port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 sio0 port 0x3f8-0x3ff irq 4 on acpi0 sio0: type 16550A ppc0 port 0x778-0x77b,0x378-0x37f irq 7 drq 3 on acpi0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/8 bytes threshold plip0: on ppbus0 lpt0: on ppbus0 lpt0: Interrupt-driven port ppi0: on ppbus0 atkbdc0: port 0x64,0x60 irq 1 on acpi0 atkbd0: flags 0x1 irq 1 on atkbdc0 kbd0 at atkbd0 orm0: