From owner-freebsd-current@FreeBSD.ORG Thu Jan 29 01:51:35 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EAC7B16A4CE for ; Thu, 29 Jan 2004 01:51:35 -0800 (PST) Received: from smtp1.powertech.no (smtp1.powertech.no [195.159.0.145]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6B03743D2D for ; Thu, 29 Jan 2004 01:51:31 -0800 (PST) (envelope-from frode@nordahl.net) Received: from [195.159.6.12] (dhcp12.ns5.powertech.no [195.159.6.12]) by smtp1.powertech.no (Postfix) with ESMTP id 8586D81E6; Thu, 29 Jan 2004 10:51:29 +0100 (CET) In-Reply-To: <3DC16400-517B-11D8-9CB2-0005028F6AEB@TrueStep.com> References: <3DC16400-517B-11D8-9CB2-0005028F6AEB@TrueStep.com> Mime-Version: 1.0 (Apple Message framework v612) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: Content-Transfer-Encoding: 7bit From: Frode Nordahl Date: Thu, 29 Jan 2004 10:51:30 +0100 To: rory+FreeBSD@TrueStep.com X-Mailer: Apple Mail (2.612) cc: freebsd-current@freebsd.org Subject: Re: rpc.lockd(8) seg faults on 5.2-RELEASE X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Jan 2004 09:51:36 -0000 Hello, I see the same problem here, I have posted a backtrace with debug symbols in PR bin/61718. But it seldomly crashes with segfault, more often it just stops responding and starts to eat 30-50% cpu, so I have to kill it and restart. (I have a small perl script that does this for me). It may be that it would crash if I let it run some more, but it causes severe havoc to my network, so I have to restart it as soon as possible to make things work again. The server hosts home directories for about 400k users. Client servers are Linux mail (POP/IMAP) front ends accessing ~/mbox or IMAP folders, web servers for hosting homepages etc. I'm having a hard time reproducing the error, but I have set up a crash-box and I'm trying as hard as I can to make it crash on demand, no luck yet :-/ When the problem first occurs, I often have to restart it several times before things start working again, this is probably because the client retries whatever action that made the server crash in the first place. Mvh, Frode On Jan 28, 2004, at 11:17, Rory Arms wrote: > -current developers, > > I've noticed, since upgrading from 5.1-RELEASE-p11 to 5.2-RELEASE a > few weeks ago, rpc.lockd has been crashing repeatedly, though > randomly. All the clients are MacOS X 10.3 machines and one 10.2. > Though, I think it wasn't till 10.3, that client NFS locking was > finally supported. It looks like they request locks for certain > operations, such as when reading the address book file. It definitely > had something to do with the new version, as it started occurring > after the upgrade, with no other changes on the network. > > Here are the machine's specs: > > > cat /var/run/dmesg.boot > Copyright (c) 1992-2004 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, > 1994 > The Regents of the University of California. All rights > reserved. > FreeBSD 5.2-RELEASE #3: Mon Jan 12 13:56:46 EST 2004 > Preloaded elf kernel "/boot/kernel/kernel" at 0xc083b000. > Preloaded elf module "/boot/kernel/acpi.ko" at 0xc083b244. > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: Pentium II/Pentium II Xeon/Celeron (375.04-MHz 686-class CPU) > Origin = "GenuineIntel" Id = 0x652 Stepping = 2 > > Features=0x183fbff ,MCA,CMOV,PAT,PSE36,MMX,FXSR> > real memory = 536739840 (511 MB) > avail memory = 511721472 (488 MB) > ACPI APIC Table: > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs > cpu0 (BSP): APIC ID: 0 > cpu1 (AP): APIC ID: 1 > ioapic0 irqs 0-23 on motherboard > Pentium Pro MTRR support enabled > npx0: [FAST] > npx0: on motherboard > npx0: INT 16 interface > acpi0: on motherboard > acpi0: Overriding SCI Interrupt from IRQ 9 to IRQ 20 > pcibios: BIOS version 2.10 > acpi0: Power Button (fixed) > Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 > acpi_cpu0: on acpi0 > acpi_cpu1: on acpi0 > acpi_cpu1: Failed to attach throttling P_CNT > pcib0: port 0xcf8-0xcff on acpi0 > pci0: on pcib0 > pcib1: at device 1.0 on pci0 > pci1: on pcib1 > isab0: at device 7.0 on pci0 > isa0: on isab0 > atapci0: port 0xffa0-0xffaf at device > 7.1 on pci0 > ata0: at 0x1f0 irq 14 on atapci0 > ata0: [MPSAFE] > ata1: at 0x170 irq 15 on atapci0 > ata1: [MPSAFE] > pci0: at device 7.2 (no driver attached) > pci0: at device 7.3 (no driver attached) > pcib2: at device 16.0 on pci0 > pci2: on pcib2 > pcib2: slot 5 INTA is routed to irq 17 > fxp0: port 0xdf00-0xdf3f mem > 0xfd500000-0xfd5fffff,0xfd6ff000-0xfd6fffff irq 17 at device 5.0 on > pci2 > fxp0: Ethernet address 00:90:27:ee:02:97 > miibus0: on fxp0 > inphy0: on miibus0 > inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > fxp1: port 0xef40-0xef5f mem > 0xfea00000-0xfeafffff,0xffaff000-0xffafffff irq 19 at device 17.0 on > pci0 > fxp1: Ethernet address 00:e0:81:10:22:27 > miibus1: on fxp1 > inphy1: on miibus1 > inphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > ahc0: port 0xe400-0xe4ff mem > 0xfebee000-0xfebeefff irq 16 at device 18.0 on pci0 > aic7895C: Ultra Wide Channel A, SCSI Id=7, 32/253 SCBs > ahc1: port 0xe800-0xe8ff mem > 0xfebef000-0xfebeffff irq 16 at device 18.1 on pci0 > aic7895C: Ultra Wide Channel B, SCSI Id=7, 32/253 SCBs > ahc2: port 0xe000-0xe0ff > mem 0xfebed000-0xfebedfff irq 16 at device 19.0 on pci0 > aic7850: Single Channel A, SCSI Id=7, 3/253 SCBs > pci0: at device 20.0 (no driver attached) > acpi_button0: on acpi0 > atkbdc0: port 0x64,0x60 irq 1 on acpi0 > atkbd0: flags 0x1 irq 1 on atkbdc0 > kbd0 at atkbd0 > fdc0: cmd 3 failed at out byte 1 of 3 > sio0 port 0x3f8-0x3ff irq 4 on acpi0 > sio0: type 16550A > sio1 port 0x2f8-0x2ff irq 3 on acpi0 > sio1: type 16550A > fdc0: cmd 3 failed at out byte 1 of 3 > orm0: