From owner-freebsd-stable@FreeBSD.ORG Wed Nov 22 05:49:13 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5BC3C16A40F for ; Wed, 22 Nov 2006 05:49:13 +0000 (UTC) (envelope-from chrcoluk@gmail.com) Received: from py-out-1112.google.com (py-out-1112.google.com [64.233.166.178]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5732843D45 for ; Wed, 22 Nov 2006 05:48:46 +0000 (GMT) (envelope-from chrcoluk@gmail.com) Received: by py-out-1112.google.com with SMTP id f31so29770pyh for ; Tue, 21 Nov 2006 21:49:12 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=sK+MLatyuzA+8YWzjkStA/tjX7e3NWid0be+go0smlOoA2zKLbBzeJ2WX0st80u9Gj7TPr3Qi725c3k8HG0l+yssPAKQ+NzF2Ib8UAy+PN+tCwVRnbeRhgmlbhb8CXfgl4LCaTD885WwGDBnIB0wON2mTaFi76/lDOr3W9WAw5o= Received: by 10.35.41.14 with SMTP id t14mr279880pyj.1164174552267; Tue, 21 Nov 2006 21:49:12 -0800 (PST) Received: by 10.35.29.20 with HTTP; Tue, 21 Nov 2006 21:49:12 -0800 (PST) Message-ID: <3aaaa3a0611212149u21146180ra84503472a0336e3@mail.gmail.com> Date: Wed, 22 Nov 2006 05:49:12 +0000 From: Chris To: "FreeBSD Stable" MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Subject: sshfs/nfs cause server lockup X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Nov 2006 05:49:13 -0000 On a few occasions all different remote servers I have had nfs cause servers to stop responding so I stopped using it all the servers were either 6.0 release 6.1 release or 6-stable. We recently discovered sshfs which supports cross platform mounting server is linux and I mounted on a freebsd 6.1 release using security branch up to date. it was working fine for around 5 to 6 days with some problems with sshfs not updating files that are updated but wasnt compromising the stability of the freebsd server I just remounted to keep up to date. Then today the linux server had network problems so the sshfs timed out and there is 2 dirs I mount, the first mounted fine a bit slow but connected but when I ran the command to mount the 2nd dir the server stopped responding. My 2nd ssh terminal was alive I tried to run top to see if sshfs was hanging or something but when I hit enter top didnt run and the 2nd terminal was froze, note both terminals didnt timeout and a ircd running on the server also did not timeout but the box wasnt listening to any new requests, it was responding to pings fine. I have a remote reboot facility on the box but no local access and no kvm/serial console facility available this is the case for all of my servers. I initially tried a soft reboot which uses ctrl-alt-delete but the pings kept replying so I could see the reboot wasn initiated indicating some kind of console lockup as well, I then did a hard reboot which brought the server back. All logs stopped when the first lockup occured so no errors etc. recorded bear in mind I have no local access to this machine. It does appear that 6.x has some kind of serious remote mounting bug because I never had these nfs problems in freebsd 5.x. I would be interested in any thoughts as to what could help me I have rebooted the server now with network mpsafe disabled to see if this will help it is using a generic kernel with the following changes. options directio, polling, noadaptive mutexes, adaptive giant,ipv6 and nfs disabled. dmesg output below. I left the reboot showing vnodes because it also looks supicous it took so long for it to synch the disks, this was following a working reboot the remote reboot of course was improper shutdown. The hd is a sata2 but dmesg shows as ata33 Syncing disks, vnodes remaining...3 3 1 0 2 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 0 0 0 done All buffers synced. Copyright (c) 1992-2006 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 6.1-RELEASE-p10 #1: Sat Nov 11 23:02:09 GMT 2006 admin@heaven.chrysalisnet.org:/usr/obj/usr/src/sys/HEAVEN WARNING: MPSAFE network stack disabled, expect reduced performance. ACPI APIC Table: Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Athlon(tm) 64 Processor 3800+ (2410.95-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0x40ff2 Stepping = 2 Features=0x78bfbff Features2=0x2001 AMD Features=0xea500800 AMD Features2=0x1d,,CR8> real memory = 939261952 (895 MB) avail memory = 909828096 (867 MB) ioapic0 irqs 0-23 on motherboard kbd1 at kbdmux0 acpi0: on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x508-0x50b on acpi0 cpu0: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pci0: at device 0.0 (no driver attached) isab0: at device 1.0 on pci0 isa0: on isab0 pci0: at device 1.1 (no driver attached) pci0: at device 1.2 (no driver attached) pci0: at device 1.3 (no driver attached) ohci0: mem 0xdfe7f000-0xdfe7ffff irq 21 at device 2.0 on pci0 ohci0: [GIANT-LOCKED] usb0: OHCI version 1.0, legacy support usb0: SMM does not respond, resetting usb0: on ohci0 usb0: USB revision 1.0 uhub0: nVidia OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 10 ports with 10 removable, self powered ehci0: mem 0xdfe7ec00-0xdfe7ecff irq 22 at device 2.1 on pci0 ehci0: [GIANT-LOCKED] usb1: EHCI version 1.0 usb1: companion controller, 10 ports each: usb0 usb1: on ehci0 usb1: USB revision 2.0 uhub1: nVidia EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub1: 10 ports with 10 removable, self powered pcib1: at device 4.0 on pci0 pci1: on pcib1 fxp0: port 0xec00-0xec3f mem 0xdffff000-0xdffffff f,0xdffc0000-0xdffdffff irq 16 at device 6.0 on pci1 miibus0: on fxp0 inphy0: on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto fxp0: Ethernet address: 00:02:b3:bf:b5:c9 fxp0: [GIANT-LOCKED] pci0: at device 5.0 (no driver attached) atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 6.0 on pci0 ata0: on atapci0 ata1: on atapci0 atapci1: port 0xd480-0xd487,0xd400-0xd403,0xd080-0xd087,0xd000-0xd003,0xcc00-0xcc0f mem 0xdfe7d000-0xdfe7dfff irq 20 at device 8.0 on pci0 ata2: on atapci1 ata3: on atapci1 atapci2: port 0xc880-0xc887,0xc800-0xc803,0xc480-0xc487,0xc400-0xc403,0xc080-0xc08f mem 0xdfe7c000-0xdfe7cfff irq 21 at device 8.1 on pci0 ata4: on atapci2 ata5: on atapci2 pcib2: at device 9.0 on pci0 pci2: on pcib2 pcib3: at device 11.0 on pci0 pci3: on pcib3 pcib4: at device 12.0 on pci0 pci4: on pcib4 pci0: at device 13.0 (no driver attached) acpi_button0: on acpi0 fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FAST] ppc0: port 0x378-0x37f irq 7 on acpi0 ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode ppbus0: on ppc0 plip0: on ppbus0 lpt0: on ppbus0 lpt0: Interrupt-driven port ppi0: on ppbus0 atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A pmtimer0 on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounter "TSC" frequency 2410945801 Hz quality 800 Timecounters tick every 1.000 msec ad4: 238475MB at ata2-master UDMA33 Trying to mount root from ufs:/dev/ad4s1a Regards Chris