From owner-freebsd-bugs Fri Apr 7 7:20:13 2000 Delivered-To: freebsd-bugs@freebsd.org Received: from freefall.freebsd.org (freefall.FreeBSD.ORG [204.216.27.21]) by hub.freebsd.org (Postfix) with ESMTP id CD9FE37BC0B for ; Fri, 7 Apr 2000 07:20:07 -0700 (PDT) (envelope-from gnats@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.9.3/8.9.2) id HAA52336; Fri, 7 Apr 2000 07:20:02 -0700 (PDT) (envelope-from gnats@FreeBSD.org) Received: from mail.cellport.com (mail.cellport.com [205.240.1.197]) by hub.freebsd.org (Postfix) with ESMTP id 9695137BA52 for ; Fri, 7 Apr 2000 07:15:39 -0700 (PDT) (envelope-from durian@cellport.com) Received: from lostwax.cellport.com (lostwax.cellport.com [10.10.11.3]) by mail.cellport.com (8.9.3/8.9.3) with ESMTP id HAA49212 for ; Fri, 7 Apr 2000 07:15:24 -0700 (MST) Received: (from durian@localhost) by lostwax.cellport.com (8.9.3/8.9.3) id IAA30401; Fri, 7 Apr 2000 08:15:24 -0600 (MDT) (envelope-from durian@mail-i.cellport.com) Message-Id: <200004071415.IAA30401@lostwax.cellport.com> Date: Fri, 7 Apr 2000 08:15:24 -0600 (MDT) From: durian@cellport.com Reply-To: durian@cellport.com To: FreeBSD-gnats-submit@freebsd.org X-Send-Pr-Version: 3.2 Subject: kern/17844: amd wedges every morning (nfs related?) Sender: owner-freebsd-bugs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org >Number: 17844 >Category: kern >Synopsis: Amd wedges every morning since I've upgraded to 4.0 >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Fri Apr 7 07:20:01 PDT 2000 >Closed-Date: >Last-Modified: >Originator: Mike Durian >Release: FreeBSD 4.0-STABLE i386 >Organization: CellPort Systems >Environment: Copyright (c) 1992-2000 The FreeBSD Project. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. FreeBSD 4.0-STABLE #1: Fri Mar 31 17:47:19 MST 2000 root@lostwax.cellport.com:/usr/obj/usr/src/sys/LOSTWAX Timecounter "i8254" frequency 1193182 Hz CPU: Pentium II/Pentium II Xeon/Celeron (451.03-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x652 Stepping = 2 Features=0x183f9ff real memory = 134152192 (131008K bytes) avail memory = 127074304 (124096K bytes) Preloaded elf kernel "kernel" at 0xc034f000. Pentium Pro MTRR support enabled md0: Malloc disk npx0: on motherboard npx0: INT 16 interface pcib0: on motherboard pci0: on pcib0 pcib1: at device 1.0 on pci0 pci1: on pcib1 pci1: at 0.0 isab0: at device 7.0 on pci0 isa0: on isab0 atapci0: port 0x10a0-0x10af at device 7.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 uhci0: port 0x1080-0x109f irq 9 at device 7.2 on pci0 usb0: on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered chip1: port 0x2180-0x218f at device 7.3 on pci0 xl0: <3Com 3c905B-TX Fast Etherlink XL> port 0x1000-0x107f mem 0xf4000000-0xf400007f irq 11 at device 16.0 on pci0 xl0: Ethernet address: 00:00:39:a1:0a:0e miibus0: on xl0 xlphy0: <3Com internal media interface> on miibus0 xlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto xl0: supplying EUI64: 00:00:39:ff:fe:a1:0a:0e fdc0: at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 atkbdc0: at port 0x60,0x64 on isa0 atkbd0: irq 1 on atkbdc0 psm0: irq 12 on atkbdc0 psm0: model Generic PS/2 mouse, device ID 0 vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 sc0: on isa0 sc0: VGA <16 virtual consoles, flags=0x200> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A sio1 at port 0x2f8-0x2ff irq 3 on isa0 sio1: type 16550A ppc0: at port 0x378-0x37f irq 7 on isa0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/8 bytes threshold ppi0: on ppbus0 lpt0: on ppbus0 lpt0: Interrupt-driven port plip0: on ppbus0 pca0 at port 0x40 on isa0 pcm0: at port 0x534-0x537,0x388-0x38b,0x220-0x22f irq 5 drq 1,0 on isa0 unknown0: on isa0 unknown1: at port 0x120-0x127 on isa0 ad0: 8063MB [16383/16/63] at ata0-master using UDMA33 acd0: CDROM at ata1-master using PIO4 Mounting root from ufs:/dev/ad0s3a >Description: Every morning since I've upgraded to 4.0 (cvsupped on the 31st), amd is in a wedged state. When I try to access my home directory via /u/durian (where /u is a symbolic link to /home/fserver - a directory managed by amd) amd reports that /u/durian isn't responding. The actual NFS directory containing the homes is still mounted and I can access my home directory. Just not via amd. If I kill amd and restart it, things are good to go again. I think this is related to NFS as I also get messages saying the NFS server wasn't responding during the night: Apr 6 22:38:51 lostwax /kernel: nfs server fserver:/export/home/fserver: not re sponding Apr 6 22:47:12 lostwax /kernel: receive error 60 from nfs server fserver:/expor t/home/fserver Apr 7 05:59:30 lostwax /kernel: nfs server fserver:/export/home/fserver: is ali ve again Apr 7 06:11:02 lostwax /kernel: receive error 60 from nfs server fserver:/expor t/home/fserver I don't know what is going on at 6:00am, but there is a cron job running at 22:00 that does backups - including rdumps. So my gut feeling is that NFS goes out to lunch when the rdump is going on (a separate bug I suppose) and this confuses amd. NFS eventually comes back, but not in time for amd. I do have a number of warning from amd at start-up. Perhaps that is the problem. Here are our amd_flags: amd_enable="YES" amd_flags="-a /export \ -d cellport.com \ -c 1800 \ -l syslog \ -r \ -x fatal,error,user \ /home /etc/amd.home \ /amd /etc/amd.amd /etc/amd.home: # amd.home /defaults -type:=nfs;rfs:=${autodir}${path};rhost:=${key};\ fs:=${autodir}${path};\ opts:=rw,grpid,timeo=10,retrans=5,nosuid,intr,\ utimeout=1200,rsize=8192,wsize=8192 fserver host==${key};type:=link || \ ; lostwax host==${key};fs:=/usr/home;type:=link || \ rhost:=${key};rfs:=/usr/home ; /etc/amd.amd: # amd.amd /defaults -type:=nfs;rfs:=${autodir}${path};rhost:=${key};\ fs:=${autodir}${path};\ opts:=rw,grpid,timeo=10,retrans=5,nosuid,intr,\ utimeout=1200,rsize=8192,wsize=8192 applix host==lostwax;fs:=/disk2/applix;type:=link || \ rhost:=lostwax;rfs:=/disk2/applix ; distfiles host==fserver;fs:=/usr/ports/distfiles;type:=link || \ rhost:=fserver;rfs:=/usr/ports/distfiles ; src host==fserver;fs:=/usr/local/src;type:=link || \ rhost:=fserver;rfs:=/usr/local/src ; doc host==fserver;fs:=/usr/local/www/doc;type:=link || \ rhost:=fserver;rfs:=/usr/local/www/doc ; Please let me know if you would like system configuration details on our server. Lostwax is my desktop machine. >How-To-Repeat: I'm not sure how easily this can be reproduced outside our environment, though it is consistent here. I guess you need to set up NFS, amd and some backup cron job much like what we have here. >Fix: No known fix. >Release-Note: >Audit-Trail: >Unformatted: To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message