Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 7 Apr 2000 08:15:24 -0600 (MDT)
From:      durian@cellport.com
To:        FreeBSD-gnats-submit@freebsd.org
Subject:   kern/17844: amd wedges every morning (nfs related?)
Message-ID:  <200004071415.IAA30401@lostwax.cellport.com>

next in thread | raw e-mail | index | archive | help

>Number:         17844
>Category:       kern
>Synopsis:       Amd wedges every morning since I've upgraded to 4.0
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Apr  7 07:20:01 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator:     Mike Durian
>Release:        FreeBSD 4.0-STABLE i386
>Organization:
CellPort Systems
>Environment:

Copyright (c) 1992-2000 The FreeBSD Project.
Copyright (c) 1982, 1986, 1989, 1991, 1993
	The Regents of the University of California. All rights reserved.
FreeBSD 4.0-STABLE #1: Fri Mar 31 17:47:19 MST 2000
    root@lostwax.cellport.com:/usr/obj/usr/src/sys/LOSTWAX
Timecounter "i8254"  frequency 1193182 Hz
CPU: Pentium II/Pentium II Xeon/Celeron (451.03-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x652  Stepping = 2
  Features=0x183f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR>
real memory  = 134152192 (131008K bytes)
avail memory = 127074304 (124096K bytes)
Preloaded elf kernel "kernel" at 0xc034f000.
Pentium Pro MTRR support enabled
md0: Malloc disk
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <Intel 82443BX (440 BX) host to PCI bridge> on motherboard
pci0: <PCI bus> on pcib0
pcib1: <Intel 82443BX (440 BX) PCI-PCI (AGP) bridge> at device 1.0 on pci0
pci1: <PCI bus> on pcib1
pci1: <ATI Mach64-GB graphics accelerator> at 0.0
isab0: <Intel 82371AB PCI to ISA bridge> at device 7.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel PIIX4 ATA33 controller> port 0x10a0-0x10af at device 7.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
uhci0: <Intel 82371AB/EB (PIIX4) USB controller> port 0x1080-0x109f irq 9 at device 7.2 on pci0
usb0: <Intel 82371AB/EB (PIIX4) USB controller> on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
chip1: <Intel 82371AB Power management controller> port 0x2180-0x218f at device 7.3 on pci0
xl0: <3Com 3c905B-TX Fast Etherlink XL> port 0x1000-0x107f mem 0xf4000000-0xf400007f irq 11 at device 16.0 on pci0
xl0: Ethernet address: 00:00:39:a1:0a:0e
miibus0: <MII bus> on xl0
xlphy0: <3Com internal media interface> on miibus0
xlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
xl0: supplying EUI64: 00:00:39:ff:fe:a1:0a:0e
fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: model Generic PS/2 mouse, device ID 0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> on isa0
sc0: VGA <16 virtual consoles, flags=0x200>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/8 bytes threshold
ppi0: <Parallel I/O> on ppbus0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
plip0: <PLIP network interface> on ppbus0
pca0 at port 0x40 on isa0
pcm0: <CS423x-PCI> at port 0x534-0x537,0x388-0x38b,0x220-0x22f irq 5 drq 1,0 on isa0
unknown0: <DISABLED> on isa0
unknown1: <CTRL> at port 0x120-0x127 on isa0
ad0: 8063MB <WDC AC28400R> [16383/16/63] at ata0-master using UDMA33
acd0: CDROM <LTN302> at ata1-master using PIO4
Mounting root from ufs:/dev/ad0s3a

>Description:

	Every morning since I've upgraded to 4.0 (cvsupped on the 31st),
	amd is in a wedged state.  When I try to access my home directory
	via /u/durian (where /u is a symbolic link to /home/fserver - a
	directory managed by amd) amd reports that /u/durian isn't
	responding.  The actual NFS directory containing the homes is
	still mounted and I can access my home directory.  Just not via
	amd.  If I kill amd and restart it, things are good to go again.

	I think this is related to NFS as I also get messages saying
	the NFS server wasn't responding during the night:

Apr  6 22:38:51 lostwax /kernel: nfs server fserver:/export/home/fserver: not re
sponding
Apr  6 22:47:12 lostwax /kernel: receive error 60 from nfs server fserver:/expor
t/home/fserver
Apr  7 05:59:30 lostwax /kernel: nfs server fserver:/export/home/fserver: is ali
ve again
Apr  7 06:11:02 lostwax /kernel: receive error 60 from nfs server fserver:/expor
t/home/fserver

	I don't know what is going on at 6:00am, but there is a cron job
	running at 22:00 that does backups - including rdumps.

	So my gut feeling is that NFS goes out to lunch when the rdump
	is going on (a separate bug I suppose) and this confuses amd.
	NFS eventually comes back, but not in time for amd.

	I do have a number of warning from amd at start-up.  Perhaps that
	is the problem.  Here are our amd_flags:

	amd_enable="YES"
	amd_flags="-a /export           \
	        -d cellport.com         \
	        -c 1800                 \
	        -l syslog               \
	        -r                      \
	        -x fatal,error,user     \
	        /home /etc/amd.home     \
	        /amd /etc/amd.amd

	/etc/amd.home:
	# amd.home

	/defaults       -type:=nfs;rfs:=${autodir}${path};rhost:=${key};\
	                fs:=${autodir}${path};\
	                opts:=rw,grpid,timeo=10,retrans=5,nosuid,intr,\
	                utimeout=1200,rsize=8192,wsize=8192
        
	fserver         host==${key};type:=link || \
	                        ;
	lostwax         host==${key};fs:=/usr/home;type:=link || \
	                rhost:=${key};rfs:=/usr/home
	                        ;


	/etc/amd.amd:
	# amd.amd

	/defaults	-type:=nfs;rfs:=${autodir}${path};rhost:=${key};\
			fs:=${autodir}${path};\
			opts:=rw,grpid,timeo=10,retrans=5,nosuid,intr,\
			utimeout=1200,rsize=8192,wsize=8192

	applix		host==lostwax;fs:=/disk2/applix;type:=link || \
			rhost:=lostwax;rfs:=/disk2/applix
				;
	distfiles	host==fserver;fs:=/usr/ports/distfiles;type:=link || \
			rhost:=fserver;rfs:=/usr/ports/distfiles
				;
	src		host==fserver;fs:=/usr/local/src;type:=link || \
			rhost:=fserver;rfs:=/usr/local/src
				;
	doc		host==fserver;fs:=/usr/local/www/doc;type:=link || \
			rhost:=fserver;rfs:=/usr/local/www/doc
				;


	Please let me know if you would like system configuration
	details on our server.  Lostwax is my desktop machine.


>How-To-Repeat:

	I'm not sure how easily this can be reproduced outside our
	environment, though it is consistent here.  I guess you need to
	set up NFS, amd and some backup cron job much like what we
	have here.

>Fix:

	No known fix.


>Release-Note:
>Audit-Trail:
>Unformatted:


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200004071415.IAA30401>