From owner-freebsd-current Thu May 21 20:26:23 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id UAA14450 for freebsd-current-outgoing; Thu, 21 May 1998 20:26:23 -0700 (PDT) (envelope-from owner-freebsd-current@FreeBSD.ORG) Received: from freebie.lemis.com (freebie.lemis.com [139.130.136.133]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id UAA14435 for ; Thu, 21 May 1998 20:26:13 -0700 (PDT) (envelope-from grog@lemis.com) Received: (from grog@localhost) by freebie.lemis.com (8.8.8/8.8.7) id MAA01643; Fri, 22 May 1998 12:56:11 +0930 (CST) (envelope-from grog) Message-ID: <19980522125610.B27201@freebie.lemis.com> Date: Fri, 22 May 1998 12:56:10 +0930 From: Greg Lehey To: FreeBSD current users Subject: NFS server mount problems in -current? Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.91.1i WWW-Home-Page: http://www.lemis.com/~grog Organization: LEMIS, PO Box 460, Echunga SA 5153, Australia Phone: +61-8-8388-8286 Fax: +61-8-8388-8725 Mobile: +61-41-739-7062 Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG I've had a lot of trouble mounting file systems on a system (freebie) running a kernel supped on Sunday our time. I have two other machines in the network, liberty and razzia, running 2.2.6-RELEASE and 3.0-CURRENT respectively. In each case, they hang during startup at the mountall -t nfs -a stage. I've done some investigation and discovered: 1. If I comment out the mountall, I can often mount the file systems individually from the command line. This doesn't always work: === root@liberty (/dev/ttyp0) / 15 -> grep freebie /etc/fstab |cut -f 1|sed 's:^:mount :'|sh -v mount freebie:/ mount freebie:/home mount freebie:/usr mount freebie:/S (hang) After reboot: === root@liberty (/dev/ttyp0) / 2 -> grep freebie /etc/fstab |cut -f 1|sed 's:^:mount :;s:$:; sleep 10:'|sh -v mount freebie:/; sleep 10 The mount process is unstoppable, and after such a hang occurs I can't log in to the machine any more: login hangs after entering the name (doesn't prompt for password), telnet hangs before the login: prompt. 2. tcpdump shows that the requestor machine is refusing the reply from freebie: 12:00:04.268191 liberty.lemis.com.4e30802f > freebie.lemis.com.nfs: 120 getattr [|nfs] 12:00:04.268587 freebie.lemis.com.nfs > liberty.lemis.com.4e30802f: reply ok 112 12:00:04.269174 liberty.lemis.com > freebie.lemis.com: icmp: liberty.lemis.com udp port 1021 unreachable 12:00:18.165813 liberty.lemis.com.who > 192.109.197.255.who: udp 84 12:00:36.290613 liberty.lemis.com.4e30802f > freebie.lemis.com.nfs: 120 getattr [|nfs] 12:00:36.291149 freebie.lemis.com.nfs > liberty.lemis.com.4e30802f: reply ok 112 12:00:36.291836 liberty.lemis.com > freebie.lemis.com: icmp: liberty.lemis.com udp port 1021 unreachable This seems strange, because it happens on two different machines in the same manner, and their software hasn't changed. In addition, I can mount from another machine (allegro, running 2.2.2) with no problems. About the only thing that looks strange is the port number coming from liberty (4e30802f), but it's the same with allegro, and things look OK there: 12:45:31.466896 liberty.lemis.com.4ed6d030 > allegro.lemis.com.nfs: 124 access [|nfs] 12:45:31.467609 allegro.lemis.com.nfs > liberty.lemis.com.4ed6d030: reply ok 120 access [|nfs] 12:45:31.468904 liberty.lemis.com.4ed6d031 > allegro.lemis.com.nfs: 128 lookup [|nfs] 12:45:31.469697 allegro.lemis.com.nfs > liberty.lemis.com.4ed6d031: reply ok 236 lookup [|nfs] ... 12:45:35.213746 liberty.lemis.com.4ed6d059 > allegro.lemis.com.nfs: 92 fsstat [|nfs] 12:45:35.214456 allegro.lemis.com.nfs > liberty.lemis.com.4ed6d059: reply ok 168 fsstat [|nfs] 12:45:35.215555 liberty.lemis.com.4ed6d05a > allegro.lemis.com.nfs: 92 fsinfo [|nfs] 12:45:35.216188 allegro.lemis.com.nfs > liberty.lemis.com.4ed6d05a: reply ok 164 fsinfo [|nfs] 12:45:35.216931 liberty.lemis.com.4ed6d05b > allegro.lemis.com.nfs: 92 fsstat [|nfs] 12:45:35.217607 allegro.lemis.com.nfs > liberty.lemis.com.4ed6d05b: reply ok 168 fsstat [|nfs] 12:46:00.851969 liberty.lemis.com.4ed6d001 > freebie.lemis.com.nfs: 120 getattr [|nfs] 12:46:00.852394 freebie.lemis.com.nfs > liberty.lemis.com.4ed6d001: reply ok 112 12:46:00.853041 liberty.lemis.com > freebie.lemis.com: icmp: liberty.lemis.com udp port 1022 unreachable 3. As the examples above show, the problem is not 100% reproducible. The first example failed on the fourth file system, so I put in a sleep in case there was some timing problem, but the subsequent ones failed on the first file system. I can't recall seeing anything in -current about this. I'm currently supping again, and will report if nobody else comes up with some bright ideas. Greg -- See complete headers for address and phone numbers finger grog@lemis.com for PGP public key To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message