From owner-freebsd-stable Sun Mar 12 15:33:56 2000 Delivered-To: freebsd-stable@freebsd.org Received: from everest.overx.com (everest.overx.com [63.93.29.10]) by hub.freebsd.org (Postfix) with ESMTP id 47D3237BC18 for ; Sun, 12 Mar 2000 15:33:48 -0800 (PST) (envelope-from dayton@overx.com) Received: from polo.overx.com (polo.overx.com [63.93.29.12]) by everest.overx.com (Postfix) with ESMTP id 23D7E2031; Sun, 12 Mar 2000 17:33:47 -0600 (CST) Received: by polo.overx.com (Postfix, from userid 1001) id 846DF3F2F; Sun, 12 Mar 2000 17:34:23 -0600 (CST) From: Soren Dayton Reply-To: dayton+freebsd-stable@overx.com To: "Charles N. Owens" Cc: freebsd-stable@FreeBSD.ORG, Mike Squires , kingsled@enc.edu Subject: Re: samba 2.0.6 crashing -stable References: <868zzr6c15.fsf@polo.overx.com> <38C90BB4.3535B7AE@enc.edu> Date: 12 Mar 2000 17:34:23 -0600 In-Reply-To: "Charles N. Owens"'s message of "Fri, 10 Mar 2000 09:50:28 -0500" Message-ID: <86snxvvk3k.fsf@polo.overx.com> Lines: 106 User-Agent: Gnus/5.070099 (Pterodactyl Gnus v0.99) XEmacs/21.1 (Bryce Canyon) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG "Charles N. Owens" writes: > Soren Dayton wrote: > > > Hi, > > I've been having a recurring problem with samba on my freebsd > > machines. It began when I was running 3.2, and it appears to still be > > around with -stable. (and samba 2.0.3, and now 2.0.6) > > Basically, when copying large files, the FreeBSD file server appears > > to lose all of its network connectivity. It cannot be pinged, etc. > > You also cannot log in as root because it hangs after you type in > > `root' on console. (which might be concistent with all networking > > failing. I think, but have not verified this, that anything that does > > not use NIS or some such thing will continue to work. > > Some questions: > > Is root a local or NIS user? local. But the machine is an NIS server (but this should have no bearing on your question) > How long does this connectivity loss last? Does it go away by itself or > do you have to force it somehow (e.g. reset)? It doesn't seem to ever come back. I've given it about 10-15 minutes > Where is the NIS server (ypserv) that you're binding to in this setup? On > the same box or on a different box? same box. Also talks NFS, samba, and mail for a cluster of ~10 machines. > I have a 3.2-stable box (PII-400, 128MB RAM) currently running samba 2.0.5 > and I'm wondering if I'm seeing the same thing. When trying to access > files via samba I'll encounter 30+ second delays for no apparent reason. > I've not noticed that all net traffic stops but I'm going start looking. I also saw this with 3.2 (release) with samba 2.0.3. Pl > Are you thinking that some kind of subtle (or not so subtle) samba problem > is somehow jamming the network interface... causing ypbind to cough... > then while it is trying to sort things out (rebinding perhaps) samba is > hung waiting for any pending user database lookups to complete? I don't know. When there's already a root shell on console I find that I can't ping anyone on the network (note that if the shell is there, I can do plenty of things, as long as I don't go through the network stuff. For example: # ls works, but with `-l' it doesn't. Also, the samba connection eats itself before other connections do. So, for example, there will be no more progress on the copying, and, a shell (through ssh) will still work for another couple of seconds. Also. A place that I worked at a year ago had a problem in which their NFS servers (FreeBSD 3.1 at the time, I think) would eat themselves under very high stress. I don't know what the nic was. > If this is what you're thinking then we have at least two paths two > pursue: > > * Find root cause > * Optimize NIS implementation to minimize effects of problem > (perhaps by making client a slave server to keep the client > binding and lookup traffic local) > As far as the root cause is concerned I've got less to say. Interestingly > enough, I _also_ am using a Intel Pro 10/100 nic ! That makes three of > us, doesn't it? Following Mike Squires' lead, here's my fxp0 stats (from > netstat -i) > > Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll > fxp0 1500 00.a0.c9.d6.54.7d 64352010 19 88144552 0 0 Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll fxp0 1500 00:90:27:8a:4c:36 4618995 31 6471420 0 1869745 (yes, I know. I should worry about those collisions. But my network is getting switched soon...) You know. I just started thinking about this... And you know what I found? That when I induce the failure, there are a bunch of collisions. Would there be any change that collisions would be handled badly somewhere either in the card or the driver? And it only shows up on networks that are a little over done (like mine right now) > Similar pattern as far as errors go. This may be completely meaningless, > though. I think some of the discussion between you two wasn't cc:'d to > this list, so you may have covered this... but do you suspect the problem > is somehow related to the nic? Perhaps. I'm sort of in the dark. > I'll let you know what happens after I make the samba server a NIS slave > server. Thanks. I'm utterly bewildered about what's happening. And I have no idea how I could help. Soren To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message