Date: Thu, 9 Nov 2006 18:17:06 +0100 (CET) From: Oliver Fromme <olli@lurza.secnetix.de> To: freebsd-stable@FreeBSD.ORG Subject: Trouble: NFS via TCP Message-ID: <200611091717.kA9HH631005085@lurza.secnetix.de>
next in thread | raw e-mail | index | archive | help
Hi, I've got a very weird problem with NFS mounts on a RELENG_6 machine (a.k.a 6.2-PRERELEASE, sources synced yesterday, November 8th). It's an HP Proliant DL360 G4 (G4p to be exact), but that shouldn't matter. I've been banging my head on the table for several hours, but I can't find the source of the problem. :-( What I'm trying to do should be very simple: mounting an NFS directory via TCP (instead of UDP which is the default), like this: # mount_nfs -T -3 -R 3 -i -s -o ro 127.0.0.1:/localdisk /nfs/test Symptom: As soon as I use the -T option (TCP) with the mount command, it simply hangs forever. If I use the intr/soft flags, I can Ctrl-C it after a while, and the mount indeed appears in the output from "mount", but any command that tries to access it (e.g. ls(1)) also hangs. Even umount(8) hangs. More observations: - UDP works perfectly fine. No problems at all. - Other TCP connections beside NFS (e.g. ssh) work fine. - IPF is present, but disabled (ipf -D). - IPFW only contains the default "allow any to any" rule. - The interface doesn't matter. Mounting from localhost (via lo0) has the same problem as via a real NIC. - I first observed the problem on RELENG_6 of 2006-10-19 (but it could be much older, because I haven't tried NFS-via-TCP on this machine before). Then I updated to 2006-11-08, no change. - SMP or UP kernel doesn't make a difference. - No special compiler flags, make.conf is empty. - Kernel config is GENERIC with a few additions for more shared memory and semaphores (so Squid and PostgreSQL are happy) and some other unrelated details. - No suspicious things in dmesg. Kernel prints nothing during the mount attempts. - Output from rpcinfo -p looks good. - tcpdump shows that the TCP connection is immediately shut down: After connecting successfully, it sends a FIN, then reconnects, etc. ad infinitum. Meanwhile vfs.nfs.reconnects increases slowly. - On a different machine (different hardware, but same RELENG_6 and very similar kernel config), the problem does *NOT* occur. I compared sysctl variables relevant to nfs, rpc and tcp, and they're all the same. Also, rpcinfo -p is the same. Now I'm running out of ideas ... Obviously there must be something special with that machine, because it works fine on a different machine, but I'm not able to find out what it is. I even considered putting a few printf() calls into some places in sys/nfsclient/nfs_socket.c to find out what's going on, but I'm not sure if that makes sense and whether it will give any useful results. Any hints and ideas will be greatly appreciated. Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing Dienstleistungen mit Schwerpunkt FreeBSD: http://www.secnetix.de/bsd Any opinions expressed in this message may be personal to the author and may not necessarily reflect the opinions of secnetix in any way. "And believe me, as a C++ programmer, I don't hesitate to question the decisions of language designers. After a decent amount of C++ exposure, Python's flaws seem ridiculously small." -- Ville Vainio
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200611091717.kA9HH631005085>