Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 21 Nov 2019 19:10:23 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 242146] nfs root mount may loop endlessly without hint on console
Message-ID:  <bug-242146-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D242146

            Bug ID: 242146
           Summary: nfs root mount may loop endlessly without hint on
                    console
           Product: Base System
           Version: CURRENT
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: bz@FreeBSD.org
                CC: rmacklem@FreeBSD.org

If we are trying to mount the root file system over NFS and cannot establis=
h a
connection we do never give up.  TCP/2049 packets arrive at the server, RST
comes back.

The reason is in newnfs_request() which seems to jump back to tryagain; at
least that is my guess for the loop as I couldn't spot it earlier and the
socreate() is part of the loop:

   1599 XXX-BZ socreate:553^M
   1600 XXX-BZ tcp_usr_attach:155 fff 4^M
   1601 XXX-BZ tcp_usr_attach:161^M
   1602 XXX-BZ tcp_usr_attach:171 error 0^M
   1603 XXX-BZ socreate:553^M
   1604 XXX-BZ tcp_usr_attach:155 fff 5^M
   1605 XXX-BZ tcp_usr_attach:161^M
   1606 XXX-BZ tcp_usr_attach:171 error 0^M
   1607 XXX-BZ socreate:553^M
   1608 XXX-BZ tcp_usr_attach:155 fff 6^M
   1609 XXX-BZ tcp_usr_attach:161^M
   1610 XXX-BZ tcp_usr_attach:171 error 0^M
   1611 XXX-BZ socreate:553^M
   1612 XXX-BZ tcp_usr_attach:155 fff 7^M
   1613 XXX-BZ tcp_usr_attach:161^M
   1614 XXX-BZ tcp_usr_attach:171 error 0^M

I added a panic if I come by 15 times to get a backtrace.

panic() at panic+0x43/frame 0xfffffe00acf58b70
tcp_usr_attach() at tcp_usr_attach+0x2b7/frame 0xfffffe00acf58be0
socreate() at socreate+0x1ce/frame 0xfffffe00acf58c30
__rpc_nconf2socket() at __rpc_nconf2socket+0x3f/frame 0xfffffe00acf58c60
clnt_reconnect_call() at clnt_reconnect_call+0x3b6/frame 0xfffffe00acf58d10
newnfs_request() at newnfs_request+0x90b/frame 0xfffffe00acf58e80
nfsrpc_getattrnovp() at nfsrpc_getattrnovp+0xeb/frame 0xfffffe00acf59020
mountnfs() at mountnfs+0x6b6/frame 0xfffffe00acf591c0
nfs_mount() at nfs_mount+0x11d3/frame 0xfffffe00acf59500
vfs_mount_sigdefer() at vfs_mount_sigdefer+0x24/frame 0xfffffe00acf59520
vfs_domount() at vfs_domount+0x7f9/frame 0xfffffe00acf59750
vfs_donmount() at vfs_donmount+0x911/frame 0xfffffe00acf597f0
kernel_mount() at kernel_mount+0x57/frame 0xfffffe00acf59840
parse_mount() at parse_mount+0x4a1/frame 0xfffffe00acf59990
vfs_mountroot() at vfs_mountroot+0x53b/frame 0xfffffe00acf59b10
start_init() at start_init+0x28/frame 0xfffffe00acf59bb0


I would suggest we'd rather really timeout / error after <n> retries and
possibly reboot or fail mountroot or whatever it'll be, rather than being s=
tuck
in a loop without letting the user know that "NFS server not reachable:
connection refused".

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-242146-227>