From owner-freebsd-questions@FreeBSD.ORG Fri Dec 11 17:46:19 2009 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 991361065696 for ; Fri, 11 Dec 2009 17:46:19 +0000 (UTC) (envelope-from feenberg@nber.org) Received: from mail2.nber.org (mail2.nber.org [66.251.72.79]) by mx1.freebsd.org (Postfix) with ESMTP id 5748A8FC20 for ; Fri, 11 Dec 2009 17:46:18 +0000 (UTC) Received: from nber6.nber.org (nber6.nber.org [66.251.72.76]) by mail2.nber.org (8.14.3/8.13.8) with ESMTP id nBBHZ6IS089767 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NOT); Fri, 11 Dec 2009 12:35:12 -0500 (EST) (envelope-from feenberg@nber.org) Received: from nber6.nber.org (localhost [127.0.0.1]) by nber6.nber.org (8.13.8+Sun/8.12.10) with ESMTP id nBBHNspO017461; Fri, 11 Dec 2009 12:23:54 -0500 (EST) Received: from localhost (Unknown UID 1079@localhost) by nber6.nber.org (8.13.8+Sun/8.13.8/Submit) with ESMTP id nBBHNs8t017458; Fri, 11 Dec 2009 12:23:54 -0500 (EST) X-Authentication-Warning: nber6.nber.org: Unknown UID 1079 owned process doing -bs Date: Fri, 11 Dec 2009 12:23:54 -0500 (EST) From: Daniel Feenberg To: freebsd-questions@freebsd.org Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Anti-Virus: Kaspersky Anti-Virus for Linux Mail Server 5.6.39/RELEASE, bases: 20091211 #3357907, check: 20091211 clean Subject: Diskless boot fails when network card is reset before NFS root mount X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Dec 2009 17:46:19 -0000 Our dozen diskless FreeBSD 7.0 machines are all able to diskless boot just fine. However, when we tried to set up a FreeBSD 8.0 root for them to boot from, the boot process would load up all the devices, and then fail right after the line "NFS ROOT: ..." We boot using pxeboot. pxeboot then mounts our NFS root and runs the loader from /boot under there. After the beastie screen, loader runs the kernel. All this works fine under both 7 and 8. On both 7 and 8, we see messages of the form "em0: link state changed to down","em0: link state changed to up". They happen right before or after NFS ROOT. Then, on version 8, we see error messages about /devfs not being found, and eventually /sbin/init not being found. We surmise that what happens is that the kernel resets the ethernet interface, right before re-mounting the NFS root (note that the NFS root was already mounted back before the beastie screen). On 8, somehow the interface reset interferes with the nfs mount resulting in no root FS. This problem seems to be referred to here: http://lists.freebsd.org/pipermail/freebsd-net/2009-January/020666.html Have others seen this issue? Is it a known bug? Is there a workaround or a fix? It seems to us, not being kernel hackers, a particularly difficult problem to get a handle on because execution is being controlled by the kernel at that point, there is no loader script or rc script that one could insert debugging print statements into. A complete description of our diskless boot procedure is given at: http://www.nber.org/sys-admin/FreeBSD-diskless.html which has worked well on several prior versions of FreeBSD. The network card is an Intel Pro/1000 card - well supported by FreeBSD. - Alex Aminoff BaseSpace.net National Bureau of Economic Research (nber.org) - Daniel Feenberg feenberg@nber.org National Bureau of Economic Research (nber.org)