From owner-freebsd-hackers@FreeBSD.ORG Thu Oct 15 21:16:30 2009 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 47E101065676 for ; Thu, 15 Oct 2009 21:16:30 +0000 (UTC) (envelope-from gallatin@cs.duke.edu) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.freebsd.org (Postfix) with ESMTP id 082C78FC08 for ; Thu, 15 Oct 2009 21:16:29 +0000 (UTC) Received: from [172.31.193.10] (cpe-069-134-110-200.nc.res.rr.com [69.134.110.200]) (authenticated bits=0) by duke.cs.duke.edu (8.14.2/8.14.2) with ESMTP id n9FLGT92019976 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 15 Oct 2009 17:16:29 -0400 (EDT) X-DKIM: Sendmail DKIM Filter v2.8.3 duke.cs.duke.edu n9FLGT92019976 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=cs.duke.edu; s=mail; t=1255641389; bh=v7cF/ES5g7OmcESpMHwLERv/ZqC9ikTENYgNHewaP/Y=; h=Message-ID:Date:From:MIME-Version:To:Subject:Content-Type: Content-Transfer-Encoding; b=IA1xTU8wRh0KKQ8HDjOhiSj26XTZ82eBJpVzu32UZqLXV0+zSVLXfPls34PeqtKwV 5xHQWrFQXonHi09u9QZmRXchRODmWWeXRE8mKUaoTVhzK5A8VrfOYGokYTArB70zP1 tEIK9ocDEZMTH5fiEEyTdw64C853SvKOwJ9pgu9A= Message-ID: <4AD79126.8020104@cs.duke.edu> Date: Thu, 15 Oct 2009 17:16:22 -0400 From: Andrew Gallatin User-Agent: Thunderbird 2.0.0.22 (X11/20090608) MIME-Version: 1.0 To: freebsd-hackers@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: namei (via firmware_get(9)) from taskq in 7.x X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Oct 2009 21:16:30 -0000 Hi, I'm trying to re-initialize a NIC which uses firmware(9) after a hardware fault. As part of the process, I need to re-load the firmware using firmware_get(). If the firmware kld is not resident, then the machine will panic like this: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x20 fault code = supervisor read data, page not present instruction pointer = 0x8:0xffffffff805b05d4 stack pointer = 0x10:0xffffff8000080460 frame pointer = 0x10:0xffffff8000080510 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 21 (swi5: +) [thread pid 21 tid 100021 ] Stopped at namei+0x174: movq 0x20(%rbx),%rax db> bt Tracing pid 21 tid 100021 td 0xffffff00013c3ae0 namei() at namei+0x174 vn_open_cred() at vn_open_cred+0x3a4 linker_load_module() at linker_load_module+0x1f2 linker_reference_module() at linker_reference_module+0xae firmware_get() at firmware_get+0x136 mxge_load_firmware() at mxge_load_firmware+0x2d mxge_watchdog_task() at mxge_watchdog_task+0x2f6 taskqueue_run() at taskqueue_run+0x9d ithread_loop() at ithread_loop+0x17d fork_exit() at fork_exit+0x11f fork_trampoline() at fork_trampoline+0xe Looking at it in gdb, it seems like the problem is that namei is trying to use ndp->ni_cnd.cn_thread->td_proc->p_fd->fd_cdir which is null in this context. Can somebody tell me what kernel context it is safe to call firmware_get() (and hence namei) from? Is there a safe way to do it from a taskq? FWIW, this seems to work fine (even from a callout context) in 8 and higher. It is only 7 and earlier where I'm having this problem. Thanks, Drew