From owner-freebsd-hackers@FreeBSD.ORG Fri Nov 28 12:57:33 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 07BE316A4CE for ; Fri, 28 Nov 2003 12:57:33 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1B5C643FA3 for ; Fri, 28 Nov 2003 12:57:32 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) hASKvViF002737; Fri, 28 Nov 2003 12:57:31 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.9p2/8.12.9/Submit) id hASKvUOM002734; Fri, 28 Nov 2003 12:57:30 -0800 (PST) (envelope-from dillon) Date: Fri, 28 Nov 2003 12:57:30 -0800 (PST) From: Matthew Dillon Message-Id: <200311282057.hASKvUOM002734@apollo.backplane.com> To: Soren Schmidt References: <200311262104.hAQL4ICN024652@spider.deepcore.dk> cc: hackers@freebsd.org Subject: Problems with use of M_NOWWAIT in ATA X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Nov 2003 20:57:33 -0000 Soren, while fixing some issues in DFly related to the ATA driver I found a serious problem in your driver... actually, it appears to be in ata-ng -stable and -current as well. The problem is that you are using M_NOWAIT all over the place. M_NOWAIT allows malloc() to fail if/when resources are temporarily exhausted. In DFly this can occur more often because it causes malloc() to return NULL if cannot lock the kernel_map, but I believe malloc() can return NULL in -current and -stable as well (e.g. if the VM page free list is empty). Also, at least in the 4.8 ATA driver, the driver tries to revert to PIO mode if it cannot allocate a DMA buffer. This is bad because PIO might not necessarily work. It just isn't appropriate to revert to PIO mode under any circumstances other then a hardware DMA failure. In -current you are using bus dma tags and I don't know whether that solves the issue, but even in -CURRENT you are allocating request structures with M_NOWAIT (in ata_alloc_request()) and that is just plain not right. You are virtually guarenteed to corrupt any filesystem trying to issue a write for which ata_alloc_request() fails. It *CANNOT* fail. I know the problem is complicated somewhat by the fact that you cannot simply replace that code with a blocking wait without creating a deadlock situation. What you really need to do is create a request pipeline and block up-front to wait for an I/O operation to complete which frees up an existing request (e.g. before getting the tag). If you do not want to block in ad_start() you can place the original io buffer on a queue and reissue when requests/DMA memory are freed up by completing I/O's. I am going to be hacking this up in DFly and I'll email you a patch set when I'm done. This is just a head's up that there are some serious issues in the ATA driver code related to memory allocation. -Matt