From owner-freebsd-scsi@FreeBSD.ORG Sun Jan 18 14:38:34 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DB25116A4CF; Sun, 18 Jan 2004 14:38:34 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id A2C6243D41; Sun, 18 Jan 2004 14:38:32 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) i0IMcQ82097544; Sun, 18 Jan 2004 14:38:31 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.9p2/8.12.9/Submit) id i0IMcQYZ097543; Sun, 18 Jan 2004 14:38:26 -0800 (PST) (envelope-from dillon) Date: Sun, 18 Jan 2004 14:38:26 -0800 (PST) From: Matthew Dillon Message-Id: <200401182238.i0IMcQYZ097543@apollo.backplane.com> To: Scott Long , freebsd-hackers@freebsd.org, Paul Twohey , scsi@freebsd.org References: <20040118160802.GC32115@FreeBSD.org.ua> <200401181844.i0IIivlQ096389@apollo.backplane.com> <400AE3AB.1070102@freebsd.org> <200401181957.i0IJvFTe096883@apollo.backplane.com> <200401182157.i0ILvNQe097287@apollo.backplane.com> Subject: Re: [CHECKER] bugs in FreeBSD X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Jan 2004 22:38:35 -0000 Well, this is fun. There are over 460 files in the 5.x source tree (360 in DFly) that make calls to malloc(... M_NOWAIT), and so far about 80% of the calls that I've reviewed generate inappropriate side effects when/if a failure occurs. CAM is the biggest violator... it even has a few panic() conditionals if a malloc(... M_NOWAIT) fails. Not Fun! The only reason it works at all is because M_NOWAIT actually does appear to allow malloc() to block in a number of situations (such as on VM object and map mutexes), and M_NOWAIT triggers VM_ALLOC_INTERRUPT which allows kmem_malloc() to dig into the free page reserve. So in 5.x M_NOWAIT allocations will actually work most of the time.. well, at least until something exhausts the free page reserve at just the wrong time, which is quite possible to do considering how much code is being allowed to dig into the reserve now. M_NOWAIT is being used pretty much as if it were M_WAITOK|M_USE_RESERVE most of the time, especially considering the side effect situation when such allocations fail. I don't think M_WAITOK|M_USE_RESERVE would be any less reliable, actually. It looks like the whole paradigm has shifted away from the original definition of M_NOWAIT to something that is more like a cross between M_NOWAIT, M_WAITOK, and M_USE_RESERVE. This creates a conundrum for me. In DFly M_NOWAIT really means M_NOWAIT, so I am going to have to do something about all the improper M_NOWAIT use in the source base. I'm amazed we haven't had more crashes but even in DFly M_NOWAIT failures due to, e.g. not being able to get the kernel_map lock non-blocking, do not occur all that often. -Matt