From owner-freebsd-scsi@FreeBSD.ORG Sun Jan 18 13:57:26 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AE93E16A4CE; Sun, 18 Jan 2004 13:57:26 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3226643D3F; Sun, 18 Jan 2004 13:57:25 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) i0ILvN82097288; Sun, 18 Jan 2004 13:57:23 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.9p2/8.12.9/Submit) id i0ILvNQe097287; Sun, 18 Jan 2004 13:57:23 -0800 (PST) (envelope-from dillon) Date: Sun, 18 Jan 2004 13:57:23 -0800 (PST) From: Matthew Dillon Message-Id: <200401182157.i0ILvNQe097287@apollo.backplane.com> To: Scott Long References: <20040118160802.GC32115@FreeBSD.org.ua> <200401181844.i0IIivlQ096389@apollo.backplane.com> <400AE3AB.1070102@freebsd.org> <200401181957.i0IJvFTe096883@apollo.backplane.com> <400AEC20.70709@freebsd.org> cc: freebsd-hackers@freebsd.org cc: Paul Twohey cc: Ruslan Ermilov cc: scsi@freebsd.org Subject: Re: [CHECKER] bugs in FreeBSD X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Jan 2004 21:57:26 -0000 :> I know cam uses some helper threads so I am not entirely sure about :> the context the cam_sim_alloc() calls are being made in, but if they :> do not create I/O stalls for already-operational SCSI devices then I :> am inclined (in DFly anyway) to simply make the malloc in :> cam_sim_alloc() M_WAITOK. :> :> -Matt :> Matthew Dillon :> :> : :In the 4.x case, so long as the driver doesn't do an splcam() or somehow :block hardware interrupts before calling cam_sim_alloc() you are :probably fine. For 5.x, you might run into Giant problems. : :Scott Well, I don't see how a spl or Giant could possibly have anything to do with memory deadlocks. Both are dropped when a thread blocks so the worst that happens is that you add some latency. The culprit is almost guarenteed to be blocking in the interrupt threads themselves or blocking in serialized multi-device-handling threads such as some of CAM's helper threads. Blocking in either could deadlock the system in a low memory situation. But what people seem to have done... using M_NOWAIT with very little regard for the side effects that occur when malloc() might then fail, is not the right solution. If the CAM code cannot use a blocking malloc for a critical structure allocation then it certainly can't use a non-blocking malloc that might then fail as a workaround! Some other solution is needed for those situations (something like the MPIPE solution I came up with to guarentee the availability of I/O request structures in interrupt service routines). What it comes down to for cam_sim_alloc() is, again, the context in which it is called. Can it be called from a serialized cam thread or an interrupt thread in a way that could potential block I/O operations for devices other then the one trying to attach? If so then there's a real problem that needs to be solved. If not then M_WAITOK can be safely used in this particular situation and the NULL case no longer needs to be worried about. -Matt Matthew Dillon