From owner-freebsd-stable@FreeBSD.ORG Sat Jan 8 22:06:09 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A2968106566B for ; Sat, 8 Jan 2011 22:06:09 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from ftp.translate.ru (ftp.translate.ru [80.249.188.42]) by mx1.freebsd.org (Postfix) with ESMTP id 5BE088FC12 for ; Sat, 8 Jan 2011 22:06:09 +0000 (UTC) Received: from [192.168.134.2] (89.112.15.178.pppoe.eltel.net [89.112.15.178]) (Authenticated sender: lev@serebryakov.spb.ru) by ftp.translate.ru (Postfix) with ESMTPA id 23CE013DF5F; Sun, 9 Jan 2011 01:06:08 +0300 (MSK) Date: Sun, 9 Jan 2011 01:06:01 +0300 From: Lev Serebryakov Organization: FreeBSD Project X-Priority: 3 (Normal) Message-ID: <959936032.20110109010601@serebryakov.spb.ru> To: Kostik Belousov In-Reply-To: <20110108202028.GY12599@deviant.kiev.zoral.com.ua> References: <204344488.20110108214457@serebryakov.spb.ru> <20110108190232.GU12599@deviant.kiev.zoral.com.ua> <1792026896.20110108222909@serebryakov.spb.ru> <20110108195613.GW12599@deviant.kiev.zoral.com.ua> <1544327450.20110108231021@serebryakov.spb.ru> <20110108202028.GY12599@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: quoted-printable Cc: freebsd-stable@freebsd.org Subject: Re: 8.2-PRERELEASE: live deadlock, almost all processes in "pfault" state X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Jan 2011 22:06:09 -0000 Hello, Kostik. You wrote 8 =D1=CE=D7=C1=D2=D1 2011 =C7., 23:20:28: >> >> And, if it is "classic deadlock" is here any "classical" solution to >> >> it? >> > Do not allocate during bio processing. >> So, if GEOM need some cache, it needs pre-allocate it and implements >> custom allocator over allocated chunk? :( >>=20 >> And what is "bio processing" in this context? geom_raid5 puts all > bio processing =3D=3D whole time needed to finish pageout. Pageout is > often performed to clean the page to lower the page shortage. > If pageout requires more free pages to finish during the shortage, > then we get the deadlock. Ok, and transmission mmap() files on geom_raid5, so when these pages are paged out, and geom_raid5 asks for other pages, and there is no free ones... I see. It seems, that M_NOWAIT flag should help, if geom_raid5 could live with failed mallocs... > Also, it seems that you allocate not only bios (small objects, not > every request cause page allocation), but also the huge buffers, that > require free pages each time. Yes, in worst case RAID5 need a lot of additional memory to perform simple write. If it is lone write (geom_raid5 waits some time for writes in adjacent areas, but not forever), geom_raid5 need to read (Number of disks - 1) x (size of write) bytes of data to re-calculate checksum. And it need buffers for this data. Worst case for 5-disks RAID5 and 128KiB write will be 4x128KiB =3D 512KiB of buffers. For one 128KiB write. And I don;t understand how to avoid deadlock here :( Maybe, preallocating some memory at start (these 512KiB) and try to use them when malloc() failed... I need to look how raid3 and vinum/raid5 lives with that situation. --=20 // Black Lion AKA Lev Serebryakov