Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 9 Jan 2011 01:06:01 +0300
From:      Lev Serebryakov <lev@FreeBSD.org>
To:        Kostik Belousov <kostikbel@gmail.com>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: 8.2-PRERELEASE: live deadlock, almost all processes in "pfault" state
Message-ID:  <959936032.20110109010601@serebryakov.spb.ru>
In-Reply-To: <20110108202028.GY12599@deviant.kiev.zoral.com.ua>
References:  <204344488.20110108214457@serebryakov.spb.ru> <20110108190232.GU12599@deviant.kiev.zoral.com.ua> <1792026896.20110108222909@serebryakov.spb.ru> <20110108195613.GW12599@deviant.kiev.zoral.com.ua> <1544327450.20110108231021@serebryakov.spb.ru> <20110108202028.GY12599@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
Hello, Kostik.
You wrote 8 =D1=CE=D7=C1=D2=D1 2011 =C7., 23:20:28:

>> >>   And, if it is "classic deadlock" is here any "classical" solution to
>> >> it?
>> > Do not allocate during bio processing.
>>  So, if GEOM need some cache, it needs pre-allocate it and implements
>> custom allocator over allocated chunk? :(
>>=20
>>  And what is "bio processing" in this context? geom_raid5 puts all
> bio processing =3D=3D whole time needed to finish pageout. Pageout is
> often performed to clean the page to lower the page shortage.
> If pageout requires more free pages to finish during the shortage,
> then we get the deadlock.
  Ok, and transmission mmap() files on geom_raid5, so when these pages
are paged out, and geom_raid5 asks for other pages, and there is no
free ones... I see. It seems, that M_NOWAIT flag should help, if
geom_raid5 could live with failed mallocs...

> Also, it seems that you allocate not only bios (small objects, not
> every request cause page allocation), but also the huge buffers, that
> require free pages each time.
  Yes, in worst case RAID5 need a lot of additional memory to perform
 simple write. If it is lone write (geom_raid5 waits some time for
 writes in adjacent areas, but not forever), geom_raid5 need to read
 (Number of disks - 1) x (size of write) bytes of data to re-calculate
 checksum. And it need buffers for this data. Worst case for 5-disks
 RAID5 and 128KiB write will be 4x128KiB =3D 512KiB of buffers. For one
 128KiB write. And I don;t understand how to avoid deadlock here :(
 Maybe, preallocating some memory at start (these 512KiB) and try to
 use them when malloc() failed...

  I need to look how raid3 and vinum/raid5 lives with that situation.

--=20
// Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?959936032.20110109010601>