Date: Thu, 29 Sep 2005 00:31:02 -0700 (PDT) From: Don Lewis <truckman@FreeBSD.org> To: phk@phk.freebsd.dk Cc: current@FreeBSD.org Subject: Re: stress test deadlock involving vm and/or geom Message-ID: <200509290731.j8T7V2OR007019@gw.catspoiler.org> In-Reply-To: <6785.1127975825@critter.freebsd.dk>
next in thread | previous in thread | raw e-mail | index | archive | help
On 29 Sep, Poul-Henning Kamp wrote:
> In message <200509290214.j8T2EogR006609@gw.catspoiler.org>, Don Lewis writes:
>
>>Also, g_io_check() can only return ENOMEM if that is the provider error
>>status, but the provider seems to be happy:
>
> ENOMEM can also be returned by the code in the GEOM class.
Ok, that was obscure enough.
It looks like the
uma_zalloc(biozone, M_NOWAIT | M_ZERO)
call in g_clone_bio() is the likely culprit. Lots of allocation
failures for this zone:
(kgdb) print *biozone
$2 = {uz_name = 0xc086938c "g_bio", uz_lock = 0xc10295a8, uz_keg = 0xc10295a0,
uz_link = {le_next = 0x0, le_prev = 0xc10295d8}, uz_full_bucket = {
lh_first = 0x0}, uz_free_bucket = {lh_first = 0x0}, uz_ctor = 0,
uz_dtor = 0, uz_init = 0, uz_fini = 0, uz_allocs = 22457718,
uz_frees = 22456210, uz_fails = 6603729, uz_fills = 1, uz_count = 128,
uz_cpu = {{uc_freebucket = 0x0, uc_allocbucket = 0x0, uc_allocs = 0,
uc_frees = 0}}}
Any easy way to figure out why the memory allocation is repeatedly
failing?
Here's the output of vmstat -z:
ITEM SIZE LIMIT USED FREE REQUESTS
UMA Kegs: 140, 0, 62, 10, 62
UMA Zones: 120, 0, 62, 28, 62
UMA Slabs: 64, 0, 2169, 132, 51890
UMA RCntSlabs: 104, 0, 104, 266, 37549
UMA Hash: 128, 0, 1, 29, 3
16 Bucket: 76, 0, 12, 38, 45
32 Bucket: 140, 0, 30, 26, 69
64 Bucket: 268, 0, 14, 14, 72
128 Bucket: 524, 0, 2409, 244, 53217
VM OBJECT: 132, 0, 5249, 17255, 2323529
MAP: 192, 0, 7, 33, 7
KMAP ENTRY: 68, 56616, 867, 1317, 797654
MAP ENTRY: 68, 0, 1881, 3271, 615413
PV ENTRY: 24, 1494950, 186575, 68335, 77309488
DP fakepg: 72, 0, 0, 0, 0
mt_zone: 64, 0, 194, 101, 194
16: 16, 0, 2357, 282, 2618883
32: 32, 0, 36199, 19962, 9905089
64: 64, 0, 5020, 1883, 3448783
128: 128, 0, 29552, 4078, 5969903
256: 256, 0, 2556, 309, 6529179
512: 512, 0, 71, 33, 5456
1024: 1024, 0, 139, 65, 67843
2048: 2048, 0, 3123, 57, 775663
4096: 4096, 0, 216, 97, 22551
Files: 72, 0, 151, 326, 2846128
PROC: 524, 0, 143, 130, 14242
THREAD: 372, 0, 273, 77, 5308
KSEGRP: 88, 0, 273, 47, 707
UPCALL: 44, 0, 0, 156, 866
VMSPACE: 300, 0, 92, 142, 14191
mbuf_packet: 256, 0, 66, 51, 82560936
mbuf: 256, 0, 3, 540, 231234746
mbuf_cluster: 2048, 25600, 117, 91, 12760461
ACL UMA zone: 388, 0, 0, 0, 0
g_bio: 132, 0, 1508, 0, 22457718
ata_request: 200, 0, 0, 0, 0
ata_composite: 192, 0, 0, 0, 0
VNODE: 272, 0, 29373, 6523, 4183368
VNODEPOLL: 76, 0, 0, 0, 0
S VFS Cache: 68, 0, 11060, 2828, 844011
L VFS Cache: 291, 0, 0, 0, 0
NAMEI: 1024, 0, 24, 52, 7276418
NFSMOUNT: 480, 0, 2, 14, 2
NFSNODE: 424, 0, 25, 902, 7316
DIRHASH: 1024, 0, 491, 97, 56851
PIPE: 408, 0, 4, 59, 4381
KNOTE: 68, 0, 0, 112, 80
socket: 356, 25608, 66, 55, 807098
unpcb: 140, 25620, 17, 39, 217
udpcb: 180, 25608, 25, 41, 805728
inpcb: 180, 25608, 24, 64, 1152
tcpcb: 460, 25600, 24, 32, 1152
tcptw: 48, 5148, 0, 156, 237
syncache: 100, 15366, 0, 78, 348
hostcache: 76, 15400, 1, 99, 2
tcpreass: 20, 1690, 0, 0, 0
sackhole: 20, 0, 0, 0, 0
ripcb: 180, 25608, 0, 0, 0
rtentry: 132, 0, 14, 44, 14
SWAPMETA: 276, 121576, 2620, 5542, 354141
FFS inode: 132, 0, 29314, 4123, 4175811
FFS1 dinode: 128, 0, 183, 1497, 13155
FFS2 dinode: 256, 0, 29124, 3441, 4162649
It looks like the problem might be caused by having too much I/O in
progress.
(kgdb) print runningbufspace
$3 = 1585152
(kgdb) print lorunningspace
$4 = 524288
print hirunningspace
$5 = 1048576
so runningbufspace >> hirunningspace
The bufdaemon and the syncer both ignore the runningbufspace limit.
Maybe they should obey the limit, but each be guaranteed a minimum
quota.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200509290731.j8T7V2OR007019>
