Date: Wed, 01 Oct 2003 19:09:07 +0000 From: "Vinod R. Kashyap" <vinodrk@hotmail.com> To: freebsd-fs@freebsd.org, fs@freebsd.org Subject: clean/dirty queue mixup Message-ID: <BAY2-F17yupGKy3hrbn00004847@hotmail.com>
next in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] I am running some file I/O (multiple 'cp -R x/* y') on a kernel built out of FreeBSD 4.8 RELEASE sources (with SMP and APIC_IO turned on), when I see the following panic, anywhere between 0.5 - 15 hours after start: panic: softdep_deallocate_dependencies: dangling deps mp_lock = 01000001; cpuid = 1; lapic_id = 06000000 (the I/O is running on a system with a dual Xeon motherboard) I recompiled the kernel with INVARIANTS & INVARIANT_SUPPORT on, and now see the following panic: panic: getnewbuf: inconsistant queue 2 bp 0xcc94b564 mp_lock = 01000001; cpuid = 1; lapic_id = 06000000 In getnewbuf (in kern/vfs_bio.c), bufqueues[QUEUE_EMPTYKVA] seems to have run out of buffers, and so, a buffer is gotten out of bufqueues[QUEUE_CLEAN]. The problem is that this buffer has its b_qindex set to 3 (QUEUE_DIRTY), instead of 2 (QUEUE_CLEAN). It also has b_flags set to B_DELWRI, and b_xflags set to BX_VNDIRTY. bp->b_dep is non-NULL, and points to a work item with wk_type=0x5 (D_INDIRDEP) and wk_state=0x8001 (VONWORKLIST | ATTACHED). I noticed that at least the first 5 buffers in the CLEAN queue had their b_qindex set to 3 (DIRTY). I noticed something that could potentially be causing this (I might be way off here...): When a low level device driver calls biodone(), if bp->b_dep is non-NULL, (*bioops.io_complete)/softdep_disk_write_complete gets called, which, in case the work item is of type D_INDIRDEP, unconditionally puts it back into the b_dep queue, and marks it dirty. On return from this call, it looks like biodone goes ahead and frees the buffer with the dirty flags still set, and b_dep non-NULL. Is this the cause by any chance? Can anyone please help? I have attached some more information from gdb. I can send the core-dump to anyone who might be interested. Thanks, Vinod. _________________________________________________________________ Contact brides & grooms FREE! http://www.shaadi.com/ptnr.php?ptnr=hmltag Only on www.shaadi.com. Register now! [-- Attachment #2 --] FreeBSD4# gdb -k /kernel GNU gdb 4.18 (FreeBSD) Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-unknown-freebsd"...Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 2627 in elfstab_build_psymtabs Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 933 in fill_symbuf (kgdb) exec-file kernel.3 (kgdb) core-file vmcore.3 SMP 2 cpus IdlePTD at phsyical address 0x0055f000 initial pcb at physical address 0x00487ee0 panicstr: (fmt null) panic messages: --- panic: getnewbuf: inconsistant queue 2 bp 0xcc94b564 mp_lock = 01000001; cpuid = 1; lapic.id = 06000000 panic: (fmt null) mp_lock = 01000002; cpuid = 1; lapic.id = 06000000 boot() called on cpu#1 Uptime: 13h25m14s dumping to dev #ad/0x20001, offset 1015968 dump ata0: resetting devices .. done 511 510 509 508 507 506 505 504 503 502 501 500 499 498 497 496 495 494 493 492 491 490 489 488 487 486 485 484 483 482 481 480 479 478 477 476 475 474 473 472 471 470 469 468 467 466 465 464 463 462 461 460 459 458 457 456 455 454 453 452 451 450 449 448 447 446 445 444 443 442 441 440 439 438 437 436 435 434 433 432 431 430 429 428 427 426 425 424 423 422 421 420 419 418 417 416 415 414 413 412 411 410 409 408 407 406 405 404 403 402 401 400 399 398 397 396 395 394 393 392 391 390 389 388 387 386 385 384 383 382 381 380 379 378 377 376 375 374 373 372 371 370 369 368 367 366 365 364 363 362 361 360 359 358 357 356 355 354 353 352 351 350 349 348 347 346 345 344 343 342 341 340 339 338 337 336 335 334 333 332 331 330 329 328 327 326 325 324 323 322 321 320 319 318 317 316 315 314 313 312 311 310 309 308 307 306 305 304 303 302 301 300 299 298 297 296 295 294 293 292 291 290 289 288 287 286 285 284 283 282 281 280 279 278 277 276 275 274 273 272 271 270 269 268 267 266 265 264 263 262 261 260 259 258 257 256 255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 --- #0 dumpsys () at ../../kern/kern_shutdown.c:487 487 if (dumping++) { (kgdb) where #0 dumpsys () at ../../kern/kern_shutdown.c:487 #1 0xc0219757 in boot (howto=260) at ../../kern/kern_shutdown.c:316 #2 0xc0219bc9 in panic (fmt=0x0) at ../../kern/kern_shutdown.c:595 #3 0xc014d735 in db_fncall (dummy1=0, dummy2=0, dummy3=0, dummy4=0xdb4a2b1c "`µIÀ &GÀdÄ\001hÉ 7Àí\0377ÀþÔ\001k`µIÀ\r") at ../../ddb/db_command.c:534 #4 0xc014d559 in db_command (last_cmdp=0xc042c424, cmd_table=0xc042c264, aux_cmd_tablep=0xc046fb98) at ../../ddb/db_command.c:333 #5 0xc014d61e in db_command_loop () at ../../ddb/db_command.c:457 #6 0xc014f75b in db_trap (type=3, code=0) at ../../ddb/db_trap.c:71 #7 0xc03823a1 in kdb_trap (type=3, code=0, regs=0xdb4a2c2c) at ../../i386/i386/db_interface.c:158 #8 0xc0399098 in trap (frame={tf_fs = 24, tf_es = 16, tf_ds = -615907312, tf_edi = 2, tf_esi = 256, tf_ebp = -615895948, tf_isp = -615895976, tf_ebx = -1069628416, tf_edx = -1744946277, tf_ecx = 0, tf_eax = 18, tf_trapno = 3, tf_err = 0, tf_eip = -1070061984, tf_cs = 8, tf_eflags = 582, tf_esp = -1069445309, tf_ss = -1069644758}) at ../../i386/i386/trap.c:592 #9 0xc0382660 in Debugger (msg=0xc03e842a "panic") at machine/cpufunc.h:67 #10 0xc0219bc0 in panic (fmt=0xc03ec400 "getnewbuf: inconsistant queue %d bp %p") at ../../kern/kern_shutdown.c:593 #11 0xc0240dd4 in getnewbuf (slpflag=0, slptimeo=0, size=2048, maxsize=16384) at ../../kern/vfs_bio.c:1593 #12 0xc0241a0a in getblk (vp=0xdb7a3c80, blkno=0, size=2048, slpflag=0, slptimeo=0) at ../../kern/vfs_bio.c:2239 #13 0xc031f95a in ffs_balloc (ap=0xdb4a2e20) at ../../ufs/ffs/ffs_balloc.c:172 #14 0xc0329431 in ffs_write (ap=0xdb4a2e60) at vnode_if.h:1056 #15 0xc024ff88 in vn_write (fp=0xc23b8640, uio=0xdb4a2ed0, cred=0xc256bb00, flags=0, p=0xdb3a9100) at vnode_if.h:363 #16 0xc0228dad in dofilewrite (p=0xdb3a9100, fp=0xc23b8640, fd=4, buf=0x28058000, nbyte=34, offset=-1, flags=0) at ../../sys/file.h:163 #17 0xc0228c66 in write (p=0xdb3a9100, uap=0xdb4a2f80) at ../../kern/sys_generic.c:329 ---Type <return> to continue, or q <return> to quit--- #18 0xc0399b11 in syscall2 (frame={tf_fs = 671416367, tf_es = 671416367, tf_ds = -1078001617, tf_edi = 671449088, tf_esi = 671449088, tf_ebp = -1077937612, tf_isp = -615895084, tf_ebx = 34, tf_edx = 134674508, tf_ecx = 0, tf_eax = 4, tf_trapno = 7, tf_err = 2, tf_eip = 134561984, tf_cs = 31, tf_eflags = 659, tf_esp = -1077937672, tf_ss = 47}) at ../../i386/i386/trap.c:1175 #19 0xc03832fb in Xint0x80_syscall () cannot read proc at 0 (kgdb) f 11 #11 0xc0240dd4 in getnewbuf (slpflag=0, slptimeo=0, size=2048, maxsize=16384) at ../../kern/vfs_bio.c:1593 1593 KASSERT(bp->b_qindex == qindex, ("getnewbuf: inconsistant queue %d bp %p", qindex, bp)); (kgdb) p bp $1 = (struct buf *) 0xcc94b564 (kgdb) p qindex $2 = 2 (kgdb) p nqindex $3 = 2 (kgdb) p bp->b_qindex $4 = 3 (kgdb) p bp->b_flags $5 = 537006752 (kgdb) p/x bp->b_flags $6 = 0x200212a0 (kgdb) p bp->b_dep $7 = {lh_first = 0xc2682420} (kgdb) x/4xw bp->b_dep.lh_first 0xc2682420: 0x00000000 0xcc94b6a4 0x80010005 0x00000000 (kgdb) p bp->b_xflags $8 = 1 '\001' (kgdb) p/x bp->b_xflags $9 = 0x1 (kgdb) p nbp $10 = (struct buf *) 0xcc8aaee0 (kgdb) p nbp->q_index There is no member named q_index. (kgdb) p nbp->b_qindex $11 = 3 (kgdb) p bufqueues[0] $12 = {tqh_first = 0x0, tqh_last = 0xc044cd20} (kgdb) p bufqueues[1] $13 = {tqh_first = 0x0, tqh_last = 0xc044cd28} (kgdb) p bufqueues[2] $14 = {tqh_first = 0xcc94b564, tqh_last = 0xcc93f1a0} (kgdb) p bufqueues[3] $15 = {tqh_first = 0xcc82b300, tqh_last = 0xcc91c7dc} (kgdb) p bufqueues[4] $16 = {tqh_first = 0x0, tqh_last = 0xc044cd40} (kgdb) p bufqueues[5] $17 = {tqh_first = 0xcc95e628, tqh_last = 0xcc961dfc} (kgdb) p bufqueues[5].tqh_first->b_freelist $18 = {tqe_next = 0xcc95e774, tqe_prev = 0xc044cd48} (kgdb) p bufqueues[5].tqh_first->b_freelist->tqe_next->b_freelist $19 = {tqe_next = 0xcc95e8c0, tqe_prev = 0xcc95e638} (kgdb) p bufqueues[5].tqh_first->b_freelist.b_qindex There is no member named b_qindex. (kgdb) p bufqueues[5].tqh_first->b_freelist->tqe_next->b_qindex $20 = 5 (kgdb) p bufqueues[5].tqh_first->b_freelist->tqe_next->b_freelist->tqe_next->b_qindex $21 = 5 (kgdb) p bufqueues[2].tqh_first->b_freelist $22 = {tqe_next = 0xcc8aaee0, tqe_prev = 0xc044cd30} (kgdb) p bufqueues[2].tqh_first->b_freelist->tqe_next->b_qindex $23 = 3 (kgdb) p bufqueues[2].tqh_first->b_freelist->tqe_next->b_freelist->tqe_next->b_qindex $24 = 3 (kgdb) p bufqueues[2].tqh_first->b_freelist->tqe_next->b_freelist->tqe_next->b_freelist->tqe_next->b_qindex $25 = 3 (kgdb) p bufqueues[2].tqh_first->b_freelist->tqe_next->b_freelist->tqe_next->b_freelist->tqe_next->b_flags $26 = 537002656 (kgdb) p bufqueues[2].tqh_first->b_freelist->tqe_next->b_freelist->tqe_next->b_flags $27 = 70304 (kgdb) p bufqueues[3].tqh_first->b_freelist $28 = {tqe_next = 0xcc8e9194, tqe_prev = 0xc044cd38} (kgdb) p bufqueues[3].tqh_first->b_freelist->tqe_next->b_qindex $29 = 3 (kgdb) p bufqueues[3].tqh_first->b_freelist->tqe_next->b_freelist->tqe_next->b_flags $30 = 537002144 (kgdb) p bufqueues[3].tqh_first->b_freelist->tqe_next->b_freelist->tqe_next->b_qindex $31 = 3 (kgdb) p bufqueues[2].tqh_first->b_freelist->tqe_next->b_freelist->tqe_next->b_freelist->tqe_next->b_qindex $32 = 3 (kgdb) p bp->b_freelist $33 = {tqe_next = 0xcc8aaee0, tqe_prev = 0xc044cd30} (kgdb) p bufqueues[5] $34 = {tqh_first = 0xcc95e628, tqh_last = 0xcc961dfc} (kgdb) p bufqueues[5].tqh_first->b_freelist->tqe_next->b_freelist->tqe_next->b_freelist->tqe_next $35 = (struct buf *) 0xcc95ea0c (kgdb) p bp->b_act $36 = {tqe_next = 0x0, tqe_prev = 0xc24a4008} (kgdb) p nbp->b_act $37 = {tqe_next = 0x0, tqe_prev = 0xc24a4008} (kgdb) p bufqueues[5].tqh_first->b_freelist->tqe_next->b_freelist->tqe_next->b_freelist->tqe_next->b_freelist->tqe_next $38 = (struct buf *) 0xcc95eb58 (kgdb) p bufqueues[5].tqh_first->b_freelist->tqe_next->b_freelist->tqe_next->b_freelist->tqe_next->b_freelist->tqe_next->b_freelist->tqe_next $39 = (struct buf *) 0xcc95eca4 (kgdb) p defrag $40 = 0 (kgdb) p bufspace $41 = 63275008 (kgdb) p lobufspace $42 = 63275008 (kgdb) p/x $43 = 0x3c58000 (kgdb) p nbp->b_dep.lh_first $44 = (struct worklist *) 0xc25e17e0 (kgdb) p/4wx nbp->b_dep.lh_first Size letters are meaningless in "print" command. (kgdb) x/4wx nbp->b_dep.lh_first 0xc25e17e0: 0x00000000 0xcc8ab020 0x80010005 0x00000000 (kgdb) p/x numdirtybuffers $45 = 0x26a (kgdb) p bp->b_vp $46 = (struct vnode *) 0xdbcf1200 (kgdb) p/x bp->b_vp->v_flag $47 = 0x202000 (kgdb) p bp-b_vp->v_dirtyblkhd No symbol "b_vp" in current context. (kgdb) p bp->b_vp->v_dirtyblkhd $48 = {tqh_first = 0xcc94b564, tqh_last = 0xcc94b56c} (kgdb) p bp->b_vp->v_dirtyblkhd->tqh_first $49 = (struct buf *) 0xcc94b564 (kgdb) p bp $50 = (struct buf *) 0xcc94b564 (kgdb) p bp->b_lblkno $51 = -12 (kgdb) p bp->b_vnbufs $52 = {tqe_next = 0x0, tqe_prev = 0xdbcf1234} (kgdb) p/x reassignbufsortgood $53 = 0x61459c (kgdb) p/x reassignbufmethod $54 = 0x1 (kgdb) p bp->b_vp->v_dirtyblkhd->tqh_last $55 = (struct buf **) 0xcc94b56c (kgdb) p bp->b_vp->v_dirtyblkhd->tqh_first->vn_bufs There is no member named vn_bufs. (kgdb) p bp->b_vp->v_dirtyblkhd->tqh_first->b_vnbufs $56 = {tqe_next = 0x0, tqe_prev = 0xdbcf1234} (kgdb) p/x mp_lock $57 = 0x1000002 (kgdb) p bp->b_dev $58 = 0xc2398880 (kgdb) p bp->b_dev->si_drv1 $59 = (void *) 0xc25c5000 (kgdb) p/4wx 0xc25c5000 Size letters are meaningless in "print" command. (kgdb) x/10wx 0xc25c5000 0xc25c5000: 0xc24c6c00 0xc2398980 0xc24a4000 0xc24a4430 0xc25c5010: 0x00000000 0x00000000 0xc2582880 0xc2398980 0xc25c5020: 0xc25cc000 0x00000000 (kgdb) x/40wx 0xc24a4410 0xc24a4410: 0xc25c3000 0xc25c4c00 0xc25c4800 0xc25c4400 0xc24a4420: 0xc25c4000 0xc25c5c00 0xc25c5800 0xc25c5400 0xc24a4430: 0x12a19eb0 0x00004c01 0x000000ff 0x0000003f 0xc24a4440: 0x00000000 0x00000000 0x00000fff 0xc24c6c00 0xc24a4450: 0x00000000 0x00000000 0x00000000 0x00000000 0xc24a4460: 0x00000000 0x00000000 0x00000000 0x00000000 0xc24a4470: 0x12a19eb0 0x00004c01 0x000000ff 0x0000003f 0xc24a4480: 0x00000001 0x00000002 0x00000fff 0xc24c6b80 0xc24a4490: 0x00000000 0x00000000 0x00000000 0x00000000 0xc24a44a0: 0x00000000 0x00000000 0x00000000 0x00000000 (kgdb) p/x bp->b_error $60 = 0x0 (kgdb) p/x bp->b_act $61 = {tqe_next = 0x0, tqe_prev = 0xc24a4008} (kgdb) x/40wx 0xc24a4000 0xc24a4000: 0xc2598c00 0xc25bc20c 0x00000000 0xc24a4008 0xc24a4010: 0x00b86b67 0x00000000 0x00000000 0x00000000 0xc24a4020: 0xc24a401c 0xc25be400 0xc25b460c 0x00000000 0xc24a4030: 0xc24a402c 0xc24a6c00 0xc24a6800 0xc24a6400 0xc24a4040: 0xc24a6000 0xc2587c00 0xc2587800 0xc2587400 0xc24a4050: 0xc2587000 0xc2588c00 0xc2588800 0xc2588400 0xc24a4060: 0xc2588000 0xc2589c00 0xc2589800 0xc2589400 0xc24a4070: 0xc2589000 0xc258ac00 0xc258a800 0xc258a400 0xc24a4080: 0xc258a000 0xc258bc00 0xc258b800 0xc258b400 0xc24a4090: 0xc258b000 0xc258cc00 0xc258c800 0xc258c400 (kgdb) p bp->b_dep $62 = {lh_first = 0xc2682420} (kgdb) p bp->b_dep.lh_first $63 = (struct worklist *) 0xc2682420 (kgdb) p nbp->b_dep.lh_first $64 = (struct worklist *) 0xc25e17e0 (kgdb) x/4wx 0xc2682420 0xc2682420: 0x00000000 0xcc94b6a4 0x80010005 0x00000000 (kgdb) p bp->b_npages $65 = 4 (kgdb) p stat_indir_blk_ptrs $66 = 115289 (kgdb) p bp->b_kvasize $67 = 16384 (kgdb) p mp_lock $68 = 16777218 (kgdb) p/x mp_lock $69 = 0x1000002 (kgdb) p/x cpuid No symbol "cpuid" in current context. (kgdb) p/x lapic_id Cannot access memory at address 0xffbff020. (kgdb) x/4wx lapic_id Cannot access memory at address 0xffbff020. (kgdb) p/x cpu_id $70 = 0xf27 (kgdb)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BAY2-F17yupGKy3hrbn00004847>
