Date: Fri, 16 Jul 2004 20:32:05 -0400 From: Brian Fundakowski Feldman <green@FreeBSD.org> To: hackers@FreeBSD.org Cc: alc@FreeBSD.org Subject: crash via vm_page_sleep_if_busy() and contigmalloc Message-ID: <20040717003205.GM1626@green.homeunix.org>
next in thread | raw e-mail | index | archive | help
--FN+gV9K+162wdwwF Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Anyone VM-y enough to be up to the task: please take a look at this current vm_contig.c code and the crash that I have. This crash is not common -- this is the first time I've seen it -- but the problem certainly doesn't seem unique. What seems to happen is that vm_page_sleep_if_busy() is called from a place that expects that a page may go away, but it does not really realize this. If it has to sleep because the page is busy, it will afterward happily dereference m->object which may now be NULL or belong to something else, and unlock its mutex (which may be locked). It seems that this is a generic problem that needs to be solved by not dicking around with vm_object inside vm_page_sleep_if_busy(): pass it in locked all of the time, return it unlocked all of the time if the page queue mutex was relinquished. Also, assumptions should be removed from other callers of vm_page_sleep_if_busy() such that they know the object may not exist after return, so if the page queue lock is gone then the object is gone and it must not reference it anymore. Essentially every bit of code that calls vm_page_sleep_if_busy() without explicit knowledge of the backing object is in violation of this. As such, I think callers need to either lock the vm_object in every case before locking the page queues, or if they hold the page queues' mutex, do a trylock before trying to call vm_page_sleep_if_busy(), and be able to handle both of the locks being relinquished on a return of TRUE. Comments? -- Brian Fundakowski Feldman \'[ FreeBSD ]''''''''''\ <> green@FreeBSD.org \ The Power to Serve! \ Opinions expressed are my own. \,,,,,,,,,,,,,,,,,,,,,,\ --FN+gV9K+162wdwwF Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="crash.typescript" Content-Transfer-Encoding: quoted-printable Script started on Fri Jul 16 19:22:54 2004 You have mail.=0D bfeldman# gdb53 -k kernel.debug vmcore.15=1B[14`=1B[K53 -k kernel.debug vmc= ore.15=0D=0D GNU gdb 5.3 (FreeBSD)=0D Copyright 2002 Free Software Foundation, Inc.=0D GDB is free software, covered by the GNU General Public License, and you ar= e=0D welcome to change it and/or distribute copies of it under certain condition= s.=0D Type "show copying" to see the conditions.=0D There is absolutely no warranty for GDB. Type "show warranty" for details.= =0D This GDB was configured as "i386-portbld-freebsd5.2"...set=0D panic: lockmgr: locking against myself=0D panic messages:=0D ---=0D panic: lockmgr: locking against myself=0D Uptime: 6h11m56s=0D Dumping 510 MB=0D 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320 = 336 352 368 384 400 416 432 448 464 480 496=0D ---=0D #0 doadump () at ../../../kern/kern_shutdown.c:236=0D 236 dumping++;=0D (kgdb) set pagination off=0D (kgdb) bt=0D #0 doadump () at ../../../kern/kern_shutdown.c:236=0D #1 0xc04fb30b in boot (howto=3D260) at ../../../kern/kern_shutdown.c:370=0D #2 0xc04fb5c3 in panic (fmt=3D0xc06776a0 "lockmgr: locking against myself"= ) at ../../../kern/kern_shutdown.c:548=0D #3 0xc04f11b8 in lockmgr (lkp=3D0xc1982e0c, flags=3D2, interlkp=3D0x100000= 0, td=3D0xc219e2c0) at ../../../kern/kern_lock.c:437=0D #4 0xc05ffdd0 in _vm_map_lock_read (map=3D0x0, file=3D0xc068c92e "../../..= /vm/vm_map.c", line=3D2935) at machine/pcpu.h:156=0D #5 0xc0602e10 in vm_map_lookup (var_map=3D0xde778b24, vaddr=3D0, fault_typ= ea=3D1 '\001', out_entry=3D0xde778b28, object=3D0x0, pindex=3D0x0, out_prot= =3D0x0, wired=3D0xde778b00) at ../../../vm/vm_map.c:2935=0D #6 0xc05fbed9 in vm_fault (map=3D0xc1982dd0, vaddr=3D0, fault_type=3D1 '\0= 01', fault_flags=3D0) at ../../../vm/vm_fault.c:232=0D #7 0xc063c49b in trap_pfault (frame=3D0xde778bec, usermode=3D0, eva=3D0) a= t ../../../i386/i386/trap.c:710=0D #8 0xc063c1b1 in trap (frame=3D{tf_fs =3D -1038548968, tf_es =3D 16, tf_ds= =3D -562626544, tf_edi =3D -1066871035, tf_esi =3D 431, tf_ebp =3D -562590= 664, tf_isp =3D -562590696, tf_ebx =3D 0, tf_edx =3D -1038490944, tf_ecx = =3D 2, tf_eax =3D -1038490944, tf_trapno =3D 12, tf_err =3D 0, tf_eip =3D -= 1068550176, tf_cs =3D 8, tf_eflags =3D 78470, tf_esp =3D -1048169872, tf_ss= =3D 1}) at ../../../i386/i386/trap.c:420=0D #9 0xc04f37e0 in _mtx_lock_flags (m=3D0x0, opts=3D0, file=3D0xc068d705 "..= /../../vm/vm_page.c", line=3D431) at ../../../kern/kern_mutex.c:246=0D #10 0xc0607638 in vm_page_sleep_if_busy (m=3D0xc1863270, also_m_busy=3D1, m= sg=3D0xc068d4a6 "madvpo") at ../../../vm/vm_page.c:431=0D #11 0xc0605ceb in vm_object_madvise (object=3D0xc22097bc, pindex=3D6116, co= unt=3D0, advise=3D5) at ../../../vm/vm_object.c:1130=0D #12 0xc060118d in vm_map_madvise (map=3D0xc1982dd0, start=3D169009152, end= =3D172679168, behav=3D5) at ../../../vm/vm_map.c:1546=0D #13 0xc0603c5a in madvise (td=3D0xc1982dd0, uap=3D0x0) at ../../../vm/vm_mm= ap.c:676=0D #14 0xc063ca6b in syscall (frame=3D{tf_fs =3D 47, tf_es =3D 22151215, tf_ds= =3D -1078001617, tf_edi =3D 3670016, tf_esi =3D 896, tf_ebp =3D -107794216= 8, tf_isp =3D -562590348, tf_ebx =3D 674006828, tf_edx =3D 811073536, tf_ec= x =3D 7993, tf_eax =3D 75, tf_trapno =3D 12, tf_err =3D 2, tf_eip =3D 67354= 1319, tf_cs =3D 31, tf_eflags =3D 12946, tf_esp =3D -1077942212, tf_ss =3D = 47}) at ../../../i386/i386/trap.c:1004=0D (kgdb) p/x allproc.lh_first->p_threads->tqh_first->td_pcb->pcb_ebp=0D $1 =3D 0xdf6e8904=0D (kgdb) frame 0xdf6e8904=0D #0 0x00000000 in ?? ()=0D (kgdb) down=0D Bottom (i.e., innermost) frame selected; you cannot go down.=0D (kgdb) up=0D #1 0xc06e72cc in sysctl__kern_shutdown_children ()=0D (kgdb) =0D #2 0xc050157e in mi_switch (flags=3D0) at ../../../kern/kern_synch.c:352=0D 352 sched_switch(td);=0D (kgdb) =0D #3 0xc0515992 in sleepq_switch (wchan=3D0x0) at ../../../kern/subr_sleepqu= eue.c:374=0D 374 mi_switch(SW_VOL);=0D (kgdb) =0D #4 0xc0515b43 in sleepq_wait (wchan=3D0xcbe2d488) at ../../../kern/subr_sl= eepqueue.c:478=0D 478 sleepq_switch(wchan);=0D (kgdb) =0D #5 0xc050125a in msleep (ident=3D0xcbe2d488, mtx=3D0xc070f980, priority=3D= 68, wmesg=3D0xc068c150 "swwrt", timo=3D0) at ../../../kern/kern_synch.c:243= =0D 243 sleepq_wait(ident);=0D (kgdb) =0D #6 0xc053c587 in bwait (bp=3D0xcbe2d488, pri=3D68 'D', wchan=3D0xc068c150 = "swwrt") at ../../../kern/vfs_bio.c:3766=0D 3766 msleep(bp, &bdonelock, pri, wchan, 0);=0D (kgdb) =0D #7 0xc05fa3c2 in swap_pager_putpages (object=3D0xc22097bc, m=3D0xdf6e8a6c,= count=3D1, sync=3D1, rtvals=3D0xdf6e8a20) at ../../../vm/swap_pager.c:1372= =0D 1372 bwait(bp, PVM, "swwrt");=0D (kgdb) =0D #8 0xc060a6f2 in vm_pageout_flush (mc=3D0xdf6e8a6c, count=3D1, flags=3D1) = at ../../../vm/vm_pager.h:139=0D 139 (*pagertab[object->type]->pgo_putpages)=0D (kgdb) =0D #9 0xc0609332 in vm_contig_launder_page (m=3D0xc18636f0) at ../../../vm/vm= _contig.c:121=0D warning: Source file is more recent than executable.=0D =0D 121 vm_pageout_flush(&m_tmp, 1, VM_PAGER_PUT_SYNC);=0D (kgdb) =0D #10 0xc0609c6e in vm_page_alloc_contig (npages=3D44, low=3D0, high=3D429496= 7295, alignment=3D4, boundary=3D0) at ../../../vm/vm_contig.c:447=0D 447 if (vm_contig_launder_page(m) !=3D 0)=0D (kgdb) =0D #11 0xc0609f6d in contigmalloc (size=3D180224, type=3D0xc06bbc80, flags=3D2= 58, low=3D0, high=3D4294967295, alignment=3D4, boundary=3D0) at ../../../vm= /vm_contig.c:546=0D 546 pages =3D vm_page_alloc_contig(npgs, low, high,=0D (kgdb) =0D #12 0xc062d137 in bus_dmamem_alloc (dmat=3D0xc1cffc00, vaddr=3D0xc1d17554, = flags=3D0, mapp=3D0x0) at ../../../i386/i386/busdma_machdep.c:430=0D 430 *vaddr =3D contigmalloc(dmat->maxsize, M_DEVBUF, mflags,=0D (kgdb) =0D #13 0xc2251898 in ?? ()=0D (kgdb) =0D #14 0xc2251634 in ?? ()=0D (kgdb) =0D #15 0xc050d46c in device_attach (dev=3D0xc1d17550) at device_if.h:39=0D 39 KOBJOPLOOKUP(((kobj_t)dev)->ops,device_attach);=0D (kgdb) =0D #16 0xc050d40c in device_probe_and_attach (dev=3D0xc1bd8400) at ../../../ke= rn/subr_bus.c:1684=0D 1684 error =3D device_attach(dev);=0D (kgdb) =0D #17 0xc0464785 in cardbus_driver_added (cbdev=3D0xc1a49e00, driver=3D0xc226= a3e8) at ../../../dev/cardbus/cardbus.c:278=0D 278 if (device_probe_and_attach(dev) !=3D 0)=0D (kgdb) =0D #18 0xc050c403 in devclass_add_driver (dc=3D0xc19709c0, driver=3D0xc226a3e8= ) at bus_if.h:71=0D 71 ((bus_driver_added_t *) _m)(_dev, _driver);=0D (kgdb) =0D #19 0xc050ec92 in driver_module_handler (mod=3D0xc1cbc640, what=3D-10376550= 64, arg=3D0xc226a45c) at ../../../kern/subr_bus.c:2545=0D 2545 error =3D devclass_add_driver(bus_devclass, driver);=0D (kgdb) =0D #20 0xc04f312e in module_register_init (arg=3D0xc226a470) at ../../../kern/= kern_module.c:108=0D 108 error =3D MOD_EVENT(mod, MOD_LOAD);=0D (kgdb) =0D #21 0xc04ee5e1 in linker_file_sysinit (lf=3D0xc1c8fe00) at ../../../kern/ke= rn_linker.c:193=0D 193 (*((*sipp)->func)) ((*sipp)->udata);=0D (kgdb) =0D #22 0xc04ee895 in linker_load_file (filename=3D0xc1d30180 "/home/green/pris= m54-driver/pff/if_pff.ko", result=3D0xdf6e8cb0) at ../../../kern/kern_linke= r.c:358=0D 358 linker_file_sysinit(lf);=0D (kgdb) =0D #23 0xc04f08e7 in linker_load_module (kldname=3D0xc1d30180 "/home/green/pri= sm54-driver/pff/if_pff.ko", modname=3D0x0, parent=3D0x0, verinfo=3D0x0, lfp= p=3D0xdf6e8cdc) at ../../../kern/kern_linker.c:1673=0D 1673 error =3D linker_load_file(pathname, &lfdep);=0D (kgdb) =0D #24 0xc04ef297 in kldload (td=3D0xc1cb59a0, uap=3D0x0) at ../../../kern/ker= n_linker.c:776=0D 776 error =3D linker_load_module(kldname, modname, NULL, NULL, &lf);=0D (kgdb) =0D #25 0xc063ca6b in syscall (frame=3D{tf_fs =3D 47, tf_es =3D 47, tf_ds =3D 4= 7, tf_edi =3D 1, tf_esi =3D -1077942056, tf_ebp =3D -1077942136, tf_isp =3D= -546402956, tf_ebx =3D 0, tf_edx =3D -1, tf_ecx =3D -1077941787, tf_eax = =3D 304, tf_trapno =3D 12, tf_err =3D 2, tf_eip =3D 671870951, tf_cs =3D 31= , tf_eflags =3D 658, tf_esp =3D -1077942180, tf_ss =3D 47}) at ../../../i38= 6/i386/trap.c:1004=0D 1004 error =3D (*callp->sy_call)(td, args);=0D (kgdb) =0D Initial frame selected; you cannot go up.=0D (kgdb) =0D Initial frame selected; you cannot go up.=0D (kgdb) =0D Initial frame selected; you cannot go up.=0D (kgdb) =0D Initial frame selected; you cannot go up.=0D (kgdb) bfeldman# ^D=08=08exit=0D Script done on Fri Jul 16 19:23:44 2004 --FN+gV9K+162wdwwF--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040717003205.GM1626>