Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 16 Jul 2004 20:32:05 -0400
From:      Brian Fundakowski Feldman <green@FreeBSD.org>
To:        hackers@FreeBSD.org
Cc:        alc@FreeBSD.org
Subject:   crash via vm_page_sleep_if_busy() and contigmalloc
Message-ID:  <20040717003205.GM1626@green.homeunix.org>

next in thread | raw e-mail | index | archive | help

--FN+gV9K+162wdwwF
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

Anyone VM-y enough to be up to the task: please take a look at this
current vm_contig.c code and the crash that I have.

This crash is not common -- this is the first time I've seen it --
but the problem certainly doesn't seem unique.

What seems to happen is that vm_page_sleep_if_busy() is called from
a place that expects that a page may go away, but it does not really
realize this.  If it has to sleep because the page is busy, it will
afterward happily dereference m->object which may now be NULL or belong
to something else, and unlock its mutex (which may be locked).

It seems that this is a generic problem that needs to be solved by not
dicking around with vm_object inside vm_page_sleep_if_busy(): pass it
in locked all of the time, return it unlocked all of the time if the
page queue mutex was relinquished.  Also, assumptions should be removed
from other callers of vm_page_sleep_if_busy() such that they know the
object may not exist after return, so if the page queue lock is gone
then the object is gone and it must not reference it anymore.

Essentially every bit of code that calls vm_page_sleep_if_busy() without
explicit knowledge of the backing object is in violation of this.  As
such, I think callers need to either lock the vm_object in every case
before locking the page queues, or if they hold the page queues' mutex,
do a trylock before trying to call vm_page_sleep_if_busy(), and be able
to handle both of the locks being relinquished on a return of TRUE.

Comments?

-- 
Brian Fundakowski Feldman                           \'[ FreeBSD ]''''''''''\
  <> green@FreeBSD.org                               \  The Power to Serve! \
 Opinions expressed are my own.                       \,,,,,,,,,,,,,,,,,,,,,,\

--FN+gV9K+162wdwwF
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="crash.typescript"
Content-Transfer-Encoding: quoted-printable

Script started on Fri Jul 16 19:22:54 2004
You have mail.=0D
bfeldman# gdb53 -k kernel.debug vmcore.15=1B[14`=1B[K53 -k kernel.debug vmc=
ore.15=0D=0D
GNU gdb 5.3 (FreeBSD)=0D
Copyright 2002 Free Software Foundation, Inc.=0D
GDB is free software, covered by the GNU General Public License, and you ar=
e=0D
welcome to change it and/or distribute copies of it under certain condition=
s.=0D
Type "show copying" to see the conditions.=0D
There is absolutely no warranty for GDB.  Type "show warranty" for details.=
=0D
This GDB was configured as "i386-portbld-freebsd5.2"...set=0D
panic: lockmgr: locking against myself=0D
panic messages:=0D
---=0D
 panic: lockmgr: locking against myself=0D
Uptime: 6h11m56s=0D
Dumping 510 MB=0D
 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320 =
336 352 368 384 400 416 432 448 464 480 496=0D
---=0D
#0  doadump () at ../../../kern/kern_shutdown.c:236=0D
236		dumping++;=0D
(kgdb) set pagination off=0D
(kgdb) bt=0D
#0  doadump () at ../../../kern/kern_shutdown.c:236=0D
#1  0xc04fb30b in boot (howto=3D260) at ../../../kern/kern_shutdown.c:370=0D
#2  0xc04fb5c3 in panic (fmt=3D0xc06776a0 "lockmgr: locking against myself"=
) at ../../../kern/kern_shutdown.c:548=0D
#3  0xc04f11b8 in lockmgr (lkp=3D0xc1982e0c, flags=3D2, interlkp=3D0x100000=
0, td=3D0xc219e2c0) at ../../../kern/kern_lock.c:437=0D
#4  0xc05ffdd0 in _vm_map_lock_read (map=3D0x0, file=3D0xc068c92e "../../..=
/vm/vm_map.c", line=3D2935) at machine/pcpu.h:156=0D
#5  0xc0602e10 in vm_map_lookup (var_map=3D0xde778b24, vaddr=3D0, fault_typ=
ea=3D1 '\001', out_entry=3D0xde778b28, object=3D0x0, pindex=3D0x0, out_prot=
=3D0x0, wired=3D0xde778b00) at ../../../vm/vm_map.c:2935=0D
#6  0xc05fbed9 in vm_fault (map=3D0xc1982dd0, vaddr=3D0, fault_type=3D1 '\0=
01', fault_flags=3D0) at ../../../vm/vm_fault.c:232=0D
#7  0xc063c49b in trap_pfault (frame=3D0xde778bec, usermode=3D0, eva=3D0) a=
t ../../../i386/i386/trap.c:710=0D
#8  0xc063c1b1 in trap (frame=3D{tf_fs =3D -1038548968, tf_es =3D 16, tf_ds=
 =3D -562626544, tf_edi =3D -1066871035, tf_esi =3D 431, tf_ebp =3D -562590=
664, tf_isp =3D -562590696, tf_ebx =3D 0, tf_edx =3D -1038490944, tf_ecx =
=3D 2, tf_eax =3D -1038490944, tf_trapno =3D 12, tf_err =3D 0, tf_eip =3D -=
1068550176, tf_cs =3D 8, tf_eflags =3D 78470, tf_esp =3D -1048169872, tf_ss=
 =3D 1}) at ../../../i386/i386/trap.c:420=0D
#9  0xc04f37e0 in _mtx_lock_flags (m=3D0x0, opts=3D0, file=3D0xc068d705 "..=
/../../vm/vm_page.c", line=3D431) at ../../../kern/kern_mutex.c:246=0D
#10 0xc0607638 in vm_page_sleep_if_busy (m=3D0xc1863270, also_m_busy=3D1, m=
sg=3D0xc068d4a6 "madvpo") at ../../../vm/vm_page.c:431=0D
#11 0xc0605ceb in vm_object_madvise (object=3D0xc22097bc, pindex=3D6116, co=
unt=3D0, advise=3D5) at ../../../vm/vm_object.c:1130=0D
#12 0xc060118d in vm_map_madvise (map=3D0xc1982dd0, start=3D169009152, end=
=3D172679168, behav=3D5) at ../../../vm/vm_map.c:1546=0D
#13 0xc0603c5a in madvise (td=3D0xc1982dd0, uap=3D0x0) at ../../../vm/vm_mm=
ap.c:676=0D
#14 0xc063ca6b in syscall (frame=3D{tf_fs =3D 47, tf_es =3D 22151215, tf_ds=
 =3D -1078001617, tf_edi =3D 3670016, tf_esi =3D 896, tf_ebp =3D -107794216=
8, tf_isp =3D -562590348, tf_ebx =3D 674006828, tf_edx =3D 811073536, tf_ec=
x =3D 7993, tf_eax =3D 75, tf_trapno =3D 12, tf_err =3D 2, tf_eip =3D 67354=
1319, tf_cs =3D 31, tf_eflags =3D 12946, tf_esp =3D -1077942212, tf_ss =3D =
47}) at ../../../i386/i386/trap.c:1004=0D
(kgdb) p/x allproc.lh_first->p_threads->tqh_first->td_pcb->pcb_ebp=0D
$1 =3D 0xdf6e8904=0D
(kgdb) frame 0xdf6e8904=0D
#0  0x00000000 in ?? ()=0D
(kgdb) down=0D
Bottom (i.e., innermost) frame selected; you cannot go down.=0D
(kgdb) up=0D
#1  0xc06e72cc in sysctl__kern_shutdown_children ()=0D
(kgdb) =0D
#2  0xc050157e in mi_switch (flags=3D0) at ../../../kern/kern_synch.c:352=0D
352		sched_switch(td);=0D
(kgdb) =0D
#3  0xc0515992 in sleepq_switch (wchan=3D0x0) at ../../../kern/subr_sleepqu=
eue.c:374=0D
374		mi_switch(SW_VOL);=0D
(kgdb) =0D
#4  0xc0515b43 in sleepq_wait (wchan=3D0xcbe2d488) at ../../../kern/subr_sl=
eepqueue.c:478=0D
478		sleepq_switch(wchan);=0D
(kgdb) =0D
#5  0xc050125a in msleep (ident=3D0xcbe2d488, mtx=3D0xc070f980, priority=3D=
68, wmesg=3D0xc068c150 "swwrt", timo=3D0) at ../../../kern/kern_synch.c:243=
=0D
243			sleepq_wait(ident);=0D
(kgdb) =0D
#6  0xc053c587 in bwait (bp=3D0xcbe2d488, pri=3D68 'D', wchan=3D0xc068c150 =
"swwrt") at ../../../kern/vfs_bio.c:3766=0D
3766			msleep(bp, &bdonelock, pri, wchan, 0);=0D
(kgdb) =0D
#7  0xc05fa3c2 in swap_pager_putpages (object=3D0xc22097bc, m=3D0xdf6e8a6c,=
 count=3D1, sync=3D1, rtvals=3D0xdf6e8a20) at ../../../vm/swap_pager.c:1372=
=0D
1372			bwait(bp, PVM, "swwrt");=0D
(kgdb) =0D
#8  0xc060a6f2 in vm_pageout_flush (mc=3D0xdf6e8a6c, count=3D1, flags=3D1) =
at ../../../vm/vm_pager.h:139=0D
139		(*pagertab[object->type]->pgo_putpages)=0D
(kgdb) =0D
#9  0xc0609332 in vm_contig_launder_page (m=3D0xc18636f0) at ../../../vm/vm=
_contig.c:121=0D
warning: Source file is more recent than executable.=0D
=0D
121				vm_pageout_flush(&m_tmp, 1, VM_PAGER_PUT_SYNC);=0D
(kgdb) =0D
#10 0xc0609c6e in vm_page_alloc_contig (npages=3D44, low=3D0, high=3D429496=
7295, alignment=3D4, boundary=3D0) at ../../../vm/vm_contig.c:447=0D
447						if (vm_contig_launder_page(m) !=3D 0)=0D
(kgdb) =0D
#11 0xc0609f6d in contigmalloc (size=3D180224, type=3D0xc06bbc80, flags=3D2=
58, low=3D0, high=3D4294967295, alignment=3D4, boundary=3D0) at ../../../vm=
/vm_contig.c:546=0D
546			pages =3D vm_page_alloc_contig(npgs, low, high,=0D
(kgdb) =0D
#12 0xc062d137 in bus_dmamem_alloc (dmat=3D0xc1cffc00, vaddr=3D0xc1d17554, =
flags=3D0, mapp=3D0x0) at ../../../i386/i386/busdma_machdep.c:430=0D
430			*vaddr =3D contigmalloc(dmat->maxsize, M_DEVBUF, mflags,=0D
(kgdb) =0D
#13 0xc2251898 in ?? ()=0D
(kgdb) =0D
#14 0xc2251634 in ?? ()=0D
(kgdb) =0D
#15 0xc050d46c in device_attach (dev=3D0xc1d17550) at device_if.h:39=0D
39		KOBJOPLOOKUP(((kobj_t)dev)->ops,device_attach);=0D
(kgdb) =0D
#16 0xc050d40c in device_probe_and_attach (dev=3D0xc1bd8400) at ../../../ke=
rn/subr_bus.c:1684=0D
1684		error =3D device_attach(dev);=0D
(kgdb) =0D
#17 0xc0464785 in cardbus_driver_added (cbdev=3D0xc1a49e00, driver=3D0xc226=
a3e8) at ../../../dev/cardbus/cardbus.c:278=0D
278			if (device_probe_and_attach(dev) !=3D 0)=0D
(kgdb) =0D
#18 0xc050c403 in devclass_add_driver (dc=3D0xc19709c0, driver=3D0xc226a3e8=
) at bus_if.h:71=0D
71		((bus_driver_added_t *) _m)(_dev, _driver);=0D
(kgdb) =0D
#19 0xc050ec92 in driver_module_handler (mod=3D0xc1cbc640, what=3D-10376550=
64, arg=3D0xc226a45c) at ../../../kern/subr_bus.c:2545=0D
2545			error =3D devclass_add_driver(bus_devclass, driver);=0D
(kgdb) =0D
#20 0xc04f312e in module_register_init (arg=3D0xc226a470) at ../../../kern/=
kern_module.c:108=0D
108		error =3D MOD_EVENT(mod, MOD_LOAD);=0D
(kgdb) =0D
#21 0xc04ee5e1 in linker_file_sysinit (lf=3D0xc1c8fe00) at ../../../kern/ke=
rn_linker.c:193=0D
193			(*((*sipp)->func)) ((*sipp)->udata);=0D
(kgdb) =0D
#22 0xc04ee895 in linker_load_file (filename=3D0xc1d30180 "/home/green/pris=
m54-driver/pff/if_pff.ko", result=3D0xdf6e8cb0) at ../../../kern/kern_linke=
r.c:358=0D
358				linker_file_sysinit(lf);=0D
(kgdb) =0D
#23 0xc04f08e7 in linker_load_module (kldname=3D0xc1d30180 "/home/green/pri=
sm54-driver/pff/if_pff.ko", modname=3D0x0, parent=3D0x0, verinfo=3D0x0, lfp=
p=3D0xdf6e8cdc) at ../../../kern/kern_linker.c:1673=0D
1673			error =3D linker_load_file(pathname, &lfdep);=0D
(kgdb) =0D
#24 0xc04ef297 in kldload (td=3D0xc1cb59a0, uap=3D0x0) at ../../../kern/ker=
n_linker.c:776=0D
776		error =3D linker_load_module(kldname, modname, NULL, NULL, &lf);=0D
(kgdb) =0D
#25 0xc063ca6b in syscall (frame=3D{tf_fs =3D 47, tf_es =3D 47, tf_ds =3D 4=
7, tf_edi =3D 1, tf_esi =3D -1077942056, tf_ebp =3D -1077942136, tf_isp =3D=
 -546402956, tf_ebx =3D 0, tf_edx =3D -1, tf_ecx =3D -1077941787, tf_eax =
=3D 304, tf_trapno =3D 12, tf_err =3D 2, tf_eip =3D 671870951, tf_cs =3D 31=
, tf_eflags =3D 658, tf_esp =3D -1077942180, tf_ss =3D 47}) at ../../../i38=
6/i386/trap.c:1004=0D
1004			error =3D (*callp->sy_call)(td, args);=0D
(kgdb) =0D
Initial frame selected; you cannot go up.=0D
(kgdb) =0D
Initial frame selected; you cannot go up.=0D
(kgdb) =0D
Initial frame selected; you cannot go up.=0D
(kgdb) =0D
Initial frame selected; you cannot go up.=0D
(kgdb) bfeldman# ^D=08=08exit=0D

Script done on Fri Jul 16 19:23:44 2004

--FN+gV9K+162wdwwF--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040717003205.GM1626>