Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 27 Nov 2012 20:36:44 +0100
From:      Andre Oppermann <andre@freebsd.org>
To:        Alan Cox <alc@rice.edu>
Cc:        Konstantin Belousov <kostikbel@gmail.com>, alc@freebsd.org, freebsd-current@freebsd.org
Subject:   Re: panic: vm_object_madvise: page 0xfffffe0413c58630 is fictitious
Message-ID:  <50B5164C.9040206@freebsd.org>
In-Reply-To: <50B51194.2020209@rice.edu>
References:  <50B4A374.5040705@freebsd.org> <20121127150625.GJ3013@kib.kiev.ua> <50B4ED86.40308@rice.edu> <50B50196.4080804@freebsd.org> <50B5061E.90805@rice.edu> <50B509CC.5090508@freebsd.org> <50B51194.2020209@rice.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On 27.11.2012 20:16, Alan Cox wrote:
> On 11/27/2012 12:43, Andre Oppermann wrote:
>> On 27.11.2012 19:27, Alan Cox wrote:
>>> On 11/27/2012 12:08, Andre Oppermann wrote:
>>>> On 27.11.2012 17:42, Alan Cox wrote:
>>>>> On 11/27/2012 09:06, Konstantin Belousov wrote:
>>>>>> On Tue, Nov 27, 2012 at 12:26:44PM +0100, Andre Oppermann wrote:
>>>>>>> FreeBSD bbb.ccc 10.0-CURRENT FreeBSD 10.0-CURRENT #0:
>>>>>>> Fri Nov 23 17:00:40 CET 2012
>>>>>>> aaa@bbb.ccc:/usr/obj/usr/src/head/sys/GENERIC  amd64
>>>>>>>
>>>>>>> #0  doadump (textdump=-2014022336) at pcpu.h:229
>>>>>>> #1  0xffffffff8033e2d2 in db_fncall (dummy1=<value optimized out>,
>>>>>>> dummy2=<value optimized out>,
>>>>>>>         dummy3=<value optimized out>, dummy4=<value optimized out>)
>>>>>>>         at /usr/src/head/sys/ddb/db_command.c:578
>>>>>>> #2  0xffffffff8033e074 in db_command (last_cmdp=<value optimized
>>>>>>> out>,
>>>>>>>         cmd_table=<value optimized out>, dopager=1) at
>>>>>>> /usr/src/head/sys/ddb/db_command.c:449
>>>>>>> #3  0xffffffff8033dd62 in db_command_loop () at
>>>>>>> /usr/src/head/sys/ddb/db_command.c:502
>>>>>>> #4  0xffffffff80340690 in db_trap (type=<value optimized out>,
>>>>>>> code=0)
>>>>>>>         at /usr/src/head/sys/ddb/db_main.c:231
>>>>>>> #5  0xffffffff808b375e in kdb_trap (type=3, code=0, tf=<value
>>>>>>> optimized
>>>>>>> out>)
>>>>>>>         at /usr/src/head/sys/kern/subr_kdb.c:654
>>>>>>> #6  0xffffffff80bfc71a in trap (frame=0xffffff8487f478a0)
>>>>>>>         at /usr/src/head/sys/amd64/amd64/trap.c:579
>>>>>>> #7  0xffffffff80be65b2 in calltrap () at /tmp/exception-3nQ6Cf.s:179
>>>>>>> #8  0xffffffff808b2f5e in kdb_enter (why=0xffffffff80e5e23b "panic",
>>>>>>> msg=<value optimized out>)
>>>>>>>         at cpufunc.h:63
>>>>>>> #9  0xffffffff8088086f in panic (fmt=<value optimized out>)
>>>>>>>         at /usr/src/head/sys/kern/kern_shutdown.c:628
>>>>>>> #10 0xffffffff80adea4a in vm_object_madvise (object=<value
>>>>>>> optimized out>,
>>>>>>>         pindex=<value optimized out>, end=8952, advise=<value
>>>>>>> optimized out>)
>>>>>>>         at /usr/src/head/sys/vm/vm_object.c:1101
>>>>>>> #11 0xffffffff80ad759a in vm_map_madvise (map=0xfffffe0018260188,
>>>>>>> start=<value optimized out>,
>>>>>>>         end=<value optimized out>, behav=5) at
>>>>>>> /usr/src/head/sys/vm/vm_map.c:2140
>>>>>>> #12 0xffffffff80adbd8d in sys_madvise (td=<value optimized out>,
>>>>>>> uap=<value optimized out>)
>>>>>>>         at /usr/src/head/sys/vm/vm_mmap.c:752
>>>>>>> #13 0xffffffff80bfd3a5 in amd64_syscall (td=0xfffffe0018230000,
>>>>>>> traced=0) at subr_syscall.c:135
>>>>>>> #14 0xffffffff80be689b in Xfast_syscall () at
>>>>>>> /tmp/exception-3nQ6Cf.s:329
>>>>>>> #15 0x00000000016f3bfa in ?? ()
>>>>>> I think this is an omission in the check for the object types. BTW,
>>>>>> this
>>>>>> pattern already repeats in several places, I thought about adding
>>>>>> either
>>>>>> new pager method, like boolean_t vm_pager_is_pageable(), or just a
>>>>>> flag
>>>>>> fields to the struct vm_pager to classify the vm objects.
>>>>>
>>>>>
>>>>> A fictitious page should always have a non-zero wire count.  In
>>>>> fact, it
>>>>> should always be one and never change.  (See vm_page_unwire().)  In
>>>>> vm_object_madvise(), there is a check against the page's wire count
>>>>> that
>>>>> precedes the KASSERT().  This check should prevent the KASSERT() from
>>>>> being reached for the various device-backed object types.  So,
>>>>> something
>>>>> else has gone wrong here, or rather something has gone wrong elsewhere
>>>>> that caused the KASSERT() failure here.
>>>>>
>>>>> Andre, can we see the contents of the offending struct vm_page and
>>>>> also
>>>>> the struct vm_object to which the offending page belongs to?  Also,
>>>>> are
>>>>> you running a kernel with any experimental zero-copy send support?
>>>>
>>>> No experimental zero-copy support, or anything else, just a stock
>>>> GENERIC kernel.
>>>>
>>>> (kgdb) frame 11
>>>> #11 0xffffffff80ad759a in vm_map_madvise (map=0xfffffe0018260188,
>>>> start=<value optimized out>,
>>>>       end=<value optimized out>, behav=5) at
>>>> /usr/src/head/sys/vm/vm_map.c:2140
>>>> 2140
>>>> vm_object_madvise(current->object.vm_object, pstart,
>>>> (kgdb) p *map
>>>> $1 = {header = {prev = 0xfffffe025631c438, next = 0xfffffe0248f119d8,
>>>> left = 0x0, right = 0x0,
>>>>       start = 4096, end = 140737488355328, avail_ssize = 0, adj_free =
>>>> 0, max_free = 0, object = {
>>>>         vm_object = 0x0, sub_map = 0x0}, offset = 0, eflags = 0,
>>>> protection = 0 '\0',
>>>>       max_protection = 0 '\0', inheritance = 0 '\0', read_ahead = 0
>>>> '\0', wired_count = 0,
>>>>       next_read = 0, cred = 0x0}, lock = {lock_object = {
>>>>         lo_name = 0xffffffff80e66905 "vm map (user)", lo_flags =
>>>> 36896768, lo_data = 0,
>>>>         lo_witness = 0xffffff80006c9700}, sx_lock = 17}, system_mtx =
>>>> {lock_object = {
>>>>         lo_name = 0xffffffff80e668d7 "vm map (system)", lo_flags =
>>>> 21168128, lo_data = 0,
>>>>         lo_witness = 0xffffff80006c9500}, mtx_lock = 4}, nentries = 32,
>>>> size = 64647168,
>>>>     timestamp = 52, needs_wakeup = 0 '\0', system_map = 0 '\0', flags =
>>>> 0 '\0',
>>>>     root = 0xfffffe02560a6258, pmap = 0xfffffe00182602b8, busy = 0}
>>>> (kgdb) p* map->pmap
>>>> $6 = {pm_mtx = {lock_object = {lo_name = 0xffffffff80e66934 "pmap",
>>>> lo_flags = 21168128,
>>>>         lo_data = 0, lo_witness = 0xffffff80006c9900}, mtx_lock = 4},
>>>> pm_pml4 = 0xfffffe0256458000,
>>>>     pm_pvchunk = {tqh_first = 0xfffffe0256142000, tqh_last =
>>>> 0xfffffe025644c008}, pm_active = {
>>>>       __bits = {1}}, pm_stats = {resident_count = 12683, wired_count
>>>> = 0},
>>>>     pm_root = 0xfffffe041289e040}
>>>> (kgdb) p* map->root
>>>> $7 = {prev = 0xfffffe0018ed0708, next = 0xfffffe02560a6870, left =
>>>> 0xfffffe0018ed0708,
>>>>     right = 0xfffffe02560a6870, start = 34393292800, end = 34431041536,
>>>> avail_ssize = 0,
>>>>     adj_free = 140703057047552, max_free = 140703057047552, object = {
>>>>       vm_object = 0xfffffe0256484570, sub_map = 0xfffffe0256484570},
>>>> offset = 1810432, eflags = 0,
>>>>     protection = 3 '\003', max_protection = 7 '\a', inheritance = 1
>>>> '\001', read_ahead = 15 '\017',
>>>>     wired_count = 0, next_read = 0, cred = 0x0}
>>>>
>>>> (kgdb) p *current
>>>> $2 = {prev = 0xfffffe025631c438, next = 0xfffffe0248f119d8, left =
>>>> 0x0, right = 0x0, start = 4096,
>>>>     end = 140737488355328, avail_ssize = 0, adj_free = 0, max_free = 0,
>>>> object = {vm_object = 0x0,
>>>>       sub_map = 0x0}, offset = 0, eflags = 0, protection = 0 '\0',
>>>> max_protection = 0 '\0',
>>>>     inheritance = 0 '\0', read_ahead = 0 '\0', wired_count = 0,
>>>> next_read = 0, cred = 0x0}
>>>>
>>>> (kgdb) p *entry
>>>> $3 = {prev = 0xfffffe0018ed0708, next = 0xfffffe02560a6870, left =
>>>> 0xfffffe0018ed0708,
>>>>     right = 0xfffffe02560a6870, start = 34393292800, end = 34431041536,
>>>> avail_ssize = 0,
>>>>     adj_free = 140703057047552, max_free = 140703057047552, object = {
>>>>       vm_object = 0xfffffe0256484570, sub_map = 0xfffffe0256484570},
>>>> offset = 1810432, eflags = 0,
>>>>     protection = 3 '\003', max_protection = 7 '\a', inheritance = 1
>>>> '\001', read_ahead = 15 '\017',
>>>>     wired_count = 0, next_read = 0, cred = 0x0}
>>>
>>>
>>> The following tells us that this is an OBJT_DEFAULT (i.e., anonymous)
>>> memory object.  Such objects should never contain fictitious pages.  Can
>>> you please print the contents of the offending struct vm_page using the
>>> address from the panic message?
>>
>> (kgdb) p (struct vm_page)*0xfffffe0413c58630
>> $12 = {pageq = {tqe_next = 0xfffffe04127fbcc8, tqe_prev =
>> 0xfffffe0412658358}, listq = {
>>      tqe_next = 0xfffffe0413c586a8, tqe_prev = 0xfffffe0413c585c8},
>> left = 0xfffffe0413c585b8,
>>    right = 0xfffffe0413c586a8, object = 0xfffffe0256484570, pindex =
>> 8868, phys_addr = 10744668160,
>>    md = {pv_list = {tqh_first = 0xfffffe025654d9a0, tqh_last =
>> 0xfe025654d9a8}, pat_mode = 6},
>>    queue = 1 '\001', segind = 4 '\004', hold_count = 0, order = 13
>> '\r', pool = 0 '\0', cow = 0,
>>    wire_count = 0, aflags = 1 '\001', oflags = 0 '\0', flags = 65535,
>> act_count = 5 '\005',
>>    busy = 0 '\0', valid = 255 'ÿ', dirty = 0 '\0'}
>>
>
> Except for the value of the "flags" field, this looks like a perfectly
> ordinary page of physical memory.  In other words, this is not a
> fictitious page.  Moreover, there is nothing inconsistent about the
> other fields.
>
> A "flags" field value of 65535 should be an impossibility.  We only
> define flags for 9 of the 16 bits in the field.
>
> Is there any chance you're loading and using a kernel module that is
> older than your kernel?

No kernel modules loaded at all.  It's the first time ever I got this
panic.  Maybe it's just a Heisenbug.  IIRC I don't have ECC in this
machine.

-- 
Andre




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?50B5164C.9040206>