From owner-freebsd-current@FreeBSD.ORG Tue Nov 27 19:35:16 2012 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6C0C6F5E for ; Tue, 27 Nov 2012 19:35:16 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id B2CF88FC14 for ; Tue, 27 Nov 2012 19:35:15 +0000 (UTC) Received: (qmail 36445 invoked from network); 27 Nov 2012 21:06:50 -0000 Received: from unknown (HELO [62.48.0.94]) ([62.48.0.94]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 27 Nov 2012 21:06:50 -0000 Message-ID: <50B5164C.9040206@freebsd.org> Date: Tue, 27 Nov 2012 20:36:44 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121026 Thunderbird/16.0.2 MIME-Version: 1.0 To: Alan Cox Subject: Re: panic: vm_object_madvise: page 0xfffffe0413c58630 is fictitious References: <50B4A374.5040705@freebsd.org> <20121127150625.GJ3013@kib.kiev.ua> <50B4ED86.40308@rice.edu> <50B50196.4080804@freebsd.org> <50B5061E.90805@rice.edu> <50B509CC.5090508@freebsd.org> <50B51194.2020209@rice.edu> In-Reply-To: <50B51194.2020209@rice.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Cc: Konstantin Belousov , alc@freebsd.org, freebsd-current@freebsd.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Nov 2012 19:35:16 -0000 On 27.11.2012 20:16, Alan Cox wrote: > On 11/27/2012 12:43, Andre Oppermann wrote: >> On 27.11.2012 19:27, Alan Cox wrote: >>> On 11/27/2012 12:08, Andre Oppermann wrote: >>>> On 27.11.2012 17:42, Alan Cox wrote: >>>>> On 11/27/2012 09:06, Konstantin Belousov wrote: >>>>>> On Tue, Nov 27, 2012 at 12:26:44PM +0100, Andre Oppermann wrote: >>>>>>> FreeBSD bbb.ccc 10.0-CURRENT FreeBSD 10.0-CURRENT #0: >>>>>>> Fri Nov 23 17:00:40 CET 2012 >>>>>>> aaa@bbb.ccc:/usr/obj/usr/src/head/sys/GENERIC amd64 >>>>>>> >>>>>>> #0 doadump (textdump=-2014022336) at pcpu.h:229 >>>>>>> #1 0xffffffff8033e2d2 in db_fncall (dummy1=, >>>>>>> dummy2=, >>>>>>> dummy3=, dummy4=) >>>>>>> at /usr/src/head/sys/ddb/db_command.c:578 >>>>>>> #2 0xffffffff8033e074 in db_command (last_cmdp=>>>>>> out>, >>>>>>> cmd_table=, dopager=1) at >>>>>>> /usr/src/head/sys/ddb/db_command.c:449 >>>>>>> #3 0xffffffff8033dd62 in db_command_loop () at >>>>>>> /usr/src/head/sys/ddb/db_command.c:502 >>>>>>> #4 0xffffffff80340690 in db_trap (type=, >>>>>>> code=0) >>>>>>> at /usr/src/head/sys/ddb/db_main.c:231 >>>>>>> #5 0xffffffff808b375e in kdb_trap (type=3, code=0, tf=>>>>>> optimized >>>>>>> out>) >>>>>>> at /usr/src/head/sys/kern/subr_kdb.c:654 >>>>>>> #6 0xffffffff80bfc71a in trap (frame=0xffffff8487f478a0) >>>>>>> at /usr/src/head/sys/amd64/amd64/trap.c:579 >>>>>>> #7 0xffffffff80be65b2 in calltrap () at /tmp/exception-3nQ6Cf.s:179 >>>>>>> #8 0xffffffff808b2f5e in kdb_enter (why=0xffffffff80e5e23b "panic", >>>>>>> msg=) >>>>>>> at cpufunc.h:63 >>>>>>> #9 0xffffffff8088086f in panic (fmt=) >>>>>>> at /usr/src/head/sys/kern/kern_shutdown.c:628 >>>>>>> #10 0xffffffff80adea4a in vm_object_madvise (object=>>>>>> optimized out>, >>>>>>> pindex=, end=8952, advise=>>>>>> optimized out>) >>>>>>> at /usr/src/head/sys/vm/vm_object.c:1101 >>>>>>> #11 0xffffffff80ad759a in vm_map_madvise (map=0xfffffe0018260188, >>>>>>> start=, >>>>>>> end=, behav=5) at >>>>>>> /usr/src/head/sys/vm/vm_map.c:2140 >>>>>>> #12 0xffffffff80adbd8d in sys_madvise (td=, >>>>>>> uap=) >>>>>>> at /usr/src/head/sys/vm/vm_mmap.c:752 >>>>>>> #13 0xffffffff80bfd3a5 in amd64_syscall (td=0xfffffe0018230000, >>>>>>> traced=0) at subr_syscall.c:135 >>>>>>> #14 0xffffffff80be689b in Xfast_syscall () at >>>>>>> /tmp/exception-3nQ6Cf.s:329 >>>>>>> #15 0x00000000016f3bfa in ?? () >>>>>> I think this is an omission in the check for the object types. BTW, >>>>>> this >>>>>> pattern already repeats in several places, I thought about adding >>>>>> either >>>>>> new pager method, like boolean_t vm_pager_is_pageable(), or just a >>>>>> flag >>>>>> fields to the struct vm_pager to classify the vm objects. >>>>> >>>>> >>>>> A fictitious page should always have a non-zero wire count. In >>>>> fact, it >>>>> should always be one and never change. (See vm_page_unwire().) In >>>>> vm_object_madvise(), there is a check against the page's wire count >>>>> that >>>>> precedes the KASSERT(). This check should prevent the KASSERT() from >>>>> being reached for the various device-backed object types. So, >>>>> something >>>>> else has gone wrong here, or rather something has gone wrong elsewhere >>>>> that caused the KASSERT() failure here. >>>>> >>>>> Andre, can we see the contents of the offending struct vm_page and >>>>> also >>>>> the struct vm_object to which the offending page belongs to? Also, >>>>> are >>>>> you running a kernel with any experimental zero-copy send support? >>>> >>>> No experimental zero-copy support, or anything else, just a stock >>>> GENERIC kernel. >>>> >>>> (kgdb) frame 11 >>>> #11 0xffffffff80ad759a in vm_map_madvise (map=0xfffffe0018260188, >>>> start=, >>>> end=, behav=5) at >>>> /usr/src/head/sys/vm/vm_map.c:2140 >>>> 2140 >>>> vm_object_madvise(current->object.vm_object, pstart, >>>> (kgdb) p *map >>>> $1 = {header = {prev = 0xfffffe025631c438, next = 0xfffffe0248f119d8, >>>> left = 0x0, right = 0x0, >>>> start = 4096, end = 140737488355328, avail_ssize = 0, adj_free = >>>> 0, max_free = 0, object = { >>>> vm_object = 0x0, sub_map = 0x0}, offset = 0, eflags = 0, >>>> protection = 0 '\0', >>>> max_protection = 0 '\0', inheritance = 0 '\0', read_ahead = 0 >>>> '\0', wired_count = 0, >>>> next_read = 0, cred = 0x0}, lock = {lock_object = { >>>> lo_name = 0xffffffff80e66905 "vm map (user)", lo_flags = >>>> 36896768, lo_data = 0, >>>> lo_witness = 0xffffff80006c9700}, sx_lock = 17}, system_mtx = >>>> {lock_object = { >>>> lo_name = 0xffffffff80e668d7 "vm map (system)", lo_flags = >>>> 21168128, lo_data = 0, >>>> lo_witness = 0xffffff80006c9500}, mtx_lock = 4}, nentries = 32, >>>> size = 64647168, >>>> timestamp = 52, needs_wakeup = 0 '\0', system_map = 0 '\0', flags = >>>> 0 '\0', >>>> root = 0xfffffe02560a6258, pmap = 0xfffffe00182602b8, busy = 0} >>>> (kgdb) p* map->pmap >>>> $6 = {pm_mtx = {lock_object = {lo_name = 0xffffffff80e66934 "pmap", >>>> lo_flags = 21168128, >>>> lo_data = 0, lo_witness = 0xffffff80006c9900}, mtx_lock = 4}, >>>> pm_pml4 = 0xfffffe0256458000, >>>> pm_pvchunk = {tqh_first = 0xfffffe0256142000, tqh_last = >>>> 0xfffffe025644c008}, pm_active = { >>>> __bits = {1}}, pm_stats = {resident_count = 12683, wired_count >>>> = 0}, >>>> pm_root = 0xfffffe041289e040} >>>> (kgdb) p* map->root >>>> $7 = {prev = 0xfffffe0018ed0708, next = 0xfffffe02560a6870, left = >>>> 0xfffffe0018ed0708, >>>> right = 0xfffffe02560a6870, start = 34393292800, end = 34431041536, >>>> avail_ssize = 0, >>>> adj_free = 140703057047552, max_free = 140703057047552, object = { >>>> vm_object = 0xfffffe0256484570, sub_map = 0xfffffe0256484570}, >>>> offset = 1810432, eflags = 0, >>>> protection = 3 '\003', max_protection = 7 '\a', inheritance = 1 >>>> '\001', read_ahead = 15 '\017', >>>> wired_count = 0, next_read = 0, cred = 0x0} >>>> >>>> (kgdb) p *current >>>> $2 = {prev = 0xfffffe025631c438, next = 0xfffffe0248f119d8, left = >>>> 0x0, right = 0x0, start = 4096, >>>> end = 140737488355328, avail_ssize = 0, adj_free = 0, max_free = 0, >>>> object = {vm_object = 0x0, >>>> sub_map = 0x0}, offset = 0, eflags = 0, protection = 0 '\0', >>>> max_protection = 0 '\0', >>>> inheritance = 0 '\0', read_ahead = 0 '\0', wired_count = 0, >>>> next_read = 0, cred = 0x0} >>>> >>>> (kgdb) p *entry >>>> $3 = {prev = 0xfffffe0018ed0708, next = 0xfffffe02560a6870, left = >>>> 0xfffffe0018ed0708, >>>> right = 0xfffffe02560a6870, start = 34393292800, end = 34431041536, >>>> avail_ssize = 0, >>>> adj_free = 140703057047552, max_free = 140703057047552, object = { >>>> vm_object = 0xfffffe0256484570, sub_map = 0xfffffe0256484570}, >>>> offset = 1810432, eflags = 0, >>>> protection = 3 '\003', max_protection = 7 '\a', inheritance = 1 >>>> '\001', read_ahead = 15 '\017', >>>> wired_count = 0, next_read = 0, cred = 0x0} >>> >>> >>> The following tells us that this is an OBJT_DEFAULT (i.e., anonymous) >>> memory object. Such objects should never contain fictitious pages. Can >>> you please print the contents of the offending struct vm_page using the >>> address from the panic message? >> >> (kgdb) p (struct vm_page)*0xfffffe0413c58630 >> $12 = {pageq = {tqe_next = 0xfffffe04127fbcc8, tqe_prev = >> 0xfffffe0412658358}, listq = { >> tqe_next = 0xfffffe0413c586a8, tqe_prev = 0xfffffe0413c585c8}, >> left = 0xfffffe0413c585b8, >> right = 0xfffffe0413c586a8, object = 0xfffffe0256484570, pindex = >> 8868, phys_addr = 10744668160, >> md = {pv_list = {tqh_first = 0xfffffe025654d9a0, tqh_last = >> 0xfe025654d9a8}, pat_mode = 6}, >> queue = 1 '\001', segind = 4 '\004', hold_count = 0, order = 13 >> '\r', pool = 0 '\0', cow = 0, >> wire_count = 0, aflags = 1 '\001', oflags = 0 '\0', flags = 65535, >> act_count = 5 '\005', >> busy = 0 '\0', valid = 255 'ÿ', dirty = 0 '\0'} >> > > Except for the value of the "flags" field, this looks like a perfectly > ordinary page of physical memory. In other words, this is not a > fictitious page. Moreover, there is nothing inconsistent about the > other fields. > > A "flags" field value of 65535 should be an impossibility. We only > define flags for 9 of the 16 bits in the field. > > Is there any chance you're loading and using a kernel module that is > older than your kernel? No kernel modules loaded at all. It's the first time ever I got this panic. Maybe it's just a Heisenbug. IIRC I don't have ECC in this machine. -- Andre