From owner-freebsd-hackers@FreeBSD.ORG Mon Apr 16 19:12:08 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2D8B1106566B for ; Mon, 16 Apr 2012 19:12:08 +0000 (UTC) (envelope-from etempest@juniper.net) Received: from exprod7og104.obsmtp.com (exprod7og104.obsmtp.com [64.18.2.161]) by mx1.freebsd.org (Postfix) with ESMTP id CC9C68FC18 for ; Mon, 16 Apr 2012 19:12:07 +0000 (UTC) Received: from P-EMHUB01-HQ.jnpr.net ([66.129.224.36]) (using TLSv1) by exprod7ob104.postini.com ([64.18.6.12]) with SMTP ID DSNKT4xvB1ww9PQUo6a+VVqMiGWd/p+knQWL@postini.com; Mon, 16 Apr 2012 12:12:07 PDT Received: from P-CLDFE02-HQ.jnpr.net (172.24.192.60) by P-EMHUB01-HQ.jnpr.net (172.24.192.35) with Microsoft SMTP Server (TLS) id 8.3.213.0; Mon, 16 Apr 2012 12:08:34 -0700 Received: from p-emfe02-wf.jnpr.net (172.28.145.25) by p-cldfe02-hq.jnpr.net (172.24.192.60) with Microsoft SMTP Server (TLS) id 14.1.355.2; Mon, 16 Apr 2012 12:08:34 -0700 Received: from EMBX01-WF.jnpr.net ([fe80::1914:3299:33d9:e43b]) by p-emfe02-wf.jnpr.net ([fe80::c126:c633:d2dc:8090%11]) with mapi; Mon, 16 Apr 2012 15:08:29 -0400 From: Ewart Tempest To: "freebsd-hackers@freebsd.org" Date: Mon, 16 Apr 2012 15:08:25 -0400 Thread-Topic: Corrupted pmap pm_vlist - pmap_remove_pte() Thread-Index: Ac0cBEza14fHNTwiSTiknI7KKPUryw== Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailman-Approved-At: Mon, 16 Apr 2012 20:33:43 +0000 Cc: Tony Lanza , Ewart Tempest Subject: Corrupted pmap pm_vlist - pmap_remove_pte() X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Apr 2012 19:12:08 -0000 In FreeBSD 6.*, we have been seeing crashes in pmap_remove_pages() that onl= y seem to occur in scaling scenarios: 2564 #ifdef PMAP_REMOVE_PAGES_CURPROC_ONLY 2565 pte =3D vtopte(pv->pv_va); 2566 #else 2567 pte =3D pmap_pte(pmap, pv->pv_va); 2568 #endif 2569 tpte =3D *pte; <=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D page fault here The suspicion is that the pmap's pm_pvlist list is getting corrupted. To th= is end, I have a question on the following logic in pmap_remove_pte() (see = in-line comment): 1533 static int 1534 pmap_remove_pte(pmap_t pmap, pt_entry_t *ptq, vm_offset_t va, pd_en= try_t ptepde) 1535 { 1536 pt_entry_t oldpte; 1537 vm_page_t m; 1538=20 1539 PMAP_LOCK_ASSERT(pmap, MA_OWNED); 1540 oldpte =3D pte_load_clear(ptq); 1541 if (oldpte & PG_W) 1542 pmap->pm_stats.wired_count -=3D 1; 1543 /* 1544 * Machines that don't support invlpg, also don't support 1545 * PG_G. 1546 */ 1547 if (oldpte & PG_G) 1548 pmap_invalidate_page(kernel_pmap, va); 1549 pmap->pm_stats.resident_count -=3D 1; 1550 if (oldpte & PG_MANAGED) { 1551 m =3D PHYS_TO_VM_PAGE(oldpte & PG_FRAME); 1552 if (oldpte & PG_M) { 1553 #if defined(PMAP_DIAGNOSTIC) 1554 if (pmap_nw_modified((pt_entry_t) oldpte)) { 1555 printf( 1556 "pmap_remove: modified page not writable: va: 0x%lx, pte: 0x%lx\n"= , 1557 va, oldpte); 1558 } 1559 #endif 1560 if (pmap_track_modified(va)) 1561 vm_page_dirty(m); 1562 } 1563 if (oldpte & PG_A) 1564 vm_page_flag_set(m, PG_REFERENCED); 1565 pmap_remove_entry(pmap, m, va); 1566 } 1567 return (pmap_unuse_pt(pmap, va, ptepde)); <=3D=3D=3D=3D=3D=3D=3D *= ** under what circumstances is it valid to free the page but not remove it = from the pmap's pm_vlist? Even the code comment for pmap_unuse_pt() commenc= es "After removing a page table entry ... ". *** 1568 } If the tail end of the above function is changed as follows: 1565 pmap_remove_entry(pmap, m, va); 1565.5 return (pmap_unuse_pt(pmap, va, ptepde)); 1566 } 1567 return (0); Then we don't see any crashes ... but is it the right thing to do? Ewart