From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 17 09:48:30 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C4F881065670; Tue, 17 Apr 2012 09:48:30 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 570FF8FC0C; Tue, 17 Apr 2012 09:48:29 +0000 (UTC) Received: from skuns.kiev.zoral.com.ua (localhost [127.0.0.1]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id q3H9mLY8018351; Tue, 17 Apr 2012 12:48:21 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5) with ESMTP id q3H9mLuP005591; Tue, 17 Apr 2012 12:48:21 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5/Submit) id q3H9mKwf005590; Tue, 17 Apr 2012 12:48:20 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 17 Apr 2012 12:48:20 +0300 From: Konstantin Belousov To: Ewart Tempest Message-ID: <20120417094820.GK2358@deviant.kiev.zoral.com.ua> References: Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="nO+apMLAy/edto0B" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.0 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: alc@freebsd.org, "freebsd-hackers@freebsd.org" , Tony Lanza Subject: Re: Corrupted pmap pm_vlist - pmap_remove_pte() X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Apr 2012 09:48:30 -0000 --nO+apMLAy/edto0B Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Apr 16, 2012 at 03:08:25PM -0400, Ewart Tempest wrote: > In FreeBSD 6.*, we have been seeing crashes in pmap_remove_pages() that o= nly seem to occur in scaling scenarios: >=20 > 2564 #ifdef PMAP_REMOVE_PAGES_CURPROC_ONLY > 2565 pte =3D vtopte(pv->pv_va); > 2566 #else > 2567 pte =3D pmap_pte(pmap, pv->pv_va); > 2568 #endif > 2569 tpte =3D *pte; <=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D page fault here >=20 > The suspicion is that the pmap's pm_pvlist list is getting corrupted. To = this end, I have a question on the following logic in pmap_remove_pte() (se= e in-line comment): >=20 > 1533 static int > 1534 pmap_remove_pte(pmap_t pmap, pt_entry_t *ptq, vm_offset_t va, pd_= entry_t ptepde) > 1535 { > 1536 pt_entry_t oldpte; > 1537 vm_page_t m; > 1538=20 > 1539 PMAP_LOCK_ASSERT(pmap, MA_OWNED); > 1540 oldpte =3D pte_load_clear(ptq); > 1541 if (oldpte & PG_W) > 1542 pmap->pm_stats.wired_count -=3D 1; > 1543 /* > 1544 * Machines that don't support invlpg, also don't support > 1545 * PG_G. > 1546 */ > 1547 if (oldpte & PG_G) > 1548 pmap_invalidate_page(kernel_pmap, va); > 1549 pmap->pm_stats.resident_count -=3D 1; > 1550 if (oldpte & PG_MANAGED) { > 1551 m =3D PHYS_TO_VM_PAGE(oldpte & PG_FRAME); > 1552 if (oldpte & PG_M) { > 1553 #if defined(PMAP_DIAGNOSTIC) > 1554 if (pmap_nw_modified((pt_entry_t) oldpte)) { > 1555 printf( > 1556 "pmap_remove: modified page not writable: va: 0x%lx, pte: 0x%lx\= n", > 1557 va, oldpte); > 1558 } > 1559 #endif > 1560 if (pmap_track_modified(va)) > 1561 vm_page_dirty(m); > 1562 } > 1563 if (oldpte & PG_A) > 1564 vm_page_flag_set(m, PG_REFERENCED); > 1565 pmap_remove_entry(pmap, m, va); > 1566 } > 1567 return (pmap_unuse_pt(pmap, va, ptepde)); <=3D=3D=3D=3D=3D=3D=3D= *** under what circumstances is it valid to free the page but not remove i= t from the pmap's pm_vlist? Even the code comment for pmap_unuse_pt() comme= nces "After removing a page table entry ... ". *** It is valid to not remove pv_entry when no pv_entry exists for the mapping. The pv_entry is created if the page is managed, see pmap_enter() code. The block above the return is executed when the page is managed, or at least pmap thinks so. The HEAD code will panic in pmap_pvh_free() if pmap_phv_remove() cannot find the pv entry for given page and given pmap/va. > 1568 } >=20 > If the tail end of the above function is changed as follows: >=20 > 1565 pmap_remove_entry(pmap, m, va); > 1565.5 return (pmap_unuse_pt(pmap, va, ptepde)); > 1566 } > 1567 return (0); >=20 > Then we don't see any crashes ... but is it the right thing to do? Should be not. Try to test this with some unmanaged mapping, like /dev/mem pages mapped into the exiting process address space. I am too new to know about any nuances of the RELENG_6 code. --nO+apMLAy/edto0B Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAk+NPGQACgkQC3+MBN1Mb4jKjwCfZdJAjdcJ2d7SFP/NdhiTAZc/ kf4AnAmlE3ir/1n6zUackWmq8k5OTp81 =Gcsz -----END PGP SIGNATURE----- --nO+apMLAy/edto0B--