From owner-freebsd-current@freebsd.org Thu Feb 18 15:13:31 2016 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E2D3AAAC615 for ; Thu, 18 Feb 2016 15:13:31 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 718251AF8; Thu, 18 Feb 2016 15:13:31 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u1IFDLrl020395 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 18 Feb 2016 17:13:21 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u1IFDLrl020395 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u1IFDL6a020394; Thu, 18 Feb 2016 17:13:21 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 18 Feb 2016 17:13:21 +0200 From: Konstantin Belousov To: Andriy Gapon Cc: alc@FreeBSD.org, FreeBSD Current Subject: Re: Memory modified after free in "MAP ENTRY" zone (vm_map_entry_t->read_ahead) Message-ID: <20160218151321.GR91220@kib.kiev.ua> References: <56BBAB6E.5050601@FreeBSD.org> <56C08AAA.5050206@FreeBSD.org> <56C1953F.60604@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56C1953F.60604@FreeBSD.org> User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Feb 2016 15:13:32 -0000 On Mon, Feb 15, 2016 at 11:07:11AM +0200, Andriy Gapon wrote: > On 15/02/2016 00:27, Alan Cox wrote: > > > > On Sun, Feb 14, 2016 at 8:09 AM, Andriy Gapon > > wrote: > > > > On 10/02/2016 23:28, Andriy Gapon wrote: > > > > > > Over a span of approximately 3 weeks I have got two slightly different panics of > > > the same kind. The affected system is a several months old amd64 head. > > > > I added a small assertion and it got tripped: > [snip] > > > > So, it seems that the code allows modification of read_ahead field of an entry > > without holding a map lock. I'd guess that that should be mostly harmless as > > read_ahead value is not critical, but it is still not nice. > > > > Not intentionally. The nearby code to the read_ahead assignment expects the map > > to be locked, so I wouldn't categorize this as mostly harmless. > > > > In general, you shouldn't get to the read_ahead assignment without the map being > > locked, and almost all of the code paths that unlock the map jump to RetryFault > > shortly thereafter, so it's hard to imagine how the map lock wouldn't be > > reacquired. I speculate that the root cause of your panic is a case where > > vm_pager_get_pages() fails, and in such a case we might loop around, descending > > to the next backing object without reacquiring the map lock. In other words, > > your above assertion failure is happening on the next iteration after an initial > > vm_pager_get_pages() failure, and we're about to do another vm_pager_get_pages() > > call on a different object. > > > > In summary, I have no trouble believing that the code that handles a failure by > > vm_pager_get_pages() in vm_fault() is not quite right, and I think that's what's > > biting you. > > Alan, > > thank you very much for the very insightful analysis. > Indeed, I see that in this case the object chain is default -> swap -> swap. I > am not sure how such chain was created. It seems that going default -> swap is > not a problem as the map lock is not dropped in this case. But going swap -> > swap the way you described (pager error, e.g. the page is just not found) has > exactly the problem that you suggested. > So this is arguably a fallout from r188331. The following is somewhat non-insistent attempt to fix the problem. diff --git a/sys/vm/vm_fault.c b/sys/vm/vm_fault.c index a7e3d37..cddf1eb 100644 --- a/sys/vm/vm_fault.c +++ b/sys/vm/vm_fault.c @@ -291,7 +291,8 @@ vm_fault_hold(vm_map_t map, vm_offset_t vaddr, vm_prot_t fault_type, struct faultstate fs; struct vnode *vp; vm_page_t m; - int ahead, behind, cluster_offset, error, locked; + int ahead, behind, cluster_offset, error, locked, rv; + u_char behavior; hardfault = 0; growstack = TRUE; @@ -550,9 +551,18 @@ readrest: * zero-filled pages. */ if (fs.object->type != OBJT_DEFAULT) { - int rv; - u_char behavior = vm_map_entry_behavior(fs.entry); - + if (!fs.lookup_still_valid) { + locked = vm_map_trylock_read(fs.map); + if (locked) + fs.lookup_still_valid = TRUE; + if (!locked || fs.map->timestamp != + map_generation) { + release_page(&fs); + unlock_and_deallocate(&fs); + goto RetryFault; + } + } + behavior = vm_map_entry_behavior(fs.entry); era = fs.entry->read_ahead; if (behavior == MAP_ENTRY_BEHAV_RANDOM || P_KILLED(curproc)) { @@ -603,6 +613,7 @@ readrest: } ahead = ulmin(ahead, atop(fs.entry->end - vaddr) - 1); if (era != nera) + /* XXX only read-lock on map */ fs.entry->read_ahead = nera; /*