Date: Thu, 11 Nov 2004 19:38:22 +0300 (MSK) From: Igor Sysoev <is@rambler-co.ru> To: Uwe Doering <gemini@geminix.org> Cc: stable@freebsd.org Subject: Re: vnode_pager_putpages errors and DOS? Message-ID: <20041111192947.A41088@is.park.rambler.ru> In-Reply-To: <20041111190413.F41088@is.park.rambler.ru> References: <Pine.NEB.3.96L.1041009150440.93055O-100000@fledge.watson.org> <4168578F.7060706@geminix.org> <20041103191641.K63546@is.park.rambler.ru> <4189666A.9020500@geminix.org> <20041104124616.S92154@is.park.rambler.ru> <418BEBC2.3020304@geminix.org> <20041111190413.F41088@is.park.rambler.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 11 Nov 2004, Igor Sysoev wrote: > On Fri, 5 Nov 2004, Uwe Doering wrote: > > > Igor Sysoev wrote: > > > [...] > > > I've tried your patch from second email (it requires to include > > > <sys/conf.h> for devsw and D_DISK): the system also became unresponsible. > > > > > > The main problem is that I could not kill the offending process - it > > > stuck in biowr state. > > > > In the meantime I've investigated this further. The two patches I > > provided so far certainly have their merits, since they deal with some > > unwanted side effects. However, I found that the root cause for the > > eventual system lock-up lies elsewhere. > > > > In an earlier email I already pointed out that function > > vnode_pager_generic_putpages() actually doesn't care whether the write > > operation failed or not. It always returns VM_PAGER_OK. > > > > Now, in case the write operation succeeds the file system code takes > > care that the formerly dirty pages associated with the i/o buffer get > > marked clean. On the other hand, if the write attempt fails, for > > instance in an out-of-disk-space situation, the pages are left dirty. > > At this point the syncer enters an infinite loop, trying to flush the > > same dirty pages to disk over and over again. > > > > The fix is actually quite simple. In case of a write error we have to > > make sure ourselves that the associated pages get marked clean. We do > > this by returning VM_PAGER_BAD instead of VM_PAGER_OK. These two result > > codes are functionally identical, with the exception that VM_PAGER_BAD > > additionally marks the respective page clean. For the details, please > > have a look at the caller function vm_pageout_flush() in 'vm_pageout.c'. > > > > What this modification means is that in case of a write error the > > affected pages remain intact in memory until they get recycled, but we > > lose their contents as far as the copy on disk is concerned. I believe > > this is acceptable (and possibly even originally intended) because > > giving up on syncing is about the best thing we can do in this > > situation, anyway. And it is certainly a much better choice than > > halting the whole system due to an infinite loop. > > > > I've attached an updated version of the patch for 'vnode_pager.c'. On > > my test system it resolved the issue. Please let us know whether it > > works for you as well. > > Sorry for the late response: I was ill and have no access to the test machine. > I applied the patch to the clean 4.10. The result is the same: the process > could not be killed, the file system access is very limited and the system > became unresponsible. Sorry, I applied the patch, but forget to rebuild kernel :). It seems that patch resolves the problem - the program exits and the system is working. I run it several times. I would also run buildworld on this system to ensure that the program did not affect VM. Igor Sysoev http://sysoev.ru/en/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041111192947.A41088>