Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 11 Nov 2004 19:38:22 +0300 (MSK)
From:      Igor Sysoev <is@rambler-co.ru>
To:        Uwe Doering <gemini@geminix.org>
Cc:        stable@freebsd.org
Subject:   Re: vnode_pager_putpages errors and DOS?
Message-ID:  <20041111192947.A41088@is.park.rambler.ru>
In-Reply-To: <20041111190413.F41088@is.park.rambler.ru>
References:  <Pine.NEB.3.96L.1041009150440.93055O-100000@fledge.watson.org> <4168578F.7060706@geminix.org> <20041103191641.K63546@is.park.rambler.ru> <4189666A.9020500@geminix.org> <20041104124616.S92154@is.park.rambler.ru> <418BEBC2.3020304@geminix.org> <20041111190413.F41088@is.park.rambler.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 11 Nov 2004, Igor Sysoev wrote:

> On Fri, 5 Nov 2004, Uwe Doering wrote:
>
> > Igor Sysoev wrote:
> > > [...]
> > > I've tried your patch from second email (it requires to include
> > > <sys/conf.h> for devsw and D_DISK): the system also became unresponsible.
> > >
> > > The main problem is that I could not kill the offending process - it
> > > stuck in biowr state.
> >
> > In the meantime I've investigated this further.  The two patches I
> > provided so far certainly have their merits, since they deal with some
> > unwanted side effects.  However, I found that the root cause for the
> > eventual system lock-up lies elsewhere.
> >
> > In an earlier email I already pointed out that function
> > vnode_pager_generic_putpages() actually doesn't care whether the write
> > operation failed or not.  It always returns VM_PAGER_OK.
> >
> > Now, in case the write operation succeeds the file system code takes
> > care that the formerly dirty pages associated with the i/o buffer get
> > marked clean.  On the other hand, if the write attempt fails, for
> > instance in an out-of-disk-space situation, the pages are left dirty.
> > At this point the syncer enters an infinite loop, trying to flush the
> > same dirty pages to disk over and over again.
> >
> > The fix is actually quite simple.  In case of a write error we have to
> > make sure ourselves that the associated pages get marked clean.  We do
> > this by returning VM_PAGER_BAD instead of VM_PAGER_OK.  These two result
> > codes are functionally identical, with the exception that VM_PAGER_BAD
> > additionally marks the respective page clean.  For the details, please
> > have a look at the caller function vm_pageout_flush() in 'vm_pageout.c'.
> >
> > What this modification means is that in case of a write error the
> > affected pages remain intact in memory until they get recycled, but we
> > lose their contents as far as the copy on disk is concerned.  I believe
> > this is acceptable (and possibly even originally intended) because
> > giving up on syncing is about the best thing we can do in this
> > situation, anyway.  And it is certainly a much better choice than
> > halting the whole system due to an infinite loop.
> >
> > I've attached an updated version of the patch for 'vnode_pager.c'.  On
> > my test system it resolved the issue.  Please let us know whether it
> > works for you as well.
>
> Sorry for the late response: I was ill and have no access to the test machine.
> I applied the patch to the clean 4.10. The result is the same: the process
> could not be killed, the file system access is very limited and the system
> became unresponsible.

Sorry, I applied the patch, but forget to rebuild kernel :).

It seems that patch resolves the problem - the program exits and the system
is working.  I run it several times.  I would also run buildworld on this
system to ensure that the program did not affect VM.


Igor Sysoev
http://sysoev.ru/en/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041111192947.A41088>