Date: Thu, 4 Nov 2004 12:52:48 +0300 (MSK) From: Igor Sysoev <is@rambler-co.ru> To: Uwe Doering <gemini@geminix.org> Cc: stable@freebsd.org Subject: Re: vnode_pager_putpages errors and DOS? Message-ID: <20041104124616.S92154@is.park.rambler.ru> In-Reply-To: <4189666A.9020500@geminix.org> References: <Pine.NEB.3.96L.1041009150440.93055O-100000@fledge.watson.org> <4168578F.7060706@geminix.org> <20041103191641.K63546@is.park.rambler.ru> <4189666A.9020500@geminix.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 4 Nov 2004, Uwe Doering wrote: > Igor Sysoev wrote: > > On Sat, 9 Oct 2004, Uwe Doering wrote: > >>[...] > >>I wonder whether the unresponsiveness is actually just the result of the > >>kernel spending most of the time in printf(), generating warning > >>messages. vnode_pager_generic_putpages() doesn't return any error in > >>case of a write failure, so the caller (syncer in this case) isn't aware > >>that the paging out failed, that is, it is supposed to carry on as if > >>nothing happened. > >> > >>So how about limiting the number of warnings to one per second? UFS has > >>similar code in order to curb "file system full" and the like. Please > >>consider trying the attached patch, which applies cleanly to 4-STABLE. > >>It won't make the actual application causing these errors any happier, > >>but it may eliminate the DoS aspect of the issue. > > > > I have just tried your patch. To test I ran the program from > > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/67919 > > > > The patch allows me to login on machine while the system reports about > > "vnode_pager_putpages: I/O error 28". However, the file system access is > > very limited and after some time the system became unresponsible. > > Limited file system access is to be expected, since > vnode_pager_putpages() keeps the number of dirty buffers > ('numdirtybuffers') near its upper limit ('hidirtybuffers'). However, > the unresponsiveness may be caused by another shortcoming I found in the > meantime. > > When 'numdirtybuffers' is greater or equal 'hidirtybuffers', function > bwillwrite() will block until 'numdirtybuffers' drops below some > threshold value. bwillwrite() gets called in a number of places that > deal with writing data to disk. > > Two of these places are dofilewrite() (which is in turn called by > write() and pwrite()) and writev(). There, bwillwrite() gets called if > the file descriptor is of type DTYPE_VNODE. Now, this unfortunately > doesn't take into account that ttys, including pseudo ttys, and even > /dev/null and friends, are character device nodes and therefore vnodes > as well, but have nothing to do with writing data to disk. That is, in > case of heavy disk write activity, write attempts to these device nodes > get blocked, too! With the consequence that the system appears to > become unresponsive at the shell prompt, or reacts very sporadic. Even > daemonized processes that happen to log data to /dev/null (on stdout & > stderr, for example) will block. > > What we need here is an additional test that makes sure that in case of > a character device bwillwrite() gets called only if the device is in > fact a disk. Please consider trying out the attached patch. It will > not reduce the heavy disk activity (which is, after all, legitimate), > but it is supposed to enable you to operate the system at shell level > and kill the offending process, or do whatever is necessary to resolve > the problem. I've tried your patch from second email (it requires to include <sys/conf.h> for devsw and D_DISK): the system also became unresponsible. The main problem is that I could not kill the offending process - it stuck in biowr state. Igor Sysoev http://sysoev.ru/en/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041104124616.S92154>