From owner-freebsd-hackers  Thu Aug 13 13:02:07 1998
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id NAA01288
          for freebsd-hackers-outgoing; Thu, 13 Aug 1998 13:02:07 -0700 (PDT)
          (envelope-from owner-freebsd-hackers@FreeBSD.ORG)
Received: from dingo.cdrom.com (dingo.cdrom.com [204.216.28.145])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id NAA01258;
          Thu, 13 Aug 1998 13:02:03 -0700 (PDT)
          (envelope-from mike@dingo.cdrom.com)
Received: from dingo.cdrom.com (localhost [127.0.0.1])
	by dingo.cdrom.com (8.8.8/8.8.5) with ESMTP id MAA00604;
	Thu, 13 Aug 1998 12:59:45 -0700 (PDT)
Message-Id: <199808131959.MAA00604@dingo.cdrom.com>
X-Mailer: exmh version 2.0zeta 7/24/97
To: peter@sirius.com
cc: mrcpu@internetcds.com (Jaye Mathisen), hackers@FreeBSD.ORG,
        stable@FreeBSD.ORG
Subject: Re: vmopar state in 2.2.7? 
In-reply-to: Your message of "Thu, 13 Aug 1998 11:34:00 PDT."
             <199808131834.LAA14961@staff.sirius.com> 
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Thu, 13 Aug 1998 12:59:45 -0700
From: Mike Smith <mike@smith.net.au>
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> 
> We worked around a similar problem (processes left immortal, here in
> the context of several processes [httpd] writing to the same NFS mounted 
> file [http log file]) by adjusting the timeout value from 0 (never) to
> 2 * hz (2 seconds). Details are posted as follow-up to kern/4588 in
> FreeBSD.org's gnats problem report database.

As this is an NFS-related issue, you should follow this up with Poul 
Henning (phk@freebsd.org).  I understand he's working on NFS amongst 
other things at the moment (I know he's working on FreeBSD for us, as 
he keeps sending us invoices... 8).

> It looks like other parts of the kernel (here the vm subsystem) suffer
> similar problems. It appears to me that an overly optimistic use of
> tsleep() with both, interrupts disabled and time-out set to infinity,
> leaves immortal yet paralyzed processes around.

I don't think you mean interrupts disabled.

> >From /usr/src/sys/vm/vm_object.c (a second, similar occurence around
> line 1261):
> 
>    1218      /*
>    1219      * The busy flags are only cleared at
>    1220      * interrupt -- minimize the spl transitions
>    1221      */
>    1222      if ((p->flags & PG_BUSY) || p->busy) {
>    1223               s = splvm();
>    1224               if ((p->flags & PG_BUSY) || p->busy) {
>    1225                       p->flags |= PG_WANTED;
>    1226                       tsleep(p, PVM, "vmopar", 0);
>    1227                       splx(s);
>    1228                       goto again;
>    1229               }
>    1230               splx(s);
>    1231      }
> 
> The code in line 1224 checks a condition to see whether somebody else
> is already performing an operation on object p; in this case it wants
> to ensure that a wakeup() for the following tsleep() is delivered by
> setting a flag in line 1225.
> 
> But what ensures that the world did not change between lines 1224 and
> 1225? Could the wakeup() happen after 1224 has determined to issue
> the tsleep() but before the flagging in 1225 was registered? Then it
> would be missed. Is this a race condition biting heavily hit machines?

It shouldn't.  the splvm() call should mask vm-related activities from 
its return through to the call to tsleep (where the mask is saved and 
the mask for the new context is restored).  There is a risk that the 
assumption in the comment is invalid; you would want to look for any 
likely operations involving PG_BUSY.

To track this one further, you would want to look at the code that's 
responsible for for dealing with pages with PG_WANTED set, and work out 
why it's never satisfying this request (or if it is, why it's not 
waking the above caller up).

> Try changing lines 1226 and 1261 to something like:
> 	tsleep(p, PVM, "vmopar", 5 * hz);
...
> This function would return "EWOULDBLOCK" after the time-out expires then, 
> no clue what that will do to your system or apps ;) -- I would expect the
> blocked process to go away within 5 seconds...

I dont' have 2.2 sources to hand, and the above is now just a call to 
vm_page_sleep, but if the timeout expires, the entire operation is 
retried, so it should be harmless (although it is masking a legitimate 
bug).

This might be a candidate for a bandaid patch for 2.2 systems, as 2.2 
goes into life-support mode.

BTW, thanks for looking at this at all, and thanks for making your 
findings generally known.  If you can roll a patch and put it out for 
general testing, we'd be very interested in hearing about the feedback 
you get.

-- 
\\  Sometimes you're ahead,       \\  Mike Smith
\\  sometimes you're behind.      \\  mike@smith.net.au
\\  The race is long, and in the  \\  msmith@freebsd.org
\\  end it's only with yourself.  \\  msmith@cdrom.com


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message