From owner-freebsd-hackers Tue Aug 12 20:13:34 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id UAA14707 for hackers-outgoing; Tue, 12 Aug 1997 20:13:34 -0700 (PDT) Received: from alpo.whistle.com (alpo.whistle.com [207.76.204.38]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id UAA14649; Tue, 12 Aug 1997 20:13:10 -0700 (PDT) Received: (from daemon@localhost) by alpo.whistle.com (8.8.5/8.8.5) id UAA02073; Tue, 12 Aug 1997 20:08:10 -0700 (PDT) Received: from current1.whistle.com(207.76.205.22) via SMTP by alpo.whistle.com, id smtpd002070; Wed Aug 13 03:08:01 1997 Message-ID: <33F12483.2781E494@whistle.com> Date: Tue, 12 Aug 1997 20:05:39 -0700 From: Julian Elischer Organization: Whistle Communications X-Mailer: Mozilla 3.0Gold (X11; I; FreeBSD 2.2-CURRENT i386) MIME-Version: 1.0 To: Michael Smith CC: julian@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: 2.2.2+ crash.. more info References: <199708130234.MAA11390@genesis.atrad.adelaide.edu.au> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Michael Smith wrote: > > Julian Elischer stands accused of saying: > > Michael Smith wrote: > > > > > > Julian Elischer stands accused of saying: > > > > > > > > We have several hundred Bsd machines here.. we see this one > > > > enough for > > > > me to recognise it.. > > the fact that the process got put on the a sleep queue while it was > > on the runnable queue. suggests that maybe an interrupt driver > > ran 'tsleep' while curproc had the value of this process in it.. stupidly I have a big clue right in front of me that I didn't mention.. the process in question has a wait channel of 'swread' that's in the VM code.. in fact, in: static int swap_pager_getpages(object, m, count, reqpage) ... /* * wait for the sync I/O to complete */ s = splbio(); while ((bp->b_flags & B_DONE) == 0) { if (tsleep(bp, PVM, "swread", hz*20)) { printf("swap_pager: indefinite wait buffer: device: %d, blkno: %d, size: %d\n", bp->b_dev, bp->b_blkno, bp->b_bcount); } } what I wonder about is: does this ever get run in the context of another process? who runs this? etc. I think (after discussion with john) that the quick test will be to add code to tsleep to check if the process being slept is still on the run queue..