Date: 27 May 2006 02:25:26 +0200 From: "Arno J. Klaassen" <arno@heho.snv.jussieu.fr> To: freebsd-current@freebsd.org Subject: indefinite wait buffer Message-ID: <wphd3c48ll.fsf@heho.labo>
next in thread | raw e-mail | index | archive | help
--=-=-= Hello, we use FreeBSD amongst others for scientific calculations, and ran into the 'indefinite wait buffer' problem on ordinary swap/dump devices : the swap-overhead being justified by enabling greater data-sets to be treated and processes grosso-modo still being CPU-bound rather than I/O(swap)-bound. On recent RELENG_6 however, this fails (for sure on scrappy ATA-devices, rather easy as well on SCSI-devices though they seem to persist 'a couple of' 'indefinite wait buffer' warnings). I tested today on an amd64-notebook with 1G physmem and 4G swap on a from-the-shelf ATA-disk. I wrote the following code : int main (int argc, char **argv) { unsigned long maxpage; int * base, * ptr; _malloc_options = "AJ"; maxpage = strtol(argv[1],(char **)NULL, 10) * M_SIZE; fprintf (stderr, "Allocing %ld Bytes\n", maxpage); base = (int *)(malloc (maxpage)); if (base == NULL ) { fprintf (stderr, "Jammer\n"); } while (0 == 0) { int * ptr = base; unsigned int i = 0; for (i=0; i< maxpage/sizeof(int); i++) { *(ptr++) += 1; } fprintf (stderr, "Loop <%d> done\n", iter); iter++; } exit (0); } Calling this (on RELENG_6) with an argument in between 1024 and 1500 in a few minutes deadlocks the notebook with an 'indefinite wait buffer' in /var/log/messages after reboot. After some fiddling I came to the attached amateuristic patch : swap_pager.c has a heuristiquely (I suppose) timeout of 20 seconds for a msleep call; I changed this for a timeout based on a presupposed pessimistic minimal througput for the swapping device multiplied by the minimum of swapsize and physmem. With this patch I can run the above code without deadlock even with 4096 (Meg) as argument.. Two remarks : 1 I abuse linux.ko/linprocfs to correctly initialise my code 2 This is no solution to the real 'indefinite wait buffer' problems since at shutdown it still panics with 'swap_pager_force_pagein: read from swap failed', but at least it keeps the system functional while working. I hope someone can comment this idea. Best regards, Arno --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=swap.patch Index: sys/vm/swap_pager.c =================================================================== RCS file: /home/ncvs/src/sys/vm/swap_pager.c,v retrieving revision 1.273.2.2 diff -r1.273.2.2 swap_pager.c 285a286,288 > static int MB_per_sec = 1; /* be pessimist, might be sysctl'ed */ > static int timo_secs = 60; /* for msleep() in swap_pager_getpages() */ > 401a405 > 604a609 > 1104c1109 < if (msleep(mreq, &vm_page_queue_mtx, PSWP, "swread", hz*20)) { --- > if (msleep(mreq, &vm_page_queue_mtx, PSWP, "swread", hz*timo_secs)) { 1106,1107c1111,1114 < "swap_pager: indefinite wait buffer: bufobj: %p, blkno: %jd, size: %ld\n", < bp->b_bufobj, (intmax_t)bp->b_blkno, bp->b_bcount); --- > "swap_pager: wait buffer timeout (%d secs): bufobj: %p, blkno: %jd, size: %ld\n", > timo_secs, bp->b_bufobj, (intmax_t)bp->b_blkno, bp->b_bcount); > /* wait & pray ... respect mutex */ > msleep(mreq, &vm_page_queue_mtx, PSWP, "swread", 0); 2240a2248 > int timo_secs_swap, timo_secs_physmem; 2248a2257,2265 > > timo_secs_swap = MB_per_sec * (*total * PAGE_SIZE) / (1024*1024); > timo_secs_physmem = MB_per_sec * (physmem * PAGE_SIZE) / (1024*1024); > timo_secs = min(timo_secs_swap, timo_secs_physmem); > > if (timo_secs < 60) timo_secs=60; > printf("ARNO timo_secs = <%d>.\n", timo_secs); > printf("ARNO timo_secs_swap = <%d> timo_secs_physmem <%d>.\n", > timo_secs_swap, timo_secs_physmem); --=-=-=--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?wphd3c48ll.fsf>