From owner-freebsd-hackers Tue Dec 14 17:59:21 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from screech.weirdnoise.com (209-128-78-198.bayarea.net [209.128.78.198]) by hub.freebsd.org (Postfix) with ESMTP id B8B6115395 for ; Tue, 14 Dec 1999 17:59:16 -0800 (PST) (envelope-from edhall@screech.weirdnoise.com) Received: from screech.weirdnoise.com (localhost [127.0.0.1]) by screech.weirdnoise.com (8.9.3/8.8.7) with ESMTP id RAA15697; Tue, 14 Dec 1999 17:59:39 -0800 Message-Id: <199912150159.RAA15697@screech.weirdnoise.com> X-Mailer: exmh version 2.0.2 To: freebsd-hackers@FreeBSD.ORG Cc: edhall@screech.weirdnoise.com Subject: VM Scan Rate: Speed Kills on 3.3 Mime-Version: 1.0 Content-Type: multipart/mixed ; boundary="==_Exmh_1410642600" Date: Tue, 14 Dec 1999 17:59:39 -0800 From: Ed Hall Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG This is a multipart MIME message. --==_Exmh_1410642600 Content-Type: text/plain; charset=us-ascii Under certain circumstances the VM scan rate can spike into the millions/sec (as reported by vmstat) followed quickly by a system lockup (an endless loop in vm_pageout()), suggesting that the page queue has been tied in a loop. This effect was observed in a program ported from Solaris that updated a large file by mmap()'ing small parts of it. Although using read()/write() eliminates the problem (and with a sizable increase in performance as well), there may be other triggers for this bug. (A side comment: although using mmap() for file updates in FreeBSD applications seems to perform quite poorly when compared to read()/write(), this is not the case on some other systems, such as Solaris. Also, there may be cases where the shared memory semantics of mmap() are important to an application such that conversion to read()/write() is not possible.) I've attached a small test program that provokes the same behavior as my (old) application did. It accepts two arguments: a file size in MB, and an update size in KB. Make the first twice the size of physical memory and the second a small multiple of the page size (see example run in second attachment). In my case there was .5GB of physical memory on an N440BX (500MHz P-III) server w/IDE disk. -Ed Hall Technical Yahoo, Yahoo! Inc. --==_Exmh_1410642600 Content-Type: text/plain ; name="vmtest1.txt"; charset=us-ascii Content-Description: vmtest1.txt Content-Disposition: attachment; filename="vmtest1.txt" A sample stack trace: siointr1(c617e000,e8e33f44,c01c1057,0,80000010) at siointr1+0xb9 siointr(0) at siointr+0x12 Xfastintr4(3425000,80000000,c0131c28,8000,c0140f94) at Xfastintr4+0x17 vm_pageout_scan(80000000,c0208404,e8e33fb0,c0131c5a,e8e2bdff) at vm_pageout_scan+0x173 vm_pageout(e8e2bdff,c01ecf38,c0208404,c0278ff4,c01c0e2a) at vm_pageout+0x226 kproc_start(c0208404) at kproc_start+0x32 fork_trampoline(0,0,c040c000,0,c040c008) at fork_trampoline+0x3a Here is the standard output leading up to the tragedy; the perfect interleave of program output with "vmstat 1" output is pure luck: % vmstat 1 & . . . % ./a.out 2048 8 00000000 1 0 0 2100992416600 17886 0 0 0 19 0 4 241 17929 30 47 47 7 00010000 1 0 0 2100992343144 19333 0 0 0 0 0 0 236 19357 24 57 43 0 00020000 1 0 0 2100992272096 19419 0 0 0 0 0 4 242 19443 25 59 41 0 00030000 1 0 0 2100992203096 19661 0 0 0 0 0 0 236 19685 23 55 45 0 00040000 1 0 0 2100992135896 19661 0 0 0 0 0 0 241 19685 25 65 35 0 00050000 1 0 0 2100992 70704 19871 0 0 0 0 0 0 237 19895 24 64 36 0 00060000 1 0 0 3840 18296 18339 0 0 0 0 121030 0 237 18364 25 52 48 0 00070000 1 2 1 2101332 1492 5963 0 3 89 36 1001341 84 341 5987 129 15 75 10 and then dialtone. --==_Exmh_1410642600 Content-Type: text/plain; name="vmtest1.c"; charset=us-ascii Content-Description: vmtest1.c Content-Disposition: attachment; filename="vmtest1.c" #include #include #include #include #include #include int main(int argc, char **argv) { off_t filesize; size_t maplen; off_t randoffset; void *maploc; void *memorybuf; int fd; int i; int r; if (argc != 3) fprintf(stderr, "usage: %s filesize-MB mapsize-KB\n", *argv), exit(1); filesize = (off_t)1048576 * (off_t)atoi(argv[1]); maplen = (size_t)1024 * (size_t)atoi(argv[2]); if ((memorybuf = calloc(maplen, 1)) == NULL) perror("calloc() failed"), exit(1); if ((fd = open("test.gomi", O_RDWR|O_CREAT, 0666)) < 0) perror("can't create/open test.gomi"), exit(1); if (ftruncate(fd, filesize) < 0) perror("ftruncate failed"), exit(1); for (i = 0;; i++) { r = random() % (int)((filesize / (off_t)maplen) - 1); randoffset = (off_t)(r * maplen); if (randoffset + maplen > filesize) continue; if ((maploc = mmap(NULL, maplen, PROT_READ|PROT_WRITE, MAP_SHARED, fd, randoffset)) == MAP_FAILED) perror("mmap failed"), exit(1); memcpy(memorybuf, maploc, maplen); ((int *)memorybuf)[0]++; memcpy(maploc, memorybuf, maplen); if (munmap(maploc, maplen) < 0) perror("munmap failed"), exit(1); if (i%10000 == 0) fprintf(stderr, "%08d\n", i); } } --==_Exmh_1410642600-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message