From owner-freebsd-ports Thu Jul 18 14:50:30 1996 Return-Path: owner-ports Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id OAA21130 for ports-outgoing; Thu, 18 Jul 1996 14:50:30 -0700 (PDT) Received: from orion.webspan.net (root@orion.webspan.net [206.154.70.41]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id OAA21119 for ; Thu, 18 Jul 1996 14:50:26 -0700 (PDT) Received: from localhost (gpalmer@localhost [127.0.0.1]) by orion.webspan.net (8.7.5/8.6.12) with SMTP id RAA13840; Thu, 18 Jul 1996 17:43:15 -0400 (EDT) X-Authentication-Warning: orion.webspan.net: Host gpalmer@localhost [127.0.0.1] didn't use HELO protocol To: roberto@keltia.freenix.fr (Ollivier Robert) cc: charnier@xp11.frmug.org (Philippe Charnier), ports@freebsd.org From: "Gary Palmer" Subject: Re: new patch for inn-1.4u4 In-reply-to: Your message of "Thu, 18 Jul 1996 23:22:02 +0200." <199607182122.XAA17886@keltia.freenix.fr> Date: Thu, 18 Jul 1996 17:43:14 -0400 Message-ID: <13835.837726194@orion.webspan.net> Sender: owner-ports@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Ollivier Robert wrote in message ID <199607182122.XAA17886@keltia.freenix.fr>: > According to Philippe Charnier: > > with the old port (inn1.4sec), I had problems with my active file which is > > 4096 bytes long and filled up with ascii 0. I wonder if this is due to a > > problem using MMAP instead of READ in files/config.data (see ACT_STYLE). I > Typical of this setting. Try READ. The VM is o good that it doesn't make a > signifiant difference... John has tried several times to nail this one but > it still raises its ugly head from time to time. David found a problem in vfs_bio.c which MAY affect this. It's to do with the pages of a process are run down. No-one is 100% sure if it fixes the problem tho. It'd be interesting to hear if a 2.1.5-RELEASE kernel STILL has INN MMAP problems. The log message is enclosed. If you read it, it may affect the INN case as inn forks a lot of readers, sharing the active file with MMAP. If you also read the message, there may STILL be other causes of this problem :( Gary -- Gary Palmer FreeBSD Core Team Member FreeBSD: Turning PC's into workstations. See http://www.FreeBSD.ORG/ for info davidg 96/06/29 22:17:09 Modified: sys/kern vfs_bio.c Log: Fixed a major bug that caused various pmap related panics, hangs, and reboots. The i386 pmap module uses a special area of kernel virtual memory for mapping of page tables pages when it needs to modify another process's virtual address space. It's called the 'alternate page table map'. There is only one of them and it's expected that only one process will be using it at once and that the operation is atomic. When the merged VM/buffer cache was implemented over a year ago, it became necessary to rundown VM pages at I/O completion. The unfortunate and unforeseen side effect of this is that pmap functions are now called at bio interrupt time. If there happend to be a process using the alternate page table map when this I/O completion occurred, it was possible for a different process's address space to be switched into the alternate page table map - leaving the current pmap process with the wrong address space mapped when the interrupt completed. This resulted in BAD things happening like pages being mapped or removed from the wrong address space, etc.. Since a very common case of a process modifying another process's address space is during fork when the kernel stack is inserted, one of the most common manifestations of this bug was the kernel stack not being mapped properly, resulting in a silent hang or reboot. This made it VERY difficult to troubleshoot this bug (I've been trying to figure out the cause of this for >6 months). Fortunately, the set of conditions that must be true before this problem occurs is sufficiently rare enough that most people never saw the bug occur. As I/O rates increase, however, so does the frequency of the crashes. This problem used to kill wcarchive about every 10 days, but in more recent times when the traffic exceeded >100GB/day, the machine could barely manage 6 hours of uptime. The fix is to make certain that no process has the pages mapped that are involved in the I/O, before the I/O is started. The pages are made busy, so no process will be able to map them, either, until the I/O has finished. This side-steps the issue by still allowing the pmap functions to be called at interrupt time, but also assuring that the alternate page table map won't be switched. Unfortunately, this appears to not be the only cause of this problem. :-( Reviewed by: dyson Revision Changes Path 1.94 +2 -2 src/sys/kern/vfs_bio.c davidg 96/06/29 22:23:43 Branch: sys/kern RELENG_2_1_0 Modified: sys/kern vfs_bio.c Log: Brought in fix from rev 1.94: make sure pages involved in I/O are not mapped into any processes. Also set b_pages[]=0 when the pages are removed from the buffer - this is to satisfy my paranoia by making sure that nothing bogusly messes with the pages after the buffer has been reclaimed. A similar change was made in the main branch some time ago. Revision Changes Path 1.46.4.9 +3 -2 src/sys/kern/vfs_bio.c