From owner-freebsd-hackers@FreeBSD.ORG Mon Mar 21 06:55:36 2005 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8449F16A4CE for ; Mon, 21 Mar 2005 06:55:36 +0000 (GMT) Received: from wattres.watt.com (wattres.watt.com [66.93.133.130]) by mx1.FreeBSD.org (Postfix) with ESMTP id 33B8843D3F for ; Mon, 21 Mar 2005 06:55:36 +0000 (GMT) (envelope-from steve@Watt.COM) Received: from wattres.watt.com (localhost.watt.com [127.0.0.1]) by wattres.watt.com (8.13.1/8.13.1) with ESMTP id j2L6tN4V063117 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 20 Mar 2005 22:55:24 -0800 (PST) (envelope-from steve@wattres.watt.com) Received: (from steve@localhost) by wattres.watt.com (8.13.1/8.13.1/Submit) id j2L6tNoH063116; Sun, 20 Mar 2005 22:55:23 -0800 (PST) (envelope-from steve) Message-Id: <200503210655.j2L6tNoH063116@wattres.watt.com> X-Newsgroups: local.freebsd-hackers In-Reply-To: <423DE326.9000203@digitalstratum.com> Organization: Watt Consultants From: steve@Watt.COM (Steve Watt) Date: Sun, 20 Mar 2005 22:55:23 -0800 X-Mailer: Mail User's Shell (7.2.6 beta(5) 10/07/98) To: matthew@digitalstratum.com X-Archived: 1111388124.088708483@wattres.Watt.COM X-Virus-Scanned: ClamAV 0.83/776/Sun Mar 20 22:37:00 2005 on wattres.Watt.COM X-Virus-Status: Clean cc: hackers@freebsd.org Subject: Re: Causing a process switch to test a theory. X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Mar 2005 06:55:36 -0000 In <423DE326.9000203@digitalstratum.com>, matthew@digitalstratum.com wrote: >I think I have found a possible bug in Apache's logging when using their >"reliable pipe" feature, but I'd like to test it prior to submitting a >bug report (or possibly a patch.) Of course I posted a message on the >Apache development forum before posting here, but I have had no response >from that group. And based on your description, I think I agree. >My understanding of PIPE_BUF is that it is the largest amount of data >the kernel will guarantee to be atomic when writing to a pipe. Thus if >more than one process is writing to the same pipe, and more than >PIPE_BUF bytes needs to be written, there is the chance of the data >being interleaved due to a context switch during write(), or between >multiple calls to write() in order to write all required data. Yes, that is exactly the POSIX semantic for PIPE_BUF. There are a lot of tricky details in there that are not fully obvious, and it's now been long enough (6 years) that I've forgotten the exact details. There are some weird interactions between O_NONBLOCK and PIPE_BUF, but it looks like some of them have been ironed out in recent versions of the standard. >I've been reading the Apache source code to try and determine if >PIPE_BUF is taken into consideration while logging entries to a pipe. >What appears to happen is that if a single log entry is more than 512 >bytes, it is simply written to the the pipe without regard to PIPE_BUF. Then there is definitely a risk of interleaving. This is basically a race condition -- if you're lucky the log reader can cope, but that depends greatly on the guts of the logger. >In this situation (each child logging it's own entries) it seems there >is the possibility that a child could be preempted during it's call to >write() when trying to write more than PIPE_BUF bytes of log data. What >I'd like to do is create a test where I would be making requests to >Apache that would cause log entries longer than PIPE_BUF in length, then >be able to show the interleaving of log entries due to the PIPE_BUF >limit being exceeded. I would guess that the easiest way to run into this is to cause lots of processes to write larger blocks to the same pipe (are they all really writing to the exact same pipe? If not, no problem!) at the same time. An SMP box might be able to tickle this one better. >Under the conditions such that cls->log_fd is a pipe (inherited from the >parent), len > PIPE_BUF, and there are multiple child processes all >logging entries with this code. Assuming they're all writing to the same log_fd, then you might have a problem. >Knowing if Apache could possibly write interleaved logs when writing to >a pipe is critical to a program I'm developing which receives log >entries from Apache via a pipe. That's another layer of indirection, though. If all of the children have separate pipes to the parent, and then the parent logs to your program, all should be fine. But at the kernel level, yes, writes longer than PIPE_BUF might get interleaved. The longer the write, the higher the probability, so for your test, if you can generate, say, 10K writes over and over, you can probably trip it. -- Steve Watt KD6GGD PP-ASEL-IA ICBM: 121W 56' 57.8" / 37N 20' 14.9" Internet: steve @ Watt.COM Whois: SW32 Free time? There's no such thing. It just comes in varying prices...