From owner-freebsd-questions@FreeBSD.ORG Sun Nov 19 07:43:44 2006 Return-Path: X-Original-To: freebsd-questions@freebsd.org Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7B6EB16A407 for ; Sun, 19 Nov 2006 07:43:44 +0000 (UTC) (envelope-from freebsd@sopwith.solgatos.com) Received: from schitzo.solgatos.com (pool-71-117-237-135.ptldor.fios.verizon.net [71.117.237.135]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2A1F343D49 for ; Sun, 19 Nov 2006 07:43:34 +0000 (GMT) (envelope-from freebsd@sopwith.solgatos.com) Received: from schitzo.solgatos.com (localhost.home.localnet [127.0.0.1]) by schitzo.solgatos.com (8.13.7/8.13.6) with ESMTP id kAJ7htnk018623 for ; Sat, 18 Nov 2006 23:43:55 -0800 Received: from sopwith.solgatos.com (uucp@localhost) by schitzo.solgatos.com (8.13.7/8.13.4/Submit) with UUCP id kAJ7htlk018620 for freebsd-questions@freebsd.org; Sat, 18 Nov 2006 23:43:55 -0800 Received: from localhost by sopwith.solgatos.com (8.8.8/6.24) id HAA04640; Sun, 19 Nov 2006 07:42:31 GMT Message-Id: <200611190742.HAA04640@sopwith.solgatos.com> To: freebsd-questions@freebsd.org In-reply-to: Your message of "Sat, 18 Nov 2006 20:02:48 CST." <20061119020247.GB15898@dan.emsphone.com> Date: Sat, 18 Nov 2006 23:42:31 +0000 From: Dieter Subject: Re: TCP parameters and interpreting tcpdump output X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Nov 2006 07:43:44 -0000 Dan writes: Dan> A shrinking window and no packet loss is an indication that the program Dan> the socket is connected to isn't reading data fast enough. If you're Dan> locally gzipping the output of a remote backup, for example, you'll see Dan> this. Just a tight loop reading the socket and writing to stdout, which is directed into a file on disk. Dan> The completely duplicated data packets from the sender, even before any Dan> perceived packet loss, are troubling. Either the sender decided to Dan> resend that data on its own, or the packet was duplicated by a router Dan> or switch in transit. Dumps of the same stream from both sender and Dan> receiver would help, as would enabling rfc 1323 extensions on both Dan> systems (which will put a timestamp value on each packet and enable Dan> SACK. It's enabled by default on FreeBSD). No router or switch, just a piece of wire. net.inet.tcp.rfc1323: 1 Bill writes: Bill> My guess would be that your process blocked on stdout. Bill> You don't mention what you're doing with stdout from the program, are Bill> you just letting it scroll on the terminal, or redirecting it to a file? Just redirected to a file. FFS, soft updates, 7200 rpm SATA drive with the disk's write cache turned off. Input data rate is less than 20 M bits/sec. I can write to the disk at approx 6 M Bytes/sec sustained. (or 10x that with disk write cache turned on, but I don't like trashed filesystems after the machine goes down hard) The machine and the disk are plenty fast enough, AMD64, 2 GB main memory. CPU is 90-something percent idle. Sometimes it works fine for extended periods, 30-40 minutes. Other times the src box reports thousands of network errors. So far I haven't figured out what the difference is between the working tests and the failing tests. The crontab directory is empty, so it shouldn't be cron jobs. > As an experiment, try running the process and redirecting > stdout to /dev/null -- if it doesn't exhibit the problem, then you > need to look at where you're actually storing the data and speed that > part up. I've thought of trying /dev/null but haven't yet. It might provide a clue. I would expect that the filesystem should be buffering the write from short term disk latency. Surely FreeBSD 6.0 provides the classic Unix write-behind? The disk activity LED flashes constantly, so it doesn't appear to be saving up disk writes and then doing a bunch at once, > Is the data coming in at a fairly constant rate? Yes. > you've got plenty of RAM The machine has 2 GB. I wonder if the process is getting its fair share? I have been observing other problems where disk activity to one disk will make an unrelated process reading data from a different disk *very* unresponsive.