From owner-freebsd-stable@FreeBSD.ORG Wed Mar 22 01:46:25 2006 Return-Path: X-Original-To: stable@freebsd.org Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4DD1016A400; Wed, 22 Mar 2006 01:46:25 +0000 (UTC) (envelope-from mi+mx@aldan.algebra.com) Received: from zig.murex.com (mail.murex.com [194.98.239.11]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8BEC743D45; Wed, 22 Mar 2006 01:46:24 +0000 (GMT) (envelope-from mi+mx@aldan.algebra.com) Received: from interscan.fr.murex.com (interscan.fr.murex.com [172.21.17.207] (may be forged)) by zig.murex.com with ESMTP id k2M1m7JQ007399; Wed, 22 Mar 2006 02:48:07 +0100 (CET) Received: from mxmail.murex.com (interscan.murex.fr [127.0.0.1]) by interscan.fr.murex.com (8.11.6/8.11.6) with ESMTP id k2M2GQ426486; Wed, 22 Mar 2006 03:16:26 +0100 Received: from mteterin.us.murex.com ([172.21.130.86]) by mxmail.murex.com with Microsoft SMTPSVC(6.0.3790.0); Wed, 22 Mar 2006 02:45:42 +0100 From: Mikhail Teterin Organization: Virtual Estates, Inc. To: Matthew Dillon Date: Tue, 21 Mar 2006 20:45:39 -0500 User-Agent: KMail/1.8.3 References: <200603211607.30372.mi+mx@aldan.algebra.com> <200603211948.28178.mi+mx@aldan.algebra.com> <200603220109.k2M19GVS007470@apollo.backplane.com> In-Reply-To: <200603220109.k2M19GVS007470@apollo.backplane.com> MIME-Version: 1.0 Content-Disposition: inline Message-Id: <200603212045.39845.mi+mx@aldan.algebra.com> Content-Type: text/plain; charset="koi8-u" Content-Transfer-Encoding: 8bit X-OriginalArrivalTime: 22 Mar 2006 01:45:42.0345 (UTC) FILETIME=[544A4B90:01C64D52] Cc: alc@freebsd.org, stable@freebsd.org Subject: Re: more weird bugs with mmap-ing via NFS X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Mar 2006 01:46:25 -0000 в╕второк 21 березень 2006 20:09, Matthew Dillon Ви написали: > If the network bandwidth is still going full bore then the program is > doing something. NFS retries would not account for it. A simple > test for that would be to ^Z the program once it gets into this state > and see if the network bandwidth goes to zero. Pressing ^Z moves the process' state from ``nfs'' to ``STOP'' according to top(1), but the shell does not give the prompt back for many minutes. Only when it does, does the bandwidth go down to negligable amounts. > So if we assume that packets aren't being lost, then the question > becomes: what is the program doing that is causing the network > bandwidth to go nuts? You have the program's source... I run it simply as: mzip -g -v -b 16k -w /meow/tmp/db.dmp /backup/tmp/db.dmp.gz.part /meow is local, /backup is mounted this way: mount_nfs -r 5120 -w 5120 -ointr pandora:/backup /backup > ktrace on the program would tell us if read() or write() or ftruncate() > were causing an issue. According to `kdump -l', which I launched in parallel to the ktrace-ed mzip, the last syscall is madvise. But that returns long before the bandwidth shoots up... > 'vmstat 1' while the program is running would tell us if VM faults > are creating an issue. Just as `systat -vm', `vmstat 1' hangs -- and stalls everything else for many minutes. Maybe, this is the hint at too much faulting? > 50 lines of output from something like this after the program has gotten > into its weird state might give us a clue: > tcpdump -s 4096 -n -i -l port 2049 Now I am thoroughly confused, the lines are very repetative: tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on em0, link-type EN10MB (Ethernet), capture size 4096 bytes 20:41:55.788436 IP 172.21.128.43.2049 > 172.21.130.86.1445243414: reply ok 60 20:41:55.788502 IP 172.21.130.86.1445243415 > 172.21.128.43.2049: 1472 write fh 1090,6005/15141914 5120 (5120) bytes @ 4943872 20:41:55.788811 IP 172.21.128.43.2049 > 172.21.130.86.1445243415: reply ok 60 write ERROR: Permission denied 20:41:55.788872 IP 172.21.130.86.1445243416 > 172.21.128.43.2049: 1472 write fh 1090,6005/15141914 5120 (5120) bytes @ 4947968 [...] The only reason for "permission denied" I know, is the firewall, but neither the server nor the client even have ipfw loaded... Yours, -mi