From owner-freebsd-questions@FreeBSD.ORG Wed Nov 22 21:46:10 2006 Return-Path: X-Original-To: freebsd-questions@freebsd.org Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5580516A47C for ; Wed, 22 Nov 2006 21:46:10 +0000 (UTC) (envelope-from freebsd@sopwith.solgatos.com) Received: from schitzo.solgatos.com (pool-71-117-237-189.ptldor.fios.verizon.net [71.117.237.189]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9741F43E51 for ; Wed, 22 Nov 2006 21:43:03 +0000 (GMT) (envelope-from freebsd@sopwith.solgatos.com) Received: from schitzo.solgatos.com (localhost.home.localnet [127.0.0.1]) by schitzo.solgatos.com (8.13.8/8.13.6) with ESMTP id kAMLhVZj002893 for ; Wed, 22 Nov 2006 13:43:32 -0800 Received: from sopwith.solgatos.com (uucp@localhost) by schitzo.solgatos.com (8.13.8/8.13.4/Submit) with UUCP id kAMLhVA4002873 for freebsd-questions@freebsd.org; Wed, 22 Nov 2006 13:43:31 -0800 Received: from localhost by sopwith.solgatos.com (8.8.8/6.24) id TAA14770; Wed, 22 Nov 2006 19:02:54 GMT Message-Id: <200611221902.TAA14770@sopwith.solgatos.com> To: freebsd-questions@freebsd.org In-reply-to: Your message of "Wed, 22 Nov 2006 11:52:38 EST." <20061122165238.GA37819@xor.obsecurity.org> Date: Wed, 22 Nov 2006 11:02:54 +0000 From: Dieter Subject: Re: processes not getting fair share of available disk I/O (was: Re: TCP parameters and interpreting tcpdump output ) X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Nov 2006 21:46:10 -0000 In message <20061122165238.GA37819@xor.obsecurity.org>, Kris Kennaway writes: > > > I'm surprised that you're seeing that much of a "hang". Even if the di= > sks > > > are busy, the system should slow down all disk processes equally, so no > > > one process "blocks", but they're all a little slower. > >=20 > > I collected a bit of data: > >=20 > > While copying a large file from disk1 to disk2, > >=20 > > time ls on a small directory on disk3 (not cached in memory) > >=20 > > real 0m0.032s > > user 0m0.000s > > sys 0m0.003s > >=20 > > time ls on a small directory on disk2 > >=20 > > real 4m51.911s > > user 0m0.000s > > sys 0m0.002s > >=20 > > I expect access to a busy disk to take longer, but 5 minutes is > > a bit much. And that's the root directory of the filesystem, > > it didn't have to follow a long chain of directories to get there. > >=20 > > Sometimes I see long delays when accessing disk3, but it is > > behaving at the moment. > > ls still has to acquire a number of locks in order to be sure that the > contents of the directory aren't changing. If there are lots of other > processes all competing for these locks, it will be slow. It looks > like that's the case on your system, although details of your workload > have been trimmed from your email. In telnet window 1: cd /disk1/ cp -ip very_big_file /disk2/bar/ (the workload) In telnet window 2: time ls /disk3/foo1/ (make sure time and ls are cached in memory) time ls /disk3/foo2/ (see timing numbers above) time ls /disk2/ (see timing numbers above) The /disk2/ directory is small, only contains 3 directories and .snap Would the cp into /disk2/bar/ lock the /disk2/ directory? IIRC the cp was still running after the ls completed. Other than the cp and ls, the machine should have been idle. None of the three disks have /, /usr, /var, /home or similar filesystems likely to have stray I/O. The crontab directory is empty.