From owner-freebsd-stable@FreeBSD.ORG Sat Jul 8 02:56:50 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 92F2116A4DE for ; Sat, 8 Jul 2006 02:56:50 +0000 (UTC) (envelope-from spork@bway.net) Received: from mail.bway.net (xena.bway.net [216.220.96.26]) by mx1.FreeBSD.org (Postfix) with ESMTP id 95E1243D58 for ; Sat, 8 Jul 2006 02:56:49 +0000 (GMT) (envelope-from spork@bway.net) Received: (qmail 49028 invoked by uid 0); 8 Jul 2006 02:56:48 -0000 Received: from unknown (HELO white.nat.fasttrackmonkey.com) (spork@bway.net@216.220.116.154) by smtp.bway.net with (DHE-RSA-AES256-SHA encrypted) SMTP; 8 Jul 2006 02:56:48 -0000 Date: Fri, 7 Jul 2006 22:56:47 -0400 (EDT) From: Charles Sprickman X-X-Sender: spork@white.nat.fasttrackmonkey.com To: freebsd-stable@freebsd.org Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Subject: 6.1 quota issues X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Jul 2006 02:56:50 -0000 Hello all, I'm in the process of rolling out a new shell server and for numerous reasons have decided 6.x is the best fit (jail improvements, SMP improvements, 3Ware driver, pf). The shell server is within a jail, and the uids there are unique so that quotas remain sane. There are about 5000 active accounts using about 40GB of a 210GB partition. The quota.user file is about 4GB. I just started work on getting quotas setup for everyone after rsyncing all the homedirs from the old server over. At first, all seemed well, then I ran into a few issues on subsequent rsyncs. I had people with large (1GB+) homedirs and quotas in the 1GB-4GB range and as rsync was chowning the files to the users it was throwing errors about "quota exceeded". Here's a brief example that illustrates what I was seeing: ot@beta[/home/staff/micro/tmp]# quota micro Disk quotas for user micro (uid 5315): Filesystem usage quota limit grace files quota limit grace / 1630026 3000000 3100000 13393 0 0 root@beta[/home/staff/micro/tmp]# chown micro index.html chown: index.html: Disc quota exceeded root@beta[/home/staff/micro/tmp]# I know in the past when I've seen inconsistencies indicating that I needed a manual run of quotacheck, they would show up in the output of the quota command; ie: the "quota" command would show the user had more usage than "du" would indicate. The above example is a bit odd - "quota" shows that he's well within his limits, but the kernel thinks otherwise. Thinking it would be a good idea to stop the jails, turn off quotas, umount the partition, fsck it, mount it and then run quotacheck, I found more problems. My first run of quotacheck ran for a few minutes, reported many inconsistencies and then sat there for quite some time before spitting this out: quotacheck: /jails/quota.user: seek failed: Invalid argument Trying again, it reported the same inconsistencies then sat there for more than an hour taking up all the available CPU on the box until I killed it. The mtime on quota.user had not changed during the run. Running it yet again now gives me this: /jails: fixed: inodes 27 -> 0 blocks 156 -> 0 quotacheck: /jails/quota.user: seek failed: Invalid argument THE FOLLOWING FILE SYSTEM HAD AN UNEXPECTED INCONSISTENCY: /dev/twed0s1g (/jails) For now I can live without quotas, but if there's anything I can test from -stable that might address this I'd like to try it. I'd say this thing is still a good month from going live since we have lots of dependancy mess on the old box to clean up before cutting over. Any ideas what's going on here? Is this related to the large number of users and the size of the partition? I've seen some of the discussions about snapshots + quotas, but that seems like an entirely different issue. For the time being I've killed "background_fsck" and "check_quotas" in rc.conf, and I'll avoid dumping that fs with the snapshot flag. What other information can I provide to help better define where this bug lives? Thanks, Charles