From owner-freebsd-stable@FreeBSD.ORG Sat May 17 11:55:03 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 67FB9457 for ; Sat, 17 May 2014 11:55:03 +0000 (UTC) Received: from mail-la0-x244.google.com (mail-la0-x244.google.com [IPv6:2a00:1450:4010:c03::244]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E990429F5 for ; Sat, 17 May 2014 11:55:02 +0000 (UTC) Received: by mail-la0-f68.google.com with SMTP id hr17so948865lab.11 for ; Sat, 17 May 2014 04:55:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=MnH8Ov9Jrnosum6xwcnvCHAuNtMJ5NtfCgsYVTTES9M=; b=IpjhXwOeIPrM0S/jQhk7/615/yKn9lIewuRx5lrf/lLabb73XI2nAi6PJuhXGA5d00 Dj8SJeUgDnxjhgfilxl+WxF4vk3EHwCqQSw9FzJVMQgEAEHGtFtvvUBUbgcvjhEwj6uy mu/J8jaZsX8Xcx470hei+7PoPVJcRtme0QdJ2EJ5royXtbm6KlliEoEWFsCz8VfCfc52 Qyqrm6rrfKltoV79TzrbLDoUV33ZIY0+eucLYt0OBvxrKNtqt8CGoQB6J12IIbxd4CUb 4hcMYI2E6oeVMGf2qQe9OCWpk+ImEbUPhcFiq6BynIi6vmv478qWw24hmjQ3v23Fq4IU GbEw== MIME-Version: 1.0 X-Received: by 10.152.18.133 with SMTP id w5mr1283294lad.60.1400327700857; Sat, 17 May 2014 04:55:00 -0700 (PDT) Received: by 10.112.161.230 with HTTP; Sat, 17 May 2014 04:55:00 -0700 (PDT) Date: Sat, 17 May 2014 13:55:00 +0200 Message-ID: Subject: [9.3 PRE] filesystem full even if there are free descriptors and disk space From: Damian Danielecki To: freebsd-stable@freebsd.org Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 17 May 2014 11:55:03 -0000 I am receiving errors like this on heavily loaded FreeBSD 9.3-PRERELEASE nfs server: May 17 05:01:02 nfsd kernel: pid 4173 (nfsd), uid 0 inumber 185391682 on /exports: filesystem full Filesystem is newly created and server newly installed with custom minimalist kernel&world. I am experienced user. There are many free inodes and much disk space: # df -i Filesystem 1K-blocks Used Avail Capacity iused ifree %iused Mounted on /dev/mirror/gm0s1a 473048844 3449532 431755408 1% 306673 60848397 1% / devfs 1 1 0 100% 0 0 100% /dev /dev/da0p1 3783567612 1772595572 1708286632 51% 20964037 468035769 4% /exports # df -h Filesystem Size Used Avail Capacity Mounted on /dev/mirror/gm0s1a 451G 3.3G 411G 1% / devfs 1.0k 1.0k 0B 100% /dev /dev/da0p1 3.5T 1.7T 1.6T 51% /exports # uname -a FreeBSD nfsd.xxx.pl 9.3-PRERELEASE FreeBSD 9.3-PRERELEASE #5: Fri May 16 15:41:36 CEST 2014 root@nfsd.xxx.pl:/usr/obj/usr/src/sys/FREEBSD9 amd64 Filesystem is clean and not fragmented. For sure fsck has been done on unmounted filesystem. I see these errors also after fsck. # fsck -t ufs -y /dev/da0p1 ** /dev/da0p1 ** Last Mounted on /exports ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 20964037 files, 443148893 used, 502743010 free (396842 frags, 62793271 blocks, 0.0% fragmentation) These are my filesystem params. # tunefs -p /exports tunefs: POSIX.1e ACLs: (-a) disabled tunefs: NFSv4 ACLs: (-N) disabled tunefs: MAC multilabel: (-l) disabled tunefs: soft updates: (-n) enabled tunefs: soft update journaling: (-j) disabled tunefs: gjournal: (-J) disabled tunefs: trim: (-t) disabled tunefs: maximum blocks per file in a cylinder group: (-e) 4096 tunefs: average file size: (-f) 16384 tunefs: average number of files in a directory: (-s) 64 tunefs: minimum percentage of free space: (-m) 8% tunefs: space to hold for metadata blocks: (-k) 6408 tunefs: optimization preference: (-o) time tunefs: volume label: (-L) My mouting flags are: /dev/da0p1 on /exports (ufs, NFS exported, local, noatime, nosuid, with quotas, soft-updates) I found file releated to sample inode, it's users's backup. File: "/exports/(...)/xxx.tar.gz" Size: 53706752 FileType: Regular File Mode: (0640/-rw-r-----) Uid: ( 6225/ (6225)) Gid: ( 1676/ (1676)) Device: 0,78 Inode: 185391682 Links: 1 Access: Sat May 17 05:00:57 2014 Modify: Sat May 17 05:01:02 2014 Change: Sat May 17 05:01:48 2014 I am unable to decompress it, it's actually damaged: # cp xxx.tar.gz /otherfilesystem/ # cd /otherfilesystem && gzip -d xxx.tar.gz gzip: xxx.tar.gz: unexpected end of file gzip: xxx.tar.gz: uncompress failed I've tried to rerun this user's backup and I am able to reproduce problem just now. The same backup file of the same user now has different inode but problem is the same: pid 4173 (nfsd), uid 0 inumber 185391652 on /exports: filesystem full # stat -x xxx.tar.gz File: "xxx.tar.gz" Size: 55902208 FileType: Regular File Mode: (0640/-rw-r-----) Uid: ( 6225/ (6225)) Gid: ( 1676/ (1676)) Device: 0,78 Inode: 185391652 Links: 1 Access: Sat May 17 12:51:47 2014 Modify: Sat May 17 12:51:51 2014 Change: Sat May 17 12:52:52 2014 Of course I am able to create any new big file from /dev/random, there is free space. I've easily created single 30GB file via nfs. I've also tried to count md5 sum of newly generated 1GB file many times via nfs and it's still the same. I checked this to be sure nfs transmission is valid and da0 device is working properly. I've added some primitive debug to kernel sources but this is production env so I will be able to reboot server only at night. Just now it's impossible. /usr/src/sys/ufs # grep -R 'filesystem full' * ffs/ffs_balloc.c: ffs_fserr(fs, ip->i_number, "filesystem full line 320"); ffs/ffs_balloc.c: ffs_fserr(fs, ip->i_number, "filesystem full line 397"); ffs/ffs_balloc.c: ffs_fserr(fs, ip->i_number, "filesystem full line 882"); ffs/ffs_balloc.c: ffs_fserr(fs, ip->i_number, "filesystem full line 960"); ffs/ffs_alloc.c: ffs_fserr(fs, ip->i_number, "filesystem full line 227"); ffs/ffs_alloc.c: ffs_fserr(fs, ip->i_number, "filesystem full line 438"); Any help will be appreciated! Should be corrected before 9.3-RELEASE. DD