Date: Sun, 18 Aug 2013 14:25:12 GMT From: Konstantin <konstantin.debalyuk@kaspersky.com> To: freebsd-gnats-submit@FreeBSD.org Subject: kern/181375: disk free space blackhole magic Message-ID: <201308181425.r7IEPC8s034739@oldred.freebsd.org> Resent-Message-ID: <201308181430.r7IEU0hD034307@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 181375 >Category: kern >Synopsis: disk free space blackhole magic >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sun Aug 18 14:30:00 UTC 2013 >Closed-Date: >Last-Modified: >Originator: Konstantin >Release: FreeBSD 8.3 - FreeBSD 8.4 >Organization: Kaspersky Lab >Environment: FreeBSD 8.3-RELEASE-p5 FreeBSD 8.3-RELEASE-p5 #1: Thu Dec 6 17:13:30 UTC 2012 root@tb.kaspersky-labs.com:/usr/obj/usr/src/sys/GENERIC amd64 FreeBSD test 8.4-RELEASE FreeBSD 8.4-RELEASE #0 r251259: Sun Jun 2 21:26:57 UTC 2013 root@bake.isc.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 >Description: FreeBSD 8.3 - FreeBSD 8.4 amd64 SMP soft updates noatime A daemon running as unprivileged user writes a log. Newsyslog rotates this log every hour and compress it by gzip. Some time later (about 1-2 weeks on our real systems) filesystem gets full. df and du show different results. Kill processes ('/etc/rc.d/jail stop' to be honest) doesn't help. Finally, it is impossible to umount this partition. We get "Device busy" error, but fstat and/or lsof doesn't show something. We know about "open file deletion" issue and other reasons why df and du shows different results, but it looks like it is not our case. Our real systems use jail environment, but it is possible to repeat this strange behavior without jail. We could not repeat this on non-SMP system, but it is not heavily tested. Below are more details: # tunefs -p /dev/mfid0p4 tunefs: POSIX.1e ACLs: (-a) disabled tunefs: NFSv4 ACLs: (-N) disabled tunefs: MAC multilabel: (-l) disabled tunefs: soft updates: (-n) enabled tunefs: gjournal: (-J) disabled tunefs: trim: (-t) disabled tunefs: maximum blocks per file in a cylinder group: (-e) 2048 tunefs: average file size: (-f) 16384 tunefs: average number of files in a directory: (-s) 64 tunefs: minimum percentage of free space: (-m) 8% tunefs: optimization preference: (-o) time tunefs: volume label: (-L) We tested with disabled Soft Updates. It did not help. /etc/fstat: /dev/mfid0p4 /jail ufs rw,noatime 2 2 We tested without noatime options. It did not help. jail newsyslog.conf: /var/log/urep.txt kuser:logv 664 24 * 1 Z /var/run/kur.service.pid real logs: -rw-rw-r-- 1 1001 1002 17361916663 Aug 16 18:59 urep.txt -rw-rw-r-- 1 1001 1002 5801630304 Aug 16 18:00 urep.txt.0.gz -rw-rw-r-- 1 1001 1002 4950129578 Aug 16 17:00 urep.txt.1.gz -rw-rw-r-- 1 1001 1002 2333474330 Aug 16 08:00 urep.txt.10.gz -rw-rw-r-- 1 1001 1002 1154592135 Aug 16 07:00 urep.txt.11.gz -rw-rw-r-- 1 1001 1002 783671738 Aug 16 06:00 urep.txt.12.gz -rw-rw-r-- 1 1001 1002 641988029 Aug 16 05:00 urep.txt.13.gz -rw-rw-r-- 1 1001 1002 711639597 Aug 16 04:00 urep.txt.14.gz -rw-rw-r-- 1 1001 1002 858166595 Aug 16 03:00 urep.txt.15.gz -rw-rw-r-- 1 1001 1002 1129264281 Aug 16 02:00 urep.txt.16.gz -rw-rw-r-- 1 1001 1002 1825284327 Aug 16 01:00 urep.txt.17.gz -rw-rw-r-- 1 1001 1002 2336803342 Aug 16 00:00 urep.txt.18.gz -rw-rw-r-- 1 1001 1002 2412378597 Aug 15 23:00 urep.txt.19.gz -rw-rw-r-- 1 1001 1002 5808607921 Aug 16 16:00 urep.txt.2.gz -rw-rw-r-- 1 1001 1002 2751801103 Aug 15 22:00 urep.txt.20.gz -rw-rw-r-- 1 1001 1002 2546710751 Aug 15 21:00 urep.txt.21.gz -rw-rw-r-- 1 1001 1002 2540838045 Aug 15 20:00 urep.txt.22.gz -rw-rw-r-- 1 1001 1002 3015737372 Aug 15 19:00 urep.txt.23.gz -rw-rw-r-- 1 1001 1002 3251842089 Aug 15 18:00 urep.txt.24.gz -rw-rw-r-- 1 1001 1002 5239690207 Aug 16 15:00 urep.txt.3.gz -rw-rw-r-- 1 1001 1002 6059472499 Aug 16 14:00 urep.txt.4.gz -rw-rw-r-- 1 1001 1002 6249745715 Aug 16 13:00 urep.txt.5.gz -rw-rw-r-- 1 1001 1002 6314110754 Aug 16 12:00 urep.txt.6.gz -rw-rw-r-- 1 1001 1002 6142097019 Aug 16 11:00 urep.txt.7.gz -rw-rw-r-- 1 1001 1002 5700280821 Aug 16 10:00 urep.txt.8.gz -rw-rw-r-- 1 1001 1002 3843021410 Aug 16 09:00 urep.txt.9.gz # dmesg pid 81826 (ksn_urlrepd), uid 1001 inumber 15002792 on /jail: filesystem full pid 81826 (ksn_urlrepd), uid 1001 inumber 15002792 on /jail: filesystem full pid 81826 (ksn_urlrepd), uid 1001 inumber 15002792 on /jail: filesystem full pid 81826 (ksn_urlrepd), uid 1001 inumber 15002792 on /jail: filesystem full # df -h Filesystem Size Used Avail Capacity Mounted on /dev/mfid0p2 9.7G 1.1G 8.0G 12% / devfs 1.0k 1.0k 0B 100% /dev /dev/mfid0p4 256G 238G -2.9G 101% /jail devfs 1.0k 1.0k 0B 100% /jail/jail0/dev du show different free space: # du -h -d 1 / | grep jail 124G /jail Ok, try to stop jail and check free space again. # /etc/rc.d/jail stop Stopping jails: k.kaspersky-labs.com. Nothing has changed. du and df show the same values. Ok, try to umount filesystem. # umount /dev/mfid0p4 umount: unmount of /jail failed: Device busy Are you sure, really? # sync # fstat | grep jail Nobody use /jail! # umount /dev/mfid0p4 umount: unmount of /jail failed: Device busy We were trying to sync/fstat/umount several times without success. We were waiting several hours. Nothing has changed. Ok, finally try to umount -f # umount -f /dev/mfid0p4 Yeah! # fsck /dev/mfid0p4 ** /dev/mfid0p4 ** Last Mounted on /jail ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 25147 files, 65589771 used, 68699872 free (12208 frags, 8585958 blocks, 0.0% fragmentation) ***** FILE SYSTEM IS CLEAN ***** There are no any errors. Mount it back and check free space. # mount /dev/mfid0p4 # df -h Filesystem Size Used Avail Capacity Mounted on /dev/mfid0p2 9.7G 1.1G 8.0G 12% / devfs 1.0k 1.0k 0B 100% /dev /dev/mfid0p4 256G 125G 110G 53% /jail # du -h -d 1 / | grep jail 125G /jail Yeah! >How-To-Repeat: 1. Create a fs partition and mount it somewhere. It is better to create small size partition. We will use /data for example. 2. Add a local unprivileged account. We will use account named 'logv' and the same group name for example. Create some subdirectory in the /data and give it 'rwx' access rights for our account. We will use /data/logv. 3. Add to /etc/newsyslog.conf: /data/logv/test.log logv:logv 664 24 * 1 NZ i.e. rotate log every hour, owner and group for the archive file is our account, no need to send signal, compress log with gzip, keep a 24 gzipped copies. 4. Start with privilege of logv account simple script: while [ 1 -ne 0 ] ; do echo 111111111111111111111111 >> /data/logv/test.log; done Wait from several hours to several days. Waiting time depend on power of the system and the partition size. You can check filesystem free size by 'df -h | grep data' and 'du -h -d 1 / | grep data' commands. When you see a big difference between du and df free space time has come. Kill the script, comment out rotation of log in the /etc/newsyslog.conf, sync, check by fstat and/or lsof usage of our partition, kill all if need, try to umount the partition. >Fix: Not a real fix: 'umount -f' >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201308181425.r7IEPC8s034739>