Date: Sun, 19 May 2013 19:54:59 -0700 From: Dennis Glatting <dg@pki2.com> To: Paul Kraus <paul@kraus-haus.org> Cc: Tijl Coosemans <tijl@coosemans.org>, freebsd-questions@freebsd.org Subject: Re: More than 32 CPUs under 8.4-P Message-ID: <1369018499.16472.65.camel@btw.pki2.com> In-Reply-To: <1369014335.16472.60.camel@btw.pki2.com> References: <1368897188.16472.19.camel@btw.pki2.com> <51989FDA.5070302@coosemans.org> <1368978686.16472.25.camel@btw.pki2.com> <B06924FB-141E-421B-96E0-CEFE37C277A5@kraus-haus.org> <1369014335.16472.60.camel@btw.pki2.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Minutes after I typed that message 2x16 the system paniced with the following back trace: kdb_backtrace panic vdev_deadman vdev_deadman vdev_deadman spa_deadman softclock intr_event_execute_handlers ithread_loop fork_exit fork_trampoline I had just created a memory disk when that happened: root@iirc:~ # mdconfig -a -t swap -s 1g -u 1 root@iirc:~ # newfs -U /dev/md1 root@iirc:~ # mount /dev/md1 /mnt root@iirc:~ # cp -p procstat kgdb /mnt root@iirc:~ # cd /rescue/ root@iirc:/rescue # cp -p * /mnt On Sun, 2013-05-19 at 18:45 -0700, Dennis Glatting wrote: > On Sun, 2013-05-19 at 16:28 -0400, Paul Kraus wrote: > > On May 19, 2013, at 11:51 AM, Dennis Glatting <freebsd@pki2.com> wrote: > > > > > ZFS hangs on multi-socket systems (Tyan, Supermicro) under 9.1. ZFS does > > > not hang under 8.4. This (and one other 4 socket) is a production > > > system. > > > > Can you be more specific, I have been running 9.0 and 9.1 systems with > > multi-CPU and all ZFS with no (CPU related*) issues. > > > > I have (down to) ten FreeBSD/ZFS systems. Five of them are multi-socket > populated. All are AMD CPUs of the 6200 series. Two of those > multi-socketed systems are simply workstations and don't do much file > I/O, so I have yet to see them fault. > > The remaining three perform significant I/O in the 1-8TB (simultaneous) > file range, including sorting, compression, backup, etc (ZFS compression > is enabled on some data sets as is dedup on a few minor data sets). I > also do iSCSI and NFS from one of these systems. > > Simply, if I run 9.1 on those three busy systems ZFS will eventually > hang under load (within ten hours to a few days) whereas it does not > under 8.3/4. Two of those systems are 4x16 cores, one 2x16, and two 2x8 > cores. Multiple, simultaneous pbzip2 runs on individual 2-5TB ASCII > files generally causes a hang within 10-20 hours. > > "Hang" means the system is alive and on the network but disk I/O has > stopped. Run any command except statically linked executables on a > memory volume and they will not run (no output or return to command > prompt). This includes "reboot," which never really reboots. > > The volumes where work is performed are typically 12-33TB RAIDz2 > volumes. For example: > > root@mc:~ # zpool list disk-1 > NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT > disk-1 16.2T 5.86T 10.4T 36% 1.32x ONLINE - > > root@mc:~ # zpool status disk-1 > pool: disk-1 > state: ONLINE > scan: scrub repaired 0 in 21h53m with 0 errors on Mon Apr 29 01:52:55 > 2013 > config: > > NAME STATE READ WRITE CKSUM > disk-1 ONLINE 0 0 0 > raidz2-0 ONLINE 0 0 0 > da2 ONLINE 0 0 0 > da3 ONLINE 0 0 0 > da4 ONLINE 0 0 0 > da7 ONLINE 0 0 0 > da5 ONLINE 0 0 0 > da6 ONLINE 0 0 0 > cache > da0 ONLINE 0 0 0 > > errors: No known data errors > > > > * I say no CPU related issues because I have run into SATA timeout > > issues with an external SATA enclosure with 4 drives (I know, SATA port > > expanders are evil, but it is my best option here). Sometimes the zpool > > hangs hard, sometimes just becomes unresponsive for a while. My "fix", > > such as it is, is to tune the zfs per vdev queue depth as follows: > > > > vfs.zfs.vdev.min_pending="3" > > vfs.zfs.vdev.max_pending="5" > > > > I've not tried those. Currently, these are mine: > > vfs.zfs.write_limit_override="1G" > vfs.zfs.arc_max="8G" > vfs.zfs.txg.timeout=15 > vfs.zfs.cache_flush_disable=1 > > # Recommended from the net > # April, 2013 > vfs.zfs.l2arc_norw=0 # Default is 1 > vfs.zfs.l2arc_feed_again=0 # Default is 1 > vfs.zfs.l2arc_noprefetch=0 # Default is 0 > vfs.zfs.l2arc_feed_min_ms=1000 # Default is 200 > > > > The defaults are 5 and 10 respectively, and when I run with those I > > have the timeout issues, but only under very heavy I/O load. I only > > generate such load when migrating large amounts of data, which > > thankfully does not happen all that often. > > > > Two days ago when the 9.1 system hanged I was able to run a static > procstat where it inadvertently(?) printed that da0 wasn't responsive on > the console. Unfortunately I didn't have a static camcontrol ready so I > was unable to query it. > > That said, according to the criteria from > https://wiki.freebsd.org/AvgZfsDeadlockDebug that hang isn't a true ZFS > problem, yet hung it was. > > I have since (today) updated the firmware of most of the devices in that > system and it is currently running some tasks. Most of the disks in that > system are Seagate but the un-updated devices include three WD disks > (RAID1 OS and a swap disk) -- unupdated because I haven't been able to > figure WD firmware download out) and a SSD where the manufacturer > indicates the firmware diff is minor, though I plan to go back and flash > it anyway. > > If my 4x16 system ever finishes I will be updating its device's firmware > too but it is an 8.4-P system and doesn't give me any trouble. Another > 4x16 system gave me ZFS trouble under 9.1 but when I downgraded to 8.4-P > it has been stable as a rock for the past 22 days often under heavy > load. > > > > > > _______________________________________________ > freebsd-questions@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org" -- Dennis Glatting <dg@pki2.com>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1369018499.16472.65.camel>