Date: Sun, 2 Dec 2007 13:33:09 +0100 From: =?ISO-8859-1?Q?Johan_Str=F6m?= <johan@stromnet.se> To: m.rebele@web.de Cc: freebsd-current@freebsd.org Subject: Re: 7.0-Beta 3: zfs makes system reboot Message-ID: <CAAD8692-4EBA-4BEF-9523-721EFFC5643E@stromnet.se> In-Reply-To: <475039D5.4020204@web.de> References: <475039D5.4020204@web.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On Nov 30, 2007, at 17:27 , Michael Rebele wrote: > Hello, > > i'm testing the zfs since 7.0-Beta 1. > First, i had only access to an 32 Bit Machine (P4/3GHz with 2GB =20 > RAM, 2xHD for RAID1 and 2xHD for ZFS Raid 0). > > While running iozone with the following call: > iozone -R -a -z -b file.wks -g 4G -f testile > > (This is inspired by Dominic Kay from Sun, see http://blogs.sun.com/=20= > dom/entry/zfs_v_vxfs_iozone for details). > > the well known "kmem_malloc" error occured and stopped the system. > (panic: kmem_malloc (131072): kmem_map too small: 398491648 total =20 > allocated cpuid=3D1) > > I tested several optimizations as suggested in the ZFS Tuning Guide =20= > and several postings on this list. > The problem stayed mainly the same, it stopped with a "kmem_malloc" =20= > or rebooted without warning. This depends on the configuration, if =20 > i raised the vm.kmem_-sizes or only the KVA_PAGES or both. > But it never ever made the benchmark. With more memory in =20 > vm.kmem_size and vm.kmem_size_max, the problem came later. > > > > But ok, the main target for the ZFS is to use amd64, not i386. Now =20 > i have access to an Intel Woodcrest-System, it's a Xeon 5160 with =20 > 4GB RAM and 1xHD. It has UFS for the System and Home and one ZFS =20 > only for data (for the iozone-Benchmark). > It has a vanilla kernel, i haven't touched it. I've tested the =20 > default settings from Beta 3 and applied the tuning tips from the =20 > Tuning Guide. > It shows the same behaviour as on the 32 Bit machine. One major =20 > difference: it makes always a reboot. There's no kmem_malloc error =20 > message (which made the system hang). > > The problem is the "-z" option in the iozone-Benchmark. Without it, =20= > the benchmark works (on the i386 and on the amd64-Machine). This =20 > option makes iozone testing small record sizes for large files. On =20 > an UFS-Filesystem, iozone works with the "-z" option. Though, it =20 > seems to me, that this is a problem with ZFS. > > Here are some more informations (from the amd64-System): > > 1. The captured iozone output > > [root@zfs /tank/iozone]# iozone -R -a -z -b filez-512M.wks -g 4G -f =20= > testile > ... For the record, I can reproduce the same thing on amd64 FreeBSD =20 RELENG_7 (installed from beta3 2 days ago) from 2 days ago. Its a c2d =20= box with 2Gb of memory and two satadrives in zpool mirror. No special =20= tweaking whatsoever yet.. The panic was Page fault, supervisor read instruction page not =20 present.. so not the (apparently) regular kmem_malloc? So I doubt the =20= other patch that was linked to by Alexandre would help? iozone got to Run began: Sun Dec 2 13:11:53 2007 Excel chart generation enabled Auto Mode Cross over of record size disabled. Using maximum file size of 4194304 kilobytes. Command line used: iozone -R -a -z -b file.wks -g 4G -f testile Output is in Kbytes/sec Time Resolution =3D 0.000001 seconds. Processor cache size set to 1024 Kbytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. random =20 random bkwd record stride KB reclen write rewrite read reread read =20= write read rewrite read fwrite frewrite fread freread 64 4 122584 489126 969761 1210227 1033216 =20 503814 769584 516414 877797 291206 460591 703068 735831 64 8 204474 735831 1452528 1518251 1279447 =20 799377 1255511 752329 1460430 372410 727850 1087638 1279447 ...... 131072 4 65734 71698 1011780 970967 =20 755928 5479 1008858 494172 931232 65869 68155 906746 =20 910950 131072 8 79507 74422 1699148 1710185 1350184 =20= 10907 1612344 929991 1372725 34699 74782 1407638 1429434 131072 16 82479 74279 2411000 2426173 2095714 =20= 25327 2299061 1608974 2038950 71102 69200 1887231 1893067 131072 32 75268 73077 3276650 3326454 2954789 =20= 70573 3195793 2697621 2987611 then it died No cores dumped however.. Altough I'm running on a gmirror for swap, =20 if I recall correct at least 6.x couldnt dump to a gmirror, I guess =20 7.x cant either then.. Altought the dump message DID say it dumped =20 memory (and it did say Dump complete), savecore didnt find any dumps =20 at boot.. The box didnt do anything else during this test, and is not running =20 any apps yet. Havent encounterd the problem before, but then again =20 I've only been playing with it for 2 days without any real hard test =20 (just scp'ed about 50 gigs of data to it, but thats it) -- Johan Str=F6m Stromnet johan@stromnet.se http://www.stromnet.se/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAAD8692-4EBA-4BEF-9523-721EFFC5643E>