From owner-freebsd-current@FreeBSD.ORG Mon Mar 10 14:24:59 2008 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EC17F1065673 for ; Mon, 10 Mar 2008 14:24:59 +0000 (UTC) (envelope-from freebsdlists@bsdunix.ch) Received: from conversation.bsdunix.ch (ns1.bsdunix.ch [82.220.1.90]) by mx1.freebsd.org (Postfix) with ESMTP id 502598FC1F for ; Mon, 10 Mar 2008 14:24:58 +0000 (UTC) (envelope-from freebsdlists@bsdunix.ch) Received: from localhost (localhost.bsdunix.ch [127.0.0.1]) by conversation.bsdunix.ch (Postfix) with ESMTP id E07D75DD9; Mon, 10 Mar 2008 15:24:56 +0100 (CET) X-Virus-Scanned: by amavisd-new at mail.bsdunix.ch Received: from conversation.bsdunix.ch ([127.0.0.1]) by localhost (conversation.bsdunix.ch [127.0.0.1]) (amavisd-new, port 10024) with LMTP id zjMLYy5047sb; Mon, 10 Mar 2008 15:24:50 +0100 (CET) Received: from bert.mlan.solnet.ch (bert.mlan.solnet.ch [212.101.1.83]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by conversation.bsdunix.ch (Postfix) with ESMTP id 13E1A5DCA; Mon, 10 Mar 2008 15:24:50 +0100 (CET) Message-ID: <47D544B1.6070806@bsdunix.ch> Date: Mon, 10 Mar 2008 15:24:49 +0100 From: Thomas Vogt User-Agent: Thunderbird 2.0.0.9 (X11/20080218) MIME-Version: 1.0 To: current@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org Subject: vm_thread_new: kstack allocation failed with many ZFS FS and NFSD X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Mar 2008 14:25:00 -0000 Hi List(s) I try to simulate real workload for our environment in my lab. The idea was to create 10k+ ZFS fs with several thousand files on each fs and then measure daily workload performance. Maybe 10k fs sounds silly but if you need individual quota for every user on a system, 5-10k fs are not unusual for ZFS My script to cerate zfs fs #!/bin/sh i=0; while [ $i != 10000 ]; do zfs create tank/script$i; i=`expr $i + 1`; done My script stopped after creating ~4850 FS with: vm_thread_new: kstack allocation failed vm_thread_new: kstack allocation failed vm_thread_new: kstack allocation failed vm_thread_new: kstack allocation failed vm_thread_new: kstack allocation failed vm_thread_new: kstack allocation failed Of course I blamed my script first but the problem did not disappear after a reboot. I also tried to just create 1k FS at one time and not 10k. I run my script 5 times to create 5k FS but in my last run i got: Cannot fork: Cannot allocate memory on the shell and "vm_thread_new: kstack allocation failed" by syslog. Also only ~4850 FS are created. Same behavior as the first time when i tried to create 10k fs. Another problem occurs during a boot process. ZFS tries to mount ~4850 fs (takes a while) and 256 NFS daemons are started. After the machine is up I receive "vm_thread_new: kstack allocation failed" messages. I could not login via ssh or run any command from the console. It looks the problem disappears if i disable nfs_server_enable. Is there something i can do? vm.kmem_size_max is already set to 1.5GB. My second issue looks zfs only related. Sometimes ZFS mounts nothing after a reboot and sometimes it mounts just about 1k FS and not all ~4850k fs. I can't see any error message. Hardware: 2x Intel Quadcore 53310 Memory: 8GBMotherboard: Intel 5000VSA Boot Disk: 1x SATA ZFS Storage: LSI SAS 3081E-R Controller OS: FreeBSD 7.0 amd64 my rc.conf ifconfig_em0="DHCP" nfs_server_enable="YES" nfs_server_flags="-u -t -n 256" rpcbind_enable="YES" sshd_enable="YES" zfs_enable="YES" loader.conf: vm.kmem_size_max="1500M" vm.kmem_size="1500M" As far as i know more than 1.5GB kmem_size is not supported even if the machine has enough memory left. I had some kernel panics with > 1.5GB kmem_size. It's a test system. So i can run every patch, every debug option people need. dmesg without ZFS. FreeBSD 7.0-RELEASE #1: Fri Mar 7 13:21:33 UTC 2008 root@netappkiller.labor.solnet.ch:/usr/obj/usr/src/sys/STORAGE Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(R) CPU E5310 @ 1.60GHz (1603.91-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x6f7 Stepping = 7 Features=0xbfebfbff Features2=0x4e33d AMD Features=0x20100800 AMD Features2=0x1 Cores per package: 4 usable memory = 8574955520 (8177 MB) avail memory = 8261730304 (7879 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 ioapic0 irqs 0-23 on motherboard ioapic1 irqs 24-47 on motherboard lapic0: Forcing LINT1 to edge trigger kbd1 at kbdmux0 acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) acpi0: reservation of 0, a0000 (3) failed Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 acpi_hpet0: iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 900 cpu0: on acpi0 device_attach: acpi_perf0 attach returned 6 device_attach: acpi_perf0 attach returned 6 p4tcc0: on cpu0 cpu1: on acpi0 device_attach: acpi_perf1 attach returned 6 device_attach: acpi_perf1 attach returned 6 p4tcc1: on cpu1 cpu2: on acpi0 device_attach: acpi_perf2 attach returned 6 device_attach: acpi_perf2 attach returned 6 p4tcc2: on cpu2 cpu3: on acpi0 device_attach: acpi_perf3 attach returned 6 device_attach: acpi_perf3 attach returned 6 p4tcc3: on cpu3 acpi_button0: on acpi0 pcib0: port 0xca2,0xca3,0xcf8-0xcff on acpi0 pci0: on pcib0 pcib1: at device 2.0 on pci0 pci1: on pcib1 pcib2: irq 16 at device 0.0 on pci1 pci2: on pcib2 pcib3: irq 16 at device 0.0 on pci2 pci3: on pcib3 pcib4: irq 17 at device 1.0 on pci2 pci4: on pcib4 mpt0: port 0x3000-0x30ff mem 0xb8910000-0xb8913fff,0xb8900000-0xb890ffff irq 17 at device 0.0 on pci4 mpt0: [ITHREAD] mpt0: MPI Version=1.5.16.0 mpt0: mpt_cam_event: 0x16 mpt0: mpt_cam_event: 0x16 mpt0: mpt_cam_event: 0x16 mpt0: mpt_cam_event: 0x12 mpt0: mpt_cam_event: 0x16 mpt0: mpt_cam_event: 0x16 mpt0: mpt_cam_event: 0x12 mpt0: mpt_cam_event: 0x13 mpt0: mpt_cam_event: 0x12 mpt0: mpt_cam_event: 0x12 mpt0: mpt_cam_event: 0x12 mpt0: mpt_cam_event: 0x12 mpt0: mpt_cam_event: 0x12 mpt0: mpt_cam_event: 0x12 mpt0: mpt_cam_event: 0x16 pcib5: irq 18 at device 2.0 on pci2 pci5: on pcib5 em0: port 0x2020-0x203f mem 0xb8820000-0xb883ffff,0xb8400000-0xb87fffff irq 18 at device 0.0 on pci5 em0: Using MSI interrupt em0: Ethernet address: 00:15:17:44:df:1c em0: [FILTER] em1: port 0x2000-0x201f mem 0xb8800000-0xb881ffff,0xb8000000-0xb83fffff irq 19 at device 0.1 on pci5 em1: Using MSI interrupt em1: Ethernet address: 00:15:17:44:df:1d em1: [FILTER] pcib6: at device 0.3 on pci1 pci6: on pcib6 pcib7: at device 3.0 on pci0 pci7: on pcib7 pci0: at device 8.0 (no driver attached) pcib8: irq 16 at device 28.0 on pci0 pci8: on pcib8 uhci0: port 0x40a0-0x40bf irq 23 at device 29.0 on pci0 uhci0: [GIANT-LOCKED] uhci0: [ITHREAD] usb0: on uhci0 usb0: USB revision 1.0 uhub0: on usb0 uhub0: 2 ports with 2 removable, self powered uhci1: port 0x4080-0x409f irq 22 at device 29.1 on pci0 uhci1: [GIANT-LOCKED] uhci1: [ITHREAD] usb1: on uhci1 usb1: USB revision 1.0 uhub1: on usb1 uhub1: 2 ports with 2 removable, self powered uhci2: port 0x4060-0x407f irq 23 at device 29.2 on pci0 uhci2: [GIANT-LOCKED] uhci2: [ITHREAD] usb2: on uhci2 usb2: USB revision 1.0 uhub2: on usb2 uhub2: 2 ports with 2 removable, self powered uhci3: port 0x4040-0x405f irq 22 at device 29.3 on pci0 uhci3: [GIANT-LOCKED] uhci3: [ITHREAD] usb3: on uhci3 usb3: USB revision 1.0 uhub3: on usb3 uhub3: 2 ports with 2 removable, self powered ehci0: mem 0xb8c00400-0xb8c007ff irq 23 at device 29.7 on pci0 ehci0: [GIANT-LOCKED] ehci0: [ITHREAD] usb4: EHCI version 1.0 usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3 usb4: on ehci0 usb4: USB revision 2.0 uhub4: on usb4 uhub4: 8 ports with 8 removable, self powered pcib9: at device 30.0 on pci0 pci9: on pcib9 vgapci0: port 0x1000-0x10ff mem 0xb0000000-0xb7ffffff,0xb8b00000-0xb8b0ffff irq 17 at device 12.0 on pci9 isab0: at device 31.0 on pci0 isa0: on isab0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x40c0-0x40cf irq 20 at device 31.1 on pci0 ata0: on atapci0 ata0: [ITHREAD] ata1: on atapci0 ata1: [ITHREAD] atapci1: port 0x40d8-0x40df,0x40f4-0x40f7,0x40d0-0x40d7,0x40f0-0x40f3,0x4020-0x403f mem 0xb8c00000-0xb8c003ff irq 20 at device 31.2 on pci0 atapci1: [ITHREAD] atapci1: AHCI Version 01.10 controller with 6 ports detected ata2: on atapci1 ata2: [ITHREAD] ata3: on atapci1 ata3: [ITHREAD] ata4: on atapci1 ata4: [ITHREAD] ata5: on atapci1 ata5: [ITHREAD] ata6: on atapci1 ata6: [ITHREAD] ata7: on atapci1 ata7: [ITHREAD] pci0: at device 31.3 (no driver attached) atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] psm0: irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: [ITHREAD] psm0: model IntelliMouse, device ID 3 sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio0: [FILTER] sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A sio1: [FILTER] orm0: at iomem 0xc0000-0xc8fff,0xc9000-0xcafff,0xd1000-0xd1fff,0xd2000-0xd2fff on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounters tick every 1.000 msec acd0: DVDROM at ata0-master UDMA33 ad4: 715404MB at ata2-master SATA300 ad6: 715404MB at ata3-master SATA300 ad8: 715404MB at ata4-master SATA300 ad10: 715404MB at ata5-master SATA300 ad12: 381554MB at ata6-master SATA300 da0 at mpt0 bus 0 target 2 lun 0 da0: Fixed Direct Access SCSI-5 device da0: 300.000MB/s transfers da0: Command Queueing Enabled da0: 715404MB (1465149168 512 byte sectors: 255H 63S/T 91201C) da1 at mpt0 bus 0 target 3 lun 0 da1: Fixed Direct Access SCSI-5 device da1: 300.000MB/s transfers da1: Command Queueing Enabled da1: 715404MB (1465149168 512 byte sectors: 255H 63S/T 91201C) da2 at mpt0 bus 0 target 4 lun 0 da2: Fixed Direct Access SCSI-5 device da2: 300.000MB/s transfers da2: Command Queueing Enabled da2: 715404MB (1465149168 512 byte sectors: 255H 63S/T 91201C) da3 at mpt0 bus 0 target 5 lun 0 da3: Fixed Direct Access SCSI-5 device da3: 300.000MB/s transfers da3: Command Queueing Enabled da3: 715404MB (1465149168 512 byte sectors: 255H 63S/T 91201C) da4 at mpt0 bus 0 target 6 lun 0 da4: Fixed Direct Access SCSI-5 device da4: 300.000MB/s transfers da4: Command Queueing Enabled da4: 715404MB (1465149168 512 byte sectors: 255H 63S/T 91201C) da5 at mpt0 bus 0 target 7 lun 0 da5: Fixed Direct Access SCSI-5 device da5: 300.000MB/s transfers da5: Command Queueing Enabled da5: 715404MB (1465149168 512 byte sectors: 255H 63S/T 91201C) da6 at mpt0 bus 0 target 8 lun 0 da6: Fixed Direct Access SCSI-5 device da6: 300.000MB/s transfers da6: Command Queueing Enabled da6: 715404MB (1465149168 512 byte sectors: 255H 63S/T 91201C) lapic1: Forcing LINT1 to edge trigger SMP: AP CPU #1 Launched! lapic2: Forcing LINT1 to edge trigger SMP: AP CPU #2 Launched! lapic3: Forcing LINT1 to edge trigger SMP: AP CPU #3 Launched! Trying to mount root from ufs:/dev/ad12s1a