From owner-freebsd-questions@FreeBSD.ORG Wed Mar 28 03:09:18 2012 Return-Path: Delivered-To: questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 279231065673; Wed, 28 Mar 2012 03:09:18 +0000 (UTC) (envelope-from pgollucci@gmail.com) Received: from mail-qa0-f54.google.com (mail-qa0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id BA1A68FC1D; Wed, 28 Mar 2012 03:09:17 +0000 (UTC) Received: by qao25 with SMTP id 25so645059qao.13 for ; Tue, 27 Mar 2012 20:09:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:organization:user-agent:mime-version:to:cc :subject:references:in-reply-to:x-enigmail-version:content-type; bh=wMQWWSsv0V7mwej7lsOx43nz5Rpuuwf7+r94eLesuug=; b=wQaBmK48Pfgt/B1ur8QMAulPAia4JTzzM3AcssMgpPMe6tGUdTPz45Nmjdr5T4+jPB PzVp+jolmgWYJdfqBazdA//lLVhEikcJMCImxTyEH/LLKwW9BPjcvsYTn02Q+j8npG3z JUwD8Lqcaa6BAvOL8E+0TV78l6hrTusJs9uwALP13L9gFBj3uVyoRZL9HVb4BsnVfDfc IePN4xKoA1L8kRtDNGV1709wruxXb0jKWObPDTeaBlLWMaIUxb835qtvzA9pP25g/Ldj V763Jliouhp0V8UYHN0HEr63DtUrvznxQKIt7NqL7n30GVtf8+8u4qAB31B9JT7sdThJ AvDw== Received: by 10.224.32.12 with SMTP id a12mr35802092qad.66.1332904156834; Tue, 27 Mar 2012 20:09:16 -0700 (PDT) Received: from philip.hq.rws (wsip-174-79-184-239.dc.dc.cox.net. [174.79.184.239]) by mx.google.com with ESMTPS id x4sm4297102qaa.22.2012.03.27.20.09.14 (version=SSLv3 cipher=OTHER); Tue, 27 Mar 2012 20:09:15 -0700 (PDT) Message-ID: <4F7280D7.8050409@p6m7g8.com> Date: Wed, 28 Mar 2012 03:09:11 +0000 From: "Philip M. Gollucci" Organization: P6M7G8 Inc. User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:7.0.1) Gecko/20111029 Thunderbird/7.0.1 MIME-Version: 1.0 To: "Philip M. Gollucci" References: <4F708223.4060803@p6m7g8.com> <4F7126BA.1000202@p6m7g8.com> In-Reply-To: <4F7126BA.1000202@p6m7g8.com> X-Enigmail-Version: undefined Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig65DE325CEF9DB244211F385D" Cc: questions@freebsd.org, current@freebsd.org Subject: Re: freebsd 9.0-release + zfs + mysqld(percona) = kernel: swap zone exhausted, increase kern.maxswzone X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Mar 2012 03:09:18 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig65DE325CEF9DB244211F385D Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 03/27/12 02:32, Philip M. Gollucci wrote: > Some other tuning updates >=20 > $ zfs set zfs:zfs_nocacheflush =3D 1 > $ sysctl vfs.zfs.prefetch_disable=3D1 >=20 > $ cat /etc/my.cnf > skip-innodb-doublewrite > innodb_flush_log_at_trx_commit=3D2 >=20 >=20 > $ zfs set primarycache=3Dmetadata zmysqlD > $ zfs set atime=3Doff zmysqlD > $ zfs set recordsize=3D16k zmysqlD >=20 > but not on zmysqlL >=20 > my next plan is to turn off tmpfs and use ZVOL swaps then to simply use= > just zroot/tmp as a normal dir. >=20 > after that I'll drastically increase maxswzone. >=20 > still hoping someone has already done this. None of that made a difference; however I haven't tried the ZVOL swaps yet b/c they're quite new and this after all production eventually. so I've been reading up on maxswzone. Its seems to me that nobody really understands it. Fortunately it isn't used very much, It works out to roughly 7.7GB from 32MB okay fine. If I double it, that should give me 15.4GB from 64MB (still not enough). If I 16x it that should give me 246GB from 512MB. Thats more my physical ram + swap. Oh well. I've seen John Baldwin write on lists o) you have another problem if the default isn't enough o) when it panics I pick up the crash dump swap info and do #blocks in use*totalswblocks/maxswzone o) setting it higher claims wired memory which can't be reused. tuning(7) is from the 4.x days and is useless here. something thats really confusing me is if the output from $ vmstat -z |grep solaris is relevant or the size of my swap itself or if by upping maxswzone I'm taking away too much from zfs in the long r= un. So tracing this below kern.maxswzone=3D"536870912" # =3D 16*(32*1024*1024) vm.stats.vm.v_page_count: 24411488 n=3D12205744 ### n =3D cnt.v_page_count / 2; if (maxswzone && n > maxswzone / sizeof(struct swblock)) n =3D maxswzone / sizeof(struct swblock); struct swblock { struct swblock *swb_hnext; vm_object_t swb_object; vm_pindex_t swb_index; int swb_count; daddr_t swb_pages[SWAP_META_PAGES]; }; if this is >43.98 bytes then the conditional is true; however its not b/c the printf() message isn't written out below. if (n2 !=3D n) printf("Swap zone entries reduced from %d to %d.\n", which means the initial allocation succeeds with n=3D12205744 and not maxswzone. ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP SWAPMETA: 288, 1864135, 0, 0, 0, 0, 0= So more than a little perplex by these size/limits and that none of its used on a system thats running out of it. subr_param.c: --------------- long maxswzone; /* max swmeta KVA storage */ SYSCTL_LONG(_kern, OID_AUTO, maxswzone, CTLFLAG_RDTUN, &maxswzone, 0, "Maximum memory for swap metadata"); #ifdef VM_SWZONE_SIZE_MAX maxswzone =3D VM_SWZONE_SIZE_MAX; #endif TUNABLE_LONG_FETCH("kern.maxswzone", &maxswzone); param.h: -------- /* * Ceiling on amount of swblock kva space, can be changed via * the kern.maxswzone /boot/loader.conf variable. */ #ifndef VM_SWZONE_SIZE_MAX #define VM_SWZONE_SIZE_MAX (32 * 1024 * 1024) #endif swap_pager.c: -------------- void swap_pager_swap_init(void) { int n, n2; //comments skipped nsw_cluster_max =3D min((MAXPHYS/PAGE_SIZE), MAX_PAGEOUT_CLUSTER); mtx_lock(&pbuf_mtx); nsw_rcount =3D (nswbuf + 1) / 2; nsw_wcount_sync =3D (nswbuf + 3) / 4; nsw_wcount_async =3D 4; nsw_wcount_async_max =3D nsw_wcount_async; mtx_unlock(&pbuf_mtx); /* * Initialize our zone. Right now I'm just guessing on the number * we need based on the number of pages in the system. Each swblock * can hold 16 pages, so this is probably overkill. This reservation * is typically limited to around 32MB by default. */ n =3D cnt.v_page_count / 2; if (maxswzone && n > maxswzone / sizeof(struct swblock)) n =3D maxswzone / sizeof(struct swblock); n2 =3D n; swap_zone =3D uma_zcreate("SWAPMETA", sizeof(struct swblock), NULL, NULL= , NULL, NULL, UMA_ALIGN_PTR, UMA_ZONE_NOFREE | UMA_ZONE_VM); if (swap_zone =3D=3D NULL) panic("failed to create swap_zone."); do { if (uma_zone_set_obj(swap_zone, &swap_zone_obj, n)) break; /* * if the allocation failed, try a zone two thirds the * size of the previous attempt. */ n -=3D ((n + 2) / 3); } while (n > 0); if (n2 !=3D n) printf("Swap zone entries reduced from %d to %d.\n", n2, n); n2 =3D n; /* * Initialize our meta-data hash table. The swapper does not need to * be quite as efficient as the VM system, so we do not use an * oversized hash table. * * n: size of hash table, must be power of 2 * swhash_mask: hash table index mask */ for (n =3D 1; n < n2 / 8; n *=3D 2) ; swhash =3D malloc(sizeof(struct swblock *) * n, M_VMPGDATA, M_WAITOK | M_ZERO); swhash_mask =3D n - 1; mtx_init(&swhash_mtx, "swap_pager swhash", NULL, MTX_DEF); } --=20 ------------------------------------------------------------------------ 1024D/DB9B8C1C B90B FBC3 A3A1 C71A 8E70 3F8C 75B8 8FFB DB9B 8C1C Philip M. Gollucci (pgollucci@p6m7g8.com) c: 703.336.9354 Member, Apache Software Foundation Committer, FreeBSD Foundation Consultant, P6M7G8 Inc. Director Operations, Ridecharge Inc. Work like you don't need the money, love like you'll never get hurt, and dance like nobody's watching. --------------enig65DE325CEF9DB244211F385D Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFPcoDZdbiP+9ubjBwRAgQcAJ0R5rAqf40NxYgZxNmbN3fC4VwjgwCfV5Ss 4u4Oi9Nbas/8VBLNPyF0CYU= =dnUa -----END PGP SIGNATURE----- --------------enig65DE325CEF9DB244211F385D--