From owner-freebsd-questions@freebsd.org Tue Dec 4 18:50:47 2018 Return-Path: Delivered-To: freebsd-questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 90E181317443 for ; Tue, 4 Dec 2018 18:50:47 +0000 (UTC) (envelope-from adams-freebsd@ateamsystems.com) Received: from smtp68.iad3b.emailsrvr.com (smtp68.iad3b.emailsrvr.com [146.20.161.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CBC446C0AA for ; Tue, 4 Dec 2018 18:50:46 +0000 (UTC) (envelope-from adams-freebsd@ateamsystems.com) Received: from smtp17.relay.iad3b.emailsrvr.com (localhost [127.0.0.1]) by smtp17.relay.iad3b.emailsrvr.com (SMTP Server) with ESMTP id 67016A0592 for ; Tue, 4 Dec 2018 13:50:40 -0500 (EST) X-Auth-ID: adam.strohl@ateamsystems.com Received: by smtp17.relay.iad3b.emailsrvr.com (Authenticated sender: adam.strohl-AT-ateamsystems.com) with ESMTPSA id 26565A05B4 for ; Tue, 4 Dec 2018 13:50:40 -0500 (EST) X-Sender-Id: adam.strohl@ateamsystems.com Received: from [192.168.15.220] (office.ateamsystems.com [50.246.230.241]) (using TLSv1.2 with cipher AES128-SHA) by 0.0.0.0:465 (trex/5.7.12); Tue, 04 Dec 2018 13:50:40 -0500 To: freebsd-questions@freebsd.org From: Adam Strohl Subject: HAST Error: G_GATE_CMD_START failed: Cannot allocate memory. Message-ID: <79cd5422-9a89-bc32-a5f4-a4f7180641ee@ateamsystems.com> Date: Tue, 4 Dec 2018 10:50:42 -0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.3.2 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: CBC446C0AA X-Spamd-Result: default: False [-4.40 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:146.20.161.0/25]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-questions@freebsd.org]; TO_DN_NONE(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; RCVD_COUNT_THREE(0.00)[4]; DMARC_NA(0.00)[ateamsystems.com]; MX_GOOD(-0.01)[mx1.emailsrvr.com,mx2.emailsrvr.com]; NEURAL_HAM_SHORT(-0.94)[-0.939,0]; RCVD_IN_DNSWL_NONE(0.00)[68.161.20.146.list.dnswl.org : 127.0.15.0]; IP_SCORE(-1.16)[ipnet: 146.20.0.0/16(-2.22), asn: 27357(-3.47), country: US(-0.09)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:27357, ipnet:146.20.0.0/16, country:US]; MID_RHS_MATCH_FROM(0.00)[] X-Rspamd-Server: mx1.freebsd.org X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Dec 2018 18:50:47 -0000 Hello Everyone, We've been successfully using HAST (with ZFS and NFS on top) for years and years on a number of different deployments for our customers. Typically the setup is two HAST volumes with ZFS running zmirror over it. Today, under FreeBSD 11.2-RELEASE-p4, we went to setup an instance using 4 disks using zraid1, and ran into an issue as soon as any significant writing occurred: Two out of the 4 total HAST volumes immediately "crashed" and failed to report: >>>> hastctl status ;;Name Status Role Components zhsubd0 complete primary /dev/ada0p4 nas2 zhsubd1 - init /dev/ada1p4 nas2 zhsubd2 complete primary /dev/ada2p4 nas2 zhsubd3 - init /dev/ada3p4 nas2 On further inspection in syslog we'd see: Dec 4 18:30:14 nas1 hastd[1475]: [zhsubd1] (primary) G_GATE_CMD_START failed: Cannot allocate memory. Dec 4 18:30:14 nas1 devd: Processing event '!system=DEVFS subsystem=CDEV type=DESTROY cdev=hast/zhsubd1' Dec 4 18:30:14 nas1 devd: Processing event '!system=GEOM subsystem=DEV type=DESTROY cdev=hast/zhsubd1' Dec 4 18:30:14 nas1 hastd[578]: [zhsubd1] (primary) Worker process exited ungracefully (pid=1475, exitcode=71). Dec 4 18:30:14 nas1 hastd[578]: [zhsubd1] (primary) Changing resource role back to init. The only trace of this error I can find is in this mailing list entry: https://lists.freebsd.org/pipermail/freebsd-current/2015-May/055750.html However we're not running a custom kernel. This seems to be something specific to running more than 2 volumes. Does anyone have any insight into what limit is being hit and how to fix it? I can't find much documentation on MAXPHYS and what it does (or did). I would be grateful for any assistance -- please let me know if there is a better place to post this or anyone needs more details. Thank you! -- Adam Strohl http://www.ateamsystems.com/