From owner-freebsd-net@freebsd.org Thu Sep 6 13:15:34 2018 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2392EFFAE4C for ; Thu, 6 Sep 2018 13:15:34 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "smtp.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id CAAAA766EE; Thu, 6 Sep 2018 13:15:33 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: from venus.codepro.be (venus.codepro.be [IPv6:2a01:4f8:162:1127::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.codepro.be", Issuer "Gandi Standard SSL CA 2" (verified OK)) (Authenticated sender: kp) by smtp.freebsd.org (Postfix) with ESMTPSA id A8AC81D509; Thu, 6 Sep 2018 13:15:33 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: from [172.28.128.1] (d5152df30.static.telenet.be [81.82.223.48]) (Authenticated sender: kp) by venus.codepro.be (Postfix) with ESMTPSA id 31AC8B036; Thu, 6 Sep 2018 15:15:31 +0200 (CEST) From: "Kristof Provost" To: "Bjoern A. Zeeb" Cc: "FreeBSD Net" Subject: vnet shutdown / ifnet_departure_event Date: Thu, 06 Sep 2018 15:15:30 +0200 X-Mailer: MailMate (2.0BETAr6119) Message-ID: <3F130DF9-40CA-45C9-944E-91E1CA2BF445@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed; markup=markdown Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.27 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Sep 2018 13:15:34 -0000 Hi Bjoern, I’m running into an issue with vnet shutdown. It manifests consistently with pfsync, but if I understand the problem fully it’s not really related to pfsync. The issue is that we end up with a use-after-free of the struct ifnet of the pfsync interface. When the jail shuts down the pfsync interface is destroyed, but because this is during vnet shutdown we skip a lot of the cleanup. Including the `EVENTHANDLER_INVOKE(ifnet_departure_event, ifp);`, which means pf doesn’t get notified that the interface went away, so it keeps its struct pfi_kif for that interface, which it tries to clean up when we get round to doing the vnet shutdown for pf. At that point it tries to clear the if_pf_kif and pfg_pf_kif pointers, for an ifp which has already been freed. Invoking the event handler from the ‘if (shutdown)’ code in if_detach_internal() fixes the problem, but I’m not totally confident that won’t have any unexpected side effects. Best regards, Kristof From owner-freebsd-net@freebsd.org Thu Sep 6 13:17:34 2018 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D95F1FFAF48 for ; Thu, 6 Sep 2018 13:17:33 +0000 (UTC) (envelope-from kayasaman@optiplex-networks.com) Received: from x-ray.optiplex-networks.com (mail.optiplex-networks.com [212.159.80.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3E6BD76800 for ; Thu, 6 Sep 2018 13:17:32 +0000 (UTC) (envelope-from kayasaman@optiplex-networks.com) Received: from localhost (localhost [127.0.0.1]) by x-ray.optiplex-networks.com (Postfix) with ESMTP id CA035A200E8; Thu, 6 Sep 2018 14:17:23 +0100 (BST) Received: from x-ray.optiplex-networks.com ([127.0.0.1]) by localhost (x-ray.optiplex-networks.com [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id yfZlbvE0K4kU; Thu, 6 Sep 2018 14:17:20 +0100 (BST) Received: from localhost (localhost [127.0.0.1]) by x-ray.optiplex-networks.com (Postfix) with ESMTP id D87EDA204EF; Thu, 6 Sep 2018 14:17:20 +0100 (BST) X-Virus-Scanned: amavisd-new at x-ray.optiplex-networks.com Received: from x-ray.optiplex-networks.com ([127.0.0.1]) by localhost (x-ray.optiplex-networks.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 6WpEYnNmZwyT; Thu, 6 Sep 2018 14:17:20 +0100 (BST) Received: from Sting-Ray.optiplex-networks.com (unknown [192.168.20.30]) by x-ray.optiplex-networks.com (Postfix) with ESMTPSA id B59AEA200E8; Thu, 6 Sep 2018 14:17:20 +0100 (BST) Subject: Re: iSCSI issues after upgrading to 11.2 x64 RELEASE To: Eugene Grosbein , Ryan Moeller , freebsd-net@freebsd.org References: <541494c3-d275-dee2-ff5e-8b276ef8d9d6@gmail.com> <0C437CBE-E525-4277-9315-6205206CDBB7@ixsystems.com> From: Kaya Saman Message-ID: <351b8574-d936-7efc-782c-bd50e28fa784@optiplex-networks.com> Date: Thu, 6 Sep 2018 14:17:20 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US-large Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Sep 2018 13:17:34 -0000 So, the system semi hung again.... ctrl + alt + F(n) only thing that was=20 working. :-( On 9/4/18 6:08 PM, Eugene Grosbein wrote: > 04.09.2018 23:57, Ryan Moeller wrote: > >>> The NIC's are Intel based using igb kernel driver: >>> >>> igb0: flags=3D8843 metric 0 m= tu 9000 >>> options=3D6403bb >> I see your MTU is 9000, and as described by the other thread you linke= d to, there are issues with 9k jumbo cluster allocation. >> Some detailed notes are here, but the quick summary is: set MTU < 4096 >> https://gist.github.com/freqlabs/eba9b755f17a223260246becfbb150a1 Yes MTU 9000, though it seems the 9k issues are related to FreeBSD=20 only?? - my other OS's (OpenBSD and Linux based) seem to be able to=20 handle the setting fine as I haven't experienced any issues with them.=20 However, their driver implementation or handling of things maybe quite=20 different so I cannot form a direct comparison. Taking your advice and reading through the link I reset the MTU to 4000=20 after the 'hang' mentioned above, so far no issues: 24652/3428/28080 mbufs in use (current/cache/total) 0/1358/1358/1525810 mbuf clusters in use (current/cache/total/max) 0/1081 mbuf+clusters out of packet secondary zone in use (current/cache) 24648/129/24777/762905 4k (page size) jumbo clusters in use (current/cach= e/total/max) 0/0/0/226045 9k jumbo clusters in use (current/cache/total/max) 0/0/0/127150 16k jumbo clusters in use (current/cache/total/max) 104755K/4089K/108844K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0 sendfile syscalls 0 sendfile syscalls completed without I/O request 0 requests for I/O initiated by sendfile 0 pages read by sendfile as part of a request 0 pages were valid at time of a sendfile request 0 pages were requested for read ahead by applications 0 pages were read ahead by sendfile 0 times sendfile encountered an already busy page 0 requests for sfbufs denied 0 requests for sfbufs delayed >> >>> Can anyone suggest anything to stop my system from completely locking= up and becoming unresponsive? >>> At the moment I'm not sure if switching to 'Stable' or 'Current' bran= ches is a good solution? >> The problem has been mitigated for a while on 12-CURRENT, so that migh= t be worth trying. Otherwise I=E2=80=99ve been hoping a committer will pu= t this fix in 11-STABLE, but in the meantime you could manually apply the= patch: >> https://reviews.freebsd.org/D16534 > Intel NIC users also should be aware of chip hardware problems while de= aling with 9k MTU, like documented here: > https://www.intel.com/content/dam/www/public/us/en/documents/specificat= ion-updates/i218-i219-ethernet-connection-spec-update.pdf > > In short, Intel does not recommend MTU over 8500. That's really interesting! The card in the system is one of these 4 port ones:=20 https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/e= thernet-controller-i350-datasheet.pdf For now I'll keep the mtu at 4000 then when 12 becomes RELEASE, I'll try=20 cranking it up again to see if the problem has been fixed; however, I=20 set up a cron job to mail me the output of 'netstat -m' so I can keep=20 track of the mbufs though it's probably going to be more useful at full=20 whack - meaning 9k then now were it seems the issue has been temporarily=20 alleviated.... > > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" Kaya