From owner-freebsd-stable@freebsd.org Mon Nov 26 08:46:59 2018 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BE81E1151DA8 for ; Mon, 26 Nov 2018 08:46:59 +0000 (UTC) (envelope-from gerrit.kuehn@aei.mpg.de) Received: from umail2.aei.mpg.de (umail2.aei.mpg.de [194.94.224.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CF2938DCE9 for ; Mon, 26 Nov 2018 08:46:58 +0000 (UTC) (envelope-from gerrit.kuehn@aei.mpg.de) Received: from mailgate.aei.mpg.de (mailgate.aei.mpg.de [194.94.224.5]) by umail2.aei.mpg.de (Postfix) with ESMTP id 708DE256A2A9 for ; Mon, 26 Nov 2018 09:46:49 +0100 (CET) Received: from mailgate.aei.mpg.de (localhost [127.0.0.1]) by localhost (Postfix) with SMTP id 62516406ADE for ; Mon, 26 Nov 2018 09:46:49 +0100 (CET) Received: from intranet.aei.uni-hannover.de (ahin1.aei.uni-hannover.de [130.75.117.40]) by mailgate.aei.mpg.de (Postfix) with ESMTP id 1250D406ADB for ; Mon, 26 Nov 2018 09:46:49 +0100 (CET) Received: from arc.aei.uni-hannover.de ([130.75.117.1]) by intranet.aei.uni-hannover.de (IBM Domino Release 9.0.1FP8) with ESMTP id 2018112609464877-353279 ; Mon, 26 Nov 2018 09:46:48 +0100 Date: Mon, 26 Nov 2018 09:46:48 +0100 From: Gerrit =?ISO-8859-1?Q?K=FChn?= To: freebsd-stable@freebsd.org Subject: high cpu irq load and slow boot after update from 10.4 to 11.2 Message-Id: <20181126094648.510fc7f7b773bfdac546d037@aei.mpg.de> Organization: Max Planck Gesellschaft X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.32; amd64-portbld-freebsd11.1) Mime-Version: 1.0 X-MIMETrack: Itemize by SMTP Server on intranet/aei-hannover(Release 9.0.1FP8|February 23, 2017) at 26/11/2018 09:46:48, Serialize by Router on intranet/aei-hannover(Release 9.0.1FP8|February 23, 2017) at 26/11/2018 09:46:48, Serialize complete at 26/11/2018 09:46:48 X-TNEFEvaluated: 1 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII X-PMX-Version: 6.0.2.2308539, Antispam-Engine: 2.7.2.2107409, Antispam-Data: 2018.11.26.83615 X-PerlMx-Spam: Gauge=IIIIIIII, Probability=8%, Report=' HTML_00_01 0.05, HTML_00_10 0.05, MIME_LOWER_CASE 0.05, BODYTEXTP_SIZE_3000_LESS 0, BODY_SIZE_1600_1699 0, BODY_SIZE_2000_LESS 0, BODY_SIZE_5000_LESS 0, BODY_SIZE_7000_LESS 0, SINGLE_URI_IN_BODY 0, URI_WITH_PATH_ONLY 0, __ANY_URI 0, __CP_URI_IN_BODY 0, __CT 0, __CTE 0, __CT_TEXT_PLAIN 0, __HAS_FROM 0, __HAS_MSGID 0, __HAS_X_MAILER 0, __HTTPS_URI 0, __MIME_TEXT_ONLY 0, __MIME_TEXT_P 0, __MIME_TEXT_P1 0, __MIME_VERSION 0, __PHISH_SPEAR_SUBJ_PREDICATE 0, __SANE_MSGID 0, __SINGLE_URI_TEXT 0, __SUBJ_ALPHA_START 0, __TO_MALFORMED_2 0, __TO_NO_NAME 0, __URI_IN_BODY 0, __URI_NOT_IMG 0, __URI_NO_MAILTO 0, __URI_NO_WWW 0, __URI_NS , __URI_WITH_PATH 0' X-Rspamd-Queue-Id: CF2938DCE9 X-Spamd-Result: default: False [0.91 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-0.51)[-0.510,0]; RCVD_COUNT_FIVE(0.00)[5]; FROM_HAS_DN(0.00)[]; MV_CASE(0.50)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-stable@freebsd.org]; TO_DN_NONE(0.00)[]; AUTH_NA(1.00)[]; RCPT_COUNT_ONE(0.00)[1]; HAS_ORG_HEADER(0.00)[]; DMARC_NA(0.00)[mpg.de]; NEURAL_SPAM_SHORT(0.15)[0.151,0]; RCVD_IN_DNSWL_MED(-0.20)[8.224.94.194.list.dnswl.org : 127.0.11.2]; MX_GOOD(-0.01)[mailgate2.aei.mpg.de,mailgate.aei.mpg.de]; NEURAL_SPAM_LONG(0.11)[0.106,0]; R_SPF_NA(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:680, ipnet:194.94.0.0/15, country:DE]; MID_RHS_MATCH_FROM(0.00)[]; IP_SCORE(-0.02)[asn: 680(-0.11), country: DE(-0.01)] X-Rspamd-Server: mx1.freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Nov 2018 08:47:00 -0000 Hi all, A couple of weeks ago, I updated an older storage server (2 CPUs, 4 cores each, 48GB RAM, 36x4GB HDDs, 3 LSI-based mps controllers) from 10.4 to 11.2. The first thing I noticed was that booting takes much longer now. The system probes each HDD (there are 36 of them, attached to mps controllers) very slowly multiple times (I can see the light of each disk blinking, it takes seconds to go on to the next disk), the whole process takes several minutes (was much faster before). A more nasty issue appears after a couple of weeks of operation (so far, roughly between 15 and 30 days): Suddenly there is a very high irq load on one of the CPU cores (cpu:timer), causing high system load and high cpu load (top easily shows average load over 10, whereas it was always below 1 before). I cannot find any process or device as a culprit. First I thought this problem can only be made to go away by rebooting, but now I managed to get rid of it (at least for some time, don't know if or when it will be back) while checking out the latest source in background (I actually intended to fiddle with some kernel settings, but suddenly the issue was gone after persisting permanently over the weekend), causing. Looking around, I found a couple of vaguely similar reports (like https://lists.freebsd.org/pipermail/freebsd-current/2017-January/064419.html), but these all appear to be fixed by now. I have a couple of other storage machines (mostly mps-based, but always slightly different hardware) that show no such issue after updating to 11.2. Any ideas? cu Gerrit