From owner-freebsd-stable@freebsd.org Sun Sep 18 09:43:14 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 11351BDFB56 for ; Sun, 18 Sep 2016 09:43:14 +0000 (UTC) (envelope-from w8hdkim@gmail.com) Received: from mail-vk0-x22a.google.com (mail-vk0-x22a.google.com [IPv6:2607:f8b0:400c:c05::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id BCD53967 for ; Sun, 18 Sep 2016 09:43:13 +0000 (UTC) (envelope-from w8hdkim@gmail.com) Received: by mail-vk0-x22a.google.com with SMTP id 192so85230744vkl.0 for ; Sun, 18 Sep 2016 02:43:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to; bh=35BYWpOIlew+5O8Ad+v4c1WvmYDmZtRsbV/Wf8q7pWw=; b=ahKUDnZ+tJ+/F7xDsQdvHBc9GFFxt+pNJ2V0Olj5V8aZ46R/h2dPIkrUHHqUlQhN/Y 9X0AoGsBScIfjHMesEAAVH4m6sVhTBOQAZDhed6JAmd4sfOm+yCuTXUiKfNubw/wf0z8 B/Mfks5kBg8On/NXdmpWwv5IdTGPHd0GG0VpUcgyWyEA+MorbSx1XeQtmMN4hjhxXVp3 Rcz4CtU/SIezndP1cVsA7C5Vgfz2UNw8QCCNx119lmKZJUVT+r7eZatqsqY9eBZNH/W5 mhRnJdhkA+lZ+ID4vMKKNp5EkVlU8egygYnwuP4KqjK4/WlYmimzHNFuaNkL4kW0He96 orkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=35BYWpOIlew+5O8Ad+v4c1WvmYDmZtRsbV/Wf8q7pWw=; b=jpMhlIFmp5VJ7q9oIoB0J5es+2SSUeqMVPj0Shuxfpbna1HhG51DZH7VrgQsMyHQha sf2UdAA+yalBuFyXN+sMe/O2sqgfp4AxUBxVAyvHUIXxLwO78n0VA0uyYOfwP5ec+yJu DqViOHIhCWwdD5+prdy6pA+E6L1zcsT/5QMucEJULaw0UUaNuMU6If5vePUsnzXm+5TN ec7WXQBiH5430D+Az54oGyGr9DnScokVJvCWlDKl3uPNRH5MxEndEq40i8QjbG/i/o0L maGN7oJtKUt86cv4LZE3fRjT+YR3BOd/IPEZnaSseYtWiQTiaJGQUcsPRBAUgzMNBhFu qkfA== X-Gm-Message-State: AE9vXwM5mY14I5GYXjJKa/aOSDAyBF3DWsqWpObGuwpeXHAU2JB7Gy3dO3o7p7ILHagNnCCEyVdo0oigWKvxdQ== X-Received: by 10.31.172.209 with SMTP id v200mr10891844vke.55.1474191792519; Sun, 18 Sep 2016 02:43:12 -0700 (PDT) MIME-Version: 1.0 Received: by 10.176.82.54 with HTTP; Sun, 18 Sep 2016 02:43:12 -0700 (PDT) From: Kim Culhan Date: Sun, 18 Sep 2016 05:43:12 -0400 Message-ID: Subject: freebsd-update to FreeBSD 11.0-RC3 then kernel compile fails In function `iflib_legacy_setup' To: freebsd-stable@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Sep 2016 09:43:14 -0000 Used freebsd-update from 11.0-RC1 to 11.0-RC3 and kernel compile failed: linking kernel.full iflib.o: In function `iflib_legacy_setup': /usr/src/sys/amd64/compile/hyster3/../../../net/iflib.c:4457: undefined reference to `taskqgroup_attach' kernel config was GENERIC with added: > device pf > device pflog > device pfsync > > options ALTQ > options ALTQ_CBQ > options ALTQ_RED > options ALTQ_RIO > options ALTQ_CODEL > options ALTQ_HFSC > options ALTQ_FAIRQ > options ALTQ_CDNR > options ALTQ_PRIQ Any help greatly appreciated. thanks -kim From owner-freebsd-stable@freebsd.org Sun Sep 18 13:09:21 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C1E06BDECB0 for ; Sun, 18 Sep 2016 13:09:21 +0000 (UTC) (envelope-from ubm.freebsd@googlemail.com) Received: from mail-wm0-x243.google.com (mail-wm0-x243.google.com [IPv6:2a00:1450:400c:c09::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 600201108 for ; Sun, 18 Sep 2016 13:09:21 +0000 (UTC) (envelope-from ubm.freebsd@googlemail.com) Received: by mail-wm0-x243.google.com with SMTP id l132so9608719wmf.1 for ; Sun, 18 Sep 2016 06:09:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=from:date:to:subject:message-id:mime-version :content-transfer-encoding; bh=I9SbEv8bJI3Wzp4S5PhEKY2AZZiobTt4W/xjx/SO67U=; b=ovc7j1Og6yw3U/qvc6pWxL8AIMlam5o8NgOIuZqGYoGpWkZImVyBOgQ3cOReDMwuk1 aiynHWzOEyQBELiBBkzWvTc42US6V/wrw8jwfObG4MnOqbUkby71pGGX8+feX0tLVoxJ hQtK4z24ua6D3Yk02Unj0R3vCZvFTw5WPc/w3k44+wWlMC4irf8wB/t+mn5tSM/ebMWi anqTcpp9RTXIwEspXqTHdiIpmzG4+oFw1QtcGqzEg+6OIFEsv5VR9V/AjlmIyUh+4st7 4FR4JH7s26kdQr5YgAETeaiHTC2xoRTM47Zs7ElngSX94K6zujYQtA6c6aJPulDxJX35 MPCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:date:to:subject:message-id:mime-version :content-transfer-encoding; bh=I9SbEv8bJI3Wzp4S5PhEKY2AZZiobTt4W/xjx/SO67U=; b=Bf4I4gzeIrP4m7k+XzUZ2IAUBQvWCazJ/WbjY/2AcX1jisQTkGghLk3R8boHMYDwQv kWkUWCcW1Rt3aFQ3XZnRrfIJbZo3ICC+7KbQn8t1b/l/n/cA/0Gbz+waJ6Uk7ASL4/I9 FXNgAh2A/Gyyr+DO5Vyhgb8npz/gEJdM2i7vDHeai2+fn0vDuDZ+S+R/45TrUT8jC44U 1Vzjfbejbh4f41ZXdtGSSuUFLvA0bCtkJtbLtO8USu1DhCVs9olauZ3LP9cMWy3b+14B X7F3bNy0rGrqZqj7Q1lHPwEZXyZKkMyd04noxphtPahKYfVTmrADiDNbZo7jXIV2AOqF Tqlg== X-Gm-Message-State: AE9vXwMpNhQUHzZxHBuJMh7a6wDkUgvh2xMEfN6As1+GqccAEoA08tvHfOJ8nogC7FWP3A== X-Received: by 10.194.109.229 with SMTP id hv5mr22264895wjb.131.1474204159599; Sun, 18 Sep 2016 06:09:19 -0700 (PDT) Received: from ubm.strangled.net (ipb21a85d1.dynamic.kabel-deutschland.de. [178.26.133.209]) by smtp.gmail.com with ESMTPSA id lj2sm17815660wjc.38.2016.09.18.06.09.18 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 18 Sep 2016 06:09:19 -0700 (PDT) From: Marc UBM Bocklet X-Google-Original-From: Marc "UBM" Bocklet Date: Sun, 18 Sep 2016 15:09:17 +0200 To: freebsd-stable Subject: zfs resilver keeps restarting Message-Id: <20160918150917.09f9448464d84d4e50808707@gmail.com> X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.29; amd64-portbld-freebsd11.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Sep 2016 13:09:21 -0000 Hi all, due to two bad cables, I had two drives drop from my striped raidz2 pool (built on top of geli encrypted drives). I replaced one of the drives before I realized that the cabling was at fault - that's the drive which is being replaced in the ouput of zpool status below. I have just installed the new cables and all sata errors are gone. However, the resilver of the pool keeps restarting. I see no errors in /var/log/messages, but zpool history -i says: 2016-09-18.14:56:21 [txg:1219501] scan setup func=2 mintxg=3 maxtxg=1219391 2016-09-18.14:56:51 [txg:1219505] scan done complete=0 2016-09-18.14:56:51 [txg:1219505] scan setup func=2 mintxg=3 maxtxg=1219391 2016-09-18.14:57:20 [txg:1219509] scan done complete=0 2016-09-18.14:57:20 [txg:1219509] scan setup func=2 mintxg=3 maxtxg=1219391 2016-09-18.14:57:49 [txg:1219513] scan done complete=0 2016-09-18.14:57:49 [txg:1219513] scan setup func=2 mintxg=3 maxtxg=1219391 2016-09-18.14:58:19 [txg:1219517] scan done complete=0 2016-09-18.14:58:19 [txg:1219517] scan setup func=2 mintxg=3 maxtxg=1219391 2016-09-18.14:58:45 [txg:1219521] scan done complete=0 2016-09-18.14:58:45 [txg:1219521] scan setup func=2 mintxg=3 maxtxg=1219391 I assume that "scan done complete=0" means that the resilver didn't finish? pool layout is the following: pool: pool state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Sun Sep 18 14:51:39 2016 235G scanned out of 9.81T at 830M/s, 3h21m to go 13.2M resilvered, 2.34% done config: NAME STATE READ WRITE CKSUM pool DEGRADED 0 0 0 raidz2-0 ONLINE 0 0 0 da6.eli ONLINE 0 0 0 da7.eli ONLINE 0 0 0 ada1.eli ONLINE 0 0 0 ada2.eli ONLINE 0 0 0 da10.eli ONLINE 0 0 2 da11.eli ONLINE 0 0 0 da12.eli ONLINE 0 0 0 da13.eli ONLINE 0 0 0 raidz2-1 DEGRADED 0 0 0 da0.eli ONLINE 0 0 0 da1.eli ONLINE 0 0 0 da2.eli ONLINE 0 0 1 (resilvering) replacing-3 DEGRADED 0 0 1 10699825708166646100 UNAVAIL 0 0 0 was /dev/da3.eli da4.eli ONLINE 0 0 0 (resilvering) da3.eli ONLINE 0 0 0 da5.eli ONLINE 0 0 0 da8.eli ONLINE 0 0 0 da9.eli ONLINE 0 0 0 errors: No known data errors system is FreeBSD xxx 10.1-BETA1 FreeBSD 10.1-BETA1 #27 r271633: Mon Sep 15 22:34:05 CEST 2014 root@xxx:/usr/obj/usr/src/sys/xxx amd64 controller is SAS2116 PCI-Express Fusion-MPT SAS-2 [Meteor] Drives are connected via four four-port sata cables. Should I upgrade to 10.3-release or did I make some sort of configuration error / overlook something? Thanks in advance! Cheers, Marc From owner-freebsd-stable@freebsd.org Sun Sep 18 15:22:25 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1B883BDF090 for ; Sun, 18 Sep 2016 15:22:25 +0000 (UTC) (envelope-from w8hdkim@gmail.com) Received: from mail-yb0-x235.google.com (mail-yb0-x235.google.com [IPv6:2607:f8b0:4002:c09::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id CD5DF372 for ; Sun, 18 Sep 2016 15:22:24 +0000 (UTC) (envelope-from w8hdkim@gmail.com) Received: by mail-yb0-x235.google.com with SMTP id x93so68681047ybh.1 for ; Sun, 18 Sep 2016 08:22:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to; bh=rbpFOF9vdxUylOfMAEVJIUA5pQHa54vJB+b2HpjBR/o=; b=Wl1GMsn6OCXqACyiZgiUOxpTtQSbB4Q/Xegw1lrKNppDXLv92AoUQG5XslGDhFauIA 836M2+HYvigaYfX86jhndtq5uebsKsQreu/avaN+nAIQrQwlLfeGDOnLSq4E62ulpCWU HkESguqrJKvqCpjl4KSiMHJSXuqgIZ+jfhCcdQQDoAUAynda0wSZ39U3DXAypVhcH6yq DWPZnZSsL9uG1sCkqanBRzIv6kE/2vjxS5/oU5YY1RYactXXUDC0jOtbdBzTgmKU49H0 M019Y0CFkPMUYqF2ePUuAQyeFtqOnIlNNv2Y7bhc2ta0AG+MQwWM3rAROWXCLbXxGJ2x crFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=rbpFOF9vdxUylOfMAEVJIUA5pQHa54vJB+b2HpjBR/o=; b=hQe5aXxxpBStCbb0XUc/3iNI957MycnyVSE5uZJbLJdpc+/L4ERky3rt/ttyOimsfk 6s2p98I+bOwZVSpEKBa+ilnjFlYitTCjXCLxbclEFn6jO/zgxLxgu24ChXmGrbksw6xA H1IZRlBNMtpohsHRzylKetqmedI6EpPDlo596ELKH97/P7e7Ly0WT/aODB+B45cdoITU MjFO+cJ+4NC0Z5B44rk5YZ0fsp2Xr7OSFfhti6zhRFbqBQ3XV/v3/AnJ7Q1jkoM0ORlV ZOJBhZPYII3zSzNnD/7Z3g0asZPzOey9+MccH7fRmFptYV/HdUVt9tSn4cznumPcraRd SN/g== X-Gm-Message-State: AE9vXwN5Gi83XR0wk3SQldJY5tQ3SCuAPUiv3orVJiNFGv8i2ZVTezjW8FNgLtp/7JPmqttlbO1f8UCuxzVPcg== X-Received: by 10.37.171.201 with SMTP id v67mr1481019ybi.177.1474212143880; Sun, 18 Sep 2016 08:22:23 -0700 (PDT) MIME-Version: 1.0 Received: by 10.176.82.54 with HTTP; Sun, 18 Sep 2016 08:22:23 -0700 (PDT) From: Kim Culhan Date: Sun, 18 Sep 2016 11:22:23 -0400 Message-ID: Subject: RE: working now: freebsd-update to FreeBSD 11.0-RC3 then kernel compile fails In function `iflib_legacy_setup' To: freebsd-stable@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Sep 2016 15:22:25 -0000 Attempted to isolate the problem, compiled a GENERIC kernel with nothing added, with no problem. Then with only pf added and then with pf and altq added and still no compile problem. Did not reboot the machine any more between these trials so I think the update process was good, do not know what the difference is. Sorry for the noise. thanks -kim From owner-freebsd-stable@freebsd.org Sun Sep 18 16:05:53 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D01B5BDFD11 for ; Sun, 18 Sep 2016 16:05:53 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-oi0-x232.google.com (mail-oi0-x232.google.com [IPv6:2607:f8b0:4003:c06::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 966BB1D70 for ; Sun, 18 Sep 2016 16:05:53 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-oi0-x232.google.com with SMTP id a62so31778733oib.1 for ; Sun, 18 Sep 2016 09:05:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=ONKRZH4OBX8uXIkOcBIfYmMuUV3qSfvOJc2xS/3ALzk=; b=zqFWTzBbepkbrS/DlmqSo5+yQ6mhmJJm4ZqYrYJmasVaW3eZMmvcs/Y8KkeRWsocBp 7iax7o+fPH+LVdRnB3mAGb9OFJV1OOWgIhWWrmoPmjv73V3SGFYwJuy3Hg7zw0dlO/Hl i6sSEM3y8GfELlEPyaNKqEWpYe5AEhaj3YbSVv8nhPf9CcBOYYL7u1hw8t2xRR+3o0am EOqKbaU+hGftiifZ+vd5k9kNwiEr855v8M3VVRFR/aMG+arv+IwoVvASUNetrsksu34t QsE+V3k9ou9iOOlfYelgFZrH+OgV1XO8+DNPdhkX/7XJvWsTju5XFLG2QHT5bQJihgvA KOWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=ONKRZH4OBX8uXIkOcBIfYmMuUV3qSfvOJc2xS/3ALzk=; b=epr09876HPXAwot5vy6WYZEX/T5Qlo8Qf/nRkGSOxN2lpUiKIpwmZgY1hQqEEbfrHS cZ5ueS6OlbqABpkyppXQeAgF2HD5tkFi5wPbEmev5gvwRFXYNmSxB1daPFGMN/CaOOzj zORB3+wvyUlOoWcsqomJqN+uWdhk5dafmdAqkD1bVDky0CoNMu14eeEJmyWcA6yo1IGb AliOLcDrgfrXDXaAGztYa2JhfcJGSa3e2IzEOEb5/TKZxOVMV0rXSiiEQg2egATef8zY Kugto8/r/omvBiY0/2d3FxjYtyPZ86QI2LhJYhJKtX5wYMy4K9l89oERs5nNvLUucwU5 797w== X-Gm-Message-State: AE9vXwMTOeRQHqdxgjX1tW3oJcP1DELGXfbvmS+BbX0QJj/XAcCodkt4E1RlD+KGFM54P6AywP1c22xIIcdBMg== X-Received: by 10.202.97.2 with SMTP id v2mr25460213oib.157.1474214752854; Sun, 18 Sep 2016 09:05:52 -0700 (PDT) MIME-Version: 1.0 Sender: asomers@gmail.com Received: by 10.202.71.11 with HTTP; Sun, 18 Sep 2016 09:05:52 -0700 (PDT) In-Reply-To: <20160918150917.09f9448464d84d4e50808707@gmail.com> References: <20160918150917.09f9448464d84d4e50808707@gmail.com> From: Alan Somers Date: Sun, 18 Sep 2016 10:05:52 -0600 X-Google-Sender-Auth: CFF30-DRyyLLtwmZYEa6_hfAxlM Message-ID: Subject: Re: zfs resilver keeps restarting To: Marc UBM Bocklet Cc: freebsd-stable Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Sep 2016 16:05:53 -0000 On Sun, Sep 18, 2016 at 7:09 AM, Marc UBM Bocklet via freebsd-stable wrote: > > Hi all, > > due to two bad cables, I had two drives drop from my striped raidz2 > pool (built on top of geli encrypted drives). I replaced one of the > drives before I realized that the cabling was at fault - that's the > drive which is being replaced in the ouput of zpool status below. > > I have just installed the new cables and all sata errors are gone. > However, the resilver of the pool keeps restarting. > > I see no errors in /var/log/messages, but zpool history -i says: > > 2016-09-18.14:56:21 [txg:1219501] scan setup func=2 mintxg=3 > maxtxg=1219391 2016-09-18.14:56:51 [txg:1219505] scan done complete=0 > 2016-09-18.14:56:51 [txg:1219505] scan setup func=2 mintxg=3 > maxtxg=1219391 2016-09-18.14:57:20 [txg:1219509] scan done complete=0 > 2016-09-18.14:57:20 [txg:1219509] scan setup func=2 mintxg=3 > maxtxg=1219391 2016-09-18.14:57:49 [txg:1219513] scan done complete=0 > 2016-09-18.14:57:49 [txg:1219513] scan setup func=2 mintxg=3 > maxtxg=1219391 2016-09-18.14:58:19 [txg:1219517] scan done complete=0 > 2016-09-18.14:58:19 [txg:1219517] scan setup func=2 mintxg=3 > maxtxg=1219391 2016-09-18.14:58:45 [txg:1219521] scan done complete=0 > 2016-09-18.14:58:45 [txg:1219521] scan setup func=2 mintxg=3 > maxtxg=1219391 > > I assume that "scan done complete=0" means that the resilver didn't > finish? > > pool layout is the following: > > pool: pool > state: DEGRADED > status: One or more devices is currently being resilvered. The pool > will continue to function, possibly in a degraded state. > action: Wait for the resilver to complete. > scan: resilver in progress since Sun Sep 18 14:51:39 2016 > 235G scanned out of 9.81T at 830M/s, 3h21m to go > 13.2M resilvered, 2.34% done > config: > > NAME STATE READ WRITE CKSUM > pool DEGRADED 0 0 0 > raidz2-0 ONLINE 0 0 0 > da6.eli ONLINE 0 0 0 > da7.eli ONLINE 0 0 0 > ada1.eli ONLINE 0 0 0 > ada2.eli ONLINE 0 0 0 > da10.eli ONLINE 0 0 2 > da11.eli ONLINE 0 0 0 > da12.eli ONLINE 0 0 0 > da13.eli ONLINE 0 0 0 > raidz2-1 DEGRADED 0 0 0 > da0.eli ONLINE 0 0 0 > da1.eli ONLINE 0 0 0 > da2.eli ONLINE 0 0 1 > (resilvering) > replacing-3 DEGRADED 0 0 1 > 10699825708166646100 UNAVAIL 0 0 0 > was /dev/da3.eli da4.eli ONLINE 0 0 0 > (resilvering) > da3.eli ONLINE 0 0 0 > da5.eli ONLINE 0 0 0 > da8.eli ONLINE 0 0 0 > da9.eli ONLINE 0 0 0 > > errors: No known data errors > > system is > FreeBSD xxx 10.1-BETA1 FreeBSD 10.1-BETA1 #27 r271633: > Mon Sep 15 22:34:05 CEST 2014 > root@xxx:/usr/obj/usr/src/sys/xxx amd64 > > controller is > SAS2116 PCI-Express Fusion-MPT SAS-2 [Meteor] > > Drives are connected via four four-port sata cables. > > Should I upgrade to 10.3-release or did I make some sort of > configuration error / overlook something? > > Thanks in advance! > > Cheers, > Marc Resilver will start over anytime there's new damage. In your case, with two failed drives, resilver should've begun after you replaced the first drive, and restarted after you replaced the second. Have you seen it restart more than that? If so, keep an eye on the error counters in "zpool status"; they might give you a clue. You could also raise the loglevel of devd to "info" in /etc/syslog.conf and see what gets logged to /etc/devd.log. That will tell you if drives a dropping out and automatically rejoining the pool, for example. -Alan From owner-freebsd-stable@freebsd.org Sun Sep 18 16:22:45 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2E5FFBDF059 for ; Sun, 18 Sep 2016 16:22:45 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E9371671; Sun, 18 Sep 2016 16:22:44 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1blerZ-000A4R-D2; Sun, 18 Sep 2016 19:22:41 +0300 Date: Sun, 18 Sep 2016 19:22:41 +0300 From: Slawa Olhovchenkov To: John Baldwin Cc: freebsd-stable@freebsd.org Subject: Re: nginx and FreeBSD11 Message-ID: <20160918162241.GE2960@zxy.spb.ru> References: <20160907191348.GD22212@zxy.spb.ru> <20160915144103.GB2960@zxy.spb.ru> <1823460.vTm8IvUQsF@ralph.baldwin.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1823460.vTm8IvUQsF@ralph.baldwin.cx> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Sep 2016 16:22:45 -0000 On Thu, Sep 15, 2016 at 10:28:11AM -0700, John Baldwin wrote: > On Thursday, September 15, 2016 05:41:03 PM Slawa Olhovchenkov wrote: > > On Wed, Sep 07, 2016 at 10:13:48PM +0300, Slawa Olhovchenkov wrote: > > > > > I am have strange issuse with nginx on FreeBSD11. > > > I am have FreeBSD11 instaled over STABLE-10. > > > nginx build for FreeBSD10 and run w/o recompile work fine. > > > nginx build for FreeBSD11 crushed inside rbtree lookups: next node > > > totaly craped. > > > > > > I am see next potential cause: > > > > > > 1) clang 3.8 code generation issuse > > > 2) system library issuse > > > > > > may be i am miss something? > > > > > > How to find real cause? > > > > I find real cause and this like show-stopper for RELEASE. > > I am use nginx with AIO and AIO from one nginx process corrupt memory > > from other nginx process. Yes, this is cross-process memory > > corruption. > > > > Last case, core dumped proccess with pid 1060 at 15:45:14. > > Corruped memory at 0x860697000. > > I am know about good memory at 0x86067f800. > > Dumping (form core) this region to file and analyze by hexdump I am > > found start of corrupt region -- offset 0000c8c0 from 0x86067f800. > > 0x86067f800+0xc8c0 = 0x86068c0c0 > > > > I am preliminary enabled debuggin of AIO started operation to nginx > > error log (memory address, file name, offset and size of transfer). > > > > grep -i 86068c0c0 error.log near 15:45:14 give target file. > > grep ce949665cbcd.hls error.log near 15:45:14 give next result: > > > > 2016/09/15 15:45:13 [notice] 1055#0: *11659936 AIO_RD 000000082065DB60 start 000000086068C0C0 561b0 2646736 ce949665cbcd.hls > > 2016/09/15 15:45:14 [notice] 1060#0: *10998125 AIO_RD 000000081F1FFB60 start 000000086FF2C0C0 6cdf0 140016832 ce949665cbcd.hls > > 2016/09/15 15:45:14 [notice] 1055#0: *11659936 AIO_RD 00000008216B6B60 start 000000086472B7C0 7ff70 2999424 ce949665cbcd.hls > > Does nginx only use AIO for regular files or does it also use it with sockets? > > You can try using this patch as a diagnostic (you will need to > run with INVARIANTS enabled, or at least enabled for vfs_aio.c): > > Index: vfs_aio.c > =================================================================== > --- vfs_aio.c (revision 305811) > +++ vfs_aio.c (working copy) > @@ -787,6 +787,8 @@ aio_process_rw(struct kaiocb *job) > * aio_aqueue() acquires a reference to the file that is > * released in aio_free_entry(). > */ > + KASSERT(curproc->p_vmspace == job->userproc->p_vmspace, > + ("%s: vmspace mismatch", __func__)); > if (cb->aio_lio_opcode == LIO_READ) { > auio.uio_rw = UIO_READ; > if (auio.uio_resid == 0) > @@ -1054,6 +1056,8 @@ aio_switch_vmspace(struct kaiocb *job) > { > > vmspace_switch_aio(job->userproc->p_vmspace); > + KASSERT(curproc->p_vmspace == job->userproc->p_vmspace, > + ("%s: vmspace mismatch", __func__)); > } > > If this panics, then vmspace_switch_aio() is not working for > some reason. I am try using next DTrace script: ==== #pragma D option dynvarsize=64m int req[struct vmspace *, void *]; self int trace; syscall:freebsd:aio_read:entry { this->aio = *(struct aiocb *)copyin(arg0, sizeof(struct aiocb)); req[curthread->td_proc->p_vmspace, this->aio.aio_buf] = curthread->td_proc->p_pid; } fbt:kernel:aio_process_rw:entry { self->job = args[0]; self->trace = 1; } fbt:kernel:aio_process_rw:return /self->trace/ { req[self->job->userproc->p_vmspace, self->job->uaiocb.aio_buf] = 0; self->job = 0; self->trace = 0; } fbt:kernel:vn_io_fault:entry /self->trace && !req[curthread->td_proc->p_vmspace, args[1]->uio_iov[0].iov_base]/ { this->buf = args[1]->uio_iov[0].iov_base; printf("%Y vn_io_fault %p:%p pid %d\n", walltimestamp, curthread->td_proc->p_vmspace, this->buf, req[curthread->td_proc->p_vmspace, this->buf]); } === And don't got any messages near nginx core dump. What I can check next? May be check context/address space switch for kernel process? From owner-freebsd-stable@freebsd.org Sun Sep 18 16:43:04 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D5BC9BDF6D6 for ; Sun, 18 Sep 2016 16:43:04 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 963DB11FF; Sun, 18 Sep 2016 16:43:04 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1blfBB-000AXS-AF; Sun, 18 Sep 2016 19:42:57 +0300 Date: Sun, 18 Sep 2016 19:42:57 +0300 From: Slawa Olhovchenkov To: hiren panchasara Cc: jch@FreeBSD.org, Konstantin Belousov , freebsd-stable@FreeBSD.org Subject: Re: 11.0 stuck on high network load Message-ID: <20160918164257.GF2960@zxy.spb.ru> References: <20160904215739.GC22212@zxy.spb.ru> <20160905014612.GA42393@strugglingcoder.info> <20160914213503.GJ2840@zxy.spb.ru> <20160915085938.GN38409@kib.kiev.ua> <20160915090633.GS2840@zxy.spb.ru> <20160916181839.GC2960@zxy.spb.ru> <20160916183053.GL9397@strugglingcoder.info> <20160916190330.GG2840@zxy.spb.ru> <20160916191155.GM9397@strugglingcoder.info> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160916191155.GM9397@strugglingcoder.info> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Sep 2016 16:43:04 -0000 On Fri, Sep 16, 2016 at 12:11:55PM -0700, hiren panchasara wrote: > + jch@ > On 09/16/16 at 10:03P, Slawa Olhovchenkov wrote: > > On Fri, Sep 16, 2016 at 11:30:53AM -0700, hiren panchasara wrote: > > > > > On 09/16/16 at 09:18P, Slawa Olhovchenkov wrote: > > > > On Thu, Sep 15, 2016 at 12:06:33PM +0300, Slawa Olhovchenkov wrote: > > > > > > > > > On Thu, Sep 15, 2016 at 11:59:38AM +0300, Konstantin Belousov wrote: > > > > > > > > > > > On Thu, Sep 15, 2016 at 12:35:04AM +0300, Slawa Olhovchenkov wrote: > > > > > > > On Sun, Sep 04, 2016 at 06:46:12PM -0700, hiren panchasara wrote: > > > > > > > > > > > > > > > On 09/05/16 at 12:57P, Slawa Olhovchenkov wrote: > > > > > > > > > I am try using 11.0 on Dual E5-2620 (no X2APIC). > > > > > > > > > Under high network load and may be addtional conditional system go to > > > > > > > > > unresponsible state -- no reaction to network and console (USB IPMI > > > > > > > > > emulation). INVARIANTS give to high overhad. Is this exist some way to > > > > > > > > > debug this? > > > > > > > > > > > > > > > > Can you panic it from console to get to db> to get backtrace and other > > > > > > > > info when it goes unresponsive? > > > > > > > > > > > > > > ipmi console don't respond (chassis power diag don't react) > > > > > > > login on sol console stuck on *tcp. > > > > > > > > > > > > Is 'login' you reference is the ipmi client state, or you mean login(1) > > > > > > on the wedged host ? > > > > > > > > > > on the wedged host > > > > > > > > > > > If BMC stops responding simultaneously with the host, I would suspect > > > > > > the hardware platform issues instead of a software problem. Do you have > > > > > > dedicated LAN port for BMC ? > > > > > > > > > > Yes. > > > > > But BMC emulate USB keyboard and this is may be lock inside USB > > > > > system. > > > > > "ipmi console don't respond" must be read as "ipmi console runnnig and > > > > > attached but system don't react to keypress on this console". > > > > > at the sime moment system respon to `enter` on ipmi sol console, but > > > > > after enter `root` stuck in login in the '*tcp' state (I think this is > > > > > NIS related). > > > > > > > > ~^B don't break to debuger. > > > > But I can login to sol console. > > > > > > You can probably: > > > debug.kdb.enter: set to enter the debugger > > > > > > or force a panic and get vmcore: > > > debug.kdb.panic: set to panic the kernel > > > > I am reset this host. > > PMC samples collected and decoded: > > > > @ CPU_CLK_UNHALTED_CORE [4653445 samples] > > > > 51.86% [2413083] lock_delay @ /boot/kernel.VSTREAM/kernel > > 100.0% [2413083] __rw_wlock_hard > > 100.0% [2413083] tcp_tw_2msl_scan > > 99.99% [2412958] pfslowtimo > > 100.0% [2412958] softclock_call_cc > > 100.0% [2412958] softclock > > 100.0% [2412958] intr_event_execute_handlers > > 100.0% [2412958] ithread_loop > > 100.0% [2412958] fork_exit > > 00.01% [125] tcp_twstart > > 100.0% [125] tcp_do_segment > > 100.0% [125] tcp_input > > 100.0% [125] ip_input > > 100.0% [125] swi_net > > 100.0% [125] intr_event_execute_handlers > > 100.0% [125] ithread_loop > > 100.0% [125] fork_exit > > > > 09.43% [438774] _rw_runlock_cookie @ /boot/kernel.VSTREAM/kernel > > 100.0% [438774] tcp_tw_2msl_scan > > 99.99% [438735] pfslowtimo > > 100.0% [438735] softclock_call_cc > > 100.0% [438735] softclock > > 100.0% [438735] intr_event_execute_handlers > > 100.0% [438735] ithread_loop > > 100.0% [438735] fork_exit > > 00.01% [39] tcp_twstart > > 100.0% [39] tcp_do_segment > > 100.0% [39] tcp_input > > 100.0% [39] ip_input > > 100.0% [39] swi_net > > 100.0% [39] intr_event_execute_handlers > > 100.0% [39] ithread_loop > > 100.0% [39] fork_exit > > > > 08.57% [398970] __rw_wlock_hard @ /boot/kernel.VSTREAM/kernel > > 100.0% [398970] tcp_tw_2msl_scan > > 99.99% [398940] pfslowtimo > > 100.0% [398940] softclock_call_cc > > 100.0% [398940] softclock > > 100.0% [398940] intr_event_execute_handlers > > 100.0% [398940] ithread_loop > > 100.0% [398940] fork_exit > > 00.01% [30] tcp_twstart > > 100.0% [30] tcp_do_segment > > 100.0% [30] tcp_input > > 100.0% [30] ip_input > > 100.0% [30] swi_net > > 100.0% [30] intr_event_execute_handlers > > 100.0% [30] ithread_loop > > 100.0% [30] fork_exit > > > > 05.79% [269224] __rw_try_rlock @ /boot/kernel.VSTREAM/kernel > > 100.0% [269224] tcp_tw_2msl_scan > > 99.99% [269203] pfslowtimo > > 100.0% [269203] softclock_call_cc > > 100.0% [269203] softclock > > 100.0% [269203] intr_event_execute_handlers > > 100.0% [269203] ithread_loop > > 100.0% [269203] fork_exit > > 00.01% [21] tcp_twstart > > 100.0% [21] tcp_do_segment > > 100.0% [21] tcp_input > > 100.0% [21] ip_input > > 100.0% [21] swi_net > > 100.0% [21] intr_event_execute_handlers > > 100.0% [21] ithread_loop > > 100.0% [21] fork_exit > > > > 05.35% [249141] _rw_wlock_cookie @ /boot/kernel.VSTREAM/kernel > > 99.76% [248543] tcp_tw_2msl_scan > > 99.99% [248528] pfslowtimo > > 100.0% [248528] softclock_call_cc > > 100.0% [248528] softclock > > 100.0% [248528] intr_event_execute_handlers > > 100.0% [248528] ithread_loop > > 100.0% [248528] fork_exit > > 00.01% [15] tcp_twstart > > 100.0% [15] tcp_do_segment > > 100.0% [15] tcp_input > > 100.0% [15] ip_input > > 100.0% [15] swi_net > > 100.0% [15] intr_event_execute_handlers > > 100.0% [15] ithread_loop > > 100.0% [15] fork_exit > > 00.24% [598] pfslowtimo > > 100.0% [598] softclock_call_cc > > 100.0% [598] softclock > > 100.0% [598] intr_event_execute_handlers > > 100.0% [598] ithread_loop > > 100.0% [598] fork_exit > > > > As I suspected, this looks like a hang trying to lock V_tcbinfo. > > I'm ccing Julien here who worked on WLOCK -> RLOCK transition to improve > performance for short-lived connections. I am not too sure if thats the > problem but looks in similar area so he may be able to provide some > insights. I am point to tcp_tw_2msl_scan. I am expect traveling by list V_twq_2msl. But I am see only endless attempt to lock first element from this list. Is this correct? From owner-freebsd-stable@freebsd.org Sun Sep 18 16:46:40 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8E12CBDF7D7 for ; Sun, 18 Sep 2016 16:46:40 +0000 (UTC) (envelope-from ubm.freebsd@googlemail.com) Received: from mail-wm0-x230.google.com (mail-wm0-x230.google.com [IPv6:2a00:1450:400c:c09::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2C5AB13AB for ; Sun, 18 Sep 2016 16:46:40 +0000 (UTC) (envelope-from ubm.freebsd@googlemail.com) Received: by mail-wm0-x230.google.com with SMTP id l68so21615428wml.1 for ; Sun, 18 Sep 2016 09:46:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=from:date:to:subject:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=Wc80ysSA0OXp4VIHJYv9VkxrXzj4P3RMdyu9ULSCDfU=; b=YIYweVefFw/ZMXnLsUCV8P++aQMZ2FzIccVlR9txQ0gJXa+q2oQ0CZXMtE2Moxl7qC LwRU9bDaflKQuT5DNqtmTG2U1iEgzYJuwKOAzld9OC9e7dxYYYD9smFroytWeWIiluWm nIWPI25McYk5fOjFcXEAisrCEf9Gdn/StZ1P+epy+486sSre5vO/XmRpI2MQM+iHUCN6 RKB5tp1I46RMLxwIfUSn9tTKd25YV8JW97CGijmrLWcUqBoTUSwDLTXgBMSJQ4qERAv0 AA/id4x70Ks58kCzyhOt5WyBtX2fJZA5/fiIPrdOx4hCAsiGMHXuBp/ipRXwmAtw8hGQ 4avw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:date:to:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Wc80ysSA0OXp4VIHJYv9VkxrXzj4P3RMdyu9ULSCDfU=; b=BoyHpNS4PWz1EzuEQzf0PfTjJY0YKQeU/8k7OZGj2u82rqowfIG6GRZfDJmd5SFoaq Z+JkY/VSt04HfD15VQStecK6QRJseFs4xP5gmmfUdV16eq2YalilSQrrpqaFHA08HyMn fZX3Ho0GMZy1JcBiZS2O7CT83v19+juAdNIfp3uQEW4NzW/F2fSofhUKyZuTSn2xHB6/ boRI2nyVTg7CBVk9uFM4x4ZaLIOc5lunq7fpVdSjSiive7Z0jNhyQCum4NqfdVNENv3s Sn7nDhDoZvt5+R/sOFMdQtH/7srMM/T12kOBQc2AZ38Bo6jGcbayVqYYd3O1+WyCGqlN H9nw== X-Gm-Message-State: AE9vXwMNYkSl+NeNCTYcggwhEHVD735lfIx+FBrQFPYS2DF6wjiW/EUHeFI36zrRziLcLg== X-Received: by 10.194.135.76 with SMTP id pq12mr19307645wjb.114.1474217198368; Sun, 18 Sep 2016 09:46:38 -0700 (PDT) Received: from ubm.strangled.net (ipb21a85d1.dynamic.kabel-deutschland.de. [178.26.133.209]) by smtp.gmail.com with ESMTPSA id r2sm11005151wmf.14.2016.09.18.09.46.37 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 18 Sep 2016 09:46:37 -0700 (PDT) From: Marc UBM Bocklet X-Google-Original-From: Marc "UBM" Bocklet Date: Sun, 18 Sep 2016 18:46:36 +0200 To: freebsd-stable Subject: Re: zfs resilver keeps restarting Message-Id: <20160918184636.5861562661d4376e845ac75d@gmail.com> In-Reply-To: References: <20160918150917.09f9448464d84d4e50808707@gmail.com> X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.29; amd64-portbld-freebsd11.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Sep 2016 16:46:40 -0000 On Sun, 18 Sep 2016 10:05:52 -0600 Alan Somers wrote: > On Sun, Sep 18, 2016 at 7:09 AM, Marc UBM Bocklet via freebsd-stable > wrote: > > > > Hi all, > > > > due to two bad cables, I had two drives drop from my striped raidz2 > > pool (built on top of geli encrypted drives). I replaced one of the > > drives before I realized that the cabling was at fault - that's the > > drive which is being replaced in the ouput of zpool status below. > > > > I have just installed the new cables and all sata errors are gone. > > However, the resilver of the pool keeps restarting. > > > > I see no errors in /var/log/messages, but zpool history -i says: > > > > 2016-09-18.14:56:21 [txg:1219501] scan setup func=2 mintxg=3 > > maxtxg=1219391 2016-09-18.14:56:51 [txg:1219505] scan done complete=0 > > 2016-09-18.14:56:51 [txg:1219505] scan setup func=2 mintxg=3 > > maxtxg=1219391 2016-09-18.14:57:20 [txg:1219509] scan done complete=0 > > 2016-09-18.14:57:20 [txg:1219509] scan setup func=2 mintxg=3 > > maxtxg=1219391 2016-09-18.14:57:49 [txg:1219513] scan done complete=0 > > 2016-09-18.14:57:49 [txg:1219513] scan setup func=2 mintxg=3 > > maxtxg=1219391 2016-09-18.14:58:19 [txg:1219517] scan done complete=0 > > 2016-09-18.14:58:19 [txg:1219517] scan setup func=2 mintxg=3 > > maxtxg=1219391 2016-09-18.14:58:45 [txg:1219521] scan done complete=0 > > 2016-09-18.14:58:45 [txg:1219521] scan setup func=2 mintxg=3 > > maxtxg=1219391 > > > > I assume that "scan done complete=0" means that the resilver didn't > > finish? > > > > pool layout is the following: > > > > pool: pool > > state: DEGRADED > > status: One or more devices is currently being resilvered. The pool > > will continue to function, possibly in a degraded state. > > action: Wait for the resilver to complete. > > scan: resilver in progress since Sun Sep 18 14:51:39 2016 > > 235G scanned out of 9.81T at 830M/s, 3h21m to go > > 13.2M resilvered, 2.34% done > > config: > > > > NAME STATE READ WRITE CKSUM > > pool DEGRADED 0 0 0 > > raidz2-0 ONLINE 0 0 0 > > da6.eli ONLINE 0 0 0 > > da7.eli ONLINE 0 0 0 > > ada1.eli ONLINE 0 0 0 > > ada2.eli ONLINE 0 0 0 > > da10.eli ONLINE 0 0 2 > > da11.eli ONLINE 0 0 0 > > da12.eli ONLINE 0 0 0 > > da13.eli ONLINE 0 0 0 > > raidz2-1 DEGRADED 0 0 0 > > da0.eli ONLINE 0 0 0 > > da1.eli ONLINE 0 0 0 > > da2.eli ONLINE 0 0 1 > > (resilvering) > > replacing-3 DEGRADED 0 0 1 > > 10699825708166646100 UNAVAIL 0 0 0 > > was /dev/da3.eli da4.eli ONLINE 0 0 0 > > (resilvering) > > da3.eli ONLINE 0 0 0 > > da5.eli ONLINE 0 0 0 > > da8.eli ONLINE 0 0 0 > > da9.eli ONLINE 0 0 0 > > > > errors: No known data errors > > > > system is > > FreeBSD xxx 10.1-BETA1 FreeBSD 10.1-BETA1 #27 r271633: > > Mon Sep 15 22:34:05 CEST 2014 > > root@xxx:/usr/obj/usr/src/sys/xxx amd64 > > > > controller is > > SAS2116 PCI-Express Fusion-MPT SAS-2 [Meteor] > > > > Drives are connected via four four-port sata cables. > > > > Should I upgrade to 10.3-release or did I make some sort of > > configuration error / overlook something? > > > > Thanks in advance! > > > > Cheers, > > Marc > > Resilver will start over anytime there's new damage. In your case, > with two failed drives, resilver should've begun after you replaced > the first drive, and restarted after you replaced the second. Have > you seen it restart more than that? If so, keep an eye on the error > counters in "zpool status"; they might give you a clue. You could > also raise the loglevel of devd to "info" in /etc/syslog.conf and see > what gets logged to /etc/devd.log. That will tell you if drives a > dropping out and automatically rejoining the pool, for example. Thanks a lot for your fast reply, unfortunately (or not), devd is silent and the error count for the pool remains at zero. The resilver, however, just keeps restarting. The furthest it got was about 68% resilvered. Usually, it gets to 2 - 3%, then restarts. I plan on offlining the pool, upgrading to 10.3, and then reimporting the pool next. Does that make sense? Cheers, Marc -- Marc "UBM" Bocklet From owner-freebsd-stable@freebsd.org Sun Sep 18 17:41:56 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 54830BE0813 for ; Sun, 18 Sep 2016 17:41:56 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-oi0-x22b.google.com (mail-oi0-x22b.google.com [IPv6:2607:f8b0:4003:c06::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 19CB3AB2 for ; Sun, 18 Sep 2016 17:41:56 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-oi0-x22b.google.com with SMTP id a62so34146061oib.1 for ; Sun, 18 Sep 2016 10:41:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=9ClkJI01npIqeY+DQUHww7kdpyrllnoJpMg9aqXpA9I=; b=TKn0iCVi2+7C4uH5qo/JBYNtu+Upg4DEpSekJwAPw9b/84RL0o7LZ4TTcWynzVixsr Gfyvf57cSnD7ZKpZzUManzD/i2UpTXogpCaYLX9AaAMqH0ACWmGlF01YEPP2r6AoTXIR N9v3y4uaFOe5I+lVJ8PYsDq25aAz0pTuS2PLmpzbNt28fCVcWayLg/ow6U9GRwyWPP1D NBE660J4jDue2FRfMfLMJjnkwwO5Zzj/FxeyAxuP2Rk5tLWMZDkPwiAWjek3stBWZcks y+yrnRkAEOIyHQGth+DdwLSfiwBoDlii/SzhAqF7TIb9PO6r3fzcGozd465Q0tekHKoV yPpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=9ClkJI01npIqeY+DQUHww7kdpyrllnoJpMg9aqXpA9I=; b=K+YCEEd0qScv940hA4XXfs2h7Il/Jl8Rnw6WXkKqkuGeLjg5QBkQgakMdeFwNLCLEs rCWMSCuskybIS1pisSESds18kDEAUPofFrCcoJ6L73Kq8ozW3HcAqhh18vE5Ry+GkQz+ e91HXUDFh7u2t/0jFtAsuH8e3tuERRtKVPBaUHBLLXgQlZ4dZNbawqkveGtL1z67sabr bgNRZHKsyHtWXfYxkVQbFPVPVBudhrpKfNn1PyeGISRU1Kd3aJw1zq+/GMtg3ChCosYx +UrylEncYgFoSmDe8Z0Tf7B4dYkrustSwAlONYNK7fRAP7z1EWfqIP7qd7arziODZc/9 znUA== X-Gm-Message-State: AE9vXwPfR+fd6Je0HOsNFZqLIiKy9HX/5AW8LkKiQOAkC8fi13M9PQ0s3WyjEwNfuKblPsB1DgIECGhpfPVFoA== X-Received: by 10.202.104.224 with SMTP id o93mr4499951oik.82.1474220515358; Sun, 18 Sep 2016 10:41:55 -0700 (PDT) MIME-Version: 1.0 Sender: asomers@gmail.com Received: by 10.202.71.11 with HTTP; Sun, 18 Sep 2016 10:41:54 -0700 (PDT) In-Reply-To: <20160918184636.5861562661d4376e845ac75d@gmail.com> References: <20160918150917.09f9448464d84d4e50808707@gmail.com> <20160918184636.5861562661d4376e845ac75d@gmail.com> From: Alan Somers Date: Sun, 18 Sep 2016 11:41:54 -0600 X-Google-Sender-Auth: VLPVdYzJTlnKKkYEukD1lcziLsM Message-ID: Subject: Re: zfs resilver keeps restarting To: Marc UBM Bocklet Cc: freebsd-stable Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Sep 2016 17:41:56 -0000 On Sun, Sep 18, 2016 at 10:46 AM, Marc UBM Bocklet via freebsd-stable wrote: > On Sun, 18 Sep 2016 10:05:52 -0600 > Alan Somers wrote: > >> On Sun, Sep 18, 2016 at 7:09 AM, Marc UBM Bocklet via freebsd-stable >> wrote: >> > >> > Hi all, >> > >> > due to two bad cables, I had two drives drop from my striped raidz2 >> > pool (built on top of geli encrypted drives). I replaced one of the >> > drives before I realized that the cabling was at fault - that's the >> > drive which is being replaced in the ouput of zpool status below. >> > >> > I have just installed the new cables and all sata errors are gone. >> > However, the resilver of the pool keeps restarting. >> > >> > I see no errors in /var/log/messages, but zpool history -i says: >> > >> > 2016-09-18.14:56:21 [txg:1219501] scan setup func=2 mintxg=3 >> > maxtxg=1219391 2016-09-18.14:56:51 [txg:1219505] scan done complete=0 >> > 2016-09-18.14:56:51 [txg:1219505] scan setup func=2 mintxg=3 >> > maxtxg=1219391 2016-09-18.14:57:20 [txg:1219509] scan done complete=0 >> > 2016-09-18.14:57:20 [txg:1219509] scan setup func=2 mintxg=3 >> > maxtxg=1219391 2016-09-18.14:57:49 [txg:1219513] scan done complete=0 >> > 2016-09-18.14:57:49 [txg:1219513] scan setup func=2 mintxg=3 >> > maxtxg=1219391 2016-09-18.14:58:19 [txg:1219517] scan done complete=0 >> > 2016-09-18.14:58:19 [txg:1219517] scan setup func=2 mintxg=3 >> > maxtxg=1219391 2016-09-18.14:58:45 [txg:1219521] scan done complete=0 >> > 2016-09-18.14:58:45 [txg:1219521] scan setup func=2 mintxg=3 >> > maxtxg=1219391 >> > >> > I assume that "scan done complete=0" means that the resilver didn't >> > finish? >> > >> > pool layout is the following: >> > >> > pool: pool >> > state: DEGRADED >> > status: One or more devices is currently being resilvered. The pool >> > will continue to function, possibly in a degraded state. >> > action: Wait for the resilver to complete. >> > scan: resilver in progress since Sun Sep 18 14:51:39 2016 >> > 235G scanned out of 9.81T at 830M/s, 3h21m to go >> > 13.2M resilvered, 2.34% done >> > config: >> > >> > NAME STATE READ WRITE CKSUM >> > pool DEGRADED 0 0 0 >> > raidz2-0 ONLINE 0 0 0 >> > da6.eli ONLINE 0 0 0 >> > da7.eli ONLINE 0 0 0 >> > ada1.eli ONLINE 0 0 0 >> > ada2.eli ONLINE 0 0 0 >> > da10.eli ONLINE 0 0 2 >> > da11.eli ONLINE 0 0 0 >> > da12.eli ONLINE 0 0 0 >> > da13.eli ONLINE 0 0 0 >> > raidz2-1 DEGRADED 0 0 0 >> > da0.eli ONLINE 0 0 0 >> > da1.eli ONLINE 0 0 0 >> > da2.eli ONLINE 0 0 1 >> > (resilvering) >> > replacing-3 DEGRADED 0 0 1 >> > 10699825708166646100 UNAVAIL 0 0 0 >> > was /dev/da3.eli da4.eli ONLINE 0 0 0 >> > (resilvering) >> > da3.eli ONLINE 0 0 0 >> > da5.eli ONLINE 0 0 0 >> > da8.eli ONLINE 0 0 0 >> > da9.eli ONLINE 0 0 0 >> > >> > errors: No known data errors >> > >> > system is >> > FreeBSD xxx 10.1-BETA1 FreeBSD 10.1-BETA1 #27 r271633: >> > Mon Sep 15 22:34:05 CEST 2014 >> > root@xxx:/usr/obj/usr/src/sys/xxx amd64 >> > >> > controller is >> > SAS2116 PCI-Express Fusion-MPT SAS-2 [Meteor] >> > >> > Drives are connected via four four-port sata cables. >> > >> > Should I upgrade to 10.3-release or did I make some sort of >> > configuration error / overlook something? >> > >> > Thanks in advance! >> > >> > Cheers, >> > Marc >> >> Resilver will start over anytime there's new damage. In your case, >> with two failed drives, resilver should've begun after you replaced >> the first drive, and restarted after you replaced the second. Have >> you seen it restart more than that? If so, keep an eye on the error >> counters in "zpool status"; they might give you a clue. You could >> also raise the loglevel of devd to "info" in /etc/syslog.conf and see >> what gets logged to /etc/devd.log. That will tell you if drives a >> dropping out and automatically rejoining the pool, for example. > > Thanks a lot for your fast reply, unfortunately (or not), devd is silent > and the error count for the pool remains at zero. The resilver, however, > just keeps restarting. The furthest it got was about 68% resilvered. > Usually, it gets to 2 - 3%, then restarts. > > I plan on offlining the pool, upgrading to 10.3, and then reimporting > the pool next. Does that make sense? > > Cheers, > Marc I suspect an upgrade won't make a difference, but it certainly won't hurt. Did you remember to change devd's loglevel to "info" and restart syslogd? From owner-freebsd-stable@freebsd.org Sun Sep 18 17:45:25 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8F29ABE08EE for ; Sun, 18 Sep 2016 17:45:25 +0000 (UTC) (envelope-from hps@selasky.org) Received: from mail.turbocat.net (mail.turbocat.net [IPv6:2a01:4f8:d16:4514::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5C533C96 for ; Sun, 18 Sep 2016 17:45:25 +0000 (UTC) (envelope-from hps@selasky.org) Received: from laptop015.home.selasky.org (unknown [62.141.129.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id 3CE7C1FE023; Sun, 18 Sep 2016 19:45:21 +0200 (CEST) To: slw@zxy.spb.ru, freebsd-stable@freebsd.org From: Hans Petter Selasky Subject: 11.0 stuck on high network load Message-ID: <2490d030-e947-4842-8e91-6498864b0100@selasky.org> Date: Sun, 18 Sep 2016 19:50:08 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Sep 2016 17:45:25 -0000 Hi, Got some tips regarding this thread. Some things you can try: 1) Compile kernel from projects/hps_head instead of your 11-stable? 2) Set net.inet.tcp.per_cpu_timers=1 If the system just hangs, it is pretty likely that the timers are going in a loop due to typical use after free. Please keep me CC'ed, hence I'm not subscribed to @stable. --HPS From owner-freebsd-stable@freebsd.org Sun Sep 18 18:10:20 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0136DBE0EF7 for ; Sun, 18 Sep 2016 18:10:20 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BA2E7BE1 for ; Sun, 18 Sep 2016 18:10:19 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1blgXg-000CoW-71; Sun, 18 Sep 2016 21:10:16 +0300 Date: Sun, 18 Sep 2016 21:10:16 +0300 From: Slawa Olhovchenkov To: Hans Petter Selasky Cc: freebsd-stable@freebsd.org Subject: Re: 11.0 stuck on high network load Message-ID: <20160918181016.GK2840@zxy.spb.ru> References: <2490d030-e947-4842-8e91-6498864b0100@selasky.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2490d030-e947-4842-8e91-6498864b0100@selasky.org> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Sep 2016 18:10:20 -0000 On Sun, Sep 18, 2016 at 07:50:08PM +0200, Hans Petter Selasky wrote: > Hi, > > Got some tips regarding this thread. > > Some things you can try: > > 1) Compile kernel from projects/hps_head instead of your 11-stable? How many difference from 11-stable? > 2) Set net.inet.tcp.per_cpu_timers=1 Already. From 10.x, by manual MFC. > If the system just hangs, it is pretty likely that the timers are going > in a loop due to typical use after free. > > Please keep me CC'ed, hence I'm not subscribed to @stable. > > --HPS From owner-freebsd-stable@freebsd.org Sun Sep 18 18:37:29 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5051ABDFB20 for ; Sun, 18 Sep 2016 18:37:29 +0000 (UTC) (envelope-from dioxinu@gmail.com) Received: from mail-wm0-x22b.google.com (mail-wm0-x22b.google.com [IPv6:2a00:1450:400c:c09::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E8F04BE for ; Sun, 18 Sep 2016 18:37:28 +0000 (UTC) (envelope-from dioxinu@gmail.com) Received: by mail-wm0-x22b.google.com with SMTP id d66so6396405wmf.0 for ; Sun, 18 Sep 2016 11:37:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to; bh=5aUXdDFlJNoZmCkMhJu+3FZ44mqyo+eFl7a1SUR8oGM=; b=qc1BoFUuaebAtX8b6xBB2fL96cNbbpRGFuVvx+Fp7VyPU9R8LwY4jk6bMkYLEAkRle G9bMkgO/3O5kDNgE6i8QUItuOBjlSy+N9hm0F+myM+E3vwz3Yu1xbEhF4VSDUUPW6en9 YLGRjTZEPr4Dz//Yqhr5KxEUuQdJlxJK3+Ah+rEEPuXYAyhUJBTc+8Z10bfLgu1QCKoJ L0QOU06XhfgsQHh9ZgCADz+CNvGja27FX8XYnVZRQqkKJpwNYjyXOrxCHxuXqWNVWRwE YYyAKAszmf4RehzKd87PrqG3naLMn2YZLxEcBSsmAoi41hIU6fLqDd9eKYHI5K61RmrG MSYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=5aUXdDFlJNoZmCkMhJu+3FZ44mqyo+eFl7a1SUR8oGM=; b=bPb9LhZ+ptoqPCiu2MUlgubWIrUpC1cymcDMjSJR2zrLE9xoMCL1gqvrQSmV1YJ3AI lvULuzvuexffldAuJ3rkLkU8FfYnR/YDcVDyouMCZNKI8ohwKy7Q52hcIeLIk+/RfoAt OiIFJ3t9eWJJQL3UC7JCLAOmoxvFw9Ah9N8P19o0ChDAi0bVkglXpoBho8XbiE1nMA+i nk1xW5BbETw3bTH1dXtle5zHcudGdDOlVCuZdtD3/BzI8kqTyUnDf0OFaFrd4y8OWCGN kqfYG4tZlmIH0gry+H2EZNLCbml/V8HYR/2b/ncGt9qqingg6TD47+tRH08Ra4scjpih OQDw== X-Gm-Message-State: AE9vXwMqioaeOdPf4Tb3lgZ1ND/KNHjmlIcwLAKTHMLGpUzAqAefn97earios7m4PABjhiXPuOVmzbysUDhOrw== X-Received: by 10.28.234.157 with SMTP id g29mr4098460wmi.86.1474223846379; Sun, 18 Sep 2016 11:37:26 -0700 (PDT) MIME-Version: 1.0 Received: by 10.80.171.165 with HTTP; Sun, 18 Sep 2016 11:37:25 -0700 (PDT) From: "Alex T." Date: Sun, 18 Sep 2016 11:37:25 -0700 Message-ID: Subject: buildkernel fails with a 'invalid conversion specifier' compiler error To: freebsd-stable@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Sep 2016 18:37:29 -0000 Hi guys, I'm on stable/10 branch and have been using it to rebuild world and kernel. This is the revision I'm currently trying to build but started seeing the following issue way before it. URL: svn://svn.freebsd.org/base/stable/10 Revision: 305760 The world builds fine, but building the kernel fails with this error: /usr/src/sys/cam/cam_xpt.c:1060:27: error: invalid conversion specifier 'b' [-Werror,-Wformat-invalid-specifier] ...printf("%s%d: quirks=0x%b\n", perip... ~^ /usr/src/sys/cam/cam_xpt.c:1061:36: error: data argument not used by format string [-Werror,-Wformat-extra-args] ...periph->unit_number, quirks, bit_st... This is how my /etc/make.conf looks like: WITH_PKGNG=yes SSP_CFLAGS=-fstack-protector-all WITH_SSP_PORTS=yes WITHOUT="DOCS" and I don't have /etc/src.conf. Has anyone seen this issue? Any idea what might me misconfigured missing here? Thank you. From owner-freebsd-stable@freebsd.org Sun Sep 18 19:07:29 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F2EB5BE037C for ; Sun, 18 Sep 2016 19:07:29 +0000 (UTC) (envelope-from dim@FreeBSD.org) Received: from tensor.andric.com (tensor.andric.com [IPv6:2001:7b8:3a7:1:2d0:b7ff:fea0:8c26]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "tensor.andric.com", Issuer "COMODO RSA Domain Validation Secure Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id BAF7B7E5 for ; Sun, 18 Sep 2016 19:07:29 +0000 (UTC) (envelope-from dim@FreeBSD.org) Received: from [IPv6:2001:7b8:3a7::7c4a:b7c3:933d:6041] (unknown [IPv6:2001:7b8:3a7:0:7c4a:b7c3:933d:6041]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by tensor.andric.com (Postfix) with ESMTPSA id 1B1AC373EF; Sun, 18 Sep 2016 21:07:18 +0200 (CEST) Subject: Re: buildkernel fails with a 'invalid conversion specifier' compiler error Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Content-Type: multipart/signed; boundary="Apple-Mail=_0865B78F-A055-49AF-BD5C-01ED104A0ECD"; protocol="application/pgp-signature"; micalg=pgp-sha1 X-Pgp-Agent: GPGMail 2.6.1 From: Dimitry Andric In-Reply-To: Date: Sun, 18 Sep 2016 21:07:08 +0200 Cc: freebsd-stable@freebsd.org Message-Id: References: To: "Alex T." X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Sep 2016 19:07:30 -0000 --Apple-Mail=_0865B78F-A055-49AF-BD5C-01ED104A0ECD Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii On 18 Sep 2016, at 20:37, Alex T. wrote: > > I'm on stable/10 branch and have been using it to rebuild world > and kernel. This is the revision I'm currently trying to build but > started seeing the following issue way before it. > > URL: svn://svn.freebsd.org/base/stable/10 > Revision: 305760 > > The world builds fine, but building the kernel fails with this error: > > /usr/src/sys/cam/cam_xpt.c:1060:27: error: > invalid conversion specifier 'b' > [-Werror,-Wformat-invalid-specifier] > ...printf("%s%d: quirks=0x%b\n", perip... > ~^ > /usr/src/sys/cam/cam_xpt.c:1061:36: error: > data argument not used by format > string [-Werror,-Wformat-extra-args] > ...periph->unit_number, quirks, bit_st... > > This is how my /etc/make.conf looks like: > WITH_PKGNG=yes > SSP_CFLAGS=-fstack-protector-all > WITH_SSP_PORTS=yes > WITHOUT="DOCS" > > and I don't have /etc/src.conf. Has anyone seen this issue? > > Any idea what might me misconfigured missing here? It's hard to say what is different on your system, but it looks like the -fformat-extensions flag is somehow not being used for building your kernel. If you can't figure out what causes this, you can try to work around it by setting WITHOUT_FORMAT_EXTENSIONS, or setting WERROR to empty. -Dimitry --Apple-Mail=_0865B78F-A055-49AF-BD5C-01ED104A0ECD Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.30 iEYEARECAAYFAlfe5eUACgkQsF6jCi4glqPmRQCgjKneVVaTuNV+lebM7n60ZFax rnAAn0Si1dD1kkHhLh0gWgEn6pPq+quC =t69F -----END PGP SIGNATURE----- --Apple-Mail=_0865B78F-A055-49AF-BD5C-01ED104A0ECD-- From owner-freebsd-stable@freebsd.org Sun Sep 18 20:34:21 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8C174BDFB84 for ; Sun, 18 Sep 2016 20:34:21 +0000 (UTC) (envelope-from hps@selasky.org) Received: from mail.turbocat.net (heidi.turbocat.net [88.198.202.214]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 529FD16C4 for ; Sun, 18 Sep 2016 20:34:20 +0000 (UTC) (envelope-from hps@selasky.org) Received: from laptop015.home.selasky.org (unknown [62.141.129.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id 1564E1FE023; Sun, 18 Sep 2016 22:34:12 +0200 (CEST) Subject: Re: 11.0 stuck on high network load To: Slawa Olhovchenkov References: <2490d030-e947-4842-8e91-6498864b0100@selasky.org> <20160918181016.GK2840@zxy.spb.ru> Cc: freebsd-stable@freebsd.org From: Hans Petter Selasky Message-ID: Date: Sun, 18 Sep 2016 22:38:58 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.0 MIME-Version: 1.0 In-Reply-To: <20160918181016.GK2840@zxy.spb.ru> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Sep 2016 20:34:21 -0000 On 09/18/16 20:10, Slawa Olhovchenkov wrote: > On Sun, Sep 18, 2016 at 07:50:08PM +0200, Hans Petter Selasky wrote: > >> Hi, >> >> Got some tips regarding this thread. >> >> Some things you can try: >> >> 1) Compile kernel from projects/hps_head instead of your 11-stable? > > How many difference from 11-stable? Hi, The callout subsystem has a different implementation. Else identical. > >> 2) Set net.inet.tcp.per_cpu_timers=1 > > Already. From 10.x, by manual MFC. OK. > >> If the system just hangs, it is pretty likely that the timers are going >> in a loop due to typical use after free. >> >> Please keep me CC'ed, hence I'm not subscribed to @stable. >> --HPS From owner-freebsd-stable@freebsd.org Sun Sep 18 20:46:21 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7B2CBBDFED8 for ; Sun, 18 Sep 2016 20:46:21 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2FA171C7F for ; Sun, 18 Sep 2016 20:46:21 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bliyf-000H5v-Gw; Sun, 18 Sep 2016 23:46:17 +0300 Date: Sun, 18 Sep 2016 23:46:17 +0300 From: Slawa Olhovchenkov To: Hans Petter Selasky Cc: freebsd-stable@freebsd.org Subject: Re: 11.0 stuck on high network load Message-ID: <20160918204617.GL2840@zxy.spb.ru> References: <2490d030-e947-4842-8e91-6498864b0100@selasky.org> <20160918181016.GK2840@zxy.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Sep 2016 20:46:21 -0000 On Sun, Sep 18, 2016 at 10:38:58PM +0200, Hans Petter Selasky wrote: > On 09/18/16 20:10, Slawa Olhovchenkov wrote: > > On Sun, Sep 18, 2016 at 07:50:08PM +0200, Hans Petter Selasky wrote: > > > >> Hi, > >> > >> Got some tips regarding this thread. > >> > >> Some things you can try: > >> > >> 1) Compile kernel from projects/hps_head instead of your 11-stable? > > > > How many difference from 11-stable? > > Hi, > > The callout subsystem has a different implementation. Else identical. userbase compatible? can i recompile only kernel? From owner-freebsd-stable@freebsd.org Mon Sep 19 01:58:43 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5F070BDF68E for ; Mon, 19 Sep 2016 01:58:43 +0000 (UTC) (envelope-from unique.identifier@gmail.com) Received: from mail-qt0-x229.google.com (mail-qt0-x229.google.com [IPv6:2607:f8b0:400d:c0d::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 265C93ED for ; Mon, 19 Sep 2016 01:58:40 +0000 (UTC) (envelope-from unique.identifier@gmail.com) Received: by mail-qt0-x229.google.com with SMTP id l91so65680644qte.3 for ; Sun, 18 Sep 2016 18:58:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to; bh=TTJjIgTHokpL9/4DycRdIgm3cOoDH3CPbqyGT8ax3gI=; b=LVKNSk33rbqfksk93vwFlZb32OtaGW8oIiUu3QGIG3faZTyHo/t8SitaQLyIXsSt85 jCkqRjM0A4RtzyYpB559cRV4U5mrA21O7EouizkqLk5DBB3ASuy3Z1DD0QIH0JweP1AY 9CDZwMlacVvQgdbFaYcpS5y0TW9GkGgRRqUz8ziRle7vZJkFzud6O8NAmI2nMP3oeByj HIDJIhzlgQp08Wy/2YpbxgosTvugOlYr3ZtjA0MKUnnbOUmZKnvL+19XgJrI8EiWL+r5 68kCg+hQCT2X9pvoaawKcVBRFdjJLe6YeofmknzT71SMikhRYW0xfCJB9ySx9dwKeiwl R1bA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=TTJjIgTHokpL9/4DycRdIgm3cOoDH3CPbqyGT8ax3gI=; b=LAGM5ENawg9Seb+HHCKh5mUTE2QOQss2rmIMCTD4KU5CB0OUIX4uv0XIs6zhHy5YCM xSQE7pfFs/tOJ+jHr7EMS+3qbXQoGieBIWYS4J7Du6woUytc8EJXq6wPdQS98uhg7sPw W+A1v1VP6BRHz3a002PYfYqS+hWCnm2dM+2x/BdOLCuSRbomicRPXxCx/KHGq2rsifiO CHSy2j5uc7pk+p75eun4sv2kCngLCWOgrttN+C05O8uwwX8m8c/SwpaObQ06eVV6LJvo LJFDHrlq/zT4RErQTOVzi4Zj5dldnGiOShmZX0c+1umCEB9IX6e0qU+yUoC9BKnqCyYF 56iw== X-Gm-Message-State: AE9vXwOWjmK9YchaYsQGiaUsQ4Z6Lwc5imB8/vQBoeCykg9S4o2O5NkqcZS6B6C/CZL9aVcQPa657dlUl85YkQ== X-Received: by 10.237.36.33 with SMTP id r30mr20635037qtc.99.1474250319085; Sun, 18 Sep 2016 18:58:39 -0700 (PDT) MIME-Version: 1.0 Received: by 10.237.36.82 with HTTP; Sun, 18 Sep 2016 18:58:38 -0700 (PDT) From: Charles Cowart Date: Sun, 18 Sep 2016 18:58:38 -0700 Message-ID: Subject: Nvidia_load not working To: Freebsd-stable@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Sep 2016 01:58:43 -0000 I did a clean install of RC3 over RC2, and I noticed that nvidia_load="yes" no longer appears to work in /boot/loader.conf. I can still load the module from etc/rc.conf From owner-freebsd-stable@freebsd.org Mon Sep 19 02:02:57 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C1F00BDF8C3 for ; Mon, 19 Sep 2016 02:02:57 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from albert.catwhisker.org (mx.catwhisker.org [198.144.209.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 75BFEA8F for ; Mon, 19 Sep 2016 02:02:56 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from albert.catwhisker.org (localhost [127.0.0.1]) by albert.catwhisker.org (8.15.2/8.15.2) with ESMTP id u8J22tYs009057; Mon, 19 Sep 2016 02:02:55 GMT (envelope-from david@albert.catwhisker.org) Received: (from david@localhost) by albert.catwhisker.org (8.15.2/8.15.2/Submit) id u8J22tNn009056; Sun, 18 Sep 2016 19:02:55 -0700 (PDT) (envelope-from david) Date: Sun, 18 Sep 2016 19:02:55 -0700 From: David Wolfskill To: Charles Cowart Cc: Freebsd-stable@freebsd.org Subject: Re: Nvidia_load not working Message-ID: <20160919020255.GG1292@albert.catwhisker.org> Reply-To: stable@freebsd.org Mail-Followup-To: stable@freebsd.org, Charles Cowart , Freebsd-stable@freebsd.org References: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="tvOENZuN7d6HfOWU" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.6.1 (2016-04-27) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Sep 2016 02:02:57 -0000 --tvOENZuN7d6HfOWU Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Sep 18, 2016 at 06:58:38PM -0700, Charles Cowart wrote: > I did a clean install of RC3 over RC2, and I noticed that nvidia_load=3D"= yes" > no longer appears to work in /boot/loader.conf. I can still load the modu= le > from etc/rc.conf > ... As the nvidia kernel module is part of a port/package, I suspect that this is more of a "ports" issue than a "stable" issue; in particular, if the version of the nvidia-driver you're using is sufficiently recent, you may find a recent ports/UPDATING entry relevant: 20160829: AFFECTS: users of x11/nvidia-driver AUTHOR: cem@FreeBSD.org The NVidia driver has been updated to version 367.35. Starting with version 358.09, new kernel module was added, nvidia-modeset.ko. This new driver component works in conjunction with the nvidia.ko kernel module to program the display engine of the GPU. Users that experience hangs when starting X11 server, or observe (II) NVIDIA(0): Validated MetaModes: (II) NVIDIA(0): "NULL" messages in their /var/log/Xorg.0.log file should replace ``nvidia'' with ``nvidia-modeset'' in /boot/loader.conf or /etc/rc.conf files, depending on how they prefer to load NVidia driver kernel module. Peace, david --=20 David H. Wolfskill david@catwhisker.org Those who would murder in the name of God or prophet are blasphemous coward= s. See http://www.catwhisker.org/~david/publickey.gpg for my public key. --tvOENZuN7d6HfOWU Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQF8BAEBCgBmBQJX30dPXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRDQ0I3Q0VGOTE3QTgwMUY0MzA2NEQ3N0Ix NTM5Q0M0MEEwNDlFRTE3AAoJEBU5zECgSe4XPgMH/iAKb0OZQKAGvl/9+yrDvCwQ riBi0PJ5Di+D5F9hsHtoM9xPxm6MZogEHEo3CBK0dEd2biR490bS7Xc3j8prcu/m AGSFeZ0I8kOBkQmjPsnvRhd8foqUflLDGKy8CLrKgT8AFVpO0wgIZRTqHHzH9o9S Q6WMl33SAd07xc9KKUB5/k4SGUvFH7pvtwroSsq8DY0m5XJh8BJMHVu8xfvJhYwW gqLwfNfQ+XuIm5zDfYRELGJc0MzaK2oBB/VaiQJQbjBixQ+9FXvp5flquYaLXR69 htmwwU7OZ3RyRdze7MazwbO0Aql7mcDKsOuFW6U/QVxpAYsQqtIjwXrayNuCq/8= =Gubn -----END PGP SIGNATURE----- --tvOENZuN7d6HfOWU-- From owner-freebsd-stable@freebsd.org Mon Sep 19 06:25:50 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 458C0BE04D0 for ; Mon, 19 Sep 2016 06:25:50 +0000 (UTC) (envelope-from junchoon@dec.sakura.ne.jp) Received: from dec.sakura.ne.jp (dec.sakura.ne.jp [210.188.226.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E0571810 for ; Mon, 19 Sep 2016 06:25:49 +0000 (UTC) (envelope-from junchoon@dec.sakura.ne.jp) Received: from fortune.joker.local (123-48-23-227.dz.commufa.jp [123.48.23.227]) (authenticated bits=0) by dec.sakura.ne.jp (8.15.2/8.15.2/[SAKURA-WEB]/20080708) with ESMTPA id u8J6PdN7071649; Mon, 19 Sep 2016 15:25:39 +0900 (JST) (envelope-from junchoon@dec.sakura.ne.jp) Date: Mon, 19 Sep 2016 15:25:39 +0900 From: Tomoaki AOKI To: freebsd-stable@freebsd.org Subject: Re: Nvidia_load not working Message-Id: <20160919152539.2bc014f05d1276d1ebd8f0ae@dec.sakura.ne.jp> In-Reply-To: <20160919020255.GG1292@albert.catwhisker.org> References: <20160919020255.GG1292@albert.catwhisker.org> Organization: Junchoon corps X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.29; amd64-portbld-freebsd11.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Sep 2016 06:25:50 -0000 I suspect loader.efi issue rather than that you mentioned. https://lists.freebsd.org/pipermail/svn-src-stable-11/2016-September/000503.html nvidia.ko is loaded as a dependency of nvidia-modeset.ko, so if nvidia.ko failed to load, nvidia-modeset.ko should fail as well. On Sun, 18 Sep 2016 19:02:55 -0700 David Wolfskill wrote: > On Sun, Sep 18, 2016 at 06:58:38PM -0700, Charles Cowart wrote: > > I did a clean install of RC3 over RC2, and I noticed that nvidia_load="yes" > > no longer appears to work in /boot/loader.conf. I can still load the module > > from etc/rc.conf > > ... > > As the nvidia kernel module is part of a port/package, I suspect that > this is more of a "ports" issue than a "stable" issue; in particular, if > the version of the nvidia-driver you're using is sufficiently recent, > you may find a recent ports/UPDATING entry relevant: > > 20160829: > AFFECTS: users of x11/nvidia-driver > AUTHOR: cem@FreeBSD.org > > The NVidia driver has been updated to version 367.35. Starting with > version 358.09, new kernel module was added, nvidia-modeset.ko. This > new driver component works in conjunction with the nvidia.ko kernel > module to program the display engine of the GPU. > > Users that experience hangs when starting X11 server, or observe > > (II) NVIDIA(0): Validated MetaModes: > (II) NVIDIA(0): "NULL" > > messages in their /var/log/Xorg.0.log file should replace ``nvidia'' > with ``nvidia-modeset'' in /boot/loader.conf or /etc/rc.conf files, > depending on how they prefer to load NVidia driver kernel module. > > Peace, > david > -- > David H. Wolfskill david@catwhisker.org > Those who would murder in the name of God or prophet are blasphemous cowards. > > See http://www.catwhisker.org/~david/publickey.gpg for my public key. -- Tomoaki AOKI From owner-freebsd-stable@freebsd.org Mon Sep 19 06:52:33 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B6EFBBE0E03 for ; Mon, 19 Sep 2016 06:52:33 +0000 (UTC) (envelope-from junchoon@dec.sakura.ne.jp) Received: from dec.sakura.ne.jp (dec.sakura.ne.jp [210.188.226.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 71E2B31D for ; Mon, 19 Sep 2016 06:52:33 +0000 (UTC) (envelope-from junchoon@dec.sakura.ne.jp) Received: from fortune.joker.local (123-48-23-227.dz.commufa.jp [123.48.23.227]) (authenticated bits=0) by dec.sakura.ne.jp (8.15.2/8.15.2/[SAKURA-WEB]/20080708) with ESMTPA id u8J6Ll38071593 for ; Mon, 19 Sep 2016 15:21:47 +0900 (JST) (envelope-from junchoon@dec.sakura.ne.jp) Date: Mon, 19 Sep 2016 15:21:46 +0900 From: Tomoaki AOKI To: freebsd-stable@freebsd.org Subject: Re: Nvidia_load not working Message-Id: <20160919152146.7e4a27541d1f07968087555e@dec.sakura.ne.jp> In-Reply-To: References: Organization: Junchoon corps X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.29; amd64-portbld-freebsd11.0) Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-2022-JP Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Sep 2016 06:52:33 -0000 If you're booting via UEFI, it should be because r305779 (MFC of r305484) is not yet MFS'ed (causing insufficient memory to load nvidia.ko). See below for detail. https://lists.freebsd.org/pipermail/svn-src-stable-11/2016-September/000503.html If so, while waiting for MFS, you can load it via rc.conf by adding nvidia-modeset.ko (or nvidia.ko depending on which version you're installing) on kldlist= line, or directly writing kldload command in rc.conf, or rebuild loader.efi with the patch obtained below applied. https://svnweb.freebsd.org/base/stable/11/sys/boot/efi/loader/copy.c?r1=302408&r2=305779&view=patch don't forget to give -p2 option to patch command line. If you're booting via legacy BIOS mode, sorry, I have no idea. :-( On Sun, 18 Sep 2016 18:58:38 -0700 Charles Cowart wrote: > I did a clean install of RC3 over RC2, and I noticed that nvidia_load="yes" > no longer appears to work in /boot/loader.conf. I can still load the module > from etc/rc.conf > _______________________________________________ > freebsd-stable@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > -- $B@DLZ(B $BCNL@(B [Tomoaki AOKI] From owner-freebsd-stable@freebsd.org Mon Sep 19 07:19:08 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 499A7BE08D2 for ; Mon, 19 Sep 2016 07:19:08 +0000 (UTC) (envelope-from hps@selasky.org) Received: from mail.turbocat.net (heidi.turbocat.net [88.198.202.214]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 109A9B8A for ; Mon, 19 Sep 2016 07:19:07 +0000 (UTC) (envelope-from hps@selasky.org) Received: from laptop015.home.selasky.org (unknown [62.141.129.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id A1C951FE023; Mon, 19 Sep 2016 09:19:04 +0200 (CEST) Subject: Re: 11.0 stuck on high network load To: Slawa Olhovchenkov References: <2490d030-e947-4842-8e91-6498864b0100@selasky.org> <20160918181016.GK2840@zxy.spb.ru> <20160918204617.GL2840@zxy.spb.ru> Cc: freebsd-stable@freebsd.org From: Hans Petter Selasky Message-ID: <7536cbef-0a83-3c8c-70e3-363d927704dd@selasky.org> Date: Mon, 19 Sep 2016 09:23:51 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.0 MIME-Version: 1.0 In-Reply-To: <20160918204617.GL2840@zxy.spb.ru> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Sep 2016 07:19:08 -0000 On 09/18/16 22:46, Slawa Olhovchenkov wrote: > userbase compatible? > can i recompile only kernel? Yes, only the kernel. --HPS From owner-freebsd-stable@freebsd.org Mon Sep 19 08:04:16 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8DC41BE0EBE for ; Mon, 19 Sep 2016 08:04:16 +0000 (UTC) (envelope-from Klauder@nt.ag) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 72872196B for ; Mon, 19 Sep 2016 08:04:16 +0000 (UTC) (envelope-from Klauder@nt.ag) Received: by mailman.ysv.freebsd.org (Postfix) id 6E695BE0EBD; Mon, 19 Sep 2016 08:04:16 +0000 (UTC) Delivered-To: stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6BCFFBE0EBC for ; Mon, 19 Sep 2016 08:04:16 +0000 (UTC) (envelope-from Klauder@nt.ag) Received: from mailrelay.ntag.de (mailrelay.ntag.de [81.209.134.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D2E07196A for ; Mon, 19 Sep 2016 08:04:15 +0000 (UTC) (envelope-from Klauder@nt.ag) Received: from mailhost.nt-web.de ([192.168.128.12]) by mailrelay.ntag.de (8.15.2/8.15.2) with ESMTP id u8J80Isc019308 for ; Mon, 19 Sep 2016 10:00:18 +0200 (CEST) (envelope-from Klauder@nt.ag) X-Notes-Item: Memo; name=Form X-Notes-Item: CN=Sascha Klauder/O=NTAG; name=SentBy X-Notes-Item: ; name=AltFrom X-Notes-Item: ; name=AltSendTo X-Notes-Item: ; name=$LangFrom X-Notes-Item: ; name=$NameLanguageTags Subject: =?ISO-8859-1?Q?AUTO=3A_Sascha_Klauder_ist_au=DFer_Haus_=28R=FCckkehr_am?= =?ISO-8859-1?Q?_10=2F25=2F2016=29?= X-Notes-Item: ; name=$AutoForward X-Notes-Item: OOAgent; name=GeneratedBy X-Notes-Item: 1; name=$AssistMail Auto-Submitted: auto-generated From: Sascha Klauder To: stable@freebsd.org Message-ID: Date: Mon, 19 Sep 2016 10:00:16 +0200 X-Notes-Item: 0; name=Encrypt X-Notes-Item: ; name=HasSafeStamp X-Notes-Item: CN=DOMINO-NTAG/O=NTAG; type=501; flags=44; name=$UpdatedBy X-Notes-Item: Mon, 19 Sep 2016 10:00:16 +0200; type=400; name=$Revisions X-Notes-Item: nt.ag; name=FromDomain X-Notes-Item: 25; type=300; name=$Hops X-Notes-Item: 1; name=$NoteHasNativeMIME X-Notes-Item: CN=Sascha Klauder/O=NTAG; name=OriginalFrom X-MIMETrack: Serialize by Router on DOMINO-NTAG/NTAG(Release 8.5.3FP6|November 21, 2013) at 19.09.2016 10:00:18 MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Sep 2016 08:04:16 -0000 Ich kehre zur=FCck am 10/25/2016. Sehr geehrte Damen und Herren, ich bin vom 26.08.2016 bis 25.10.2016 in Elternzeit und kann E-Mail nu= r unregelm=E4=DFig bearbeiten. Ihre Nachricht wird nicht weitergeleitet. = Bitte wenden Sie sich ggf. an support@nt.ag. Hinweis: Dies ist eine automatische Antwort auf Ihre Nachricht "Re: Nvidia_load not working" gesendet am 19.09.2016 04:02:55. Diese ist die einzige Benachrichtigung, die Sie empfangen werden, w=E4h= rend diese Person abwesend ist.= From owner-freebsd-stable@freebsd.org Mon Sep 19 10:27:34 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EEDFFBE0580 for ; Mon, 19 Sep 2016 10:27:34 +0000 (UTC) (envelope-from yaruta.arkadiy@gmail.com) Received: from mail-yw0-x22d.google.com (mail-yw0-x22d.google.com [IPv6:2607:f8b0:4002:c05::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id AFDC4754 for ; Mon, 19 Sep 2016 10:27:34 +0000 (UTC) (envelope-from yaruta.arkadiy@gmail.com) Received: by mail-yw0-x22d.google.com with SMTP id u82so134916063ywc.2 for ; Mon, 19 Sep 2016 03:27:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to :content-transfer-encoding; bh=wYHFpp2vxNXTOQG4eqgvpMjJoh/VDcV6BQQkYgI3fcM=; b=miJYWp/yeJZqS1NfttNoA9Jt3x8CEHPjiGLBtuUzig+Vr8Q5FDHexlPGmrQs65Ztf+ SJkrOs5Y5zApNLFXEFWOoiW2lhcEV5PKzxtgO3iAgY5JeSQTE4SjwtxAUJOEiHnUboML NfUbLUuSHt5A/jo4jYCkJ4YqUjNtVAUS6k0/Cz5MJo9rYNBe1h9330EPp/6HEpdsgej8 N5StCWUnCGgfVJYYQsa0IrqgddKC7fKzhpXmH1WCi9jP7ZPZk0ju61xAWo075e9kxwNy /Szs9hJ6SgMT+av74Mxe35KjOmUkVouPDy5Ynsop6mEZqc/21JuLWwBD0pxtIbMnkTgk F2bA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to :content-transfer-encoding; bh=wYHFpp2vxNXTOQG4eqgvpMjJoh/VDcV6BQQkYgI3fcM=; b=lhRcRqVgQ2+5sG5MaoworkOgOo0Wsv30X3zmS7w6cSH14AptO4/Im/hixkHzWvJ2+w Ul3xbvME9IQtYuzlr9d3KBeuPHJgX+GxhTZKItAkwrL0aWHSiaudcHoIY1tpoeoNdBhm +ggyjj0tQPrEWxcsbGm96Zj148uvgUEr8cDq8JCK3qaKZAG4O4T3gD1d2MfDyyJek5Qk iUok07hMQPPJgEpHQ1LGq1jJDpB9AZH8yOhULVRqaOWP66sPBKn3WYCJeFlwh25/K6Nf E2v3uinlKf3Ch+t7pnDaypx6ZfqX1o0QIIDyUmTtll8kV2jvhAR/yqZcEk37Sm0o6XsR XtuQ== X-Gm-Message-State: AE9vXwMdQx1Kn4IrjrKufobN31Qbqq6+T9fhsLLsl0FvdFuD2bfDmhJdEi4MgwCEF/n0M1jOn/5nabIAHrkVwg== X-Received: by 10.13.213.19 with SMTP id x19mr26673926ywd.226.1474280853653; Mon, 19 Sep 2016 03:27:33 -0700 (PDT) MIME-Version: 1.0 Received: by 10.13.194.1 with HTTP; Mon, 19 Sep 2016 03:27:13 -0700 (PDT) From: =?UTF-8?B?0JDRgNC60LDQtNC40Lkg0K/RgNGD0YLQsA==?= Date: Mon, 19 Sep 2016 13:27:13 +0300 Message-ID: Subject: 11.0-RC3 broken Hyper-V 2012R2 support To: freebsd-stable@freebsd.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Sep 2016 10:27:35 -0000 Hi, I decided to test 11.0-RC3 on 2012r2 hyper-v, but installation failed on disk partitioning step. I=E2=80=99ve found that this might be related to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D212721 Can anybody help with resolution? From owner-freebsd-stable@freebsd.org Mon Sep 19 18:42:55 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C78A6BE1AF9 for ; Mon, 19 Sep 2016 18:42:55 +0000 (UTC) (envelope-from ubm.freebsd@googlemail.com) Received: from mail-wm0-x230.google.com (mail-wm0-x230.google.com [IPv6:2a00:1450:400c:c09::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4F55018BC for ; Mon, 19 Sep 2016 18:42:55 +0000 (UTC) (envelope-from ubm.freebsd@googlemail.com) Received: by mail-wm0-x230.google.com with SMTP id b130so77955977wmc.0 for ; Mon, 19 Sep 2016 11:42:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=from:date:to:subject:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=VNkT2nDppMXoo5AF4+eswJ2W1Lg1+FDgZt0DM4ZJI6M=; b=XRV//QtjIzaTjY4ox/tk97yOeckabQx9x9isQcbidoTi4Jsu46GOWSFSiqj7zXNVhn 5wkOaKrLNQ33Lr2Hw6I3irQCXXKQiNlgOjoGknViOlKQDZ/K5UXAV9TJXiqx51R3T0/d Vj+znmX+iu5dtNDrVUBc55HTIXmjarmQyw0gPpjfH/EU4eURTEUZbsFXzYtZ50KWl26e o6TGeSrKxD5vvg0vdrh3PyRhEvz2nj9Uk+Ma40qD+7CH1iB/hFG6GEvIjIaFV7XnY+OK PkiHRzZSvVV+GfBZqnNGQK6cXcmH/rsUQGT/rB7xJIgGcK9ZhyXq0bguPCsW6b4epcjJ 4S6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:date:to:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=VNkT2nDppMXoo5AF4+eswJ2W1Lg1+FDgZt0DM4ZJI6M=; b=VTsTJzSqO79k0+2ygyEOUn+UjJdCNBma528PH3PxHt5LeWDiMd8/YDpJwqhH/ykHbg 9AjMqamTfh324497geg1P5nCkhfjhOEt1XkolAiXNzWZkJQr2HgQgElm4da0EoYXjW3e lDZiygnahrX9UmhLvZ+ozW0Cgdg1nqCPCFkCtfRVIkK1fIudq1j46dh09z+fJkBsHvHn S1Rf8qcuRsaIY9PV3ZsIHdKjImqVTSeImm8Yxlrj5Dj+mSBjlc+QBj5PJreRwT8d8wzP tCbiZL7dxsD71j0BhDMCUgv0Uz0qERTf176VObsMrSZg6QH4+BqOFuq/QlXhNu4tgSru VBXw== X-Gm-Message-State: AE9vXwM7xySjmmLf9yaePRehcp+qwfYjC3AWAl4I0qAZAHSWAt1urJXNGTdPMkclTw3Sow== X-Received: by 10.194.95.36 with SMTP id dh4mr24533462wjb.156.1474310573532; Mon, 19 Sep 2016 11:42:53 -0700 (PDT) Received: from ubm.strangled.net (ipb21a85d1.dynamic.kabel-deutschland.de. [178.26.133.209]) by smtp.gmail.com with ESMTPSA id 123sm23388003wmj.5.2016.09.19.11.42.52 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 19 Sep 2016 11:42:52 -0700 (PDT) From: Marc UBM Bocklet X-Google-Original-From: Marc "UBM" Bocklet Date: Mon, 19 Sep 2016 20:42:51 +0200 To: freebsd-stable Subject: Re: zfs resilver keeps restarting Message-Id: <20160919204251.47930526f3059bc62430b83a@gmail.com> In-Reply-To: References: <20160918150917.09f9448464d84d4e50808707@gmail.com> <20160918184636.5861562661d4376e845ac75d@gmail.com> X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.29; amd64-portbld-freebsd11.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Sep 2016 18:42:55 -0000 On Sun, 18 Sep 2016 11:41:54 -0600 Alan Somers wrote: > On Sun, Sep 18, 2016 at 10:46 AM, Marc UBM Bocklet via freebsd-stable > wrote: > > On Sun, 18 Sep 2016 10:05:52 -0600 > > Alan Somers wrote: > > > >> On Sun, Sep 18, 2016 at 7:09 AM, Marc UBM Bocklet via freebsd-stable > >> wrote: > >> > > >> > Hi all, > >> > > >> > due to two bad cables, I had two drives drop from my striped raidz2 > >> > pool (built on top of geli encrypted drives). I replaced one of the > >> > drives before I realized that the cabling was at fault - that's the > >> > drive which is being replaced in the ouput of zpool status below. > >> > > >> > I have just installed the new cables and all sata errors are gone. > >> > However, the resilver of the pool keeps restarting. > >> > > >> > I see no errors in /var/log/messages, but zpool history -i says: > >> > > >> > 2016-09-18.14:56:21 [txg:1219501] scan setup func=2 mintxg=3 > >> > maxtxg=1219391 2016-09-18.14:56:51 [txg:1219505] scan done complete=0 > >> > 2016-09-18.14:56:51 [txg:1219505] scan setup func=2 mintxg=3 > >> > maxtxg=1219391 2016-09-18.14:57:20 [txg:1219509] scan done complete=0 > >> > 2016-09-18.14:57:20 [txg:1219509] scan setup func=2 mintxg=3 > >> > maxtxg=1219391 2016-09-18.14:57:49 [txg:1219513] scan done complete=0 > >> > 2016-09-18.14:57:49 [txg:1219513] scan setup func=2 mintxg=3 > >> > maxtxg=1219391 2016-09-18.14:58:19 [txg:1219517] scan done complete=0 > >> > 2016-09-18.14:58:19 [txg:1219517] scan setup func=2 mintxg=3 > >> > maxtxg=1219391 2016-09-18.14:58:45 [txg:1219521] scan done complete=0 > >> > 2016-09-18.14:58:45 [txg:1219521] scan setup func=2 mintxg=3 > >> > maxtxg=1219391 > >> > > >> > I assume that "scan done complete=0" means that the resilver didn't > >> > finish? > >> > > >> > pool layout is the following: > >> > > >> > pool: pool > >> > state: DEGRADED > >> > status: One or more devices is currently being resilvered. The pool > >> > will continue to function, possibly in a degraded state. > >> > action: Wait for the resilver to complete. > >> > scan: resilver in progress since Sun Sep 18 14:51:39 2016 > >> > 235G scanned out of 9.81T at 830M/s, 3h21m to go > >> > 13.2M resilvered, 2.34% done > >> > config: > >> > > >> > NAME STATE READ WRITE CKSUM > >> > pool DEGRADED 0 0 0 > >> > raidz2-0 ONLINE 0 0 0 > >> > da6.eli ONLINE 0 0 0 > >> > da7.eli ONLINE 0 0 0 > >> > ada1.eli ONLINE 0 0 0 > >> > ada2.eli ONLINE 0 0 0 > >> > da10.eli ONLINE 0 0 2 > >> > da11.eli ONLINE 0 0 0 > >> > da12.eli ONLINE 0 0 0 > >> > da13.eli ONLINE 0 0 0 > >> > raidz2-1 DEGRADED 0 0 0 > >> > da0.eli ONLINE 0 0 0 > >> > da1.eli ONLINE 0 0 0 > >> > da2.eli ONLINE 0 0 1 > >> > (resilvering) > >> > replacing-3 DEGRADED 0 0 1 > >> > 10699825708166646100 UNAVAIL 0 0 0 > >> > was /dev/da3.eli da4.eli ONLINE 0 0 0 > >> > (resilvering) > >> > da3.eli ONLINE 0 0 0 > >> > da5.eli ONLINE 0 0 0 > >> > da8.eli ONLINE 0 0 0 > >> > da9.eli ONLINE 0 0 0 > >> > > >> > errors: No known data errors > >> > > >> > system is > >> > FreeBSD xxx 10.1-BETA1 FreeBSD 10.1-BETA1 #27 r271633: > >> > Mon Sep 15 22:34:05 CEST 2014 > >> > root@xxx:/usr/obj/usr/src/sys/xxx amd64 > >> > > >> > controller is > >> > SAS2116 PCI-Express Fusion-MPT SAS-2 [Meteor] > >> > > >> > Drives are connected via four four-port sata cables. > >> > > >> > Should I upgrade to 10.3-release or did I make some sort of > >> > configuration error / overlook something? > >> > > >> > Thanks in advance! > >> > > >> > Cheers, > >> > Marc > >> > >> Resilver will start over anytime there's new damage. In your case, > >> with two failed drives, resilver should've begun after you replaced > >> the first drive, and restarted after you replaced the second. Have > >> you seen it restart more than that? If so, keep an eye on the error > >> counters in "zpool status"; they might give you a clue. You could > >> also raise the loglevel of devd to "info" in /etc/syslog.conf and see > >> what gets logged to /etc/devd.log. That will tell you if drives a > >> dropping out and automatically rejoining the pool, for example. > > > > Thanks a lot for your fast reply, unfortunately (or not), devd is silent > > and the error count for the pool remains at zero. The resilver, however, > > just keeps restarting. The furthest it got was about 68% resilvered. > > Usually, it gets to 2 - 3%, then restarts. > > > > I plan on offlining the pool, upgrading to 10.3, and then reimporting > > the pool next. Does that make sense? > > > > Cheers, > > Marc > > I suspect an upgrade won't make a difference, but it certainly won't > hurt. Did you remember to change devd's loglevel to "info" and > restart syslogd? I just upgraded to FreeBSD xxx 10.3-STABLE FreeBSD 10.3-STABLE #29 r305937: Sun Sep 18 19:48:32 CEST 2016 xxx amd64 and the resilver is apparently finishing on the first try (currently at 97% and continuing). I changed devd's loglevel and restarted syslog, but caught no messages in the log except those (malformatted due to line wrap) Sep 19 19:55:54 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=15989198335786279524'' Sep 19 19:55:54 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=15989198335786279524 Sep 19 19:55:54 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=7556163349212343309'' Sep 19 19:55:54 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=7556163349212343309 Sep 19 19:55:54 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=11941432590348604434'' Sep 19 19:55:54 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=11941432590348604434 Sep 19 19:55:54 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=417984871670910036'' Sep 19 19:55:54 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=417984871670910036 Sep 19 19:55:54 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=9683634954111285635'' Sep 19 19:55:54 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=9683634954111285635 Sep 19 19:55:54 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=13914729548038345942'' Sep 19 19:55:54 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=13914729548038345942 Sep 19 19:55:54 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=11390813927692356269'' Sep 19 19:55:54 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=11390813927692356269 Sep 19 19:55:54 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=17789791879708299257'' Sep 19 19:55:54 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=17789791879708299257 Sep 19 19:55:54 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=3906761392810563354'' Sep 19 19:55:54 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=3906761392810563354 Sep 19 19:55:54 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=14587480937528490734'' Sep 19 19:55:54 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=14587480937528490734 Sep 19 19:55:54 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=10215581546520896687'' Sep 19 19:55:54 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=10215581546520896687 Sep 19 19:55:56 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=15040171490277026943'' Sep 19 19:55:56 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=15040171490277026943 Sep 19 19:55:56 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=1334619275191205373'' Sep 19 19:55:56 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=1334619275191205373 Sep 19 19:55:56 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=10400807283721245288'' Sep 19 19:55:56 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=10400807283721245288 Sep 19 19:55:56 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=12117703925014396267'' Sep 19 19:55:56 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=12117703925014396267 Sep 19 19:55:56 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=1855871756760387608'' Sep 19 19:55:56 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=1855871756760387608 Sep 19 19:55:57 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=15989198335786279524'' Sep 19 19:55:57 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=15989198335786279524 Sep 19 19:55:57 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=7556163349212343309'' Sep 19 19:55:57 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=7556163349212343309 Sep 19 19:55:57 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=11941432590348604434'' Sep 19 19:55:57 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=11941432590348604434 Sep 19 19:55:57 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=417984871670910036'' Sep 19 19:55:57 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=417984871670910036 Sep 19 19:55:57 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=9683634954111285635'' Sep 19 19:55:57 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=9683634954111285635 Sep 19 19:55:57 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=13914729548038345942'' Sep 19 19:55:57 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=13914729548038345942 Sep 19 19:55:57 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=11390813927692356269'' Sep 19 19:55:57 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=11390813927692356269 Sep 19 19:55:57 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=17789791879708299257'' Sep 19 19:55:57 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=17789791879708299257 Sep 19 19:55:57 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=3906761392810563354'' Sep 19 19:55:57 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=3906761392810563354 Sep 19 19:55:57 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=14587480937528490734'' Sep 19 19:55:57 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=14587480937528490734 Sep 19 19:55:57 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=10215581546520896687'' Sep 19 19:55:57 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=10215581546520896687 Sep 19 19:55:58 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=15040171490277026943'' Sep 19 19:55:58 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=15040171490277026943 Sep 19 19:55:58 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=1334619275191205373'' Sep 19 19:55:58 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=1334619275191205373 Sep 19 19:55:58 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=10400807283721245288'' Sep 19 19:55:58 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=10400807283721245288 Sep 19 19:55:58 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=12117703925014396267'' Sep 19 19:55:58 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=12117703925014396267 Sep 19 19:55:58 hamstor devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=17098460560433284558 vdev_guid=1855871756760387608'' Sep 19 19:55:58 hamstor ZFS: vdev state changed, pool_guid=17098460560433284558 vdev_guid=1855871756760387608 These were logged while already on 10.3-stable. Thanks a lot for taking the time to help me! Cheers, Marc From owner-freebsd-stable@freebsd.org Mon Sep 19 19:23:26 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3B052BE0939 for ; Mon, 19 Sep 2016 19:23:26 +0000 (UTC) (envelope-from dweimer@dweimer.net) Received: from webmail.dweimer.net (24-240-198-188.static.stls.mo.charter.com [24.240.198.188]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0CE2FC22 for ; Mon, 19 Sep 2016 19:23:25 +0000 (UTC) (envelope-from dweimer@dweimer.net) Received: from webmail.dweimer.local (localhost [10.9.5.2]) by webmail.dweimer.net (8.15.2/8.15.2) with ESMTPS id u8JJNHJd006145 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Mon, 19 Sep 2016 14:23:17 -0500 (CDT) (envelope-from dweimer@dweimer.net) Received: (from www@localhost) by webmail.dweimer.local (8.15.2/8.15.2/Submit) id u8JJNHca006144; Mon, 19 Sep 2016 14:23:17 -0500 (CDT) (envelope-from dweimer@dweimer.net) X-Authentication-Warning: webmail.dweimer.local: www set sender to dweimer@dweimer.net using -f To: FreeBSD Stable Subject: LAGG and Jumbo Frames MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Mon, 19 Sep 2016 14:23:17 -0500 From: "Dean E. Weimer" Organization: dweimer.net Reply-To: dweimer@dweimer.net Mail-Reply-To: dweimer@dweimer.net Message-ID: <48926c6013f938af832c17e4ad10b232@dweimer.net> X-Sender: dweimer@dweimer.net User-Agent: Roundcube Webmail/1.2.1 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Sep 2016 19:23:26 -0000 May not be specific to 11.0-RC3, but since that's what I was running when trying to set this up I am posting to stable mailing list first. I was trying to setup an LACP 3 port aggregate connection and having all kinds of problems, At first I thought it was an issue with NAT reflection and my firewall because I could ping the IPs and ssh to the system but I couldn't connect to the services. oddly enough I could connect from devices outside of my network and once I realized that I could also connect from wireless devices I realized the common thread was that the Internet pipe and Access points didn't support Jumbo Frames. disabling Jumbo frames on the interfaces restored connectivity to the LACP aggregate connection. I guess this could be an issue with the switch as well I don't have any other LACP enabled devices to test this with. my configuration: rc.conf settings (working): hostname="freebsd.dweimer.local" ifconfig_igb0="up" ifconfig_igb1="up" ifconfig_igb2="up" cloned_interfaces="lagg0" ifconfig_lagg0="laggproto lacp laggport igb0 laggport igb1 laggport igb2 10.9.5.5/24" rc.conf settings (Jumbo frames Broken): hostname="freebsd.dweimer.local" ifconfig_igb0="up mtu 9000" ifconfig_igb1="up mtu 9000" ifconfig_igb2="up mtu 9000" cloned_interfaces="lagg0" ifconfig_lagg0="laggproto lacp laggport igb0 laggport igb1 laggport igb2 10.9.5.5/24" Does anyone see an issue with the Jumbo Frames setup above, or are Jumbo Frames not supported correctly in a LACP Aggregate configuration. -- Thanks, Dean E. Weimer http://www.dweimer.net/ From owner-freebsd-stable@freebsd.org Mon Sep 19 20:28:48 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 37864BE1B70 for ; Mon, 19 Sep 2016 20:28:48 +0000 (UTC) (envelope-from lyndon@orthanc.ca) Received: from orthanc.ca (orthanc.ca [IPv6:2607:f2f8:abf8::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "orthanc.ca", Issuer "Let's Encrypt Authority X1" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 09CF1385 for ; Mon, 19 Sep 2016 20:28:48 +0000 (UTC) (envelope-from lyndon@orthanc.ca) Received: from localhost (localhost [IPv6:::1]) by orthanc.ca (8.15.2/8.15.2) with ESMTPS id u8JKSjI6055107 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 19 Sep 2016 13:28:45 -0700 (PDT) (envelope-from lyndon@orthanc.ca) Date: Mon, 19 Sep 2016 13:28:45 -0700 (PDT) From: Lyndon Nerenberg To: "Dean E. Weimer" cc: FreeBSD Stable Subject: Re: LAGG and Jumbo Frames In-Reply-To: <48926c6013f938af832c17e4ad10b232@dweimer.net> Message-ID: References: <48926c6013f938af832c17e4ad10b232@dweimer.net> User-Agent: Alpine 2.20 (BSF 67 2015-01-07) Organization: The Frobozz Magic Homing Pigeon Company MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset=US-ASCII X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on orthanc.ca X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Sep 2016 20:28:48 -0000 This is almost certainly a PMTUd issue. Unless your end-to-end paths to everything you talk to have jumboframes configured, there's no benefit to setting them up on the lagg. Just go with the default MTU. --lyndon From owner-freebsd-stable@freebsd.org Mon Sep 19 20:32:23 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 44A49BE1D4C for ; Mon, 19 Sep 2016 20:32:23 +0000 (UTC) (envelope-from julien.charbon@gmail.com) Received: from mail-wm0-f41.google.com (mail-wm0-f41.google.com [74.125.82.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id CEC15A36 for ; Mon, 19 Sep 2016 20:32:22 +0000 (UTC) (envelope-from julien.charbon@gmail.com) Received: by mail-wm0-f41.google.com with SMTP id l132so30114855wmf.1 for ; Mon, 19 Sep 2016 13:32:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to; bh=TIfrkBxfolzizDHPx/PMBOmQOg8Sxg9+F77OLRxcUy8=; b=XSkrEq1Z4L78gF8LQNVEKiJ0TViLKPedLRvC4RDSAFP6/musYW+HupP5G212Vt9pym YS6/0/8SwqpO7SG59erzXUAhpdpBT1fUwQmkYtVdQN/Xx04nWAcS2yK93Y925R/grZh2 XMGe0HtvVtk/UeIn2/8mixDN8McoQJkH4loKiqIH8Nm5GY8FuQg/HTvz4tALm4VUTAJP 6ERhGt1GL/1MP34s35VaUPpS19xsyZHKELfx7Z+UIbiGihC5FV/l/J31WgD5R/lEXWtP /99pGDV5Z36hciY6OI4zugQ/A5jy+jFSXpumUtFTcMQ1ilLZez1uIbQaTa1eziF5xyrK l2Pg== X-Gm-Message-State: AE9vXwPaYYmPgY5kqItzpAP/uYnTVNjUzW1CAO0LIwPIp6osqRHHVGl85donsAC5BLvZpw== X-Received: by 10.28.13.66 with SMTP id 63mr15820wmn.113.1474317140206; Mon, 19 Sep 2016 13:32:20 -0700 (PDT) Received: from [192.168.0.12] (217-162-163-184.dynamic.hispeed.ch. [217.162.163.184]) by smtp.gmail.com with ESMTPSA id ir9sm24803965wjb.16.2016.09.19.13.32.19 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 19 Sep 2016 13:32:19 -0700 (PDT) Subject: Re: 11.0 stuck on high network load To: Slawa Olhovchenkov , hiren panchasara References: <20160904215739.GC22212@zxy.spb.ru> <20160905014612.GA42393@strugglingcoder.info> <20160914213503.GJ2840@zxy.spb.ru> <20160915085938.GN38409@kib.kiev.ua> <20160915090633.GS2840@zxy.spb.ru> <20160916181839.GC2960@zxy.spb.ru> <20160916183053.GL9397@strugglingcoder.info> <20160916190330.GG2840@zxy.spb.ru> Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org From: Julien Charbon Message-ID: <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> Date: Mon, 19 Sep 2016 22:32:13 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <20160916190330.GG2840@zxy.spb.ru> Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="MikMWBCTJh2GK09JwCirqxaop1nePIMxP" X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Sep 2016 20:32:23 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --MikMWBCTJh2GK09JwCirqxaop1nePIMxP Content-Type: multipart/mixed; boundary="mgBFvcdIF2aRPu3PPAA6UmtpRvir4s5j4"; protected-headers="v1" From: Julien Charbon To: Slawa Olhovchenkov , hiren panchasara Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org Message-ID: <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> Subject: Re: 11.0 stuck on high network load References: <20160904215739.GC22212@zxy.spb.ru> <20160905014612.GA42393@strugglingcoder.info> <20160914213503.GJ2840@zxy.spb.ru> <20160915085938.GN38409@kib.kiev.ua> <20160915090633.GS2840@zxy.spb.ru> <20160916181839.GC2960@zxy.spb.ru> <20160916183053.GL9397@strugglingcoder.info> <20160916190330.GG2840@zxy.spb.ru> In-Reply-To: <20160916190330.GG2840@zxy.spb.ru> --mgBFvcdIF2aRPu3PPAA6UmtpRvir4s5j4 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Hi Slawa, On 9/16/16 9:03 PM, Slawa Olhovchenkov wrote: > On Fri, Sep 16, 2016 at 11:30:53AM -0700, hiren panchasara wrote: >=20 >> On 09/16/16 at 09:18P, Slawa Olhovchenkov wrote: >>> On Thu, Sep 15, 2016 at 12:06:33PM +0300, Slawa Olhovchenkov wrote: >>> >>>> On Thu, Sep 15, 2016 at 11:59:38AM +0300, Konstantin Belousov wrote:= >>>> >>>>> On Thu, Sep 15, 2016 at 12:35:04AM +0300, Slawa Olhovchenkov wrote:= >>>>>> On Sun, Sep 04, 2016 at 06:46:12PM -0700, hiren panchasara wrote: >>>>>> >>>>>>> On 09/05/16 at 12:57P, Slawa Olhovchenkov wrote: >>>>>>>> I am try using 11.0 on Dual E5-2620 (no X2APIC). >>>>>>>> Under high network load and may be addtional conditional system = go to >>>>>>>> unresponsible state -- no reaction to network and console (USB I= PMI >>>>>>>> emulation). INVARIANTS give to high overhad. Is this exist some = way to >>>>>>>> debug this? >>>>>>> >>>>>>> Can you panic it from console to get to db> to get backtrace and = other >>>>>>> info when it goes unresponsive? >>>>>> >>>>>> ipmi console don't respond (chassis power diag don't react) >>>>>> login on sol console stuck on *tcp. >>>>> >>>>> Is 'login' you reference is the ipmi client state, or you mean logi= n(1) >>>>> on the wedged host ? >>>> >>>> on the wedged host >>>> >>>>> If BMC stops responding simultaneously with the host, I would suspe= ct >>>>> the hardware platform issues instead of a software problem. Do you= have >>>>> dedicated LAN port for BMC ? >>>> >>>> Yes. >>>> But BMC emulate USB keyboard and this is may be lock inside USB >>>> system. >>>> "ipmi console don't respond" must be read as "ipmi console runnnig a= nd >>>> attached but system don't react to keypress on this console". >>>> at the sime moment system respon to `enter` on ipmi sol console, but= >>>> after enter `root` stuck in login in the '*tcp' state (I think this = is >>>> NIS related). >>> >>> ~^B don't break to debuger. >>> But I can login to sol console. >> >> You can probably: >> debug.kdb.enter: set to enter the debugger >> >> or force a panic and get vmcore: >> debug.kdb.panic: set to panic the kernel >=20 > I am reset this host. > PMC samples collected and decoded: >=20 > @ CPU_CLK_UNHALTED_CORE [4653445 samples] >=20 > 51.86% [2413083] lock_delay @ /boot/kernel.VSTREAM/kernel > 100.0% [2413083] __rw_wlock_hard > 100.0% [2413083] tcp_tw_2msl_scan > 99.99% [2412958] pfslowtimo > 100.0% [2412958] softclock_call_cc > 100.0% [2412958] softclock > 100.0% [2412958] intr_event_execute_handlers > 100.0% [2412958] ithread_loop > 100.0% [2412958] fork_exit > 00.01% [125] tcp_twstart > 100.0% [125] tcp_do_segment > 100.0% [125] tcp_input > 100.0% [125] ip_input > 100.0% [125] swi_net > 100.0% [125] intr_event_execute_handlers > 100.0% [125] ithread_loop > 100.0% [125] fork_exit The only write lock tcp_tw_2msl_scan() tries to get is a INP_WLOCK(inp). Thus here, tcp_tw_2msl_scan() seems to be stuck spinning on INP_WLOCK (or pfslowtimo() is going crazy and calls tcp_tw_2msl_scan() at high rate but this will be quite unexpected). Thus my hypothesis is that something is holding the INP_WLOCK and not releasing it, and tcp_tw_2msl_scan() is spinning on it. If you can, could you compile the kernel with below options: options DDB # Support DDB. options DEADLKRES # Enable the deadlock resolver options INVARIANTS # Enable calls of extra sanity checking options INVARIANT_SUPPORT # Extra sanity checks of internal structures, required by INVARIANTS options WITNESS # Enable checks to detect deadlocks and cycles options WITNESS_SKIPSPIN # Don't run witness on spinlocks for speed And once the issue is reproduce, run in ddb run the below commands: show pcpu show allpcpu show locks show alllocks show lockchain show allchains show all trace This is to see if the contention is indeed on the tcp_tw_2msl_scan's INP_WLOCK. -- Julien --mgBFvcdIF2aRPu3PPAA6UmtpRvir4s5j4-- --MikMWBCTJh2GK09JwCirqxaop1nePIMxP Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQEcBAEBCgAGBQJX4EtSAAoJEKVlQ5Je6dhx7mkIAJo089e6oC5ZUq4bLa9dOiGP NGf5Qq9DzbrrGQXI9AXCtcYUio8dSnBaK3Gkv5POxAW+7ppq9kaiznydugIZMfMC L0A1cLmiFesvZ1BxGFxeE/P+sn1iTIgiKXFbMW4MmesmKEVwTKJ/7BgIQKX7TYUA ZbuNLuL4yWS9HPQitb3I23pcfJ6tBRHCKORB9An6OJA76ETizodyp0mUPp9UxEXP xKGXdmJPKpnts8eUuMOCGdBTqOghzP9gwEnq8J+0cGjcE45zErxnI5FvMx/VeGnr xiwBb6E3/CvKURcvsDc2SDlZ6lokNM21drLzIBOX8WN++F+JG0ayKwHcxtdLebA= =FcSq -----END PGP SIGNATURE----- --MikMWBCTJh2GK09JwCirqxaop1nePIMxP-- From owner-freebsd-stable@freebsd.org Mon Sep 19 20:43:37 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B6F67BE1210 for ; Mon, 19 Sep 2016 20:43:37 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7959A350; Mon, 19 Sep 2016 20:43:37 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bm5PU-0001Ki-4G; Mon, 19 Sep 2016 23:43:28 +0300 Date: Mon, 19 Sep 2016 23:43:28 +0300 From: Slawa Olhovchenkov To: Julien Charbon Cc: hiren panchasara , Konstantin Belousov , freebsd-stable@FreeBSD.org Subject: Re: 11.0 stuck on high network load Message-ID: <20160919204328.GN2840@zxy.spb.ru> References: <20160904215739.GC22212@zxy.spb.ru> <20160905014612.GA42393@strugglingcoder.info> <20160914213503.GJ2840@zxy.spb.ru> <20160915085938.GN38409@kib.kiev.ua> <20160915090633.GS2840@zxy.spb.ru> <20160916181839.GC2960@zxy.spb.ru> <20160916183053.GL9397@strugglingcoder.info> <20160916190330.GG2840@zxy.spb.ru> <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Sep 2016 20:43:37 -0000 On Mon, Sep 19, 2016 at 10:32:13PM +0200, Julien Charbon wrote: > > > @ CPU_CLK_UNHALTED_CORE [4653445 samples] > > > > 51.86% [2413083] lock_delay @ /boot/kernel.VSTREAM/kernel > > 100.0% [2413083] __rw_wlock_hard > > 100.0% [2413083] tcp_tw_2msl_scan > > 99.99% [2412958] pfslowtimo > > 100.0% [2412958] softclock_call_cc > > 100.0% [2412958] softclock > > 100.0% [2412958] intr_event_execute_handlers > > 100.0% [2412958] ithread_loop > > 100.0% [2412958] fork_exit > > 00.01% [125] tcp_twstart > > 100.0% [125] tcp_do_segment > > 100.0% [125] tcp_input > > 100.0% [125] ip_input > > 100.0% [125] swi_net > > 100.0% [125] intr_event_execute_handlers > > 100.0% [125] ithread_loop > > 100.0% [125] fork_exit > > The only write lock tcp_tw_2msl_scan() tries to get is a > INP_WLOCK(inp). Thus here, tcp_tw_2msl_scan() seems to be stuck > spinning on INP_WLOCK (or pfslowtimo() is going crazy and calls > tcp_tw_2msl_scan() at high rate but this will be quite unexpected). > > Thus my hypothesis is that something is holding the INP_WLOCK and not > releasing it, and tcp_tw_2msl_scan() is spinning on it. > > If you can, could you compile the kernel with below options: > > options DDB # Support DDB. > options DEADLKRES # Enable the deadlock resolver > options INVARIANTS # Enable calls of extra sanity > checking > options INVARIANT_SUPPORT # Extra sanity checks of internal > structures, required by INVARIANTS > options WITNESS # Enable checks to detect > deadlocks and cycles > options WITNESS_SKIPSPIN # Don't run witness on spinlocks > for speed Currently this host run with 100% CPU load (on all cores), i.e. enabling WITNESS will be significant drop performance. Can I use only some subset of options? Also, I can some troubles to DDB enter in this case. May be kgdb will be success (not tryed yet)? > And once the issue is reproduce, run in ddb run the below commands: > > show pcpu > show allpcpu > show locks > show alllocks > show lockchain > show allchains > show all trace > > This is to see if the contention is indeed on the tcp_tw_2msl_scan's > INP_WLOCK. From owner-freebsd-stable@freebsd.org Mon Sep 19 20:59:54 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 92AF5BE169F for ; Mon, 19 Sep 2016 20:59:54 +0000 (UTC) (envelope-from dweimer@dweimer.net) Received: from webmail.dweimer.net (24-240-198-188.static.stls.mo.charter.com [24.240.198.188]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 63A47D27 for ; Mon, 19 Sep 2016 20:59:53 +0000 (UTC) (envelope-from dweimer@dweimer.net) Received: from webmail.dweimer.local (localhost [10.9.5.2]) by webmail.dweimer.net (8.15.2/8.15.2) with ESMTPS id u8JKxqWe009143 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 19 Sep 2016 15:59:52 -0500 (CDT) (envelope-from dweimer@dweimer.net) Received: (from www@localhost) by webmail.dweimer.local (8.15.2/8.15.2/Submit) id u8JKxq6X009142; Mon, 19 Sep 2016 15:59:52 -0500 (CDT) (envelope-from dweimer@dweimer.net) X-Authentication-Warning: webmail.dweimer.local: www set sender to dweimer@dweimer.net using -f To: Lyndon Nerenberg Subject: Re: LAGG and Jumbo Frames MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Mon, 19 Sep 2016 15:59:52 -0500 From: "Dean E. Weimer" Cc: FreeBSD Stable Organization: dweimer.net Reply-To: dweimer@dweimer.net Mail-Reply-To: dweimer@dweimer.net In-Reply-To: References: <48926c6013f938af832c17e4ad10b232@dweimer.net> Message-ID: <04c9065ee4a780c6f8986d1b204c4198@dweimer.net> X-Sender: dweimer@dweimer.net User-Agent: Roundcube Webmail/1.2.1 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Sep 2016 20:59:54 -0000 On 2016-09-19 3:28 pm, Lyndon Nerenberg wrote: > This is almost certainly a PMTUd issue. > > Unless your end-to-end paths to everything you talk to have > jumboframes configured, there's no benefit to setting them up on the > lagg. Just go with the default MTU. > > --lyndon Everything on physical Ethernet has support for it Including the LAN interface of Firewall, and talks to it just fine over a single interface with Jumbo frames enabled. Just when I introduced the LAGG interface other devices with Jumbo frames enabled stopped talking. I was trying to speed up my backups (Bacula runs on one of the jails, NAT Reflection isn't used for the Bacula services) which take about 7.5 hours over a single interface to complete on the weekly fulls, I Have two simultaneous jobs running at the start, and I was hoping that the LAGG would speed them up, but I suspect the loss of Jumbo frames on the transfer would be slower than the single interface. Its also possible it won't have an impact either way and the disk write is the bottle neck. The 930G written in during the backup is the only network load I have that is pushing the network anywhere close to a heavy load. FYI I do have net.inet.tcp.pmtud_blackhole_detection enabled on the server. -- Thanks, Dean E. Weimer http://www.dweimer.net/ From owner-freebsd-stable@freebsd.org Mon Sep 19 21:28:59 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CE48CBE1E76 for ; Mon, 19 Sep 2016 21:28:59 +0000 (UTC) (envelope-from lyndon@orthanc.ca) Received: from orthanc.ca (orthanc.ca [IPv6:2607:f2f8:abf8::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "orthanc.ca", Issuer "Let's Encrypt Authority X1" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id B982610BB for ; Mon, 19 Sep 2016 21:28:59 +0000 (UTC) (envelope-from lyndon@orthanc.ca) Received: from localhost (localhost [IPv6:::1]) by orthanc.ca (8.15.2/8.15.2) with ESMTPS id u8JLSuOU055613 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 19 Sep 2016 14:28:56 -0700 (PDT) (envelope-from lyndon@orthanc.ca) Date: Mon, 19 Sep 2016 14:28:56 -0700 (PDT) From: Lyndon Nerenberg To: "Dean E. Weimer" cc: FreeBSD Stable Subject: Re: LAGG and Jumbo Frames In-Reply-To: <04c9065ee4a780c6f8986d1b204c4198@dweimer.net> Message-ID: References: <48926c6013f938af832c17e4ad10b232@dweimer.net> <04c9065ee4a780c6f8986d1b204c4198@dweimer.net> User-Agent: Alpine 2.20 (BSF 67 2015-01-07) Organization: The Frobozz Magic Homing Pigeon Company MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed X-Spam-Status: No, score=-1.5 required=5.0 tests=ALL_TRUSTED,BAYES_00, MISSING_DATE autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on orthanc.ca X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Sep 2016 21:28:59 -0000 > Everything on physical Ethernet has support for it Including the LAN > interface of Firewall, and talks to it just fine over a single interface with > Jumbo frames enabled. Well, before you get too carried away, try this: 1) Run a ttcp test between a pair of local hosts using the exiting jumboframes (pick two that you expect high volume traffic between). 2) Run the same test, but with the default MTU. If you don't see a very visible difference in throughput (e.g. >15%), it's not worth the hassle. Just as a datapoint, we're running 10-gigE off some low-end Supermicro boxes with 10.3-RELEASE. Using the default MTU we're getting > 750 MB/s TCP throughput. I can't believe that you won't be able to fully saturate a 1 Gb/s link running the default MTU on anything with more oomph than a dual-core 32-bit Atom. IOW, don't micro-optimize. Life's too short ... --lyndon From owner-freebsd-stable@freebsd.org Mon Sep 19 22:08:15 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C4448BE097A for ; Mon, 19 Sep 2016 22:08:15 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 87DDC942 for ; Mon, 19 Sep 2016 22:08:15 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bm6jU-0003px-Bw; Tue, 20 Sep 2016 01:08:12 +0300 Date: Tue, 20 Sep 2016 01:08:12 +0300 From: Slawa Olhovchenkov To: Lyndon Nerenberg Cc: "Dean E. Weimer" , FreeBSD Stable Subject: Re: LAGG and Jumbo Frames Message-ID: <20160919220812.GG2960@zxy.spb.ru> References: <48926c6013f938af832c17e4ad10b232@dweimer.net> <04c9065ee4a780c6f8986d1b204c4198@dweimer.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Sep 2016 22:08:15 -0000 On Mon, Sep 19, 2016 at 02:28:56PM -0700, Lyndon Nerenberg wrote: > > Everything on physical Ethernet has support for it Including the LAN > > interface of Firewall, and talks to it just fine over a single interface with > > Jumbo frames enabled. > > Well, before you get too carried away, try this: > > 1) Run a ttcp test between a pair of local hosts using the exiting > jumboframes (pick two that you expect high volume traffic between). > > 2) Run the same test, but with the default MTU. > > If you don't see a very visible difference in throughput (e.g. >15%), it's > not worth the hassle. > > Just as a datapoint, we're running 10-gigE off some low-end Supermicro > boxes with 10.3-RELEASE. Using the default MTU we're getting > 750 MB/s > TCP throughput. I can't believe that you won't be able to fully saturate > a 1 Gb/s link running the default MTU on anything with more oomph than a > dual-core 32-bit Atom. > > IOW, don't micro-optimize. Life's too short ... May be surprised, but jumbo frames can degrade performance for not direct connected host, i.e. multiple switch between host: [hostA]=[SW1]=[SW2]=[SW3]=[hostB] This is because RTT of this link for jumbo frames higher 1500 bytes frame for store-and-forward switch chain. From owner-freebsd-stable@freebsd.org Mon Sep 19 22:59:29 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5B27FBE1980 for ; Mon, 19 Sep 2016 22:59:29 +0000 (UTC) (envelope-from lyndon@orthanc.ca) Received: from orthanc.ca (orthanc.ca [IPv6:2607:f2f8:abf8::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "orthanc.ca", Issuer "Let's Encrypt Authority X1" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 39F2DFA for ; Mon, 19 Sep 2016 22:59:29 +0000 (UTC) (envelope-from lyndon@orthanc.ca) Received: from [192.168.43.199] ([24.114.44.91]) (authenticated bits=0) by orthanc.ca (8.15.2/8.15.2) with ESMTPSA id u8JMxRdA056305 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 19 Sep 2016 15:59:28 -0700 (PDT) (envelope-from lyndon@orthanc.ca) Subject: Re: LAGG and Jumbo Frames Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Content-Type: multipart/signed; boundary="Apple-Mail=_DE3568EF-F524-45B3-92F1-C07102A1522F"; protocol="application/pgp-signature"; micalg=pgp-sha256 X-Pgp-Agent: GPGMail From: Lyndon Nerenberg In-Reply-To: <20160919220812.GG2960@zxy.spb.ru> Date: Mon, 19 Sep 2016 15:59:20 -0700 Cc: FreeBSD Stable Message-Id: <42A03EA9-7F8E-446E-B430-7431AB9CE2E6@orthanc.ca> References: <48926c6013f938af832c17e4ad10b232@dweimer.net> <04c9065ee4a780c6f8986d1b204c4198@dweimer.net> <20160919220812.GG2960@zxy.spb.ru> To: Slawa Olhovchenkov X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Sep 2016 22:59:29 -0000 --Apple-Mail=_DE3568EF-F524-45B3-92F1-C07102A1522F Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii > On Sep 19, 2016, at 3:08 PM, Slawa Olhovchenkov = wrote: >=20 > This is because RTT of this link for jumbo frames higher 1500 bytes > frame for store-and-forward switch chain. For TCP, RTT isn't really a factor (in this scenario), as the windowing = and congestion avoidance algorithms will adapt to the actual = bandwidth-delay product of the link, and the delays in each direction = will be symmetrical. Now the ack for a single 9000 octet packet will take longer than that = for a 1500 octet one, but that's because you're sending six times as = many octets before the ACK can be generated. The time to send six 1500 = octet packets and receive the ACK from sixth packet is going to be = comparable to that of receiving the ack from a single 9000 octet packet. = It's simple arithmetic to calculate the extra protocol header overhead = for 6x1500 vs 1x9000. If there *is* a significant difference (beyond the extra protocol header = overhead), it's time to take a very close look at the NICs you are using = in the end hosts. A statistically significant difference would hint at = poor interrupt handling performance on the part of one or more of the = NICs and their associated device drivers. The intermediate switch overhead will be a constant (unless the switch = backplane becomes saturated from unrelated traffic). --lyndon --Apple-Mail=_DE3568EF-F524-45B3-92F1-C07102A1522F Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- iQIcBAEBCAAGBQJX4G3IAAoJEJCSmizucT9VZnQQALMucy5vTuLJZUitVi90UfdU OTQO/fFJmT7uuG7wK1nsrFf051bsHbCIEQ4rBP8kTa2PfnG0mpm2UYT6TV352x0Y R8yE3GJ19EFrc31t1NRqGdt2SNcndTAMGEJVOzXbPYNOrocB/adOVYxJ7HeVFIwK BDRezxdrBA40pqGD0cbE/d61A664C6CVQvJPYS9VgDKAeO3G6aScuidkI0sCdcFS qfJ3xji+4UftOfkIcmY6z2H+jhBckOu2kYeBMbN+S5eTjOc7xoWww5RmlIcVshPg 5e8kgm/M3X023p1gu2LRtQTok1JO3hEWb8mXbkj3zaP1UDDIhVe+MvvLjjnVFGks ZMCtDbFSt/fItkXKFxFTqc7HlPiv5Lkr7l+5lwkvZExtL3IYYXeviDuNV+VB45AR ln/whGc2b/CEgfon247LFpEEWS8a5uq8EW9GeWVcuIC07jloK/Bbn8J0xj72RUYP VpRuqUrtgx7BVAk+6H8sw3QavCYVvYg49nogS6gTP/bvstVzMF9C3E+om1R0/2Yq iH/QLUGj27m3uPaMptSvXoINt6rYFMBsBOZV8dtUymIg7vP/lCoBGy7f7VW/Uv11 4gebadhm2VnHEdq0qD73CZsI2KHBWmaHWEeVeJUIk+0QmX1F5P7HRw4pD1yd+bpa okAgmLRI4qGnoCZmnMoA =J1Ph -----END PGP SIGNATURE----- --Apple-Mail=_DE3568EF-F524-45B3-92F1-C07102A1522F-- From owner-freebsd-stable@freebsd.org Tue Sep 20 01:05:54 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BD2E6BE0B83 for ; Tue, 20 Sep 2016 01:05:54 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from mail.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7F2D7CF7 for ; Tue, 20 Sep 2016 01:05:54 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by mail.baldwin.cx (Postfix) with ESMTPSA id 0BB1F10AF8C; Mon, 19 Sep 2016 21:05:53 -0400 (EDT) From: John Baldwin To: Slawa Olhovchenkov Cc: freebsd-stable@freebsd.org Subject: Re: nginx and FreeBSD11 Date: Mon, 19 Sep 2016 18:05:46 -0700 Message-ID: <2122051.7RxZBKUSFc@ralph.baldwin.cx> User-Agent: KMail/4.14.10 (FreeBSD/11.0-PRERELEASE; KDE/4.14.10; amd64; ; ) In-Reply-To: <20160918162241.GE2960@zxy.spb.ru> References: <20160907191348.GD22212@zxy.spb.ru> <1823460.vTm8IvUQsF@ralph.baldwin.cx> <20160918162241.GE2960@zxy.spb.ru> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (mail.baldwin.cx); Mon, 19 Sep 2016 21:05:53 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.99.2 at mail.baldwin.cx X-Virus-Status: Clean X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Sep 2016 01:05:54 -0000 On Sunday, September 18, 2016 07:22:41 PM Slawa Olhovchenkov wrote: > On Thu, Sep 15, 2016 at 10:28:11AM -0700, John Baldwin wrote: > > > On Thursday, September 15, 2016 05:41:03 PM Slawa Olhovchenkov wrote: > > > On Wed, Sep 07, 2016 at 10:13:48PM +0300, Slawa Olhovchenkov wrote: > > > > > > > I am have strange issuse with nginx on FreeBSD11. > > > > I am have FreeBSD11 instaled over STABLE-10. > > > > nginx build for FreeBSD10 and run w/o recompile work fine. > > > > nginx build for FreeBSD11 crushed inside rbtree lookups: next node > > > > totaly craped. > > > > > > > > I am see next potential cause: > > > > > > > > 1) clang 3.8 code generation issuse > > > > 2) system library issuse > > > > > > > > may be i am miss something? > > > > > > > > How to find real cause? > > > > > > I find real cause and this like show-stopper for RELEASE. > > > I am use nginx with AIO and AIO from one nginx process corrupt memory > > > from other nginx process. Yes, this is cross-process memory > > > corruption. > > > > > > Last case, core dumped proccess with pid 1060 at 15:45:14. > > > Corruped memory at 0x860697000. > > > I am know about good memory at 0x86067f800. > > > Dumping (form core) this region to file and analyze by hexdump I am > > > found start of corrupt region -- offset 0000c8c0 from 0x86067f800. > > > 0x86067f800+0xc8c0 = 0x86068c0c0 > > > > > > I am preliminary enabled debuggin of AIO started operation to nginx > > > error log (memory address, file name, offset and size of transfer). > > > > > > grep -i 86068c0c0 error.log near 15:45:14 give target file. > > > grep ce949665cbcd.hls error.log near 15:45:14 give next result: > > > > > > 2016/09/15 15:45:13 [notice] 1055#0: *11659936 AIO_RD 000000082065DB60 start 000000086068C0C0 561b0 2646736 ce949665cbcd.hls > > > 2016/09/15 15:45:14 [notice] 1060#0: *10998125 AIO_RD 000000081F1FFB60 start 000000086FF2C0C0 6cdf0 140016832 ce949665cbcd.hls > > > 2016/09/15 15:45:14 [notice] 1055#0: *11659936 AIO_RD 00000008216B6B60 start 000000086472B7C0 7ff70 2999424 ce949665cbcd.hls > > > > Does nginx only use AIO for regular files or does it also use it with sockets? > > > > You can try using this patch as a diagnostic (you will need to > > run with INVARIANTS enabled, or at least enabled for vfs_aio.c): > > > > Index: vfs_aio.c > > =================================================================== > > --- vfs_aio.c (revision 305811) > > +++ vfs_aio.c (working copy) > > @@ -787,6 +787,8 @@ aio_process_rw(struct kaiocb *job) > > * aio_aqueue() acquires a reference to the file that is > > * released in aio_free_entry(). > > */ > > + KASSERT(curproc->p_vmspace == job->userproc->p_vmspace, > > + ("%s: vmspace mismatch", __func__)); > > if (cb->aio_lio_opcode == LIO_READ) { > > auio.uio_rw = UIO_READ; > > if (auio.uio_resid == 0) > > @@ -1054,6 +1056,8 @@ aio_switch_vmspace(struct kaiocb *job) > > { > > > > vmspace_switch_aio(job->userproc->p_vmspace); > > + KASSERT(curproc->p_vmspace == job->userproc->p_vmspace, > > + ("%s: vmspace mismatch", __func__)); > > } > > > > If this panics, then vmspace_switch_aio() is not working for > > some reason. > > I am try using next DTrace script: > ==== > #pragma D option dynvarsize=64m > > int req[struct vmspace *, void *]; > self int trace; > > syscall:freebsd:aio_read:entry > { > this->aio = *(struct aiocb *)copyin(arg0, sizeof(struct aiocb)); > req[curthread->td_proc->p_vmspace, this->aio.aio_buf] = curthread->td_proc->p_pid; > } > > fbt:kernel:aio_process_rw:entry > { > self->job = args[0]; > self->trace = 1; > } > > fbt:kernel:aio_process_rw:return > /self->trace/ > { > req[self->job->userproc->p_vmspace, self->job->uaiocb.aio_buf] = 0; > self->job = 0; > self->trace = 0; > } > > fbt:kernel:vn_io_fault:entry > /self->trace && !req[curthread->td_proc->p_vmspace, args[1]->uio_iov[0].iov_base]/ > { > this->buf = args[1]->uio_iov[0].iov_base; > printf("%Y vn_io_fault %p:%p pid %d\n", walltimestamp, curthread->td_proc->p_vmspace, this->buf, req[curthread->td_proc->p_vmspace, this->buf]); > } > === > > And don't got any messages near nginx core dump. > What I can check next? > May be check context/address space switch for kernel process? Which CPU are you using? Perhaps try disabling PCID support (I think vm.pmap.pcid_enabled=0 from loader prompt or loader.conf)? (Wondering if pmap_activate() is somehow not switching) -- John Baldwin From owner-freebsd-stable@freebsd.org Tue Sep 20 03:22:49 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DFAF1BE1CB8 for ; Tue, 20 Sep 2016 03:22:49 +0000 (UTC) (envelope-from dioxinu@gmail.com) Received: from mail-wm0-x231.google.com (mail-wm0-x231.google.com [IPv6:2a00:1450:400c:c09::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A1CA7185B; Tue, 20 Sep 2016 03:22:49 +0000 (UTC) (envelope-from dioxinu@gmail.com) Received: by mail-wm0-x231.google.com with SMTP id l132so183889059wmf.0; Mon, 19 Sep 2016 20:22:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=LALIxVCvCSou7GPX76aTfoJ1WPXwR/oRe6jFzKWjSAQ=; b=g3+i8WxlrjaYADanXEhToUt6rT4cQrFr4yD2hz5uwfaoKFsZveO/1fvBVcgxE7CxMP 9yxMROtAf8UeH9PgvLeQduggbZZBmlhQb3YaKyabvNM6TSO8QlvNCVGoW1TluaQ18UXL nTIJjJ7uMPtgpJQBH0mG9glbFHEiu5Xblkw96bhgE4sJhJDGMgEYU2ZY6NIXO/gkd/u9 pyvygQDDXIKbkIivnRUd5HmZgDCuWIHqcsyj05WKj1mhc2pifUGhpZfprTETVjLQbrdc iC8K9Mvb34FBdUWcS0f6VW6mVLyP9AcfDOxCalpa9FqghNKLLMlk+mcNh223/k85zeV2 fhjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=LALIxVCvCSou7GPX76aTfoJ1WPXwR/oRe6jFzKWjSAQ=; b=hNlTOMFwMCq0/eLIHIQnqHBFjNdEDc80kGU71Bx2jK2aNlx502zmO8KV3hd1Jm6PE8 HVJPrg4enJx1X3XT0FUmGgHhT8SrhperG3Yefv7RIMrzZSiV22r89DsOUQ7jQMdCoUW6 UbnCm/U+5Qk9p7eYHig4rbxWW6CbqCJePAwI1XYa78ZH4U4KN4toUXbhSDJR5IRDK2Io GRNb5g7v6sib9dMBageq7G6mA6zagrfJDCITkStNuCyjEzcyyODoNUbnUb3N0vvUCSWe g/Mi0RHTvWXTtJuJKS+9+EBoKA188WiWcJt76ObjHkN5tPvUzub/IzqGMe52wtVTJ//i XwKA== X-Gm-Message-State: AE9vXwMCyDh1z3bXEzlJYUZsMTuSO7CrGD3iUpfWC4bTPh0K4/ue9B8SfyHAuc7fK9DleNgmMVC1Ws0niGREvg== X-Received: by 10.194.84.134 with SMTP id z6mr18261321wjy.204.1474341767962; Mon, 19 Sep 2016 20:22:47 -0700 (PDT) MIME-Version: 1.0 Received: by 10.80.171.165 with HTTP; Mon, 19 Sep 2016 20:22:47 -0700 (PDT) In-Reply-To: References: From: "Alex T." Date: Mon, 19 Sep 2016 20:22:47 -0700 Message-ID: Subject: Re: buildkernel fails with a 'invalid conversion specifier' compiler error To: Dimitry Andric Cc: freebsd-stable@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Sep 2016 03:22:50 -0000 Thank you. The workaround helped, but kind of curious now what caused this failures to start show up. Will try going through commits to see if I can spot anything. On 18 September 2016 at 12:07, Dimitry Andric wrote: > On 18 Sep 2016, at 20:37, Alex T. wrote: > > > > I'm on stable/10 branch and have been using it to rebuild world > > and kernel. This is the revision I'm currently trying to build but > > started seeing the following issue way before it. > > > > URL: svn://svn.freebsd.org/base/stable/10 > > Revision: 305760 > > > > The world builds fine, but building the kernel fails with this error: > > > > /usr/src/sys/cam/cam_xpt.c:1060:27: error: > > invalid conversion specifier 'b' > > [-Werror,-Wformat-invalid-specifier] > > ...printf("%s%d: quirks=0x%b\n", perip... > > ~^ > > /usr/src/sys/cam/cam_xpt.c:1061:36: error: > > data argument not used by format > > string [-Werror,-Wformat-extra-args] > > ...periph->unit_number, quirks, bit_st... > > > > This is how my /etc/make.conf looks like: > > WITH_PKGNG=yes > > SSP_CFLAGS=-fstack-protector-all > > WITH_SSP_PORTS=yes > > WITHOUT="DOCS" > > > > and I don't have /etc/src.conf. Has anyone seen this issue? > > > > Any idea what might me misconfigured missing here? > > It's hard to say what is different on your system, but it looks like the > -fformat-extensions flag is somehow not being used for building your > kernel. If you can't figure out what causes this, you can try to work > around it by setting WITHOUT_FORMAT_EXTENSIONS, or setting WERROR to > empty. > > -Dimitry > > From owner-freebsd-stable@freebsd.org Tue Sep 20 06:52:48 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 44494BE222F for ; Tue, 20 Sep 2016 06:52:48 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0A35933E; Tue, 20 Sep 2016 06:52:48 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bmEv6-000HTE-RS; Tue, 20 Sep 2016 09:52:44 +0300 Date: Tue, 20 Sep 2016 09:52:44 +0300 From: Slawa Olhovchenkov To: John Baldwin Cc: freebsd-stable@freebsd.org Subject: Re: nginx and FreeBSD11 Message-ID: <20160920065244.GO2840@zxy.spb.ru> References: <20160907191348.GD22212@zxy.spb.ru> <1823460.vTm8IvUQsF@ralph.baldwin.cx> <20160918162241.GE2960@zxy.spb.ru> <2122051.7RxZBKUSFc@ralph.baldwin.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2122051.7RxZBKUSFc@ralph.baldwin.cx> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Sep 2016 06:52:48 -0000 On Mon, Sep 19, 2016 at 06:05:46PM -0700, John Baldwin wrote: > > > If this panics, then vmspace_switch_aio() is not working for > > > some reason. > > > > I am try using next DTrace script: > > ==== > > #pragma D option dynvarsize=64m > > > > int req[struct vmspace *, void *]; > > self int trace; > > > > syscall:freebsd:aio_read:entry > > { > > this->aio = *(struct aiocb *)copyin(arg0, sizeof(struct aiocb)); > > req[curthread->td_proc->p_vmspace, this->aio.aio_buf] = curthread->td_proc->p_pid; > > } > > > > fbt:kernel:aio_process_rw:entry > > { > > self->job = args[0]; > > self->trace = 1; > > } > > > > fbt:kernel:aio_process_rw:return > > /self->trace/ > > { > > req[self->job->userproc->p_vmspace, self->job->uaiocb.aio_buf] = 0; > > self->job = 0; > > self->trace = 0; > > } > > > > fbt:kernel:vn_io_fault:entry > > /self->trace && !req[curthread->td_proc->p_vmspace, args[1]->uio_iov[0].iov_base]/ > > { > > this->buf = args[1]->uio_iov[0].iov_base; > > printf("%Y vn_io_fault %p:%p pid %d\n", walltimestamp, curthread->td_proc->p_vmspace, this->buf, req[curthread->td_proc->p_vmspace, this->buf]); > > } > > === > > > > And don't got any messages near nginx core dump. > > What I can check next? > > May be check context/address space switch for kernel process? > > Which CPU are you using? CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.04-MHz K8-class CPU) > Perhaps try disabling PCID support (I think vm.pmap.pcid_enabled=0 from > loader prompt or loader.conf)? (Wondering if pmap_activate() is somehow not switching) > > -- > John Baldwin From owner-freebsd-stable@freebsd.org Tue Sep 20 08:11:39 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 160D6BE1C65 for ; Tue, 20 Sep 2016 08:11:39 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CF54BED1 for ; Tue, 20 Sep 2016 08:11:38 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bmG9P-000JKq-6d; Tue, 20 Sep 2016 11:11:35 +0300 Date: Tue, 20 Sep 2016 11:11:35 +0300 From: Slawa Olhovchenkov To: Lyndon Nerenberg Cc: FreeBSD Stable Subject: Re: LAGG and Jumbo Frames Message-ID: <20160920081135.GH2960@zxy.spb.ru> References: <48926c6013f938af832c17e4ad10b232@dweimer.net> <04c9065ee4a780c6f8986d1b204c4198@dweimer.net> <20160919220812.GG2960@zxy.spb.ru> <42A03EA9-7F8E-446E-B430-7431AB9CE2E6@orthanc.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <42A03EA9-7F8E-446E-B430-7431AB9CE2E6@orthanc.ca> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Sep 2016 08:11:39 -0000 On Mon, Sep 19, 2016 at 03:59:20PM -0700, Lyndon Nerenberg wrote: > > > On Sep 19, 2016, at 3:08 PM, Slawa Olhovchenkov wrote: > > > > This is because RTT of this link for jumbo frames higher 1500 bytes > > frame for store-and-forward switch chain. > > For TCP, RTT isn't really a factor (in this scenario), I am don't see scenario in first message. For may scenario this is limiting factor > as the windowing and congestion avoidance algorithms will adapt to the actual bandwidth-delay product of the link, and the delays in each direction will be symmetrical. > > Now the ack for a single 9000 octet packet will take longer than > that for a 1500 octet one, but that's because you're sending six > times as many octets before the ACK can be generated. The time to > send six 1500 octet packets and receive the ACK from sixth packet is > going to be comparable to that of receiving the ack from a single > 9000 octet packet. It's simple arithmetic to calculate the extra > protocol header overhead for 6x1500 vs 1x9000. Time to send send six 1500 octet packets significant less then for send one 9000 octet packet over multiple switch: H1-[S1]-[S2]-[S3]-H2 Sendig single 1500 octet packet from H1 to S1 over 1Gbit link: (1500+14+4+12+8)*8/10^9 = 12us switch delayed for 3us same for s1-s2, s2-s3, s3-h2. 2'nd packet delayed for 12us. 3..6 -- same. Sending all six packets (5 inter packets over 4 hop): (12+3)*4 + 12*5 = 120us. Sending single 9000 octet packet from H1 to S1 over 1Gbit link: (9000+14+4+12+8)*8/10^9 = 72us switch delayed for 3us Sending single 9000 octet packet over 4 hop: (72+3)*4 = 300us. 300/120 = 2.5 time slower > If there *is* a significant difference (beyond the extra protocol header overhead), it's time to take a very close look at the NICs you are using in the end hosts. A statistically significant difference would hint at poor interrupt handling performance on the part of one or more of the NICs and their associated device drivers. > > The intermediate switch overhead will be a constant (unless the switch backplane becomes saturated from unrelated traffic). You lost serelisation time. From owner-freebsd-stable@freebsd.org Tue Sep 20 19:20:56 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 49607BE29C0 for ; Tue, 20 Sep 2016 19:20:56 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0FB2B839; Tue, 20 Sep 2016 19:20:56 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bmQb7-000ARb-AO; Tue, 20 Sep 2016 22:20:53 +0300 Date: Tue, 20 Sep 2016 22:20:53 +0300 From: Slawa Olhovchenkov To: John Baldwin Cc: freebsd-stable@freebsd.org Subject: Re: nginx and FreeBSD11 Message-ID: <20160920192053.GP2840@zxy.spb.ru> References: <20160907191348.GD22212@zxy.spb.ru> <1823460.vTm8IvUQsF@ralph.baldwin.cx> <20160918162241.GE2960@zxy.spb.ru> <2122051.7RxZBKUSFc@ralph.baldwin.cx> <20160920065244.GO2840@zxy.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160920065244.GO2840@zxy.spb.ru> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Sep 2016 19:20:56 -0000 On Tue, Sep 20, 2016 at 09:52:44AM +0300, Slawa Olhovchenkov wrote: > On Mon, Sep 19, 2016 at 06:05:46PM -0700, John Baldwin wrote: > > > > > If this panics, then vmspace_switch_aio() is not working for > > > > some reason. > > > > > > I am try using next DTrace script: > > > ==== > > > #pragma D option dynvarsize=64m > > > > > > int req[struct vmspace *, void *]; > > > self int trace; > > > > > > syscall:freebsd:aio_read:entry > > > { > > > this->aio = *(struct aiocb *)copyin(arg0, sizeof(struct aiocb)); > > > req[curthread->td_proc->p_vmspace, this->aio.aio_buf] = curthread->td_proc->p_pid; > > > } > > > > > > fbt:kernel:aio_process_rw:entry > > > { > > > self->job = args[0]; > > > self->trace = 1; > > > } > > > > > > fbt:kernel:aio_process_rw:return > > > /self->trace/ > > > { > > > req[self->job->userproc->p_vmspace, self->job->uaiocb.aio_buf] = 0; > > > self->job = 0; > > > self->trace = 0; > > > } > > > > > > fbt:kernel:vn_io_fault:entry > > > /self->trace && !req[curthread->td_proc->p_vmspace, args[1]->uio_iov[0].iov_base]/ > > > { > > > this->buf = args[1]->uio_iov[0].iov_base; > > > printf("%Y vn_io_fault %p:%p pid %d\n", walltimestamp, curthread->td_proc->p_vmspace, this->buf, req[curthread->td_proc->p_vmspace, this->buf]); > > > } > > > === > > > > > > And don't got any messages near nginx core dump. > > > What I can check next? > > > May be check context/address space switch for kernel process? > > > > Which CPU are you using? > > CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.04-MHz K8-class CPU) > > > Perhaps try disabling PCID support (I think vm.pmap.pcid_enabled=0 from > > loader prompt or loader.conf)? (Wondering if pmap_activate() is somehow not switching) I am need some more time to test (day or two), but now this is like workaround/solution: 12h runtime and peak hour w/o nginx crash. (vm.pmap.pcid_enabled=0 in loader.conf). From owner-freebsd-stable@freebsd.org Tue Sep 20 20:19:36 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5D8C1BE39BD for ; Tue, 20 Sep 2016 20:19:36 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E7942BEB; Tue, 20 Sep 2016 20:19:35 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u8KKJPJe001639 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Tue, 20 Sep 2016 23:19:26 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u8KKJPJe001639 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u8KKJP2q001638; Tue, 20 Sep 2016 23:19:25 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 20 Sep 2016 23:19:25 +0300 From: Konstantin Belousov To: Slawa Olhovchenkov Cc: John Baldwin , freebsd-stable@freebsd.org Subject: Re: nginx and FreeBSD11 Message-ID: <20160920201925.GI38409@kib.kiev.ua> References: <20160907191348.GD22212@zxy.spb.ru> <1823460.vTm8IvUQsF@ralph.baldwin.cx> <20160918162241.GE2960@zxy.spb.ru> <2122051.7RxZBKUSFc@ralph.baldwin.cx> <20160920065244.GO2840@zxy.spb.ru> <20160920192053.GP2840@zxy.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160920192053.GP2840@zxy.spb.ru> User-Agent: Mutt/1.6.1 (2016-04-27) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Sep 2016 20:19:36 -0000 On Tue, Sep 20, 2016 at 10:20:53PM +0300, Slawa Olhovchenkov wrote: > On Tue, Sep 20, 2016 at 09:52:44AM +0300, Slawa Olhovchenkov wrote: > > > On Mon, Sep 19, 2016 at 06:05:46PM -0700, John Baldwin wrote: > > > > > > > If this panics, then vmspace_switch_aio() is not working for > > > > > some reason. > > > > > > > > I am try using next DTrace script: > > > > ==== > > > > #pragma D option dynvarsize=64m > > > > > > > > int req[struct vmspace *, void *]; > > > > self int trace; > > > > > > > > syscall:freebsd:aio_read:entry > > > > { > > > > this->aio = *(struct aiocb *)copyin(arg0, sizeof(struct aiocb)); > > > > req[curthread->td_proc->p_vmspace, this->aio.aio_buf] = curthread->td_proc->p_pid; > > > > } > > > > > > > > fbt:kernel:aio_process_rw:entry > > > > { > > > > self->job = args[0]; > > > > self->trace = 1; > > > > } > > > > > > > > fbt:kernel:aio_process_rw:return > > > > /self->trace/ > > > > { > > > > req[self->job->userproc->p_vmspace, self->job->uaiocb.aio_buf] = 0; > > > > self->job = 0; > > > > self->trace = 0; > > > > } > > > > > > > > fbt:kernel:vn_io_fault:entry > > > > /self->trace && !req[curthread->td_proc->p_vmspace, args[1]->uio_iov[0].iov_base]/ > > > > { > > > > this->buf = args[1]->uio_iov[0].iov_base; > > > > printf("%Y vn_io_fault %p:%p pid %d\n", walltimestamp, curthread->td_proc->p_vmspace, this->buf, req[curthread->td_proc->p_vmspace, this->buf]); > > > > } > > > > === > > > > > > > > And don't got any messages near nginx core dump. > > > > What I can check next? > > > > May be check context/address space switch for kernel process? > > > > > > Which CPU are you using? > > > > CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.04-MHz K8-class CPU) Is this sandy bridge ? Show me first 100 lines of the verbose dmesg, I want to see cpu features lines. In particular, does you CPU support the INVPCID feature. Also you may show me the 'sysctl vm.pmap' output. > > > > > Perhaps try disabling PCID support (I think vm.pmap.pcid_enabled=0 from > > > loader prompt or loader.conf)? (Wondering if pmap_activate() is somehow not switching) > > I am need some more time to test (day or two), but now this is like > workaround/solution: 12h runtime and peak hour w/o nginx crash. > (vm.pmap.pcid_enabled=0 in loader.conf). Please try this variation of the previous patch. diff --git a/sys/vm/vm_map.c b/sys/vm/vm_map.c index a23468e..f754652 100644 --- a/sys/vm/vm_map.c +++ b/sys/vm/vm_map.c @@ -481,6 +481,7 @@ vmspace_switch_aio(struct vmspace *newvm) if (oldvm == newvm) return; + spinlock_enter(); /* * Point to the new address space and refer to it. */ @@ -489,6 +490,7 @@ vmspace_switch_aio(struct vmspace *newvm) /* Activate the new mapping. */ pmap_activate(curthread); + spinlock_exit(); /* Remove the daemon's reference to the old address space. */ KASSERT(oldvm->vm_refcnt > 1, From owner-freebsd-stable@freebsd.org Tue Sep 20 20:24:19 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 66A4ABE3BCE for ; Tue, 20 Sep 2016 20:24:19 +0000 (UTC) (envelope-from prvs=00711ef7d9=rblayzor.bulk@inoc.net) Received: from mta2.alb.inoc.net (mta2.alb.inoc.net [IPv6:2607:f058:110:2::1:2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 33D43DE for ; Tue, 20 Sep 2016 20:24:19 +0000 (UTC) (envelope-from prvs=00711ef7d9=rblayzor.bulk@inoc.net) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=inoc.net; s=201501; h=To:References:Message-Id:Content-Transfer-Encoding:Cc:Date: In-Reply-To:From:Subject:Mime-Version:Content-Type:Sender:Reply-To:Content-ID :Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To: Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe :List-Post:List-Owner:List-Archive; bh=P2mA53TmfHchu7nllZH29PISNUbZAh3OledsiPxGfbE=; b=DkLFOai9Owagq5QStEaJ0iGRUv Gpr+YmVKCBlnrZIjrvn1m6Ha3O3HpEMWuXkkJZhrnpKWbcqZzqhUqGeKr3cILZEcqxTePM16qPcDe kEQR91LrydkIjSk3NYw+bCm2unuCfu+9SjxCOGEC2vpOVLrDmLb2schaal4KDUlqeHtFmaakek8Eh QbDXZ5mUI9aqUD9AOsj3rUqW+E+i7wq/zI4E1B+ypHf7rpqyojBQ9Nd1TSR7pIcMlQr+jChbYzGp8 vrRVOrve1sn0GV5IO6QbHLiF8wxk3KJkXy2cMidLoGzIZ9AoEsI4LzveGIEU25onOo/cR5r8luhMr fHE44ckw==; Received: from [2607:f058:11:a:1450:27ad:161c:82ae] by mail.inoc.net with ESMTPA (Exim 4.87) (envelope-from ) id 1bmRaT-000JNw-W0 by authid ; Tue, 20 Sep 2016 20:24:18 +0000 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: LAGG and Jumbo Frames From: Robert Blayzor In-Reply-To: <48926c6013f938af832c17e4ad10b232@dweimer.net> Date: Tue, 20 Sep 2016 16:24:13 -0400 Cc: FreeBSD Stable Content-Transfer-Encoding: quoted-printable Message-Id: <8C764584-DCA9-4407-84E8-AFDC52B35AE1@inoc.net> References: <48926c6013f938af832c17e4ad10b232@dweimer.net> To: dweimer@dweimer.net X-Mailer: Apple Mail (2.3124) X-Auth-Info: cmJsYXl6b3JAaW5vYy5uZXQ= X-Virus-Scanned: ClamAV 0.99.2/22226/Tue Sep 20 16:04:49 2016 X-Origin-Country: US X-Anti-Abuse: Please report to abuse@inoc.net X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Sep 2016 20:24:19 -0000 If your lag interface is up, what=E2=80=99s not working? Does something like this work? ping -D -s 8972 and then this not? ping -D -s 8972 If your firewall is on the LAN side supporting jumbo frames ok, but not = WAN side, then the router will have to fragment all of the packets. = (unless DF bit is set of course).=20 -- Robert inoc.net!rblayzor XMPP: rblayzor.AT.inoc.net PGP Key: 78BEDCE1 @ pgp.mit.edu > On Sep 19, 2016, at 3:23 PM, Dean E. Weimer = wrote: >=20 > Does anyone see an issue with the Jumbo Frames setup above, or are = Jumbo Frames not supported correctly in a LACP Aggregate configuration. From owner-freebsd-stable@freebsd.org Tue Sep 20 20:26:37 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 63A8DBE3C82 for ; Tue, 20 Sep 2016 20:26:37 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1C51423F; Tue, 20 Sep 2016 20:26:37 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bmRcf-000C9r-5R; Tue, 20 Sep 2016 23:26:33 +0300 Date: Tue, 20 Sep 2016 23:26:33 +0300 From: Slawa Olhovchenkov To: Julien Charbon Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org, hiren panchasara Subject: Re: 11.0 stuck on high network load Message-ID: <20160920202633.GQ2840@zxy.spb.ru> References: <20160905014612.GA42393@strugglingcoder.info> <20160914213503.GJ2840@zxy.spb.ru> <20160915085938.GN38409@kib.kiev.ua> <20160915090633.GS2840@zxy.spb.ru> <20160916181839.GC2960@zxy.spb.ru> <20160916183053.GL9397@strugglingcoder.info> <20160916190330.GG2840@zxy.spb.ru> <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> <20160919204328.GN2840@zxy.spb.ru> <8ba75d6e-4f01-895e-0aed-53c6c6692cb9@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8ba75d6e-4f01-895e-0aed-53c6c6692cb9@freebsd.org> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Sep 2016 20:26:37 -0000 On Tue, Sep 20, 2016 at 10:00:25PM +0200, Julien Charbon wrote: > > Hi Slawa, > > On 9/19/16 10:43 PM, Slawa Olhovchenkov wrote: > > On Mon, Sep 19, 2016 at 10:32:13PM +0200, Julien Charbon wrote: > >> > >>> @ CPU_CLK_UNHALTED_CORE [4653445 samples] > >>> > >>> 51.86% [2413083] lock_delay @ /boot/kernel.VSTREAM/kernel > >>> 100.0% [2413083] __rw_wlock_hard > >>> 100.0% [2413083] tcp_tw_2msl_scan > >>> 99.99% [2412958] pfslowtimo > >>> 100.0% [2412958] softclock_call_cc > >>> 100.0% [2412958] softclock > >>> 100.0% [2412958] intr_event_execute_handlers > >>> 100.0% [2412958] ithread_loop > >>> 100.0% [2412958] fork_exit > >>> 00.01% [125] tcp_twstart > >>> 100.0% [125] tcp_do_segment > >>> 100.0% [125] tcp_input > >>> 100.0% [125] ip_input > >>> 100.0% [125] swi_net > >>> 100.0% [125] intr_event_execute_handlers > >>> 100.0% [125] ithread_loop > >>> 100.0% [125] fork_exit > >> > >> The only write lock tcp_tw_2msl_scan() tries to get is a > >> INP_WLOCK(inp). Thus here, tcp_tw_2msl_scan() seems to be stuck > >> spinning on INP_WLOCK (or pfslowtimo() is going crazy and calls > >> tcp_tw_2msl_scan() at high rate but this will be quite unexpected). > >> > >> Thus my hypothesis is that something is holding the INP_WLOCK and not > >> releasing it, and tcp_tw_2msl_scan() is spinning on it. > >> > >> If you can, could you compile the kernel with below options: > >> > >> options DDB # Support DDB. > >> options DEADLKRES # Enable the deadlock resolver > >> options INVARIANTS # Enable calls of extra sanity > >> checking > >> options INVARIANT_SUPPORT # Extra sanity checks of internal > >> structures, required by INVARIANTS > >> options WITNESS # Enable checks to detect > >> deadlocks and cycles > >> options WITNESS_SKIPSPIN # Don't run witness on spinlocks > >> for speed > > > > Currently this host run with 100% CPU load (on all cores), i.e. > > enabling WITNESS will be significant drop performance. > > Can I use only some subset of options? > > > > Also, I can some troubles to DDB enter in this case. > > May be kgdb will be success (not tryed yet)? > > If these kernel options will certainly slow down your kernel, they also > might found the root cause of your issue before reaching the point where > you have 100% cpu load on all cores (thanks to INVARIANTS). I would > suggest: Hmmm, may be I am not clarified. This host run at peak hours with 100% CPU load as normal operation, this is for servering 2x10G, this is CPU load not result of lock issuse, this is not us case. And this is because I am fear to enable WITNESS -- I am fear drop performance. This lock issuse happen irregulary and may be caused by other issuse (nginx crashed). In this case about 1/3 cores have 100% cpu load, perhaps by this lock -- I am can trace only from one core and need more then hour for this (may be on other cores different trace, I can't guaranted anything). > #1. Try above kernel options at least once, and see what you can get. OK, I am try this after some time. > #2. If #1 is a total failure try below patch: It won't solve anything, > it just makes tcp_tw_2msl_scan() less greedy when there is contention on > the INP write lock. If it makes the debugging more feasible, continue > to #3. OK, thanks. What purpose to not skip locked tcptw in this loop? > diff --git a/sys/netinet/tcp_timewait.c b/sys/netinet/tcp_timewait.c > index a8b78f9..4206ea3 100644 > --- a/sys/netinet/tcp_timewait.c > +++ b/sys/netinet/tcp_timewait.c > @@ -701,34 +701,42 @@ tcp_tw_2msl_scan(int reuse) > in_pcbref(inp); > TW_RUNLOCK(V_tw_lock); > > +retry: > if (INP_INFO_TRY_RLOCK(&V_tcbinfo)) { > > - INP_WLOCK(inp); > - tw = intotw(inp); > - if (in_pcbrele_wlocked(inp)) { > - KASSERT(tw == NULL, ("%s: held last inp " > - "reference but tw not NULL", __func__)); > - INP_INFO_RUNLOCK(&V_tcbinfo); > - continue; > - } > + if (INP_TRY_WLOCK(inp)) { > + tw = intotw(inp); > + if (in_pcbrele_wlocked(inp)) { > + KASSERT(tw == NULL, ("%s: held > last inp " > + "reference but tw not NULL", > __func__)); > + INP_INFO_RUNLOCK(&V_tcbinfo); > + continue; > + } > > - if (tw == NULL) { > - /* tcp_twclose() has already been called */ > - INP_WUNLOCK(inp); > - INP_INFO_RUNLOCK(&V_tcbinfo); > - continue; > - } > + if (tw == NULL) { > + /* tcp_twclose() has already > been called */ > + INP_WUNLOCK(inp); > + INP_INFO_RUNLOCK(&V_tcbinfo); > + continue; > + } > > - tcp_twclose(tw, reuse); > - INP_INFO_RUNLOCK(&V_tcbinfo); > - if (reuse) > - return tw; > + tcp_twclose(tw, reuse); > + INP_INFO_RUNLOCK(&V_tcbinfo); > + if (reuse) > + return tw; > + } else { > + INP_INFO_RUNLOCK(&V_tcbinfo); > + goto retry; > + } > } else { > /* INP_INFO lock is busy, continue later. */ > - INP_WLOCK(inp); > - if (!in_pcbrele_wlocked(inp)) > - INP_WUNLOCK(inp); > - break; > + if (INP_TRY_WLOCK(inp)) { > + if (!in_pcbrele_wlocked(inp)) > + INP_WUNLOCK(inp); > + break; > + } else { > + goto retry; > + } > } > } > > #3. Once the issue is reproduced, launch ddb and run the below commands: > > show pcpu > show allpcpu > show locks > show alllocks > show lockchain > show allchains > show all trace > > My 2 cents. > > -- > Julien > From owner-freebsd-stable@freebsd.org Tue Sep 20 20:38:56 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4018DBE21E6 for ; Tue, 20 Sep 2016 20:38:56 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 02F5AE56; Tue, 20 Sep 2016 20:38:56 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bmRoc-000CUa-1H; Tue, 20 Sep 2016 23:38:54 +0300 Date: Tue, 20 Sep 2016 23:38:54 +0300 From: Slawa Olhovchenkov To: Konstantin Belousov Cc: John Baldwin , freebsd-stable@freebsd.org Subject: Re: nginx and FreeBSD11 Message-ID: <20160920203853.GR2840@zxy.spb.ru> References: <20160907191348.GD22212@zxy.spb.ru> <1823460.vTm8IvUQsF@ralph.baldwin.cx> <20160918162241.GE2960@zxy.spb.ru> <2122051.7RxZBKUSFc@ralph.baldwin.cx> <20160920065244.GO2840@zxy.spb.ru> <20160920192053.GP2840@zxy.spb.ru> <20160920201925.GI38409@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160920201925.GI38409@kib.kiev.ua> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Sep 2016 20:38:56 -0000 On Tue, Sep 20, 2016 at 11:19:25PM +0300, Konstantin Belousov wrote: > On Tue, Sep 20, 2016 at 10:20:53PM +0300, Slawa Olhovchenkov wrote: > > On Tue, Sep 20, 2016 at 09:52:44AM +0300, Slawa Olhovchenkov wrote: > > > > > On Mon, Sep 19, 2016 at 06:05:46PM -0700, John Baldwin wrote: > > > > > > > > > If this panics, then vmspace_switch_aio() is not working for > > > > > > some reason. > > > > > > > > > > I am try using next DTrace script: > > > > > ==== > > > > > #pragma D option dynvarsize=64m > > > > > > > > > > int req[struct vmspace *, void *]; > > > > > self int trace; > > > > > > > > > > syscall:freebsd:aio_read:entry > > > > > { > > > > > this->aio = *(struct aiocb *)copyin(arg0, sizeof(struct aiocb)); > > > > > req[curthread->td_proc->p_vmspace, this->aio.aio_buf] = curthread->td_proc->p_pid; > > > > > } > > > > > > > > > > fbt:kernel:aio_process_rw:entry > > > > > { > > > > > self->job = args[0]; > > > > > self->trace = 1; > > > > > } > > > > > > > > > > fbt:kernel:aio_process_rw:return > > > > > /self->trace/ > > > > > { > > > > > req[self->job->userproc->p_vmspace, self->job->uaiocb.aio_buf] = 0; > > > > > self->job = 0; > > > > > self->trace = 0; > > > > > } > > > > > > > > > > fbt:kernel:vn_io_fault:entry > > > > > /self->trace && !req[curthread->td_proc->p_vmspace, args[1]->uio_iov[0].iov_base]/ > > > > > { > > > > > this->buf = args[1]->uio_iov[0].iov_base; > > > > > printf("%Y vn_io_fault %p:%p pid %d\n", walltimestamp, curthread->td_proc->p_vmspace, this->buf, req[curthread->td_proc->p_vmspace, this->buf]); > > > > > } > > > > > === > > > > > > > > > > And don't got any messages near nginx core dump. > > > > > What I can check next? > > > > > May be check context/address space switch for kernel process? > > > > > > > > Which CPU are you using? > > > > > > CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.04-MHz K8-class CPU) > Is this sandy bridge ? Sandy Bridge EP > Show me first 100 lines of the verbose dmesg, After day or two, after end of this test run -- I am need to enable verbose. > I want to see cpu features lines. In particular, does you CPU support > the INVPCID feature. CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.05-MHz K8-class CPU) Origin="GenuineIntel" Id=0x206d7 Family=0x6 Model=0x2d Stepping=7 Features=0xbfebfbff Features2=0x1fbee3ff AMD Features=0x2c100800 AMD Features2=0x1 XSAVE Features=0x1 VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID TSC: P-state invariant, performance statistics I am don't see this feature before E5v3: CPU: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz (2600.06-MHz K8-class CPU) Origin="GenuineIntel" Id=0x306e4 Family=0x6 Model=0x3e Stepping=4 Features=0xbfebfbff Features2=0x7fbee3ff AMD Features=0x2c100800 AMD Features2=0x1 Structured Extended Features=0x281 XSAVE Features=0x1 VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr TSC: P-state invariant, performance statistics (don't run 11.0 on this CPU) CPU: Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz (2600.05-MHz K8-class CPU) Origin="GenuineIntel" Id=0x306f2 Family=0x6 Model=0x3f Stepping=2 Features=0xbfebfbff Features2=0x7ffefbff AMD Features=0x2c100800 AMD Features2=0x21 Structured Extended Features=0x37ab XSAVE Features=0x1 VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr TSC: P-state invariant, performance statistics (11.0 run w/o this issuse) > Also you may show me the 'sysctl vm.pmap' output. # sysctl vm.pmap vm.pmap.pdpe.demotions: 3 vm.pmap.pde.promotions: 172495 vm.pmap.pde.p_failures: 2119294 vm.pmap.pde.mappings: 1927 vm.pmap.pde.demotions: 126192 vm.pmap.pcid_save_cnt: 0 vm.pmap.invpcid_works: 0 vm.pmap.pcid_enabled: 0 vm.pmap.pg_ps_enabled: 1 vm.pmap.pat_works: 1 This is after vm.pmap.pcid_enabled=0 in loader.conf > > > > > > > Perhaps try disabling PCID support (I think vm.pmap.pcid_enabled=0 from > > > > loader prompt or loader.conf)? (Wondering if pmap_activate() is somehow not switching) > > > > I am need some more time to test (day or two), but now this is like > > workaround/solution: 12h runtime and peak hour w/o nginx crash. > > (vm.pmap.pcid_enabled=0 in loader.conf). > > Please try this variation of the previous patch. and remove vm.pmap.pcid_enabled=0? > diff --git a/sys/vm/vm_map.c b/sys/vm/vm_map.c > index a23468e..f754652 100644 > --- a/sys/vm/vm_map.c > +++ b/sys/vm/vm_map.c > @@ -481,6 +481,7 @@ vmspace_switch_aio(struct vmspace *newvm) > if (oldvm == newvm) > return; > > + spinlock_enter(); > /* > * Point to the new address space and refer to it. > */ > @@ -489,6 +490,7 @@ vmspace_switch_aio(struct vmspace *newvm) > > /* Activate the new mapping. */ > pmap_activate(curthread); > + spinlock_exit(); > > /* Remove the daemon's reference to the old address space. */ > KASSERT(oldvm->vm_refcnt > 1, From owner-freebsd-stable@freebsd.org Tue Sep 20 20:59:20 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 361BABE2915 for ; Tue, 20 Sep 2016 20:59:20 +0000 (UTC) (envelope-from julien.charbon@gmail.com) Received: from mail-wm0-f44.google.com (mail-wm0-f44.google.com [74.125.82.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id BF710CBD for ; Tue, 20 Sep 2016 20:59:19 +0000 (UTC) (envelope-from julien.charbon@gmail.com) Received: by mail-wm0-f44.google.com with SMTP id b130so56662612wmc.0 for ; Tue, 20 Sep 2016 13:59:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to; bh=TBhkwAZMxRQJJwZXaVq/7qX6IjfEvv5sa4DycmEMWV4=; b=F4ZJC4QOK4XJRGSoO2Vh4XJePXfci0kOqglFkGwXQ7IVtgm6LPWZI48/rfwKPH8RGa WevirCxalh4uqKXZ532724ptnli/l2RWP2jlDy3OWWKkjlwRhDpix4Mj70QkWaHBIA79 i3mMr0XotIRbwj7cjn2PImXnONd+h6fzRx/FyRc5LrdVWN41Da4DQ4+wm68ZobfhtYf+ wdybjyAhHWAxyCz7rzKylt5+A74sFTqFpLu58h+xapQ3b6wmCffTu9TVR0528sz7Fkw3 uUyG1gY5lxr5Sd3Dq+zi0YnfpQPnqO/mpVmhix6C3uf6Q2i9cY8iU6p4a0YhopmKtgR7 7R/g== X-Gm-Message-State: AE9vXwMqZ6Os4Lt+lvJ0Ii0WK1lHh/DVHAQFNZJhGz6Vemm2dPTwwG82biNjGdFILAMVrg== X-Received: by 10.28.181.145 with SMTP id e139mr4654358wmf.114.1474401634944; Tue, 20 Sep 2016 13:00:34 -0700 (PDT) Received: from [192.168.0.12] (217-162-163-184.dynamic.hispeed.ch. [217.162.163.184]) by smtp.gmail.com with ESMTPSA id o2sm29998864wjo.3.2016.09.20.13.00.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 20 Sep 2016 13:00:33 -0700 (PDT) Subject: Re: 11.0 stuck on high network load To: Slawa Olhovchenkov References: <20160904215739.GC22212@zxy.spb.ru> <20160905014612.GA42393@strugglingcoder.info> <20160914213503.GJ2840@zxy.spb.ru> <20160915085938.GN38409@kib.kiev.ua> <20160915090633.GS2840@zxy.spb.ru> <20160916181839.GC2960@zxy.spb.ru> <20160916183053.GL9397@strugglingcoder.info> <20160916190330.GG2840@zxy.spb.ru> <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> <20160919204328.GN2840@zxy.spb.ru> Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org, hiren panchasara From: Julien Charbon Message-ID: <8ba75d6e-4f01-895e-0aed-53c6c6692cb9@freebsd.org> Date: Tue, 20 Sep 2016 22:00:25 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <20160919204328.GN2840@zxy.spb.ru> Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="UfwU9GI729W6F2q2nLQ6jAwJJ1tANQG4J" X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Sep 2016 20:59:20 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --UfwU9GI729W6F2q2nLQ6jAwJJ1tANQG4J Content-Type: multipart/mixed; boundary="v9b3hUSrNDIrqVvbMuxrNDHf6TvMtHHi3"; protected-headers="v1" From: Julien Charbon To: Slawa Olhovchenkov Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org, hiren panchasara Message-ID: <8ba75d6e-4f01-895e-0aed-53c6c6692cb9@freebsd.org> Subject: Re: 11.0 stuck on high network load References: <20160904215739.GC22212@zxy.spb.ru> <20160905014612.GA42393@strugglingcoder.info> <20160914213503.GJ2840@zxy.spb.ru> <20160915085938.GN38409@kib.kiev.ua> <20160915090633.GS2840@zxy.spb.ru> <20160916181839.GC2960@zxy.spb.ru> <20160916183053.GL9397@strugglingcoder.info> <20160916190330.GG2840@zxy.spb.ru> <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> <20160919204328.GN2840@zxy.spb.ru> In-Reply-To: <20160919204328.GN2840@zxy.spb.ru> --v9b3hUSrNDIrqVvbMuxrNDHf6TvMtHHi3 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Hi Slawa, On 9/19/16 10:43 PM, Slawa Olhovchenkov wrote: > On Mon, Sep 19, 2016 at 10:32:13PM +0200, Julien Charbon wrote: >> >>> @ CPU_CLK_UNHALTED_CORE [4653445 samples] >>> >>> 51.86% [2413083] lock_delay @ /boot/kernel.VSTREAM/kernel >>> 100.0% [2413083] __rw_wlock_hard >>> 100.0% [2413083] tcp_tw_2msl_scan >>> 99.99% [2412958] pfslowtimo >>> 100.0% [2412958] softclock_call_cc >>> 100.0% [2412958] softclock >>> 100.0% [2412958] intr_event_execute_handlers >>> 100.0% [2412958] ithread_loop >>> 100.0% [2412958] fork_exit >>> 00.01% [125] tcp_twstart >>> 100.0% [125] tcp_do_segment >>> 100.0% [125] tcp_input >>> 100.0% [125] ip_input >>> 100.0% [125] swi_net >>> 100.0% [125] intr_event_execute_handlers >>> 100.0% [125] ithread_loop >>> 100.0% [125] fork_exit >> >> The only write lock tcp_tw_2msl_scan() tries to get is a >> INP_WLOCK(inp). Thus here, tcp_tw_2msl_scan() seems to be stuck >> spinning on INP_WLOCK (or pfslowtimo() is going crazy and calls >> tcp_tw_2msl_scan() at high rate but this will be quite unexpected). >> >> Thus my hypothesis is that something is holding the INP_WLOCK and not= >> releasing it, and tcp_tw_2msl_scan() is spinning on it. >> >> If you can, could you compile the kernel with below options: >> >> options DDB # Support DDB. >> options DEADLKRES # Enable the deadlock resolver >> options INVARIANTS # Enable calls of extra sanity >> checking >> options INVARIANT_SUPPORT # Extra sanity checks of intern= al >> structures, required by INVARIANTS >> options WITNESS # Enable checks to detect >> deadlocks and cycles >> options WITNESS_SKIPSPIN # Don't run witness on spinlock= s >> for speed >=20 > Currently this host run with 100% CPU load (on all cores), i.e. > enabling WITNESS will be significant drop performance. > Can I use only some subset of options? >=20 > Also, I can some troubles to DDB enter in this case. > May be kgdb will be success (not tryed yet)? If these kernel options will certainly slow down your kernel, they also might found the root cause of your issue before reaching the point where you have 100% cpu load on all cores (thanks to INVARIANTS). I would suggest: #1. Try above kernel options at least once, and see what you can get. #2. If #1 is a total failure try below patch: It won't solve anything, it just makes tcp_tw_2msl_scan() less greedy when there is contention on the INP write lock. If it makes the debugging more feasible, continue to #3. diff --git a/sys/netinet/tcp_timewait.c b/sys/netinet/tcp_timewait.c index a8b78f9..4206ea3 100644 --- a/sys/netinet/tcp_timewait.c +++ b/sys/netinet/tcp_timewait.c @@ -701,34 +701,42 @@ tcp_tw_2msl_scan(int reuse) in_pcbref(inp); TW_RUNLOCK(V_tw_lock); +retry: if (INP_INFO_TRY_RLOCK(&V_tcbinfo)) { - INP_WLOCK(inp); - tw =3D intotw(inp); - if (in_pcbrele_wlocked(inp)) { - KASSERT(tw =3D=3D NULL, ("%s: held last i= np " - "reference but tw not NULL", __func__= )); - INP_INFO_RUNLOCK(&V_tcbinfo); - continue; - } + if (INP_TRY_WLOCK(inp)) { + tw =3D intotw(inp); + if (in_pcbrele_wlocked(inp)) { + KASSERT(tw =3D=3D NULL, ("%s: hel= d last inp " + "reference but tw not NULL", __func__)); + INP_INFO_RUNLOCK(&V_tcbinfo); + continue; + } - if (tw =3D=3D NULL) { - /* tcp_twclose() has already been called = */ - INP_WUNLOCK(inp); - INP_INFO_RUNLOCK(&V_tcbinfo); - continue; - } + if (tw =3D=3D NULL) { + /* tcp_twclose() has already been called */ + INP_WUNLOCK(inp); + INP_INFO_RUNLOCK(&V_tcbinfo); + continue; + } - tcp_twclose(tw, reuse); - INP_INFO_RUNLOCK(&V_tcbinfo); - if (reuse) - return tw; + tcp_twclose(tw, reuse); + INP_INFO_RUNLOCK(&V_tcbinfo); + if (reuse) + return tw; + } else { + INP_INFO_RUNLOCK(&V_tcbinfo); + goto retry; + } } else { /* INP_INFO lock is busy, continue later. */ - INP_WLOCK(inp); - if (!in_pcbrele_wlocked(inp)) - INP_WUNLOCK(inp); - break; + if (INP_TRY_WLOCK(inp)) { + if (!in_pcbrele_wlocked(inp)) + INP_WUNLOCK(inp); + break; + } else { + goto retry; + } } } #3. Once the issue is reproduced, launch ddb and run the below commands= : show pcpu show allpcpu show locks show alllocks show lockchain show allchains show all trace My 2 cents. -- Julien --v9b3hUSrNDIrqVvbMuxrNDHf6TvMtHHi3-- --UfwU9GI729W6F2q2nLQ6jAwJJ1tANQG4J Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQEcBAEBCgAGBQJX4ZVgAAoJEKVlQ5Je6dhxXvIIANqvnsweFVWMO+fG4EHI1tZW UeXOwM5/Amyox9uVrVkaTlBDft8hEDVAgwDyQKWdEOqV3FJek9/edXkbg31nuVHJ LtQQZvgMgN2gFQs+42U6XTKXAk0XsVmFQaPqi99m97AThXOLvKkI4kS0DJZL6tvU drXFt5NT0zfQLjqo/WWbNNqBXYkBQ6NI+rQ5bpZIBPEmvPPXSm1RUQd7pkJOdnYa mZHXB7J/9avca6pPomNsm6bnyrsmvdg6ecOTaQRq4qBn6sEjb1bYIOzfz5Tc6hzk Y7+JwRUtN9+sN7QumghAupyTuET4kvIOLfjGDT0HTF+8cAXOMG18Yj4jdPX74GE= =BBIM -----END PGP SIGNATURE----- --UfwU9GI729W6F2q2nLQ6jAwJJ1tANQG4J-- From owner-freebsd-stable@freebsd.org Tue Sep 20 21:15:24 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EF588BE3031 for ; Tue, 20 Sep 2016 21:15:24 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 805E68E2; Tue, 20 Sep 2016 21:15:24 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u8KLFIdG014464 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Wed, 21 Sep 2016 00:15:18 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u8KLFIdG014464 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u8KLFIfx014462; Wed, 21 Sep 2016 00:15:18 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 21 Sep 2016 00:15:17 +0300 From: Konstantin Belousov To: Slawa Olhovchenkov Cc: John Baldwin , freebsd-stable@freebsd.org Subject: Re: nginx and FreeBSD11 Message-ID: <20160920211517.GJ38409@kib.kiev.ua> References: <20160907191348.GD22212@zxy.spb.ru> <1823460.vTm8IvUQsF@ralph.baldwin.cx> <20160918162241.GE2960@zxy.spb.ru> <2122051.7RxZBKUSFc@ralph.baldwin.cx> <20160920065244.GO2840@zxy.spb.ru> <20160920192053.GP2840@zxy.spb.ru> <20160920201925.GI38409@kib.kiev.ua> <20160920203853.GR2840@zxy.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160920203853.GR2840@zxy.spb.ru> User-Agent: Mutt/1.6.1 (2016-04-27) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Sep 2016 21:15:25 -0000 On Tue, Sep 20, 2016 at 11:38:54PM +0300, Slawa Olhovchenkov wrote: > On Tue, Sep 20, 2016 at 11:19:25PM +0300, Konstantin Belousov wrote: > > > On Tue, Sep 20, 2016 at 10:20:53PM +0300, Slawa Olhovchenkov wrote: > > > On Tue, Sep 20, 2016 at 09:52:44AM +0300, Slawa Olhovchenkov wrote: > > > > > > > On Mon, Sep 19, 2016 at 06:05:46PM -0700, John Baldwin wrote: > > > > > > > > > > > If this panics, then vmspace_switch_aio() is not working for > > > > > > > some reason. > > > > > > > > > > > > I am try using next DTrace script: > > > > > > ==== > > > > > > #pragma D option dynvarsize=64m > > > > > > > > > > > > int req[struct vmspace *, void *]; > > > > > > self int trace; > > > > > > > > > > > > syscall:freebsd:aio_read:entry > > > > > > { > > > > > > this->aio = *(struct aiocb *)copyin(arg0, sizeof(struct aiocb)); > > > > > > req[curthread->td_proc->p_vmspace, this->aio.aio_buf] = curthread->td_proc->p_pid; > > > > > > } > > > > > > > > > > > > fbt:kernel:aio_process_rw:entry > > > > > > { > > > > > > self->job = args[0]; > > > > > > self->trace = 1; > > > > > > } > > > > > > > > > > > > fbt:kernel:aio_process_rw:return > > > > > > /self->trace/ > > > > > > { > > > > > > req[self->job->userproc->p_vmspace, self->job->uaiocb.aio_buf] = 0; > > > > > > self->job = 0; > > > > > > self->trace = 0; > > > > > > } > > > > > > > > > > > > fbt:kernel:vn_io_fault:entry > > > > > > /self->trace && !req[curthread->td_proc->p_vmspace, args[1]->uio_iov[0].iov_base]/ > > > > > > { > > > > > > this->buf = args[1]->uio_iov[0].iov_base; > > > > > > printf("%Y vn_io_fault %p:%p pid %d\n", walltimestamp, curthread->td_proc->p_vmspace, this->buf, req[curthread->td_proc->p_vmspace, this->buf]); > > > > > > } > > > > > > === > > > > > > > > > > > > And don't got any messages near nginx core dump. > > > > > > What I can check next? > > > > > > May be check context/address space switch for kernel process? > > > > > > > > > > Which CPU are you using? > > > > > > > > CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.04-MHz K8-class CPU) > > Is this sandy bridge ? > > Sandy Bridge EP > > > Show me first 100 lines of the verbose dmesg, > > After day or two, after end of this test run -- I am need to enable verbose. > > > I want to see cpu features lines. In particular, does you CPU support > > the INVPCID feature. > > CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.05-MHz K8-class CPU) > Origin="GenuineIntel" Id=0x206d7 Family=0x6 Model=0x2d Stepping=7 > Features=0xbfebfbff > Features2=0x1fbee3ff > AMD Features=0x2c100800 > AMD Features2=0x1 > XSAVE Features=0x1 > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID > TSC: P-state invariant, performance statistics > > I am don't see this feature before E5v3: > > CPU: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz (2600.06-MHz K8-class CPU) > Origin="GenuineIntel" Id=0x306e4 Family=0x6 Model=0x3e Stepping=4 > Features=0xbfebfbff > Features2=0x7fbee3ff > AMD Features=0x2c100800 > AMD Features2=0x1 > Structured Extended Features=0x281 > XSAVE Features=0x1 > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr > TSC: P-state invariant, performance statistics > > (don't run 11.0 on this CPU) Ok. > > CPU: Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz (2600.05-MHz K8-class CPU) > Origin="GenuineIntel" Id=0x306f2 Family=0x6 Model=0x3f Stepping=2 > Features=0xbfebfbff > Features2=0x7ffefbff > AMD Features=0x2c100800 > AMD Features2=0x21 > Structured Extended Features=0x37ab > XSAVE Features=0x1 > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr > TSC: P-state invariant, performance statistics > > (11.0 run w/o this issuse) Do you mean that similarly configured nginx+aio do not demonstrate the corruption on this machine ? > > > Also you may show me the 'sysctl vm.pmap' output. > > # sysctl vm.pmap > vm.pmap.pdpe.demotions: 3 > vm.pmap.pde.promotions: 172495 > vm.pmap.pde.p_failures: 2119294 > vm.pmap.pde.mappings: 1927 > vm.pmap.pde.demotions: 126192 > vm.pmap.pcid_save_cnt: 0 > vm.pmap.invpcid_works: 0 > vm.pmap.pcid_enabled: 0 > vm.pmap.pg_ps_enabled: 1 > vm.pmap.pat_works: 1 > > This is after vm.pmap.pcid_enabled=0 in loader.conf > > > > > > > > > > Perhaps try disabling PCID support (I think vm.pmap.pcid_enabled=0 from > > > > > loader prompt or loader.conf)? (Wondering if pmap_activate() is somehow not switching) > > > > > > I am need some more time to test (day or two), but now this is like > > > workaround/solution: 12h runtime and peak hour w/o nginx crash. > > > (vm.pmap.pcid_enabled=0 in loader.conf). > > > > Please try this variation of the previous patch. > > and remove vm.pmap.pcid_enabled=0? Definitely. > > > diff --git a/sys/vm/vm_map.c b/sys/vm/vm_map.c > > index a23468e..f754652 100644 > > --- a/sys/vm/vm_map.c > > +++ b/sys/vm/vm_map.c > > @@ -481,6 +481,7 @@ vmspace_switch_aio(struct vmspace *newvm) > > if (oldvm == newvm) > > return; > > > > + spinlock_enter(); > > /* > > * Point to the new address space and refer to it. > > */ > > @@ -489,6 +490,7 @@ vmspace_switch_aio(struct vmspace *newvm) > > > > /* Activate the new mapping. */ > > pmap_activate(curthread); > > + spinlock_exit(); > > > > /* Remove the daemon's reference to the old address space. */ > > KASSERT(oldvm->vm_refcnt > 1, From owner-freebsd-stable@freebsd.org Tue Sep 20 21:47:39 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 856D3BE3A28 for ; Tue, 20 Sep 2016 21:47:39 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 47AB51E14; Tue, 20 Sep 2016 21:47:39 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bmSsz-000ENC-MI; Wed, 21 Sep 2016 00:47:29 +0300 Date: Wed, 21 Sep 2016 00:47:29 +0300 From: Slawa Olhovchenkov To: Konstantin Belousov Cc: John Baldwin , freebsd-stable@freebsd.org Subject: Re: nginx and FreeBSD11 Message-ID: <20160920214729.GS2840@zxy.spb.ru> References: <20160907191348.GD22212@zxy.spb.ru> <1823460.vTm8IvUQsF@ralph.baldwin.cx> <20160918162241.GE2960@zxy.spb.ru> <2122051.7RxZBKUSFc@ralph.baldwin.cx> <20160920065244.GO2840@zxy.spb.ru> <20160920192053.GP2840@zxy.spb.ru> <20160920201925.GI38409@kib.kiev.ua> <20160920203853.GR2840@zxy.spb.ru> <20160920211517.GJ38409@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160920211517.GJ38409@kib.kiev.ua> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Sep 2016 21:47:39 -0000 On Wed, Sep 21, 2016 at 12:15:17AM +0300, Konstantin Belousov wrote: > On Tue, Sep 20, 2016 at 11:38:54PM +0300, Slawa Olhovchenkov wrote: > > On Tue, Sep 20, 2016 at 11:19:25PM +0300, Konstantin Belousov wrote: > > > > > On Tue, Sep 20, 2016 at 10:20:53PM +0300, Slawa Olhovchenkov wrote: > > > > On Tue, Sep 20, 2016 at 09:52:44AM +0300, Slawa Olhovchenkov wrote: > > > > > > > > > On Mon, Sep 19, 2016 at 06:05:46PM -0700, John Baldwin wrote: > > > > > > > > > > > > > If this panics, then vmspace_switch_aio() is not working for > > > > > > > > some reason. > > > > > > > > > > > > > > I am try using next DTrace script: > > > > > > > ==== > > > > > > > #pragma D option dynvarsize=64m > > > > > > > > > > > > > > int req[struct vmspace *, void *]; > > > > > > > self int trace; > > > > > > > > > > > > > > syscall:freebsd:aio_read:entry > > > > > > > { > > > > > > > this->aio = *(struct aiocb *)copyin(arg0, sizeof(struct aiocb)); > > > > > > > req[curthread->td_proc->p_vmspace, this->aio.aio_buf] = curthread->td_proc->p_pid; > > > > > > > } > > > > > > > > > > > > > > fbt:kernel:aio_process_rw:entry > > > > > > > { > > > > > > > self->job = args[0]; > > > > > > > self->trace = 1; > > > > > > > } > > > > > > > > > > > > > > fbt:kernel:aio_process_rw:return > > > > > > > /self->trace/ > > > > > > > { > > > > > > > req[self->job->userproc->p_vmspace, self->job->uaiocb.aio_buf] = 0; > > > > > > > self->job = 0; > > > > > > > self->trace = 0; > > > > > > > } > > > > > > > > > > > > > > fbt:kernel:vn_io_fault:entry > > > > > > > /self->trace && !req[curthread->td_proc->p_vmspace, args[1]->uio_iov[0].iov_base]/ > > > > > > > { > > > > > > > this->buf = args[1]->uio_iov[0].iov_base; > > > > > > > printf("%Y vn_io_fault %p:%p pid %d\n", walltimestamp, curthread->td_proc->p_vmspace, this->buf, req[curthread->td_proc->p_vmspace, this->buf]); > > > > > > > } > > > > > > > === > > > > > > > > > > > > > > And don't got any messages near nginx core dump. > > > > > > > What I can check next? > > > > > > > May be check context/address space switch for kernel process? > > > > > > > > > > > > Which CPU are you using? > > > > > > > > > > CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.04-MHz K8-class CPU) > > > Is this sandy bridge ? > > > > Sandy Bridge EP > > > > > Show me first 100 lines of the verbose dmesg, > > > > After day or two, after end of this test run -- I am need to enable verbose. > > > > > I want to see cpu features lines. In particular, does you CPU support > > > the INVPCID feature. > > > > CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.05-MHz K8-class CPU) > > Origin="GenuineIntel" Id=0x206d7 Family=0x6 Model=0x2d Stepping=7 > > Features=0xbfebfbff > > Features2=0x1fbee3ff > > AMD Features=0x2c100800 > > AMD Features2=0x1 > > XSAVE Features=0x1 > > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID > > TSC: P-state invariant, performance statistics > > > > I am don't see this feature before E5v3: > > > > CPU: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz (2600.06-MHz K8-class CPU) > > Origin="GenuineIntel" Id=0x306e4 Family=0x6 Model=0x3e Stepping=4 > > Features=0xbfebfbff > > Features2=0x7fbee3ff > > AMD Features=0x2c100800 > > AMD Features2=0x1 > > Structured Extended Features=0x281 > > XSAVE Features=0x1 > > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr > > TSC: P-state invariant, performance statistics > > > > (don't run 11.0 on this CPU) > Ok. > > > > > CPU: Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz (2600.05-MHz K8-class CPU) > > Origin="GenuineIntel" Id=0x306f2 Family=0x6 Model=0x3f Stepping=2 > > Features=0xbfebfbff > > Features2=0x7ffefbff > > AMD Features=0x2c100800 > > AMD Features2=0x21 > > Structured Extended Features=0x37ab > > XSAVE Features=0x1 > > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr > > TSC: P-state invariant, performance statistics > > > > (11.0 run w/o this issuse) > Do you mean that similarly configured nginx+aio do not demonstrate the corruption on this machine ? Yes. But different storage configuration and different pattern load. Also 11.0 run w/o this issuse on CPU: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz (2200.04-MHz K8-class CPU) Origin="GenuineIntel" Id=0x406f1 Family=0x6 Model=0x4f Stepping=1 Features=0xbfebfbff Features2=0x7ffefbff AMD Features=0x2c100800 AMD Features2=0x121 Structured Extended Features=0x21cbfbb XSAVE Features=0x1 VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr TSC: P-state invariant, performance statistics PS: all systems is dual-cpu. From owner-freebsd-stable@freebsd.org Tue Sep 20 22:00:11 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B0CF2BE3CF4 for ; Tue, 20 Sep 2016 22:00:11 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-io0-x229.google.com (mail-io0-x229.google.com [IPv6:2607:f8b0:4001:c06::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 80F025F6 for ; Tue, 20 Sep 2016 22:00:11 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-io0-x229.google.com with SMTP id m79so34505762ioo.3 for ; Tue, 20 Sep 2016 15:00:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=N+sEnO3G/LxeidDHAO+0U2MJN0+jJwOzBwyShzuEbDA=; b=bfhoR7PekeYk369mP8hX19rXl3hk4sC26KSmMMtaCBcqjojFdCL9uT0AYIdutXyTKn dcibKDVwPCAKuUfVecZPRnLyp/uPm4qTH3TI4fjRA0Xu/6XF1grFcnZUHl9Boypl/gWc i6Ot74d3odwFRT8casoKGUQKqyWWYjowPTihXOkCIqaa1sgS7arIAfjfrA7YIK/5UR37 JJkG30ys+d07QuetlvgzoY/JBOKU5xsBdmpsl5llBNhWo1CMd+ToDw81I65EQulvzPV3 MihOf/zRG38YxxgrYk7psdC88PvGJ6Zk0p1vV7blxA03ocSb3ZfnxbelGwEr7sjO4qS6 Awbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=N+sEnO3G/LxeidDHAO+0U2MJN0+jJwOzBwyShzuEbDA=; b=OSUt+BiqysNDWLeMvs/dUZA8QYEZwX74LQQHh3BxwPu4KaMSexMN8K5B89dsgDGUFd tBeMtXdfD5EozPMoisKueK/8siy5H9jPKbljBz8LKnsDE0LCJjWNyEyZz1d9mNtwLrzW F0WwE3GeUauoqiCPehz5WIkDZSyKZ2n5MLh5xVM8VRi05Ki6oHCBT284VL475rNUZxuH sKNJIHR/OgnoKTdSER1S4ym/cpndKmMnAyVYfSJvxi3oByvuERKah47Lfed6SXBGIBwM gwHFm5JsqjKo4yyTq4WOxQbdiuBQ/RePakAX8jrKwzN7D4n7Fi0ufNayBRP3ZycG7xHb T0Yg== X-Gm-Message-State: AE9vXwP67u7YDHHQtqpFMR1ZsvgZYHsLdJ2bpuvzutOtBxzu/uazZajZlyeqP1edVaiG3Otks5MhAcF1Yf2OMw== X-Received: by 10.107.184.131 with SMTP id i125mr51553459iof.167.1474408810769; Tue, 20 Sep 2016 15:00:10 -0700 (PDT) MIME-Version: 1.0 Sender: wlosh@bsdimp.com Received: by 10.36.65.7 with HTTP; Tue, 20 Sep 2016 15:00:10 -0700 (PDT) X-Originating-IP: [69.53.245.200] In-Reply-To: <20160920214729.GS2840@zxy.spb.ru> References: <20160907191348.GD22212@zxy.spb.ru> <1823460.vTm8IvUQsF@ralph.baldwin.cx> <20160918162241.GE2960@zxy.spb.ru> <2122051.7RxZBKUSFc@ralph.baldwin.cx> <20160920065244.GO2840@zxy.spb.ru> <20160920192053.GP2840@zxy.spb.ru> <20160920201925.GI38409@kib.kiev.ua> <20160920203853.GR2840@zxy.spb.ru> <20160920211517.GJ38409@kib.kiev.ua> <20160920214729.GS2840@zxy.spb.ru> From: Warner Losh Date: Tue, 20 Sep 2016 16:00:10 -0600 X-Google-Sender-Auth: oeE2MriJ8mCLk8wyBN64xENAK4g Message-ID: Subject: Re: nginx and FreeBSD11 To: Slawa Olhovchenkov Cc: Konstantin Belousov , FreeBSD-STABLE Mailing List , John Baldwin Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Sep 2016 22:00:11 -0000 On Tue, Sep 20, 2016 at 3:47 PM, Slawa Olhovchenkov wrote: > On Wed, Sep 21, 2016 at 12:15:17AM +0300, Konstantin Belousov wrote: > >> On Tue, Sep 20, 2016 at 11:38:54PM +0300, Slawa Olhovchenkov wrote: >> > On Tue, Sep 20, 2016 at 11:19:25PM +0300, Konstantin Belousov wrote: >> > >> > > On Tue, Sep 20, 2016 at 10:20:53PM +0300, Slawa Olhovchenkov wrote: >> > > > On Tue, Sep 20, 2016 at 09:52:44AM +0300, Slawa Olhovchenkov wrote: >> > > > >> > > > > On Mon, Sep 19, 2016 at 06:05:46PM -0700, John Baldwin wrote: >> > > > > >> > > > > > > > If this panics, then vmspace_switch_aio() is not working for >> > > > > > > > some reason. >> > > > > > > >> > > > > > > I am try using next DTrace script: >> > > > > > > ==== >> > > > > > > #pragma D option dynvarsize=64m >> > > > > > > >> > > > > > > int req[struct vmspace *, void *]; >> > > > > > > self int trace; >> > > > > > > >> > > > > > > syscall:freebsd:aio_read:entry >> > > > > > > { >> > > > > > > this->aio = *(struct aiocb *)copyin(arg0, sizeof(struct aiocb)); >> > > > > > > req[curthread->td_proc->p_vmspace, this->aio.aio_buf] = curthread->td_proc->p_pid; >> > > > > > > } >> > > > > > > >> > > > > > > fbt:kernel:aio_process_rw:entry >> > > > > > > { >> > > > > > > self->job = args[0]; >> > > > > > > self->trace = 1; >> > > > > > > } >> > > > > > > >> > > > > > > fbt:kernel:aio_process_rw:return >> > > > > > > /self->trace/ >> > > > > > > { >> > > > > > > req[self->job->userproc->p_vmspace, self->job->uaiocb.aio_buf] = 0; >> > > > > > > self->job = 0; >> > > > > > > self->trace = 0; >> > > > > > > } >> > > > > > > >> > > > > > > fbt:kernel:vn_io_fault:entry >> > > > > > > /self->trace && !req[curthread->td_proc->p_vmspace, args[1]->uio_iov[0].iov_base]/ >> > > > > > > { >> > > > > > > this->buf = args[1]->uio_iov[0].iov_base; >> > > > > > > printf("%Y vn_io_fault %p:%p pid %d\n", walltimestamp, curthread->td_proc->p_vmspace, this->buf, req[curthread->td_proc->p_vmspace, this->buf]); >> > > > > > > } >> > > > > > > === >> > > > > > > >> > > > > > > And don't got any messages near nginx core dump. >> > > > > > > What I can check next? >> > > > > > > May be check context/address space switch for kernel process? >> > > > > > >> > > > > > Which CPU are you using? >> > > > > >> > > > > CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.04-MHz K8-class CPU) >> > > Is this sandy bridge ? >> > >> > Sandy Bridge EP >> > >> > > Show me first 100 lines of the verbose dmesg, >> > >> > After day or two, after end of this test run -- I am need to enable verbose. >> > >> > > I want to see cpu features lines. In particular, does you CPU support >> > > the INVPCID feature. >> > >> > CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.05-MHz K8-class CPU) >> > Origin="GenuineIntel" Id=0x206d7 Family=0x6 Model=0x2d Stepping=7 >> > Features=0xbfebfbff >> > Features2=0x1fbee3ff >> > AMD Features=0x2c100800 >> > AMD Features2=0x1 >> > XSAVE Features=0x1 >> > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID >> > TSC: P-state invariant, performance statistics >> > >> > I am don't see this feature before E5v3: >> > >> > CPU: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz (2600.06-MHz K8-class CPU) >> > Origin="GenuineIntel" Id=0x306e4 Family=0x6 Model=0x3e Stepping=4 >> > Features=0xbfebfbff >> > Features2=0x7fbee3ff >> > AMD Features=0x2c100800 >> > AMD Features2=0x1 >> > Structured Extended Features=0x281 >> > XSAVE Features=0x1 >> > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr >> > TSC: P-state invariant, performance statistics >> > >> > (don't run 11.0 on this CPU) >> Ok. >> >> > >> > CPU: Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz (2600.05-MHz K8-class CPU) >> > Origin="GenuineIntel" Id=0x306f2 Family=0x6 Model=0x3f Stepping=2 >> > Features=0xbfebfbff >> > Features2=0x7ffefbff >> > AMD Features=0x2c100800 >> > AMD Features2=0x21 >> > Structured Extended Features=0x37ab >> > XSAVE Features=0x1 >> > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr >> > TSC: P-state invariant, performance statistics >> > >> > (11.0 run w/o this issuse) >> Do you mean that similarly configured nginx+aio do not demonstrate the corruption on this machine ? > > Yes. > But different storage configuration and different pattern load. > > Also 11.0 run w/o this issuse on > > CPU: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz (2200.04-MHz K8-class CPU) > Origin="GenuineIntel" Id=0x406f1 Family=0x6 Model=0x4f Stepping=1 > Features=0xbfebfbff > Features2=0x7ffefbff > AMD Features=0x2c100800 > AMD Features2=0x121 > Structured Extended Features=0x21cbfbb > XSAVE Features=0x1 > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr > TSC: P-state invariant, performance statistics > > PS: all systems is dual-cpu. Does this mean 2 cores or two sockets? We've seen a similar hang with the following CPU: CPU: Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (2700.06-MHz K8-class CPU) Origin="GenuineIntel" Id=0x306e4 Family=0x6 Model=0x3e Stepping=4 Features=0xbfebfbff Features2=0x7fbee3ff AMD Features=0x2c100800 AMD Features2=0x1 Structured Extended Features=0x281 XSAVE Features=0x1 VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr TSC: P-state invariant, performance statistics real memory = 274877906944 (262144 MB) avail memory = 267146330112 (254770 MB) 12 cores x 2 SMT x 1 socket Warner From owner-freebsd-stable@freebsd.org Tue Sep 20 22:02:59 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C1B89BE3E6F; Tue, 20 Sep 2016 22:02:59 +0000 (UTC) (envelope-from jenkins-admin@FreeBSD.org) Received: from jenkins-9.freebsd.org (jenkins-9.freebsd.org [8.8.178.209]) by mx1.freebsd.org (Postfix) with ESMTP id A7ADEA01; Tue, 20 Sep 2016 22:02:59 +0000 (UTC) (envelope-from jenkins-admin@FreeBSD.org) Received: from jenkins-9.freebsd.org (localhost [127.0.0.1]) by jenkins-9.freebsd.org (Postfix) with ESMTP id E540C8A; Tue, 20 Sep 2016 22:02:59 +0000 (UTC) Date: Tue, 20 Sep 2016 22:02:57 +0000 (GMT) From: jenkins-admin@FreeBSD.org To: ae@FreeBSD.org, bde@FreeBSD.org, jenkins-admin@FreeBSD.org, freebsd-stable@FreeBSD.org, freebsd-arm@FreeBSD.org Message-ID: <260542609.2.1474408979951.JavaMail.jenkins@jenkins-9.freebsd.org> Subject: FreeBSD_STABLE_11-arm64 - Build #154 - Failure MIME-Version: 1.0 X-Jenkins-Job: FreeBSD_STABLE_11-arm64 X-Jenkins-Result: FAILURE Precedence: bulk Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Sep 2016 22:02:59 -0000 FreeBSD_STABLE_11-arm64 - Build #154 - Failure: Build information: https://jenkins.FreeBSD.org/job/FreeBSD_STABLE_11-arm64/154/ Full change log: https://jenkins.FreeBSD.org/job/FreeBSD_STABLE_11-arm64/154/changes Full build log: https://jenkins.FreeBSD.org/job/FreeBSD_STABLE_11-arm64/154/console Change summaries: 306025 by ae: MFC r305778: Fix swap tables between sets when this functional is enabled. We have 6 opcode rewriters for table opcodes. When `set swap' command invoked, it is called for each rewriter, so at the end we get the same result, because opcode rewriter uses ETLV type to match opcode. And all tables opcodes have the same ETLV type. To solve this problem, use separate sets handler for one opcode rewriter. Use it to handle TEST_ALL, SWAP_ALL and MOVE_ALL commands. PR: 212630 305971 by bde: MFC r305380: Fix missing fmodl() on arches with 53-bit long doubles. PR: 199422, 211965 The end of the build log: [...truncated 21808 lines...] ===> share/doc/papers/relengr (includes) --- includes_subdir_share/doc/papers/sysperf --- ===> share/doc/papers/sysperf (includes) --- includes_subdir_share/doc/papers/timecounter --- ===> share/doc/papers/timecounter (includes) --- includes_subdir_share/doc/psd --- ===> share/doc/psd (includes) --- includes_subdir_share/doc/psd/title --- ===> share/doc/psd/title (includes) --- includes_subdir_share/doc/psd/contents --- ===> share/doc/psd/contents (includes) --- includes_subdir_share/doc/psd/01.cacm --- ===> share/doc/psd/01.cacm (includes) --- includes_subdir_share/doc/psd/02.implement --- ===> share/doc/psd/02.implement (includes) --- includes_subdir_share/doc/psd/03.iosys --- ===> share/doc/psd/03.iosys (includes) --- includes_subdir_share/doc/psd/04.uprog --- ===> share/doc/psd/04.uprog (includes) --- includes_subdir_share/doc/psd/05.sysman --- ===> share/doc/psd/05.sysman (includes) --- includes_subdir_share/doc/psd/06.Clang --- ===> share/doc/psd/06.Clang (includes) --- includes_subdir_share/doc/psd/12.make --- ===> share/doc/psd/12.make (includes) --- includes_subdir_share/doc/psd/13.rcs --- ===> share/doc/psd/13.rcs (includes) --- includes_subdir_share/doc/psd/13.rcs/rcs --- ===> share/doc/psd/13.rcs/rcs (includes) --- includes_subdir_share/doc/psd/13.rcs/rcs_func --- ===> share/doc/psd/13.rcs/rcs_func (includes) --- includes_subdir_share/doc/psd/15.yacc --- ===> share/doc/psd/15.yacc (includes) --- includes_subdir_share/doc/psd/16.lex --- ===> share/doc/psd/16.lex (includes) --- includes_subdir_share/doc/psd/17.m4 --- ===> share/doc/psd/17.m4 (includes) --- includes_subdir_share/doc/psd/18.gprof --- ===> share/doc/psd/18.gprof (includes) --- includes_subdir_share/doc/psd/20.ipctut --- ===> share/doc/psd/20.ipctut (includes) --- includes_subdir_share/doc/psd/21.ipc --- ===> share/doc/psd/21.ipc (includes) --- includes_subdir_share/doc/psd/22.rpcgen --- ===> share/doc/psd/22.rpcgen (includes) --- includes_subdir_share/doc/psd/23.rpc --- ===> share/doc/psd/23.rpc (includes) --- includes_subdir_share/doc/psd/24.xdr --- ===> share/doc/psd/24.xdr (includes) --- includes_subdir_share/doc/psd/25.xdrrfc --- ===> share/doc/psd/25.xdrrfc (includes) --- includes_subdir_share/doc/psd/26.rpcrfc --- ===> share/doc/psd/26.rpcrfc (includes) --- includes_subdir_share/doc/psd/27.nfsrpc --- ===> share/doc/psd/27.nfsrpc (includes) --- includes_subdir_share/doc/smm --- ===> share/doc/smm (includes) --- includes_subdir_share/doc/smm/title --- ===> share/doc/smm/title (includes) --- includes_subdir_share/doc/smm/contents --- ===> share/doc/smm/contents (includes) --- includes_subdir_share/doc/smm/01.setup --- ===> share/doc/smm/01.setup (includes) --- includes_subdir_share/doc/smm/02.config --- ===> share/doc/smm/02.config (includes) --- includes_subdir_share/doc/smm/03.fsck --- ===> share/doc/smm/03.fsck (includes) --- includes_subdir_share/doc/smm/04.quotas --- ===> share/doc/smm/04.quotas (includes) --- includes_subdir_share/doc/smm/05.fastfs --- ===> share/doc/smm/05.fastfs (includes) --- includes_subdir_share/doc/smm/06.nfs --- ===> share/doc/smm/06.nfs (includes) --- includes_subdir_share/doc/smm/07.lpd --- ===> share/doc/smm/07.lpd (includes) --- includes_subdir_share/doc/smm/08.sendmailop --- ===> share/doc/smm/08.sendmailop (includes) --- includes_subdir_share/doc/smm/11.timedop --- ===> share/doc/smm/11.timedop (includes) --- includes_subdir_share/doc/smm/12.timed --- ===> share/doc/smm/12.timed (includes) --- includes_subdir_share/doc/smm/18.net --- ===> share/doc/smm/18.net (includes) --- includes_subdir_share/doc/usd --- ===> share/doc/usd (includes) --- includes_subdir_share/doc/usd/title --- ===> share/doc/usd/title (includes) --- includes_subdir_share/doc/usd/contents --- ===> share/doc/usd/contents (includes) --- includes_subdir_share/doc/usd/04.csh --- ===> share/doc/usd/04.csh (includes) --- includes_subdir_share/doc/usd/05.dc --- ===> share/doc/usd/05.dc (includes) --- includes_subdir_share/doc/usd/06.bc --- ===> share/doc/usd/06.bc (includes) --- includes_subdir_share/doc/usd/07.mail --- ===> share/doc/usd/07.mail (includes) --- includes_subdir_share/doc/usd/10.exref --- ===> share/doc/usd/10.exref (includes) --- includes_subdir_share/doc/usd/10.exref/exref --- ===> share/doc/usd/10.exref/exref (includes) --- includes_subdir_share/doc/usd/10.exref/summary --- ===> share/doc/usd/10.exref/summary (includes) --- includes_subdir_share/doc/usd/11.vitut --- ===> share/doc/usd/11.vitut (includes) --- includes_subdir_share/doc/usd/12.vi --- ===> share/doc/usd/12.vi (includes) --- includes_subdir_share/doc/usd/12.vi/vi --- ===> share/doc/usd/12.vi/vi (includes) --- includes_subdir_share/doc/usd/12.vi/viapwh --- ===> share/doc/usd/12.vi/viapwh (includes) --- includes_subdir_share/doc/usd/12.vi/summary --- ===> share/doc/usd/12.vi/summary (includes) --- includes_subdir_share/doc/usd/13.viref --- ===> share/doc/usd/13.viref (includes) --- includes_subdir_share/doc/usd/18.msdiffs --- ===> share/doc/usd/18.msdiffs (includes) --- includes_subdir_share/doc/usd/19.memacros --- ===> share/doc/usd/19.memacros (includes) --- includes_subdir_share/doc/usd/20.meref --- ===> share/doc/usd/20.meref (includes) --- includes_subdir_share/doc/usd/21.troff --- ===> share/doc/usd/21.troff (includes) --- includes_subdir_share/doc/usd/22.trofftut --- ===> share/doc/usd/22.trofftut (includes) --- includes_subdir_share/dtrace --- ===> share/dtrace (includes) --- includes_subdir_share/examples --- ===> share/examples (includes) --- includes_subdir_share/examples/tests --- ===> share/examples/tests (includes) --- includes_subdir_share/examples/tests/tests --- ===> share/examples/tests/tests (includes) --- includes_subdir_share/examples/tests/tests/atf --- ===> share/examples/tests/tests/atf (includes) --- includes_subdir_share/examples/tests/tests/plain --- ===> share/examples/tests/tests/plain (includes) --- includes_subdir_share/i18n --- ===> share/i18n (includes) --- includes_subdir_share/i18n/csmapper --- ===> share/i18n/csmapper (includes) --- includes_subdir_share/i18n/csmapper/APPLE --- ===> share/i18n/csmapper/APPLE (includes) --- includes_subdir_share/i18n/csmapper/AST --- ===> share/i18n/csmapper/AST (includes) --- includes_subdir_share/i18n/csmapper/BIG5 --- ===> share/i18n/csmapper/BIG5 (includes) --- includes_subdir_share/i18n/csmapper/CNS --- ===> share/i18n/csmapper/CNS (includes) --- includes_subdir_share/i18n/csmapper/CP --- ===> share/i18n/csmapper/CP (includes) --- includes_subdir_share/i18n/csmapper/EBCDIC --- ===> share/i18n/csmapper/EBCDIC (includes) --- includes_subdir_share/i18n/csmapper/GB --- ===> share/i18n/csmapper/GB (includes) --- includes_subdir_share/i18n/csmapper/GEORGIAN --- ===> share/i18n/csmapper/GEORGIAN (includes) --- includes_subdir_share/i18n/csmapper/ISO646 --- ===> share/i18n/csmapper/ISO646 (includes) --- includes_subdir_share/i18n/csmapper/ISO-8859 --- ===> share/i18n/csmapper/ISO-8859 (includes) --- includes_subdir_share/i18n/csmapper/JIS --- ===> share/i18n/csmapper/JIS (includes) --- includes_subdir_share/i18n/csmapper/KAZAKH --- ===> share/i18n/csmapper/KAZAKH (includes) --- includes_subdir_share/i18n/csmapper/KOI --- ===> share/i18n/csmapper/KOI (includes) --- includes_subdir_share/i18n/csmapper/KS --- ===> share/i18n/csmapper/KS (includes) --- includes_subdir_share/i18n/csmapper/MISC --- ===> share/i18n/csmapper/MISC (includes) --- includes_subdir_share/i18n/csmapper/TCVN --- ===> share/i18n/csmapper/TCVN (includes) --- includes_subdir_share/i18n/esdb --- ===> share/i18n/esdb (includes) --- includes_subdir_share/i18n/esdb/APPLE --- ===> share/i18n/esdb/APPLE (includes) --- includes_subdir_share/i18n/esdb/AST --- ===> share/i18n/esdb/AST (includes) --- includes_subdir_share/i18n/esdb/BIG5 --- ===> share/i18n/esdb/BIG5 (includes) --- includes_subdir_share/i18n/esdb/CP --- ===> share/i18n/esdb/CP (includes) --- includes_subdir_share/i18n/esdb/DEC --- ===> share/i18n/esdb/DEC (includes) --- includes_subdir_share/i18n/esdb/EUC --- ===> share/i18n/esdb/EUC (includes) --- includes_subdir_share/i18n/esdb/EBCDIC --- ===> share/i18n/esdb/EBCDIC (includes) --- includes_subdir_share/i18n/esdb/GB --- ===> share/i18n/esdb/GB (includes) --- includes_subdir_share/i18n/esdb/GEORGIAN --- ===> share/i18n/esdb/GEORGIAN (includes) --- includes_subdir_share/i18n/esdb/ISO-2022 --- ===> share/i18n/esdb/ISO-2022 (includes) --- includes_subdir_share/i18n/esdb/ISO-8859 --- ===> share/i18n/esdb/ISO-8859 (includes) --- includes_subdir_share/i18n/esdb/ISO646 --- ===> share/i18n/esdb/ISO646 (includes) --- includes_subdir_share/i18n/esdb/KAZAKH --- ===> share/i18n/esdb/KAZAKH (includes) --- includes_subdir_share/i18n/esdb/KOI --- ===> share/i18n/esdb/KOI (includes) --- includes_subdir_share/i18n/esdb/MISC --- ===> share/i18n/esdb/MISC (includes) --- includes_subdir_share/i18n/esdb/TCVN --- ===> share/i18n/esdb/TCVN (includes) --- includes_subdir_share/i18n/esdb/UTF --- ===> share/i18n/esdb/UTF (includes) --- includes_subdir_share/keys --- ===> share/keys (includes) --- includes_subdir_share/keys/pkg --- ===> share/keys/pkg (includes) --- includes_subdir_share/keys/pkg/trusted --- ===> share/keys/pkg/trusted (includes) --- includes_subdir_share/man --- ===> share/man (includes) --- includes_subdir_share/man/man1 --- ===> share/man/man1 (includes) --- includes_subdir_share/man/man3 --- ===> share/man/man3 (includes) --- includes_subdir_secure --- --- includes_subdir_secure/lib --- --- includes_subdir_secure/lib/libssh --- ===> secure/lib/libssh (includes) Agent went offline during the build Build step 'Execute shell' marked build as failure ERROR: Step ?Scan for compiler warnings? failed: no workspace for FreeBSD_STABLE_11-arm64 #154 [PostBuildScript] - Execution post build scripts. ERROR: Build step failed with exception java.lang.NullPointerException: no workspace from node hudson.slaves.DumbSlave[kyua2.nyi.freebsd.org] which is computer hudson.slaves.SlaveComputer@3745450f and has channel null at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:74) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:64) at org.jenkinsci.plugins.postbuildscript.PostBuildScript.processBuildSteps(PostBuildScript.java:204) at org.jenkinsci.plugins.postbuildscript.PostBuildScript.processScripts(PostBuildScript.java:143) at org.jenkinsci.plugins.postbuildscript.PostBuildScript._perform(PostBuildScript.java:105) at org.jenkinsci.plugins.postbuildscript.PostBuildScript.perform(PostBuildScript.java:85) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:779) at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:720) at hudson.model.Build$BuildExecution.post2(Build.java:185) at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:665) at hudson.model.Run.execute(Run.java:1745) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:98) at hudson.model.Executor.run(Executor.java:404) Build step 'Execute a set of scripts' marked build as failure Email was triggered for: Failure - Any Sending email for trigger: Failure - Any From owner-freebsd-stable@freebsd.org Tue Sep 20 22:08:34 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9BFE5BE20BC for ; Tue, 20 Sep 2016 22:08:34 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5892EDD2; Tue, 20 Sep 2016 22:08:34 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bmTDL-000Ew4-Ul; Wed, 21 Sep 2016 01:08:31 +0300 Date: Wed, 21 Sep 2016 01:08:31 +0300 From: Slawa Olhovchenkov To: Warner Losh Cc: Konstantin Belousov , FreeBSD-STABLE Mailing List , John Baldwin Subject: Re: nginx and FreeBSD11 Message-ID: <20160920220831.GT2840@zxy.spb.ru> References: <1823460.vTm8IvUQsF@ralph.baldwin.cx> <20160918162241.GE2960@zxy.spb.ru> <2122051.7RxZBKUSFc@ralph.baldwin.cx> <20160920065244.GO2840@zxy.spb.ru> <20160920192053.GP2840@zxy.spb.ru> <20160920201925.GI38409@kib.kiev.ua> <20160920203853.GR2840@zxy.spb.ru> <20160920211517.GJ38409@kib.kiev.ua> <20160920214729.GS2840@zxy.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Sep 2016 22:08:34 -0000 On Tue, Sep 20, 2016 at 04:00:10PM -0600, Warner Losh wrote: > >> > > Is this sandy bridge ? > >> > > >> > Sandy Bridge EP > >> > > >> > > Show me first 100 lines of the verbose dmesg, > >> > > >> > After day or two, after end of this test run -- I am need to enable verbose. > >> > > >> > > I want to see cpu features lines. In particular, does you CPU support > >> > > the INVPCID feature. > >> > > >> > CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.05-MHz K8-class CPU) > >> > Origin="GenuineIntel" Id=0x206d7 Family=0x6 Model=0x2d Stepping=7 > >> > Features=0xbfebfbff > >> > Features2=0x1fbee3ff > >> > AMD Features=0x2c100800 > >> > AMD Features2=0x1 > >> > XSAVE Features=0x1 > >> > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID > >> > TSC: P-state invariant, performance statistics > >> > > >> > I am don't see this feature before E5v3: > >> > > >> > CPU: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz (2600.06-MHz K8-class CPU) > >> > Origin="GenuineIntel" Id=0x306e4 Family=0x6 Model=0x3e Stepping=4 > >> > Features=0xbfebfbff > >> > Features2=0x7fbee3ff > >> > AMD Features=0x2c100800 > >> > AMD Features2=0x1 > >> > Structured Extended Features=0x281 > >> > XSAVE Features=0x1 > >> > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr > >> > TSC: P-state invariant, performance statistics > >> > > >> > (don't run 11.0 on this CPU) > >> Ok. > >> > >> > > >> > CPU: Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz (2600.05-MHz K8-class CPU) > >> > Origin="GenuineIntel" Id=0x306f2 Family=0x6 Model=0x3f Stepping=2 > >> > Features=0xbfebfbff > >> > Features2=0x7ffefbff > >> > AMD Features=0x2c100800 > >> > AMD Features2=0x21 > >> > Structured Extended Features=0x37ab > >> > XSAVE Features=0x1 > >> > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr > >> > TSC: P-state invariant, performance statistics > >> > > >> > (11.0 run w/o this issuse) > >> Do you mean that similarly configured nginx+aio do not demonstrate the corruption on this machine ? > > > > Yes. > > But different storage configuration and different pattern load. > > > > Also 11.0 run w/o this issuse on > > > > CPU: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz (2200.04-MHz K8-class CPU) > > Origin="GenuineIntel" Id=0x406f1 Family=0x6 Model=0x4f Stepping=1 > > Features=0xbfebfbff > > Features2=0x7ffefbff > > AMD Features=0x2c100800 > > AMD Features2=0x121 > > Structured Extended Features=0x21cbfbb > > XSAVE Features=0x1 > > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr > > TSC: P-state invariant, performance statistics > > > > PS: all systems is dual-cpu. > > Does this mean 2 cores or two sockets? We've seen a similar hang with > the following CPU: two sockets. not sure how this impotant, just for record. you system also w/o INVPCID feature (as kib question). may be you case also will be resolved by vm.pmap.pcid_enabled=0? > CPU: Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (2700.06-MHz K8-class CPU) > Origin="GenuineIntel" Id=0x306e4 Family=0x6 Model=0x3e Stepping=4 > Features=0xbfebfbff > Features2=0x7fbee3ff > AMD Features=0x2c100800 > AMD Features2=0x1 > Structured Extended Features=0x281 > XSAVE Features=0x1 > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr > TSC: P-state invariant, performance statistics > real memory = 274877906944 (262144 MB) > avail memory = 267146330112 (254770 MB) > > 12 cores x 2 SMT x 1 socket > > Warner From owner-freebsd-stable@freebsd.org Wed Sep 21 01:47:31 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E03ACBE2ADB; Wed, 21 Sep 2016 01:47:31 +0000 (UTC) (envelope-from jenkins-admin@FreeBSD.org) Received: from jenkins-9.freebsd.org (jenkins-9.freebsd.org [8.8.178.209]) by mx1.freebsd.org (Postfix) with ESMTP id D3A16364; Wed, 21 Sep 2016 01:47:31 +0000 (UTC) (envelope-from jenkins-admin@FreeBSD.org) Received: from jenkins-9.freebsd.org (localhost [127.0.0.1]) by jenkins-9.freebsd.org (Postfix) with ESMTP id F05C59C; Wed, 21 Sep 2016 01:47:31 +0000 (UTC) Date: Wed, 21 Sep 2016 01:47:30 +0000 (GMT) From: jenkins-admin@FreeBSD.org To: karels@FreeBSD.org, jenkins-admin@FreeBSD.org, freebsd-stable@FreeBSD.org, freebsd-arm@FreeBSD.org Message-ID: <483114260.8.1474422451991.JavaMail.jenkins@jenkins-9.freebsd.org> In-Reply-To: <260542609.2.1474408979951.JavaMail.jenkins@jenkins-9.freebsd.org> References: <260542609.2.1474408979951.JavaMail.jenkins@jenkins-9.freebsd.org> Subject: FreeBSD_STABLE_11-arm64 - Build #155 - Fixed MIME-Version: 1.0 X-Jenkins-Job: FreeBSD_STABLE_11-arm64 X-Jenkins-Result: SUCCESS Precedence: bulk Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Sep 2016 01:47:32 -0000 FreeBSD_STABLE_11-arm64 - Build #155 - Fixed: Build information: https://jenkins.FreeBSD.org/job/FreeBSD_STABLE_11-arm64/155/ Full change log: https://jenkins.FreeBSD.org/job/FreeBSD_STABLE_11-arm64/155/changes Full build log: https://jenkins.FreeBSD.org/job/FreeBSD_STABLE_11-arm64/155/console Change summaries: 306060 by karels: MFC r304713: Fix L2 caching for UDP over IPv6 ip6_output() was missing cache invalidation code analougous to ip_output.c. r304545 disabled L2 caching for UDP/IPv6 as a workaround. This change adds the missing cache invalidation code and reverts r304545. Reviewed by: gnn Approved by: gnn (mentor) Tested by: peter@, Mike Andrews Differential Revision: https://reviews.freebsd.org/D7591 From owner-freebsd-stable@freebsd.org Wed Sep 21 07:23:22 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AA391BE36EB for ; Wed, 21 Sep 2016 07:23:22 +0000 (UTC) (envelope-from julien.charbon@gmail.com) Received: from mail-wm0-f66.google.com (mail-wm0-f66.google.com [74.125.82.66]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 40CEEE7D for ; Wed, 21 Sep 2016 07:23:21 +0000 (UTC) (envelope-from julien.charbon@gmail.com) Received: by mail-wm0-f66.google.com with SMTP id b184so6988169wma.3 for ; Wed, 21 Sep 2016 00:23:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:newsgroups:from :message-id:date:user-agent:mime-version:in-reply-to :content-transfer-encoding; bh=kTQRwfUDGuvyfuPXrv9QfiPmsEEr2dgXINcPNUqvExU=; b=i+mGZBEkI2lokL2P+Nb+9aixsXdBX5wfggA51cCF4hc6GrqyHQr6okMuMK/MgXIxX/ iTnYnUUwBlKj1gAysxm2ud2yl4BYRS9dfVs7uhCgAV3YDDDgiJ4lmymvGFSb+tnHALrw 6Xw4/m0W+qr9KbR0yIxGxSmaOV77Ca9VZuxfNMkwHD5cctvKCMdYdwMJb1RPcDFi33rk 4MkbQ030gdC4tRpfBwWZeS3DyHaALpJayE2rgbyzrGUJNLpJNOBiSsiKjHXH1ftk4oJH wWTL5qIKgRLPN4K+ewgZ4jYnSs8LSkoxfo8ZgDDSV5g5/TH8j3NuWSeboKKCO0pJlyqL mO/w== X-Gm-Message-State: AE9vXwNC4tZIB9+e0IOnHMs4KEFE6k+G7KW+rHUIneGTuTc4oYsegS4os4SEBhzGguYi8A== X-Received: by 10.194.223.33 with SMTP id qr1mr5622165wjc.216.1474441887289; Wed, 21 Sep 2016 00:11:27 -0700 (PDT) Received: from [172.20.10.4] (177.227.197.178.dynamic.wless.zhbmb00p-cgnat.res.cust.swisscom.ch. [178.197.227.177]) by smtp.gmail.com with ESMTPSA id r9sm32073461wjp.15.2016.09.21.00.11.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 21 Sep 2016 00:11:26 -0700 (PDT) Subject: Re: 11.0 stuck on high network load To: Slawa Olhovchenkov References: <20160905014612.GA42393@strugglingcoder.info> <20160914213503.GJ2840@zxy.spb.ru> <20160915085938.GN38409@kib.kiev.ua> <20160915090633.GS2840@zxy.spb.ru> <20160916181839.GC2960@zxy.spb.ru> <20160916183053.GL9397@strugglingcoder.info> <20160916190330.GG2840@zxy.spb.ru> <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> <20160919204328.GN2840@zxy.spb.ru> <8ba75d6e-4f01-895e-0aed-53c6c6692cb9@freebsd.org> <20160920202633.GQ2840@zxy.spb.ru> Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org, hiren panchasara Newsgroups: gmane.os.freebsd.stable From: Julien Charbon Message-ID: Date: Wed, 21 Sep 2016 09:11:24 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <20160920202633.GQ2840@zxy.spb.ru> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Sep 2016 07:23:22 -0000 Hi Slawa, On 9/20/16 10:26 PM, Slawa Olhovchenkov wrote: > On Tue, Sep 20, 2016 at 10:00:25PM +0200, Julien Charbon wrote: >> On 9/19/16 10:43 PM, Slawa Olhovchenkov wrote: >>> On Mon, Sep 19, 2016 at 10:32:13PM +0200, Julien Charbon wrote: >>>> >>>>> @ CPU_CLK_UNHALTED_CORE [4653445 samples] >>>>> >>>>> 51.86% [2413083] lock_delay @ /boot/kernel.VSTREAM/kernel >>>>> 100.0% [2413083] __rw_wlock_hard >>>>> 100.0% [2413083] tcp_tw_2msl_scan >>>>> 99.99% [2412958] pfslowtimo >>>>> 100.0% [2412958] softclock_call_cc >>>>> 100.0% [2412958] softclock >>>>> 100.0% [2412958] intr_event_execute_handlers >>>>> 100.0% [2412958] ithread_loop >>>>> 100.0% [2412958] fork_exit >>>>> 00.01% [125] tcp_twstart >>>>> 100.0% [125] tcp_do_segment >>>>> 100.0% [125] tcp_input >>>>> 100.0% [125] ip_input >>>>> 100.0% [125] swi_net >>>>> 100.0% [125] intr_event_execute_handlers >>>>> 100.0% [125] ithread_loop >>>>> 100.0% [125] fork_exit >>>> >>>> The only write lock tcp_tw_2msl_scan() tries to get is a >>>> INP_WLOCK(inp). Thus here, tcp_tw_2msl_scan() seems to be stuck >>>> spinning on INP_WLOCK (or pfslowtimo() is going crazy and calls >>>> tcp_tw_2msl_scan() at high rate but this will be quite unexpected). >>>> >>>> Thus my hypothesis is that something is holding the INP_WLOCK and not >>>> releasing it, and tcp_tw_2msl_scan() is spinning on it. >>>> >>>> If you can, could you compile the kernel with below options: >>>> >>>> options DDB # Support DDB. >>>> options DEADLKRES # Enable the deadlock resolver >>>> options INVARIANTS # Enable calls of extra sanity >>>> checking >>>> options INVARIANT_SUPPORT # Extra sanity checks of internal >>>> structures, required by INVARIANTS >>>> options WITNESS # Enable checks to detect >>>> deadlocks and cycles >>>> options WITNESS_SKIPSPIN # Don't run witness on spinlocks >>>> for speed >>> >>> Currently this host run with 100% CPU load (on all cores), i.e. >>> enabling WITNESS will be significant drop performance. >>> Can I use only some subset of options? >>> >>> Also, I can some troubles to DDB enter in this case. >>> May be kgdb will be success (not tryed yet)? >> >> If these kernel options will certainly slow down your kernel, they also >> might found the root cause of your issue before reaching the point where >> you have 100% cpu load on all cores (thanks to INVARIANTS). I would >> suggest: > > Hmmm, may be I am not clarified. > This host run at peak hours with 100% CPU load as normal operation, > this is for servering 2x10G, this is CPU load not result of lock > issuse, this is not us case. And this is because I am fear to enable > WITNESS -- I am fear drop performance. > > This lock issuse happen irregulary and may be caused by other issuse > (nginx crashed). In this case about 1/3 cores have 100% cpu load, > perhaps by this lock -- I am can trace only from one core and need > more then hour for this (may be on other cores different trace, I > can't guaranted anything). I see, especially if you are running in production WITNESS might indeed be not practical for you. In this case, I would suggest before doing WITNESS and still get more information to: #0: Do a lock profiling: https://www.freebsd.org/cgi/man.cgi?query=LOCK_PROFILING options LOCK_PROFILING Example of usage: # Run $ sudo sysctl debug.lock.prof.enable=1 $ sleep 10 $ sudo sysctl debug.lock.prof.enable=0 # Get results $ sysctl debug.lock.prof.stats | head -2; sysctl debug.lock.prof.stats | sort -n -k 4 -r You can also use Dtrace and lockstat (especially with the lockstat -s option): https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks https://www.freebsd.org/cgi/man.cgi?query=lockstat&manpath=FreeBSD+11.0-RELEASE But I am less familiar with Dtrace/lockstat tools. >> #1. Try above kernel options at least once, and see what you can get. > > OK, I am try this after some time. > >> #2. If #1 is a total failure try below patch: It won't solve anything, >> it just makes tcp_tw_2msl_scan() less greedy when there is contention on >> the INP write lock. If it makes the debugging more feasible, continue >> to #3. > > OK, thanks. > What purpose to not skip locked tcptw in this loop? If I understand your question correctly: According to your pmcstat result, tcp_tw_2msl_scan() currently struggles with a write lock (__rw_wlock_hard) and the only write lock used tcp_tw_2msl_scan() is INP_WLOCK. No sign of contention on TW_RLOCK(V_tw_lock) currently. 51.86% [2413083] lock_delay @ /boot/kernel.VSTREAM/kernel 100.0% [2413083] __rw_wlock_hard 100.0% [2413083] tcp_tw_2msl_scan -- Julien From owner-freebsd-stable@freebsd.org Wed Sep 21 08:31:52 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 13605BE047E for ; Wed, 21 Sep 2016 08:31:52 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C98E1EA3; Wed, 21 Sep 2016 08:31:51 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bmcwW-0005cR-Mk; Wed, 21 Sep 2016 11:31:48 +0300 Date: Wed, 21 Sep 2016 11:31:48 +0300 From: Slawa Olhovchenkov To: Julien Charbon Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org, hiren panchasara Subject: Re: 11.0 stuck on high network load Message-ID: <20160921083148.GU2840@zxy.spb.ru> References: <20160915085938.GN38409@kib.kiev.ua> <20160915090633.GS2840@zxy.spb.ru> <20160916181839.GC2960@zxy.spb.ru> <20160916183053.GL9397@strugglingcoder.info> <20160916190330.GG2840@zxy.spb.ru> <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> <20160919204328.GN2840@zxy.spb.ru> <8ba75d6e-4f01-895e-0aed-53c6c6692cb9@freebsd.org> <20160920202633.GQ2840@zxy.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Sep 2016 08:31:52 -0000 On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote: > > Hi Slawa, > > On 9/20/16 10:26 PM, Slawa Olhovchenkov wrote: > > On Tue, Sep 20, 2016 at 10:00:25PM +0200, Julien Charbon wrote: > >> On 9/19/16 10:43 PM, Slawa Olhovchenkov wrote: > >>> On Mon, Sep 19, 2016 at 10:32:13PM +0200, Julien Charbon wrote: > >>>> > >>>>> @ CPU_CLK_UNHALTED_CORE [4653445 samples] > >>>>> > >>>>> 51.86% [2413083] lock_delay @ /boot/kernel.VSTREAM/kernel > >>>>> 100.0% [2413083] __rw_wlock_hard > >>>>> 100.0% [2413083] tcp_tw_2msl_scan > >>>>> 99.99% [2412958] pfslowtimo > >>>>> 100.0% [2412958] softclock_call_cc > >>>>> 100.0% [2412958] softclock > >>>>> 100.0% [2412958] intr_event_execute_handlers > >>>>> 100.0% [2412958] ithread_loop > >>>>> 100.0% [2412958] fork_exit > >>>>> 00.01% [125] tcp_twstart > >>>>> 100.0% [125] tcp_do_segment > >>>>> 100.0% [125] tcp_input > >>>>> 100.0% [125] ip_input > >>>>> 100.0% [125] swi_net > >>>>> 100.0% [125] intr_event_execute_handlers > >>>>> 100.0% [125] ithread_loop > >>>>> 100.0% [125] fork_exit > >>>> > >>>> The only write lock tcp_tw_2msl_scan() tries to get is a > >>>> INP_WLOCK(inp). Thus here, tcp_tw_2msl_scan() seems to be stuck > >>>> spinning on INP_WLOCK (or pfslowtimo() is going crazy and calls > >>>> tcp_tw_2msl_scan() at high rate but this will be quite unexpected). > >>>> > >>>> Thus my hypothesis is that something is holding the INP_WLOCK and not > >>>> releasing it, and tcp_tw_2msl_scan() is spinning on it. > >>>> > >>>> If you can, could you compile the kernel with below options: > >>>> > >>>> options DDB # Support DDB. > >>>> options DEADLKRES # Enable the deadlock resolver > >>>> options INVARIANTS # Enable calls of extra sanity > >>>> checking > >>>> options INVARIANT_SUPPORT # Extra sanity checks of internal > >>>> structures, required by INVARIANTS > >>>> options WITNESS # Enable checks to detect > >>>> deadlocks and cycles > >>>> options WITNESS_SKIPSPIN # Don't run witness on spinlocks > >>>> for speed > >>> > >>> Currently this host run with 100% CPU load (on all cores), i.e. > >>> enabling WITNESS will be significant drop performance. > >>> Can I use only some subset of options? > >>> > >>> Also, I can some troubles to DDB enter in this case. > >>> May be kgdb will be success (not tryed yet)? > >> > >> If these kernel options will certainly slow down your kernel, they also > >> might found the root cause of your issue before reaching the point where > >> you have 100% cpu load on all cores (thanks to INVARIANTS). I would > >> suggest: > > > > Hmmm, may be I am not clarified. > > This host run at peak hours with 100% CPU load as normal operation, > > this is for servering 2x10G, this is CPU load not result of lock > > issuse, this is not us case. And this is because I am fear to enable > > WITNESS -- I am fear drop performance. > > > > This lock issuse happen irregulary and may be caused by other issuse > > (nginx crashed). In this case about 1/3 cores have 100% cpu load, > > perhaps by this lock -- I am can trace only from one core and need > > more then hour for this (may be on other cores different trace, I > > can't guaranted anything). > > I see, especially if you are running in production WITNESS might indeed > be not practical for you. In this case, I would suggest before doing > WITNESS and still get more information to: > > #0: Do a lock profiling: > > https://www.freebsd.org/cgi/man.cgi?query=LOCK_PROFILING > > options LOCK_PROFILING > > Example of usage: > > # Run > $ sudo sysctl debug.lock.prof.enable=1 > $ sleep 10 > $ sudo sysctl debug.lock.prof.enable=0 > > # Get results > $ sysctl debug.lock.prof.stats | head -2; sysctl debug.lock.prof.stats | > sort -n -k 4 -r OK, but in case of leak lock (why inp lock too long for tcp_tw_2msl_scan?) I can't see cause of this lock running this commands after stuck happen? > You can also use Dtrace and lockstat (especially with the lockstat -s > option): > > https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks > https://www.freebsd.org/cgi/man.cgi?query=lockstat&manpath=FreeBSD+11.0-RELEASE > > But I am less familiar with Dtrace/lockstat tools. OK, interesting too. Thanks. > >> #1. Try above kernel options at least once, and see what you can get. > > > > OK, I am try this after some time. > > > >> #2. If #1 is a total failure try below patch: It won't solve anything, > >> it just makes tcp_tw_2msl_scan() less greedy when there is contention on > >> the INP write lock. If it makes the debugging more feasible, continue > >> to #3. > > > > OK, thanks. > > What purpose to not skip locked tcptw in this loop? > > If I understand your question correctly: According to your pmcstat > result, tcp_tw_2msl_scan() currently struggles with a write lock > (__rw_wlock_hard) and the only write lock used tcp_tw_2msl_scan() is > INP_WLOCK. No sign of contention on TW_RLOCK(V_tw_lock) currently. As I see in code, tcp_tw_2msl_scan got first node from V_twq_2msl and need got RW lock on inp w/o alternates. Can tcp_tw_2msl_scan skip current node and go to next node in V_twq_2msl list if current node locked by some reasson? > 51.86% [2413083] lock_delay @ /boot/kernel.VSTREAM/kernel > 100.0% [2413083] __rw_wlock_hard > 100.0% [2413083] tcp_tw_2msl_scan > > -- > Julien From owner-freebsd-stable@freebsd.org Wed Sep 21 10:57:42 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1377DBE2834 for ; Wed, 21 Sep 2016 10:57:42 +0000 (UTC) (envelope-from jenkins-admin@FreeBSD.org) Received: from jenkins-9.freebsd.org (jenkins-9.freebsd.org [8.8.178.209]) by mx1.freebsd.org (Postfix) with ESMTP id 070B3CC8; Wed, 21 Sep 2016 10:57:42 +0000 (UTC) (envelope-from jenkins-admin@FreeBSD.org) Received: from jenkins-9.freebsd.org (localhost [127.0.0.1]) by jenkins-9.freebsd.org (Postfix) with ESMTP id 1449DAE; Wed, 21 Sep 2016 10:57:41 +0000 (UTC) Date: Wed, 21 Sep 2016 10:57:40 +0000 (GMT) From: jenkins-admin@FreeBSD.org To: jenkins-admin@FreeBSD.org, freebsd-stable@FreeBSD.org Message-ID: <193106415.12.1474455460958.JavaMail.jenkins@jenkins-9.freebsd.org> Subject: Jenkins build became unstable: FreeBSD_stable_10 #403 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Instance-Identity: MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAkKKb2VAfYQKfu1t7qk4nR5qzUBEI+UqT4BPec4qHVhqUy0FFdq50sMH+3y9bCDNOufctov6VqTNffZ3YXArnZK95YF0OX97fh+E9txYOUX1adc+TikcKjuYpHmL5dE62eaZTI+4A5jnRonskQ1PaoIFz0Kbu4mWzkFsmdiXTraGzomXq4cHUCATA2+K4eDYgjXEQI30z3GOMmmZ4t/+6QGk1cMb/BqMWHbn80AsRCb4tU7Hpd72XLDpsuO7YRP1Q0CjmNAuBOTj+sFiiOe6U9HpqOlQN+iFUvBdZo/ybuy5Kh71cAaYQNL68cYdZJ6binH/DkG3KY/fS7DFYAeuwjwIDAQAB X-Jenkins-Job: FreeBSD_stable_10 X-Jenkins-Result: UNSTABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Sep 2016 10:57:42 -0000 See From owner-freebsd-stable@freebsd.org Wed Sep 21 17:23:01 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 73231BD6D63 for ; Wed, 21 Sep 2016 17:23:01 +0000 (UTC) (envelope-from shashaness@hotmail.com) Received: from SNT004-OMC1S11.hotmail.com (snt004-omc1s11.hotmail.com [65.55.90.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "*.outlook.com", Issuer "Microsoft IT SSL SHA2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 30BDAD71 for ; Wed, 21 Sep 2016 17:23:00 +0000 (UTC) (envelope-from shashaness@hotmail.com) Received: from NAM03-CO1-obe.outbound.protection.outlook.com ([65.55.90.8]) by SNT004-OMC1S11.hotmail.com over TLS secured channel with Microsoft SMTPSVC(7.5.7601.23008); Wed, 21 Sep 2016 10:21:54 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=oX/tvScB5x0LVdbO1QRVlHO/2rPm5FtonoGn6e9NMzw=; b=AMai+RLPvU8qAqVNRlfpzwMTc7fXqITISdM8z5HawYswzTO0ELNnuUHw+l5PMDRxY5ZcgEJ3W+jwI9x4uDMGoRhKk4xjaR6pMtaD9ydeayo5vTFfc4wCZ4ngeCqgY4+fSZvpK+9zHSLmhgrHxFyRkiQ5nW+CWrMNLDVg7GXaH8LnRTNHXw+9iWHCMEwTH6MrhAYY0PHXyxIVYTKkucldfpVFAwfLjiDplVuofuUz6GhltvKOZqU2BnIZRNwHNtfS5HLj8/5HTWDeu6EqJEmzKS4FtC2SKKCZi5ITTcKIq282ntf8Z9/vlswIGrbhb1C+MVR3RZHoa9sAtEn8v8DO4Q== Received: from CO1NAM03FT038.eop-NAM03.prod.protection.outlook.com (10.152.80.53) by CO1NAM03HT152.eop-NAM03.prod.protection.outlook.com (10.152.81.59) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.629.5; Wed, 21 Sep 2016 17:21:53 +0000 Received: from CY1PR14MB0520.namprd14.prod.outlook.com (10.152.80.60) by CO1NAM03FT038.mail.protection.outlook.com (10.152.81.212) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.629.5 via Frontend Transport; Wed, 21 Sep 2016 17:21:53 +0000 Received: from CY1PR14MB0520.namprd14.prod.outlook.com ([10.164.71.150]) by CY1PR14MB0520.namprd14.prod.outlook.com ([10.164.71.150]) with mapi id 15.01.0629.006; Wed, 21 Sep 2016 17:21:53 +0000 From: Shawn Bakhtiar To: "freebsd-stable@freebsd.org" Subject: Problem with nsswitch.conf Thread-Topic: Problem with nsswitch.conf Thread-Index: AQHSFCylTHoBNWaZl0ySmezYDK7bEA== Date: Wed, 21 Sep 2016 17:21:53 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=softfail (sender IP is 10.152.80.60) smtp.mailfrom=hotmail.com; freebsd.org; dkim=none (message not signed) header.d=none;freebsd.org; dmarc=fail action=none header.from=hotmail.com; received-spf: SoftFail (protection.outlook.com: domain of transitioning hotmail.com discourages use of 10.152.80.60 as permitted sender) x-incomingtopheadermarker: OriginalChecksum:42C400D6D49F0086747804B98970FDA028AB7827863DFFD0E3C5D93EBFF678DC; UpperCasedChecksum:BAF3AB976BCFC9978C232FFDD0637D83BB94F20D6D6EBD9132D536A40027C1B3; SizeAsReceived:6877; Count:33 x-ms-exchange-messagesentrepresentingtype: 1 x-incomingheadercount: 33 x-eopattributedmessage: 0 x-microsoft-exchange-diagnostics: 1; CO1NAM03HT152; 6:N0Y8hv74+/NZkCRC/1lkn3taTmjiollBKG5t3GjweBNoDCwJvFCbhRihnb2WBJJGRQmGI50vkioNVBVDFmvVSdsQPNWwUdul8zSz1mjsIZDsacSrXxagGKnWoHzLZMRBQFdGdcfliIJtkIe3+oyc8Ty5ifCPwq2HSvtDKChoDpE/QfHJC1fl0Sc9VXOSkaB1Hywg0awYD2Xn+qPQh7X+m0jZaBOBB5zO431mzi5mLflo+mURgjsInKsW5GqGl5GaR6qDazoxDTUfBtQWAq84Ywme/x3MGyba8n+BUYlhS9M=; 5:JS1VAYcrcYxFBqoZrQ1NZttyxpeCwzFaNQLRWjO/3zvq+t5REfIIClEAgiAvSHH7luSfKGTjPbTZZemV1gAJYNR0jWKCEaIdfimM7ri/ku5PX8xCW5I/h7NPpM05eLYhhsB69nzFsn5IVM4qprIsww==; 24:7VXdOf5Hnd8eg3O7RjmYoTzKC42SAqFvhuF2Xbv5VndYrY69LJ8dxSqgz5Qx14yHIHCAaCwUaF4SQjTTvGJD5S7tPPICuOsVTDcaAoXmn4M=; 7:f6HCsP0L3tyOYW273NQBDSrr0rJqKw7kk0cpjvjzJnPM/Fmdtcu/i4IjYJPMtPh9PhFxXu6IwG381KZAe3QtiTRJP+v46BgVpPd9dcVyMFRGuhr1QETQnTWibvoM6ZgU2bUqfp5Yk766gJLcpDlmIRWS3OhFkWbLL9oDSKhqdiNmOqtqnnL4WZhWEW7fn0JJdkHisoXZQv7Tq8Few2abJX/mza4KVRhx97JauyBWmBm+HjYKPa3LGiSt5tLoCfkbMVe9EPYeHObZU4X8PbmMKFPXOUGW9cDYvSvfTtuhuu2+Zm1v9F/LA7stEl4gPZi9 x-forefront-antispam-report: EFV:NLI; SFV:NSPM; SFS:(10019020)(98900003); DIR:OUT; SFP:1102; SCL:1; SRVR:CO1NAM03HT152; H:CY1PR14MB0520.namprd14.prod.outlook.com; FPR:; SPF:None; LANG:en; x-ms-office365-filtering-correlation-id: 39026188-1076-43c1-7f32-08d3e243c727 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(1601124038)(1603103081)(1601125047); SRVR:CO1NAM03HT152; x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(432015012)(82015046); SRVR:CO1NAM03HT152; BCL:0; PCL:0; RULEID:; SRVR:CO1NAM03HT152; x-forefront-prvs: 007271867D spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-ID: <59C586FCA46FA64EB6128F6673B01324@namprd14.prod.outlook.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: hotmail.com X-MS-Exchange-CrossTenant-originalarrivaltime: 21 Sep 2016 17:21:53.3582 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO1NAM03HT152 X-OriginalArrivalTime: 21 Sep 2016 17:21:54.0896 (UTC) FILETIME=[A6565D00:01D2142C] X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Sep 2016 17:23:01 -0000 Good morning All, I'm trying to configure my server as an LDAP client. I installed the nslcd = service and it's working great. My problem is when I issue the command getent passwd it only returns the LD= AP user not the local users.=20 # # nsswitch.conf(5) - name service switch configuration file # $FreeBSD: releng/10.2/etc/nsswitch.conf 224765 2011-08-10 20:52:02Z dougb= $ # group: file ldap group_compat: nis ldap hosts: files dns networks: files passwd: file ldap passwd_compat: nis ldap shells: files=20 services: files=20 services_compat: nis protocols: files=20 rpc: files When I change the above group and passwd setting back to compat (which was = the default configuration) I get the local users but none of the ldap users= show up. In fact nslcd is not even called (i've checked by running it in d= ebug mode). So how do I configure nsswitch to use both the local /etc/passw= d file and the ldap. I need this because without it services will not start= . IE nslcd complains that nslcd is not a valid user when using the above co= nfiguration. Any help would greatly be appreciated, Shawn From owner-freebsd-stable@freebsd.org Wed Sep 21 17:28:11 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A3F25BE226B for ; Wed, 21 Sep 2016 17:28:11 +0000 (UTC) (envelope-from vangyzen@FreeBSD.org) Received: from smtp.vangyzen.net (hotblack.vangyzen.net [IPv6:2607:fc50:1000:7400:216:3eff:fe72:314f]) by mx1.freebsd.org (Postfix) with ESMTP id 8F48314D for ; Wed, 21 Sep 2016 17:28:11 +0000 (UTC) (envelope-from vangyzen@FreeBSD.org) Received: from sweettea.beer.town (unknown [76.164.8.130]) by smtp.vangyzen.net (Postfix) with ESMTPSA id E464256495; Wed, 21 Sep 2016 12:28:10 -0500 (CDT) Subject: Re: Problem with nsswitch.conf To: Shawn Bakhtiar , "freebsd-stable@freebsd.org" References: From: Eric van Gyzen Message-ID: Date: Wed, 21 Sep 2016 12:28:10 -0500 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Sep 2016 17:28:11 -0000 On 09/21/2016 12:21, Shawn Bakhtiar wrote: > Good morning All, > > I'm trying to configure my server as an LDAP client. I installed the nslcd service and it's working great. > > My problem is when I issue the command getent passwd it only returns the LDAP user not the local users. > > # > # nsswitch.conf(5) - name service switch configuration file > # $FreeBSD: releng/10.2/etc/nsswitch.conf 224765 2011-08-10 20:52:02Z dougb $ > # > group: file ldap > group_compat: nis ldap > hosts: files dns > networks: files > passwd: file ldap > passwd_compat: nis ldap > shells: files > services: files > services_compat: nis > protocols: files > rpc: files > > > When I change the above group and passwd setting back to compat (which was the default configuration) I get the local users but none of the ldap users show up. In fact nslcd is not even called (i've checked by running it in debug mode). So how do I configure nsswitch to use both the local /etc/passwd file and the ldap. I need this because without it services will not start. IE nslcd complains that nslcd is not a valid user when using the above configuration. It should be "files", plural. Eric From owner-freebsd-stable@freebsd.org Wed Sep 21 17:31:26 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D6E6CBE267F for ; Wed, 21 Sep 2016 17:31:26 +0000 (UTC) (envelope-from shashaness@hotmail.com) Received: from BLU004-OMC4S5.hotmail.com (blu004-omc4s5.hotmail.com [65.55.111.144]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "*.outlook.com", Issuer "Microsoft IT SSL SHA2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 87AC968D for ; Wed, 21 Sep 2016 17:31:26 +0000 (UTC) (envelope-from shashaness@hotmail.com) Received: from NAM02-SN1-obe.outbound.protection.outlook.com ([65.55.111.137]) by BLU004-OMC4S5.hotmail.com over TLS secured channel with Microsoft SMTPSVC(7.5.7601.23008); Wed, 21 Sep 2016 10:30:18 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=LYg0XJ15mgwcAwEElOgR9Hn+LInlTwSWTwC4TZya+I4=; b=bX9Pfa2Ui/kqrZjL+CFV0oY2a4qgOQLha3RJWpO/oNFXvfJCc8MatRqQa0Ss9RzbNdTbq08hXpdkn+o9Vn28gt7J7U3kko2mYoqjMkuB54UmlCqFBxPBMYC9oQbIB47+WjL30G6yV7hbOhKaGxr/n4oWP+yZ++8re/+NmiF9BLv51Z7mDL/wacuRZR5sWP+MFD884cfJ1NRyTtfzgaSF/SNo716iS1OLGUZ1k/SH+D53Jfwdd+Jth7H8z50Qtn41f5qeDE5zLb6jwRNygxaDNrTtIATMnYCnXKrnnv8WMc95WFFkrI2tuG4sOkA6PkCmFNmktHfNW3eFwpuq4CM21A== Received: from BL2NAM02FT005.eop-nam02.prod.protection.outlook.com (10.152.76.54) by BL2NAM02HT174.eop-nam02.prod.protection.outlook.com (10.152.77.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.629.5; Wed, 21 Sep 2016 17:30:14 +0000 Received: from CY1PR14MB0520.namprd14.prod.outlook.com (10.152.76.56) by BL2NAM02FT005.mail.protection.outlook.com (10.152.76.252) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.629.5 via Frontend Transport; Wed, 21 Sep 2016 17:30:14 +0000 Received: from CY1PR14MB0520.namprd14.prod.outlook.com ([10.164.71.150]) by CY1PR14MB0520.namprd14.prod.outlook.com ([10.164.71.150]) with mapi id 15.01.0629.006; Wed, 21 Sep 2016 17:30:13 +0000 From: Shawn Bakhtiar To: "freebsd-stable@freebsd.org" Subject: Re: Problem with nsswitch.conf Thread-Topic: Problem with nsswitch.conf Thread-Index: AQHSFCylTHoBNWaZl0ySmezYDK7bEKCEMrIAgAAAjoA= Date: Wed, 21 Sep 2016 17:30:13 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=softfail (sender IP is 10.152.76.56) smtp.mailfrom=hotmail.com; freebsd.org; dkim=none (message not signed) header.d=none;freebsd.org; dmarc=fail action=none header.from=hotmail.com; received-spf: SoftFail (protection.outlook.com: domain of transitioning hotmail.com discourages use of 10.152.76.56 as permitted sender) x-ms-exchange-messagesentrepresentingtype: 1 x-eopattributedmessage: 0 x-microsoft-exchange-diagnostics: 1; BL2NAM02HT174; 6:zJSge8CqdtBWqntJPCabh0RyQ2ZGUhl/bLhcTBSC5VTuZeIIGFKCokQR+qeDYg23qj+5CrR11gwDfet6oHrYDHi+uk+5kWyarzRPgtJ/JBRUFrzPjO+VNYYJlVbWnaAOXyLNbmYUz1sztDsOa1YuhC+enSyJop0AiE3OTVw8MexF6ugSaASc9xrAg/01z/iFLf6rnkkbk425EtPDRD+6k6BpYkDYcQ/mBvO92y1sCrDVhcoPPmHhrN8mQrSWRGVseqfPLmEF4+24jD/iA7wu+Ut4t7sSXm6aOuk/Jlb2tn0=; 5:XGrR6uG8QTA24qyFBw0hl6M+T13idU6+e0M901BN75w6D6Ee9z2oFS5cZ8YR1LYslF9qiIMx5Wh8OPaKNw5xZ0wCyCKA314GTmfZpXVJxDe/nFXiQkNGru0f4YU1O1iQuzAluvUvlx+FvPD6ICoQ2w==; 24:5T7NYPMbrqzZmIbIBI+gGV/v1cFH8Yeqo+QrvPGb/fc2B+sE9pfWIlxtpwz4WUqcdy7tcHLY3uqy24e/jsm59Yaj3AU7ziRY/cWYyVWZ0Lg=; 7:encGKLlEKBO8m32e/OgPMf+7pQQRsqTU+l+vdaw8nJRLGH/72lwtV1EuSrM9xmwoH9PZ633k5Xfqs7qqNvuoIyTUyXRZGW+/X28i7YnfVvhCelvkIE8aXBOfP2uFwOYkmVYn3Vm1FIKYLAwlM5IZ5qiOJnJyggjMOi0dChMdWoA+qeUMVgDhZmGX0h8KRnFzP0qlzVPp2sZb0Okd3THb80426V6yTCW9a3xn0U06XtkeAI99CCTZG3TCMzuWMgLZRPVj8nTFvTCyvfa9a7OTPi+ijRA6lAXpGst9hAbiBMUH5+hCIL5SQncQ3xH6Pz/s x-forefront-antispam-report: EFV:NLI; SFV:NSPM; SFS:(10019020)(98900003); DIR:OUT; SFP:1102; SCL:1; SRVR:BL2NAM02HT174; H:CY1PR14MB0520.namprd14.prod.outlook.com; FPR:; SPF:None; LANG:en; x-ms-office365-filtering-correlation-id: 20d950ab-4e29-4fb6-7e37-08d3e244f150 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(1601124038)(1603103081)(1601125047); SRVR:BL2NAM02HT174; x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(432015012)(82015046); SRVR:BL2NAM02HT174; BCL:0; PCL:0; RULEID:; SRVR:BL2NAM02HT174; x-forefront-prvs: 007271867D spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="Windows-1252" Content-ID: <6D3763515CB0FB479C77AF2ED96A1D06@namprd14.prod.outlook.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: hotmail.com X-MS-Exchange-CrossTenant-originalarrivaltime: 21 Sep 2016 17:30:13.4718 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL2NAM02HT174 X-OriginalArrivalTime: 21 Sep 2016 17:30:18.0778 (UTC) FILETIME=[D2ACA7A0:01D2142D] X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Sep 2016 17:31:26 -0000 Oh Jesus!!! Thank you.. that worked.. obviously... > On Sep 21, 2016, at 10:28 AM, Eric van Gyzen wrote= : >=20 > On 09/21/2016 12:21, Shawn Bakhtiar wrote: >> Good morning All, >>=20 >> I'm trying to configure my server as an LDAP client. I installed the nsl= cd service and it's working great. >>=20 >> My problem is when I issue the command getent passwd it only returns the= LDAP user not the local users.=20 >>=20 >> # >> # nsswitch.conf(5) - name service switch configuration file >> # $FreeBSD: releng/10.2/etc/nsswitch.conf 224765 2011-08-10 20:52:02Z do= ugb $ >> # >> group: file ldap >> group_compat: nis ldap >> hosts: files dns >> networks: files >> passwd: file ldap >> passwd_compat: nis ldap >> shells: files=20 >> services: files=20 >> services_compat: nis >> protocols: files=20 >> rpc: files >>=20 >>=20 >> When I change the above group and passwd setting back to compat (which w= as the default configuration) I get the local users but none of the ldap us= ers show up. In fact nslcd is not even called (i've checked by running it i= n debug mode). So how do I configure nsswitch to use both the local /etc/pa= sswd file and the ldap. I need this because without it services will not st= art. IE nslcd complains that nslcd is not a valid user when using the above= configuration. >=20 > It should be "files", plural. >=20 > Eric From owner-freebsd-stable@freebsd.org Wed Sep 21 19:51:59 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 17C29BE4696 for ; Wed, 21 Sep 2016 19:51:59 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CC3A1DC3; Wed, 21 Sep 2016 19:51:58 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bmnYh-000Mcq-Mz; Wed, 21 Sep 2016 22:51:55 +0300 Date: Wed, 21 Sep 2016 22:51:55 +0300 From: Slawa Olhovchenkov To: Julien Charbon Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org, hiren panchasara Subject: Re: 11.0 stuck on high network load Message-ID: <20160921195155.GW2840@zxy.spb.ru> References: <20160915085938.GN38409@kib.kiev.ua> <20160915090633.GS2840@zxy.spb.ru> <20160916181839.GC2960@zxy.spb.ru> <20160916183053.GL9397@strugglingcoder.info> <20160916190330.GG2840@zxy.spb.ru> <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> <20160919204328.GN2840@zxy.spb.ru> <8ba75d6e-4f01-895e-0aed-53c6c6692cb9@freebsd.org> <20160920202633.GQ2840@zxy.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Sep 2016 19:51:59 -0000 On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote: > > You can also use Dtrace and lockstat (especially with the lockstat -s > option): > > https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks > https://www.freebsd.org/cgi/man.cgi?query=lockstat&manpath=FreeBSD+11.0-RELEASE > > But I am less familiar with Dtrace/lockstat tools. I am still use old kernel and got lockdown again. Try using lockstat (I am save more output), interesting may be next: R/W writer spin on writer: 190019 events in 1.070 seconds (177571 events/sec) ------------------------------------------------------------------------------- Count indv cuml rcnt nsec Lock Caller 140839 74% 74% 0.00 24659 tcpinp tcp_tw_2msl_scan+0xc6 nsec ------ Time Distribution ------ count Stack 4096 | 913 tcp_twstart+0xa3 8192 |@@@@@@@@@@@@ 58191 tcp_do_segment+0x201f 16384 |@@@@@@ 29594 tcp_input+0xe1c 32768 |@@@@ 23447 ip_input+0x15f 65536 |@@@ 16197 131072 |@ 8674 262144 | 3358 524288 | 456 1048576 | 9 ------------------------------------------------------------------------------- Count indv cuml rcnt nsec Lock Caller 49180 26% 100% 0.00 15929 tcpinp tcp_tw_2msl_scan+0xc6 nsec ------ Time Distribution ------ count Stack 4096 | 157 pfslowtimo+0x54 8192 |@@@@@@@@@@@@@@@ 24796 softclock_call_cc+0x179 16384 |@@@@@@ 11223 softclock+0x44 32768 |@@@@ 7426 intr_event_execute_handlers+0x95 65536 |@@ 3918 131072 | 1363 262144 | 278 524288 | 19 ------------------------------------------------------------------------------- > >> #1. Try above kernel options at least once, and see what you can get. > > > > OK, I am try this after some time. > > > >> #2. If #1 is a total failure try below patch: It won't solve anything, > >> it just makes tcp_tw_2msl_scan() less greedy when there is contention on > >> the INP write lock. If it makes the debugging more feasible, continue > >> to #3. > > > > OK, thanks. > > What purpose to not skip locked tcptw in this loop? > > If I understand your question correctly: According to your pmcstat > result, tcp_tw_2msl_scan() currently struggles with a write lock > (__rw_wlock_hard) and the only write lock used tcp_tw_2msl_scan() is > INP_WLOCK. No sign of contention on TW_RLOCK(V_tw_lock) currently. > > 51.86% [2413083] lock_delay @ /boot/kernel.VSTREAM/kernel > 100.0% [2413083] __rw_wlock_hard > 100.0% [2413083] tcp_tw_2msl_scan > > -- > Julien From owner-freebsd-stable@freebsd.org Wed Sep 21 19:55:08 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8D073BE4B47 for ; Wed, 21 Sep 2016 19:55:08 +0000 (UTC) (envelope-from jenkins-admin@FreeBSD.org) Received: from jenkins-9.freebsd.org (jenkins-9.freebsd.org [8.8.178.209]) by mx1.freebsd.org (Postfix) with ESMTP id 7F3957F6; Wed, 21 Sep 2016 19:55:08 +0000 (UTC) (envelope-from jenkins-admin@FreeBSD.org) Received: from jenkins-9.freebsd.org (localhost [127.0.0.1]) by jenkins-9.freebsd.org (Postfix) with ESMTP id 5A350BA; Wed, 21 Sep 2016 19:55:08 +0000 (UTC) Date: Wed, 21 Sep 2016 19:55:07 +0000 (GMT) From: jenkins-admin@FreeBSD.org To: jenkins-admin@FreeBSD.org, freebsd-stable@FreeBSD.org Message-ID: <2026074152.17.1474487707926.JavaMail.jenkins@jenkins-9.freebsd.org> In-Reply-To: <193106415.12.1474455460958.JavaMail.jenkins@jenkins-9.freebsd.org> References: <193106415.12.1474455460958.JavaMail.jenkins@jenkins-9.freebsd.org> Subject: Jenkins build is back to stable : FreeBSD_stable_10 #404 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Instance-Identity: MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAkKKb2VAfYQKfu1t7qk4nR5qzUBEI+UqT4BPec4qHVhqUy0FFdq50sMH+3y9bCDNOufctov6VqTNffZ3YXArnZK95YF0OX97fh+E9txYOUX1adc+TikcKjuYpHmL5dE62eaZTI+4A5jnRonskQ1PaoIFz0Kbu4mWzkFsmdiXTraGzomXq4cHUCATA2+K4eDYgjXEQI30z3GOMmmZ4t/+6QGk1cMb/BqMWHbn80AsRCb4tU7Hpd72XLDpsuO7YRP1Q0CjmNAuBOTj+sFiiOe6U9HpqOlQN+iFUvBdZo/ybuy5Kh71cAaYQNL68cYdZJ6binH/DkG3KY/fS7DFYAeuwjwIDAQAB X-Jenkins-Job: FreeBSD_stable_10 X-Jenkins-Result: SUCCESS X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Sep 2016 19:55:08 -0000 See From owner-freebsd-stable@freebsd.org Wed Sep 21 21:25:28 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E1543BE37FD for ; Wed, 21 Sep 2016 21:25:28 +0000 (UTC) (envelope-from julien.charbon@gmail.com) Received: from mail-wm0-x232.google.com (mail-wm0-x232.google.com [IPv6:2a00:1450:400c:c09::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6642B86C for ; Wed, 21 Sep 2016 21:25:28 +0000 (UTC) (envelope-from julien.charbon@gmail.com) Received: by mail-wm0-x232.google.com with SMTP id l132so292168327wmf.0 for ; Wed, 21 Sep 2016 14:25:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to; bh=k5CLCtVpIkmqkgjvo2JIE8j8wgkatI9oCXcqIs/iwEA=; b=qTR8atZmRpnJMcGNfW6bYHIc1IfduguNQ08QwS0/ARK8yxcL6Tr7pTnbMYB5D8RzMl yNc8CauKu/EBSnPlv3pdKGj54DDlcTMfxZlhgtajHgPwWUDI2DAF0zBEd0ePU5qpEN4s frNSlrln/n4YeRteqb63vN4Wksx9miFLerwAVKq+fVr6WR6wy+ZE0sf3Y+nhcs+P+PbB /ZBR5cqWXqe4zy1SEnWkMY4IdTyOsOdJkhiReoGJ/9nt99M8oNf81mD0BSg5iJ3o0pFY C4ShfC+jFcwRQVFdzSeycDoX+rS6apA0EoYeP9qOJVijZW2X37Ahe9ETlL7z3XidvwAH NODA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to; bh=k5CLCtVpIkmqkgjvo2JIE8j8wgkatI9oCXcqIs/iwEA=; b=O2dmMMwwUTBvv0h4cqGZaiYLTgKF770O1UG+oylyJJpugGm9A8hG5kRWlzIbKQH76Z PhW6cg9eUn2dvEehLkklafHZG7yphXWF++od1gCtzr0g93fcSdAyHaB16bS6/rgGQefU rWexNkAZZ4ZbkYkgT/Ko1emDGbIM9eg139/dsFVhscCHHK4nvBfEfwZ3N6k3fowLGbUv pKHCK0jb7B5UQ5s/PSnlwMbahQzUHQ3sfORu6VR7XlYGsslTDRnQ9EL0exHnjZbGFeEb qw+JAFK+tvVfLX71wRfBpKk8QK3kU0rcxocaFvYkGuij3tDR3T+d7cozdArNYWQIKENC q0dQ== X-Gm-Message-State: AE9vXwOquAmtPIOfgZHBVNjTcSy0XmrMK+87CF72RSLiP/J4G9bbjYdGVyY47BYXJ5qGZw== X-Received: by 10.28.66.6 with SMTP id p6mr4737960wma.59.1474493126961; Wed, 21 Sep 2016 14:25:26 -0700 (PDT) Received: from [192.168.0.14] (217-162-163-184.dynamic.hispeed.ch. [217.162.163.184]) by smtp.gmail.com with ESMTPSA id xb6sm35418168wjb.30.2016.09.21.14.25.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 21 Sep 2016 14:25:26 -0700 (PDT) Subject: Re: 11.0 stuck on high network load To: Slawa Olhovchenkov References: <20160915085938.GN38409@kib.kiev.ua> <20160915090633.GS2840@zxy.spb.ru> <20160916181839.GC2960@zxy.spb.ru> <20160916183053.GL9397@strugglingcoder.info> <20160916190330.GG2840@zxy.spb.ru> <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> <20160919204328.GN2840@zxy.spb.ru> <8ba75d6e-4f01-895e-0aed-53c6c6692cb9@freebsd.org> <20160920202633.GQ2840@zxy.spb.ru> <20160921195155.GW2840@zxy.spb.ru> Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org, hiren panchasara From: Julien Charbon Message-ID: Date: Wed, 21 Sep 2016 23:25:18 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <20160921195155.GW2840@zxy.spb.ru> Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="PMIcB2RuhcdqOnqGRO71Tkne4QgWf9AS6" X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Sep 2016 21:25:29 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --PMIcB2RuhcdqOnqGRO71Tkne4QgWf9AS6 Content-Type: multipart/mixed; boundary="31rbt2q54Ws8RPnw0ruU6la6KWx342Bne"; protected-headers="v1" From: Julien Charbon To: Slawa Olhovchenkov Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org, hiren panchasara Message-ID: Subject: Re: 11.0 stuck on high network load References: <20160915085938.GN38409@kib.kiev.ua> <20160915090633.GS2840@zxy.spb.ru> <20160916181839.GC2960@zxy.spb.ru> <20160916183053.GL9397@strugglingcoder.info> <20160916190330.GG2840@zxy.spb.ru> <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> <20160919204328.GN2840@zxy.spb.ru> <8ba75d6e-4f01-895e-0aed-53c6c6692cb9@freebsd.org> <20160920202633.GQ2840@zxy.spb.ru> <20160921195155.GW2840@zxy.spb.ru> In-Reply-To: <20160921195155.GW2840@zxy.spb.ru> --31rbt2q54Ws8RPnw0ruU6la6KWx342Bne Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Hi Slawa, On 9/21/16 9:51 PM, Slawa Olhovchenkov wrote: > On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote: >> You can also use Dtrace and lockstat (especially with the lockstat -s= >> option): >> >> https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks >> https://www.freebsd.org/cgi/man.cgi?query=3Dlockstat&manpath=3DFreeBSD= +11.0-RELEASE >> >> But I am less familiar with Dtrace/lockstat tools. >=20 > I am still use old kernel and got lockdown again. > Try using lockstat (I am save more output), interesting may be next: >=20 > R/W writer spin on writer: 190019 events in 1.070 seconds (177571 event= s/sec) >=20 > -----------------------------------------------------------------------= -------- > Count indv cuml rcnt nsec Lock Caller = =20 > 140839 74% 74% 0.00 24659 tcpinp tcp_tw_2msl_scan+= 0xc6 =20 >=20 > nsec ------ Time Distribution ------ count Stack = =20 > 4096 | 913 tcp_twstart+0xa3 = =20 > 8192 |@@@@@@@@@@@@ 58191 tcp_do_segment+0x2= 01f =20 > 16384 |@@@@@@ 29594 tcp_input+0xe1c = =20 > 32768 |@@@@ 23447 ip_input+0x15f = =20 > 65536 |@@@ 16197 =20 > 131072 |@ 8674 =20 > 262144 | 3358 =20 > 524288 | 456 =20 > 1048576 | 9 =20 > -----------------------------------------------------------------------= -------- > Count indv cuml rcnt nsec Lock Caller = =20 > 49180 26% 100% 0.00 15929 tcpinp tcp_tw_2msl_scan+0= xc6 =20 >=20 > nsec ------ Time Distribution ------ count Stack = =20 > 4096 | 157 pfslowtimo+0x54 = =20 > 8192 |@@@@@@@@@@@@@@@ 24796 softclock_call_cc+= 0x179=20 > 16384 |@@@@@@ 11223 softclock+0x44 = =20 > 32768 |@@@@ 7426 intr_event_execute= _handlers+0x95 > 65536 |@@ 3918 =20 > 131072 | 1363 =20 > 262144 | 278 =20 > 524288 | 19 =20 > -----------------------------------------------------------------------= -------- This is interesting, it seems that you have two call paths competing for INP locks here: - pfslowtimo()/tcp_tw_2msl_scan(reuse=3D0) and - tcp_input()/tcp_twstart()/tcp_tw_2msl_scan(reuse=3D1) These paths can indeed compete for the same INP lock, as both tcp_tw_2msl_scan() calls always start with the first inp found in twq_2msl list. But in both cases, this first inp should be quickly used and its lock released anyway, thus that could explain your situation it that the TCP stack is doing that all the time, for example: - Let say that you are running out completely and constantly of tcptw, and then all connections transitioning to TIME_WAIT state are competing with the TIME_WAIT timeout scan that tries to free all the expired tcptw. If the stack is doing that all the time, it can appear like "live" locked. This is just an hypothesis and as usual might be a red herring. Anyway, could you run: $ vmstat -z | head -2; vmstat -z | grep -E 'tcp|sock' Ideally, once when everything is ok, and once when you have the issue to see the differences (if any). If it appears your are quite low in tcptw, and if you have enough memory, could you try increase the tcptw limit using sysctl net.inet.tcp.maxtcptw? And actually see if it improve (or not) your performance. My 2 cents. -- Julien --31rbt2q54Ws8RPnw0ruU6la6KWx342Bne-- --PMIcB2RuhcdqOnqGRO71Tkne4QgWf9AS6 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQEcBAEBCgAGBQJX4vrEAAoJEKVlQ5Je6dhxDDAH/1ZVvUOHAOrR83zbYHmPgaGZ NgNAj8jsXSf8Q37bnl/m4QwF25Dg9srpEv8VXmgKFO7MlpoIvNcf9I/exILbAJTO TnHB48CKL82SvXtVdb7DBmRC51fOZZJCZ5FQwjAuJSz9GOxbrxrARjWXH+jrNEx6 GjA/4CfcnaM7Uq5Y7jsAnxjYyBDdBdaTPqc03asUySKtdEEPHa9bonK/YTsHElik W8V4OIsarjeLrF6RLaxW9oYTTIWLsDEG8Hvw/M/PEFvxggHYNXfOx78L24PW/twF cXLoh0+U1UOUR4dob+fprdRARbDcTm53VS16DSSn/0IZ4jLcbPeKt0Ng/UgJh0g= =aRb8 -----END PGP SIGNATURE----- --PMIcB2RuhcdqOnqGRO71Tkne4QgWf9AS6-- From owner-freebsd-stable@freebsd.org Thu Sep 22 02:47:25 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AB996BE42AB for ; Thu, 22 Sep 2016 02:47:25 +0000 (UTC) (envelope-from chio1990@gmail.com) Received: from mail-pa0-x244.google.com (mail-pa0-x244.google.com [IPv6:2607:f8b0:400e:c03::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7DB8FAF1 for ; Thu, 22 Sep 2016 02:47:25 +0000 (UTC) (envelope-from chio1990@gmail.com) Received: by mail-pa0-x244.google.com with SMTP id my20so3010359pab.3 for ; Wed, 21 Sep 2016 19:47:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=to:from:subject:message-id:date:user-agent:mime-version :content-transfer-encoding; bh=E7stK/jcKsWt7J4Pm8VudXPmD7dab8L+48EYaVKT0do=; b=TVXQSfJ8ehKNAlJQvsJ+XVmSkR+tKOQUlZqKUCP039ALUhb2ZVyXmMVw52xv+Emjea XrNIM/7aWal4VqMqujnKgWlixhD5iPtvlaWabQXfMDnFU8RvLGH8lSqQEasdy4rrj3Ks Jnx4kptM1Gdmbi0JKPK/GZIIhTTKcPh/zQ+wsI5HPg5uRHjMMKtFiyDy9ESmyuDibU+r Eeafjfxm0ztmU2LgMV86ANc7uzsM84pQGtnMcCYDvcQNzJrxesJxQ0oPK8HUo+kBeRce 4TTCmqM/B4BsleRMmRww1a5lK/7oUkfxz/9s1uy8ssfyuMxsn6ack/fsJuothyfP0Ecg VbJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:to:from:subject:message-id:date:user-agent :mime-version:content-transfer-encoding; bh=E7stK/jcKsWt7J4Pm8VudXPmD7dab8L+48EYaVKT0do=; b=ZmZKKFZjosep9ki3XJOnVASc7n6DEE86sy2ZPlEEkvOvu4+zMUDOQAOyKZi1j3klU5 h1XL6Yj9AmiteVjfQh3APhUxEpqWLT6MnppDTEqDWkzOfO9K/sWYkA+DiW/M2anpf0A1 Rvi36fgNjymMmHcn74s0sZP9Kdn7u033IWN7vBsw0k0I45p5ekMg1HNVk1IKmXHnts+4 GFq1gKVL5SJkzSP8g0PRktJkbha2hYqE7nEkIJtKElMUIlhq3mOA/u1LBOV5j3vnWyX9 i+gCnZ57rgWxvYixYLALo6ApJbFjtlUo5nSagd+3QWg4egYtuuBz7VLQqwWmAbsBaePX rprg== X-Gm-Message-State: AE9vXwM8zDFCXjYL0Jec6iILYd5X15orI5fjaGcx7tk3c/XeVV3zdkUZTHbOTAjlnNGi9A== X-Received: by 10.66.252.167 with SMTP id zt7mr69092299pac.93.1474512444937; Wed, 21 Sep 2016 19:47:24 -0700 (PDT) Received: from kmatoMacBook-Pro.local ([221.234.47.40]) by smtp.gmail.com with ESMTPSA id x66sm788826pfb.86.2016.09.21.19.47.23 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 21 Sep 2016 19:47:24 -0700 (PDT) To: freebsd-stable@FreeBSD.org From: k simon Subject: 11.0-stable can not buildworld Message-ID: <0afcde33-5127-6ee3-fcdc-54a252d02ff2@gmail.com> Date: Thu, 22 Sep 2016 10:47:21 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 02:47:25 -0000 Hi,lists, 11.0-stable r306419 can not buildworld. The output as below: clang -O2 -pipe -fno-omit-frame-pointer -march=core2 -I/usr/src/lib/libc/include -I/usr/src/lib/libc/../../include -I/usr/src/lib/libc/amd64 -DNLS -D__DBINTERFACE_PRIVATE -I/usr/src/lib/libc/../../contrib/gdtoa -I/usr/src/lib/libc/../../contrib/libc-vis -DINET6 -I/usr/obj/usr/src/lib/libc -I/usr/src/lib/libc/resolv -D_ACL_PRIVATE -DPOSIX_MISTAKE -I/usr/src/lib/libc/../libmd -I/usr/src/lib/libc/../../contrib/jemalloc/include -DMALLOC_PRODUCTION -I/usr/src/lib/libc/../../contrib/tzcode/stdtime -I/usr/src/lib/libc/stdtime -I/usr/src/lib/libc/locale -DBROKEN_DES -DPORTMAP -DDES_BUILTIN -I/usr/src/lib/libc/rpc -DYP -DNS_CACHING -DSYMBOL_VERSIONING -MD -MF.depend.__vdso_gettimeofday.o -MT__vdso_gettimeofday.o -std=gnu99 -fstack-protector-strong -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -Wno-empty-body -Wno-string-plus-int -Wno-unused-const-variable -Wno-tautological-compare -Wno-unused-value -Wno-parentheses-equality -Wno-unused-function -Wno-enum-conversion -Wno-unused-local-typedef -Wno-switch -Wno-switch-enum -Wno-knr-promoted-parameter -Qunused-arguments -I/usr/src/lib/libutil -I/usr/src/lib/msun/amd64 -I/usr/src/lib/msun/x86 -I/usr/src/lib/msun/src -c /usr/src/lib/libc/sys/__vdso_gettimeofday.c -o __vdso_gettimeofday.o /usr/src/lib/libc/sys/__vdso_gettimeofday.c:43:27: error: too many arguments to function call, expected single argument 'vdso_th', have 2 arguments error = __vdso_gettc(th, &tc); ~~~~~~~~~~~~ ^~~ /usr/include/sys/vdso.h:65:1: note: '__vdso_gettc' declared here u_int __vdso_gettc(const struct vdso_timehands *vdso_th); ^ 1 error generated. --- __error.o --- --- __vdso_gettimeofday.o --- *** [__vdso_gettimeofday.o] Error code 1 make[4]: stopped in /usr/src/lib/libc --- __error.o --- Simon 200922 From owner-freebsd-stable@freebsd.org Thu Sep 22 07:56:18 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5F974BE5C7F for ; Thu, 22 Sep 2016 07:56:18 +0000 (UTC) (envelope-from jenkins-admin@FreeBSD.org) Received: from jenkins-9.freebsd.org (jenkins-9.freebsd.org [8.8.178.209]) by mx1.freebsd.org (Postfix) with ESMTP id 53228806; Thu, 22 Sep 2016 07:56:18 +0000 (UTC) (envelope-from jenkins-admin@FreeBSD.org) Received: from jenkins-9.freebsd.org (localhost [127.0.0.1]) by jenkins-9.freebsd.org (Postfix) with ESMTP id 0B77BD1; Thu, 22 Sep 2016 07:56:17 +0000 (UTC) Date: Thu, 22 Sep 2016 07:56:16 +0000 (GMT) From: jenkins-admin@FreeBSD.org To: jenkins-admin@FreeBSD.org, freebsd-stable@FreeBSD.org Message-ID: <1758150249.26.1474530976648.JavaMail.jenkins@jenkins-9.freebsd.org> Subject: Jenkins build became unstable: FreeBSD_stable_10 #406 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Instance-Identity: MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAkKKb2VAfYQKfu1t7qk4nR5qzUBEI+UqT4BPec4qHVhqUy0FFdq50sMH+3y9bCDNOufctov6VqTNffZ3YXArnZK95YF0OX97fh+E9txYOUX1adc+TikcKjuYpHmL5dE62eaZTI+4A5jnRonskQ1PaoIFz0Kbu4mWzkFsmdiXTraGzomXq4cHUCATA2+K4eDYgjXEQI30z3GOMmmZ4t/+6QGk1cMb/BqMWHbn80AsRCb4tU7Hpd72XLDpsuO7YRP1Q0CjmNAuBOTj+sFiiOe6U9HpqOlQN+iFUvBdZo/ybuy5Kh71cAaYQNL68cYdZJ6binH/DkG3KY/fS7DFYAeuwjwIDAQAB X-Jenkins-Job: FreeBSD_stable_10 X-Jenkins-Result: UNSTABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 07:56:18 -0000 See From owner-freebsd-stable@freebsd.org Thu Sep 22 07:59:39 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 17AE3BE5FA8 for ; Thu, 22 Sep 2016 07:59:39 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 994C0A0D; Thu, 22 Sep 2016 07:59:38 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u8M7xXR1035597 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 22 Sep 2016 10:59:33 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u8M7xXR1035597 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u8M7xX8J035596; Thu, 22 Sep 2016 10:59:33 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 22 Sep 2016 10:59:33 +0300 From: Konstantin Belousov To: Slawa Olhovchenkov Cc: John Baldwin , freebsd-stable@freebsd.org, alc@freebsd.org Subject: Re: nginx and FreeBSD11 Message-ID: <20160922075933.GL38409@kib.kiev.ua> References: <20160907191348.GD22212@zxy.spb.ru> <1823460.vTm8IvUQsF@ralph.baldwin.cx> <20160918162241.GE2960@zxy.spb.ru> <2122051.7RxZBKUSFc@ralph.baldwin.cx> <20160920065244.GO2840@zxy.spb.ru> <20160920192053.GP2840@zxy.spb.ru> <20160920201925.GI38409@kib.kiev.ua> <20160920203853.GR2840@zxy.spb.ru> <20160920211517.GJ38409@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160920211517.GJ38409@kib.kiev.ua> User-Agent: Mutt/1.6.1 (2016-04-27) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 07:59:39 -0000 On Wed, Sep 21, 2016 at 12:15:17AM +0300, Konstantin Belousov wrote: > > > diff --git a/sys/vm/vm_map.c b/sys/vm/vm_map.c > > > index a23468e..f754652 100644 > > > --- a/sys/vm/vm_map.c > > > +++ b/sys/vm/vm_map.c > > > @@ -481,6 +481,7 @@ vmspace_switch_aio(struct vmspace *newvm) > > > if (oldvm == newvm) > > > return; > > > > > > + spinlock_enter(); > > > /* > > > * Point to the new address space and refer to it. > > > */ > > > @@ -489,6 +490,7 @@ vmspace_switch_aio(struct vmspace *newvm) > > > > > > /* Activate the new mapping. */ > > > pmap_activate(curthread); > > > + spinlock_exit(); > > > > > > /* Remove the daemon's reference to the old address space. */ > > > KASSERT(oldvm->vm_refcnt > 1, Did you tested the patch ? Below is, I believe, the committable fix, of course supposing that the patch above worked. If you want to retest it on stable/11, ignore efirt.c chunks. diff --git a/sys/amd64/amd64/efirt.c b/sys/amd64/amd64/efirt.c index f1d67f7..c883af8 100644 --- a/sys/amd64/amd64/efirt.c +++ b/sys/amd64/amd64/efirt.c @@ -53,6 +53,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include #include @@ -301,6 +302,17 @@ efi_enter(void) PMAP_UNLOCK(curpmap); return (error); } + + /* + * IPI TLB shootdown handler invltlb_pcid_handler() reloads + * %cr3 from the curpmap->pm_cr3, which would disable runtime + * segments mappings. Block the handler's action by setting + * curpmap to impossible value. See also comment in + * pmap.c:pmap_activate_sw(). + */ + if (pmap_pcid_enabled && !invpcid_works) + PCPU_SET(curpmap, NULL); + load_cr3(VM_PAGE_TO_PHYS(efi_pml4_page) | (pmap_pcid_enabled ? curpmap->pm_pcids[PCPU_GET(cpuid)].pm_pcid : 0)); /* @@ -317,7 +329,9 @@ efi_leave(void) { pmap_t curpmap; - curpmap = PCPU_GET(curpmap); + curpmap = &curproc->p_vmspace->vm_pmap; + if (pmap_pcid_enabled && !invpcid_works) + PCPU_SET(curpmap, curpmap); load_cr3(curpmap->pm_cr3 | (pmap_pcid_enabled ? curpmap->pm_pcids[PCPU_GET(cpuid)].pm_pcid : 0)); if (!pmap_pcid_enabled) diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c index 63042e4..59e1b67 100644 --- a/sys/amd64/amd64/pmap.c +++ b/sys/amd64/amd64/pmap.c @@ -6842,6 +6842,7 @@ pmap_activate_sw(struct thread *td) { pmap_t oldpmap, pmap; uint64_t cached, cr3; + register_t rflags; u_int cpuid; oldpmap = PCPU_GET(curpmap); @@ -6865,16 +6866,43 @@ pmap_activate_sw(struct thread *td) pmap == kernel_pmap, ("non-kernel pmap thread %p pmap %p cpu %d pcid %#x", td, pmap, cpuid, pmap->pm_pcids[cpuid].pm_pcid)); + + /* + * If the INVPCID instruction is not available, + * invltlb_pcid_handler() is used for handle + * invalidate_all IPI, which checks for curpmap == + * smp_tlb_pmap. Below operations sequence has a + * window where %CR3 is loaded with the new pmap's + * PML4 address, but curpmap value is not yet updated. + * This causes invltlb IPI handler, called between the + * updates, to execute as NOP, which leaves stale TLB + * entries. + * + * Note that the most typical use of + * pmap_activate_sw(), from the context switch, is + * immune to this race, because interrupts are + * disabled (while the thread lock is owned), and IPI + * happends after curpmap is updated. Protect other + * callers in a similar way, by disabling interrupts + * around the %cr3 register reload and curpmap + * assignment. + */ + if (!invpcid_works) + rflags = intr_disable(); + if (!cached || (cr3 & ~CR3_PCID_MASK) != pmap->pm_cr3) { load_cr3(pmap->pm_cr3 | pmap->pm_pcids[cpuid].pm_pcid | cached); if (cached) PCPU_INC(pm_save_cnt); } + PCPU_SET(curpmap, pmap); + if (!invpcid_works) + intr_restore(rflags); } else if (cr3 != pmap->pm_cr3) { load_cr3(pmap->pm_cr3); + PCPU_SET(curpmap, pmap); } - PCPU_SET(curpmap, pmap); #ifdef SMP CPU_CLR_ATOMIC(cpuid, &oldpmap->pm_active); #else From owner-freebsd-stable@freebsd.org Thu Sep 22 08:25:36 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3F0F7BE30E0 for ; Thu, 22 Sep 2016 08:25:36 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 01BF2E2; Thu, 22 Sep 2016 08:25:36 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bmzJv-000GWh-6N; Thu, 22 Sep 2016 11:25:27 +0300 Date: Thu, 22 Sep 2016 11:25:27 +0300 From: Slawa Olhovchenkov To: Konstantin Belousov Cc: John Baldwin , freebsd-stable@freebsd.org, alc@freebsd.org Subject: Re: nginx and FreeBSD11 Message-ID: <20160922082527.GX2840@zxy.spb.ru> References: <20160907191348.GD22212@zxy.spb.ru> <1823460.vTm8IvUQsF@ralph.baldwin.cx> <20160918162241.GE2960@zxy.spb.ru> <2122051.7RxZBKUSFc@ralph.baldwin.cx> <20160920065244.GO2840@zxy.spb.ru> <20160920192053.GP2840@zxy.spb.ru> <20160920201925.GI38409@kib.kiev.ua> <20160920203853.GR2840@zxy.spb.ru> <20160920211517.GJ38409@kib.kiev.ua> <20160922075933.GL38409@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160922075933.GL38409@kib.kiev.ua> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 08:25:36 -0000 On Thu, Sep 22, 2016 at 10:59:33AM +0300, Konstantin Belousov wrote: > On Wed, Sep 21, 2016 at 12:15:17AM +0300, Konstantin Belousov wrote: > > > > diff --git a/sys/vm/vm_map.c b/sys/vm/vm_map.c > > > > index a23468e..f754652 100644 > > > > --- a/sys/vm/vm_map.c > > > > +++ b/sys/vm/vm_map.c > > > > @@ -481,6 +481,7 @@ vmspace_switch_aio(struct vmspace *newvm) > > > > if (oldvm == newvm) > > > > return; > > > > > > > > + spinlock_enter(); > > > > /* > > > > * Point to the new address space and refer to it. > > > > */ > > > > @@ -489,6 +490,7 @@ vmspace_switch_aio(struct vmspace *newvm) > > > > > > > > /* Activate the new mapping. */ > > > > pmap_activate(curthread); > > > > + spinlock_exit(); > > > > > > > > /* Remove the daemon's reference to the old address space. */ > > > > KASSERT(oldvm->vm_refcnt > 1, > Did you tested the patch ? I am now installed it. For success test need 2-3 days. If test failed result may be quickly. > Below is, I believe, the committable fix, of course supposing that > the patch above worked. If you want to retest it on stable/11, ignore > efirt.c chunks. and remove patch w/ spinlock? > diff --git a/sys/amd64/amd64/efirt.c b/sys/amd64/amd64/efirt.c > index f1d67f7..c883af8 100644 > --- a/sys/amd64/amd64/efirt.c > +++ b/sys/amd64/amd64/efirt.c > @@ -53,6 +53,7 @@ __FBSDID("$FreeBSD$"); > #include > #include > #include > +#include > #include > #include > #include > @@ -301,6 +302,17 @@ efi_enter(void) > PMAP_UNLOCK(curpmap); > return (error); > } > + > + /* > + * IPI TLB shootdown handler invltlb_pcid_handler() reloads > + * %cr3 from the curpmap->pm_cr3, which would disable runtime > + * segments mappings. Block the handler's action by setting > + * curpmap to impossible value. See also comment in > + * pmap.c:pmap_activate_sw(). > + */ > + if (pmap_pcid_enabled && !invpcid_works) > + PCPU_SET(curpmap, NULL); > + > load_cr3(VM_PAGE_TO_PHYS(efi_pml4_page) | (pmap_pcid_enabled ? > curpmap->pm_pcids[PCPU_GET(cpuid)].pm_pcid : 0)); > /* > @@ -317,7 +329,9 @@ efi_leave(void) > { > pmap_t curpmap; > > - curpmap = PCPU_GET(curpmap); > + curpmap = &curproc->p_vmspace->vm_pmap; > + if (pmap_pcid_enabled && !invpcid_works) > + PCPU_SET(curpmap, curpmap); > load_cr3(curpmap->pm_cr3 | (pmap_pcid_enabled ? > curpmap->pm_pcids[PCPU_GET(cpuid)].pm_pcid : 0)); > if (!pmap_pcid_enabled) > diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c > index 63042e4..59e1b67 100644 > --- a/sys/amd64/amd64/pmap.c > +++ b/sys/amd64/amd64/pmap.c > @@ -6842,6 +6842,7 @@ pmap_activate_sw(struct thread *td) > { > pmap_t oldpmap, pmap; > uint64_t cached, cr3; > + register_t rflags; > u_int cpuid; > > oldpmap = PCPU_GET(curpmap); > @@ -6865,16 +6866,43 @@ pmap_activate_sw(struct thread *td) > pmap == kernel_pmap, > ("non-kernel pmap thread %p pmap %p cpu %d pcid %#x", > td, pmap, cpuid, pmap->pm_pcids[cpuid].pm_pcid)); > + > + /* > + * If the INVPCID instruction is not available, > + * invltlb_pcid_handler() is used for handle > + * invalidate_all IPI, which checks for curpmap == > + * smp_tlb_pmap. Below operations sequence has a > + * window where %CR3 is loaded with the new pmap's > + * PML4 address, but curpmap value is not yet updated. > + * This causes invltlb IPI handler, called between the > + * updates, to execute as NOP, which leaves stale TLB > + * entries. > + * > + * Note that the most typical use of > + * pmap_activate_sw(), from the context switch, is > + * immune to this race, because interrupts are > + * disabled (while the thread lock is owned), and IPI > + * happends after curpmap is updated. Protect other > + * callers in a similar way, by disabling interrupts > + * around the %cr3 register reload and curpmap > + * assignment. > + */ > + if (!invpcid_works) > + rflags = intr_disable(); > + > if (!cached || (cr3 & ~CR3_PCID_MASK) != pmap->pm_cr3) { > load_cr3(pmap->pm_cr3 | pmap->pm_pcids[cpuid].pm_pcid | > cached); > if (cached) > PCPU_INC(pm_save_cnt); > } > + PCPU_SET(curpmap, pmap); > + if (!invpcid_works) > + intr_restore(rflags); > } else if (cr3 != pmap->pm_cr3) { > load_cr3(pmap->pm_cr3); > + PCPU_SET(curpmap, pmap); > } > - PCPU_SET(curpmap, pmap); > #ifdef SMP > CPU_CLR_ATOMIC(cpuid, &oldpmap->pm_active); > #else From owner-freebsd-stable@freebsd.org Thu Sep 22 08:27:45 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2B048BE348E for ; Thu, 22 Sep 2016 08:27:45 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B6C3528B; Thu, 22 Sep 2016 08:27:44 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u8M8Rebb049032 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 22 Sep 2016 11:27:40 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u8M8Rebb049032 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u8M8Re3W049031; Thu, 22 Sep 2016 11:27:40 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 22 Sep 2016 11:27:40 +0300 From: Konstantin Belousov To: Slawa Olhovchenkov Cc: John Baldwin , freebsd-stable@freebsd.org, alc@freebsd.org Subject: Re: nginx and FreeBSD11 Message-ID: <20160922082740.GN38409@kib.kiev.ua> References: <1823460.vTm8IvUQsF@ralph.baldwin.cx> <20160918162241.GE2960@zxy.spb.ru> <2122051.7RxZBKUSFc@ralph.baldwin.cx> <20160920065244.GO2840@zxy.spb.ru> <20160920192053.GP2840@zxy.spb.ru> <20160920201925.GI38409@kib.kiev.ua> <20160920203853.GR2840@zxy.spb.ru> <20160920211517.GJ38409@kib.kiev.ua> <20160922075933.GL38409@kib.kiev.ua> <20160922082527.GX2840@zxy.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160922082527.GX2840@zxy.spb.ru> User-Agent: Mutt/1.6.1 (2016-04-27) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 08:27:45 -0000 On Thu, Sep 22, 2016 at 11:25:27AM +0300, Slawa Olhovchenkov wrote: > On Thu, Sep 22, 2016 at 10:59:33AM +0300, Konstantin Belousov wrote: > > Below is, I believe, the committable fix, of course supposing that > > the patch above worked. If you want to retest it on stable/11, ignore > > efirt.c chunks. > > and remove patch w/ spinlock? Yes. From owner-freebsd-stable@freebsd.org Thu Sep 22 08:34:27 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5DC5FBE3BFB for ; Thu, 22 Sep 2016 08:34:27 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2204B9DF; Thu, 22 Sep 2016 08:34:27 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bmzSa-000GlB-V9; Thu, 22 Sep 2016 11:34:24 +0300 Date: Thu, 22 Sep 2016 11:34:24 +0300 From: Slawa Olhovchenkov To: Konstantin Belousov Cc: John Baldwin , freebsd-stable@freebsd.org, alc@freebsd.org Subject: Re: nginx and FreeBSD11 Message-ID: <20160922083424.GY2840@zxy.spb.ru> References: <20160918162241.GE2960@zxy.spb.ru> <2122051.7RxZBKUSFc@ralph.baldwin.cx> <20160920065244.GO2840@zxy.spb.ru> <20160920192053.GP2840@zxy.spb.ru> <20160920201925.GI38409@kib.kiev.ua> <20160920203853.GR2840@zxy.spb.ru> <20160920211517.GJ38409@kib.kiev.ua> <20160922075933.GL38409@kib.kiev.ua> <20160922082527.GX2840@zxy.spb.ru> <20160922082740.GN38409@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160922082740.GN38409@kib.kiev.ua> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 08:34:27 -0000 On Thu, Sep 22, 2016 at 11:27:40AM +0300, Konstantin Belousov wrote: > On Thu, Sep 22, 2016 at 11:25:27AM +0300, Slawa Olhovchenkov wrote: > > On Thu, Sep 22, 2016 at 10:59:33AM +0300, Konstantin Belousov wrote: > > > Below is, I believe, the committable fix, of course supposing that > > > the patch above worked. If you want to retest it on stable/11, ignore > > > efirt.c chunks. > > > > and remove patch w/ spinlock? > Yes. What you prefer now -- I am test spinlock patch or this patch? For success in any case need wait 2-3 days. From owner-freebsd-stable@freebsd.org Thu Sep 22 08:53:27 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D35C3BE4CF2 for ; Thu, 22 Sep 2016 08:53:27 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4386588A; Thu, 22 Sep 2016 08:53:27 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u8M8rKcO055003 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 22 Sep 2016 11:53:20 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u8M8rKcO055003 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u8M8rKAx055002; Thu, 22 Sep 2016 11:53:20 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 22 Sep 2016 11:53:20 +0300 From: Konstantin Belousov To: Slawa Olhovchenkov Cc: John Baldwin , freebsd-stable@freebsd.org, alc@freebsd.org Subject: Re: nginx and FreeBSD11 Message-ID: <20160922085320.GQ38409@kib.kiev.ua> References: <2122051.7RxZBKUSFc@ralph.baldwin.cx> <20160920065244.GO2840@zxy.spb.ru> <20160920192053.GP2840@zxy.spb.ru> <20160920201925.GI38409@kib.kiev.ua> <20160920203853.GR2840@zxy.spb.ru> <20160920211517.GJ38409@kib.kiev.ua> <20160922075933.GL38409@kib.kiev.ua> <20160922082527.GX2840@zxy.spb.ru> <20160922082740.GN38409@kib.kiev.ua> <20160922083424.GY2840@zxy.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160922083424.GY2840@zxy.spb.ru> User-Agent: Mutt/1.6.1 (2016-04-27) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 08:53:27 -0000 On Thu, Sep 22, 2016 at 11:34:24AM +0300, Slawa Olhovchenkov wrote: > On Thu, Sep 22, 2016 at 11:27:40AM +0300, Konstantin Belousov wrote: > > > On Thu, Sep 22, 2016 at 11:25:27AM +0300, Slawa Olhovchenkov wrote: > > > On Thu, Sep 22, 2016 at 10:59:33AM +0300, Konstantin Belousov wrote: > > > > Below is, I believe, the committable fix, of course supposing that > > > > the patch above worked. If you want to retest it on stable/11, ignore > > > > efirt.c chunks. > > > > > > and remove patch w/ spinlock? > > Yes. > > What you prefer now -- I am test spinlock patch or this patch? > For success in any case need wait 2-3 days. If you already run previous (spinlock) version for 1 day, then finish with it. I am confident that spinlock version results are indicative for the refined patch as well. If you did not applied the spinlock variant at all, there is no reason to spend efforts on it, use the patch I sent today. From owner-freebsd-stable@freebsd.org Thu Sep 22 09:28:43 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 049ABBE3674 for ; Thu, 22 Sep 2016 09:28:43 +0000 (UTC) (envelope-from julien.charbon@gmail.com) Received: from mail-wm0-f46.google.com (mail-wm0-f46.google.com [74.125.82.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A9956F31 for ; Thu, 22 Sep 2016 09:28:42 +0000 (UTC) (envelope-from julien.charbon@gmail.com) Received: by mail-wm0-f46.google.com with SMTP id l132so137147795wmf.1 for ; Thu, 22 Sep 2016 02:28:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:subject:to:references:cc:message-id:date :user-agent:mime-version:in-reply-to; bh=I+QrymTMW6wvBJPlMH8F7dtMRM+p/6TjIqvPHG92LGo=; b=NxNasZyap0f5cKoh+GE5w80UltWSiqqOFzgeT8M9SvGW7VPN6+ITchDXs1kmX8Z4mJ G5R2ROgFJEJa/QsJGYIHDIKCM3zlwt+oenyEa7ZM9JcsXgx91OBkJat6W+VJmYX+mff9 ooAwja7sNUXlYwEGEjdklTGhhmotiwBAN4tPaXRDW6juVGu5YJXBsz7SXvGpF6vV37hZ 9cfMG3AgqnWSHcnkDtKM0CvaQlSG1C6e5bz1K+Y/8e5dPh8kz1D+dvmIKJfWVBe0BbXd A+lgJ2RMuq6QDGfjinr4YfT1lH6YOCIjHrwIp+zEZD2nypx5uweOLo/t/kG9w22oWJxp PiUg== X-Gm-Message-State: AE9vXwO+WFNSA/wmmT25jME/UF/JFRGe/x8BxAb4dENVQYHXfTKr59BK7pNXvyEtv6wRCg== X-Received: by 10.28.215.67 with SMTP id o64mr1430067wmg.98.1474536520885; Thu, 22 Sep 2016 02:28:40 -0700 (PDT) Received: from [10.100.64.44] ([217.30.88.7]) by smtp.gmail.com with ESMTPSA id 137sm36795859wmi.16.2016.09.22.02.28.39 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 22 Sep 2016 02:28:40 -0700 (PDT) From: Julien Charbon Subject: Re: 11.0 stuck on high network load To: Slawa Olhovchenkov References: <20160915085938.GN38409@kib.kiev.ua> <20160915090633.GS2840@zxy.spb.ru> <20160916181839.GC2960@zxy.spb.ru> <20160916183053.GL9397@strugglingcoder.info> <20160916190330.GG2840@zxy.spb.ru> <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> <20160919204328.GN2840@zxy.spb.ru> <8ba75d6e-4f01-895e-0aed-53c6c6692cb9@freebsd.org> <20160920202633.GQ2840@zxy.spb.ru> <20160921083148.GU2840@zxy.spb.ru> X-Mozilla-News-Host: news://news.gmane.org Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org, hiren panchasara Message-ID: <05ba1a3a-2d99-f8e2-40a1-4c1fca317db3@freebsd.org> Date: Thu, 22 Sep 2016 11:28:38 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <20160921083148.GU2840@zxy.spb.ru> Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="5WrnxOFLdULVi1adJj9CXBnKvjsw6ABrP" X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 09:28:43 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --5WrnxOFLdULVi1adJj9CXBnKvjsw6ABrP Content-Type: multipart/mixed; boundary="nDdrl8RqBsaIktWKwGtj7FQJva3SNWuwD"; protected-headers="v1" From: Julien Charbon To: Slawa Olhovchenkov Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org, hiren panchasara Message-ID: <05ba1a3a-2d99-f8e2-40a1-4c1fca317db3@freebsd.org> Subject: Re: 11.0 stuck on high network load References: <20160915085938.GN38409@kib.kiev.ua> <20160915090633.GS2840@zxy.spb.ru> <20160916181839.GC2960@zxy.spb.ru> <20160916183053.GL9397@strugglingcoder.info> <20160916190330.GG2840@zxy.spb.ru> <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> <20160919204328.GN2840@zxy.spb.ru> <8ba75d6e-4f01-895e-0aed-53c6c6692cb9@freebsd.org> <20160920202633.GQ2840@zxy.spb.ru> <20160921083148.GU2840@zxy.spb.ru> In-Reply-To: <20160921083148.GU2840@zxy.spb.ru> --nDdrl8RqBsaIktWKwGtj7FQJva3SNWuwD Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Hi Slawa, On 9/21/16 10:31 AM, Slawa Olhovchenkov wrote: > On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote: >> On 9/20/16 10:26 PM, Slawa Olhovchenkov wrote: >>> On Tue, Sep 20, 2016 at 10:00:25PM +0200, Julien Charbon wrote: >>>> On 9/19/16 10:43 PM, Slawa Olhovchenkov wrote: >>>>> On Mon, Sep 19, 2016 at 10:32:13PM +0200, Julien Charbon wrote: >>>>>> >>>>>>> @ CPU_CLK_UNHALTED_CORE [4653445 samples] >>>>>>> >>>>>>> 51.86% [2413083] lock_delay @ /boot/kernel.VSTREAM/kernel >>>>>>> 100.0% [2413083] __rw_wlock_hard >>>>>>> 100.0% [2413083] tcp_tw_2msl_scan >>>>>>> 99.99% [2412958] pfslowtimo >>>>>>> 100.0% [2412958] softclock_call_cc >>>>>>> 100.0% [2412958] softclock >>>>>>> 100.0% [2412958] intr_event_execute_handlers >>>>>>> 100.0% [2412958] ithread_loop >>>>>>> 100.0% [2412958] fork_exit >>>>>>> 00.01% [125] tcp_twstart >>>>>>> 100.0% [125] tcp_do_segment >>>>>>> 100.0% [125] tcp_input >>>>>>> 100.0% [125] ip_input >>>>>>> 100.0% [125] swi_net >>>>>>> 100.0% [125] intr_event_execute_handlers >>>>>>> 100.0% [125] ithread_loop >>>>>>> 100.0% [125] fork_exit >>>>>> >>>>>> The only write lock tcp_tw_2msl_scan() tries to get is a >>>>>> INP_WLOCK(inp). Thus here, tcp_tw_2msl_scan() seems to be stuck >>>>>> spinning on INP_WLOCK (or pfslowtimo() is going crazy and calls >>>>>> tcp_tw_2msl_scan() at high rate but this will be quite unexpected)= =2E >>>>>> >>>>>> Thus my hypothesis is that something is holding the INP_WLOCK and= not >>>>>> releasing it, and tcp_tw_2msl_scan() is spinning on it. >>>>>> >>>>>> If you can, could you compile the kernel with below options: >>>>>> >>>>>> options DDB # Support DDB. >>>>>> options DEADLKRES # Enable the deadlock resol= ver >>>>>> options INVARIANTS # Enable calls of extra san= ity >>>>>> checking >>>>>> options INVARIANT_SUPPORT # Extra sanity checks of in= ternal >>>>>> structures, required by INVARIANTS >>>>>> options WITNESS # Enable checks to detect >>>>>> deadlocks and cycles >>>>>> options WITNESS_SKIPSPIN # Don't run witness on spin= locks >>>>>> for speed >>>>> >>>>> Currently this host run with 100% CPU load (on all cores), i.e. >>>>> enabling WITNESS will be significant drop performance. >>>>> Can I use only some subset of options? >>>>> >>>>> Also, I can some troubles to DDB enter in this case. >>>>> May be kgdb will be success (not tryed yet)? >>>> >>>> If these kernel options will certainly slow down your kernel, they = also >>>> might found the root cause of your issue before reaching the point w= here >>>> you have 100% cpu load on all cores (thanks to INVARIANTS). I would= >>>> suggest: >>> >>> Hmmm, may be I am not clarified. >>> This host run at peak hours with 100% CPU load as normal operation, >>> this is for servering 2x10G, this is CPU load not result of lock >>> issuse, this is not us case. And this is because I am fear to enable >>> WITNESS -- I am fear drop performance. >>> >>> This lock issuse happen irregulary and may be caused by other issuse >>> (nginx crashed). In this case about 1/3 cores have 100% cpu load, >>> perhaps by this lock -- I am can trace only from one core and need >>> more then hour for this (may be on other cores different trace, I >>> can't guaranted anything). >> >> I see, especially if you are running in production WITNESS might inde= ed >> be not practical for you. In this case, I would suggest before doing >> WITNESS and still get more information to: >> >> #0: Do a lock profiling: >> >> https://www.freebsd.org/cgi/man.cgi?query=3DLOCK_PROFILING >> >> options LOCK_PROFILING >> >> Example of usage: >> >> # Run >> $ sudo sysctl debug.lock.prof.enable=3D1 >> $ sleep 10 >> $ sudo sysctl debug.lock.prof.enable=3D0 >> >> # Get results >> $ sysctl debug.lock.prof.stats | head -2; sysctl debug.lock.prof.stats= | >> sort -n -k 4 -r >=20 > OK, but in case of leak lock (why inp lock too long for > tcp_tw_2msl_scan?) I can't see cause of this lock running this > commands after stuck happen? >=20 >>> What purpose to not skip locked tcptw in this loop? >> >> If I understand your question correctly: According to your pmcstat >> result, tcp_tw_2msl_scan() currently struggles with a write lock >> (__rw_wlock_hard) and the only write lock used tcp_tw_2msl_scan() is >> INP_WLOCK. No sign of contention on TW_RLOCK(V_tw_lock) currently. >=20 > As I see in code, tcp_tw_2msl_scan got first node from V_twq_2msl and > need got RW lock on inp w/o alternates. Can tcp_tw_2msl_scan skip curre= nt node > and go to next node in V_twq_2msl list if current node locked by some > reasson? Interesting question indeed: It is not optimal that all simultaneous calls to tcp_tw_2msl_scan() compete for the same oldest tcptw. The next tcptws in the list are certainly old enough also. Let me see if I can make a simple change that makes kernel threads calling tcp_tw_2msl_scan() at same time to work on a different old enough tcptws. So far, I found only solutions quite complex to implement= =2E -- Julien --nDdrl8RqBsaIktWKwGtj7FQJva3SNWuwD-- --5WrnxOFLdULVi1adJj9CXBnKvjsw6ABrP Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQEcBAEBCgAGBQJX46RHAAoJEKVlQ5Je6dhx/kAIAJ5JeWeAmgBn4vyYboMP2XIL h7rqNrTtGfFYxkjp03TgHk27iCUCUTJHfMpsZ0hoFKypi4n648bvFFZadrfVZZxq 1dzDfJm5DJcHgKu0iL7NDw2pGHx3NwLPPVCiP/1kMxuzHTTfHW5Pm/2+FxKUZYKg SZian0jdtxXkZTsYTtOXo2Gug3h/FA9PgPCJHPt3T6Nzwdlk4r4ou1OVh0Cxq/fn JjLMc8AM2YFtZj1us9RAPo2cShbR3RgJlt5K7Rwa9OMmstX3IFHr+MmO1ZHSVzjh 4IwfYwBFyrxpwN815Q+H3BIP7mAOVGFytNMRi4zg5XdlX7gNaHVJeUS0+sLJlzM= =ixjt -----END PGP SIGNATURE----- --5WrnxOFLdULVi1adJj9CXBnKvjsw6ABrP-- From owner-freebsd-stable@freebsd.org Thu Sep 22 09:33:58 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CB37EBE3CA9 for ; Thu, 22 Sep 2016 09:33:58 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8DDF4825; Thu, 22 Sep 2016 09:33:58 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bn0OB-000IGE-3E; Thu, 22 Sep 2016 12:33:55 +0300 Date: Thu, 22 Sep 2016 12:33:55 +0300 From: Slawa Olhovchenkov To: Konstantin Belousov Cc: John Baldwin , freebsd-stable@freebsd.org, alc@freebsd.org Subject: Re: nginx and FreeBSD11 Message-ID: <20160922093355.GZ2840@zxy.spb.ru> References: <20160920065244.GO2840@zxy.spb.ru> <20160920192053.GP2840@zxy.spb.ru> <20160920201925.GI38409@kib.kiev.ua> <20160920203853.GR2840@zxy.spb.ru> <20160920211517.GJ38409@kib.kiev.ua> <20160922075933.GL38409@kib.kiev.ua> <20160922082527.GX2840@zxy.spb.ru> <20160922082740.GN38409@kib.kiev.ua> <20160922083424.GY2840@zxy.spb.ru> <20160922085320.GQ38409@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160922085320.GQ38409@kib.kiev.ua> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 09:33:58 -0000 On Thu, Sep 22, 2016 at 11:53:20AM +0300, Konstantin Belousov wrote: > On Thu, Sep 22, 2016 at 11:34:24AM +0300, Slawa Olhovchenkov wrote: > > On Thu, Sep 22, 2016 at 11:27:40AM +0300, Konstantin Belousov wrote: > > > > > On Thu, Sep 22, 2016 at 11:25:27AM +0300, Slawa Olhovchenkov wrote: > > > > On Thu, Sep 22, 2016 at 10:59:33AM +0300, Konstantin Belousov wrote: > > > > > Below is, I believe, the committable fix, of course supposing that > > > > > the patch above worked. If you want to retest it on stable/11, ignore > > > > > efirt.c chunks. > > > > > > > > and remove patch w/ spinlock? > > > Yes. > > > > What you prefer now -- I am test spinlock patch or this patch? > > For success in any case need wait 2-3 days. > > If you already run previous (spinlock) version for 1 day, then finish > with it. I am confident that spinlock version results are indicative for > the refined patch as well. > > If you did not applied the spinlock variant at all, there is no reason to > spend efforts on it, use the patch I sent today. No, I am did not applied the spinlock variant at all. OK, try this patch. Do you still need first 100 lines from verbose boot? From owner-freebsd-stable@freebsd.org Thu Sep 22 09:40:52 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A5556BE43DF for ; Thu, 22 Sep 2016 09:40:52 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6733DB0C; Thu, 22 Sep 2016 09:40:52 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bn0Us-000IRP-9v; Thu, 22 Sep 2016 12:40:50 +0300 Date: Thu, 22 Sep 2016 12:40:50 +0300 From: Slawa Olhovchenkov To: Julien Charbon Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org, hiren panchasara Subject: Re: 11.0 stuck on high network load Message-ID: <20160922094050.GA2840@zxy.spb.ru> References: <20160916181839.GC2960@zxy.spb.ru> <20160916183053.GL9397@strugglingcoder.info> <20160916190330.GG2840@zxy.spb.ru> <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> <20160919204328.GN2840@zxy.spb.ru> <8ba75d6e-4f01-895e-0aed-53c6c6692cb9@freebsd.org> <20160920202633.GQ2840@zxy.spb.ru> <20160921083148.GU2840@zxy.spb.ru> <05ba1a3a-2d99-f8e2-40a1-4c1fca317db3@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <05ba1a3a-2d99-f8e2-40a1-4c1fca317db3@freebsd.org> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 09:40:52 -0000 On Thu, Sep 22, 2016 at 11:28:38AM +0200, Julien Charbon wrote: > >>> What purpose to not skip locked tcptw in this loop? > >> > >> If I understand your question correctly: According to your pmcstat > >> result, tcp_tw_2msl_scan() currently struggles with a write lock > >> (__rw_wlock_hard) and the only write lock used tcp_tw_2msl_scan() is > >> INP_WLOCK. No sign of contention on TW_RLOCK(V_tw_lock) currently. > > > > As I see in code, tcp_tw_2msl_scan got first node from V_twq_2msl and > > need got RW lock on inp w/o alternates. Can tcp_tw_2msl_scan skip current node > > and go to next node in V_twq_2msl list if current node locked by some > > reasson? > > Interesting question indeed: It is not optimal that all simultaneous > calls to tcp_tw_2msl_scan() compete for the same oldest tcptw. The next > tcptws in the list are certainly old enough also. > > Let me see if I can make a simple change that makes kernel threads > calling tcp_tw_2msl_scan() at same time to work on a different old > enough tcptws. So far, I found only solutions quite complex to implement. Simple solution is skip in each thread ncpu elemnts and skip curent cpu number elements at start, if I understund you correctly. From owner-freebsd-stable@freebsd.org Thu Sep 22 09:53:34 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3DE77BE53F3 for ; Thu, 22 Sep 2016 09:53:34 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 006C86A4 for ; Thu, 22 Sep 2016 09:53:34 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bn0h9-000IlA-Ov; Thu, 22 Sep 2016 12:53:31 +0300 Date: Thu, 22 Sep 2016 12:53:31 +0300 From: Slawa Olhovchenkov To: Julien Charbon Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org, hiren panchasara Subject: Re: 11.0 stuck on high network load Message-ID: <20160922095331.GB2840@zxy.spb.ru> References: <20160916181839.GC2960@zxy.spb.ru> <20160916183053.GL9397@strugglingcoder.info> <20160916190330.GG2840@zxy.spb.ru> <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> <20160919204328.GN2840@zxy.spb.ru> <8ba75d6e-4f01-895e-0aed-53c6c6692cb9@freebsd.org> <20160920202633.GQ2840@zxy.spb.ru> <20160921195155.GW2840@zxy.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 09:53:34 -0000 On Wed, Sep 21, 2016 at 11:25:18PM +0200, Julien Charbon wrote: > > Hi Slawa, > > On 9/21/16 9:51 PM, Slawa Olhovchenkov wrote: > > On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote: > >> You can also use Dtrace and lockstat (especially with the lockstat -s > >> option): > >> > >> https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks > >> https://www.freebsd.org/cgi/man.cgi?query=lockstat&manpath=FreeBSD+11.0-RELEASE > >> > >> But I am less familiar with Dtrace/lockstat tools. > > > > I am still use old kernel and got lockdown again. > > Try using lockstat (I am save more output), interesting may be next: > > > > R/W writer spin on writer: 190019 events in 1.070 seconds (177571 events/sec) > > > > ------------------------------------------------------------------------------- > > Count indv cuml rcnt nsec Lock Caller > > 140839 74% 74% 0.00 24659 tcpinp tcp_tw_2msl_scan+0xc6 > > > > nsec ------ Time Distribution ------ count Stack > > 4096 | 913 tcp_twstart+0xa3 > > 8192 |@@@@@@@@@@@@ 58191 tcp_do_segment+0x201f > > 16384 |@@@@@@ 29594 tcp_input+0xe1c > > 32768 |@@@@ 23447 ip_input+0x15f > > 65536 |@@@ 16197 > > 131072 |@ 8674 > > 262144 | 3358 > > 524288 | 456 > > 1048576 | 9 > > ------------------------------------------------------------------------------- > > Count indv cuml rcnt nsec Lock Caller > > 49180 26% 100% 0.00 15929 tcpinp tcp_tw_2msl_scan+0xc6 > > > > nsec ------ Time Distribution ------ count Stack > > 4096 | 157 pfslowtimo+0x54 > > 8192 |@@@@@@@@@@@@@@@ 24796 softclock_call_cc+0x179 > > 16384 |@@@@@@ 11223 softclock+0x44 > > 32768 |@@@@ 7426 intr_event_execute_handlers+0x95 > > 65536 |@@ 3918 > > 131072 | 1363 > > 262144 | 278 > > 524288 | 19 > > ------------------------------------------------------------------------------- > > This is interesting, it seems that you have two call paths competing > for INP locks here: > > - pfslowtimo()/tcp_tw_2msl_scan(reuse=0) and > > - tcp_input()/tcp_twstart()/tcp_tw_2msl_scan(reuse=1) I think same. > These paths can indeed compete for the same INP lock, as both > tcp_tw_2msl_scan() calls always start with the first inp found in > twq_2msl list. But in both cases, this first inp should be quickly used > and its lock released anyway, thus that could explain your situation it > that the TCP stack is doing that all the time, for example: > > - Let say that you are running out completely and constantly of tcptw, > and then all connections transitioning to TIME_WAIT state are competing > with the TIME_WAIT timeout scan that tries to free all the expired > tcptw. If the stack is doing that all the time, it can appear like > "live" locked. > > This is just an hypothesis and as usual might be a red herring. > Anyway, could you run: > > $ vmstat -z | head -2; vmstat -z | grep -E 'tcp|sock' ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP socket: 864, 4192664, 18604, 25348,49276158, 0, 0 tcp_inpcb: 464, 4192664, 34226, 18702,49250593, 0, 0 tcpcb: 1040, 4192665, 18424, 18953,49250593, 0, 0 tcptw: 88, 16425, 15802, 623,14526919, 8, 0 tcpreass: 40, 32800, 15, 2285, 632381, 0, 0 In normal case tcptw is about 16425/600/900 And after `sysctl -a | grep tcp` system stuck on serial console and I am reset it. > Ideally, once when everything is ok, and once when you have the issue > to see the differences (if any). > > If it appears your are quite low in tcptw, and if you have enough > memory, could you try increase the tcptw limit using sysctl I think this is not eliminate stuck, just may do it less frequency > net.inet.tcp.maxtcptw? And actually see if it improve (or not) your > performance. I am already play with net.inet.tcp.maxtcptw and it not affect performance. From owner-freebsd-stable@freebsd.org Thu Sep 22 10:04:50 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 338EDBE5F2B for ; Thu, 22 Sep 2016 10:04:50 +0000 (UTC) (envelope-from julien.charbon@gmail.com) Received: from mail-wm0-f48.google.com (mail-wm0-f48.google.com [74.125.82.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C705FE38 for ; Thu, 22 Sep 2016 10:04:49 +0000 (UTC) (envelope-from julien.charbon@gmail.com) Received: by mail-wm0-f48.google.com with SMTP id l132so138814886wmf.1 for ; Thu, 22 Sep 2016 03:04:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=2/trLX7Oqf49tG+npLioKVH7OvVp2s0tV4QGicJCc80=; b=AcsiA+4HCIZRtdbW13meKoIYD8Fj+DyKXKMqrgKos4lN/v5Z72Lh4zvaIVDkpuU8Uy JAdqvtP6cINTEnB7hzvOP9GyNFrdf1kEQnlnJRN0CmwOeJ0czumzBloktQwVascGa9Yh kcinQT3Ib39K44Jif+0kJOFTo7fLtVduwMRVKuCOs7cNMQQ1GBWWFQgFZdrjeCChGg+8 16LG5mS2DjsxuBwIQ1YEuYqt1eu3GBohDq/YRExvoz0ufrjtQ1DjRMW3yXYD4M3llLp7 /aJmhdbuwka+eIwTNmxdjvrK0iIrtH8/byXfhcjnNtcNa8rfaz9YE6eFz44SVW1I8I7y z0Gw== X-Gm-Message-State: AE9vXwPDX9lLaeLOBplVBrPPJ+zCr0bwqyeupWNqOmtoAZjnghCP5LyK2u7wMf4AkY/RfQ== X-Received: by 10.28.189.197 with SMTP id n188mr6895136wmf.116.1474538682441; Thu, 22 Sep 2016 03:04:42 -0700 (PDT) Received: from [10.100.64.44] ([217.30.88.7]) by smtp.gmail.com with ESMTPSA id t202sm36782033wmt.22.2016.09.22.03.04.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 22 Sep 2016 03:04:41 -0700 (PDT) Subject: Re: 11.0 stuck on high network load To: Slawa Olhovchenkov References: <20160916181839.GC2960@zxy.spb.ru> <20160916183053.GL9397@strugglingcoder.info> <20160916190330.GG2840@zxy.spb.ru> <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> <20160919204328.GN2840@zxy.spb.ru> <8ba75d6e-4f01-895e-0aed-53c6c6692cb9@freebsd.org> <20160920202633.GQ2840@zxy.spb.ru> <20160921195155.GW2840@zxy.spb.ru> <20160922095331.GB2840@zxy.spb.ru> Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org, hiren panchasara From: Julien Charbon Message-ID: <67862b33-63c0-2f23-d254-5ddc55dbb554@freebsd.org> Date: Thu, 22 Sep 2016 12:04:40 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <20160922095331.GB2840@zxy.spb.ru> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 10:04:50 -0000 Hi Slawa, On 9/22/16 11:53 AM, Slawa Olhovchenkov wrote: > On Wed, Sep 21, 2016 at 11:25:18PM +0200, Julien Charbon wrote: >> On 9/21/16 9:51 PM, Slawa Olhovchenkov wrote: >>> On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote: >>>> You can also use Dtrace and lockstat (especially with the lockstat -s >>>> option): >>>> >>>> https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks >>>> https://www.freebsd.org/cgi/man.cgi?query=lockstat&manpath=FreeBSD+11.0-RELEASE >>>> >>>> But I am less familiar with Dtrace/lockstat tools. >>> >>> I am still use old kernel and got lockdown again. >>> Try using lockstat (I am save more output), interesting may be next: >>> >>> R/W writer spin on writer: 190019 events in 1.070 seconds (177571 events/sec) >>> >>> ------------------------------------------------------------------------------- >>> Count indv cuml rcnt nsec Lock Caller >>> 140839 74% 74% 0.00 24659 tcpinp tcp_tw_2msl_scan+0xc6 >>> >>> nsec ------ Time Distribution ------ count Stack >>> 4096 | 913 tcp_twstart+0xa3 >>> 8192 |@@@@@@@@@@@@ 58191 tcp_do_segment+0x201f >>> 16384 |@@@@@@ 29594 tcp_input+0xe1c >>> 32768 |@@@@ 23447 ip_input+0x15f >>> 65536 |@@@ 16197 >>> 131072 |@ 8674 >>> 262144 | 3358 >>> 524288 | 456 >>> 1048576 | 9 >>> ------------------------------------------------------------------------------- >>> Count indv cuml rcnt nsec Lock Caller >>> 49180 26% 100% 0.00 15929 tcpinp tcp_tw_2msl_scan+0xc6 >>> >>> nsec ------ Time Distribution ------ count Stack >>> 4096 | 157 pfslowtimo+0x54 >>> 8192 |@@@@@@@@@@@@@@@ 24796 softclock_call_cc+0x179 >>> 16384 |@@@@@@ 11223 softclock+0x44 >>> 32768 |@@@@ 7426 intr_event_execute_handlers+0x95 >>> 65536 |@@ 3918 >>> 131072 | 1363 >>> 262144 | 278 >>> 524288 | 19 >>> ------------------------------------------------------------------------------- >> >> This is interesting, it seems that you have two call paths competing >> for INP locks here: >> >> - pfslowtimo()/tcp_tw_2msl_scan(reuse=0) and >> >> - tcp_input()/tcp_twstart()/tcp_tw_2msl_scan(reuse=1) > > I think same. > >> These paths can indeed compete for the same INP lock, as both >> tcp_tw_2msl_scan() calls always start with the first inp found in >> twq_2msl list. But in both cases, this first inp should be quickly used >> and its lock released anyway, thus that could explain your situation it >> that the TCP stack is doing that all the time, for example: >> >> - Let say that you are running out completely and constantly of tcptw, >> and then all connections transitioning to TIME_WAIT state are competing >> with the TIME_WAIT timeout scan that tries to free all the expired >> tcptw. If the stack is doing that all the time, it can appear like >> "live" locked. >> >> This is just an hypothesis and as usual might be a red herring. >> Anyway, could you run: >> >> $ vmstat -z | head -2; vmstat -z | grep -E 'tcp|sock' > > ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP > > socket: 864, 4192664, 18604, 25348,49276158, 0, 0 > tcp_inpcb: 464, 4192664, 34226, 18702,49250593, 0, 0 > tcpcb: 1040, 4192665, 18424, 18953,49250593, 0, 0 > tcptw: 88, 16425, 15802, 623,14526919, 8, 0 > tcpreass: 40, 32800, 15, 2285, 632381, 0, 0 > > In normal case tcptw is about 16425/600/900 > > And after `sysctl -a | grep tcp` system stuck on serial console and I am reset it. > >> Ideally, once when everything is ok, and once when you have the issue >> to see the differences (if any). >> >> If it appears your are quite low in tcptw, and if you have enough >> memory, could you try increase the tcptw limit using sysctl > > I think this is not eliminate stuck, just may do it less frequency You are right, it would just be a big hint that the tcp_tw_2msl_scan() contention hypothesis is the right one. As I see you have plenty of memory on your server, thus could you try with: net.inet.tcp.maxtcptw=4192665 And see what happen. Just to validate this hypothesis. Thanks. -- Julien From owner-freebsd-stable@freebsd.org Thu Sep 22 10:12:43 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5B5B8BE4865 for ; Thu, 22 Sep 2016 10:12:43 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E71726C7; Thu, 22 Sep 2016 10:12:42 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u8MACcTO074102 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 22 Sep 2016 13:12:38 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u8MACcTO074102 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u8MACcjN074101; Thu, 22 Sep 2016 13:12:38 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 22 Sep 2016 13:12:38 +0300 From: Konstantin Belousov To: Slawa Olhovchenkov Cc: John Baldwin , freebsd-stable@freebsd.org, alc@freebsd.org Subject: Re: nginx and FreeBSD11 Message-ID: <20160922101238.GS38409@kib.kiev.ua> References: <20160920192053.GP2840@zxy.spb.ru> <20160920201925.GI38409@kib.kiev.ua> <20160920203853.GR2840@zxy.spb.ru> <20160920211517.GJ38409@kib.kiev.ua> <20160922075933.GL38409@kib.kiev.ua> <20160922082527.GX2840@zxy.spb.ru> <20160922082740.GN38409@kib.kiev.ua> <20160922083424.GY2840@zxy.spb.ru> <20160922085320.GQ38409@kib.kiev.ua> <20160922093355.GZ2840@zxy.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160922093355.GZ2840@zxy.spb.ru> User-Agent: Mutt/1.6.1 (2016-04-27) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 10:12:43 -0000 On Thu, Sep 22, 2016 at 12:33:55PM +0300, Slawa Olhovchenkov wrote: > Do you still need first 100 lines from verbose boot? No. From owner-freebsd-stable@freebsd.org Thu Sep 22 10:20:48 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 68621BE4FB7 for ; Thu, 22 Sep 2016 10:20:48 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2B34CAF7; Thu, 22 Sep 2016 10:20:48 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bn17V-000JSS-KO; Thu, 22 Sep 2016 13:20:45 +0300 Date: Thu, 22 Sep 2016 13:20:45 +0300 From: Slawa Olhovchenkov To: Julien Charbon Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org, hiren panchasara Subject: Re: 11.0 stuck on high network load Message-ID: <20160922102045.GC2840@zxy.spb.ru> References: <20160916190330.GG2840@zxy.spb.ru> <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> <20160919204328.GN2840@zxy.spb.ru> <8ba75d6e-4f01-895e-0aed-53c6c6692cb9@freebsd.org> <20160920202633.GQ2840@zxy.spb.ru> <20160921195155.GW2840@zxy.spb.ru> <20160922095331.GB2840@zxy.spb.ru> <67862b33-63c0-2f23-d254-5ddc55dbb554@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <67862b33-63c0-2f23-d254-5ddc55dbb554@freebsd.org> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 10:20:48 -0000 On Thu, Sep 22, 2016 at 12:04:40PM +0200, Julien Charbon wrote: > >> These paths can indeed compete for the same INP lock, as both > >> tcp_tw_2msl_scan() calls always start with the first inp found in > >> twq_2msl list. But in both cases, this first inp should be quickly used > >> and its lock released anyway, thus that could explain your situation it > >> that the TCP stack is doing that all the time, for example: > >> > >> - Let say that you are running out completely and constantly of tcptw, > >> and then all connections transitioning to TIME_WAIT state are competing > >> with the TIME_WAIT timeout scan that tries to free all the expired > >> tcptw. If the stack is doing that all the time, it can appear like > >> "live" locked. > >> > >> This is just an hypothesis and as usual might be a red herring. > >> Anyway, could you run: > >> > >> $ vmstat -z | head -2; vmstat -z | grep -E 'tcp|sock' > > > > ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP > > > > socket: 864, 4192664, 18604, 25348,49276158, 0, 0 > > tcp_inpcb: 464, 4192664, 34226, 18702,49250593, 0, 0 > > tcpcb: 1040, 4192665, 18424, 18953,49250593, 0, 0 > > tcptw: 88, 16425, 15802, 623,14526919, 8, 0 > > tcpreass: 40, 32800, 15, 2285, 632381, 0, 0 > > > > In normal case tcptw is about 16425/600/900 > > > > And after `sysctl -a | grep tcp` system stuck on serial console and I am reset it. > > > >> Ideally, once when everything is ok, and once when you have the issue > >> to see the differences (if any). > >> > >> If it appears your are quite low in tcptw, and if you have enough > >> memory, could you try increase the tcptw limit using sysctl > > > > I think this is not eliminate stuck, just may do it less frequency > > You are right, it would just be a big hint that the tcp_tw_2msl_scan() > contention hypothesis is the right one. As I see you have plenty of > memory on your server, thus could you try with: > > net.inet.tcp.maxtcptw=4192665 > > And see what happen. Just to validate this hypothesis. This is bad way for validate, with maxtcptw=16384 happened is random and can be waited for month. After maxtcptw=4192665 I am don't know how long need to wait for verification this hypothesis. More frequency (may be 3-5 times per day) happening less traffic drops (not to zero for minutes). May be this caused also by contention in tcp_tw_2msl_scan, but fast resolved (stochastic process). By eating CPU power nginx can't service connection and clients closed connections and need more TIME_WAIT and can trigered tcp_tw_2msl_scan(reuse=1). After this we can got live lock. May be after I learning to catch and dignostic this validation is more accurately. From owner-freebsd-stable@freebsd.org Thu Sep 22 11:54:57 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4C7AFBE5FB1 for ; Thu, 22 Sep 2016 11:54:57 +0000 (UTC) (envelope-from emz@norma.perm.ru) Received: from elf.hq.norma.perm.ru (mail.norma.perm.ru [IPv6:2a00:7540:1::5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.norma.perm.ru", Issuer "Vivat-Trade UNIX Root CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id B48F81240 for ; Thu, 22 Sep 2016 11:54:56 +0000 (UTC) (envelope-from emz@norma.perm.ru) Received: from bsdrookie.norma.com. ([IPv6:fd00::7fe]) by elf.hq.norma.perm.ru (8.15.2/8.15.2) with ESMTPS id u8MBsqmr036632 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO) for ; Thu, 22 Sep 2016 16:54:52 +0500 (YEKT) (envelope-from emz@norma.perm.ru) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=norma.perm.ru; s=key; t=1474545292; bh=G5TbbUvZhIL3hLyO9kY1RvHA9CIQc6CB1ojt4D5kQ+E=; h=To:From:Subject:Date; b=iGoFEAQQgK58G++jBa45xLC0CREVsBd+Nd6k2CqxqR0rtNREDWAwUYXUhJuM4+/S+ aREPZjRqCUM1efrXP6PYqQKyzYL0lth4w31JDaUF57DuNqMylXz32U312BrdCpJbXx V3Vle6BGxtCWYjRIUL+JXIPMmKVxJ1XBztAlqkFI= To: freebsd-stable@FreeBSD.org From: "Eugene M. Zheganin" Subject: zfs/raidz and creation pause/blocking Message-ID: <57E3C68C.8060200@norma.perm.ru> Date: Thu, 22 Sep 2016 16:54:52 +0500 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101 Thunderbird/38.7.0 MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 11:54:57 -0000 Hi. Recently I spent a lot of time setting up various zfs installations, and I got a question. Often when creating a raidz on disks considerably big (>~ 1T) I'm seeing a weird stuff: "zpool create" blocks, and waits for several minutes. In the same time system is fully responsive and I can see in gstat that the kernel starts to tamper all the pool candidates sequentially at 100% busy with iops around zero (in the example below, taken from a live system, it's doing something with da11): (zpool create gamestop raidz da5 da7 da8 da9 da10 da11) dT: 1.064s w: 1.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name 0 0 0 0 0.0 0 0 0.0 0.0| da0 0 0 0 0 0.0 0 0 0.0 0.0| da1 0 0 0 0 0.0 0 0 0.0 0.0| da2 0 0 0 0 0.0 0 0 0.0 0.0| da3 0 0 0 0 0.0 0 0 0.0 0.0| da4 0 0 0 0 0.0 0 0 0.0 0.0| da5 0 0 0 0 0.0 0 0 0.0 0.0| da6 0 0 0 0 0.0 0 0 0.0 0.0| da7 0 0 0 0 0.0 0 0 0.0 0.0| da8 0 0 0 0 0.0 0 0 0.0 0.0| da9 0 0 0 0 0.0 0 0 0.0 0.0| da10 150 3 0 0 0.0 0 0 0.0 112.6| da11 0 0 0 0 0.0 0 0 0.0 0.0| da0p1 0 0 0 0 0.0 0 0 0.0 0.0| da0p2 0 0 0 0 0.0 0 0 0.0 0.0| da0p3 0 0 0 0 0.0 0 0 0.0 0.0| da1p1 0 0 0 0 0.0 0 0 0.0 0.0| da1p2 0 0 0 0 0.0 0 0 0.0 0.0| da1p3 0 0 0 0 0.0 0 0 0.0 0.0| da0p4 0 0 0 0 0.0 0 0 0.0 0.0| gpt/boot0 0 0 0 0 0.0 0 0 0.0 0.0| gptid/22659641-7ee6-11e6-9b56-0cc47aa41194 0 0 0 0 0.0 0 0 0.0 0.0| gpt/zroot0 0 0 0 0 0.0 0 0 0.0 0.0| gpt/esx0 0 0 0 0 0.0 0 0 0.0 0.0| gpt/boot1 0 0 0 0 0.0 0 0 0.0 0.0| gptid/23c1fbec-7ee6-11e6-9b56-0cc47aa41194 0 0 0 0 0.0 0 0 0.0 0.0| gpt/zroot1 0 0 0 0 0.0 0 0 0.0 0.0| mirror/mirror 0 0 0 0 0.0 0 0 0.0 0.0| da1p4 0 0 0 0 0.0 0 0 0.0 0.0| gpt/esx1 The most funny thing is that da5,7-11 are SSD, with a capability of like 30K iops at their least. So I wonder what is happening during this and why does it take that long. Because usually pools are creating very fast. Thanks. From owner-freebsd-stable@freebsd.org Thu Sep 22 11:56:57 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A1262BE417F for ; Thu, 22 Sep 2016 11:56:57 +0000 (UTC) (envelope-from emz@norma.perm.ru) Received: from elf.hq.norma.perm.ru (mail.norma.perm.ru [IPv6:2a00:7540:1::5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.norma.perm.ru", Issuer "Vivat-Trade UNIX Root CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 24C51143E for ; Thu, 22 Sep 2016 11:56:56 +0000 (UTC) (envelope-from emz@norma.perm.ru) Received: from bsdrookie.norma.com. ([IPv6:fd00::7fe]) by elf.hq.norma.perm.ru (8.15.2/8.15.2) with ESMTPS id u8MBura9036784 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO) for ; Thu, 22 Sep 2016 16:56:54 +0500 (YEKT) (envelope-from emz@norma.perm.ru) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=norma.perm.ru; s=key; t=1474545414; bh=aNqY3RqpJv4OYOPqCxUv7SDzZNfcRs+Smwqu8Mbe9PA=; h=From:Subject:To:Date; b=Aa75zAqMA5VhctbhYf6qhKvxotB03KHumngXnVwTYjfDBApuwtZGUpDIjTVnXFqZJ Xj9yz0KE5ejGi7UniV+JHiMnUeDEu5Df2xIZsgomW3Zxr7TlW+JftiY6SiOfb4EJ4P ZxmV+Rdu3y4dhXzrVWVTNdMIA3AEd1eEL5KJCwrk= From: "Eugene M. Zheganin" Subject: zvol clone diffs To: freebsd-stable@FreeBSD.org Message-ID: <57E3C705.2010702@norma.perm.ru> Date: Thu, 22 Sep 2016 16:56:53 +0500 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101 Thunderbird/38.7.0 MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 11:56:57 -0000 Hi. I should mention from the start that this is a question about an engineering task, not a question about FreeBSD issue. I have a set of zvol clones that I redistribute over iSCSI. Several Windows VMs use these clones as disks via their embedded iSCSI initiators (each clone represents a disk with an NTFS partition, is imported as a "foreign" disk and functions just fine). From my opinion, they should not have any need to do additional writes on these clones (each VM should only read data, from my point of view). But zfs shows they do, and sometimes they write a lot of data, so clearly facts and expactations differ a lot - obviously I didn't take something into accounting. Is there any way to figure out what these writes are ? Because I cannot propose any simple enough method. Thanks. Eugene. From owner-freebsd-stable@freebsd.org Thu Sep 22 12:01:06 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CFDB1BE4D39 for ; Thu, 22 Sep 2016 12:01:06 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: from mail-wm0-x22d.google.com (mail-wm0-x22d.google.com [IPv6:2a00:1450:400c:c09::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 67B8E1AD9 for ; Thu, 22 Sep 2016 12:01:06 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: by mail-wm0-x22d.google.com with SMTP id l132so323610499wmf.0 for ; Thu, 22 Sep 2016 05:01:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=multiplay-co-uk.20150623.gappssmtp.com; s=20150623; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to; bh=AfK6KS3C0KGXqgRs3JIxXHMrsWgrEjsK7hvM2qdH+DU=; b=HscqE/Wi94Ta2IYz6EzCs2hl8VXIBgXbg4moShNxOI3O246ohn1JJ6U2SqwD11VNR6 CxwJpLW/TkxZ02gBqhPtOYjcb7LZrJEQgDQTsEDhQVk5c1A+3CordtZpXwL2MxHcWnGK 28aaZYJpPjlrJBJNCguKn+oT4ExwjXfyUZHE9CLdslAnzS5oWJU2hEflDkiGQFlEnvRQ tZSUDJ6FcMWWOXLndYRtdesEQVGTG5FgRsspVCU6lBAMY6whO1/IXHwI5lPU1tm/z6cA ECm1yBKukbxLqnZ4jTtoUCmbqqT9lM2qRBlqAWf/x/J6SH/TL2vCF5tOcDyD6r02Vi9f L9VQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to; bh=AfK6KS3C0KGXqgRs3JIxXHMrsWgrEjsK7hvM2qdH+DU=; b=BsS6/A5n6hee3mkM2I0Qx4MK8lORHuha58WaYCya8UxBjvkJLi7I2st0bNe0m4N16u zlffZ0QluDTqs+WfpWbM39VuiVI2k9o/wnD6Q21vPyjfPphBngT4FR5H9DMv3njx4iHq 2MkLcq0cusrTsV6GKtswFnH0xAtPOnkSNghe7Laymv8iQ4acOnZe22IRKIOu7d8d1SvA lNi76UxtOSBtQoEpayOT97ca88PqlwpDfXPAST1HcjI+vzelslGQb/Upg4JgXd3CbZA7 DBu6ge5d5kD/LNVjCWfmanQ4BPWN2gGJI/PciSmAMu4JT09CiF0GHICmw2b1z4up/twz 6n6Q== X-Gm-Message-State: AE9vXwPomnMoPi1voZX/fXdX6wUYkZe0DQJfTekKbMi+SF1WdtdAfRc7oSSdfupxnQmpduIO X-Received: by 10.28.131.199 with SMTP id f190mr8615104wmd.30.1474545664321; Thu, 22 Sep 2016 05:01:04 -0700 (PDT) Received: from [10.10.1.58] (liv3d.labs.multiplay.co.uk. [82.69.141.171]) by smtp.gmail.com with ESMTPSA id r9sm1783225wjp.15.2016.09.22.05.01.03 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 22 Sep 2016 05:01:03 -0700 (PDT) Subject: Re: zfs/raidz and creation pause/blocking To: freebsd-stable@freebsd.org References: <57E3C68C.8060200@norma.perm.ru> From: Steven Hartland Message-ID: Date: Thu, 22 Sep 2016 13:01:02 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <57E3C68C.8060200@norma.perm.ru> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 12:01:06 -0000 Almost certainly its TRIMing the drives try setting the sysctl vfs.zfs.vdev.trim_on_init=0 On 22/09/2016 12:54, Eugene M. Zheganin wrote: > Hi. > > Recently I spent a lot of time setting up various zfs installations, and > I got a question. > Often when creating a raidz on disks considerably big (>~ 1T) I'm seeing > a weird stuff: "zpool create" blocks, and waits for several minutes. In > the same time system is fully responsive and I can see in gstat that the > kernel starts to tamper all the pool candidates sequentially at 100% > busy with iops around zero (in the example below, taken from a live > system, it's doing something with da11): > > (zpool create gamestop raidz da5 da7 da8 da9 da10 da11) > > dT: 1.064s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name > 0 0 0 0 0.0 0 0 0.0 0.0| da0 > 0 0 0 0 0.0 0 0 0.0 0.0| da1 > 0 0 0 0 0.0 0 0 0.0 0.0| da2 > 0 0 0 0 0.0 0 0 0.0 0.0| da3 > 0 0 0 0 0.0 0 0 0.0 0.0| da4 > 0 0 0 0 0.0 0 0 0.0 0.0| da5 > 0 0 0 0 0.0 0 0 0.0 0.0| da6 > 0 0 0 0 0.0 0 0 0.0 0.0| da7 > 0 0 0 0 0.0 0 0 0.0 0.0| da8 > 0 0 0 0 0.0 0 0 0.0 0.0| da9 > 0 0 0 0 0.0 0 0 0.0 0.0| da10 > 150 3 0 0 0.0 0 0 0.0 112.6| da11 > 0 0 0 0 0.0 0 0 0.0 0.0| da0p1 > 0 0 0 0 0.0 0 0 0.0 0.0| da0p2 > 0 0 0 0 0.0 0 0 0.0 0.0| da0p3 > 0 0 0 0 0.0 0 0 0.0 0.0| da1p1 > 0 0 0 0 0.0 0 0 0.0 0.0| da1p2 > 0 0 0 0 0.0 0 0 0.0 0.0| da1p3 > 0 0 0 0 0.0 0 0 0.0 0.0| da0p4 > 0 0 0 0 0.0 0 0 0.0 0.0| gpt/boot0 > 0 0 0 0 0.0 0 0 0.0 0.0| > gptid/22659641-7ee6-11e6-9b56-0cc47aa41194 > 0 0 0 0 0.0 0 0 0.0 0.0| gpt/zroot0 > 0 0 0 0 0.0 0 0 0.0 0.0| gpt/esx0 > 0 0 0 0 0.0 0 0 0.0 0.0| gpt/boot1 > 0 0 0 0 0.0 0 0 0.0 0.0| > gptid/23c1fbec-7ee6-11e6-9b56-0cc47aa41194 > 0 0 0 0 0.0 0 0 0.0 0.0| gpt/zroot1 > 0 0 0 0 0.0 0 0 0.0 0.0| mirror/mirror > 0 0 0 0 0.0 0 0 0.0 0.0| da1p4 > 0 0 0 0 0.0 0 0 0.0 0.0| gpt/esx1 > > The most funny thing is that da5,7-11 are SSD, with a capability of like > 30K iops at their least. > So I wonder what is happening during this and why does it take that > long. Because usually pools are creating very fast. > > Thanks. > _______________________________________________ > freebsd-stable@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" From owner-freebsd-stable@freebsd.org Thu Sep 22 12:44:18 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1C451BE57EA for ; Thu, 22 Sep 2016 12:44:18 +0000 (UTC) (envelope-from matthew@FreeBSD.org) Received: from smtp.infracaninophile.co.uk (smtp.infracaninophile.co.uk [81.2.117.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "smtp.infracaninophile.co.uk", Issuer "infracaninophile.co.uk" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id CD588C39 for ; Thu, 22 Sep 2016 12:44:17 +0000 (UTC) (envelope-from matthew@FreeBSD.org) Received: from liminal.local (unknown [109.111.229.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: m.seaman@infracaninophile.co.uk) by smtp.infracaninophile.co.uk (Postfix) with ESMTPSA id 1DAB41B81 for ; Thu, 22 Sep 2016 12:44:04 +0000 (UTC) Authentication-Results: smtp.infracaninophile.co.uk; dmarc=none header.from=FreeBSD.org Authentication-Results: smtp.infracaninophile.co.uk/1DAB41B81; dkim=none; dkim-atps=neutral Subject: Re: zvol clone diffs To: freebsd-stable@freebsd.org References: <57E3C705.2010702@norma.perm.ru> From: Matthew Seaman Message-ID: Date: Thu, 22 Sep 2016 14:43:49 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <57E3C705.2010702@norma.perm.ru> Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="6JkaobB5JuQ2pmCeL7TaxhwxULuIsTlgk" X-Spam-Status: No, score=1.0 required=5.0 tests=BAYES_00, RCVD_IN_BRBL_LASTEXT, RDNS_NONE,SPF_SOFTFAIL autolearn=no autolearn_force=no version=3.4.1 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on smtp.infracaninophile.co.uk X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 12:44:18 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --6JkaobB5JuQ2pmCeL7TaxhwxULuIsTlgk Content-Type: multipart/mixed; boundary="dlINSGC5NL3vqMMePkSoQStnNFVgPoicI"; protected-headers="v1" From: Matthew Seaman To: freebsd-stable@freebsd.org Message-ID: Subject: Re: zvol clone diffs References: <57E3C705.2010702@norma.perm.ru> In-Reply-To: <57E3C705.2010702@norma.perm.ru> --dlINSGC5NL3vqMMePkSoQStnNFVgPoicI Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable On 22/09/2016 13:56, Eugene M. Zheganin wrote: > Is there any way to figure out what these writes are ? Because I cannot= > propose any simple enough method. Given you're using volumes for datasets where ZFS knows nothing about the contained filesystem structure, about the only way to proceed is via the windows site of things. You'ld need to somehow trap where windows issues a write and proceed from there. Ideally you could do something like snapshot NTFS, wait until windows has written something and then compare the snapshot with the live filesystem. Very cursory Googling suggests that Microsoft calls this sort of thing a 'shadow copy' Cheers, Matthew --dlINSGC5NL3vqMMePkSoQStnNFVgPoicI-- --6JkaobB5JuQ2pmCeL7TaxhwxULuIsTlgk Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- iQJ8BAEBCgBmBQJX49IFXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQ2NTNBNjhCOTEzQTRFNkNGM0UxRTEzMjZC QjIzQUY1MThFMUE0MDEzAAoJELsjr1GOGkATGg0QAIFn3WmH9Igw7KW3rrxXrmX+ 92s3bQUPl9c8xzgKlcpLynVCkITQq5ztkC1qVlnNzE+TbGLMJNaHe61o10txGnoA E5VYKW8KsLGUTSDx2orI+caHK6WVGSxlmjay8IJumefaN8Ks7+6omC9BpfV1me1g KUqGstdNixTIQ0ghDvAUtUixY74/BhpvPEnufhczqBeHcg8H2ZPO/sWWS+icf09V FC3mWYaUhsvoW+8IlA+lyTh0QXGYfH86U6VoXeKjOIBxUx5eteRXIYmZ7Bybhcbt qNIHy6Jv2ltetniwNZwNUIqZpwqqsiORb86u7AGD3X7FCDr6fuVHAtsGnb9pcYUW yvuRAcu5iLvrU5c6BGCmj7O72/HWyI5CsFJY+TzbN/ApeAEfI/N/Rc7K5a4Qrrx3 QkYbn48CjLeGney2GieQJ+THobgBpREHDfTzjcbYGENIcNnV+aOAZNWyODx4rTte /93UChH/sYVrwSvM+oTOb2XxlDzvbPKtYQEsC7NPZeEC2L2qC7qtOiKxu0fY6TiS 0K98RcG94isFoTcC1rGF8CZBSrd4w7N7YkDTZC0AkC9qzXCF4hBGVUU/bpJmrIoA 0lFxKqxlLKst0Z80WuywZKcc/SSTWc9ZgbhvZDUw/ERPr+lfFbPojkRGgPvPSeiw 8enH90yDOqrnWiLQ2lln =qSav -----END PGP SIGNATURE----- --6JkaobB5JuQ2pmCeL7TaxhwxULuIsTlgk-- From owner-freebsd-stable@freebsd.org Thu Sep 22 13:56:32 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 542A5BE5120 for ; Thu, 22 Sep 2016 13:56:32 +0000 (UTC) (envelope-from jenkins-admin@FreeBSD.org) Received: from jenkins-9.freebsd.org (jenkins-9.freebsd.org [8.8.178.209]) by mx1.freebsd.org (Postfix) with ESMTP id 467BEC5E; Thu, 22 Sep 2016 13:56:32 +0000 (UTC) (envelope-from jenkins-admin@FreeBSD.org) Received: from jenkins-9.freebsd.org (localhost [127.0.0.1]) by jenkins-9.freebsd.org (Postfix) with ESMTP id 6EA16D8; Thu, 22 Sep 2016 13:56:32 +0000 (UTC) Date: Thu, 22 Sep 2016 13:56:31 +0000 (GMT) From: jenkins-admin@FreeBSD.org To: jenkins-admin@FreeBSD.org, freebsd-stable@FreeBSD.org Message-ID: <644852396.27.1474552591868.JavaMail.jenkins@jenkins-9.freebsd.org> In-Reply-To: <1758150249.26.1474530976648.JavaMail.jenkins@jenkins-9.freebsd.org> References: <1758150249.26.1474530976648.JavaMail.jenkins@jenkins-9.freebsd.org> Subject: Jenkins build is back to stable : FreeBSD_stable_10 #407 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Instance-Identity: MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAkKKb2VAfYQKfu1t7qk4nR5qzUBEI+UqT4BPec4qHVhqUy0FFdq50sMH+3y9bCDNOufctov6VqTNffZ3YXArnZK95YF0OX97fh+E9txYOUX1adc+TikcKjuYpHmL5dE62eaZTI+4A5jnRonskQ1PaoIFz0Kbu4mWzkFsmdiXTraGzomXq4cHUCATA2+K4eDYgjXEQI30z3GOMmmZ4t/+6QGk1cMb/BqMWHbn80AsRCb4tU7Hpd72XLDpsuO7YRP1Q0CjmNAuBOTj+sFiiOe6U9HpqOlQN+iFUvBdZo/ybuy5Kh71cAaYQNL68cYdZJ6binH/DkG3KY/fS7DFYAeuwjwIDAQAB X-Jenkins-Job: FreeBSD_stable_10 X-Jenkins-Result: SUCCESS X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 13:56:32 -0000 See From owner-freebsd-stable@freebsd.org Thu Sep 22 14:38:36 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 12D56BE5957 for ; Thu, 22 Sep 2016 14:38:36 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CD4F35E5 for ; Thu, 22 Sep 2016 14:38:35 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bn58y-00004T-Kw; Thu, 22 Sep 2016 17:38:32 +0300 Date: Thu, 22 Sep 2016 17:38:32 +0300 From: Slawa Olhovchenkov To: "Eugene M. Zheganin" Cc: freebsd-stable@FreeBSD.org Subject: Re: zvol clone diffs Message-ID: <20160922143832.GJ2960@zxy.spb.ru> References: <57E3C705.2010702@norma.perm.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <57E3C705.2010702@norma.perm.ru> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 14:38:36 -0000 On Thu, Sep 22, 2016 at 04:56:53PM +0500, Eugene M. Zheganin wrote: > Hi. > > I should mention from the start that this is a question about an > engineering task, not a question about FreeBSD issue. > > I have a set of zvol clones that I redistribute over iSCSI. Several > Windows VMs use these clones as disks via their embedded iSCSI > initiators (each clone represents a disk with an NTFS partition, is > imported as a "foreign" disk and functions just fine). From my opinion, > they should not have any need to do additional writes on these clones > (each VM should only read data, from my point of view). But zfs shows > they do, and sometimes they write a lot of data, so clearly facts and > expactations differ a lot - obviously I didn't take something into > accounting. May be atime like on NTFS? http://serverfault.com/questions/33932/how-do-you-disable-the-last-accessed-attribute-on-ntfs-windows From owner-freebsd-stable@freebsd.org Thu Sep 22 14:48:36 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0D04BBE5464; Thu, 22 Sep 2016 14:48:36 +0000 (UTC) (envelope-from dweimer@dweimer.net) Received: from webmail.dweimer.net (24-240-198-188.static.stls.mo.charter.com [24.240.198.188]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D2668EC4; Thu, 22 Sep 2016 14:48:35 +0000 (UTC) (envelope-from dweimer@dweimer.net) Received: from webmail.dweimer.local (localhost [10.9.5.2]) by webmail.dweimer.net (8.15.2/8.15.2) with ESMTPS id u8MEmSs2004461 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Thu, 22 Sep 2016 09:48:28 -0500 (CDT) (envelope-from dweimer@dweimer.net) Received: (from www@localhost) by webmail.dweimer.local (8.15.2/8.15.2/Submit) id u8MEmSDJ004460; Thu, 22 Sep 2016 09:48:28 -0500 (CDT) (envelope-from dweimer@dweimer.net) X-Authentication-Warning: webmail.dweimer.local: www set sender to dweimer@dweimer.net using -f To: Slawa Olhovchenkov Subject: Re: zvol clone diffs MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Thu, 22 Sep 2016 09:48:28 -0500 From: "Dean E. Weimer" Cc: "Eugene M. Zheganin" , freebsd-stable@freebsd.org, owner-freebsd-stable@freebsd.org Organization: dweimer.net Reply-To: dweimer@dweimer.net Mail-Reply-To: dweimer@dweimer.net In-Reply-To: <20160922143832.GJ2960@zxy.spb.ru> References: <57E3C705.2010702@norma.perm.ru> <20160922143832.GJ2960@zxy.spb.ru> Message-ID: X-Sender: dweimer@dweimer.net User-Agent: Roundcube Webmail/1.2.1 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 14:48:36 -0000 On 2016-09-22 9:38 am, Slawa Olhovchenkov wrote: > On Thu, Sep 22, 2016 at 04:56:53PM +0500, Eugene M. Zheganin wrote: > >> Hi. >> >> I should mention from the start that this is a question about an >> engineering task, not a question about FreeBSD issue. >> >> I have a set of zvol clones that I redistribute over iSCSI. Several >> Windows VMs use these clones as disks via their embedded iSCSI >> initiators (each clone represents a disk with an NTFS partition, is >> imported as a "foreign" disk and functions just fine). From my >> opinion, >> they should not have any need to do additional writes on these clones >> (each VM should only read data, from my point of view). But zfs shows >> they do, and sometimes they write a lot of data, so clearly facts and >> expactations differ a lot - obviously I didn't take something into >> accounting. > > May be atime like on NTFS? > > http://serverfault.com/questions/33932/how-do-you-disable-the-last-accessed-attribute-on-ntfs-windows I would recommend using the windows Diskpart command and settings the volumes attribute to read only, this will force the NTFS volume to be readonly and shouldn't allow changes to be saved. -- Thanks, Dean E. Weimer http://www.dweimer.net/ From owner-freebsd-stable@freebsd.org Thu Sep 22 19:32:21 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E48D6BE5C37 for ; Thu, 22 Sep 2016 19:32:21 +0000 (UTC) (envelope-from gjb@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id D5046F9F; Thu, 22 Sep 2016 19:32:21 +0000 (UTC) (envelope-from gjb@FreeBSD.org) Received: from FreeBSD.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by freefall.freebsd.org (Postfix) with ESMTP id 1CD22105D; Thu, 22 Sep 2016 19:32:21 +0000 (UTC) (envelope-from gjb@FreeBSD.org) Date: Thu, 22 Sep 2016 19:32:19 +0000 From: Glen Barber To: Mark Millard Cc: freebsd-stable@freebsd.org Subject: Re: svn commit: r306207 - releng/11.0/release/doc/en_US.ISO8859-1/relnotes Message-ID: <20160922193219.GD35413@FreeBSD.org> References: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="balsNyzuYCBdi6WK" Content-Disposition: inline In-Reply-To: X-Operating-System: FreeBSD 11.0-CURRENT amd64 X-SCUD-Definition: Sudden Completely Unexpected Dataloss X-SULE-Definition: Sudden Unexpected Learning Event X-PEKBAC-Definition: Problem Exists, Keyboard Between Admin/Computer User-Agent: Mutt/1.5.24 (2015-08-30) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 19:32:22 -0000 --balsNyzuYCBdi6WK Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Bah. I'll add this to the errata.html page that it was turned off. It's my fault. Glen On Thu, Sep 22, 2016 at 12:30:40PM -0700, Mark Millard wrote: > https://svnweb.freebsd.org/base/releng/11.0/release/doc/en_US.ISO8859-1/r= elnotes/article.xml?revision=3D306207&view=3Dmarkup >=20 > says. . . >=20 >=20 > > The > >=20 > > WITH_SYSTEM_COMPILER &man.src.conf.5; > >=20 > > option is enabled by default. >=20 > but. . . >=20 > > Author: bdrewery > > Date: Wed Sep 21 21:23:09 2016 > > New Revision: 306143 > > URL: https://svnweb.freebsd.org/changeset/base/306143 > >=20 > > Log: > > Disable SYSTEM_COMPILER by default. > > =20 > > This is a direct commit to releng/11.0. > > =20 > > Having it enabled can lead to a situation where building > > on one system and installing on another will fail due > > to not finding cc in the OBJDIR. > > =20 > > An actual fix will be made on head separately. > > =20 > > PR: 212877 > > Relnotes: yes > > Sponsored by: Dell EMC Isilon > > Approved by: re (gjb) >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 --balsNyzuYCBdi6WK Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJX5DG+AAoJEAMUWKVHj+KTAQ4P/R++P+p3WPbcTqV74oAdUjSZ r0Q7qxSK5HKczyIYSkb0GpDJE8JI+5kn4H8C+rL6DWEg0N9Ss9cm3Gg+NKcgijbu BS2QzlwG0Rt17jgDyLDeSG1gzyacrujI8BjiehhlYepXowcKL88+hq0sV9QC5VZy Fukl74+KwLN/Q1dPcnsEJxKXbObQTfkY1PlHAXjdDOr+RhLBbOqSLgdG7HD7ZpCA dixFU2IZo7H1dJs2pv7e6E9tAZHisuQcRTRq4b32KaXoPHM9F52OcuOrysh2d+n/ hHjO4xLAu65hwcqOTRH6kVcNO2ZRQP+FDVaak4kMKGh/bOTGcALcLdSsZOD3PbXV HxDzz2ayjIijnUMsFL/QMRCA2eC/pHnhLKF1dWKA3OfsT3ww1ST6fPhKadS7gV3N H9CeOKxZtpYc670tqjHA26D5tXiJDP8BY/SnYg33lrVmIxocC2EG5bH3dnwFIeTc 6KY7776Hhm9Oewir4xlWSMvaz5kQWQJDtdHb40BRjZRrEVlb1/rRfC2T0xqBg5XL lr4TUEoIrUEFkKtYhRMknyjvQkh4RFtslFNDlFKGBrWpuw95zhtS+x++ICDvFadz RMjOU5uAgxrdI0DbL9QnRDFrpLrif/YtypTKtNVg+bYzBX1ZHHHDbei8lXTMdRGP 5Uk3OzIr+hqheL5VBpMx =6AxE -----END PGP SIGNATURE----- --balsNyzuYCBdi6WK-- From owner-freebsd-stable@freebsd.org Thu Sep 22 19:37:26 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 32B2DBE6021 for ; Thu, 22 Sep 2016 19:37:26 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-210-72.reflexion.net [208.70.210.72]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C5292132D for ; Thu, 22 Sep 2016 19:37:25 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 5768 invoked from network); 22 Sep 2016 19:30:35 -0000 Received: from unknown (HELO mail-cs-02.app.dca.reflexion.local) (10.81.19.2) by 0 (rfx-qmail) with SMTP; 22 Sep 2016 19:30:35 -0000 Received: by mail-cs-02.app.dca.reflexion.local (Reflexion email security v8.00.0) with SMTP; Thu, 22 Sep 2016 15:30:35 -0400 (EDT) Received: (qmail 21897 invoked from network); 22 Sep 2016 19:30:33 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 22 Sep 2016 19:30:33 -0000 Received: from [192.168.0.105] (ip70-189-131-151.lv.lv.cox.net [70.189.131.151]) by iron2.pdx.net (Postfix) with ESMTPSA id 6967EEC8843; Thu, 22 Sep 2016 12:30:41 -0700 (PDT) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: Re: svn commit: r306207 - releng/11.0/release/doc/en_US.ISO8859-1/relnotes Date: Thu, 22 Sep 2016 12:30:40 -0700 Message-Id: Cc: freebsd-stable@freebsd.org To: Glen Barber Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 19:37:26 -0000 h= ttps://svnweb.freebsd.org/base/releng/11.0/release/doc/en_US.ISO8859-1/rel= notes/article.xml?revision=3D306207&view=3Dmarkup says. . . > The >=20 > WITH_SYSTEM_COMPILER &man.src.conf.5; >=20 > option is enabled by default. but. . . > Author: bdrewery > Date: Wed Sep 21 21:23:09 2016 > New Revision: 306143 > URL: https://svnweb.freebsd.org/changeset/base/306143 >=20 > Log: > Disable SYSTEM_COMPILER by default. > =20 > This is a direct commit to releng/11.0. > =20 > Having it enabled can lead to a situation where building > on one system and installing on another will fail due > to not finding cc in the OBJDIR. > =20 > An actual fix will be made on head separately. > =20 > PR: 212877 > Relnotes: yes > Sponsored by: Dell EMC Isilon > Approved by: re (gjb) =3D=3D=3D Mark Millard markmi at dsl-only.net From owner-freebsd-stable@freebsd.org Thu Sep 22 23:31:37 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 88FADBE64EC for ; Thu, 22 Sep 2016 23:31:37 +0000 (UTC) (envelope-from chris@stankevitz.com) Received: from mango.stankevitz.com (mango.stankevitz.com [208.79.93.194]) by mx1.freebsd.org (Postfix) with ESMTP id 7C45FF21 for ; Thu, 22 Sep 2016 23:31:37 +0000 (UTC) (envelope-from chris@stankevitz.com) Received: from Chriss-MacBook-Pro.local (209-203-101-124.static.twtelecom.net [209.203.101.124]) by mango.stankevitz.com (Postfix) with ESMTPSA id 8E01E706A9 for ; Thu, 22 Sep 2016 16:22:48 -0700 (PDT) From: Chris Stankevitz To: freebsd-stable@freebsd.org Subject: Source upgrade to 10.3: Undefined symbol "__set_error_selector" Message-ID: <736b94be-3e0b-5723-d89d-fc1dd0584b10@stankevitz.com> Date: Thu, 22 Sep 2016 16:22:47 -0700 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 23:31:37 -0000 FYI (issue is resolved so I'm just reporting for posterity)... I have four offline ("air gapped") FreeBSD systems with nearly identical hardware. Two started life as 10.1-RELEASE and the other two started life as 10.2-RELEASE. All are kept up to date by bringing over /usr/src for their associated releases (using 'svn co https://svn.freebsd.org/base/releng/10.x') and make/buildworld. All were upgraded to 10.3-p5 and failed 'make installworld' with: /lib/libthr.so.3: Undefined symbol "__set_error_selector" All resolved with 'cd /usr/src/lib/libc && make install && cd /usr/src && make installworld' Chris From owner-freebsd-stable@freebsd.org Fri Sep 23 19:17:06 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5B462BE6180 for ; Fri, 23 Sep 2016 19:17:06 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id EF271A9E; Fri, 23 Sep 2016 19:17:05 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bnVxw-000IaI-MA; Fri, 23 Sep 2016 22:16:56 +0300 Date: Fri, 23 Sep 2016 22:16:56 +0300 From: Slawa Olhovchenkov To: Julien Charbon Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org, hiren panchasara Subject: Re: 11.0 stuck on high network load Message-ID: <20160923191656.GF2840@zxy.spb.ru> References: <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> <20160919204328.GN2840@zxy.spb.ru> <8ba75d6e-4f01-895e-0aed-53c6c6692cb9@freebsd.org> <20160920202633.GQ2840@zxy.spb.ru> <20160921195155.GW2840@zxy.spb.ru> <20160922095331.GB2840@zxy.spb.ru> <67862b33-63c0-2f23-d254-5ddc55dbb554@freebsd.org> <20160922102045.GC2840@zxy.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160922102045.GC2840@zxy.spb.ru> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Sep 2016 19:17:06 -0000 On Thu, Sep 22, 2016 at 01:20:45PM +0300, Slawa Olhovchenkov wrote: > On Thu, Sep 22, 2016 at 12:04:40PM +0200, Julien Charbon wrote: > > > >> These paths can indeed compete for the same INP lock, as both > > >> tcp_tw_2msl_scan() calls always start with the first inp found in > > >> twq_2msl list. But in both cases, this first inp should be quickly used > > >> and its lock released anyway, thus that could explain your situation it > > >> that the TCP stack is doing that all the time, for example: > > >> > > >> - Let say that you are running out completely and constantly of tcptw, > > >> and then all connections transitioning to TIME_WAIT state are competing > > >> with the TIME_WAIT timeout scan that tries to free all the expired > > >> tcptw. If the stack is doing that all the time, it can appear like > > >> "live" locked. > > >> > > >> This is just an hypothesis and as usual might be a red herring. > > >> Anyway, could you run: > > >> > > >> $ vmstat -z | head -2; vmstat -z | grep -E 'tcp|sock' > > > > > > ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP > > > > > > socket: 864, 4192664, 18604, 25348,49276158, 0, 0 > > > tcp_inpcb: 464, 4192664, 34226, 18702,49250593, 0, 0 > > > tcpcb: 1040, 4192665, 18424, 18953,49250593, 0, 0 > > > tcptw: 88, 16425, 15802, 623,14526919, 8, 0 > > > tcpreass: 40, 32800, 15, 2285, 632381, 0, 0 > > > > > > In normal case tcptw is about 16425/600/900 > > > > > > And after `sysctl -a | grep tcp` system stuck on serial console and I am reset it. > > > > > >> Ideally, once when everything is ok, and once when you have the issue > > >> to see the differences (if any). > > >> > > >> If it appears your are quite low in tcptw, and if you have enough > > >> memory, could you try increase the tcptw limit using sysctl > > > > > > I think this is not eliminate stuck, just may do it less frequency > > > > You are right, it would just be a big hint that the tcp_tw_2msl_scan() > > contention hypothesis is the right one. As I see you have plenty of > > memory on your server, thus could you try with: > > > > net.inet.tcp.maxtcptw=4192665 > > > > And see what happen. Just to validate this hypothesis. > > This is bad way for validate, with maxtcptw=16384 happened is random > and can be waited for month. After maxtcptw=4192665 I am don't know > how long need to wait for verification this hypothesis. > > More frequency (may be 3-5 times per day) happening less traffic drops > (not to zero for minutes). May be this caused also by contention in > tcp_tw_2msl_scan, but fast resolved (stochastic process). By eating > CPU power nginx can't service connection and clients closed > connections and need more TIME_WAIT and can trigered > tcp_tw_2msl_scan(reuse=1). After this we can got live lock. > > May be after I learning to catch and dignostic this validation is more > accurately. Some more bits: socket: 864, 4192664, 30806, 790,28524160, 0, 0 ipq: 56, 32802, 0, 1278, 1022, 0, 0 udp_inpcb: 464, 4192664, 44, 364, 14066, 0, 0 udpcb: 32, 4192750, 44, 3081, 14066, 0, 0 tcp_inpcb: 464, 4192664, 38558, 378,28476709, 0, 0 tcpcb: 1040, 4192665, 30690, 738,28476709, 0, 0 tcptw: 88, 32805, 7868, 772, 8412249, 0, 0 last pid: 49575; load averages: 2.00, 2.05, 3.75 up 1+01:12:08 22:13:42 853 processes: 15 running, 769 sleeping, 35 waiting, 34 lock CPU 0: 0.0% user, 0.0% nice, 0.0% system, 100% interrupt, 0.0% idle CPU 1: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 2: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 3: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 4: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 5: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 6: 0.0% user, 0.0% nice, 0.4% system, 0.0% interrupt, 99.6% idle CPU 7: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 8: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 9: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 10: 0.0% user, 0.0% nice, 0.4% system, 0.0% interrupt, 99.6% idle CPU 11: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Mem: 8659M Active, 8385M Inact, 107G Wired, 1325M Free ARC: 99G Total, 88G MFU, 10G MRU, 32K Anon, 167M Header, 529M Other Swap: 32G Total, 32G Free PID UID PRI NICE SIZE RES STATE C TIME WCPU COMMAND 49566 0 20 0 26264K 7068K CPU10 10 0:00 0.14% top 5 0 -8 - 0K 192K l2arc_ 6 31:07 0.13% zfskern{l2arc_fe 46120 0 20 0 14608K 3868K nanslp 8 0:10 0.10% tcpw 12 0 -76 - 0K 848K WAIT 1 0:00 0.02% intr{swi0: uart total = used + free Max used total used total used total used total 2016-09-23 21:46:41 tcptw 32805 389 1665 0 tcpcb 27788 28557 inpcb 28177 29296 socket 27875 28740 2016-09-23 21:46:42 tcptw 32805 409 1665 0 tcpcb 27723 28557 inpcb 28132 29296 socket 27812 28740 2016-09-23 21:46:43 tcptw 32805 405 1665 0 tcpcb 27750 28557 inpcb 28155 29296 socket 27838 28740 2016-09-23 21:46:44 tcptw 32805 409 1665 0 tcpcb 27765 28557 inpcb 28174 29296 socket 27852 28740 2016-09-23 21:46:45 tcptw 32805 422 1665 0 tcpcb 27737 28557 inpcb 28159 29296 socket 27824 28740 2016-09-23 21:46:46 tcptw 32805 444 1665 0 tcpcb 27729 28557 inpcb 28173 29296 socket 27817 28740 2016-09-23 21:46:47 tcptw 32805 444 1665 0 tcpcb 27712 28557 inpcb 28156 29296 socket 27800 28740 2016-09-23 21:46:48 tcptw 32805 428 1665 0 tcpcb 27693 28557 inpcb 28121 29296 socket 27783 28740 2016-09-23 21:46:49 tcptw 32805 431 1665 0 tcpcb 27677 28557 inpcb 28108 29296 socket 27766 28740 2016-09-23 21:46:50 tcptw 32805 455 1665 0 tcpcb 27721 28557 inpcb 28176 29296 socket 27809 28740 2016-09-23 21:46:51 tcptw 32805 432 1665 0 tcpcb 27739 28557 inpcb 28171 29296 socket 27828 28740 2016-09-23 21:46:52 tcptw 32805 434 1665 0 tcpcb 27731 28557 inpcb 28165 29296 socket 27819 28740 2016-09-23 21:46:53 tcptw 32805 431 1665 0 tcpcb 27742 28557 inpcb 28173 29296 socket 27831 28740 2016-09-23 21:46:54 tcptw 32805 424 1665 0 tcpcb 27738 28557 inpcb 28162 29296 socket 27826 28740 2016-09-23 21:46:55 tcptw 32805 397 1665 0 tcpcb 27740 28557 inpcb 28137 29296 socket 27827 28740 2016-09-23 21:46:56 tcptw 32805 412 1665 0 tcpcb 27742 28557 inpcb 28154 29296 socket 27830 28740 2016-09-23 21:46:57 tcptw 32805 418 1665 0 tcpcb 27749 28557 inpcb 28167 29296 socket 27838 28740 2016-09-23 21:46:58 tcptw 32805 426 1665 0 tcpcb 27740 28557 inpcb 28166 29296 socket 27827 28740 2016-09-23 21:46:59 tcptw 32805 423 1665 0 tcpcb 27687 28557 inpcb 28110 29296 socket 27773 28740 2016-09-23 21:47:00 tcptw 32805 426 1665 0 tcpcb 27716 28557 inpcb 28142 29296 socket 27804 28740 2016-09-23 21:47:01 tcptw 32805 437 1665 0 tcpcb 27732 28557 inpcb 28169 29296 socket 27821 28740 2016-09-23 21:47:02 tcptw 32805 471 1665 0 tcpcb 27672 28557 inpcb 28143 29296 socket 27760 28740 2016-09-23 21:47:03 tcptw 32805 426 1665 0 tcpcb 27752 28557 inpcb 28178 29296 socket 27838 28740 2016-09-23 21:47:04 tcptw 32805 402 1665 0 tcpcb 27760 28557 inpcb 28162 29296 socket 27847 28740 2016-09-23 21:47:05 tcptw 32805 406 1665 0 tcpcb 27757 28557 inpcb 28163 29296 socket 27845 28740 2016-09-23 21:47:06 tcptw 32805 443 1665 0 tcpcb 27783 28557 inpcb 28226 29296 socket 27871 28740 2016-09-23 21:47:07 tcptw 32805 484 1665 0 tcpcb 27707 28557 inpcb 28191 29296 socket 27794 28740 2016-09-23 21:47:08 tcptw 32805 473 1665 0 tcpcb 27721 28557 inpcb 28194 29296 socket 27807 28740 2016-09-23 21:47:09 tcptw 32805 432 1665 0 tcpcb 27749 28557 inpcb 28181 29296 socket 27837 28740 2016-09-23 21:47:10 tcptw 32805 421 1665 0 tcpcb 27780 28557 inpcb 28201 29296 socket 27868 28740 2016-09-23 21:47:11 tcptw 32805 530 1665 0 tcpcb 27814 28557 inpcb 28344 29296 socket 27902 28740 2016-09-23 21:47:12 tcptw 32805 680 1665 0 tcpcb 27874 28557 inpcb 28554 29296 socket 27964 28740 2016-09-23 21:47:13 tcptw 32805 832 1665 0 tcpcb 27881 28557 inpcb 28713 29296 socket 27971 28740 2016-09-23 21:47:14 tcptw 32805 997 1665 0 tcpcb 27880 28557 inpcb 28877 29352 socket 27972 28740 2016-09-23 21:47:15 tcptw 32805 1155 1890 0 tcpcb 27931 28557 inpcb 29086 29560 socket 28023 28740 2016-09-23 21:47:16 tcptw 32805 1322 2250 0 tcpcb 27981 28557 inpcb 29303 29800 socket 28075 28740 2016-09-23 21:47:17 tcptw 32805 1496 2385 0 tcpcb 28065 28557 inpcb 29561 30040 socket 28159 28740 2016-09-23 21:47:18 tcptw 32805 1648 2385 0 tcpcb 28151 28557 inpcb 29799 30280 socket 28245 28740 2016-09-23 21:47:19 tcptw 32805 1790 2655 0 tcpcb 28398 28599 inpcb 30188 30672 socket 28492 28796 2016-09-23 21:47:20 tcptw 32805 1954 2655 0 tcpcb 28712 28923 inpcb 30666 31120 socket 28807 29116 2016-09-23 21:47:21 tcptw 32805 2115 3015 0 tcpcb 29061 29244 inpcb 31176 31576 socket 29156 29468 2016-09-23 21:47:22 tcptw 32805 2265 3150 0 tcpcb 29335 29538 inpcb 31600 32056 socket 29430 29704 2016-09-23 21:47:23 tcptw 32805 2424 3150 0 tcpcb 29553 29775 inpcb 31977 32440 socket 29648 29956 2016-09-23 21:47:24 tcptw 32805 2590 3375 0 tcpcb 29711 29901 inpcb 32301 32744 socket 29807 30112 2016-09-23 21:47:25 tcptw 32805 2760 3780 0 tcpcb 29794 30015 inpcb 32554 33040 socket 29891 30224 2016-09-23 21:47:26 tcptw 32805 2935 3915 0 tcpcb 29879 30111 inpcb 32814 33312 socket 29976 30292 2016-09-23 21:47:27 tcptw 32805 3109 3915 0 tcpcb 29953 30195 inpcb 33062 33584 socket 30050 30392 2016-09-23 21:47:28 tcptw 32805 3264 4140 0 tcpcb 30060 30267 inpcb 33324 33824 socket 30158 30476 2016-09-23 21:47:29 tcptw 32805 3435 4275 0 tcpcb 30137 30363 inpcb 33572 34032 socket 30235 30572 2016-09-23 21:47:30 tcptw 32805 3600 4500 0 tcpcb 30221 30489 inpcb 33821 34304 socket 30320 30644 2016-09-23 21:47:31 tcptw 32805 3775 4635 0 tcpcb 30309 30588 inpcb 34084 34576 socket 30408 30740 2016-09-23 21:47:32 tcptw 32805 3936 4770 0 tcpcb 30534 30741 inpcb 34470 34960 socket 30634 30908 2016-09-23 21:47:33 tcptw 32805 4097 4905 0 tcpcb 30744 30951 inpcb 34841 35352 socket 30844 31160 2016-09-23 21:47:34 tcptw 32805 4233 5040 0 tcpcb 31006 31176 inpcb 35239 35680 socket 31106 31372 2016-09-23 21:47:35 tcptw 32805 4366 5265 0 tcpcb 31160 31386 inpcb 35526 35920 socket 31260 31568 2016-09-23 21:47:36 tcptw 32805 4738 5535 0 tcpcb 29529 31428 inpcb 34267 36016 socket 29629 31596 2016-09-23 21:47:37 tcptw 32805 4879 5625 0 tcpcb 29506 31428 inpcb 34385 36016 socket 29607 31596 2016-09-23 21:47:38 tcptw 32805 5011 5895 0 tcpcb 29590 31428 inpcb 34601 36016 socket 29691 31596 2016-09-23 21:47:39 tcptw 32805 5130 5895 0 tcpcb 29713 31428 inpcb 34843 36016 socket 29815 31596 2016-09-23 21:47:40 tcptw 32805 5259 6165 0 tcpcb 29783 31428 inpcb 35042 36016 socket 29886 31596 2016-09-23 21:47:41 tcptw 32805 5378 6255 0 tcpcb 29606 31428 inpcb 34984 36016 socket 29709 31596 2016-09-23 21:47:42 tcptw 32805 5489 6255 0 tcpcb 29638 31428 inpcb 35127 36016 socket 29741 31596 2016-09-23 21:47:43 tcptw 32805 5629 6390 0 tcpcb 29630 31428 inpcb 35259 36016 socket 29735 31596 2016-09-23 21:47:44 tcptw 32805 5754 6660 0 tcpcb 29593 31428 inpcb 35347 36016 socket 29696 31596 2016-09-23 21:47:45 tcptw 32805 5887 6660 0 tcpcb 29606 31428 inpcb 35493 36016 socket 29709 31596 2016-09-23 21:47:46 tcptw 32805 6011 6750 0 tcpcb 29613 31428 inpcb 35624 36016 socket 29716 31596 2016-09-23 21:47:47 tcptw 32805 6128 7020 0 tcpcb 29642 31428 inpcb 35770 36128 socket 29745 31596 2016-09-23 21:47:48 tcptw 32805 6250 7020 0 tcpcb 29742 31428 inpcb 35992 36416 socket 29845 31596 2016-09-23 21:47:49 tcptw 32805 6378 7155 0 tcpcb 29745 31428 inpcb 36123 36472 socket 29850 31596 2016-09-23 21:47:50 tcptw 32805 6486 7290 0 tcpcb 29756 31428 inpcb 36242 36648 socket 29861 31596 2016-09-23 21:47:51 tcptw 32805 6603 7515 0 tcpcb 29807 31428 inpcb 36410 36792 socket 29912 31596 2016-09-23 21:47:52 tcptw 32805 6736 7515 0 tcpcb 29830 31428 inpcb 36566 36912 socket 29935 31596 2016-09-23 21:47:53 tcptw 32805 6852 7785 0 tcpcb 29892 31428 inpcb 36744 37112 socket 29996 31596 2016-09-23 21:47:54 tcptw 32805 6991 7785 0 tcpcb 29876 31428 inpcb 36867 37288 socket 29981 31596 2016-09-23 21:47:55 tcptw 32805 7102 8010 0 tcpcb 29928 31428 inpcb 37030 37400 socket 30033 31596 2016-09-23 21:47:56 tcptw 32805 7227 8010 0 tcpcb 29960 31428 inpcb 37187 37544 socket 30065 31596 2016-09-23 21:47:57 tcptw 32805 7356 8280 0 tcpcb 30031 31428 inpcb 37387 37752 socket 30136 31596 2016-09-23 21:47:58 tcptw 32805 7505 8415 0 tcpcb 30164 31428 inpcb 37669 38040 socket 30270 31596 2016-09-23 21:47:59 tcptw 32805 7618 8415 0 tcpcb 30302 31428 inpcb 37920 38328 socket 30408 31596 2016-09-23 21:48:00 tcptw 32805 7740 8505 0 tcpcb 30478 31428 inpcb 38218 38560 socket 30584 31596 2016-09-23 21:48:01 tcptw 32805 7858 8640 0 tcpcb 30663 31428 inpcb 38521 38880 socket 30769 31596 From owner-freebsd-stable@freebsd.org Fri Sep 23 20:01:45 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 95CA0BE84BD for ; Fri, 23 Sep 2016 20:01:45 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5872CEA8 for ; Fri, 23 Sep 2016 20:01:45 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bnWfH-000JZx-DH; Fri, 23 Sep 2016 23:01:43 +0300 Date: Fri, 23 Sep 2016 23:01:43 +0300 From: Slawa Olhovchenkov To: Julien Charbon Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org, hiren panchasara Subject: Re: 11.0 stuck on high network load Message-ID: <20160923200143.GG2840@zxy.spb.ru> References: <20160916181839.GC2960@zxy.spb.ru> <20160916183053.GL9397@strugglingcoder.info> <20160916190330.GG2840@zxy.spb.ru> <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> <20160919204328.GN2840@zxy.spb.ru> <8ba75d6e-4f01-895e-0aed-53c6c6692cb9@freebsd.org> <20160920202633.GQ2840@zxy.spb.ru> <20160921195155.GW2840@zxy.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Sep 2016 20:01:45 -0000 On Wed, Sep 21, 2016 at 11:25:18PM +0200, Julien Charbon wrote: > > Hi Slawa, > > On 9/21/16 9:51 PM, Slawa Olhovchenkov wrote: > > On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote: > >> You can also use Dtrace and lockstat (especially with the lockstat -s > >> option): > >> > >> https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks > >> https://www.freebsd.org/cgi/man.cgi?query=lockstat&manpath=FreeBSD+11.0-RELEASE > >> > >> But I am less familiar with Dtrace/lockstat tools. > > > > I am still use old kernel and got lockdown again. > > Try using lockstat (I am save more output), interesting may be next: > > > > R/W writer spin on writer: 190019 events in 1.070 seconds (177571 events/sec) > > > > ------------------------------------------------------------------------------- > > Count indv cuml rcnt nsec Lock Caller > > 140839 74% 74% 0.00 24659 tcpinp tcp_tw_2msl_scan+0xc6 > > > > nsec ------ Time Distribution ------ count Stack > > 4096 | 913 tcp_twstart+0xa3 > > 8192 |@@@@@@@@@@@@ 58191 tcp_do_segment+0x201f > > 16384 |@@@@@@ 29594 tcp_input+0xe1c > > 32768 |@@@@ 23447 ip_input+0x15f > > 65536 |@@@ 16197 > > 131072 |@ 8674 > > 262144 | 3358 > > 524288 | 456 > > 1048576 | 9 > > ------------------------------------------------------------------------------- > > Count indv cuml rcnt nsec Lock Caller > > 49180 26% 100% 0.00 15929 tcpinp tcp_tw_2msl_scan+0xc6 > > > > nsec ------ Time Distribution ------ count Stack > > 4096 | 157 pfslowtimo+0x54 > > 8192 |@@@@@@@@@@@@@@@ 24796 softclock_call_cc+0x179 > > 16384 |@@@@@@ 11223 softclock+0x44 > > 32768 |@@@@ 7426 intr_event_execute_handlers+0x95 > > 65536 |@@ 3918 > > 131072 | 1363 > > 262144 | 278 > > 524288 | 19 > > ------------------------------------------------------------------------------- > > This is interesting, it seems that you have two call paths competing > for INP locks here: > > - pfslowtimo()/tcp_tw_2msl_scan(reuse=0) and > > - tcp_input()/tcp_twstart()/tcp_tw_2msl_scan(reuse=1) My current hypothesis: nginx do write() (or may be close()?) to socket, kernel lock first inp in V_twq_2msl, happen callout for pfslowtimo() on the same CPU core and tcp_tw_2msl_scan infinity locked on same inp. In this case you modification can't help, before next try we need some like yeld(). From owner-freebsd-stable@freebsd.org Sat Sep 24 02:01:12 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 252A5BE7977 for ; Sat, 24 Sep 2016 02:01:12 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-210-59.reflexion.net [208.70.210.59]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CFA35FE for ; Sat, 24 Sep 2016 02:01:10 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 14736 invoked from network); 24 Sep 2016 02:02:00 -0000 Received: from unknown (HELO mail-cs-01.app.dca.reflexion.local) (10.81.19.1) by 0 (rfx-qmail) with SMTP; 24 Sep 2016 02:02:00 -0000 Received: by mail-cs-01.app.dca.reflexion.local (Reflexion email security v8.00.0) with SMTP; Fri, 23 Sep 2016 22:01:15 -0400 (EDT) Received: (qmail 4279 invoked from network); 24 Sep 2016 02:01:14 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 24 Sep 2016 02:01:14 -0000 Received: from [192.168.0.105] (ip70-189-131-151.lv.lv.cox.net [70.189.131.151]) by iron2.pdx.net (Postfix) with ESMTPSA id 9AB2BEC8FC3 for ; Fri, 23 Sep 2016 19:01:08 -0700 (PDT) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: 11.0-RELEASE tier level for arm64/aaarch64 and the officially built arm/armv6 variants? Message-Id: <4076CFFA-7BE2-4E1B-A7E8-08FD8FC27D21@dsl-only.net> Date: Fri, 23 Sep 2016 19:01:07 -0700 To: freebsd-stable@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 Sep 2016 02:01:12 -0000 =46rom https://www.freebsd.org/platforms/arm.html : > 32-bit ARM is officially a Tier 2 architecture, as the FreeBSD project = does not provide official releases or pre-built packages for this = platform due to it primarily targeting the embedded arena. However, = FreeBSD/ARM is being actively developed and maintained, is well = supported, and provides an excellent framework for building ARM-based = systems. FreeBSD/arm supports ARMv4 and ARMv5 processors. FreeBSD/armv6 = supports ARMv6 and ARMv7 processors, including SMP on the latter. "does not provide official releases or pre-built packages"? > # uname -apKU > FreeBSD rpi2 11.0-PRERELEASE FreeBSD 11.0-PRERELEASE #5 r304943M: Sun = Aug 28 03:17:54 PDT 2016 = markmi@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG arm = armv6 1100502 1100502 > # pkg search '.*' | wc > 21349 155540 1596736 Will 11.0-RELEASE change the tier level for any of the specific = arm-armv6 variants that have FreeBSD-11.0-*-arm-armv6-*.img* files = built, such as for RPI2? Even if all the officially built arm-armv6 variants stay tier 2, the = wording on the web page likely needs to be changed because so much is = built and available that the above quote claims is not available. Also from https://www.freebsd.org/platforms/arm.html : > Initial support for 64-bit ARM is complete. 64-bit ARM platforms = follow a set of standard conventions, and a single FreeBSD build will = work on hardware from multiple vendors. As a result, FreeBSD will = provide official releases for FreeBSD/arm64 and packages will be = available. FreeBSD/arm64 is on the path to becoming a Tier 1 = architecture. Will 11.0-RELEASE make arm64/aarch64 Tier 1? [I will note that, while there are no official builds for the Pine64 = family (A64 based) that are under the Allwinner arm activity, the SOC's = involved are Cortex-A53 64-bit arm based. They likely do not fit in the = "standard conventions" or arm64/aarch64 would be where they would have = been supported. Some rewording might be appropriate for the above quote = as well.] =3D=3D=3D Mark Millard markmi at dsl-only.net From owner-freebsd-stable@freebsd.org Sat Sep 24 02:56:20 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 08CF3BE693D for ; Sat, 24 Sep 2016 02:56:20 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-210-51.reflexion.net [208.70.210.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B27C074C for ; Sat, 24 Sep 2016 02:56:19 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 10889 invoked from network); 24 Sep 2016 02:30:23 -0000 Received: from unknown (HELO rtc-sm-01.app.dca.reflexion.local) (10.81.150.1) by 0 (rfx-qmail) with SMTP; 24 Sep 2016 02:30:23 -0000 Received: by rtc-sm-01.app.dca.reflexion.local (Reflexion email security v8.00.0) with SMTP; Fri, 23 Sep 2016 22:29:37 -0400 (EDT) Received: (qmail 23047 invoked from network); 24 Sep 2016 02:29:37 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 24 Sep 2016 02:29:37 -0000 Received: from [192.168.0.105] (ip70-189-131-151.lv.lv.cox.net [70.189.131.151]) by iron2.pdx.net (Postfix) with ESMTPSA id 9A69AEC8A8B; Fri, 23 Sep 2016 19:29:31 -0700 (PDT) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: Fwd: 11.0-RELEASE tier level for arm64/aaarch64 and the officially built arm/armv6 variants? Date: Fri, 23 Sep 2016 19:29:30 -0700 References: <4076CFFA-7BE2-4E1B-A7E8-08FD8FC27D21@dsl-only.net> To: freebsd-stable@freebsd.org, freebsd-arm Message-Id: <332FA120-31E5-4D31-B63E-A0DFDD7DEFC7@dsl-only.net> Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 Sep 2016 02:56:20 -0000 [A resend since I forget to list free-arm in the To: the first time.] =46rom https://www.freebsd.org/platforms/arm.html : > 32-bit ARM is officially a Tier 2 architecture, as the FreeBSD project = does not provide official releases or pre-built packages for this = platform due to it primarily targeting the embedded arena. However, = FreeBSD/ARM is being actively developed and maintained, is well = supported, and provides an excellent framework for building ARM-based = systems. FreeBSD/arm supports ARMv4 and ARMv5 processors. FreeBSD/armv6 = supports ARMv6 and ARMv7 processors, including SMP on the latter. "does not provide official releases or pre-built packages"? > # uname -apKU > FreeBSD rpi2 11.0-PRERELEASE FreeBSD 11.0-PRERELEASE #5 r304943M: Sun = Aug 28 03:17:54 PDT 2016 = markmi@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG arm = armv6 1100502 1100502 > # pkg search '.*' | wc > 21349 155540 1596736 Will 11.0-RELEASE change the tier level for any of the specific = arm-armv6 variants that have FreeBSD-11.0-*-arm-armv6-*.img* files = built, such as for RPI2? Even if all the officially built arm-armv6 variants stay tier 2, the = wording on the web page likely needs to be changed because so much is = built and available that the above quote claims is not available. Also from https://www.freebsd.org/platforms/arm.html : > Initial support for 64-bit ARM is complete. 64-bit ARM platforms = follow a set of standard conventions, and a single FreeBSD build will = work on hardware from multiple vendors. As a result, FreeBSD will = provide official releases for FreeBSD/arm64 and packages will be = available. FreeBSD/arm64 is on the path to becoming a Tier 1 = architecture. Will 11.0-RELEASE make arm64/aarch64 Tier 1? [I will note that, while there are no official builds for the Pine64 = family (A64 based) that are under the Allwinner arm activity, the SOC's = involved are Cortex-A53 64-bit arm based. They likely do not fit in the = "standard conventions" or arm64/aarch64 would be where they would have = been supported. Some rewording might be appropriate for the above quote = as well.] =3D=3D=3D Mark Millard markmi at dsl-only.net From owner-freebsd-stable@freebsd.org Sat Sep 24 07:45:24 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2F5A8BE6A86 for ; Sat, 24 Sep 2016 07:45:24 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-210-56.reflexion.net [208.70.210.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DA507C00 for ; Sat, 24 Sep 2016 07:45:23 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 28415 invoked from network); 24 Sep 2016 07:46:03 -0000 Received: from unknown (HELO mail-cs-02.app.dca.reflexion.local) (10.81.19.2) by 0 (rfx-qmail) with SMTP; 24 Sep 2016 07:46:03 -0000 Received: by mail-cs-02.app.dca.reflexion.local (Reflexion email security v8.00.0) with SMTP; Sat, 24 Sep 2016 03:45:07 -0400 (EDT) Received: (qmail 5511 invoked from network); 24 Sep 2016 07:45:07 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 24 Sep 2016 07:45:07 -0000 Received: from [192.168.0.105] (ip70-189-131-151.lv.lv.cox.net [70.189.131.151]) by iron2.pdx.net (Postfix) with ESMTPSA id 8988BEC8A8B; Sat, 24 Sep 2016 00:45:15 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: 11.0-RELEASE tier level for arm64/aarch64 and the officially built arm/armv6 variants? From: Mark Millard In-Reply-To: <332FA120-31E5-4D31-B63E-A0DFDD7DEFC7@dsl-only.net> Date: Sat, 24 Sep 2016 00:45:14 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <13AA43A5-AC5C-4BB0-81BE-3B87C00B0B58@dsl-only.net> References: <4076CFFA-7BE2-4E1B-A7E8-08FD8FC27D21@dsl-only.net> <332FA120-31E5-4D31-B63E-A0DFDD7DEFC7@dsl-only.net> To: freebsd-stable@freebsd.org, freebsd-arm X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 Sep 2016 07:45:24 -0000 [Another resend since (A) I forget to list free-arm in the To: the first = time and (B) the second send bounced due to temporary conditions for the = outgoing mail server.] =46rom https://www.freebsd.org/platforms/arm.html : > 32-bit ARM is officially a Tier 2 architecture, as the FreeBSD project = does not provide official releases or pre-built packages for this = platform due to it primarily targeting the embedded arena. However, = FreeBSD/ARM is being actively developed and maintained, is well = supported, and provides an excellent framework for building ARM-based = systems. FreeBSD/arm supports ARMv4 and ARMv5 processors. FreeBSD/armv6 = supports ARMv6 and ARMv7 processors, including SMP on the latter. "does not provide official releases or pre-built packages"? > # uname -apKU > FreeBSD rpi2 11.0-PRERELEASE FreeBSD 11.0-PRERELEASE #5 r304943M: Sun = Aug 28 03:17:54 PDT 2016 = markmi@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG arm = armv6 1100502 1100502 > # pkg search '.*' | wc > 21349 155540 1596736 Will 11.0-RELEASE change the tier level for any of the specific = arm-armv6 variants that have FreeBSD-11.0-*-arm-armv6-*.img* files = built, such as for RPI2? Even if all the officially built arm-armv6 variants stay tier 2, the = wording on the web page likely needs to be changed because so much is = built and available that the above quote claims is not available. Also from https://www.freebsd.org/platforms/arm.html : > Initial support for 64-bit ARM is complete. 64-bit ARM platforms = follow a set of standard conventions, and a single FreeBSD build will = work on hardware from multiple vendors. As a result, FreeBSD will = provide official releases for FreeBSD/arm64 and packages will be = available. FreeBSD/arm64 is on the path to becoming a Tier 1 = architecture. Will 11.0-RELEASE make arm64/aarch64 Tier 1? [I will note that, while there are no official builds for the Pine64 = family (A64 based) that are under the Allwinner arm activity, the SOC's = involved are Cortex-A53 64-bit arm based. They likely do not fit in the = "standard conventions" or arm64/aarch64 would be where they would have = been supported. Some rewording might be appropriate for the above quote = as well.] =3D=3D=3D Mark Millard markmi at dsl-only.net From owner-freebsd-stable@freebsd.org Sat Sep 24 21:11:16 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 57EC8BE9290 for ; Sat, 24 Sep 2016 21:11:16 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-it0-x22c.google.com (mail-it0-x22c.google.com [IPv6:2607:f8b0:4001:c0b::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 22721284 for ; Sat, 24 Sep 2016 21:11:16 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-it0-x22c.google.com with SMTP id o3so43817463ita.1 for ; Sat, 24 Sep 2016 14:11:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-transfer-encoding; bh=89DKoYYhdL6Gc9+8lFSTJW9jxj1XZNHtE4rtptCdEO0=; b=VHta04g8zm8BbLgDy9eyfsxZd+pW8xWBkAyl9DVltLHqNx2+Yd1pKcvkZfyNJpjdnM 9qHNtSgSTCAwgWVUyr7SfRCLuIiLIc+f/aC1ckSNytc6euvb7+nBHWte/OQ4ohmJX7L5 2PPHf+9uIT6k88qEMKyebtibZz7hJ2filvjdDOCImsKFuWIS7mTqfDosVtqr2jJTh0nj IdGsMX1tbSlJu3PHAGTuGYlQRQcwVkOH8eX6HTe3hL/2534QPtPHjJnIek49jA3AOygF Xf4x+Ne1zsn2fP3i66re6UfR6lINZ/qsXJ+mvcaPkOI1cMccqCmIlAPbAbU/4zCLGHtL Ip2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc:content-transfer-encoding; bh=89DKoYYhdL6Gc9+8lFSTJW9jxj1XZNHtE4rtptCdEO0=; b=R2GJ3vw7SYVL1ydWZKv5OXEVmCn6oLN0ZyufwCSOe53fsJjo3EIYfO10uD2bSo4EiN DW6JIQZFLd0uLN8CJA+qTHwWBFF9UB8K/BdRa2cdymrwXJbynaHr0wAfPRnSOdX70vZi y6bY03rn4PYDNj+P2KXmWJ+nKhzBXtJh2ZeDHN8hjqlyz9Lk3G7qXGmJ2cD6xR8dbhkv 5jTQD4Za4sowj0/kOW5dkxD9ZWQ6fyNMpRELcaz7iMombB2veUFDR1sETBuTjzizlzMp FyAkVUzt/XmUv3znbZoTpD0dzp2Ay6QhNnoaro/vike2tV2j5bsB9Zl/r294jzMfXcTx ymSg== X-Gm-Message-State: AA6/9RkC2LeohDyeVvlc0NeLHwHjXF4gXMG/cpXKkMbImkdyU/RNN5sS4Z2wxORQB7OlP1ilBCigdHH1xIU0Lg== X-Received: by 10.36.152.5 with SMTP id n5mr10058085itd.79.1474751475446; Sat, 24 Sep 2016 14:11:15 -0700 (PDT) MIME-Version: 1.0 Sender: wlosh@bsdimp.com Received: by 10.36.65.7 with HTTP; Sat, 24 Sep 2016 14:11:15 -0700 (PDT) X-Originating-IP: [69.53.245.200] In-Reply-To: <332FA120-31E5-4D31-B63E-A0DFDD7DEFC7@dsl-only.net> References: <4076CFFA-7BE2-4E1B-A7E8-08FD8FC27D21@dsl-only.net> <332FA120-31E5-4D31-B63E-A0DFDD7DEFC7@dsl-only.net> From: Warner Losh Date: Sat, 24 Sep 2016 15:11:15 -0600 X-Google-Sender-Auth: 0vZbZaSU6pCbd8gn6KSvP3ERS04 Message-ID: Subject: Re: 11.0-RELEASE tier level for arm64/aaarch64 and the officially built arm/armv6 variants? To: Mark Millard Cc: FreeBSD-STABLE Mailing List , freebsd-arm Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 Sep 2016 21:11:16 -0000 On Fri, Sep 23, 2016 at 8:29 PM, Mark Millard wrote: > [A resend since I forget to list free-arm in the To: the first time.] > > From https://www.freebsd.org/platforms/arm.html : > >> 32-bit ARM is officially a Tier 2 architecture, as the FreeBSD project d= oes not provide official releases or pre-built packages for this platform d= ue to it primarily targeting the embedded arena. However, FreeBSD/ARM is be= ing actively developed and maintained, is well supported, and provides an e= xcellent framework for building ARM-based systems. FreeBSD/arm supports ARM= v4 and ARMv5 processors. FreeBSD/armv6 supports ARMv6 and ARMv7 processors,= including SMP on the latter. > > "does not provide official releases or pre-built packages"? > >> # uname -apKU >> FreeBSD rpi2 11.0-PRERELEASE FreeBSD 11.0-PRERELEASE #5 r304943M: Sun Au= g 28 03:17:54 PDT 2016 markmi@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/s= rc/sys/RPI2-NODBG arm armv6 1100502 1100502 > >> # pkg search '.*' | wc >> 21349 155540 1596736 > > Will 11.0-RELEASE change the tier level for any of the specific arm-armv6= variants that have FreeBSD-11.0-*-arm-armv6-*.img* files built, such as fo= r RPI2? > > Even if all the officially built arm-armv6 variants stay tier 2, the word= ing on the web page likely needs to be changed because so much is built and= available that the above quote claims is not available. armv6 is basically Tier 1 right now, though not as Tier 1 as i386 or amd64 due to the fragmented nature of the arm world. On the platforms we run on and create releases for, however, it's my opinion that it is Tier 1: it has been running in production a while, things people expect from a FreeBSD system are present, you can get decent support if you ask questions, there's no known major gotchas in deploying this hardware. The only remaining annoying issue is the 'u-boot' problem where we have to have a different u-boot image for every board and no standardized way to convert a 'generic' image into one that's specific for specific boards. For x86 this is all done with the installer since that boot environment is more standardized. Does this last issue keep arm from being Tier 1? That's a judgement call, but I think the project should promote w/o this last issue. > Also from https://www.freebsd.org/platforms/arm.html : > >> Initial support for 64-bit ARM is complete. 64-bit ARM platforms follow = a set of standard conventions, and a single FreeBSD build will work on hard= ware from multiple vendors. As a result, FreeBSD will provide official rele= ases for FreeBSD/arm64 and packages will be available. FreeBSD/arm64 is on = the path to becoming a Tier 1 architecture. > > Will 11.0-RELEASE make arm64/aarch64 Tier 1? > > [I will note that, while there are no official builds for the Pine64 fami= ly (A64 based) that are under the Allwinner arm activity, the SOC's involve= d are Cortex-A53 64-bit arm based. They likely do not fit in the "standard = conventions" or arm64/aarch64 would be where they would have been supported= . Some rewording might be appropriate for the above quote as well.] No. aarch64 isn't Tier 1 yet. There's many small bits that are missing. It is quite solidly Tier 2, but we don't have a linker, we don't have widespread hardware availability, we don't have production experience with the platform. Most things work, but there's still some gotchas. There's still the 'u-boot' problem with many arm64 systems because for systems that use u-boot to bootstrap UEFI, you need a different image for each board (some closely related board families can get by with one to be pedantic). All these issues are still significant barriers to production use. It's not been officially promoted yet and I don't think the time is quite right yet. Warner