From owner-freebsd-current@freebsd.org Sat Oct 15 19:17:21 2016 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 86C62C138F8 for ; Sat, 15 Oct 2016 19:17:21 +0000 (UTC) (envelope-from uspoerlein@gmail.com) Received: from mail-lf0-x229.google.com (mail-lf0-x229.google.com [IPv6:2a00:1450:4010:c07::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0A1EE89C for ; Sat, 15 Oct 2016 19:17:21 +0000 (UTC) (envelope-from uspoerlein@gmail.com) Received: by mail-lf0-x229.google.com with SMTP id b75so220068776lfg.3 for ; Sat, 15 Oct 2016 12:17:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=xf0oxVGv6F4Zd/xsov/PwyQxHc/uYapjwA08s2QvyZg=; b=QulIWwbtUFnLA+q3H3Aj4+Yb5DyHudaJMSKnhFop51lQh+KwdRHmcJq6XkHnhXSj2r r7qW2nUBSOPKrLJTKSLMxI+aYFUKDnwy5J4Je9aXHh77xOPbtng9LfeC7iDbsUW4k5Am Y6htg862klliZmZP/T7ksAJngHO7viT9BdsPs6GMxjL17Y1YRhrgnB8xOfFEemp5jlvK CIDp5vVJx3eKYy+MLyVSlPh9iGBrFBmp9F76NPB/6DjdIm1YHqiFn97si5a5AnhA6fHW vLrUzXzjKs51WKT//pNPliFFnAGf8ILLtXjj4XFWU7TxChI0GDWhi+tByFR/YgZ1RH6M 7thw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=xf0oxVGv6F4Zd/xsov/PwyQxHc/uYapjwA08s2QvyZg=; b=hmCOEcIVikAthdL3Gxv4Bv80iRCG66Qb7k6hqXpidV+XmEb3b/g5/ZVW1A/YgGsibP pCNfZ7EicBx4IFuNFXreRyGb1yOu0jWMOLEY0tEYHzxoC+knYUDrJvp0OZGwz5rE8eBs KhbHVuID92bwerkwMvVNIrt/Qsi602G6B/UZbtFHWV4W1A23mHEiGbTVywnuuawsiKrn w245mbL2xLtV1LUbTMqDpaoUy13WfUtf+eT/DkZ8g+Ll+P60E0KCxgk5vfjM082RxsKc HEb48aXV24J6HMmlECxdaFlinqITUPQ0T+y0Jo8WClyOJcVfzHYlzpmX3dfsa2vXMIHv PsLQ== X-Gm-Message-State: AA6/9RnyQ8FfZzjpbU49InPsHX+qRjshb3TRCz8QdgBxiA+J4mgPOEa+6R0R5dyabGKjN6XMi6DzSYnOyJrCMw== X-Received: by 10.25.89.70 with SMTP id n67mr7622983lfb.163.1476559038688; Sat, 15 Oct 2016 12:17:18 -0700 (PDT) MIME-Version: 1.0 Received: by 10.25.37.147 with HTTP; Sat, 15 Oct 2016 12:17:17 -0700 (PDT) In-Reply-To: References: <20161015161848.GD2532@acme.spoerlein.net> <6926bd72-35c9-cb21-4785-b50a05e581be@selasky.org> From: =?UTF-8?Q?Ulrich_Sp=C3=B6rlein?= Date: Sat, 15 Oct 2016 21:17:17 +0200 Message-ID: Subject: Re: FreeBSD 11.x grinds to a halt after about 48h of uptime To: Kevin Oberman Cc: Hans Petter Selasky , FreeBSD Current Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Mailman-Approved-At: Sat, 15 Oct 2016 20:33:06 +0000 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Oct 2016 19:17:21 -0000 2016-10-15 18:36 GMT+02:00 Kevin Oberman : > > On Sat, Oct 15, 2016 at 9:26 AM, Hans Petter Selasky > wrote: > > > On 10/15/16 18:18, Ulrich Sp=C3=B6rlein wrote: > > > >> Hey all, while 11.x is -STABLE now, this happens to my machine ever > >> since I upgraded it to 11-CURRENT years ago. I have no idea when this > >> started, actually, but what always happens is this: > >> > >> - System and X11 is up and running, I keep it running over night as I'= m > >> too lazy to reboot and restart everthing. > >> - There's a bunch of xterms, Chrome, Clementine-Player and some other > >> programs running > >> - Coming back to the machine the next day (or the day after) it will > >> exit the screensaver just fine and then either I can use it for a coup= le > >> of seconds before it freezes, or it's pretty much dead already. The > >> mouse cursor still moves for a bit, but the also freezes (so it this a > >> GPU problem??) > >> > >> Now what I currently see on the screen is a clock widget stuck at 18:0= 4 > >> but conky itself has last updated at 18:00:18 ... > >> > >> This time I had some SSH sessions from another machine to see some mor= e > >> useful things. There was nothing in various logs under /var/log (I als= o > >> can't run dmesg anymore ...) > >> I had top(1) running in a loop, this is the last output: > >> > >> last pid: 25633; load averages: 0.27, 0.39, 0.36 up 1+23:03:28 > >> 18:00:12 > >> 202 processes: 2 running, 188 sleeping, 11 zombie, 1 waiting > >> > >> Mem: 8873M Active, 1783M Inact, 5072M Wired, 567M Buf, 132M Free > >> ARC: 1844M Total, 469M MFU, 268M MRU, 16K Anon, 96M Header, 1012M Othe= r > >> Swap: 4096M Total, 2395M Used, 1701M Free, 58% Inuse > >> > >> > >> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCP= U > >> COMMAND > >> 11 root 8 155 ki31 0K 128K CPU0 0 364.6H 772.95= % > >> idle > >> 3122 uqs 15 28 0 7113M 5861M uwait 0 > >> 94:44 13.96% chrome > >> 2887 uqs 28 22 0 1394M 23= 7M > >> select 2 172:53 6.98% chrome > >> 2890 uqs 11 21 0 > >> 1034M 178M select 5 231:21 1.95% chrome > >> 1062 root = 9 > >> 21 0 440M 47220K select 0 67:09 0.98% Xorg > >> 3002 uqs > >> 15 25 5 1159M 172M uwait 2 19:09 0.00% chrome > >> 3139 uqs 17 25 5 1163M 156M uwait 2 16:15 0.00= % > >> chrome > >> 3001 uqs 18 25 5 1639M 575M uwait 0 16:05 0.00= % > >> chrome > >> 12 root 24 -64 - 0K 384K WAIT -1 10:53 0.00= % > >> intr > >> 3129 uqs 12 20 0 2820M 1746M uwait 6 8:36 0.00= % > >> chrome > >> 2822 uqs 9 20 0 217M 81300K select 0 5:10 0.00= % > >> conky > >> 3174 root 1 20 0 21532K 3188K select 0 4:20 0.00= % > >> systat > >> 3130 uqs 16 20 0 1058M 131M uwait 4 3:03 0.00= % > >> chrome > >> 2998 uqs 16 20 0 1110M 123M uwait 2 2:53 0.00= % > >> chrome > >> 3165 uqs 10 20 0 1209M 215M uwait 6 2:52 0.00= % > >> chrome > >> 3142 uqs 11 25 5 1344M 195M uwait 2 2:46 0.00= % > >> chrome > >> 2876 uqs 19 20 0 580M 37164K select 3 2:42 0.00= % > >> clementine-player > >> 20 root 2 -16 - 0K 32K psleep 6 2:25 0.00= % > >> pagedaemon > >> > >> I also had systat -vm running and it continued to update its screen ..= . > >> for a short while, this is the last update before SSH died: > >> > >> > >> Mem usage: 0k%Phy 5%Kmem > >> Mem: KB REAL VIRTUAL VN PAGER SWA= P > >> PAGER > >> Tot Share Tot Share Free in out i= n > >> out > >> Act 11051k 67868 71051992 255448 61840 count > >> All 11051k 67924 71058776 262100 pages > >> Proc: > >> Interrupts > >> r p d s w Csw Trp Sys Int Sof Flt ioflt 224 > >> total > >> 25 730 11 724 109 404 101 13 cow 2 > >> ehci0 16 > >> zfod 3 > >> ehci1 23 > >> 0.0%Sys 0.1%Intr 0.0%User 0.0%Nice 99.9%Idle ozfod 16 > >> cpu0:timer > >> | | | | | | | | | | %ozfod > >> xhci0 264 > >> daefr 3 = em0 > >> 265 > >> 50 dtbuf prcfr 94 > >> hdac1 266 > >> Namei Name-cache Dir-cache 349167 desvn totfr > >> ahci0 270 > >> Calls hits % hits % 349155 numvn react 5 > >> cpu1:timer > >> 121 121 100 253501 frevn pdwak 1 > >> cpu2:timer > >> pdpgs 29 > >> cpu7:timer > >> Disks md0 ada0 ada1 pass0 pass1 pass2 intrn 12 > >> cpu3:timer > >> KB/t 0.00 0.00 0.00 0.00 0.00 0.00 5318892 wire 41 > >> cpu6:timer > >> tps 0 0 0 0 0 0 9261404 act 12 > >> cpu5:timer > >> MB/s 0.00 0.00 0.00 0.00 0.00 0.00 1598184 inact 6 > >> cpu4:timer > >> %busy 0 0 0 0 0 0 cache > >> vgapci0 > >> 61840 free > >> 712304 buf > >> > >> > >> Why do I have a Chrome tab using about 6G? What other sort of debuggin= g > >> output can be helpful to get to the bottom of this? The machine still > >> responds to pings just fine, TCP connections get set up but the SSH > >> handshake never completes. > >> > >> This always happens between 30-50h and is super annoying and has been > >> going on for >1year. Help? > >> > >> Note, I cut the power to the monitor overnight to save electricity, ca= n > >> this mess up something in the Radeon card or X server? What combinatio= ns > >> would be most useful to try next? > >> > >> > > Hi, > > > > Sounds like a memory leak. Can you track the memory use over time? Memory leak or not, it shouldn't lock up the whole system just the minute/second that I start using it again. > > > > > Did you look at the output from: > > > > vmstat -m ? No, but I'll capture it for the next cycle :) > > > > > --HPS > > > I have noted significant memory leakage in chromium for some time. If I > leave it running overnight, my system is essentially frozen. If I termina= te > the chromium process, it slowly comes back to life. I always keep a gkrel= lm > session on-screen where the memory and swap utilization is continuously > displayed and that clearly shows resources declining. > > Try closing your chromium at night and see if that fixes the problem. > > If you have never tried gkrellm (sysutils/gkrellm2), it is a the best > system monitor I have found. though pulls in a lot of dependencies. It al= so > can run as a server with remote systems displaying the data. Handy to > monitor servers. I'll try w/o Chrome, it's easy to stop and restart anyway. I'll be back in a week or so :) Uli