From owner-freebsd-current@freebsd.org  Thu Oct 27 12:51:06 2016
Return-Path: <owner-freebsd-current@freebsd.org>
Delivered-To: freebsd-current@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 09A2CC2289F
 for <freebsd-current@mailman.ysv.freebsd.org>;
 Thu, 27 Oct 2016 12:51:06 +0000 (UTC)
 (envelope-from dnebdal@gmail.com)
Received: from mail-wm0-x22b.google.com (mail-wm0-x22b.google.com
 [IPv6:2a00:1450:400c:c09::22b])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 99CD25E6;
 Thu, 27 Oct 2016 12:51:05 +0000 (UTC)
 (envelope-from dnebdal@gmail.com)
Received: by mail-wm0-x22b.google.com with SMTP id b80so32135708wme.1;
 Thu, 27 Oct 2016 05:51:05 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:from:date:message-id:subject:to
 :content-transfer-encoding;
 bh=k8XSr+5jgrLXk5wHP5q73ZWArSm7wtFeI4pvMZTcLL4=;
 b=jaGDGICAgAthmeTRnJFnwg7SzqU43LrgC7AVsBjL7JyljvdCRURp8LmxeCNzaxMQ6p
 0ry5LGSqABDxcn8S2BsduKU/H9Twj1jOC++JEsu48o0BIJRmfQ8dxNa0unKeTVmLK9n8
 vxNc3cb8mwIslYHHuj3ygXr7wS9mBwpeGERIEWhiR8SBmX4e+vKh0lhnCvrmKmugdEGj
 07gH4xwR3R+gclcOsUx1u87nq6McOwbVX7ZfBcDYAmB85PZfuuqLPV//bCLZtTRg/dYD
 g48U/OcocmLJADyWV7ziEvMh0iRyn/35bHvcLzmgdRMV+4tC55y7FhsB9z7jVRMH2jEY
 jqRQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:in-reply-to:references:from:date
 :message-id:subject:to:content-transfer-encoding;
 bh=k8XSr+5jgrLXk5wHP5q73ZWArSm7wtFeI4pvMZTcLL4=;
 b=g/XlquxfpXFShqWXSgMmKfjFbUiEGCXn9i4Uk4Ju0Dz5ti7tx/jhoU4qT1EuOk82ta
 JMvYsJMFxjC/aMarOOMjgWUzy9HFrwJdSi9B6sSjLdHnUHxCbebNRlU8m69lrt24FOla
 4BTgVxkn3Kwsv6qK3d9aZ1kNl4xHWb1xmk+1UtNWelN9Y66YuA1hWu8OcPFRVMYUNOdV
 fxYCQIbxthM1ttqoutnLeQo9QtU5dv0sgBIGiCiDKJetzc2OnR9z3TsnNh6fqAJiv0E/
 F+GG7iSFWRR0oqJlAg/JY3lMbrHW/sxT7Q3EiqDP3IFc1amDyYerg2mKN4QGUlSI5cwd
 +1Zw==
X-Gm-Message-State: ABUngve6L028MF3HUPt9Y49nlhKe4YYKgkGDpYydMKlQQ2vPgVWKDFZZVKMUmMtiYrAtAZUUPOfztgZI1XyMeQ==
X-Received: by 10.28.12.77 with SMTP id 74mr8902726wmm.115.1477572663680; Thu,
 27 Oct 2016 05:51:03 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.28.109.132 with HTTP; Thu, 27 Oct 2016 05:51:03 -0700 (PDT)
In-Reply-To: <20161026164518.GD2734@acme.spoerlein.net>
References: <20161015161848.GD2532@acme.spoerlein.net>
 <6926bd72-35c9-cb21-4785-b50a05e581be@selasky.org>
 <CAN6yY1toUPRt3zBBJtgDVoFJCmZdgoGaUXznS9BbooN9BbfeEQ@mail.gmail.com>
 <20161024174327.GB2734@acme.spoerlein.net>
 <20161026164343.GC2734@acme.spoerlein.net>
 <20161026164518.GD2734@acme.spoerlein.net>
From: Daniel Nebdal <dnebdal@gmail.com>
Date: Thu, 27 Oct 2016 14:51:03 +0200
Message-ID: <CA+t49PLPHXw-F5t16G0O8YrNPynyB7eo2g6+ErZnbMziwCGj0g@mail.gmail.com>
Subject: Re: 11.x deadlocking during pfault (was Re: FreeBSD 11.x grinds to a
 halt after about 48h of uptime)
To: Kevin Oberman <rkoberman@gmail.com>, kib@freebsd.org, 
 Hans Petter Selasky <hps@selasky.org>,
 FreeBSD Current <freebsd-current@freebsd.org>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current/>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Oct 2016 12:51:06 -0000

On Wed, Oct 26, 2016 at 6:45 PM, Ulrich Sp=C3=B6rlein <uqs@freebsd.org> wro=
te:
>
> On Wed, 2016-10-26 at 18:43:43 +0200, Ulrich Sp=C3=B6rlein wrote:
> > On Mon, 2016-10-24 at 19:43:27 +0200, Ulrich Sp=C3=B6rlein wrote:
> > > On Sat, 2016-10-15 at 09:36:27 -0700, Kevin Oberman wrote:
> > > > On Sat, Oct 15, 2016 at 9:26 AM, Hans Petter Selasky <hps@selasky.o=
rg>
> > > > wrote:
> > > >
> > > > > On 10/15/16 18:18, Ulrich Sp=C3=B6rlein wrote:
> > > > >
> > > > >> Hey all, while 11.x is -STABLE now, this happens to my machine e=
ver
> > > > >> since I upgraded it to 11-CURRENT years ago. I have no idea when=
 this
> > > > >> started, actually, but what always happens is this:
> > > > >>
> > > > >> - System and X11 is up and running, I keep it running over night=
 as I'm
> > > > >> too lazy to reboot and restart everthing.
> > > > >> - There's a bunch of xterms, Chrome, Clementine-Player and some =
other
> > > > >> programs running
> > > > >> - Coming back to the machine the next day (or the day after) it =
will
> > > > >> exit the screensaver just fine and then either I can use it for =
a couple
> > > > >> of seconds before it freezes, or it's pretty much dead already. =
The
> > > > >> mouse cursor still moves for a bit, but the also freezes (so it =
this a
> > > > >> GPU problem??)
> > > > >>
> > > > >> Now what I currently see on the screen is a clock widget stuck a=
t 18:04
> > > > >> but conky itself has last updated at 18:00:18 ...
> > > > >>
> > > > >> This time I had some SSH sessions from another machine to see so=
me more
> > > > >> useful things. There was nothing in various logs under /var/log =
(I also
> > > > >> can't run dmesg anymore ...)
> > > > >> I had top(1) running in a loop, this is the last output:
> > > > >>
> > > > >> last pid: 25633;  load averages:  0.27,  0.39,  0.36  up 1+23:03=
:28
> > > > >> 18:00:12
> > > > >> 202 processes: 2 running, 188 sleeping, 11 zombie, 1 waiting
> > > > >>
> > > > >> Mem: 8873M Active, 1783M Inact, 5072M Wired, 567M Buf, 132M Free
> > > > >> ARC: 1844M Total, 469M MFU, 268M MRU, 16K Anon, 96M Header, 1012=
M Other
> > > > >> Swap: 4096M Total, 2395M Used, 1701M Free, 58% Inuse
> > > > >>
> > > > >>
> > > > >>   PID USERNAME      THR PRI NICE   SIZE    RES STATE   C   TIME =
   WCPU
> > > > >> COMMAND
> > > > >>    11 root            8 155 ki31     0K   128K CPU0    0 364.6H =
772.95%
> > > > >> idle
> > > > >>              3122 uqs            15  28    0  7113M  5861M uwait=
   0
> > > > >> 94:44  13.96% chrome
> > > > >>                            2887 uqs            28  22    0  1394=
M   237M
> > > > >> select  2 172:53   6.98% chrome
> > > > >>                                        2890 uqs            11  2=
1    0
> > > > >> 1034M   178M select  5 231:21   1.95% chrome
> > > > >>                                                    1062 root    =
        9
> > > > >> 21    0   440M 47220K select  0  67:09   0.98% Xorg
> > > > >>                                                              300=
2 uqs
> > > > >>       15  25    5  1159M   172M uwait   2  19:09   0.00% chrome
> > > > >>  3139 uqs            17  25    5  1163M   156M uwait   2  16:15 =
  0.00%
> > > > >> chrome
> > > > >>  3001 uqs            18  25    5  1639M   575M uwait   0  16:05 =
  0.00%
> > > > >> chrome
> > > > >>    12 root           24 -64    -     0K   384K WAIT   -1  10:53 =
  0.00%
> > > > >> intr
> > > > >>  3129 uqs            12  20    0  2820M  1746M uwait   6   8:36 =
  0.00%
> > > > >> chrome
> > > > >>  2822 uqs             9  20    0   217M 81300K select  0   5:10 =
  0.00%
> > > > >> conky
> > > > >>  3174 root            1  20    0 21532K  3188K select  0   4:20 =
  0.00%
> > > > >> systat
> > > > >>  3130 uqs            16  20    0  1058M   131M uwait   4   3:03 =
  0.00%
> > > > >> chrome
> > > > >>  2998 uqs            16  20    0  1110M   123M uwait   2   2:53 =
  0.00%
> > > > >> chrome
> > > > >>  3165 uqs            10  20    0  1209M   215M uwait   6   2:52 =
  0.00%
> > > > >> chrome
> > > > >>  3142 uqs            11  25    5  1344M   195M uwait   2   2:46 =
  0.00%
> > > > >> chrome
> > > > >>  2876 uqs            19  20    0   580M 37164K select  3   2:42 =
  0.00%
> > > > >> clementine-player
> > > > >>    20 root            2 -16    -     0K    32K psleep  6   2:25 =
  0.00%
> > > > >> pagedaemon
> > > > >>
> > > > >> I also had systat -vm running and it continued to update its scr=
een ...
> > > > >> for a short while, this is the last update before SSH died:
> > > > >>
> > > > >>
> > > > >>    Mem usage:  0k%Phy  5%Kmem
> > > > >> Mem: KB    REAL            VIRTUAL                      VN PAGER=
   SWAP
> > > > >> PAGER
> > > > >>         Tot   Share      Tot    Share    Free           in   out=
     in
> > > > >>  out
> > > > >> Act  11051k   67868 71051992   255448   61840  count
> > > > >> All  11051k   67924 71058776   262100          pages
> > > > >> Proc:
> > > > >> Interrupts
> > > > >>   r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Flt        ioflt =
  224
> > > > >> total
> > > > >>      25     730  11   724  109  404  101   13             cow   =
    2
> > > > >> ehci0 16
> > > > >>                                                           zfod  =
    3
> > > > >> ehci1 23
> > > > >>  0.0%Sys   0.1%Intr  0.0%User  0.0%Nice 99.9%Idle         ozfod =
   16
> > > > >> cpu0:timer
> > > > >> |    |    |    |    |    |    |    |    |    |           %ozfod
> > > > >>  xhci0 264
> > > > >>                                                           daefr =
    3 em0
> > > > >> 265
> > > > >>                                         50 dtbuf          prcfr =
   94
> > > > >> hdac1 266
> > > > >> Namei     Name-cache   Dir-cache    349167 desvn          totfr
> > > > >>  ahci0 270
> > > > >>    Calls    hits   %    hits   %    349155 numvn          react =
    5
> > > > >> cpu1:timer
> > > > >>      121     121 100                253501 frevn          pdwak =
    1
> > > > >> cpu2:timer
> > > > >>                                                           pdpgs =
   29
> > > > >> cpu7:timer
> > > > >> Disks   md0  ada0  ada1 pass0 pass1 pass2                 intrn =
   12
> > > > >> cpu3:timer
> > > > >> KB/t   0.00  0.00  0.00  0.00  0.00  0.00         5318892 wire  =
   41
> > > > >> cpu6:timer
> > > > >> tps       0     0     0     0     0     0         9261404 act   =
   12
> > > > >> cpu5:timer
> > > > >> MB/s   0.00  0.00  0.00  0.00  0.00  0.00         1598184 inact =
    6
> > > > >> cpu4:timer
> > > > >> %busy     0     0     0     0     0     0                 cache
> > > > >>  vgapci0
> > > > >>                                                     61840 free
> > > > >>                                                    712304 buf
> > > > >>
> > > > >>
> > > > >> Why do I have a Chrome tab using about 6G? What other sort of de=
bugging
> > > > >> output can be helpful to get to the bottom of this? The machine =
still
> > > > >> responds to pings just fine, TCP connections get set up but the =
SSH
> > > > >> handshake never completes.
> > > > >>
> > > > >> This always happens between 30-50h and is super annoying and has=
 been
> > > > >> going on for >1year. Help?
> > > > >>
> > > > >> Note, I cut the power to the monitor overnight to save electrici=
ty, can
> > > > >> this mess up something in the Radeon card or X server? What comb=
inations
> > > > >> would be most useful to try next?
> > > > >>
> > > > >>
> > > > > Hi,
> > > > >
> > > > > Sounds like a memory leak. Can you track the memory use over time=
?
> > > > >
> > > > > Did you look at the output from:
> > > > >
> > > > > vmstat -m ?
> > > > >
> > > > > --HPS
> > > >
> > > >
> > > > I have noted significant  memory leakage in chromium for some time.=
 If I
> > > > leave it running overnight, my system is essentially frozen. If I t=
erminate
> > > > the chromium process, it slowly comes back to life. I always keep a=
 gkrellm
> > > > session on-screen where the memory and swap utilization is continuo=
usly
> > > > displayed and that clearly shows resources declining.
> > >
> > > That is not what is happening to my system though, it actually
> > > deadlocks. There's no way to recover from it, it seems.
> > >
> > > So I killed Chromium overnight each day, and I'm at this:
> > >
> > > % top -Sbores
> > > last pid: 44526;  load averages:  0.10,  0.11,  0.56  up 7+09:53:30  =
  19:33:25
> > > 156 processes: 2 running, 153 sleeping, 1 waiting
> > >
> > > Mem: 315M Active, 550M Inact, 5671M Wired, 515M Buf, 9324M Free
> > > ARC: 1852M Total, 541M MFU, 196M MRU, 16K Anon, 93M Header, 1022M Oth=
er
> > > Swap: 4096M Total, 2186M Used, 1910M Free, 53% Inuse
> > >
> > >
> > >   PID USERNAME      THR PRI NICE   SIZE    RES STATE   C   TIME    WC=
PU COMMAND
> > >  2755 uqs            10  20    0  1697M   311M select  1  47:23   0.0=
0% conky
> > >  2736 uqs            32  20    0   699M   116M select  7  94:29   0.0=
0% clementine-player
> > >  3000 uqs            12  20    0  1126M 69380K select  5   9:48   0.0=
0% digikam
> > >   960 root            9  20    0   448M 59076K select  0 250:22   0.0=
0% Xorg
> > > 72608 uqs             8  20    0   939M 55432K uwait   5   0:01   0.0=
0% chrome
> > > 72599 uqs             9  52    0   929M 55116K uwait   0   0:00   0.0=
0% chrome
> > >  2567 root            1  20    0 89948K 42964K select  1   1:51   0.0=
0% bsnmpd
> > > 70476 uqs             1  20    0 93656K 25712K select  2   0:05   0.0=
0% xterm
> > >  2730 uqs             5  20    0   208M 14988K select  1   0:22   0.0=
0% clock-applet
> > >   880 root            1  20    0 22628K 12500K select  3   0:20   0.0=
0% ntpd
> > >  2726 uqs             4  20    0   206M 12456K select  6   0:09   0.0=
0% mateweather-applet
> > > 44352 uqs             1  20    0 75224K 12348K select  4   0:00   0.0=
0% xterm
> > > 43049 uqs             1  20    0 75224K 11792K select  5   0:00   0.0=
0% xterm
> > >  3074 uqs             2  20    0   308M  9692K select  1   0:02   0.0=
0% kdeinit4
> > >  2671 uqs             1  20    0   144M  9488K select  1   0:13   0.0=
0% openbox
> > >  3072 uqs             1  20    0   210M  8284K select  3   0:00   0.0=
0% kdeinit4
> > >  2724 uqs             4  20    0   154M  8256K select  2   0:19   0.0=
0% wnck-applet
> > >  2701 uqs             5  20    0   177M  8144K select  2   0:01   0.0=
0% mate-panel
> > >
> > >
> > > 7d running, pretty good. But look closer, the system is doing pretty
> > > much nothing but did swap out 2G. What?
> > >
> > > > Try closing your chromium at night and see if that fixes the proble=
m.
> > >
> > > It's better, but I'm not sure it's a real fix. I've now turned off
> > > "hardware acceleration" in Chromium, though chrome://gpu didn't real
> > > inspire confidence that it was actually using any h/w accel at all.
> > >
> > >
> > > > If you have never tried gkrellm (sysutils/gkrellm2), it is a the be=
st
> > > > system monitor I have found. though pulls in a lot of dependencies.=
 It also
> > > > can run as a server with remote systems displaying the data. Handy =
to
> > > > monitor servers.
> > >
> > > I had a cacti-setup that would also monitor my workstation (through a
> > > OpenVPN tunnel), but that has bit-rotted and Apache only gives me 500=
s
> > > on that cacti URL and nothing in the logs, oh well ...)
> > >
> > > Hooking up a serial console and testing whether DDB works is probably
> > > the next best step to take ...
> >
> > Sigh, I forgot to shut down Chrome overnight, and my system is
> > deadlocked again. I can still switch virtual desktops in X11 and a
> > running xterm accepts keyboard input, but it is 18:35 now and I see tha=
t
> > my X11 clock has last updated at 17:18 and conky is stuck at 17:14:11
> >
> > my top-in-a-loop is stuck here and no longer loops:
> >
> > last pid: 73731;  load averages:  0.23,  0.24,  0.23  up 9+07:34:15    =
17:14:10
> > 160 processes: 3 running, 146 sleeping, 11 zombie
> >
> > Mem: 9302M Active, 763M Inact, 5682M Wired, 752M Buf, 113M Free
> > ARC: 1731M Total, 549M MFU, 129M MRU, 16K Anon, 91M Header, 963M Other
> > Swap:
> >
> >
> >   PID USERNAME      THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU=
 COMMAND
> >   960 root            9  21    0   451M 55796K CPU7    7 299:15   2.98%=
 Xorg
> > 53884 uqs            27  22    0   904M   250M CPU4    4 105:05   2.98%=
 chrome
> > 54081 uqs            15  21    0  3601M  2527M uwait   6  38:11   0.98%=
 chrome
> >  2736 uqs            33  20    0   697M 84620K select  5  95:38   0.00%=
 clementine-player
> >  2755 uqs            10  20    0  2111M   721M select  0  60:47   0.00%=
 conky
> > 78943 uqs             1  20    0 21532K  3328K select  7  13:12   0.00%=
 systat
> > 54048 uqs            15  25    5  1093M   159M uwait   4   9:51   0.00%=
 chrome
> >  3000 uqs            12  20    0  1126M 53248K select  7   9:50   0.00%=
 digikam
> > 53999 uqs            19  25    5  1525M   514M uwait   5   6:28   0.00%=
 chrome
> >  2703 uqs             1  20    0   169M  5948K select  0   5:40   0.00%=
 mate-volume-control
> >  2707 uqs             1  20    0 60240K  2516K select  7   5:21   0.00%=
 autocutsel
> > 54077 uqs            15  20    0  1803M   821M uwait   6   3:51   0.00%=
 chrome
> > 78869 uqs             1  20    0 93480K  6636K select  7   3:38   0.00%=
 sshd
> > 38396 uqs             1  20    0 75224K  3896K select  0   3:18   0.00%=
 xterm
> >  2567 root            1  20    0 89948K 43892K select  6   2:25   0.00%=
 bsnmpd
> >   876 root            9  52    0 30120K  3468K uwait   7   2:20   0.00%=
 nscd
> >   883 root            1  20    0 10444K  1708K select  0   2:16   0.00%=
 powerd
> >  2586 haldaemon       2  20    0 56620K  3920K select  1   2:04   0.00%=
 hald
> >
> >
> > systat -vm:
> >
> >    10 users    Load  0.21  0.23  0.23                  Oct 26 17:14
> >    Mem usage:  0k%Phy  5%Kmem
> > Mem: KB    REAL            VIRTUAL                      VN PAGER   SWAP=
 PAGER
> >         Tot   Share      Tot    Share    Free           in   out     in=
   out
> > Act  10545k   51036 67451248   249344   61912  count   120     3
> > All  10550k   55504 67480212   275720          pages   807     3
> > Proc:                                                            Interr=
upts
> >   r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Flt    149 ioflt   845 t=
otal
> >       6     781  12  9327 6923  24k  318   18 6649     20 cow       2 e=
hci0 16
> >                                                      5064 zfod      3 e=
hci1 23
> >  4.7%Sys   0.0%Intr  1.7%User  0.0%Nice 93.6%Idle      16 ozfod   106 c=
pu0:timer
> > |    |    |    |    |    |    |    |    |    |           %ozfod       x=
hci0 264
> > =3D=3D>                                                   927 daefr    =
45 em0 265
> >                                         51 dtbuf      545 prcfr    94 h=
dac1 266
> > Namei     Name-cache   Dir-cache    349167 desvn     1787 totfr   175 a=
hci0 270
> >    Calls    hits   %    hits   %    337762 numvn          react    50 c=
pu1:timer
> >      935     901  96                170078 frevn          pdwak    48 c=
pu3:timer
> >                                                   1435985 pdpgs    66 c=
pu2:timer
> > Disks   md0  ada0  ada1 pass0 pass1 pass2                 intrn    77 c=
pu7:timer
> > KB/t   0.00 22.59  0.00  0.00  0.00  0.00         5819100 wire     75 c=
pu4:timer
> > tps      17   168     0     0     0     0         9723008 act      51 c=
pu6:timer
> > MB/s   0.00  3.70  0.00  0.00  0.00  0.00          636300 inact    53 c=
pu5:timer
> > %busy     0     3     0     0     0     0                 cache       v=
gapci0
> >                                                     61912 free
> >                                                    770732 buf
> >
> >
> > I'm unable to scroll back the vmstat -m output in my tmux pane (running=
 on a
> > different system, this is super strange), so all I can show is this:
> >
> >
> >      inodedep    16 12293K       -  2770872  512
> >     bmsafemap     9    49K       -  2136243  256,8192
> >        newblk    28 24582K       -  4111401  256
> >      indirdep     4     1K       -   865983  128,16384
> >      freefrag     6     1K       -   682027  128
> >      freeblks     3     1K       -  1318985  256
> >      freefile     3     1K       -  1272832  64
> >        diradd     2     1K       -  1306741  128
> >         mkdir     0     0K       -     5120  128
> >        dirrem     0     0K       -  1305358  128
> >     newdirblk     0     0K       -     2610  64
> >      freework     9     1K       -  1566489  64,128
> >         sbdep     0     0K       -    45661  64
> >      savedino     0     0K       -   186280  256
> >       softdep     6     3K       -        6  512
> >   ufs_dirhash  1533   767K       -   109513  16,32,64,128,256,512,1024,=
2048,4096
> >     ufs_quota     1  2048K       -        1
> >     ufs_mount    18    55K       -       18  512,2048,4096,8192
> >     vm_pgdata     1  2048K       -        2  128
> >       UMAHash    23  5385K       -      108  512,1024,2048,4096,8192,16=
384,32768,65536
> >       memdesc     1     4K       -        1  4096
> >      atkbddev     2     1K       -        2  64
> >       entropy     1     1K       -   894489  32,4096
> >        apmdev     1     1K       -        1  128
> >    madt_table     0     0K       -        1  4096
> >      SCSI ENC    25   100K       -   120744  16,64,256,2048,32768
> >       io_apic     1     2K       -        1  2048
> >           MCA    18     3K       -       18  64,128
> >           msi    16     2K       -       16  128
> >      nexusdev     5     1K       -        5  16
> >          hdaa     5    54K       -        5  2048,16384,32768
> >          hdac     1     1K       -        1  1024
> >         hdacc     1     1K       -        1  32
> >         linux    29     2K       -       29  64
> >       solaris 295750 228696K       - 140520323  16,32,64,128,256,512,10=
24,2048,4096,8192,32768
> >    kstat_data     6     1K       -        6  64
> >      eli data    22     4K       -  6218901  64,256,512,1024,2048,4096,=
8192,16384,32768,65536
> >          ksem     1     8K       -        1  8192
> >   nullfs_hash     1  2048K       -        1
> >   nullfs_node     9     1K       -       41  64
> >  nullfs_mount     9     1K       -        9  32
> >   fdesc_mount     1     1K       -        1  16
> >      gem_name    46    14K       -      122  32,4096
> >    drm_global     2     1K       -        2  128,256
> >       drm_dma     1     1K       -        1  32
> >     drm_sarea     1     1K       -        1  16
> >    drm_driver    91  2278K       -      125  16,32,64,128,256,512,1024,=
2048,4096,8192,32768
> >     drm_minor     2     1K       -        2  128
> >     drm_files     1     1K       -        3  512
> > drm_ctxbitmap     1     4K       -        1  4096
> >      drm_sman    41     6K       -       42  128
> >   drm_hashtab     3  4129K       -        4  128,32768
> >       drm_kms    69    19K       -      163  16,32,64,128,256,512,2048,=
4096,8192
> >    drm_vblank     7     1K       -        7  32,256
> >        ttm_pd    16    17K       -       18  16,128,2048,65536
> >      ttm_rman     2     1K       -        2  256
> >      ttm_zone     2     1K       -        2  64
> >   ttm_poolmgr     1     1K       -        1  512
> >
> >
> > Now what?
> >
> > The xterm I have running locally with a stuck top is showing the top 3 =
chrome
> > processes in pfault state and it has "Swap: 11M In" in the header, so c=
learly
> > 11.x is prone to deadlock during page faults and or swapping. It has la=
st
> > updated on 17:14:13 (compare to the other top at 17:14:10 not showing p=
fault
> > state yet).
>
> Addendum: I still see the HDD indication light flickering every couple
> of seconds, so something is still doing I/O. My SSH sessions into the
> machine haven't time out, and the screensaver (just DPMS blank) is also
> still kicking in correctly.
>

Just out of general interest, how is your swap set up? Is it on a zvol
or something exotic like that, or is it just a normal partition?

--=20
Daniel Nebdal