Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 13 Jul 2015 09:48:38 -0300
From:      Christopher Forgeron <csforgeron@gmail.com>
To:        Karl Denninger <karl@denninger.net>
Cc:        FreeBSD Filesystems <freebsd-fs@freebsd.org>
Subject:   Re: FreeBSD 10.1 Memory Exhaustion
Message-ID:  <CAB2_NwCJ4sOV0LOLR2dRw%2BDNYmFJ_LxUx566nevCcgy-Ld0SYQ@mail.gmail.com>
In-Reply-To: <55A3A800.5060904@denninger.net>
References:  <CAB2_NwCngPqFH4q-YZk00RO_aVF9JraeSsVX3xS0z5EV3YGa1Q@mail.gmail.com> <55A3A800.5060904@denninger.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Hey, thanks, that's sounding exactly like my issue.

I'll patch and build that kernel this week and have it up for testing on at
least one of my boxes.

While I agree that ZFS is the instigator of this behaviour, shouldn't the
kernel protect against this type of behaviour?

What would happen if a web server started chewing memory aggressively due
to a DoS attack?

On Mon, Jul 13, 2015 at 8:58 AM, Karl Denninger <karl@denninger.net> wrote:

> Put this on your box and see if the problem goes away.... :-)
>
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D187594
>
> The 2015-02-10 refactor will apply against 10.1-STABLE and 10.2-PRE (the
> latter will give you a 10-line fuzz in one block but applies and works.)
>
> I've been unable to provoke misbehavior with this patch in and I run a
> cron job that does auto-snapshotting.  There are others that have run
> this patch with similarly positive results.
>
> On 7/13/2015 06:48, Christopher Forgeron wrote:
> > TL;DR Summary: I can run FreeBSD out of memory quite consistently, and
> it=E2=80=99s
> > not a TOS/mbuf exhaustion issue. It=E2=80=99s quite possible that ZFS i=
s the
> > culprit, but shouldn=E2=80=99t the pager be able to handle aggressive m=
emory
> > requests in a low memory situation gracefully, without needing custom
> > tuning of ZFS / VM?
> >
> >
> > Hello,
> >
> > I=E2=80=99ve been dealing with some instability in my 10.1-RELEASE and
> > STABLEr282701M machines for the last few months.
> >
> > These machines are NFS/iSCSI storage machines, running on Dell M610x or
> > similar hardware, 96 Gig Memory, 10Gig Network Cards, dual Xeon
> Processors
> > =E2=80=93 Fairly beefy stuff.
> >
> > Initially I thought it was more issues with TOS / jumbo mbufs, as I had
> > this problem last year. I had thought that this was properly resolved,
> but
> > setting my MTU to 1500, and turning off TOS did give me a bit more
> > stability. Currently all my machines are set this way.
> >
> > Crashes were usually represented by loss of network connectivity, and t=
he
> > ctld daemon scrolling messages across the screen at full speed about lo=
st
> > connections.
> >
> > All of this did seem like more network stack problems, but with each
> crash
> > I=E2=80=99d be able to learn a bit more.
> >
> > Usually there was nothing of any use in the logfile, but every now and
> then
> > I=E2=80=99d get this:
> >
> > Jun  3 13:02:04 san0 kernel: WARNING: 172.16.0.97
> > (iqn.1998-01.com.vmware:esx5a-3387a188): failed to allocate memory
> > Jun  3 13:02:04 san0 kernel: WARNING: icl_pdu_new: failed to allocate 8=
0
> > bytes
> > Jun  3 13:02:04 san0 kernel: WARNING: 172.16.0.97
> > (iqn.1998-01.com.vmware:esx5a-3387a188): failed to allocate memory
> > Jun  3 13:02:04 san0 kernel: WARNING: icl_pdu_new: failed to allocate 8=
0
> > bytes
> > Jun  3 13:02:04 san0 kernel: WARNING: 172.16.0.97
> > (iqn.1998-01.com.vmware:esx5a-3387a188): failed to allocate memory
> > ---------
> > Jun  4 03:03:09 san0 kernel: WARNING: icl_pdu_new: failed to allocate 8=
0
> > bytes
> > Jun  4 03:03:09 san0 kernel: WARNING: icl_pdu_new: failed to allocate 8=
0
> > bytes
> > Jun  4 03:03:09 san0 kernel: WARNING: 172.16.0.97
> > (iqn.1998-01.com.vmware:esx5a-3387a188): failed to allocate memory
> > Jun  4 03:03:09 san0 kernel: WARNING: 172.16.0.97
> > (iqn.1998-01.com.vmware:esx5a-3387a188): connection error; dropping
> > connection
> > Jun  4 03:03:09 san0 kernel: WARNING: 172.16.0.97
> > (iqn.1998-01.com.vmware:esx5a-3387a188): connection error; dropping
> > connection
> > Jun  4 03:03:10 san0 kernel: WARNING: 172.16.0.97
> > (iqn.1998-01.com.vmware:esx5a-3387a188): waiting for CTL to terminate
> > tasks, 1 remaining
> > Jun  4 06:04:27 san0 syslogd: kernel boot file is /boot/kernel/kernel
> >
> > So knowing that it seemed to be running out of memory, I started leavin=
g
> > leaving =E2=80=98vmstat 5=E2=80=99 running on a console, to see what it=
 was displaying
> > during the crash.
> >
> > It was always the same thing:
> >
> >  0 0 0   1520M  4408M    15   0   0   0    25  19   0   0 21962 1667
> 91390
> >  0 33 67
> >  0 0 0   1520M  4310M     9   0   0   0     2  15   3   0 21527 1385
> 95165
> >  0 31 69
> >  0 0 0   1520M  4254M     7   0   0   0    14  19   0   0 17664 1739
> 72873
> >  0 18 82
> >  0 0 0   1520M  4145M     2   0   0   0     0  19   0   0 23557 1447
> 96941
> >  0 36 64
> >  0 0 0   1520M  4013M     4   0   0   0    14  19   0   0 4288  490 346=
85
> >  0 72 28
> >  0 0 0   1520M  3885M     2   0   0   0     0  19   0   0 11141 1038
> 69242
> >  0 52 48
> >  0 0 0   1520M  3803M    10   0   0   0    14  19   0   0 24102 1834
> 91050
> >  0 33 67
> >  0 0 0   1520M  8192B     2   0   0   0     2  15   1   0 19037 1131
> 77470
> >  0 45 55
> >  0 0 0   1520M  8192B     0  22   0   0     2   0   6   0  146   82
> 578  0
> >  0 100
> >  0 0 0   1520M  8192B     1   0   0   0     0   0   0   0  130   40
> 510  0
> >  0 100
> >  0 0 0   1520M  8192B     0   0   0   0     0   0   0   0  143   40
> 501  0
> >  0 100
> >  0 0 0   1520M  8192B     0   0   0   0     0   0   0   0  201   62
> 660  0
> >  0 100
> >  0 0 0   1520M  8192B     0   0   0   0     0   0   0   0  101   28
> 404  0
> >  0 100
> >  0 0 0   1520M  8192B     0   0   0   0     0   0   0   0   97   27
> 398  0
> >  0 100
> >  0 0 0   1520M  8192B     0   0   0   0     0   0   0   0   93   28
> 377  0
> >  0 100
> >  0 0 0   1520M  8192B     0   0   0   0     0   0   0   0   92   27
> 373  0
> >  0 100
> >
> >
> >  I=E2=80=99d go from a decent amount of free memory to suddenly having =
none.
> Vmstat
> > would stop outputting, console commands would hang, etc. The whole syst=
em
> > would be useless.
> >
> > Looking into this, I came across a similar issue;
> >
> > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D199189
> >
> > I started increasing v.v_free_min, and it helped =E2=80=93 My crashes w=
ent from
> > being ~every 6 hours to every few days.
> >
> > Currently I=E2=80=99m running with vm.v_free_min=3D1254507 =E2=80=93 Th=
at=E2=80=99s (1254507 *
> 4KiB)
> > , or 4.78GiB of Reserve.  The vmstat above is of a machine with that
> > setting still running to 8B of memory.
> >
> > I have two issues here:
> >
> > 1) I don=E2=80=99t think I should ever be able to run the system into t=
he ground
> on
> > memory. Deny me new memory until the pager can free more.
> > 2) Setting =E2=80=98min=E2=80=99 doesn=E2=80=99t really mean =E2=80=98m=
in=E2=80=99 as it can obviously go below
> > that threshold.
> >
> >
> > I have plenty of local UFS swap (non-ZFS drives)
> >
> >  Adrian requested that I output a few more diagnostic items, and this i=
s
> > what I=E2=80=99m running on a console now, in a loop:
> >
> >         vmstat
> >         netstat -m
> >         vmstat -z
> >         sleep 1
> >
> > The output of four crashes are attached here, as they can be a bit long=
.
> > Let me know if that=E2=80=99s not a good way to report them. They will =
each start
> > mid-way through a vmstat =E2=80=93z output, as that=E2=80=99s as far ba=
ck as my terminal
> > buffer allows.
> >
> >
> >
> > Now, I have a good idea of the conditions that are causing this: ZFS
> > Snapshots, run by cron, during times of high ZFS writes.
> >
> > The crashes are all nearly on the hour, as that=E2=80=99s when crontab =
triggers
> my
> > python scripts to make new snapshots, and delete old ones.
> >
> > My average FreeBSD machine has ~ 30 zfs datasets, with each pool having
> ~20
> > TiB used. These all need to snapshot on the hour.
> >
> > By staggering the snapshots by a few minutes, I have been able to reduc=
e
> > crashing from every other day to perhaps once a week if I=E2=80=99m luc=
ky =E2=80=93 But
> if
> > I start moving a lot of data around, I can cause daily crashes again.
> >
> > It=E2=80=99s looking to be the memory demand of snapshotting lots of ZF=
S datasets
> > at the same time while accepting a lot of write traffic.
> >
> > Now perhaps the answer is =E2=80=98don=E2=80=99t do that=E2=80=99 but I=
 feel that FreeBSD should
> be
> > robust enough to handle this. I don=E2=80=99t mind tuning for now to
> > reduce/eliminate this, but others shouldn=E2=80=99t run into this pain =
just
> because
> > they heavily load their machines =E2=80=93 There must be a way of avoid=
ing this
> > condition.
> >
> > Here are the contents of my /boot/loader.conf and sysctl.conf, so show =
my
> > minimal tuning to make this problem a little more bearable:
> >
> > /boot/loader.conf
> > vfs.zfs.arc_meta_limit=3D49656727553
> > vfs.zfs.arc_max =3D 91489280512
> >
> > /etc/sysctl.conf
> > vm.v_free_min=3D1254507
> >
> >
> > Any suggestions/help is appreciated.
> >
> > Thank you.
> >
> >
> > _______________________________________________
> > freebsd-fs@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>
> --
> Karl Denninger
> karl@denninger.net <mailto:karl@denninger.net>
> /The Market Ticker/
> /[S/MIME encrypted email preferred]/
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAB2_NwCJ4sOV0LOLR2dRw%2BDNYmFJ_LxUx566nevCcgy-Ld0SYQ>