Date: Sat, 2 Apr 2016 11:39:10 +0200 From: "O. Hartmann" <ohartman@zedat.fu-berlin.de> To: Cy Schubert <Cy.Schubert@komquats.com> Cc: Michael Butler <imb@protected-networks.net>, "K. Macy" <kmacy@freebsd.org>, FreeBSD CURRENT <freebsd-current@freebsd.org> Subject: Re: CURRENT slow and shaky network stability Message-ID: <20160402113910.14de7eaf.ohartman@zedat.fu-berlin.de> In-Reply-To: <20160402105503.7ede5be1.ohartman@zedat.fu-berlin.de> References: <56F6C6B0.6010103@protected-networks.net> <201604020807.u3287tgc034452@slippy.cwsent.com> <20160402105503.7ede5be1.ohartman@zedat.fu-berlin.de>
next in thread | previous in thread | raw e-mail | index | archive | help
--Sig_/cnPyYwlIcD24/.m6dd2EX7j Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Am Sat, 2 Apr 2016 10:55:03 +0200 "O. Hartmann" <ohartman@zedat.fu-berlin.de> schrieb: > Am Sat, 02 Apr 2016 01:07:55 -0700 > Cy Schubert <Cy.Schubert@komquats.com> schrieb: >=20 > > In message <56F6C6B0.6010103@protected-networks.net>, Michael Butler wr= ites: =20 > > > -current is not great for interactive use at all. The strategy of > > > pre-emptively dropping idle processes to swap is hurting .. big time.= =20 > >=20 > > FreeBSD doesn't "preemptively" or arbitrarily push pages out to disk. L= RU=20 > > doesn't do this. > > =20 > > >=20 > > > Compare inactive memory to swap in this example .. > > >=20 > > > 110 processes: 1 running, 108 sleeping, 1 zombie > > > CPU: 1.2% user, 0.0% nice, 4.3% system, 0.0% interrupt, 94.5% idle > > > Mem: 474M Active, 1609M Inact, 764M Wired, 281M Buf, 119M Free > > > Swap: 4096M Total, 917M Used, 3178M Free, 22% Inuse =20 > >=20 > > To analyze this you need to capture vmstat output. You'll see the free = pool=20 > > dip below a threshold and pages go out to disk in response. If you have= =20 > > daemons with small working sets, pages that are not part of the working= =20 > > sets for daemons or applications will eventually be paged out. This is = not=20 > > a bad thing. In your example above, the 281 MB of UFS buffers are more= =20 > > active than the 917 MB paged out. If it's paged out and never used agai= n,=20 > > then it doesn't hurt. However the 281 MB of buffers saves you I/O. The= =20 > > inactive pages are part of your free pool that were active at one time = but=20 > > now are not. They may be reclaimed and if they are, you've just saved m= ore=20 > > I/O. > >=20 > > Top is a poor tool to analyze memory use. Vmstat is the better tool to = help=20 > > understand memory use. Inactive memory isn't a bad thing per se. Monito= r=20 > > page outs, scan rate and page reclaims. > >=20 > > =20 >=20 > I give up! Tried to check via ssh/vmstat what is going on. Last lines bef= ore broken > pipe: >=20 > [...] > procs memory page disks faults cpu > r b w avm fre flt re pi po fr sr ad0 ad1 in sy cs us= sy id > 22 0 22 5.8G 1.0G 46319 0 0 0 55721 1297 0 4 219 23907 5400 = 95 5 0 > 22 0 22 5.4G 1.3G 51733 0 0 0 72436 1162 0 0 108 40869 3459 = 93 7 0 > 15 0 22 12G 1.2G 54400 0 27 0 52188 1160 0 42 148 52192 4366 = 91 9 0 > 14 0 22 12G 1.0G 44954 0 37 0 37550 1179 0 39 141 86209 4368 = 88 12 0 > 26 0 22 12G 1.1G 60258 0 81 0 69459 1119 0 27 123 779569 70435= 9 87 13 0 > 29 3 22 13G 774M 50576 0 68 0 32204 1304 0 2 102 507337 48486= 1 93 7 0 > 27 0 22 13G 937M 47477 0 48 0 59458 1264 3 2 112 68131 44407 = 95 5 0 > 36 0 22 13G 829M 83164 0 2 0 82575 1225 1 0 126 99366 38060 = 89 11 0 > 35 0 22 6.2G 1.1G 98803 0 13 0 121375 1217 2 8 112 99371 4999= 85 15 0 > 34 0 22 13G 723M 54436 0 20 0 36952 1276 0 17 153 29142 4431 = 95 5 0 > Fssh_packet_write_wait: Connection to 192.168.0.1 port 22: Broken pipe >=20 >=20 > This makes this crap system completely unusable. The server (FreeBSD 11.0= -CURRENT #20 > r297503: Sat Apr 2 09:02:41 CEST 2016 amd64) in question did poudriere b= ulk job. I can > not even determine what terminal goes down first - another one, much more= time idle than > the one shwoing the "vmstat 5" output, is still alive!=20 >=20 > i consider this a serious bug and it is no benefit what happened since th= is "fancy" > update. :-( By the way - it might be of interest and some hint. One of my boxes is acting as server and gateway. It utilises NAT, IPFW, whe= n it is under high load, as it was today, sometimes passing the network flow from ISP int= o the network for clients is extremely slow. I do not consider this the reason for collap= sing ssh sessions, since this incident happens also under no-load, but in the overal= l-view onto the problem, this could be a hint - I hope.=20 --Sig_/cnPyYwlIcD24/.m6dd2EX7j Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAEBCAAGBQJW/5M+AAoJEOgBcD7A/5N8TsMH+wRrvRrKanSvZNB2fF1wensa z3HSbLHcHTaQNI2DtsaVyiIJEybU7I90wCcA53QLVn17t4ksWs9jg4lJ2ZeDU1iY 5/cHwav9ZjUmVRRUJpF6VjeMjvlIRXVXDB29whVzlzVzyrAJHMdP5DWQy69teRlB jVb1tstMscKlVQpmfNE4a3no7PNnoGCxsKk4soCntDjPalPzLJFNWftmfZvbIcsU 4MFn7y6gqMbeA0o72RLp8S6gHKlbalHaQHlkSqFPoY8pXk/EGf2z9vyCMMBysj/9 0HqyBts2T2djPmSBOEkkIgkJSht990giT5Y9hjGentWuyWCE+xD0bpE+l2peyW8= =0c62 -----END PGP SIGNATURE----- --Sig_/cnPyYwlIcD24/.m6dd2EX7j--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160402113910.14de7eaf.ohartman>