From owner-freebsd-net@FreeBSD.ORG Fri Jun 12 12:37:39 2015 Return-Path: Delivered-To: freebsd-net@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0ACA58EF for ; Fri, 12 Jun 2015 12:37:39 +0000 (UTC) (envelope-from csforgeron@gmail.com) Received: from mail-qg0-x233.google.com (mail-qg0-x233.google.com [IPv6:2607:f8b0:400d:c04::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B0DFA192A for ; Fri, 12 Jun 2015 12:37:38 +0000 (UTC) (envelope-from csforgeron@gmail.com) Received: by qgep100 with SMTP id p100so10962422qge.3 for ; Fri, 12 Jun 2015 05:37:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=qSWTpo4NQpEbtndxvMpXK7ck5Qgl8R6weGjoRzMB3/M=; b=hy2ffIPz0BS3jvzLuWeU9tOb7cRPHgoICiASd6Fn/nDX9ZNqyHuRjUupkK4rzCT9gH sVkUFjrKbvugWGC70ZmSV3oh2YJZdH4ufsSqkYVZC//i9YnHX1XoNrPgQsxXZ9ItC0YP skW6Zmls41Nm12nBWVeFRJUwbZ1NWg9GACqzTu8RQfSXMnAvvqVSw7199aQSSWb5cp7e UffgfefAOEOru8mBMoa3s7QcYGS0McuHcfcY8KhZMJsIENiZO+JvwN4l/YUepp3cX3M+ /AKQOzTaaMfEfrnWbCQTCb+7Jtz79T2mhv27QrCt8VAtxIG4GeYiSk1H1VP14AXFinGj QLzw== MIME-Version: 1.0 X-Received: by 10.140.151.130 with SMTP id 124mr19219897qhx.18.1434112657723; Fri, 12 Jun 2015 05:37:37 -0700 (PDT) Received: by 10.96.76.104 with HTTP; Fri, 12 Jun 2015 05:37:37 -0700 (PDT) In-Reply-To: <557AD10D.5070205@field.hu> References: <374339249.53058039.1433681874571.JavaMail.root@uoguelph.ca> <55744F28.5000402@field.hu> <557AB1BB.60502@field.hu> <557AD10D.5070205@field.hu> Date: Fri, 12 Jun 2015 09:37:37 -0300 Message-ID: Subject: Re: FreeBSD 10.1-REL - network unaccessible after high traffic From: Christopher Forgeron To: Cs Cc: FreeBSD Net Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Jun 2015 12:37:39 -0000 rsycn burns memory - I'd say you have a good chance you're running out of mem before it's replenished. For vmstat 5 - Don't run it on console. Connect via a second box with ssh, and run it there - That way it's the last thing on the ssh terminal screen when the box dies, and you'll have your proof. On Fri, Jun 12, 2015 at 9:31 AM, Cs wrote: > machine has been restarted before I could check the "vmstat 5" output. > Yep, it's rsync. Anyway I disabled the backup transfer it'll solve, but I > can't really accept this for solution. > > > 2015.06.12. 14:29 keltez=C3=A9ssel, Christopher Forgeron =C3=ADrta: > >> Well, even at low speed it could drop due to memory from what I've seen. >> >> What was the last line from vmstat 5 before it locked up? >> >> I find that the em driver isn't crap, but there is a deeper problem >> inside >> of FreeBSD that is being exposed now - For me it's due to faster network >> connections. >> >> Are you using rsync to move the files? >> >> On Fri, Jun 12, 2015 at 7:17 AM, Cs wrote: >> >> it seems it's not memory related. Server just died a few minutes ago >>> during transporting the backup (400GB) around 800Mbps speed.. >>> will disable remote backup, it's a shame that em driver is such a crap. >>> >>> >>> 2015.06.08. 5:01 keltez=C3=A9ssel, Christopher Forgeron =C3=ADrta: >>> >>> You know what helped me: >>>> >>>> 'vmstat 5' >>>> >>>> Leave that running. If the last thing on the console after a crash/han= g >>>> is >>>> vmstat showing 8k of memory left, then you're in the same problem-park >>>> as >>>> me. >>>> >>>> My 10.1 96GiB RAM box is chewing ~8 GiB of RAM in less than 5 seconds, >>>> and >>>> then crashing/panicking/hanging. >>>> >>>> There's others with this issues if you search for it; a sysctl >>>> to vm.v_free_min to double or triple that value may help, but first le= t >>>> us >>>> know if that's what is bonking your sever. >>>> >>>> >>>> >>>> On Sun, Jun 7, 2015 at 11:03 AM, Cs wrote: >>>> >>>> ok, just lowered it to 1500 but please also note that it was on 1500 >>>> for >>>> >>>>> 2 >>>>> years >>>>> >>>>> 2015.06.07. 14:57 keltez=C3=A9ssel, Rick Macklem =C3=ADrta: >>>>> >>>>> Since disabling TSO didn't help, you could try dropping to 1500mtu >>>>> >>>>>> on both interfaces. Some people run into problems when 9K jumbo >>>>>> clusters >>>>>> fragment the kernel address space used to allocate mbufs. >>>>>> >>>>>> Good luck with it, rick >>>>>> >>>>>> ----- Original Message ----- >>>>>> >>>>>> Hi All, >>>>>> >>>>>>> It worked fine for two weeks but I had a network outage 2 days ago >>>>>>> then >>>>>>> today. Tried to disable rxcsum and txcsum after the first one, didn= 't >>>>>>> help. Don't know what else to do it's a shame that I can't use this >>>>>>> card >>>>>>> with fbsd i REALLY don't want to install linux instead but my >>>>>>> production >>>>>>> servers outages are not welcomed by the customers.. >>>>>>> >>>>>>> 2015.05.26. 10:36 keltez=C3=A9ssel, Cs =C3=ADrta: >>>>>>> >>>>>>> Thanks Mark, good idea. I found this thread which is exactly the >>>>>>> >>>>>>>> same >>>>>>>> problem as mine: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> https://forums.freebsd.org/threads/workaround-freebsd-10-1-sudden-= network-down.49264/ >>>>>>>> >>>>>>>> Will see if it helps in a couple weeks. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Csaba >>>>>>>> >>>>>>>> 2015.05.26. 10:30 keltez=C3=A9ssel, Mark Schouten =C3=ADrta: >>>>>>>> >>>>>>>> Oh, didn't see your lowest remark. Then, the next thing that com= es >>>>>>>> >>>>>>>>> past here a few times per week is 'Try disabling TSO'. >>>>>>>>> >>>>>>>>> >>>>>>>>> Met vriendelijke groeten, >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Kerio Operator in de Cloud? https://www.kerioindecloud.nl/ >>>>>>>>> Mark Schouten | Tuxis Internet Engineering >>>>>>>>> KvK: 61527076 | http://www.tuxis.nl/ >>>>>>>>> T: 0318 200208 | info@tuxis.nl >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Van: Cs >>>>>>>>> Aan: Mark Schouten >>>>>>>>> Cc: >>>>>>>>> Verzonden: 25-5-2015 11:12 >>>>>>>>> Onderwerp: Re: FreeBSD 10.1-REL - network unaccessible aft= er >>>>>>>>> high >>>>>>>>> traffic >>>>>>>>> >>>>>>>>> It was on 1500 for ~3 years :) >>>>>>>>> Regards, >>>>>>>>> Csaba >>>>>>>>> On May 25, 2015, 10:30, at 10:30, Mark Schouten >>>>>>>>> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Try lowering your mtu to 1500, that worked miracles for me.. >>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Mark Schouten >>>>>>>>>> Tuxis Internet Engineering >>>>>>>>>> mark@tuxis.nl / 0318 200208 >>>>>>>>>> >>>>>>>>>> On 25 May 2015, at 09:36, "Cs" wrote: >>>>>>>>>> >>>>>>>>>> Hi all, >>>>>>>>>>> I have two FreeBSd 10.1-RELEASE servers connected to each >>>>>>>>>>> other. >>>>>>>>>>> They >>>>>>>>>>> >>>>>>>>>>> were connected via cross link, but they are connected to a >>>>>>>>>>> cisco >>>>>>>>>>> >>>>>>>>>> switch >>>>>>>>>> now (the problem was the same with cross link too). When >>>>>>>>>> transferring >>>>>>>>>> huge files (50-500GB backup files) via Gigabit (it is important!= ) >>>>>>>>>> the >>>>>>>>>> network randomly dies. The backup runs every day/week and >>>>>>>>>> sometimes the >>>>>>>>>> connection is ok for months sometimes it happens twice a week. >>>>>>>>>> When the >>>>>>>>>> network dies I can log in to the server via IPMI and use the >>>>>>>>>> console >>>>>>>>>> everything is OK, but can't send anything out on the network. >>>>>>>>>> ifconfig >>>>>>>>>> em0 down/up doesn't help nor netif restart. The problem never >>>>>>>>>> occured >>>>>>>>>> when I used 100Mbit connection between them, but it was 3com NIC >>>>>>>>>> (xl), >>>>>>>>>> gigabit adapter is Intel (em0). When I limit the transfer rate >>>>>>>>>> (rsync >>>>>>>>>> bandwith limit or ipfw pipe) the problem is much more rare. >>>>>>>>>> >>>>>>>>>> I tried to set these tuning parameters on both servers wit= h >>>>>>>>>> >>>>>>>>>>> different >>>>>>>>>>> >>>>>>>>>>> buffer size but nothing helped: >>>>>>>>>>> >>>>>>>>>> # cat /etc/sysctl.conf >>>>>>>>>> >>>>>>>>>>> security.bsd.see_other_uids=3D0 >>>>>>>>>>> net.inet.tcp.recvspace=3D512000 >>>>>>>>>>> net.route.netisr_maxqlen=3D2048 >>>>>>>>>>> kern.ipc.nmbclusters=3D1310720 >>>>>>>>>>> net.inet.tcp.sendbuf_max=3D16777216 >>>>>>>>>>> net.inet.tcp.recvbuf_max=3D16777216 >>>>>>>>>>> kern.ipc.soacceptqueue=3D32768 >>>>>>>>>>> # cat /boot/loader.conf >>>>>>>>>>> geom_mirror_load=3D"YES" # RAID1 disk driver (see gmirror(8)) >>>>>>>>>>> ipfw_load=3D"YES" >>>>>>>>>>> net.inet.ip.fw.default_to_accept=3D1 >>>>>>>>>>> kern.maxusers=3D4096 >>>>>>>>>>> accf_data_load=3D"YES" >>>>>>>>>>> The duplex settings are identical on both servers. >>>>>>>>>>> Server A: >>>>>>>>>>> em1: flags=3D8843 metri= c 0 >>>>>>>>>>> mtu >>>>>>>>>>> >>>>>>>>>>> 9000 >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> options=3D4219b >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ether 00:25:90:24:52:66 >>>>>>>>>> >>>>>>>>>> inet x.x.x.x netmask 0xfffffe00 broadcast x.x.x.x >>>>>>>>>>> nd6 options=3D29 >>>>>>>>>>> media: Ethernet autoselect (1000baseT ) >>>>>>>>>>> status: active >>>>>>>>>>> Server B: >>>>>>>>>>> em0: flags=3D8843 metri= c 0 >>>>>>>>>>> mtu >>>>>>>>>>> >>>>>>>>>>> 9000 >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> options=3D4219b >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ether 00:30:48:dd:fe:3e >>>>>>>>>> >>>>>>>>>> inet x.x.x.x netmask 0xfffffe00 broadcast x.x.x.x >>>>>>>>>>> nd6 options=3D29 >>>>>>>>>>> media: Ethernet autoselect (1000baseT ) >>>>>>>>>>> status: active >>>>>>>>>>> Today I tried to set mtu to 9000 but in tcpdump I see that >>>>>>>>>>> during >>>>>>>>>>> scp >>>>>>>>>>> >>>>>>>>>>> it is still 1500: >>>>>>>>>>> >>>>>>>>>> x.x.x.x.222 > x.x.x.x.37612: Flags [.], cksum 0xb6ee >>>>>>>>>> >>>>>>>>>>> (incorrect -> >>>>>>>>>>> >>>>>>>>>>> 0xda6f), seq 35749, ack 113701596, win 7986, options >>>>>>>>>>> [nop,nop,TS >>>>>>>>>>> >>>>>>>>>> val >>>>>>>>>> 3103966325 ecr 853712893], length 0 >>>>>>>>>> >>>>>>>>>> 09:27:33.912354 IP (tos 0x8, ttl 64, id 1028, offset 0, flags >>>>>>>>>> >>>>>>>>>>> [DF], >>>>>>>>>>> >>>>>>>>>>> proto TCP (6), length 1500) >>>>>>>>>>> >>>>>>>>>> 09:27:33.912358 IP (tos 0x8, ttl 64, id 1029, offset 0, flags >>>>>>>>>> >>>>>>>>>>> [DF], >>>>>>>>>>> >>>>>>>>>>> proto TCP (6), length 1500) >>>>>>>>>>> >>>>>>>>>> Any ideas? Thanks guys! >>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> freebsd-net@freebsd.org mailing list >>>>>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>>>>>> To unsubscribe, send any mail to >>>>>>>>>>> >>>>>>>>>>> "freebsd-net-unsubscribe@freebsd.org" >>>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> >>>>>>>>> freebsd-net@freebsd.org mailing list >>>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>>>> To unsubscribe, send any mail to >>>>>>>>> "freebsd-net-unsubscribe@freebsd.org" >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> >>>>>>>>> freebsd-net@freebsd.org mailing list >>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>>> To unsubscribe, send any mail to >>>>>>>> "freebsd-net-unsubscribe@freebsd.org" >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> >>>>>>> freebsd-net@freebsd.org mailing list >>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>> To unsubscribe, send any mail to >>>>>>> "freebsd-net-unsubscribe@freebsd.org" >>>>>>> >>>>>>> _______________________________________________ >>>>>>> >>>>>> freebsd-net@freebsd.org mailing list >>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org= " >>>>> >>>>> _______________________________________________ >>>>> >>>> freebsd-net@freebsd.org mailing list >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >>>> >>>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >>> >>> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >