Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 12 Jun 2015 09:29:22 -0300
From:      Christopher Forgeron <csforgeron@gmail.com>
To:        Cs <bimmer@field.hu>
Cc:        FreeBSD Net <freebsd-net@freebsd.org>
Subject:   Re: FreeBSD 10.1-REL - network unaccessible after high traffic
Message-ID:  <CAB2_NwA9i-wMXGH2%2BcP9SWxDMNomFRjoVP25hsGWaTDGjBxFTw@mail.gmail.com>
In-Reply-To: <557AB1BB.60502@field.hu>
References:  <374339249.53058039.1433681874571.JavaMail.root@uoguelph.ca> <55744F28.5000402@field.hu> <CAB2_NwA-D7bH47=Qkf9QLF3=mZOQBVo81bUsQzQr02W9U4vHMA@mail.gmail.com> <557AB1BB.60502@field.hu>

next in thread | previous in thread | raw e-mail | index | archive | help
Well, even at low speed it could drop due to memory from what I've seen.

What was the last line from vmstat 5 before it locked up?

 I find that the em driver isn't crap, but there is a deeper problem inside
of FreeBSD that is being exposed now - For me it's due to faster network
connections.

 Are you using rsync to move the files?

On Fri, Jun 12, 2015 at 7:17 AM, Cs <bimmer@field.hu> wrote:

> it seems it's not memory related. Server just died a few minutes ago
> during transporting the backup (400GB) around 800Mbps speed..
> will disable remote backup, it's a shame that em driver is such a crap.
>
>
> 2015.06.08. 5:01 keltez=C3=A9ssel, Christopher Forgeron =C3=ADrta:
>
>> You know what helped me:
>>
>> 'vmstat 5'
>>
>> Leave that running. If the last thing on the console after a crash/hang =
is
>> vmstat showing 8k of memory left, then you're in the same problem-park a=
s
>> me.
>>
>> My 10.1 96GiB RAM box is chewing ~8 GiB of RAM in less than 5 seconds, a=
nd
>> then crashing/panicking/hanging.
>>
>> There's others with this issues if you search for it; a sysctl
>> to vm.v_free_min to double or triple that value may help, but first let =
us
>> know if that's what is bonking your sever.
>>
>>
>>
>> On Sun, Jun 7, 2015 at 11:03 AM, Cs <bimmer@field.hu> wrote:
>>
>>  ok, just lowered it to 1500 but please also note that it was on 1500 fo=
r
>>> 2
>>> years
>>>
>>> 2015.06.07. 14:57 keltez=C3=A9ssel, Rick Macklem =C3=ADrta:
>>>
>>>  Since disabling TSO didn't help, you could try dropping to 1500mtu
>>>> on both interfaces. Some people run into problems when 9K jumbo cluste=
rs
>>>> fragment the kernel address space used to allocate mbufs.
>>>>
>>>> Good luck with it, rick
>>>>
>>>> ----- Original Message -----
>>>>
>>>>  Hi All,
>>>>>
>>>>> It worked fine for two weeks but I had a network outage 2 days ago
>>>>> then
>>>>> today. Tried to disable rxcsum and txcsum after the first one, didn't
>>>>> help. Don't know what else to do it's a shame that I can't use this
>>>>> card
>>>>> with fbsd i REALLY don't want to install linux instead but my
>>>>> production
>>>>> servers outages are not welcomed by the customers..
>>>>>
>>>>> 2015.05.26. 10:36 keltez=C3=A9ssel, Cs =C3=ADrta:
>>>>>
>>>>>  Thanks Mark, good idea. I found this thread which is exactly the
>>>>>> same
>>>>>> problem as mine:
>>>>>>
>>>>>>
>>>>>> https://forums.freebsd.org/threads/workaround-freebsd-10-1-sudden-ne=
twork-down.49264/
>>>>>>
>>>>>> Will see if it helps in a couple weeks.
>>>>>>
>>>>>> Regards,
>>>>>> Csaba
>>>>>>
>>>>>> 2015.05.26. 10:30 keltez=C3=A9ssel, Mark Schouten =C3=ADrta:
>>>>>>
>>>>>>  Oh, didn't see your lowest remark. Then, the next thing that comes
>>>>>>> past here a few times per week is 'Try disabling TSO'.
>>>>>>>
>>>>>>>
>>>>>>> Met vriendelijke groeten,
>>>>>>>
>>>>>>> --
>>>>>>> Kerio Operator in de Cloud? https://www.kerioindecloud.nl/
>>>>>>> Mark Schouten  | Tuxis Internet Engineering
>>>>>>> KvK: 61527076 | http://www.tuxis.nl/
>>>>>>> T: 0318 200208 | info@tuxis.nl
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>     Van:   Cs <bimmer@field.hu>
>>>>>>>     Aan:   Mark Schouten <mark@tuxis.nl>
>>>>>>>     Cc:    <freebsd-net@freebsd.org>
>>>>>>>     Verzonden:   25-5-2015 11:12
>>>>>>>     Onderwerp:   Re: FreeBSD 10.1-REL - network unaccessible after
>>>>>>>     high
>>>>>>> traffic
>>>>>>>
>>>>>>> It was on 1500 for ~3 years :)
>>>>>>>     Regards,
>>>>>>> Csaba
>>>>>>>         On May 25, 2015, 10:30, at 10:30, Mark Schouten
>>>>>>>         <mark@tuxis.nl>
>>>>>>> wrote:
>>>>>>>
>>>>>>>  Try lowering your mtu to 1500, that worked miracles for me..
>>>>>>>>
>>>>>>>> --
>>>>>>>> Mark Schouten
>>>>>>>> Tuxis Internet Engineering
>>>>>>>> mark@tuxis.nl / 0318 200208
>>>>>>>>
>>>>>>>>   On 25 May 2015, at 09:36, "Cs" <bimmer@field.hu> wrote:
>>>>>>>>
>>>>>>>>>     Hi all,
>>>>>>>>>     I have two FreeBSd 10.1-RELEASE servers connected to each
>>>>>>>>>     other.
>>>>>>>>> They
>>>>>>>>>
>>>>>>>>>  were connected via cross link, but they are connected to a cisco
>>>>>>>> switch
>>>>>>>> now (the problem was the same with cross link too). When
>>>>>>>> transferring
>>>>>>>> huge files (50-500GB backup files) via Gigabit (it is important!)
>>>>>>>> the
>>>>>>>> network randomly dies. The backup runs every day/week and
>>>>>>>> sometimes the
>>>>>>>> connection is ok for months sometimes it happens twice a week.
>>>>>>>> When the
>>>>>>>> network dies I can log in to the server via IPMI and use the
>>>>>>>> console
>>>>>>>> everything is OK, but can't send anything out on the network.
>>>>>>>> ifconfig
>>>>>>>> em0 down/up doesn't help nor netif restart. The problem never
>>>>>>>> occured
>>>>>>>> when I used 100Mbit connection between them, but it was 3com NIC
>>>>>>>> (xl),
>>>>>>>> gigabit adapter is Intel (em0). When I limit the transfer rate
>>>>>>>> (rsync
>>>>>>>> bandwith limit or ipfw pipe) the problem is much more rare.
>>>>>>>>
>>>>>>>>      I tried to set these tuning parameters on both servers with
>>>>>>>>> different
>>>>>>>>>
>>>>>>>>>  buffer size but nothing helped:
>>>>>>>>
>>>>>>>>      # cat /etc/sysctl.conf
>>>>>>>>> security.bsd.see_other_uids=3D0
>>>>>>>>> net.inet.tcp.recvspace=3D512000
>>>>>>>>> net.route.netisr_maxqlen=3D2048
>>>>>>>>> kern.ipc.nmbclusters=3D1310720
>>>>>>>>> net.inet.tcp.sendbuf_max=3D16777216
>>>>>>>>> net.inet.tcp.recvbuf_max=3D16777216
>>>>>>>>> kern.ipc.soacceptqueue=3D32768
>>>>>>>>>     # cat /boot/loader.conf
>>>>>>>>> geom_mirror_load=3D"YES" # RAID1 disk driver (see gmirror(8))
>>>>>>>>> ipfw_load=3D"YES"
>>>>>>>>> net.inet.ip.fw.default_to_accept=3D1
>>>>>>>>> kern.maxusers=3D4096
>>>>>>>>> accf_data_load=3D"YES"
>>>>>>>>>     The duplex settings are identical on both servers.
>>>>>>>>>     Server A:
>>>>>>>>> em1: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric =
0
>>>>>>>>> mtu
>>>>>>>>>
>>>>>>>>>  9000
>>>>>>>>
>>>>>>>>
>>>>>>>> options=3D4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,=
TSO4,WOL_MAGIC,VLAN_HWTSO>
>>>>>>>>
>>>>>>>>
>>>>>>>>            ether 00:25:90:24:52:66
>>>>>>>>
>>>>>>>>>           inet x.x.x.x netmask 0xfffffe00 broadcast x.x.x.x
>>>>>>>>>           nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>>>>>>>>>           media: Ethernet autoselect (1000baseT <full-duplex>)
>>>>>>>>>           status: active
>>>>>>>>>     Server B:
>>>>>>>>> em0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric =
0
>>>>>>>>> mtu
>>>>>>>>>
>>>>>>>>>  9000
>>>>>>>>
>>>>>>>>
>>>>>>>> options=3D4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,=
TSO4,WOL_MAGIC,VLAN_HWTSO>
>>>>>>>>
>>>>>>>>
>>>>>>>>            ether 00:30:48:dd:fe:3e
>>>>>>>>
>>>>>>>>>           inet x.x.x.x netmask 0xfffffe00 broadcast x.x.x.x
>>>>>>>>>           nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>>>>>>>>>           media: Ethernet autoselect (1000baseT <full-duplex>)
>>>>>>>>>           status: active
>>>>>>>>>     Today I tried to set mtu to 9000 but in tcpdump I see that
>>>>>>>>>     during
>>>>>>>>> scp
>>>>>>>>>
>>>>>>>>>  it is still 1500:
>>>>>>>>
>>>>>>>>        x.x.x.x.222 > x.x.x.x.37612: Flags [.], cksum 0xb6ee
>>>>>>>>>       (incorrect ->
>>>>>>>>>
>>>>>>>>>  0xda6f), seq 35749, ack 113701596, win 7986, options [nop,nop,TS
>>>>>>>> val
>>>>>>>> 3103966325 ecr 853712893], length 0
>>>>>>>>
>>>>>>>>  09:27:33.912354 IP (tos 0x8, ttl 64, id 1028, offset 0, flags
>>>>>>>>> [DF],
>>>>>>>>>
>>>>>>>>>  proto TCP (6), length 1500)
>>>>>>>>
>>>>>>>>  09:27:33.912358 IP (tos 0x8, ttl 64, id 1029, offset 0, flags
>>>>>>>>> [DF],
>>>>>>>>>
>>>>>>>>>  proto TCP (6), length 1500)
>>>>>>>>
>>>>>>>>        Any ideas? Thanks guys!
>>>>>>>>> _______________________________________________
>>>>>>>>> freebsd-net@freebsd.org mailing list
>>>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>>>>>>>> To unsubscribe, send any mail to
>>>>>>>>>
>>>>>>>>>  "freebsd-net-unsubscribe@freebsd.org"
>>>>>>>>
>>>>>>>>  _______________________________________________
>>>>>>> freebsd-net@freebsd.org mailing list
>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>>>>>> To unsubscribe, send any mail to
>>>>>>> "freebsd-net-unsubscribe@freebsd.org"
>>>>>>>
>>>>>>>
>>>>>>>   _______________________________________________
>>>>>>>
>>>>>> freebsd-net@freebsd.org mailing list
>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>>>>> To unsubscribe, send any mail to
>>>>>> "freebsd-net-unsubscribe@freebsd.org"
>>>>>>
>>>>>>  _______________________________________________
>>>>> freebsd-net@freebsd.org mailing list
>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>>>> To unsubscribe, send any mail to
>>>>> "freebsd-net-unsubscribe@freebsd.org"
>>>>>
>>>>>  _______________________________________________
>>> freebsd-net@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>>>
>>>  _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>>
>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAB2_NwA9i-wMXGH2%2BcP9SWxDMNomFRjoVP25hsGWaTDGjBxFTw>