Date: Sun, 14 Apr 2019 13:47:47 -0500 From: Jason Bacon <bacon4000@gmail.com> To: "Rodney W. Grimes" <freebsd-rwg@gndrsh.dnsmgr.net> Cc: Justin Clift <justin@postgresql.org>, freebsd-infiniband@freebsd.org Subject: Re: Kernel modules Message-ID: <1586a292-0ee1-73e5-dc72-c03087516c7a@gmail.com> In-Reply-To: <201904141754.x3EHsOZD086306@gndrsh.dnsmgr.net>
index | next in thread | previous in thread | raw e-mail
On 2019-04-14 12:54, Rodney W. Grimes wrote: >> On 2019-04-13 13:29, Justin Clift wrote: >>> On 2019-04-13 23:52, Jason Bacon wrote: >>> <snip> >>>> Stability will take a long time to test properly.? I'm going to start >>>> by rerunning some of our most I/O-intensive jobs on it - jobs that >>>> actually broke our CentOS RAID servers until I switched them to NFS >>>> over RDMA. >>> That's got to be the first time anyone's ever mentioned "NFS over >>> RDMA" as >>> increasing a systems' stability. :) >>> >>> + Justin >> Believe it or not...? ;-) >> >> After my upgrade from CentOS 6 to CentOS 7, NFS over TCP started falling >> apart under heavy load; servers and compute nodes becoming unresponsive >> and requiring a reboot to restore stability. >> >> If it's due to problems in the CentOS TCP stack, NFS over RDMA would >> help by eliminating the TCP stack from the pathway. > Any idea what happened in the CentOS TCP stack between 6 and 7? > > Not really - I don't have time do deep dive into such specifics given the number of hats I have to wear. I can only say that for our particular use case, CentOS 7 is generally more complicated, slower and slightly less reliable than 6 (which actually served us well for years). I hit a few pitfalls following the upgrades, but I found my way around them and our clusters are pretty stable now. -- Earth is a beta site.home | help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1586a292-0ee1-73e5-dc72-c03087516c7a>
