Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 09 Jun 2011 16:10:08 +0300
From:      Mikolaj Golub <trociny@freebsd.org>
To:        Maxim Sobolev <sobomax@FreeBSD.org>
Cc:        vadim_nuclight@mail.ru, Kostik Belousov <kib@FreeBSD.org>, svn-src-all@FreeBSD.org, Pawel Jakub Dawidek <pjd@FreeBSD.org>
Subject:   Re: svn commit: r222688 - head/sbin/hastd
Message-ID:  <86wrgvkv67.fsf@kopusha.home.net>
In-Reply-To: <4DED1CC5.1070001@FreeBSD.org> (Maxim Sobolev's message of "Mon,  06 Jun 2011 11:30:29 -0700")
References:  <201106041601.p54G1Ut7016697@svn.freebsd.org> <BA66495E-AED3-459F-A5CD-69B91DB359BC@lists.zabbadoz.net> <4DEA653F.7070503@FreeBSD.org> <201106061057.p56Av3u7037614@kernblitz.nuclight.avtf.net> <4DED1CC5.1070001@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On Mon, 06 Jun 2011 11:30:29 -0700 Maxim Sobolev wrote:

 MS> On 6/6/2011 3:57 AM, Vadim Goncharov wrote:
 >> Hi Maxim Sobolev!
 >>
 >> On Sat, 04 Jun 2011 10:02:55 -0700; Maxim Sobolev<sobomax@FreeBSD.org>  wrote:
 >>
 >>>> I don't know about the hast internal protocol but the above reads kind of
 >>>> wrong to me.
 >>
 >>> Hmm, not sure what exactly is wrong? Sender does 3 writes to the TCP
 >>> socket - 32k, 32k and 1071 bytes, while receiver does one
 >>> recv(MSG_WAITALL) with the size of 66607. So I suspect sender's kernel
 >>> does deliver two 32k packets and fills up receiver's buffer or
 >>> something. And the remaining 1071 bytes stay somewhere in sender's
 >>> kernel indefinitely, while recv() cannot complete in receiver's. Using
 >>> the same size when doing recv() solves the issue for me.

With MSG_WAITALL, if data to receive are larger than receive buffer, after
receiving some part of data it is drained to user buffer and the protocol is
notified (sending window update) that there is some space in the receive
buffer. So, normally, there should not be an issue with the scenario described
above. But there was a race in soreceive_generic(), I believe I have fixed in
r222454, when the connection could stall in sbwait. Do you still observe the
issue with only r222454 applied?

 >>
 >> I'm also don't know the hast internal protocol, but the very need to adjust
 >> some *user* buffers while using _TCP_ is pretty strange: TCP doesn't depend on
 >> sender's behavior only. May be setsockopt(SO_RCVBUF) needs to be used. Also,
 >> why recv() is ever there on TCP, instead of read() ? Is that blocking or
 >> non-blocking read? In the latter case kqueue(2) is very usfeul.
 >>

 MS> MSG_WAITALL might be an issue here. I suspect receiver's kernel can't
 MS> dequeue two 32k packets until the last chunk arrives. I don't have a
 MS> time to look into it in detail unfortunately.

Sorry, but I think your patch is wrong. If even it fixes the issue for you,
actually I think it does not fix but hides a real problem we have to address.

Receiving the whole chunk at once should be more effectively because we do one
syscall instead of several. Also, if you receive in smaller chunks no need to
set MSG_WAITALL at all.

Besides, with your patch I am observing hangs on primary startup in

init_remote->primary_connect->proto_connection_recv->proto_common_recv

The primary worker process asks the parent to connect to the secondary. After
establishing the connection the parent sends connection protocol name and
descriptor to the worker (proto_connection_send/proto_connection_recv). The
issue here is that in proto_connection_recv() the size of protoname is
unknown, so it calls proto_common_recv() with size = 127, larger than
protoname ("tcp").

It worked previously because after sending protoname proto_connection_send()
sends the descriptor calling sendmsg(). This is data of different type and it
makes recv() return although only 4 bytes of 127 requested were received.

With your patch, after receiving these 4 bytes it returns back to recv()
waiting for rest 123 bytes and gets stuck forever. Don't you observe this?  It
is strange, because for me it hangs on every start up. I am seeing this on
yesterday current.

-- 
Mikolaj Golub



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?86wrgvkv67.fsf>