Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 11 Sep 2013 19:18:26 +0400
From:      Gleb Smirnoff <glebius@FreeBSD.org>
To:        Yuri <yuri@rawbw.com>
Cc:        net@FreeBSD.org
Subject:   Re: Packet loss when 'control' messages are present with large data (sendmsg(2))
Message-ID:  <20130911151826.GP4574@FreeBSD.org>
In-Reply-To: <522300E3.6050303@rawbw.com>
References:  <522300E3.6050303@rawbw.com>

next in thread | previous in thread | raw e-mail | index | archive | help
  Yuri,

On Sun, Sep 01, 2013 at 01:54:59AM -0700, Yuri wrote:
Y> I found the case when sendmsg(2) silently loses packets for AF_LOCAL 
Y> domain when large packets with control part in them are sent.
Y> 
Y> Here is how:
Y> There is the watermark limit on sockbuf determined by 
Y> net.local.stream.sendspace, default is 8192 bytes (field sockbuf.sb_hiwat).
Y> When sendmsg(2) sends large enough data (8K+ that hits this 8192 limit) 
Y> with control message, sosend_generic will be cutting the message data 
Y> into separate mbufs based on 'sbspace' (derived from the above-mentioned 
Y> sb_hiwat limit) with adjustment for control message size as it sees it. 
Y> This way it tries to make sure this sb_hiwat limit is enforced.
Y> 
Y> However, down on uipc level control message is being further modified in 
Y> two ways: unp_internalize modifies it into some 'internal' form, also 
Y> unp_addsockcred function adds another control message when LOCAL_CREDS 
Y> are requested by client. Both functions only increase control message 
Y> size beyond its original size (seen by sosend_generic). So that the 
Y> first final mbuf sent (concatenation of control and data) will always be 
Y> larger than 'sbspace' limit that sosend_generic was cutting data for.
Y> 
Y> There is also the function sbappendcontrol_locked. It checks the 
Y> 'sbspace' limit again, and discards the packet when sbspace llimit is 
Y> exceeded. Its result code is essentially ignored in uipc_send. I 
Y> believe, sbappendcontrol_locked shouldn't be checking space at all, 
Y> since packets are expected to be properly sized to begin with. But this 
Y> won't be the right fix, since sizes would be exceeding the sbspace limit 
Y> anyway.
Y> 
Y> sosend_default is one level up over uipc level, and it doesn't know what 
Y> uipc will do with control message. Therefore it can't know what the real 
Y> adjustment for control message is needed (to properly cut data). It 
Y> wrongly takes the original control size and this makes the first packet 
Y> too large and discarded by sbappendcontrol_locked.
Y> 
Y> To solve the problem, I propose to add another function into struct 
Y> pr_usrreqs:
Y> int     (*pru_finalizecontrol)(struct socket *so, int flags, struct mbuf 
Y> **pcontrol);
Y> 
Y> This function will be called from sosend_default and sosend_dgram. 
Y> uipc_finalizecontrol will do the same that unp_internalize and 
Y> unp_addsockcred do on uipc level, and it will allow sosend_default to 
Y> see the final version of the control message, and properly split data 
Y> into pieces when data is large enough to hit the limit.
Y> 
Y> I felt I better discuss such addition to struct pr_usrreqs, because it 
Y> might seem like an overkill to add this function just to solve this one 
Y> local issue. But it seems there is no other solution (other than just 
Y> ignoring the occasionally larger mbuf size).
Y> 
Y> I can easily make a patch fixing this issue with this new function.

Thanks for investigation!

Can you please send at least a program that is test case for the
above problem?

A patch that fixes would be also appreciated.

-- 
Totus tuus, Glebius.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130911151826.GP4574>