Date: Wed, 11 Sep 2013 19:18:26 +0400 From: Gleb Smirnoff <glebius@FreeBSD.org> To: Yuri <yuri@rawbw.com> Cc: net@FreeBSD.org Subject: Re: Packet loss when 'control' messages are present with large data (sendmsg(2)) Message-ID: <20130911151826.GP4574@FreeBSD.org> In-Reply-To: <522300E3.6050303@rawbw.com> References: <522300E3.6050303@rawbw.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Yuri, On Sun, Sep 01, 2013 at 01:54:59AM -0700, Yuri wrote: Y> I found the case when sendmsg(2) silently loses packets for AF_LOCAL Y> domain when large packets with control part in them are sent. Y> Y> Here is how: Y> There is the watermark limit on sockbuf determined by Y> net.local.stream.sendspace, default is 8192 bytes (field sockbuf.sb_hiwat). Y> When sendmsg(2) sends large enough data (8K+ that hits this 8192 limit) Y> with control message, sosend_generic will be cutting the message data Y> into separate mbufs based on 'sbspace' (derived from the above-mentioned Y> sb_hiwat limit) with adjustment for control message size as it sees it. Y> This way it tries to make sure this sb_hiwat limit is enforced. Y> Y> However, down on uipc level control message is being further modified in Y> two ways: unp_internalize modifies it into some 'internal' form, also Y> unp_addsockcred function adds another control message when LOCAL_CREDS Y> are requested by client. Both functions only increase control message Y> size beyond its original size (seen by sosend_generic). So that the Y> first final mbuf sent (concatenation of control and data) will always be Y> larger than 'sbspace' limit that sosend_generic was cutting data for. Y> Y> There is also the function sbappendcontrol_locked. It checks the Y> 'sbspace' limit again, and discards the packet when sbspace llimit is Y> exceeded. Its result code is essentially ignored in uipc_send. I Y> believe, sbappendcontrol_locked shouldn't be checking space at all, Y> since packets are expected to be properly sized to begin with. But this Y> won't be the right fix, since sizes would be exceeding the sbspace limit Y> anyway. Y> Y> sosend_default is one level up over uipc level, and it doesn't know what Y> uipc will do with control message. Therefore it can't know what the real Y> adjustment for control message is needed (to properly cut data). It Y> wrongly takes the original control size and this makes the first packet Y> too large and discarded by sbappendcontrol_locked. Y> Y> To solve the problem, I propose to add another function into struct Y> pr_usrreqs: Y> int (*pru_finalizecontrol)(struct socket *so, int flags, struct mbuf Y> **pcontrol); Y> Y> This function will be called from sosend_default and sosend_dgram. Y> uipc_finalizecontrol will do the same that unp_internalize and Y> unp_addsockcred do on uipc level, and it will allow sosend_default to Y> see the final version of the control message, and properly split data Y> into pieces when data is large enough to hit the limit. Y> Y> I felt I better discuss such addition to struct pr_usrreqs, because it Y> might seem like an overkill to add this function just to solve this one Y> local issue. But it seems there is no other solution (other than just Y> ignoring the occasionally larger mbuf size). Y> Y> I can easily make a patch fixing this issue with this new function. Thanks for investigation! Can you please send at least a program that is test case for the above problem? A patch that fixes would be also appreciated. -- Totus tuus, Glebius.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130911151826.GP4574>