From owner-freebsd-transport@freebsd.org Tue May 5 00:22:33 2020 Return-Path: Delivered-To: freebsd-transport@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 622BB2C84CF for ; Tue, 5 May 2020 00:22:33 +0000 (UTC) (envelope-from paul.reynolds@redcom.com) Received: from smtp1.redcom.com (smtp1.redcom.com [192.86.3.143]) by mx1.freebsd.org (Postfix) with ESMTP id 49GL5h4Qdhz42Cp for ; Tue, 5 May 2020 00:22:32 +0000 (UTC) (envelope-from paul.reynolds@redcom.com) Received: from localhost (localhost [127.0.0.1]) by smtp1.redcom.com (Postfix) with ESMTP id 14C4FA02F for ; Mon, 4 May 2020 20:22:26 -0400 (EDT) X-Virus-Scanned: amavisd-new at redcom.com Received: from smtp1.redcom.com ([127.0.0.1]) by localhost (smtp1.redcom.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zvysGfmu8nQR for ; Mon, 4 May 2020 20:22:23 -0400 (EDT) Received: from pie.redcom.com (pie [192.168.33.15]) by smtp1.redcom.com (Postfix) with ESMTP id 97BB3A02A for ; Mon, 4 May 2020 20:22:23 -0400 (EDT) Received: from exch-03.redcom.com (exch-03.redcom.com [192.168.32.32]) by pie.redcom.com (8.11.7p1+Sun/8.10.2) with ESMTP id 0450MNl20914 for ; Mon, 4 May 2020 20:22:23 -0400 (EDT) Received: from exch-03.redcom.com (fd00::8549:68c0:3d5f:ee62) by exch-03.redcom.com (fd00::8549:68c0:3d5f:ee62) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.330.5; Mon, 4 May 2020 20:22:23 -0400 Received: from exch-03.redcom.com ([fe80::a442:ce34:c9c8:268f]) by exch-03.redcom.com ([fe80::a442:ce34:c9c8:268f%3]) with mapi id 15.02.0330.010; Mon, 4 May 2020 20:22:23 -0400 From: "Reynolds, Paul" To: "freebsd-transport@freebsd.org" Subject: SCTP deadlock Thread-Topic: SCTP deadlock Thread-Index: AQHWInEPIgANzT+FZEW5ABoFd20+WA== Date: Tue, 5 May 2020 00:22:23 +0000 Message-ID: <112525e87fce467e97f1d455ef9bf685@redcom.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [192.168.224.164] MIME-Version: 1.0 X-Rspamd-Queue-Id: 49GL5h4Qdhz42Cp X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of paul.reynolds@redcom.com designates 192.86.3.143 as permitted sender) smtp.mailfrom=paul.reynolds@redcom.com X-Spamd-Result: default: False [-3.27 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; HAS_XOIP(0.00)[]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:192.86.3.143/32]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-transport@freebsd.org]; DMARC_NA(0.00)[redcom.com]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; IP_SCORE(-1.07)[ip: (-2.80), ipnet: 192.86.3.0/24(-1.40), asn: 46679(-1.12), country: US(-0.05)]; TO_DN_EQ_ADDR_ALL(0.00)[]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; ASN(0.00)[asn:46679, ipnet:192.86.3.0/24, country:US]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_COUNT_SEVEN(0.00)[7] Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-transport@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussions of transport level network protocols in FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 May 2020 00:22:33 -0000 Hi, My apologies if this is not the right mailing list for this question. I am = hoping to get some suggestions on how to debug an sctp issue I have been ha= ving. I have a set of programs that use sctp for communication. For now these pro= cesses are all running on the same machine. They are using SEQPACKET mode t= o send messages back and forth. Each process has multiple threads, but only= one socket. The threads are mutex protected such that the socket cannot be= accessed by more than one thread at a time. Very occasionally one of the s= ockets will become permanently blocked on an sctp send or receive call and = I am trying to figure out why. In the cases where it has become blocked on = a send call, the corresponding receive process is not blocked and can send/= receive data to other destinations. These processes can run fine for months= and then suddenly run into this problem. I have been able to reproduce thi= s once or twice by subjecting the system to unrealistically high levels of = traffic, but it still takes several days or more to reproduce the problem. = I now have a system that is stuck in this mode and am trying to gather as m= uch information as possible. How should I go about debugging this? The output of sockstat and netstat ha= ve not been very helpful up to this point. Thanks for any help you might be able to provide, Paul