Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 17 Jul 2020 19:50:55 +0200
From:      Bernd Walter <ticso@cicely7.cicely.de>
To:        Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
Cc:        ticso@cicely.de, freebsd-net@freebsd.org, Bernd Walter <ticso@cicely7.cicely.de>
Subject:   Re: SCTP problem, how to debug?
Message-ID:  <20200717175055.GB79604@cicely7.cicely.de>
In-Reply-To: <EE7DEB1F-D941-4FC2-91FB-7A1A3D3E11C3@lurchi.franken.de>
References:  <20200717160739.GA79604@cicely7.cicely.de> <EE7DEB1F-D941-4FC2-91FB-7A1A3D3E11C3@lurchi.franken.de>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jul 17, 2020 at 07:27:00PM +0200, Michael Tuexen wrote:
> 
> 
> > On 17. Jul 2020, at 18:07, Bernd Walter <ticso@cicely7.cicely.de> wrote:
> > 
> > I'm running an LED matrix with SCTP.
> > The matrix consists from 24 raspberry pi running NFS-root FreeBSD
> > 12.0-RELEASE (they have an SD card for u-boot and loader).
> > A client system is running FreeBSD 12.1-RELEASE.
> I fixed iterator related bugs and this was MFCed to stable/12 recently.
> The problem was that the iterator stopped sending.

The client side should be easy to update.
The modules can be a bit more tricky, but I will try.

> > 
> > The matrix modules have a one to many service socket.
> > The daemon regularily sends status informations (temperature, etc.) to
> > each association and has a second thread to receive.
> > 
> > The client system has two deamons running.
> > 
> > One daemon is to control power output related to temperature states.
> > It has one thread reestablishing associations via sctp_connectx() to
> > each of the 24 modules using a single one to many socket.
> > Another thread collects all regular received data and updates stored
> > values.
> > Yet another thread sends power control data via SCTP_SENDALL, so that all modules know
> > the maxed allowed brightness rating.
> > 
> > The other daemon uses the same threads to reconnect and receive.
> > It connects to the very same sockets on the modules.
> > Another thread updates picture data and wanted power rating.
> > That is sending picture data to the given matrix module and then
> > SCTP_SENDALL an update trigger to all modules.
> > That is reduced brightness at night times, ...
> > 
> > All SCTP_SENDALL are just trigger with 0 length and different ppid values.
> Are you really sending messages of length 0? That shouldn't work...

No - I was wrong.
Just checked and I do send a dummy byte:
void
apa_push_leds()
{
        // send dummy content, because we wouldn't send anything without
        char message = 0;
        send_message(&message, 1, 0, 0x00000002, SCTP_SENDALL);
}

send_message(const void* data, size_t len, uint32_t stream, uint32_t ppid, uint32_t flags, const String& dest = "")
{
        Mutex::Guard mtx(sctp_mtx);
        ssize_t remain = len;

        if (!dest.empty()) {
                struct addrinfo ai;
                struct addrinfo *lips;
                bzero(&ai, sizeof(ai));
                ai.ai_flags = AI_ADDRCONFIG | AI_NUMERICSERV;
                ai.ai_family = AF_INET6;
                ai.ai_protocol = IPPROTO_SCTP;
                ai.ai_socktype = SOCK_SEQPACKET;
                int res = 0;
                String sport = port;
                String addr = dest;
                res = getaddrinfo(addr.c_str(), sport.c_str(), &ai, &lips);
                if (res != 0) {
                        throw Error("failed to resolve local ips");
                }
                struct addrinfo *lip;
                for (lip = lips; lip && remain > 0; lip = lip->ai_next) {
                        while (remain > 0) {
                                ssize_t res;
                                res = sctp_sendmsgx(sctp_socket, ((uint8_t*)data) + (len - remain), remain, lip->ai_addr, 1, ppid, flags | SCTP_EOR, stream, 0, 0);
                                if (res > 0) {
                                        remain -= res;
                                } else {
                                        if (errno != EAGAIN && errno != ENOBUFS) {
                                                return;
                                        } else {
                                                int res;
                                                do {
                                                        struct pollfd pfd;
                                                        pfd.fd = sctp_socket;
                                                        pfd.events = POLLOUT;
                                                        pfd.revents = 0;
                                                        res = poll(&pfd, 1, 5000);
                                                } while (res == 0);
                                        }
                                }
                        }
                }
                freeaddrinfo(lips);
        } else {
                while (remain > 0) {
                        ssize_t res;
                        res = sctp_sendmsg(sctp_socket, ((uint8_t*)data) + (len - remain), remain, NULL, 0, ppid, flags | SCTP_EOR, stream, 0, 0);
                        if (res > 0) {
                                remain -= res;
                        } else {
                                if (errno != EAGAIN && errno != ENOBUFS) {
                                        return;
                                } else {
                                        int res;
                                        do {
                                                struct pollfd pfd;
                                                pfd.fd = sctp_socket;
                                                pfd.events = POLLOUT;
                                                pfd.revents = 0;
                                                res = poll(&pfd, 1, 5000);
                                        } while (res == 0);
                                }
                        }
                }
        }
}


> > 
> > From time to time (1-5 days) I notice that a module won't get updates
> > anymore.
> > I see that the association got a SCTP_SENDER_DRY_EVENT event.
> > Therefor my expectation is that there is nothing to send.
> > I still see the association in the socket list and also receive the regular
> > temperature data.
> > However, obviously sending won't happen.
> > The other modules still get data.
> > 
> > When I restart the client daemon, things start to work again.
> > 
> > Currently I'm clueless on how to debug this problem any fsurther.
> Can you try stable/12?
> > Doing a full network log would be too big and I'm not very experienced
> > to understand the SCTP packets.
> > I have no idea to see what data is in the send buffer.
> > netstat with TCP would show send and receive queue, not so with SCTP.
> > Data is send with a single thread, which sctp_sendmsgx() the data for
> > all modules sequencially into the same socket.
> I'm not sure I understood what you are actually doing on which socket
> and how many associations are involved.

Each of the 24 modules has a single socket with two associations from the
client host.
The client host has two daemon, which has a socket each and both sockets
have an association to each of the 24 modules.

This is the client host:
Proto  Type  Local Address                                 Foreign Address                               (state)
sctp46 1toN  2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fe66:62de.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fecf:7cb7.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fe4c:b9c9.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fee6:41f6.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fec4:6a45.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fe93:5ab4.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fec6:aaea.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:feef:ba3.1000      ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fe87:b229.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fe82:9ece.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fe96:bf2a.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fe10:195b.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fe46:cb7.1000      ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:feb5:65dc.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fec5:30dd.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fe52:54bc.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fe8a:2fcb.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fec6:5d6e.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fe03:c920.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fecb:66a3.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fe9c:9e54.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fe7c:5702.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fef2:186c.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.10405                   2003:df:b017:115:ba27:ebff:fe4d:3de5.1000     ESTABLISHED
sctp46 1toN  2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fecb:66a3.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fe66:62de.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fecf:7cb7.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fe4c:b9c9.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fee6:41f6.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fec4:6a45.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fe93:5ab4.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fec6:aaea.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:feef:ba3.1000      ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fe87:b229.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fe82:9ece.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fe96:bf2a.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fe10:195b.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fe46:cb7.1000      ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:feb5:65dc.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fec5:30dd.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fe52:54bc.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fe8a:2fcb.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fec6:5d6e.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fe03:c920.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fe9c:9e54.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fe7c:5702.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fef2:186c.1000     ESTABLISHED
sctp46 1toN               2003:df:b017:115::100.38953                   2003:df:b017:115:ba27:ebff:fe4d:3de5.1000     ESTABLISHED

This is one of the 24 modules:
Proto  Type  Local Address                                 Foreign Address                               (state)
sctp46 1toN  127.0.0.1.1000                                                                              LISTEN
             fe80::1%lo0.1000                              
             ::1.1000                                      
             2003:df:b017:115:ba27:ebff:fe87:b229.1000     
             fe80::ba27:ebff:fe87:b229%ue0.1000            
             10.215.74.118.1000                            
sctp46 1toN  2003:df:b017:115:ba27:ebff:fe87:b229.1000     2003:df:b017:115::100.10405                   ESTABLISHED
sctp46 1toN               2003:df:b017:115:ba27:ebff:fe87:b229.1000     2003:df:b017:115::100.38953                   ESTABLISHED

> > I havn't checked yet if I get an error with the write to the specific
> > module IP.

-- 
B.Walter <bernd@bwct.de> http://www.bwct.de
Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20200717175055.GB79604>