Date: Tue, 20 Mar 2018 16:11:42 -0700 From: Aleksandr Miroslav <alexmiroslav@gmail.com> To: freebsd-questions@freebsd.org Subject: weird network/DNS issues (nsd not returning answer) Message-ID: <CACcSE1yXgcp8vJS2CTmXWTwSs_cxrAU%2BNjirNn2n8Ls7fSsjgQ@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
I have a number of FreeBSD servers online. The other day, one of them that I setup a month back started exhibiting really weird behavior. It doesn't get answers back to queries made to my two DNS servers, both of which are running nsd. Initially I suspected pf or sshguard to be the issue, but this happens with pf and sshguard turned off on all servers in question. The other weird thing is that all other network traffic between these servers are passing back and forth normally, only nsd replies are not being sent. Here is the issue, roughly: - given multiple servers, labeled, a-z - servers k and z run nsd - with the exception of server b, all other servers can communicate normally with servers k and z - with the exception of DNS queries, server b can communicate normally with server k and z - b can ping, ssh to, rsync, scp, to and from server k and z The only issue is when b makes a DNS query to k or z. I see those two servers get the query, and return the answer, but that answer never reaches b. I have sniffed the network to confirm this. Observe: # in these examples: # b.example.org = 66.66.66.66, the server that is misbehaving # k.example.org = 1.1.1.1, one of my DNS servers # c.example.org = 3.3.3.3, another server of mine, which I am looking up the DNS for # b make initially query to k 14:11:46.912995 IP 66.66.66.66.18394 > 1.1.1.1.53: 22479+ A? c.example.org. (31) # k receives query and immediately returns the answer 14:11:46.931605 IP 66.66.66.66.18394 > 1.1.1.1.53: 22479+ A? c.example.org. (31) 14:11:46.931854 IP 1.1.1.1.53 > 66.66.66.66.18394: 22479*- 1/2/1 A 3.3.3.3 (103) # this second line, the answer, never makes it to b # after a second or two, it makes another query: 14:11:51.969083 IP 66.66.66.66.12645 > 1.1.1.1.53: 22479+ A? c.example.org. (31) # k receives the second query and immediately returns the answer again 14:11:51.991267 IP 66.66.66.66.12645 > 1.1.1.1.53: 22479+ A? c.example.org. (31) 14:11:51.991508 IP 1.1.1.1.53 > 66.66.66.66.12645: 22479*- 1/2/1 A 3.3.3.3 (103) # there still nothing from tcpdump on b's interface that it received the answer # [DNS names and IPs have been changed above.] Here's what it looks like from b's command line $ host c.example.org k.example.org # a few seconds delay ;; connection timed out; no servers could be reached $ b has the same problem with my my other server z, which also runs nsd. All my other servers can query k and z just fine. Only b is exhibiting this problem. All the servers run pf/sshguard. But these rules/configs have not been updated in months. I did do one other thing to debug. I shutdown nsd on k, and setup a listener on b like this nc -l 10000 And on k, I did this: ls /etc | sudo nc -s 1.1.1.1 -p 53 b.example.org 10000 This produced the contents of /etc on b. So that means that without nsd in the picture, k is able to talk to b via port 53 just fine. All the above servers in question are running FreeBSD 11.1-RELEASE-p6. I'm not exactly sure how I can debug this problem further, I'm not sure where the block is happening. Any help appreciated.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CACcSE1yXgcp8vJS2CTmXWTwSs_cxrAU%2BNjirNn2n8Ls7fSsjgQ>