Date: Sun, 8 Dec 2024 20:30:36 +0100 From: Ronald Klop <ronald@FreeBSD.org> To: FreeBSD User <freebsd@walstatt-de.de> Cc: freebsd-current@freebsd.org Subject: (ipfw) Re: HELP! fetch: stuck forever OR error: RPC failed: curl 56 recv failure: Operation timed out Message-ID: <f8952585-4b68-4cfd-a60f-1ebbd7f2545f@FreeBSD.org> In-Reply-To: <20241206210947.3ae835e4@thor.intern.walstatt.dynvpn.de> References: <20241206034709.4dd32cc5@thor.intern.walstatt.dynvpn.de> <279848701.11738.1733510402875@localhost> <20241206210947.3ae835e4@thor.intern.walstatt.dynvpn.de>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi, I can reproduce your error. Today I updated my RPI4 from a build of Oct 23 to Dec 6. And I can reproduce the problem. After about 2 hours scp exits with: client_loop: send disconnect: Broken pipe scp: Connection closed Working: FreeBSD rpi4 15.0-CURRENT FreeBSD 15.0-CURRENT #4 main-d2e7bb630b8-dirty: Wed Oct 23 00:55:12 CEST 2024 ronald@rpi4:/data/ronald/freebsd/obj/data/ronald/freebsd/src/main/arm64.aarch64/sys/GENERIC-NODEBUG arm64 Broken: FreeBSD rpi4 15.0-CURRENT FreeBSD 15.0-CURRENT #5 main-839fb85336a-dirty: Sat Dec 7 22:33:27 CET 2024 ronald@rpi4:/data/ronald/freebsd/obj/data/ronald/freebsd/src/main/arm64.aarch64/sys/GENERIC-NODEBUG arm64 A cronjob which does a scp to another server didn't work anymore. When I go back to the previous BE it works fine again. Ipfw disable firewall also makes the scp work. Scp also seems to work fine if I replace the statefull firewall rules with stateless "pass all from any to any". Regards, Ronald. Op 06-12-2024 om 21:09 schreef FreeBSD User: > Am Fri, 6 Dec 2024 19:40:02 +0100 (CET) > Ronald Klop <ronald-lists@klop.ws> schrieb: > >> Might be useful to share your ipfw config. > > Sorry, my posting must have been disturbing (having in mind a "deny any rule and then > disabling the FW ...). > > Well, the IPFW setup itself is explained quickly - I use almost the vanilla rc.conf-issued > IPFW (settings: firewall_type="workstation", firewall_logif="YES", > firewall_myservices="22/tcp", firewall_allowservices="any"). The hosts in question have the > following kernel configuration, I provide the option tags that might be of interest or, if > not, just for the record, as they are not part of GENERIC, see below. > > Also, I'll provide some sysctl setting performed via /etc/sysctl.conf.local, see below. > > The configuration and settings have been mostly unchanged over a couple of months for now and > did not induce trouble so far. > > As it deemed fit regarding time and my limited skills, I disabled and enabled piece by piece > of the MAC_ and NETGRAPH_ options - without any success so far - my "measurement" is fetching > emails via claws-mail (all TLS). claws-mail reports "corrupted/broken stream", does have > authetication issues and is de facto unusable - it doesn't refresh IMAP based email fetches > and doesn't even quit without a hard kill. > Another "indicator" is the time taken to "git pull" of ZFS filesystems: cloning and pulling > takes unusual long (/usr/src is UFS/FFS, /usr/ports on a ZFS pool and since the problem > occured, it makes a mutual difference). > > While git pull or clone mutually stuck and claws-mail is endlessly fetching/authenticating > emails and never responding back in a usable manner, performing > > "ipfw disable firewall" > > makes all of a sudden the system work again as usual and expected. > > As reported - the problem spreads across all of my CURRENT hosts as I'm going to update them > towards a recent CURRENT (they all share similar static kernel configs as described here). Most > of the boxes do not show the weird reluctant behaviour when pulling via git, but weren't > capable of cloning, bailing out with the timeout reported earlier. > > I use one CURRENT box as my personal desktop, so no other (server) CURRENT show the Email > problem in detail as described. > > And, for the record: I haven't commented out the "options IPFIREWALL" yet in the kernel > config ... > > > Kind regards > > oh > > [ KERNEL config different from vanilla GENERIC ] > > options RATELIMIT > options ZFS > options TCPHPTS > options MROUTING > options IPSEC > options SCTP > > options MAC_BSDEXTENDED > options MAC_PORTACL > options MAC_IPACL > options MAC_NTPD > #options MAC_DO > > options NETGRAPH > options NETGRAPH_IPFW > options NETGRAPH_ETHER > options NETGRAPH_EIFACE > options NETGRAPH_VLAN > #options NETGRAPH_NAT > options NETGRAPH_DEVICE > #options NETGRAPH_PPPOE > options NETGRAPH_SOCKET > options NETGRAPH_KSOCKET > options NETGRAPH_NETFLOW > #options NETGRAPH_CAR > > # IPFW firewall > options IPFIREWALL > options IPFIREWALL_VERBOSE > options DUMMYNET # traffic shaper > > options BPF_JITTER # adds support for BPF just-in-time compiler. > > # Pseudo devices not in GENERIC. > device enc # IPsec device > device stf # 6to4 IPv6 over IPv4 encapsulation > device carp # Common address redundancy protocol > device lagg # Link aggregation > device gre # GRE Tunnel > device epair # A pair of virtual back-to-back connected Ethernet interfaces > device if_bridge # bridge device > device vxlan # Virtual eXtensible LAN interface > > > For the MAC_ Modules: the appropriate OIDs (sysctl) are disabled as far as the MAC module > influence the initial behaviour if unconfigured, for instance > (/etc/sysctl.conf.local) > > [ /etc/sysctl.conf.local ] > security.mac.bsdextended.enabled=0 > security.mac.mls.enabled=0 > security.mac.portacl.enabled=0 > security.mac.do.enabled=0 > security.mac.ipacl.ipv6=0 > security.mac.ipacl.ipv4=0 > # > net.bpf.optimize_writers=1 > # > net.inet.ip.fw.verbose=1 > #net.inet.ip.fw.verbose_limit=10 > net.inet.ip.fw.dyn_keep_states=1 > > > > > >> >> Van: FreeBSD User <freebsd@walstatt-de.de> >> Datum: 6 december 2024 03:47 >> Aan: freebsd-current@freebsd.org, freebsd-ipfw@freebsd.org >> Onderwerp: Re: HELP! fetch: stuck forever OR error: RPC failed: curl 56 recv failure: >> Operation timed out >> >>> >>> >>> Am Thu, 5 Dec 2024 17:33:54 +0100 >>> FreeBSD User schrieb: >>> >>> I found the culprit! >>> >>> Disabling IPFW ("ipfw disable firewall") turns system back to normal! >>> >>> For the record: on recent CURRENT, since approx. Nov. 30 and/or December 1st CURRENT seems >>> to corrupt network connections. >>> >>> IPFW is compiled statically into the kernel. >>> >>> The problem sketched below can be reproduced in a more or less obvious manner on recent >>> CURRENT: git pull/git clone of a regular FreeBSD source repo or ports via git+https takes >>> either a couple of time (up to several mintes to initiate the pull) - or, in some worse >>> cases here, the box runs into >>> error: RPC failed; curl 56 Recv failure: Operation timed out >>> >>> claws-mail complains about "corrupted/broken stream", fetching emails takes Aeons - >>> forever, the client does not come back even after several hours. >>> >>>> On Thu, 5 Dec 2024 16:55:00 +0100 >>>> Daniel Tameling wrote: >>>> >>>>> On Thu, Dec 05, 2024 at 11:51:03AM +0100, FreeBSD User wrote: >>>>>> On Wed, 04 Dec 2024 17:20:39 +0000 >>>>>> "Dave Cottlehuber" wrote: >>>>>> >>>>>> Thank you very much for responding! >>>>>> >>>>>>> On Tue, 3 Dec 2024, at 19:46, FreeBSD User wrote: >>>>>>>> On most recent CURRENT (on some boxes of ours, not all) fetch/git seem >>>>>>>> to be stuck >>>>>>>> forever fetching tarballs from ports, fetching Emails via claws-mail >>>>>>>> (TLS), opening >>>>>>>> websites via librewolf and firefox or pulling repositories via git. >>>>>>>> >>>>>>>> CURRENT: FreeBSD 15.0-CURRENT #1 main-n273978-b5a8abe9502e: Mon Dec 2 >>>>>>>> 23:11:07 CET 2024 >>>>>>>> amd64 >>>>>>>> >>>>>>>> When performing "git pull" und /usr/ports, I received after roughly 5-7 minutes: >>>>>>>> >>>>>>>> error: RPC failed: curl 56 recv failure: Operation timed out >>>>>>> >>>>>>> Generally it would be worth seeing if the HTTP(S) layers are doing the right thing >>>>>>> or not, and then working down from there, to tcpdump / wireshark and then if >>>>>>> necessary into kernel itself. >>>>>> >>>>>> My skills are limited, according to packet analysis utilizing tcpdum/wireshark (and >>>>>> theory,of course). I tried due to "a feeling" my used older Intel based NIC could >>>>>> have some checksum issues like in the past (I saw e1000 driver updates recently >>>>>> flowing into FreeBSD CURRENT). >>>>>>> >>>>>>> If fetch fails reliably in ports distfile fetching, then isolate a suitable >>>>>>> tarball, and try it again in curl, with tcpdump already prepared to capture >>>>>>> traffic to the remote host. >>>>>>> >>>>>>> tcpdump -w /tmp/curl.pcap -i ... host ... >>>>>>> >>>>>>> env SSLKEYLOGFILE=/tmp/ssl.keys curl -vsSLo /dev/null --trace >>>>>>> /tmp/curl.log https://what.ev/er >>>>>>> >>>>>>> I would guess that between the two something useful should pop up. >>>>>>> >>>>>>> I like opening the pcap in wireshark, it often has angry red and black highlighted >>>>>>> lines already giving me a hint. >>>>>>> >>>>>>> The SSLKEYLOGFILE can be imported into wireshark, and allows decrypting the TLS >>>>>>> traffic as well in case there are issues further in. Very handy, >>>>>>> see https://everything.curl.dev/usingcurl/tls/sslkeylogfile.html for how to do that. >>>>>>> >>>>>>> If your issues only occur with git pull, its also curl inside and supports similar >>>>>>> debugging. Ferreting >>>>>>> through https://stackoverflow.com/questions/6178401/how-can-i-debug-git-git-shell-related-problems/56094711#56094711 should get you similar info. >>>>>>> >>>>>>> A+ >>>>>>> Dave >>>>>>> >>>>>> >>>>>> Thanks for the hints and precious tips! I'll digg deeper into the matter. >>>>>> >>>>>> In the meanwhile, I updated some other machines running CURRENT since approx. two >>>>>> weeks with an older CURRENT to the most recent one - and face similar but not >>>>>> identical problems! >>>>>> Updating exiting FreeBSD repositories, like src.git and ports.git, show no problems >>>>>> except they take longer to accomplish than expected. >>>>>> Cloning a repo is impossible, after 10 or 15 minutes I receive a timeout. >>>>>> >>>>>> On aCURRENT recently updated and worked flawlessly before (CURRENT now: FreeBSD >>>>>> 15.0-CURRENT #5 main-n274014-b2bde8a6d39: Wed Dec 4 22:22:22 CET 2024 amd64), >>>>>> cloning attempts for 14.2-RELENG ends up in this mess: >>>>>> >>>>>> # git clone --branch releng/14.2 https://git.freebsd.org/src.git 14.2-RELENG/src/ >>>>>> Cloning into '14.2-RELENG/src'... >>>>>> error: RPC failed; curl 56 Recv failure: Operation timed out >>>>>> fatal: expected 'packfile' >>>>>> >>>>>> This is nasty. The host now in question has an i350 based dual-port NIC - the host's >>>>>> kernel is very similar to the box I reported the issue first time, both do have >>>>>> customized kernels (in most cases, I compile several modules like ZFS and >>>>>> several NETGRAPH modules statically into the kernel - a habit inherited from a small >>>>>> FBSD project I configured (I wouldn't say developed) which does not allow loadable >>>>>> kernel modules due to regulations. >>>>>> >>>>>> I hoped others would stumble over this tripwire in recent CURRENT sources, since the >>>>>> phenomena and its distribution over a bunch of CURRENT boxes with different OS states >>>>>> seemingly show different behviour. >>>>>> >>>>>> And for the record: I also build my ports via poudriere and mostly via make. I also >>>>>> rebuilt in a two day's marathon all packages via "make -f" - for librewolf, curl and >>>>>> so on to ensure having latest sources/packages. >>>>>> >>>>>> (I repeat myself here again, sorry, its for the record). >>>>>> >>>>>> Will report in on further development and "investigations" >>>>>> >>>>>> Kind regards and thanks, >>>>>> >>>>>> oh >>>>>> >>>>>> >>>>> >>>>> This is a shot into the dark but is this a virtual machine? VirtualBox 7.1.0 had some >>>>> networking issues that got fixed later. >>>> >>>> No, pure Hardware and FreeBSD ... >>>> >>>>> >>>>> Otherwise I would start with ping and traceroute to figure out if they show this issue >>>>> and where it occurs. >>>>> >>>> >>>> >>> >>> >>> >>> -- >>> O. Hartmann >>> >>> >>> >>> >>> > > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?f8952585-4b68-4cfd-a60f-1ebbd7f2545f>