Date: Wed, 10 Jan 2024 11:21:39 -0800 From: Mark Millard <marklmi@yahoo.com> To: bob prohaska <fbsd@www.zefox.net> Cc: freebsd-arm@freebsd.org Subject: Re: USB-serial adapter suggestions needed Message-ID: <E769A770-8D23-4EFC-8E75-F0ACF6705C4E@yahoo.com> In-Reply-To: <ZZ7fBDxYd8Yyw5fm@www.zefox.net> References: <ZZoOMoM/iwcFqSNi@www.zefox.net> <7D27DF9F-AA9A-4D44-BC28-8CC637D0F550@yahoo.com> <C40967D3-D480-41DD-8054-421EA7AE5EFE@yahoo.com> <ZZy2FRO7MCOLMQhp@www.zefox.net> <DD96D1B0-92A8-4CBD-9661-FC86AC547405@yahoo.com> <ZZ2E8OZJBfXLkwQ%2B@www.zefox.net> <ZZ3NGeOtadKMHgIj@www.zefox.net> <3012A549-9482-4D69-9DF4-7987E650DFFA@yahoo.com> <ZZ7fBDxYd8Yyw5fm@www.zefox.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Jan 10, 2024, at 10:16, bob prohaska <fbsd@www.zefox.net> wrote: > On Tue, Jan 09, 2024 at 05:03:42PM -0800, Mark Millard wrote: >> On Jan 9, 2024, at 14:47, bob prohaska <fbsd@www.zefox.net> wrote: >>=20 > [transcript of ssh-tip disconnect omitted] >>=20 >> Interesting. >>=20 >> www.zefox.org is the aarch64 that is not configured in config.txt >> in the normal aarch64 manor. As I've requested before, please test >> using a config.txt that instead has: >>=20 >> QUOTE >> [all] >> arm_64bit=3D1 >> dtparam=3Daudio=3Don,i2c_arm=3Don,spi=3Don >> dtoverlay=3Dmmc >> dtoverlay=3Ddisable-bt >> device_tree_address=3D0x4000 >> kernel=3Du-boot.bin >>=20 >> [pi4] >> hdmi_safe=3D1 >> armstub=3Darmstub8-gic.bin >>=20 >> # Local addition: >> [all] >> force_mac_address=3Db8:27:eb:71:46:4f >> END QUOTE >>=20 >> Please do not use a configuration based in part on armv7 FreeBSD >> config.txt material any more for the testing activity: Just use >> the FreeBSD normal aarch64 configuration with the force_mac_address >> related material added at the end. >>=20 >> I want to know if this also fails when powerd is not in >> use anywhere. >>=20 >=20 > I'd like to let the the present OS build/install cycle complete. > Then I'll replace config.txt on www.zefox.org and reboot. > That should be done in another day or two. The remote console > disconnect reported above hasn't happened again, all consoles > have stayed connected and responsive. >=20 >=20 >> [I assume that the "The Pi4 workstation" is the "pi4 RasPiOS >> workstation". True? Presuming yes: Is the RasPiOS the 64 bit >> variation? (Just my curiosity.)] >>=20 > Yes. Uname -a reports=20 > Linux raspberrypi 6.1.21-v8+ #1642 SMP PREEMPT Mon Apr 3 17:24:16=20 > BST 2023 aarch64 GNU/Linux >=20 >> Do you run the buildworld on www.zefox.org and such via the >> tip session on pelorus.zefox.org ? Via an ssh session from the >> "pi4 RasPiOS workstation" to www.zefox.org ? More generally: >>=20 >> A) What runs (if anything) via the tip session started from >> pelorus.zefox.org ? >>=20 >> B) What runs (if anything) via the ssh session connected to >> www.zefox.org ? >>=20 >=20 > In general the tip session is used only for observation or > troubleshooting. Ssh connections are used for other activity,=20 > including OS build/install cycles, poudriere, etc. They are > usually placed in the background, writing to log files so that > accidental disconnects from the workstation don't stop them. Are you using: NAME nohup =E2=80=93 invoke a utility immune to hangups SYNOPSIS nohup [--] utility [arguments] DESCRIPTION The nohup utility invokes utility with its arguments and at this = time sets the signal SIGHUP to be ignored. If the standard output is a terminal, the standard output is appended to the file nohup.out in = the current directory. If standard error is a terminal, it is directed = to the same place as the standard output. Some shells may provide a builtin nohup command which is similar or identical to this utility. Consult the builtin(1) manual page. ? >> A useful test would be to not have the tip command running >> on polaris.zefox.org and to just use the ssh to www.zfox.org >> instead to start the buildworld/buildkernel. So: No use >> of the serial connection when the buildworld is started or >> during the build(s). Using tip before that but quitting tip >> before starting to load the RPi4B would be okay for this type >> of test. The question would be if the: >>=20 >> client_loop: send disconnect: Broken pipe >>=20 >> still happens. >>=20 >> (I'm not claiming that recovery if it fails would be nice. But >> finding out if it fails looks to be important.) >>=20 >> The contrasting useful test would be to start the buildworld >> from the tip session on polaris.zefox.org and to not have any >> ssh session to www.zefox.org . The question would be if a >> failure of some kind still happens. (The tip session does not >> have a pipe in use as far as I know so the detail for >> identifying faulure would likely be different.) >>=20 >=20 > Normal practice is to leave the tip sessions displaying the=20 > console host's login prompt. So long as the console login is=20 > responsive I can assume that host isn't hung. >=20 >> Another question would be: do both such tests fail? Just one >> (which)? None? So trying both tests eventually would be >> important. >=20 > In general, ssh sessions behave completely independently.=20 > Ssh connections to tip sessions commonly fail but no other=20 > ssh connection to that terminal server is disturbed visibly. >>=20 >> It is important to have only one of the 2 types of connections >> in use during the buildworld/buildkernel and such activity for >> this type of test --and only the one instance of which ever >> type the active test is for. >>=20 >>=20 >=20 > Apologies if I didn't answer your question; I'm missing the gist. I only want one source of hangups/failure, no worries about which one (network vs. serial) lead to the activity if a failure happens. That only ssh sessions that in turn run tip fail suggests that the tip session gets the initial problem and then things propagate. I want more than a suggestion. For example: direct tip runs that are not in any ssh session: still get some form of failures? For another: no tip use, just ssh: still get failures? Do both ways still get failures? Yes, the implication is that some experiments that do not have your normal structure are involved and there may be risk of not being able to use a tip session as a responsiveness test during such an experiment. I'm not suggesting any such thing for normal operation once such experiments are finished. > It remains unclear where the disconnects to tip originate. That is part of what I'm requesting exploration of via different techniques than past attempts that did not provide the information. > If the tip > session is stopped by typing ~~. from the originating ssh instance I'm=20= > returned to the shell on the terminal server. Ssh isn't disturbed. If=20= > I type ~. the ssh session terminates and I'm back to the workstation's=20= > shell. Would it be informative to start a tip session, then ssh in=20 > separately and try to kill tip? A question is of SIGHUP is happening. If it is, then the kill that would simulate the issue would be via sending SIGHUP. But this may be only one of however many alternatives there may be. I prefer to explore what is actually happening than attempted simulations via guesses at what is happening. > I'd expect the ssh part of the link > to remain up. If not, would it be significant?=20 >=20 > Occastionally warnings like > Jan 10 00:23:30 ns1 sshd[925]: error: beginning MaxStartups throttling > show up in console messages. Might those be relevant in some way? =20 Hmm. Intersting. Looking around I see notation like: MaxStartups 10:30:100 where (mostly copy/pasted wording from an example, other than detailed = formatting): 10: concurrent unauthenticated sessions before it begins rejecting some = subsequent connections 30: The percent of subsequent connections that are rejected [but see = below] 100: At this many concurrent unauthenticated sessions, sshd rejects all = subsequent connections Looking, "man sshd_config" reports: MaxStartups Specifies the maximum number of concurrent unauthenticated connections to the SSH daemon. Additional connections will = be dropped until authentication succeeds or the LoginGraceTime expires for a connection. The default is 10:30:100. Alternatively, random early drop can be enabled by = specifying the three colon separated values start:rate:full (e.g. = "10:30:60"). sshd(8) will refuse connection attempts with a probability = of rate/100 (30%) if there are currently start (10) = unauthenticated connections. The probability increases linearly and all connection attempts are refused if the number of = unauthenticated connections reaches full (60). It does suggest that testing isolated from the source(s) of unauthenticated sessions could be worth while in case handling the load from such sessions when already heavily loaded with buildworld/builkernel or the like leads to other problems (and denial of service consequences?). I do not expect that this issue is all that likely but expectations are not evidence of their own accuracy/inaccuracy. =3D=3D=3D Mark Millard marklmi at yahoo.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E769A770-8D23-4EFC-8E75-F0ACF6705C4E>