Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 5 Jul 2023 23:42:08 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        Current FreeBSD <freebsd-current@freebsd.org>, freebsd-arm <freebsd-arm@freebsd.org>
Subject:   Re: For snapshot builds: armv7 chroot on aarch64 has kyua test -k /usr/tests/Kyuafile sys/kern/kern_copyin hung up [in getpid?], unkillable, prevents reboot
Message-ID:  <C3952A9A-E21B-41C1-9BB1-68D189083F5D@yahoo.com>
In-Reply-To: <7A41DED4-876F-4270-A980-549A4832B39A@yahoo.com>
References:  <7A41DED4-876F-4270-A980-549A4832B39A@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Jun 25, 2023, at 17:16, Mark Millard <marklmi@yahoo.com> wrote:

> Using the likes of:
>=20
> =
FreeBSD-14.0-CURRENT-arm64-aarch64-ROCK64-20230622-b95d2237af40-263748.img=

> and:
> =
FreeBSD-14.0-CURRENT-arm-armv7-GENERICSD-20230622-b95d2237af40-263748.img
>=20
> I have shown the following behavior after setting up storage
> media based on them. (This was a test that my builds were not
> odd for the issue.)
>=20
> Boot the aarch64 media and log in. (Note: I logged in
> as root.)
>=20
> mount the armv7 media (-noatime is just my habit)
> and then put it to use:
>=20
> # mount -onoatime /dev/da1s2a /mnt
>=20
> # chroot /mnt/
>=20
> # kyua test -k /usr/tests/Kyuafile sys/kern/kern_copyin
> sys/kern/kern_copyin:kern_copyin  -> =20
>=20
> On the serial console:
>=20
> # ps -xu
> USER  PID   %CPU %MEM   VSZ  RSS TT  STAT STARTED      TIME COMMAND
> root   11 1498.4  0.0     0  256  -  RNL  23:24   542:52.92 [idle]
> root 1174  100.0  0.0     0   16  -  Rs   23:37     0:00.00 =
/usr/tests/sys/kern/kern_copyin -vunprivileged-user=3Dtests =
-r/tmp/kyua.9YUttj/2/result.atf kern_copyin
> root    0    0.0  0.0     0 1616  -  DLs  23:24     0:00.50 [kernel]
> root    1    0.0  0.0 11704 1288  -  ILs  23:24     0:00.02 /sbin/init
> root    2    0.0  0.0     0  256  -  WL   23:24     0:00.26 [clock]
> root    3    0.0  0.0     0  272  -  DL   23:24     0:00.00 [crypto]
> root    4    0.0  0.0     0   80  -  DL   23:24     0:00.95 [cam]
> root    5    0.0  0.0     0   16  -  DL   23:24     0:00.00 [busdma]
> root    6    0.0  0.0     0   16  -  DL   23:24     0:00.03 =
[rand_harvestq]
> root    7    0.0  0.0     0   48  -  DL   23:24     0:00.06 =
[pagedaemon]
> root    8    0.0  0.0     0   16  -  DL   23:24     0:00.00 [vmdaemon]
> root    9    0.0  0.0     0  160  -  DL   23:24     0:00.38 =
[bufdaemon]
> root   10    0.0  0.0     0   16  -  DL   23:24     0:00.00 [audit]
> root   12    0.0  0.0     0  880  -  WL   23:24     0:11.81 [intr]
> root   13    0.0  0.0     0   48  -  DL   23:24     0:00.04 [geom]
> root   14    0.0  0.0     0   16  -  DL   23:24     0:00.00 [sequencer =
00]
> root   15    0.0  0.0     0  160  -  DL   23:24     0:06.42 [usb]
> root   16    0.0  0.0     0   16  -  DL   23:24     0:00.10 =
[acpi_thermal]
> root   17    0.0  0.0     0   16  -  DL   23:24     0:00.00 =
[acpi_cooling0]
> root   18    0.0  0.0     0   16  -  DL   23:24     0:00.04 [syncer]
> root   19    0.0  0.0     0   16  -  DL   23:24     0:00.00 [vnlru]
> root  671    0.0  0.0 13260 2600  -  Is   23:25     0:00.00 dhclient: =
system.syslog (dhclient)
> root  674    0.0  0.0 13260 2752  -  Is   23:25     0:00.00 dhclient: =
dpni0 [priv] (dhclient)
> root  761    0.0  0.0 14572 3972  -  Ss   23:25     0:00.02 /sbin/devd
> root  964    0.0  0.0 12832 2764  -  Is   23:25     0:00.02 =
/usr/sbin/syslogd -s
> root 1033    0.0  0.0 13012 2604  -  Ss   23:25     0:00.01 =
/usr/sbin/cron -s
> root 1058    0.0  0.0 21052 8308  -  Is   23:25     0:00.01 sshd: =
/usr/sbin/sshd [listener] 0 of 10-100 startups (sshd)
> root 1078    0.0  0.0 21288 9304  -  Is   23:26     0:00.09 sshd: =
root@pts/0 (sshd)
> root 1175    0.0  0.0 21288 9496  -  Is   23:37     0:00.04 sshd: =
root@pts/1 (sshd)
> root 1074    0.0  0.0 13380 3008 u0  Is   23:25     0:00.01 login =
[pam] (login)
> root 1075    0.0  0.0 13460 3292 u0  S    23:25     0:00.02 -sh (sh)
> root 1233    0.0  0.0 13588 3016 u0  R+   00:00     0:00.00 ps -xu
> root 1081    0.0  0.0 13460 3328  0  Is   23:26     0:00.02 -sh (sh)
> root 1170    0.0  0.0  5788 2884  0  I    23:36     0:00.02 /bin/sh -i
> root 1172    0.0  0.0 10408 7192  0  I+   23:37     0:00.30 kyua test =
-k /usr/tests/Kyuafile sys/kern/kern_copyin
> root 1178    0.0  0.0 13460 3320  1  Is+  23:38     0:00.01 -sh (sh)
>=20
> 1174 is stuck, even if one waits for 30min+.
> kill and kill -9 will not kill 1174.
>=20
> "shutdown -r now" hangs before the reboot happens
> and reports: "some processes would not die".
>=20
> An interesting property is that ps and top disagree
> about 1174 CPU usage: ps 100%, top 0%. But top also
> indicates 1174 always has CPU0 "STATE". (Across
> tests CPUn varies but within a test it has
> a fixed n.)
>=20
> I have also seen ps "STAT" being RXs.
>=20
> The following is from my earlier activity with my own
> builds involved, here 1119, not the 1174 from above.
> truss reports as the last thing for the stuck process
> as "getpid()".
>=20
> . . .
> 1119: 0.588983953 fstatat(AT_FDCWD,"/usr/tests/sys/kern/kern_copyin",{ =
mode=3D-r-xr-xr-x ,inode=3D111756,size=3D9776,blksize=3D10240 =
},AT_SYMLINK_NOFOLLOW) =3D 0 (0x0)
> 1119: 0.589065030 =
mmap(0x0,20480,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON|MAP_ALIGNED(12),-=
1,0x0) =3D 1074188288 (0x4006d000)
> 1119: 0.589227544 =
openat(AT_FDCWD,"/tmp/kyua.aBQv6E/2/result.atf",O_WRONLY|O_CREAT|O_TRUNC,0=
644) =3D 3 (0x3)
> 1119: 0.589276503 getpid()                      =3D 1119 (0x45f)
>=20
>=20
>=20
> For reference, from inside an armv7 chroot session
> before doing such a test:
>=20
> # uname -apKU
> FreeBSD generic 14.0-CURRENT FreeBSD 14.0-CURRENT #0 =
main-n263748-b95d2237af40: Thu Jun 22 11:10:50 UTC 2023     =
root@releng1.nyi.freebsd.org:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC =
arm armv7 1400090 1400090

I've replicated the same sort of hangup based on:

aarch64 (booted):
# uname -apKU
FreeBSD CA72-16Gp-ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #0 =
n263893-0631830a7a3c-dirty: Wed Jul  5 13:54:15 PDT 2023     =
root@CA72-16Gp-ZFS:/usr/obj/BUILDs/alt-main-CA72-nodbg-clang-alt/usr/alt-m=
ain-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64 1400092 =
1400092

armv7 (as seen in a chroot use):
# uname -apKU
FreeBSD CA72-16Gp-ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #0 =
n263893-0631830a7a3c-dirty: Wed Jul  5 13:54:15 PDT 2023     =
root@CA72-16Gp-ZFS:/usr/obj/BUILDs/alt-main-CA72-nodbg-clang-alt/usr/alt-m=
ain-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm armv7 1400092 1400092



=3D=3D=3D
Mark Millard
marklmi at yahoo.com




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C3952A9A-E21B-41C1-9BB1-68D189083F5D>