From nobody Sun Jul 9 23:59:35 2023 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4QzkdW2kK8zZZZl for ; Sun, 9 Jul 2023 23:59:43 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4QzkdW10xVz3wPs for ; Sun, 9 Jul 2023 23:59:43 +0000 (UTC) (envelope-from kostikbel@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.17.1/8.17.1) with ESMTPS id 369Nxao8096424 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Mon, 10 Jul 2023 02:59:39 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 369Nxao8096424 Received: (from kostik@localhost) by tom.home (8.17.1/8.17.1/Submit) id 369NxZMl096423; Mon, 10 Jul 2023 02:59:35 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Mon, 10 Jul 2023 02:59:35 +0300 From: Konstantin Belousov To: John F Carr Cc: Current FreeBSD Subject: Re: shell hung in fork system call Message-ID: References: <909E2C96-3BFA-41AD-8EE7-0902231C2B95@mit.edu> <52A8F775-17D9-4240-A444-98AD5339622F@mit.edu> List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <52A8F775-17D9-4240-A444-98AD5339622F@mit.edu> X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=4.0.0 X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-14) on tom.home X-Rspamd-Queue-Id: 4QzkdW10xVz3wPs X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-ThisMailContainsUnwantedMimeParts: N On Sun, Jul 09, 2023 at 11:36:03PM +0000, John F Carr wrote: > > > > On Jul 9, 2023, at 19:25, Konstantin Belousov wrote: > > > > On Sun, Jul 09, 2023 at 10:41:27PM +0000, John F Carr wrote: > >> Kernel and system at a146207d66f320ed239c1059de9df854b66b55b7 plus some irrelevant local changes, four 64 bit ARM processors, make.conf sets CPUTYPE?=cortex-a57. > >> > >> I typed ^C while /bin/sh was starting a pipeline and my shell got hung in the middle of fork(). > >> > >>> From the terminal: > >> > >> # git log --oneline --|more > >> ^C^C^C > >> load: 3.26 cmd: sh 95505 [fork] 5308.67r 0.00u 0.03s 0% 2860k > >> mi_switch+0x198 sleepq_switch+0xfc sleepq_timedwait+0x40 _sleep+0x264 fork1+0x67c sys_fork+0x34 do_el0_sync+0x4c8 handle_el0_sync+0x44 > >> load: 3.16 cmd: sh 95505 [fork] 5311.75r 0.00u 0.03s 0% 2860k > >> mi_switch+0x198 sleepq_switch+0xfc sleepq_timedwait+0x40 _sleep+0x264 fork1+0x67c sys_fork+0x34 do_el0_sync+0x4c8 handle_el0_sync+0x44 > >> > >> According to ps -d on another terminal the shell has no children: > >> > >> PID TT STAT TIME COMMAND > >> [...] > >> 873 u0 IWs 0:00.00 `-- login [pam] (login) > >> 874 u0 I 0:00.17 `-- -sh (sh) > >> 95504 u0 I 0:00.01 `-- su - > >> 95505 u0 D+ 0:00.05 `-- -su (sh) > >> [...] > >> > >> Nothing on the (115200 bps serial) console. No change in system performance. > >> > >> The system is busy copying a large amount of data from the network to a ZFS pool on spinning disks. The git|more pipeline could have taken some time to get going while I/O requests worked their way through the queue. It would not have touched the busy pool, only the zroot pool on an SSD. > >> > >> Has anything changed recently that might cause this? > > > > There was some change around fork, but your sleep seems to be not from > > that change. Can you show the wait channel for the process? Do something > > like > > $ ps alxww > > > > UID PID PPID C PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND > 0 95505 95504 2 20 0 13508 2876 fork D+ u0 0:00.13 -su (sh) > > This is probably the same information displayed as [fork] in the output from ^T. > > Does it correspond to the source line > > pause("fork", hz / 2); > > ? Yes, it is rate-limiting code. Still it is interesting to see the whole ps output. Do you have 7a70f17ac4bd64dc1a5020f in your source?