Date: Fri, 29 Nov 2019 23:10:36 +0100 From: Peter Eriksson <pen@lysator.liu.se> To: freebsd-current@freebsd.org Cc: Enji Cooper <yaneurabeya@gmail.com> Subject: Re: Best way to print something from the kernel at 1s intervals? Message-ID: <057870D3-B59E-47FB-A832-A72DC6F54AFA@lysator.liu.se> In-Reply-To: <DA769610-A087-40C4-B903-ACF34388CDE8@gmail.com> References: <E6B8A0DA-5F4D-4B08-A376-0AE65575A6F9@lysator.liu.se> <DA769610-A087-40C4-B903-ACF34388CDE8@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
I love dtrace, but I seriously doubt that it could be used it for this. =
This is the absolutely last code that executes at kernel =
shutdown/reboot. All other processes are terminated when this is =
happening...
Basically the code done in kern_reboot() (and stuff it calls) in =
/usr/src/sys/kern/kern_shutdown.c
This code in kern_reboot():
...
> EVENTHANDLER_INVOKE(shutdown_pre_sync, howto);
>
> /*
> * Now sync filesystems
> */
> if (!cold && (howto & RB_NOSYNC) =3D=3D 0 && once =3D=3D 0) {
> once =3D 1;
> bufshutdown(show_busybufs);
> }
>
> print_uptime();
>
> cngrab();
>
> /*
> * Ok, now do things that assume all filesystem activity has
> * been completed.
> */
> EVENTHANDLER_INVOKE(shutdown_post_sync, howto);
>
> if ((howto & (RB_HALT|RB_DUMP)) =3D=3D RB_DUMP && !cold && =
!dumping)
> doadump(TRUE);
>
> /* Now that we're going to really halt the system... */
> EVENTHANDLER_INVOKE(shutdown_final, howto);
What I=E2=80=99ve seen is that the three different EVENTHANDLER_INVOKE() =
all can cause a lot of calls that can take a long time. As can =
bufshutdown(), probably when it calls vfs_unmountall() to unmount all =
filesystems.
We=E2=80=99re talking like 5-20 minutes (or more). I=E2=80=99ve not =
really timed it.
I have a feeling it sometimes is due to transactions stored up in the =
ZFS filesystems that the system tries to =E2=80=9Crun=E2=80=9D so it =
doesn=E2=80=99t have to do it at mount() time after the reboot. If I =
reset the machine while it is busy doing this then sometimes the =E2=80=9C=
zfs mount -a=E2=80=9D takes very long time at boot instead.
Hmm.. I should be able to use the getnanouptime() call to get a =
=E2=80=9Cclock=E2=80=9D to look for (it=E2=80=99s used in =
print_uptime()). As long as the clock isn=E2=80=99t stopped at this time =
in the shutdown sequence atleast :-)
*Time to write some code and test this* :-)
- Peter
> On 29 Nov 2019, at 22:09, Enji Cooper <yaneurabeya@gmail.com> wrote:
>=20
>=20
>> On Nov 28, 2019, at 12:52, Peter Eriksson <pen@lysator.liu.se> wrote:
>>=20
>> =EF=BB=BFI=E2=80=99ve been looking into the =E2=80=9Ckernel looks to =
be hung at reboot=E2=80=9D problem at bit. Adding a lot of printf() =
calls to the relevant parts it looks like it actually isn=E2=80=99t hung =
but busy unmounting filesystems (which we have thousands of), flushing =
disk caches, calling registered callbacks and stuff and sometimes it =
takes a bit longer than usual - probably due to ZFS having stuff queued =
up that needs to be written to disk before it finishes=E2=80=A6
>>=20
>> Anyway, I=E2=80=99d like to propose that we add some kind of =
counter/printf() calls in that code area so we can see that things are =
indeed progressing. However, I=E2=80=99d probably prefer not to print =
_every_ filesystem (or registered callbacks - a lot of those...) - that =
generates a lot of output (tried that :-) but something like:
>>=20
>> Unmounting filesystems:
>> 1234 done
>> (With the "1234 done=E2=80=9D updated something like once per =
second).
>>=20
>> What=E2=80=99s the right/best way to do that from the kernel? In user =
space I=E2=80=99d just call time(&t) at some convenient points and only =
print something if =E2=80=9Ct=E2=80=9D has changed. :-)
>=20
> Can you use DTrace instead to get the information you need?
> Cheers,
> -Enji
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?057870D3-B59E-47FB-A832-A72DC6F54AFA>
