Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 4 Dec 2019 13:48:05 +0100
From:      Peter Eriksson <pen@lysator.liu.se>
To:        FreeBSD FS <freebsd-fs@freebsd.org>
Cc:        Andriy Gapon <avg@freebsd.org>
Subject:   Re: Slow reboots due to ZFS cleanup in kern_shutdown() .. zio_fini()
Message-ID:  <1119144A-C6DA-4CBF-8616-108FB50D2F8D@lysator.liu.se>
In-Reply-To: <BDE8C3F8-7D55-4C4F-8E40-0DF2FD732D5A@lysator.liu.se>
References:  <AD17E454-6A51-436D-A853-07F04A406EC9@lysator.liu.se> <D2A11CE9-9B24-4E40-A51A-8D318E0288C9@lysator.liu.se> <20191202225424.GG43802@raichu> <3b71fe37-c29f-e3e5-ff96-5dce15cc7553@FreeBSD.org> <20191203162219.GI43802@raichu> <247C6990-BC6E-4E3D-8CEF-5A861D8A25EC@lysator.liu.se> <4D44B25E-2F15-4536-A653-DA242051C8A9@lysator.liu.se> <e15d66e2-dde4-aaff-87fb-cacd15c1e719@FreeBSD.org> <BDE8C3F8-7D55-4C4F-8E40-0DF2FD732D5A@lysator.liu.se>

next in thread | previous in thread | raw e-mail | index | archive | help
Going deeper=E2=80=A6 now it=E2=80=99s pointing towards pmap_remove(), =
called from kmem_unback(), which is called from kmem_free(), which is =
called from page_free() which is called from keg_free_slab()...


kmem_unback: pmap_remove(kernel_pmap, 18446741884786954240, =
18446741884786970624) took 1 seconds
kmem_free: kmem_unback(kmem_object, 18446741884786954240, 16384) took 1 =
seconds
page_free: kmem_free(kmem_arena, 18446741884786954240, 16384) took 1 =
seconds
keg_free_slab: keg->uk_freef(mem) {page_free} took 1 seconds

kmem_unback: pmap_remove(kernel_pmap, 18446741884989739008, =
18446741884989755392) took 1 seconds
kmem_free: kmem_unback(kmem_object, 18446741884989739008, 16384) took 1 =
seconds
page_free: kmem_free(kmem_arena, 18446741884989739008, 16384) took 1 =
seconds
keg_free_slab: keg->uk_freef(mem) {page_free} took 1 seconds

kmem_unback: pmap_remove(kernel_pmap, 18446741885257895936, =
18446741885257912320) took 1 seconds
kmem_free: kmem_unback(kmem_object, 18446741885257895936, 16384) took 1 =
seconds
page_free: kmem_free(kmem_arena, 18446741885257895936, 16384) took 1 =
seconds
keg_free_slab: keg->uk_freef(mem) {page_free} took 1 seconds


- Peter

> On 4 Dec 2019, at 11:13, Peter Eriksson <pen@lysator.liu.se> wrote:
>=20
> Yeah, I noticed that after I sent that email (should send stuff just =
before you go to bed :-)
>=20
> Anyway, some more timing/debug printfs point to the keg->uk_freef(mem) =
call in keg_free_slab() which sometimes take about a second to execute.
> (Should probably switch to some other time source with greater =
resolution than 1s next :-)
>=20
> keg_free_slab: keg->uk_freef(mem) took 1 seconds
> keg_free_slab: keg->uk_freef(mem) took 1 seconds
> keg_free_slab: keg->uk_freef(mem) took 1 seconds
> keg_free_slab: keg->uk_freef(mem) took 1 seconds
> keg_free_slab: keg->uk_freef(mem) took 1 seconds
> keg_free_slab: keg->uk_freef(mem) took 1 seconds
> keg_free_slab: keg->uk_freef(mem) took 1 seconds
>=20
> - Peter
>=20
>> On 4 Dec 2019, at 10:48, Andriy Gapon <avg@FreeBSD.org> wrote:
>>=20
>> On 04/12/2019 00:28, Peter Eriksson wrote:
>>> =E2=80=A6 and X levels deeper it gives this (not all filesystems =
mounted - I rebooted fairly quickly):
>>>=20
>>> keg_drain: while (SLIST_FIRST/SLIST_REMOVE)-loop took 14 seconds =
[20021 loops]
>>> zone_drain_wait(): zone_foreach_keg(zone, &keg_drain) took 14 =
seconds
>>> zone_dtor(): zone_drain_wait(zone, M_WAITOK) took 14 seconds
>>> zone_free_item(zone=3DUMA Zones): zone->uz_dtor() took 14 seconds
>>> uma_zdestroy(zio_buf_12288) took 14 seconds
>>> kmem_cache_destroy: uma_zdestroy(0xfffff803467c8ac0) [zio_buf_12288] =
took 14 seconds
>>> kmem_cache_destroy(zio_buf_cache[20]) took 14 seconds
>>>=20
>>>=20
>>> Hmm=E2=80=A6 I wonder if it isn=E2=80=99t this code I =
/usr/src/sys/sys/queue.h:
>>>=20
>>>> #define SLIST_REMOVE(head, elm, type, field) do {                   =
    \
>>>>      QMD_SAVELINK(oldnext, (elm)->field.sle_next);                  =
 \
>>>>      if (SLIST_FIRST((head)) =3D=3D (elm)) {                        =
     \
>>>>              SLIST_REMOVE_HEAD((head), field);                      =
 \
>>>>      }                                                              =
 \
>>>>      else {                                                         =
 \
>>>>              QUEUE_TYPEOF(type) *curelm =3D SLIST_FIRST(head);      =
   \
>>>>              while (SLIST_NEXT(curelm, field) !=3D (elm))           =
   \
>>>>                      curelm =3D SLIST_NEXT(curelm, field);          =
   \
>>>>              SLIST_REMOVE_AFTER(curelm, field);                     =
 \
>>>>      }                                                              =
 \
>>>>      TRASHIT(*oldnext);                                             =
 \
>>>> } while (0)
>>>=20
>>> Combined with this in /usr/src/sys/vm/uma_core.c: keg_drain():
>>>=20
>>>> finished:
>>>>      KEG_UNLOCK(keg);
>>>>=20
>>>>      while ((slab =3D SLIST_FIRST(&freeslabs)) !=3D NULL) {
>>>>              SLIST_REMOVE(&freeslabs, slab, uma_slab, us_hlink);
>>>>              keg_free_slab(keg, slab, keg->uk_ipers);
>>>>      }
>>>=20
>>> (The keg_drain print above is the time for the while() loop).
>>>=20
>>> If I=E2=80=99m reading this right it looks like it=E2=80=99s =
potentially doing a linear search through a (long) linked list, for =
every entry in the free slabs list. 20000 entries in 14 seconds is 1400 =
entries per second.
>>=20
>> It's not a search.  The loop is removing *all* items.  There is no =
other way to
>> do it :-)
>> Expanding your calculations, it takes on the order of a millisecond =
to process a
>> single slab.  I bet that the time is spent in keg_free_slab() and =
it's a lot of
>> time.
>>=20
>> I wonder if the OR should actually be AND in this condition:
>>               if (!uma_dbg_kskip(keg, slab->us_data + (keg->uk_rsize =
* i)) ||
>>                   keg->uk_fini !=3D trash_fini)
>> Looks like there is a "ton" of memory reading done here.
>>=20
>> --=20
>> Andriy Gapon
>=20
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1119144A-C6DA-4CBF-8616-108FB50D2F8D>