Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 9 Oct 2024 19:21:02 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        Yuri <yuri@FreeBSD.org>, freebsd-hackers <freebsd-hackers@freebsd.org>
Subject:   RE: Why is the process gets killed because "a thread waited too long to allocate a page"?
Message-ID:  <CA420288-406A-4A96-BEF4-BEE48B1ABC1F@yahoo.com>
References:  <CA420288-406A-4A96-BEF4-BEE48B1ABC1F.ref@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Yuri <yuri_at_FreeBSD.org> wrote on
Date: Wed, 09 Oct 2024 16:12:50 UTC :

> When I tried to build lang/rust in the 14i386 poudriere VM the =
compiler=20
> got killed with this message in the kernel log:
>=20
>=20
> > Oct  9 05:21:11 yv kernel: pid 35188 (rustc), jid 1129, uid 65534,=20=

> was killed: a thread waited too long to allocate a page
>=20
>=20
>=20
> The same system has no problem building lang/rust in the 14amd64 VM.
>=20
>=20
> What does it mean "waited too long"? Why is the process killed when=20
> something is slow?
> Shouldn't it just wait instead?


If you want to allow it to potentially wait forever,
you can use:

sysctl vm.pfault_oom_attempts=3D-1

(or analogous in appropriate *.conf files
taht would later be executed).

You might end up with deadlock/livelock/. . .
if you do so. (I've not analyzed the details.)


Details:

Looking around, sys/vm/vm_pageout.c has:

               case VM_OOM_MEM_PF:
                        reason =3D "a thread waited too long to allocate =
a page";
                        break;

# grep -r VM_OOM_MEM_PF /usr/main-src/sys/
/usr/main-src/sys/vm/vm_pageout.h:#define VM_OOM_MEM_PF 2
/usr/main-src/sys/vm/vm_fault.c: vm_pageout_oom(VM_OOM_MEM_PF);
/usr/main-src/sys/vm/vm_pageout.c: if (shortage =3D=3D VM_OOM_MEM_PF &&
/usr/main-src/sys/vm/vm_pageout.c: if (shortage =3D=3D VM_OOM_MEM || =
shortage =3D=3D VM_OOM_MEM_PF)
/usr/main-src/sys/vm/vm_pageout.c: case VM_OOM_MEM_PF:

sys/vm/vm_fault.c :
(NOTE: official code has its variant of the printf under a
"if (bootverbose)" but I locally remove that conditional.)

/*
 * Initiate page fault after timeout.  Returns true if caller should
 * do vm_waitpfault() after the call.
 */
static bool
vm_fault_allocate_oom(struct faultstate *fs)
{
        struct timeval now;
=20
        vm_fault_unlock_and_deallocate(fs);
        if (vm_pfault_oom_attempts < 0)
                return (true);
        if (!fs->oom_started) {
                fs->oom_started =3D true;
                getmicrotime(&fs->oom_start_time);
                return (true);
        }
=20
        getmicrotime(&now);
        timevalsub(&now, &fs->oom_start_time);
        if (now.tv_sec < vm_pfault_oom_attempts * vm_pfault_oom_wait)
                return (true);
=20
        printf("vm_fault_allocate_oom: proc %d (%s) failed to alloc page =
on fault, starting OOM\n",
                curproc->p_pid, curproc->p_comm);
=20
        vm_pageout_oom(VM_OOM_MEM_PF);
        fs->oom_started =3D false;
        return (false);
}


This is associated with vm.pfault_oom_attempts and
vm.pfault_oom_wait . An old comment in my
/boot/loader.conf is:

#
# For possibly insufficient swap/paging space
# (might run out), increase the pageout delay
# that leads to Out Of Memory killing of
# processes (showing defaults at the time):
#vm.pfault_oom_attempts=3D 3
#vm.pfault_oom_wait=3D 10
# (The multiplication is the total but there
# are other potential tradoffs in the factors
# multiplied, even for nearly the same total.)

(Note: the "tradeoffs" is associated with:
sys/vm/vm_fault.c: vm_waitpfault(dset, vm_pfault_oom_wait * hz);
)

sys/vm/vm_pageout.c :

void
vm_pageout_oom(int shortage)
{
        const char *reason;
        struct proc *p, *bigproc;
        vm_offset_t size, bigsize;
        struct thread *td;
        struct vmspace *vm;
        int now;
        bool breakout;

        /*
         * For OOM requests originating from vm_fault(), there is a high
         * chance that a single large process faults simultaneously in
         * several threads.  Also, on an active system running many
         * processes of middle-size, like buildworld, all of them
         * could fault almost simultaneously as well.
         *
         * To avoid killing too many processes, rate-limit OOMs
         * initiated by vm_fault() time-outs on the waits for free
         * pages.
         */
        mtx_lock(&vm_oom_ratelim_mtx);
        now =3D ticks;
        if (shortage =3D=3D VM_OOM_MEM_PF &&
            (u_int)(now - vm_oom_ratelim_last) < hz * vm_oom_pf_secs) {
                mtx_unlock(&vm_oom_ratelim_mtx);
                return;
        }
        vm_oom_ratelim_last =3D now;
        mtx_unlock(&vm_oom_ratelim_mtx);
. . .
                size =3D vmspace_swap_count(vm);
                if (shortage =3D=3D VM_OOM_MEM || shortage =3D=3D =
VM_OOM_MEM_PF)
                        size +=3D vm_pageout_oom_pagecount(vm);
. . .

Looks like time based retries and giving up after
about the specified overall time for that many
retries, avoiding potentially waiting forever when
0 <=3D vm.pfault_oom_attempts .


=3D=3D=3D
Mark Millard
marklmi at yahoo.com




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA420288-406A-4A96-BEF4-BEE48B1ABC1F>