From nobody Sat Feb 3 11:12:35 2024 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TRqjx4C0kz599n2 for ; Sat, 3 Feb 2024 11:12:37 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-ot1-x333.google.com (mail-ot1-x333.google.com [IPv6:2607:f8b0:4864:20::333]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4TRqjx26L2z4hnc; Sat, 3 Feb 2024 11:12:37 +0000 (UTC) (envelope-from mjguzik@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-ot1-x333.google.com with SMTP id 46e09a7af769-6e12d0af927so2068989a34.0; Sat, 03 Feb 2024 03:12:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706958756; x=1707563556; darn=freebsd.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=4FM5witnUdqX57JJ82lAIAkjvJ2W2V+jV2IMqLJWk1c=; b=COw0hdM5MuixDroKE3aFHRWVAPy9T7s6N/oHXDr/calIbLjfiKzx3ortBYnkxyZheG YCKNXZKVWPjmV9/1yrhcXlRL6sq9fL9VGTpTT+P2yd1FzgzH5jnlF11baaOg4GvnTyxo XoSWqOb/Z4Y4hrvoXS9Ep9JI5uk/1ieJj9BcG7IMdEgJ/q1UP0BtOI8aNpmFgkwKxkJl wpH+AS372iOIH2Qc/xrac0DAqACVsRuT/1seX6T64zuOv25ixlsSCUjVasG0IlTCmd71 xXbzT8iCFSSfTSezQSpZs7PISjbvSr7rN/YGQXFNd7G8fjOKtOO3OyWuCsSIoG3csZs8 5lwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706958756; x=1707563556; h=content-transfer-encoding:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4FM5witnUdqX57JJ82lAIAkjvJ2W2V+jV2IMqLJWk1c=; b=HnHZgjAwXPNzIrJ4BmLB2QGFflIRDIVsqeAv9QsmAHhgAHKRQKZdOaaedqTBFCdlvt saGuNkfx2teIVzs/qMhgIQvaYwANahjKg2itzmutK2W7rwRVxNQe66SN6cKhNTFXNGHK KQ5DYkNtostvPEWgKEkhal3txnSSYmutWO/VVpCwuBYscdrl19W8Tm4eEaXw1BaVCMu3 K11wAHAP97gwG6kqDgavRxanl4PbNpCslCAsfVeEi2Zbn8H7ZC5e0wPDvNzEbUOljMDs GkQnOZATRrs7y8pV6H2L0vE2IcrCHi4MuicCduMEK16F/ZkFba5o7dexkmLrF8rIrGzl 2NZw== X-Gm-Message-State: AOJu0YwEghSPgF65V7bpfQSlKNunwXeri+2KemGc/MvZoqIjekgwr+wz 0quFgYHrYnjZG6t8hR5Fql8uxeslhvTR/6Dj7/G4/EAoNfL+VF8IXkioymEta6c/jkILWGC0pS8 8OzAFF7hMQVyhKyXDLlEW6VxMW1pxO+kH X-Google-Smtp-Source: AGHT+IHjJVfoIZcEHnxUEOtv5rDFBXK/axlKnHE7oOmoil7dy0PvOmR0HK6ouNU9kPdP4lkRKPCaMvdrFrFl+T105wk= X-Received: by 2002:a05:6830:87:b0:6dc:5e1:3d89 with SMTP id a7-20020a056830008700b006dc05e13d89mr11016641oto.17.1706958755757; Sat, 03 Feb 2024 03:12:35 -0800 (PST) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Received: by 2002:ac9:4645:0:b0:517:6330:dd0f with HTTP; Sat, 3 Feb 2024 03:12:35 -0800 (PST) In-Reply-To: <7E8133B7-4BD5-42AB-8B16-A10F59295F28@FreeBSD.org> References: <7E8133B7-4BD5-42AB-8B16-A10F59295F28@FreeBSD.org> From: Mateusz Guzik Date: Sat, 3 Feb 2024 12:12:35 +0100 Message-ID: Subject: Re: libc/libsys split coming soon To: David Chisnall Cc: Brooks Davis , current@freebsd.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4TRqjx26L2z4hnc X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US] On 2/3/24, David Chisnall wrote: > On 3 Feb 2024, at 09:15, Mateusz Guzik wrote: >> >> Binary startup is very slow, for example execve of a hello world >> binary in a Linux-based chroot on FreeBSD is faster by a factor of 2 >> compared to a native one. As such perf-wise this looks like a step in >> the wrong direction. > > Have you profiled this? Is the Linux version using BIND_NOW (which comes > with a load of problems, but it often the default for Linux systems and > reduces the number of slow-path entries into rtld)? Do they trigger the > same number of CoW faults? Is there a path in rtld that=E2=80=99s slower= than the > equivalent ld-linux.so path? > I only profiled FreeBSD, it was 4 years ago. I have neither time nor interest in working on this. Relevant excerpts from profiling an fexecve loop: Sampling what syscalls was being executed when in kernel mode (or trap): syscalls: pread 108 fstat 162 issetugid 250 sigprocmask 303 read 310 mprotect 341 open 380 close 1547 mmap 2787 trap 5421 [snip] In userspace most of the time is spent here: ld-elf.so.1`memset 406 ld-elf.so.1`matched_symbol 431 ld-elf.so.1`strcmp 1078 ld-elf.so.1`reloc_non_plt 1102 ld-elf.so.1`symlook_obj 1102 ld-elf.so.1`find_symdef 1439 find_symdef iterates a linked list, which I presume induces strcmp calls due to unwanted entries. [snip] Full profile user: libc.so.7`__je_extent_heap_new 71 libc.so.7`__vdso_clock_gettime 73 libc.so.7`memset 75 ld-elf.so.1`_rtld 83 ld-elf.so.1`getenv 85 libc.so.7`__je_malloc_mutex_boot 132 ld-elf.so.1`reloc_plt 148 ld-elf.so.1`__crt_malloc 163 ld-elf.so.1`symlook_default 166 ld-elf.so.1`digest_dynamic1 184 libc.so.7`__je_malloc_mutex_init 205 ld-elf.so.1`symlook_global 281 ld-elf.so.1`memset 406 ld-elf.so.1`matched_symbol 431 ld-elf.so.1`strcmp 1078 ld-elf.so.1`reloc_non_plt 1102 ld-elf.so.1`symlook_obj 1102 ld-elf.so.1`find_symdef 1439 kernel: kernel`vm_reserv_alloc_page 89 kernel`amd64_syscall 95 kernel`0xffffffff80 102 kernel`vm_page_alloc_domain_after 114 kernel`vm_object_deallocate 117 kernel`vm_map_pmap_enter 122 kernel`pmap_enter_object 140 kernel`uma_zalloc_arg 148 kernel`vm_map_lookup_entry 148 kernel`pmap_try_insert_pv_entry 156 kernel`vm_fault_dirty 168 kernel`pagecopy 177 kernel`vm_fault 260 kernel`get_pv_entry 265 kernel`pagezero_erms 367 kernel`pmap_enter_quick_locked 380 kernel`pmap_enter 432 kernel`0xffffffff80 1126 kernel`0xffffffff80 2017 kernel`trap 2097 syscalls: pread 108 fstat 162 issetugid 250 sigprocmask 303 read 310 mprotect 341 open 380 close 1547 mmap 2787 trap 5421 Counting fexecve: dtrace -n 'fbt::sys_fexecve:entry { @[count] =3D stack(); } tick-30s { exit= (0); }' dtrace script, can be run as: dtrace -w -x aggsize=3D128M -s script.d assumes binary name is a.out syscall::fexecve:entry { self->inexec =3D 1; } syscall::fexecve:return { self->inexec =3D 0; } fbt::trap:entry { self->trap =3D 1; } fbt::trap:return { self->trap =3D 0; } profile:::profile-4999 /execname =3D=3D "a.out" && arg1 && self->inexec =3D=3D 0/ { @user[usym(arg1)] =3D count(); } profile:::profile-4999 /execname =3D=3D "a.out" && arg0 && self->inexec =3D=3D 0 && self->trap =3D= =3D 0/ { @kernel[sym(arg0)] =3D count(); @kernel_syscall[stringof(`syscallnames[curthread->td_sa.code])] =3D count(); } profile:::profile-4999 /execname =3D=3D "a.out" && arg0 && self->inexec =3D=3D 0 && self->trap =3D= =3D 1/ { @kernel[sym(arg0)] =3D count(); @kernel_syscall["trap"] =3D count(); } tick-5s { system("clear"); trunc(@user, 20); trunc(@kernel, 20); printa("%40A %@16d\n", @user); printa("%40a %@16d\n", @kernel); clear(@user); clear(@kernel); trunc(@kernel_syscall, 10); printf("\t\t\tsyscalls:\n"); printa("%40s %@16d\n", @kernel_syscall); clear(@kernel_syscall); } fexecve-loop.c: #include #include #include #include #include #include #include int main(int argc, char **argv) { char buf[8]; char *cargv[3]; int fd; switch (argc) { case 1: fd =3D open(argv[0], O_EXEC); if (fd =3D=3D -1) err(1, "open"); cargv[0] =3D argv[0]; cargv[1] =3D buf; sprintf(cargv[1], "%d", fd); cargv[2] =3D NULL; break; case 2: fd =3D atoi(argv[1]); cargv[0] =3D argv[0]; cargv[1] =3D argv[1]; cargv[2] =3D NULL; break; default: printf("invalid argc %d\n", argc); exit(1); } fexecve(fd, cargv, NULL); err(1, "fexecve"); --=20 Mateusz Guzik