From nobody Tue Aug 17 01:38:56 2021 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 22CFC174FE66 for ; Tue, 17 Aug 2021 01:39:15 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-oi1-f173.google.com (mail-oi1-f173.google.com [209.85.167.173]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4GpYbl0B0xz4cBx for ; Tue, 17 Aug 2021 01:39:15 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-oi1-f173.google.com with SMTP id bj40so29641057oib.6 for ; Mon, 16 Aug 2021 18:39:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=/Owrbu3+A1HIxibpSVCP6+UcCzcSvLQWV+GEUeOeACo=; b=qfI7X0j8tcA1+k0TZhBcPpXo0OhCssmNI26NlhYAoghRAtjg0DXgEBGYEPavWAEZ46 bOLLwkn/Qh3tQea1P7aHXnXhNYY/gUTNsx8ogjNs7pEovX6cznJBrYPwz//wiFTC/pGN ARx1htzehKxUWtAQDk24ksKpvACmktF7KRyDvgILx+mJ1jyvZEBR9j2YZ76W1xzmaXhx QoiHg/uwu7kGmOqNcsUp0vBRHAc1w9q65bCQP2bnOpa+b4jQNz8UO3EpsGU6Qe8kCOJA SXB/zQ7egp05ITyhn6PA3L+H3B6zJ69rHbcMrouHCXgp9Hxl1AW8YPkiD2IxgEGEPnCO m6xg== X-Gm-Message-State: AOAM531s4n0GYa6m5VWz4rNwehUxGRLn1I8nifcrukmtlS6UvWDIfVk0 CYbnBi5L8v6TMoawzk5fHCsSSstbA+ZRYSINCyo= X-Google-Smtp-Source: ABdhPJwQR8Bea/fK+Y3rGapLWxpsPWE7uthZxsWPM4yd63jR7xhwTDqZw56Xagl/P2YvlMRNzRcLaSpfcvLOhFU/qsk= X-Received: by 2002:a54:4812:: with SMTP id j18mr584378oij.55.1629164347920; Mon, 16 Aug 2021 18:39:07 -0700 (PDT) List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 References: In-Reply-To: From: Alan Somers Date: Mon, 16 Aug 2021 19:38:56 -0600 Message-ID: Subject: Re: sysctl is too slow To: Mateusz Guzik Cc: FreeBSD Hackers Content-Type: multipart/alternative; boundary="0000000000003d160c05c9b762f8" X-Rspamd-Queue-Id: 4GpYbl0B0xz4cBx X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: Y --0000000000003d160c05c9b762f8 Content-Type: text/plain; charset="UTF-8" Actually, I did get a flamegraph, and only 0.77% of samples were in ZFS. On Mon, Aug 16, 2021 at 7:19 PM Mateusz Guzik wrote: > On 8/16/21, Alan Somers wrote: > > Yes, I see what you're talking about now. There are a bunch of linked > > lists in sysctl_find_oid etc. Good point. > > -Alan > > > > You still want to get a flamegraph, chances are most of the problem is in > zfs. > > > On Mon, Aug 16, 2021 at 1:30 PM Mateusz Guzik wrote: > > > >> Last time I checked lookup of a sysctl was very bad with linear scans > all > >> over. > >> > >> Short of complete revamp of the entire thing I would start with > >> replacing the scans with a RB tree at each level. As is if you indeed > >> have 5000 datasets, you are doing increasingly longer walks. > >> > >> On 8/16/21, Alan Somers wrote: > >> > ztop feels very sluggish on a server with 5000 ZFS datasets. Dtrace > >> shows > >> > that almost all of its time is spent in sys_sysctl. ktrace shows that > >> both > >> > ztop and sysctl(8) call sys_sysctl a total of five times for each > >> > sysctl > >> > they care about: > >> > > >> > 1) To get the next oid > >> > 2) To get the sysctl's name > >> > 3) To get the oidfmt > >> > 4) To get the size of the value > >> > 5) To get the value itself. > >> > > >> > Each of these steps takes about equal time, and together all five take > >> > about 100us. If the time per call is mostly syscall overhead, then > the > >> > process could be sped up by 80% by combining all of these things into > a > >> > single syscall: return the next oid, its name, its format, the size of > >> its > >> > value, and optimistically the value itself, assuming the user passed a > >> > sufficiently large buffer. > >> > > >> > Am I missing something? Is there any other reason why sysctl is so > >> > slow? > >> > Or should I forget about it, and try to export ZFS's dataset stats > >> through > >> > devstat instead? > >> > -Alan > >> > > >> > >> > >> -- > >> Mateusz Guzik > >> > > > > > -- > Mateusz Guzik > --0000000000003d160c05c9b762f8--