From owner-freebsd-hackers@freebsd.org  Sun Dec 11 20:07:59 2016
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id C285EC72FC9
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Sun, 11 Dec 2016 20:07:59 +0000 (UTC)
 (envelope-from mozolevsky@gmail.com)
Received: from mailman.ysv.freebsd.org (unknown [127.0.1.3])
 by mx1.freebsd.org (Postfix) with ESMTP id 9FB3D1F50
 for <freebsd-hackers@freebsd.org>; Sun, 11 Dec 2016 20:07:59 +0000 (UTC)
 (envelope-from mozolevsky@gmail.com)
Received: by mailman.ysv.freebsd.org (Postfix)
 id 9EFD7C72FC8; Sun, 11 Dec 2016 20:07:59 +0000 (UTC)
Delivered-To: hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9E941C72FC6
 for <hackers@mailman.ysv.freebsd.org>; Sun, 11 Dec 2016 20:07:59 +0000 (UTC)
 (envelope-from mozolevsky@gmail.com)
Received: from mail-wj0-x22c.google.com (mail-wj0-x22c.google.com
 [IPv6:2a00:1450:400c:c01::22c])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 268311F4F
 for <hackers@freebsd.org>; Sun, 11 Dec 2016 20:07:59 +0000 (UTC)
 (envelope-from mozolevsky@gmail.com)
Received: by mail-wj0-x22c.google.com with SMTP id tk12so55928011wjb.3
 for <hackers@freebsd.org>; Sun, 11 Dec 2016 12:07:59 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:from:date:message-id
 :subject:to:cc;
 bh=TCeVnHlDx5+SI2+qi6N9U3ITGSqvJpfJskWyga19S1E=;
 b=i82mKsTofkTN5zNBeVml7VDxXwJY/iA43oDDL5XQk7M2xb3vIRewpYju66edDnQz42
 JSkGxqeagSeWzvsY1iNY3GiKXLBkV+9hlF/4Y0olmzR4PdnZAERbzq5lXfBVbEGCzl+/
 wkgHSbx12DxfedVcc2bdrg/7lQAg3Hn1ZzDGGdAVxKe83sk5tT+CROBHdY2kacRYN6ze
 Xd3e0NvoH0kAf8X+GF4IEGpBGPTdV0DKBQJEfgwYJvZUIDkA9AffENXeeLqbqqdqHjSY
 ppnpuN0BR1LCj0TjJo+Ji31z/8Qi7yTQ0jt7kD3asost2C4jwFFpPxmzD7Ceujd85McY
 G/ug==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:sender:in-reply-to:references:from
 :date:message-id:subject:to:cc;
 bh=TCeVnHlDx5+SI2+qi6N9U3ITGSqvJpfJskWyga19S1E=;
 b=Hx57IkLGDnuB3flN1oTxzEijHYlXOi+Iby5eMiVa+3UHU/PpUoAvHj0xLaVpIu1vpK
 UPshJHdxV43hITc4GoZsRwiD7bJCOHcrFzhJCCnB562g+MoosBso92ZTIjZ3+Ik2N7Ik
 KPvIR6UQe1DosCoFeQv0z4kQWqJqCFg4YRKP1nyeeMmwv6PscEnwuYqNZBfNf4hZ+mZZ
 fYVIaLTSHHU0gRz5+kJMXtNh9AWuMvZ+9Ki7sjqJgwVpE+Om0pbmTgGVAStvS4+4VM1F
 kL2iQw+sr/9Rb94FkxSqw8i+dWgXeJ6L8nwrYMeP9jwO5tv+kFnu2rlDCchzfKcb4wTd
 1Naw==
X-Gm-Message-State: AKaTC03bczCg3mRORL17+yWzlj52Kw8qlfegjGCcPpNz3haPyymd+5J8F/eZdZ/2fLL8pq0XAFBepM38U14eZg==
X-Received: by 10.194.85.77 with SMTP id f13mr77427124wjz.187.1481486877081;
 Sun, 11 Dec 2016 12:07:57 -0800 (PST)
MIME-Version: 1.0
Sender: mozolevsky@gmail.com
Received: by 10.28.69.88 with HTTP; Sun, 11 Dec 2016 12:07:16 -0800 (PST)
In-Reply-To: <CABh_MKk87hJTsu1ETX8Ffq9E8gqRPELeSEKzf1jKk_wwUROgAw@mail.gmail.com>
References: <CABh_MKk87hJTsu1ETX8Ffq9E8gqRPELeSEKzf1jKk_wwUROgAw@mail.gmail.com>
From: Igor Mozolevsky <igor@hybrid-lab.co.uk>
Date: Sun, 11 Dec 2016 20:07:16 +0000
X-Google-Sender-Auth: Im64ikZQFUybZiacjH7CWHIRK80
Message-ID: <CADWvR2iTFiaWtrJ1NVSq_ycno=2sy0ihMhm0PfU+hdNFEWt9hQ@mail.gmail.com>
Subject: Re: Sysctl as a Service, or: making sysctl(3) more friendly for
 monitoring systems
To: Ed Schouten <ed@nuxi.nl>
Cc: hackers@freebsd.org
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.23
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 11 Dec 2016 20:07:59 -0000

Can SNMP not do that already?


-- 
Igor M.


On 11 December 2016 at 19:35, Ed Schouten <ed@nuxi.nl> wrote:

> Hi there,
>
> The last couple of months I've been playing around with a monitoring
> system called Prometheus (https://prometheus.io/). In short,
> Prometheus works like this:
>
> ----- If you already know Prometheus, skip this -----
>
> 1. For the thing you want to monitor, you either integrate the
> Prometheus client library into your codebase or you write a separate
> exporter process. The client library or the exporter process then
> exposes key metrics of your application over HTTP. Simplified example:
>
> $ curl http://localhost:12345/metrics
> # HELP open_file_descriptors The number of files opened by the process
> # TYPE open_file_descriptors gauge
> open_file_descriptors 12
> # HELP http_requests The number of HTTP requests received.
> # TYPE http_requests counter
> http_requests{result="2xx"} 100
> http_requests{result="4xx"} 14
> http_requests{result="5xx"} 0
>
> 2. You fire op Prometheus and configure it to scrape and store all of
> the things you want to monitor. Prometheus can then add more labels to
> the metrics it scrapes. So the example above may get transformed by
> Prometheus to look like this:
>
> open_file_descriptors{job="nginx",instance="web1.mycompany.com"} 12
> http_requests{job="nginx",instance="web1.mycompany.com",result="2xx"} 100
> http_requests{job="nginx",instance="web1.mycompany.com",result="4xx"} 14
> http_requests{job="nginx",instance="web1.mycompany.com",result="5xx"} 0
>
> Fun fact: Prometheus can also scrape Prometheus, so if you operate
> multiple datacenters, you can let a global instance scrape a per-DC
> instance and add a dc="..." label to all metrics.
>
> 3. After scraping data for some time, you can do fancy queries like these:
>
> - Compute the 5-minute rate of HTTP requests per server and per HTTP error
> code:
> rate(http_requests[5m])
>
> - Compute the 5-minute rate of all HTTP requests on the entire cluster:
> sum(rate(http_requests[5m]))
>
> - Same as the above, but aggregate by HTTP error code:
> sum(rate(http_requests[5m])) by (result)
>
> Prometheus can do alerting as well by using these expressions as matchers.
>
> 4. Set up Grafana and voila: you can create fancy dashboards!
>
> ----- If you skipped the introduction, start reading here -----
>
> The Prometheus folks have developed a tool called the node_exporter
> (https://github.com/prometheus/node_exporter). Basically it extracts a
> whole bunch of interesting system-related metrics (disk usage, network
> I/O, etc) through sysctl(3), invoking ioctl(2), parsing /proc files,
> etc. and exposes that information using Prometheus' syntax.
>
> The other day I was thinking: in a certain way, the node exporter is a
> bit of a redundant tool on the BSDs. Instead of needing to write
> custom collectors for every kernel subsystem, we could write a generic
> exporter for converting the entire sysctl(3) tree to Prometheus
> metrics, which is exactly what I'm experimenting with here:
>
> https://github.com/EdSchouten/prometheus_sysctl_exporter
>
> An example of what this tool's output looks like:
>
> $ ./prometheus_sysctl_exporter
> ...
> # HELP kern_maxfiles Maximum number of files
> sysctl_kern_maxfiles 1043382
> # HELP kern_openfiles System-wide number of open files
> sysctl_kern_openfiles 316
> ...
>
> You could use this to write alerting rules like this:
>
> ALERT FileDescriptorUsageHigh
>   IF sysctl_kern_openfiles / sysctl_kern_maxfiles > 0.5
>   FOR 15m
>   ANNOTATIONS {
>     description = "More than half of all FDs are in use!",
>   }
>
> There you go. Access to a very large number of metrics without too much
> effort.
>
> My main question here is: are there any people in here interested in
> seeing something like this being developed into something usable? If
> so, let me know and I'll pursue this further.
>
> I also have a couple of technical questions related to sysctl(3)'s
> in-kernel design:
>
> - Prometheus differentiates between gauges (memory usage), counters
> (number of HTTP requests), histograms (per-RPC latency stats), etc.,
> while sysctl(3) does not. It would be nice if we could have that info
> on a per-sysctl basis. Mind if I add a CTLFLAG_GAUGE, CTLFLAG_COUNTER,
> etc?
>
> - Semantically sysctl(3) and Prometheus are slightly different.
> Consider this sysctl:
>
> hw.acpi.thermal.tz0.temperature: 27.8C
>
> My tool currently converts this metric's name to
> sysctl_hw_acpi_thermal_tz0_temperature. This is suboptimal, as it
> would ideally be called
> sysctl_hw_acpi_thermal_temperature{sensor="tz0"}. Otherwise you
> wouldn't be able to write generic alerting rules, use aggregation in
> queries, etc.
>
> I was thinking: we could quite easily do such a translation by
> attaching labels to SYSCTL_NODE objects. As in, the hw.acpi.thermal
> node would get a label "sensor". Any OID placed underneath this node
> will not become a midfix of the sysctl name, but the value of that
> label instead. Thoughts?
>
> A final remark I want to make: a concern might be that changes like
> these would not be generic, but only apply to Prometheus. I tend to
> disagree. First of all, an advantage of Prometheus is that the
> coupling is very loose: it's just a GET request with key-value pairs.
> Anyone is free to add his/her own implementation.
>
> Second, emaste@ also pointed me to another monitoring framework being
> developed by Intel right now:
>
> https://github.com/intelsdi-x/snap
>
> The changes I'm proposing would also seem to make exporting sysctl
> data to that system easier.
>
> Anyway, thanks for reading this huge wall of text.
>
> Best regards,
> --
> Ed Schouten <ed@nuxi.nl>
> Nuxi, 's-Hertogenbosch, the Netherlands
> KvK-nr.: 62051717
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
>