Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 16 Sep 2011 16:59:10 -0400
From:      Arnaud Lacombe <lacombar@gmail.com>
To:        Poul-Henning Kamp <phk@phk.freebsd.dk>
Cc:        FreeBSD-Current <freebsd-current@freebsd.org>
Subject:   Re: Very imprecise watchdogd(8) timeout
Message-ID:  <CACqU3MVF5MwqeC%2Bs9VKk4mLJenmoS9Q_bJWkbYeFzaBFjo67gQ@mail.gmail.com>
In-Reply-To: <58772.1316203388@critter.freebsd.dk>
References:  <CACqU3MWs0HHnZchOwmwWG8U9Vd2pBDKAqf6Pdw5zS_XO_S6Ppw@mail.gmail.com> <58772.1316203388@critter.freebsd.dk>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi,

On Fri, Sep 16, 2011 at 4:03 PM, Poul-Henning Kamp <phk@phk.freebsd.dk> wrote:
> In message <CACqU3MWs0HHnZchOwmwWG8U9Vd2pBDKAqf6Pdw5zS_XO_S6Ppw@mail.gmail.com>
> , Arnaud Lacombe writes:
>
>>I just had a look to the way the timeout specified to watchdogd is
>>passed to the kernel. watchdogd(8) says:
>
> The API was designed for simplicity, not precision.
>
> Watchdog hardware often have weird and strange limitations on the
> actual values you can set.
>
yes.

> A very typical, the most typical in my experience, is "some
> frequency, a binary prescaler, possibly with a counter.
>
> It is also not uncommon to have more than one watchdog
> mechanism in the same system.
>
in which case the current notifier-based architecture is broken. You
may want to have a soft-watchdog triggering after 5s, and a fallback
hardware watchdog triggering after 60s.

> It would be overkill to design and implement a complex API to
> communicate these limitations to userland.
>
Linux is going this way, at least for a min/max seconds timeount info,
did not check the rest.

> So the API was designed around the power-of-two scale to give it
> a wide range, and with the semantics "no shorter than", to make
> it easy to use, and for multiple watchdogs to be engaged to the
> best of their ability.
>
wide range ? 50% of the possibility are unusable (every value below
29) and the rest is limited by what the device support anyway. Take
the geodewdt, a max timeout of 2h26, so with the actual sparse range,
you will only be able to set timeout to 1s, 2s, 4s, 8s, 17s, 34s, 68s,
274s, 549s, 1099s, 2199s, 4398s. That's 20% of the original range...

> If this is not precise enough for you, come up with something
> better.
>
I do not really care actually, but the manpage is wrong, and the code
needlessly complicated. You can just rip all the
double-to-int-log-of-nanosecond-timeout mambo-jumbo, and advertise in
watchdogd(8) that only power of two timeout are supported, or have an
option to directly specify the shift, that'll be simpler and correct.

 - Arnaud



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CACqU3MVF5MwqeC%2Bs9VKk4mLJenmoS9Q_bJWkbYeFzaBFjo67gQ>