Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Dec 2018 09:00:57 -0700
From:      Warner Losh <imp@bsdimp.com>
To:        "Rodney W. Grimes" <freebsd-rwg@pdx.rh.cn85.dnsmgr.net>
Cc:        George Neville-Neil <gnn@neville-neil.com>, "freebsd-arch@freebsd.org" <arch@freebsd.org>
Subject:   Re: A proposal for code removal prior to FreeBSD 13
Message-ID:  <CANCZdfqK9T3x8=z14pPbg7pLNfUz4JcErhSEEsJMvU5h_EnZFw@mail.gmail.com>
In-Reply-To: <201812180109.wBI19eaK098408@pdx.rh.CN85.dnsmgr.net>
References:  <CANCZdfqkh7eAbSn45cUW0LMtrz85rny_qX_xOEAp7CZ%2B8=3Y0g@mail.gmail.com> <201812180109.wBI19eaK098408@pdx.rh.CN85.dnsmgr.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Dec 17, 2018 at 6:09 PM Rodney W. Grimes <
freebsd-rwg@pdx.rh.cn85.dnsmgr.net> wrote:

> > On Mon, Dec 17, 2018 at 12:22 AM Rodney W. Grimes <
> > freebsd-rwg@pdx.rh.cn85.dnsmgr.net> wrote:
> >
> > > > On Sun, Dec 16, 2018, 9:49 PM George Neville-Neil <
> gnn@neville-neil.com
> > > > wrote:
> > > >
> > > > >
> > > > >
> > > > > On 17 Dec 2018, at 0:59, Rodney W. Grimes wrote:
> > > > >
> > > > > >> On Sun, Dec 16, 2018 at 8:27 AM George Neville-Neil
> > > > > >> <gnn@neville-neil.com>
> > > > > >> wrote:
> > > > > >>
> > > > > >>> Howdy,
> > > > > >>>
> > > > > >>> A few of us are working on a list of programs and other code
> that
> > > > > >>> we'd
> > > > > >>> like to remove before FreeBSD 13.  If others with to
> collaborate on
> > > > > >>> this
> > > > > >>> removal, or discuss it, please do so here.
> > > > > >>>
> > > > > >>> The list is being maintained on the project WIki:
> > > > > >>> https://wiki.freebsd.org/WhatsGoing/FreeBSD13
> > > > > >>
> > > > > >>
> > > > > >> I'm in the process of writing some of the criteria I've been
> using
> > > to
> > > > > >> drive
> > > > > >> discussions. I've promised a formal policy for a long time, but
> > > > > >> that's
> > > > > >> turning out to be harder than I thought to write and have just
> > > > > >> documented
> > > > > >> the criteria I look at to do the cost / benefit analysis for
> things.
> > > > > >> It's
> > > > > >> skewed a bit towards old drivers and old architectures (or
> platforms
> > > > > >> within
> > > > > >> those architectures), but it's likely a useful snowman to work
> > > > > >> towards
> > > > > >> something better. https://wiki.freebsd.org/ObsoleteCriteria
> > > > > >
> > > > > > THANK YOU!!!
> > > > > >
> > > > > >> This doesn't get into the process at all of who to have the
> > > > > >> conversation
> > > > > >> with, what steps you go through to ensure that there's a
> transition
> > > > > >> plan
> > > > > >> for current users (if there's enough to warrant it), etc. It's
> just
> > > a
> > > > > >> set
> > > > > >> of things I've found useful to think about and that I think the
> > > > > >> project
> > > > > >> should adopt (with refinement) as a standard template to use
> when
> > > > > >> making
> > > > > >> these calls.
> > > > > >
> > > > > > Can you also draft a short proposal on the "process" of
> deprecation?
> > > > > > Based I guess on our current model, with the addition of some of
> the
> > > > > > macros and other stuff you and Brooks worked on, the gone_in
> stuff
> > > > > > is what I am thinking.  How to do the head commit with the
> > > deprecation
> > > > > > notices, merge that back if applicable, then do the remove when
> > > > > > appropriate.   Basically a more complete flushed out version of:
> > > > > >
> > > > >
> > >
> https://www.freebsd.org/doc/en/articles/committers-guide/article.html#rules
> > > > > > @ 17.4.  We should also IMHO do some work smithing and
> rearrangement
> > > > > > to
> > > > > > make this read much more as steps, some of the steps as listed
> have
> > > > > > substeps that appear to be overlooked often.
> > > > > >
> > > > > > Something like this:
> > > > > > 17.4. Deprecating Features
> > > > > >
> > > > > > When it is necessary to remove functionality from software in
> > > > > > the base system, follow these guidelines whenever possible:
> > > > > >
> > > > > >     1.  Use of the deprecated feature generates a warning.
> > > > > >     2.  Mention is made in the manual page and the release
> > > > > >         notes that the option, utility, or interface is
> deprecated.
> > > > > >       (I removed the word possible here, we should always
> > > > > >       mention deprecation of features in the release notes)
> > > > > >     3.  The option, utility, or interface is preserved until
> > > > > >         the next major (point zero) release.
> > > > >
> > > > > Given the age of some of the things in our tree, of which timed
> was a
> > > > > classic example, I hardly think that this rule will work in our
> favor.
> > > > > Removing old code reduces our attack surface, and keeping something
> > > > > that's been dead for years, for another 2 years, seems problematic
> at
> > > > > the very least.  Some of the software now on the proposed removal
> list
> > > > > should have been gone by 11 or 12.
> > >
> > > Not giving the users proper and adaquate notification is asking
> > > for problems as well.
> > >
> > > In the case of amd(8) at least the man page has had a obsolete
> > > notice in it with a pointer to autofs for some time (my 11.1
> > > systems have such notice, not sure how far back that goes.)
> > >
> > > And again, age is not the correct metric, BSD is near 40 years
> > > old, I think the word you want is obsolete, which is more correct.
> > > Also timed appears to be in use, so what might be dead to you
> > > is useful to someone else.
> > >
> >
> > You can't say it's in use, other than one person saying they have it up
> and
> > running. That's not data, but an anecdote. We have no data to show one
> way
> > or another.
>
> I didn't say it was in use, I said "Age is not the correct metric".
> In use or not in use would be a correct metric.
>

There is no single metric. This code is old. Old code has more security
issues than new. That's a small factor towards removal.


> > However, given how little it's changed in the last 25 years
> > since it was imported to the tree, it's not unreasonable to conclude that
> > it's not in active use.
>
> Change is probably a marginal metric, things that work well
> tend to just keep working, without fuss or change.
> timed works, works very well, and has no reason to be
> mucked with, so I would not expect any change to that
> code.  Your drawing, IMHO, very bad conclusions from
> a poor metic.
>

Change rate is important. There have been no new features added to timed in
30 years. However, the state of the art in time keeping has advanced
significantly since then. The only changes here have been churn: marking
licenses, converting from K&R, minor tweaks to appease Coverity, moving
files around. That's code that's not living, breathing code. Did anybody
use newer research to get better synchronization? Has anybody added
frequency estimates to improve accuracy? Has anybody looked at ways for the
networks of people using timed to be self-organizing? Nope on all these.
Look at UFS. It's of similar age, and has had much work done on it since
then. Heck, even cat, cp, dd, and other utilities we've had around since V7
have all seen significant work done and continue to see new features as the
needs arise.


> > Couple with that the significant technical issues
> > with the method of timekeeping its trying to do, and it's easy to see why
> > people wouldn't want to use it.
> >
> >
> > > I would also argue that the attack surface of timed is much
> > > smaller than the attack surface of ntpd, shall we remove ntpd
> > > to reduce our attack surface?  I can not recall any cve against
> > > timed, I can recall many against ntpd.
> > >
> >
> > ntpd is useful. timed, not so much. ntpd is widely deployed, timed not so
> > much. ntpd is actively maintained, timed hasn't changed in 30 years in
> any
> > meaningful way. Etc. On all these factors, timed should go.
>
> Your "not so much" is very subjective.
> ntpd may or maynot be actively maintained as pointed out by others.
> Change != used, unchanged != unused, please stop drawing conclusions
> that are marginal at best.
>

Nope. This is incorrect. NTP is absolutely actively maintained. The NTP
working group has dozens of draft standard papers today (timed has had none
in 30 years). The ntp code base has had dozens of commits in the last 6
months, and hundreds in the last few years that I could find. In contrast,
there have been no real commits to timed since kris@ fixed the
interoperability issues in 2001, and the next set of real commits were back
in 1996 or 1997 depending on how you count them.

Real programs that are used get changes. People find bugs. While this isn't
conclusive evidence, it is suggestive. When you look at active software
projects vs inactive ones, it's clear as day.


>
> > The rest doesn't matter. So few people use timed that it's not worth
> having
> > in the tree. It's age is but one factor as is lack of meaningful
>
> FFS, age is a totally incorrect metric, if you cant see that then
> maybe you should realized that much of our code is VERY old and has
> ages beyond timed, should we axe it?   Age has 0 bearing on usefull,
> used, etc.
>

Please don't yell at me. FFS? Really? you've already lost the argument.

And you are trying to conflate two different issues. timed is not useful.
Technically, it's a crappy solution. It uses only phase errors to adjust
the clocks, but not frequency. It uses the median time of the ensemble,
which is very sensitive to systemic frequency errors which can cause the
whole ensemble to wander off into the weeds. It's statistical code is
simplistic. These factors make it less than robust. The data is also
unencrypted / unauthenticated, with no provision to change that. This makes
it spoofable. It would take considerable effort to change these features.

All of these problems with timed are a result of its age. While age isn't
necessarily a problem, it's a useful factor to consider.


> > maintenance, but it too militates against. In fact, all the commits after
> > its import have been make-work for the most part. This is a poster child
> > for why software in the tree needlessly creates friction: At least 5
> > different people had to spend time on this bit of code making sure it was
> > up to whatever tree-wide thing they were doing. Individually, this isn't
> so
> > bad, but with enough of these it becomes a real problem.
> >
> > So it's not just any one of these factors, before you try to argue one
> > point I might have overstated. It's that the cost of having it in the
> tree,
> > all things considered, outweighs the benefit it gives to the project when
> > viewed as a whole over all these factors. There's a general consensus
> > that's true, with a tiny number of people holding a contrary view. That
> > qualifies as "rough consensus" in my book. that was the old standard
> before
> > we filibustered things to death with the mistaken notion that even on
> > decenter is enough to keep something around. It's time to come up with
> some
> > clear guidelines that embody the "rough consensus" standard of old, while
> > recognizing that with a larger audience of stakeholders we'll see more of
> > the 'long tail' than we ever did when the group was smaller (the law of
> > large numbers tells us to expect this). We've not made that transition,
> and
> > we need to do so. We have to find some useful way to talk about
> retirement
> > of features, architectures, platforms, etc because we don't have enough
> > manpower to maintain everything that ever entered the tree forever. We
> have
> > to make sure that the burden things place on other developers are, on the
> > whole, out weighted by the benefit to the project. Since both of these
> > factors lack hard data to back-up assertions, we must instead substitute
> > our collective wisdom. There will be differences of opinion as to the
> cost
> > and worth of things, but the average of those opinions generally have
> been
> > shown to be good so long as the sample size is large and diverse enough.
>
> I did not object to it going, you do not have to make a case for that,
> BUT I do object to the "reason" being "it is old", that is an incorrect
> reason.  The reasons are more correctly:
>

Being too old isn't the only reason. That's one of the many reasons
articulated. The age of the code, the lack of innovation, the severe
technical deficiencies, the sparse use, the lack of maintainer, the opinion
of the domain experts in this field (Ian, myself, George and Poul-Hennig)
which say this is poo, the nature of the commits in the last decade
suggesting it's more of a drag on the tree than a help, etc. All strongly
suggest it's time to go. Don't get hung up on any one of these being
insufficient on its own: taken all together it's clear timed costs more
than the benefit it brings to the project.


>         1) It is only used by a small group of people (we should
>            define what the size of that group is that makes things
>            deprecatable.)  The non existant data we have that leads
>            us to think it is ok to remove timed may be wrong, and
>            the 17.4 commiters procedure is designed to help catch
>            that.
>
        2) It has a reasonable replacement in ntpd
>         3) It has a small, but perhaps measurable maintance cost.
>            (Though this well now be born by who ever if anyone
>            bothers to maintain your git version)
>

It will be. My github branch is complete now. All that it needs is someone
to create a port and I'll move it over to the project repo space. I'm
doubtful anybody will step up, but I've done the hard work of extraction.


> >
> > > > If it's old, we should remove it. The one major release thing is
> nice to
> > > > have in the steady state where you are caught up and retire things
> just
> > > in
> > > > time. For the backlog we have, though, it will just get in the way.
> But
> > > in
> > > > this case all we need to do is a direct commit to fix the man page
> in 12
> > > to
> > > > say it is gone in 13....
> > >
> > > Yes, we can short circuit the process, but we should not just skip
> > > the process.  Need to fix both the man page and the binary to output
> > > a message when it is invoked, and that needs to be commited to both
> > > stable/11 and stable/12.
> > >
> >
> > It's a daemon. We should not have it blast random bits. There's no need
> for
> > it and the bits would get lost anyway. Also, changes to the could break
> > things in unanticipated ways if they can't be meaningfully tested.
>
> It is not only a daemon, it has an interactive command component
> called timedc, and even then we *must* address this daemon issue
> for deprecating.  To rely on a note in a man page is not a good
> plan.
>
> Daemons often can and do spit out startup messages, these are
> not "blasting random bits" and your hyperbola is not helping
> you make a good case.
>
> And if our skill levels are so low we can not add a startup message
> to timed and a message to timedc I hate to think of the state of
> the more complex code.
>

Unless we can test the code, it has a good chance to be broken. I don't
think there's wide-spread consensus that all these things must be done
without fail every time. In this case, there's no point: it's just make
work. It would be better to create something in rc.d that prints out a
message / sends it to syslog / etc when a deprecated daemon is in use. That
way it's super low risk (timed_deprecated=yes in /etc/defaults/rc.conf) and
is a reusable solution that will remove this friction point.


> > A direct commit to the man page is all we need do here.
>
> We disagree, and stated procedure bears weight to my side.
>

Stated procedure is broken. We need to refine it. You know a procedure is
not serving the needs of the group when it is routinely ignored. What is
needed in this case isn't more enforcement, but a better procedure that's
more concrete and easier for people to use. At lot of the process was our
best guess at what to do at the time, but we never got around to testing
our best guess and refining the bits that we got wrong. And now the
atmosphere around this issue has become too dysfunctional for people to do
the work that needs to be done. We need to fix that.


> > > The prefered method would of been to do the notification stuff in
> > > head, then merge that back to stable/11 and /12, then do the remove
> > > in head.
> > >
> >
> > True, but that wasn't done here.
>
> Sadly true.
>

Yea. But it wasn't done for a reason. If we can't take a look at why it
wasn't done here, or in many other cases, then we're doomed to have this
same freak-out after the fact. That doesn't help anybody.

Warner



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfqK9T3x8=z14pPbg7pLNfUz4JcErhSEEsJMvU5h_EnZFw>