Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 8 Aug 2019 20:55:44 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Conrad Meyer <cem@freebsd.org>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org,  svn-src-head@freebsd.org
Subject:   Re: svn commit: r350713 - head/sys/kern
Message-ID:  <20190808183733.O1073@besplex.bde.org>
In-Reply-To: <201908080042.x780gThf078293@repo.freebsd.org>
References:  <201908080042.x780gThf078293@repo.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 8 Aug 2019, Conrad Meyer wrote:

> Log:
>  ddb(4): Add 'sysctl' command
>
>  Implement `sysctl` in `ddb` by overriding `SYSCTL_OUT`.  When handling the
>  req, we install custom ddb in/out handlers.  The out handler prints straight
>  to the debugger, while the in handler ignores all input.  This is intended
>  to allow us to print just about any sysctl.
>
>  There is a known issue when used from ddb(4) entered via 'sysctl
>  debug.kdb.enter=1'.  The DDB mode does not quite prevent all lock
>  interactions, and it is possible for the recursive Giant lock to be unlocked
>  when the ddb(4) 'sysctl' command is used.  This may result in a panic on
>  return from ddb(4) via 'c' (continue).  Obviously, this is not a problem
>  when debugging already-paniced systems.

Locking for kdb was grossly broken in r339917, by setting and unsetting
the SCHEDULER_STOPPED() flag on entry and exit to kdb.  This flag is only
usable during panic() since when it is set all mutex operations potentially
corrupt their mutex, and unsetting the flag doesn't undo the corruption.

Many different types of corruption are possible, but my simplest example
if like the one here for Giant, but for a spinlock.  kdb runs happily while
SCHEDULER_STOPPED() is set, but if it corrupts a mutex then the corruption
is fatal later when SCHEDULER_STOPPED() is clear (setting SCHEDULER_STOPPED()
breaks early detection of the corruption in most cases).

kern_sysctl.c does only 1 pair of mutex calls.  In
sysctl_root_handle_locked(), it locks and unlocks Giant for the !MPSAFE case.
If that is the only mutex call, then the problem is easy to fix by not
making these calls if kdb_active.

One of my versions of kdb asserts that no mutex calls are made if
kdb_active.  This makes it easy to find buggy mutex calls.
Unfortunately, low-quality low level console drivers like vt make such
calls unconditionally.  I used this to avoid all the buggy calls in
syscons (a typical error there was for to wander into wakeup or timeout
code which uses mutexes.  This was "fixed" by removing things like
beeping for errors.  The keyboard driver uses Giant fundamentally.
This was "fixed" using the usual hack of not doing the mutex call if
kdb_active).

> ...
> Modified: head/sys/kern/kern_sysctl.c
> ==============================================================================
> --- head/sys/kern/kern_sysctl.c	Thu Aug  8 00:33:23 2019	(r350712)
> +++ head/sys/kern/kern_sysctl.c	Thu Aug  8 00:42:29 2019	(r350713)
> ...
> +/*
> + * Run a sysctl handler with the DDB oldfunc and newfunc attached.
> + * Instead of copying any output to a buffer we'll dump it right to
> + * the console.
> + */
> +static int
> +db_sysctl(struct sysctl_oid *oidp, int *name, u_int namelen,
> +    void *old, size_t *oldlenp, size_t *retval, int flags)
> +{
> +	struct sysctl_req req;
> +	int error;
> +
> +	/* Setup the request */
> +	bzero(&req, sizeof req);
> +	req.td = kdb_thread;
> +	req.oldfunc = sysctl_old_ddb;
> +	req.newfunc = sysctl_new_ddb;
> +	req.lock = REQ_UNWIRED;
> +	if (oldlenp) {
> +		req.oldlen = *oldlenp;
> +	}
> +	req.validlen = req.oldlen;
> +	if (old) {
> +		req.oldptr = old;
> +	}
> +
> +	/* Setup our globals for sysctl_old_ddb */
> +	g_ddb_oid = oidp;
> +	g_ddb_sysctl_flags = flags;
> +	g_ddb_sysctl_printed = 0;

This has many style bugs.  So does the rest of the file.

> +
> +	error = sysctl_root(0, name, namelen, &req);

This makes invalid mutex calls for all !MPSAFE sysctls.

Setting SCHEDULER_STOPPED() for kdb is only very broken for spin mutexes.
Then for a contested mutex, mtx_lock_spin() ends up doing nothing except
invalid lock and/or kdtrace profiling when it finds SCHEDULER_STOPPED()
set. but mtx_unlock_spin() never checks SCHEDULER_STOPPED(), so it ends
up corrupting the mutex and the spinlock state by attempting to unlock
a mutex whose corresponding locking was not done.  INVARIANTS causes
similar bugs for recursed-on spin mutexes.  Without INVARIANTS, everything
is done inline without checking SCHEDULER_STOPPED(), so it works well enough
in practice in uncontested non-recursive cases.  With INVARIANTS, the null
locking gives no increment of the recursion counter and no call to
spinlock_enter(), but the non-null unlocking decrements the recursion
counter and calls spinlock_exit().

For sleep mutexes, unlocking checks SCHEDULER_STOPPED() consistently with
locking, so the bugs are smaller.  The easiest way to demonstrate them is
to configure INVARIANTS and WITNESS.  Starting with Giant unheld,
mtx_lock(&Giant) does nothing with the mutex except invalid lock and/or
kdtrace profiling.  mtx_unlock) calls __mtx_unlock_flags()
which calls __mtx_assert(m, MA_OWNED), but this check is subverted by not
doing it when SCHEDULER_STOPPED().  But before that, __mtx_unlock_flags()
calls WITNESS_UNLOCK() and witness_unlock() is not similarly subverted.
Also, WITNESS uses a critical spin mutex w_mtx, and this is corrupted as
above.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190808183733.O1073>