Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 1 Jul 2005 16:04:23 +0100 (BST)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        Peter Edwards <peadar@FreeBSD.org>
Cc:        arch@freebsd.org
Subject:   Re: ktrace and KTR_DROP
Message-ID:  <20050701155757.A36905@fledge.watson.org>
In-Reply-To: <20050701132104.GA95135@freefall.freebsd.org>
References:  <20050701132104.GA95135@freefall.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On Fri, 1 Jul 2005, Peter Edwards wrote:

> Ever since the introduction of a separate ktrace worker thread for 
> writing output, there's the distinct possibility that ktrace output will 
> drop requests. For some proceses, it's actually inevitable: as long as 
> the traced processes can sustain a rate of generating ktrace events 
> faster than the ktrace thread can write them, you'll eventually run out 
> of ktrace requests.
>
> I'd like to propose that rather than just dropping the request on the 
> floor, we at least configurably allow ktraced threads to block until 
> there are resources available to satisfy their requests.

There are two benefits to the current ktrace dispatch model:

(1) Avoiding untimely sleeping in the execution paths of threads that are
     being traced.

(2) Allowing the traced thread to run ahead asynchronously, hopefully
     impacting performance less.

One of the things I've been thinking for a few years is that I think I 
actually preferred the old model better, there processes (now threads) 
would hang a "current record" off of their process (now thread) structure, 
and fill it in as they went along.  The upsides of this have to do with 
the downsides of the current model: that you don't allow fully 
asynchronous execition of the threads with respect to queueing the records 
to disk, so you don't run into "drop" scenarios, instead slowing down the 
process.  Likewise, the downsides.

In the audit code, we pull from a common record queue, but we allocate the 
record when the system call starts for each process -- if there aren't 
records available (or various other reliability-related conditions fail, 
such as adequate disk space), we stall the thread entering the kernel 
until we can satisfy its record allocation requirements.

There are two cases where I really run into problems with the current 
model:

(1) When I'm interacting with a slow file system, such as NFS over
     100mbps, I will always lose records, because it doesn't take long for
     the process to get quite ahead of the write-behind.

(2) When I trace more than one process at a time, the volume of records
     overwhelms the write-behind.

Write coalescing/etc is already provided "for free" by pushing the writes 
down into the file system, so other than slowing down the traced process a 
little, I think we don't lose much by moving back to this model.  And if 
we pre-commit the record storage on system call entry (with the exception 
of paths, which generally require potential sleeps anyway), we probably 
won't hurt performance all that much, and avoid sleeping in bad places.

Robert N M Watson



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050701155757.A36905>