Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 08 Jul 2001 22:06:51 +0100
From:      Ian Dowse <iedowse@maths.tcd.ie>
To:        freebsd-fs@FreeBSD.ORG
Cc:        Matt Dillon <dillon@earth.backplane.com>, mckusick@mckusick.com
Subject:   Re: CFR: UFS directory hashing 
Message-ID:   <200107082206.aa68029@salmon.maths.tcd.ie>
In-Reply-To: Your message of "Tue, 03 Jul 2001 10:53:49 PDT." <200107031753.f63Hrnd33021@earth.backplane.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
In message <200107031753.f63Hrnd33021@earth.backplane.com>, Matt Dillon writes:
>    will help.  A delayed approach to hashing the directory might
>    work, you could use the directory vnode or the namei cache
>    (associated with the directory itself) to store the statistics
>    or something like that.

Ok, after thinking about this for a while, I have made a relatively
simple change to the recycling code that seems to improve the
situation considerably. The new algorithm is pretty arbitrary, but
it significantly reduces the number of hash-build operations that
occur when the working set is larger than the memory limit.

The old algorithm was simple LRU, so a working set larger than the
memory limit resulted in every lookup causing a hash build. The
new approach is a kind of a hybrid between LRU and least-often-used
with an adjustment that results in only a small fraction of lookups
causing a hash build in the overload case.

Each hash is placed on a LRU linked list as before, with new entries
inserted at the MRU end of the list. Candidates for recycling are
taken from the LRU end. Each hash now has a `score' that is
incremented when a hash is used and decremented when the hash is
a candidate for recycling. A hash is only recycled if its score
reaches zero, and on access, a hash is moved one slot towards the
MRU end of the list iff its score is greater than that of the next
entry.

The score of a new hash entry is set to 8, and scores are limited
to a maximum of 64. The initial value means that in the worst-case
thrashing situation, at most 1/8 of lookups result in a hash build.
The least-often-used discard policy also improves the chance of
holding on to a hashed directory across a large burst of accesses
to other directories. It might be better to weight the score
increments and decrements by the directory size, but that introduces
some other problems so I'd prefer to keep it simple for now.

The new code, which is at

	http://www.maths.tcd.ie/~iedowse/FreeBSD/dirhash.diff6

also includes a performance improvement and a bugfix to the sequential
access optimisation. I haven't updated the releng_4 version, but
I will soon.

Ian

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-fs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi? <200107082206.aa68029>