Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 22 Dec 2015 18:12:12 +0000 (UTC)
From:      Ed Schouten <ed@FreeBSD.org>
To:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   svn commit: r292613 - in head: include lib/libc/stdlib lib/libc/tests/stdlib
Message-ID:  <201512221812.tBMICCPZ003177@repo.freebsd.org>

next in thread | raw e-mail | index | archive | help
Author: ed
Date: Tue Dec 22 18:12:11 2015
New Revision: 292613
URL: https://svnweb.freebsd.org/changeset/base/292613

Log:
  Let tsearch()/tdelete() use an AVL tree.
  
  The existing implementations of POSIX tsearch() and tdelete() don't
  attempt to perform any balancing at all. Testing reveals that inserting
  100k nodes into a tree sequentially takes approximately one minute on my
  system.
  
  Though most other BSDs also don't use any balanced tree internally, C
  libraries like glibc and musl do provide better implementations. glibc
  uses a red-black tree and musl uses an AVL tree.
  
  Red-black trees have the advantage over AVL trees that they only require
  O(1) rotations after insertion and deletion, but have the disadvantage
  that the tree has a maximum depth of 2*log2(n) instead of 1.44*log2(n).
  My take is that it's better to focus on having a lower maximum depth,
  for the reason that in the case of tsearch() the invocation of the
  comparator likely dominates the running time.
  
  This change replaces the tsearch() and tdelete() functions by versions
  that create an AVL tree. Compared to musl's implementation, this version
  is different in two different ways:
  
  - We don't keep track of heights; just balances. This is sufficient.
    This has the advantage that it reduces the number of nodes that are
    being accessed. Storing heights requires us to also access all of the
    siblings along the path.
  
  - Don't use any recursion at all. We know that the tree cannot 2^64
    elements in size, so the height of the tree can never be larger than
    96. Use a 128-bit bitmask to keep track of the path that is computed.
    This allows us to iterate over the same path twice, meaning we can
    apply rotations from top to bottom.
  
  Inserting 100k nodes into a tree now only takes 0.015 seconds. Insertion
  seems to be twice as fast as glibc, whereas deletion has about the same
  performance. Unlike glibc, it uses a fixed amount of memory.
  
  I also experimented with both recursive and iterative bottom-up
  implementations of the same algorithm. This iterative top-down version
  performs similar to the recursive bottom-up version in terms of speed
  and code size.
  
  For some reason, the iterative bottom-up algorithm was actually 30%
  faster for deletion, but has a quadratic memory complexity to keep track
  of all the parent pointers.
  
  Reviewed by:	jilles
  Obtained from:	https://github.com/NuxiNL/cloudlibc
  Differential Revision:	https://reviews.freebsd.org/D4412

Added:
  head/lib/libc/stdlib/tsearch_path.h   (contents, props changed)
  head/lib/libc/tests/stdlib/tsearch_test.c   (contents, props changed)
Modified:
  head/include/search.h
  head/lib/libc/stdlib/tdelete.c
  head/lib/libc/stdlib/tsearch.3
  head/lib/libc/stdlib/tsearch.c
  head/lib/libc/tests/stdlib/Makefile

Modified: head/include/search.h
==============================================================================
--- head/include/search.h	Tue Dec 22 17:46:14 2015	(r292612)
+++ head/include/search.h	Tue Dec 22 18:12:11 2015	(r292613)
@@ -35,8 +35,9 @@ typedef	enum {
 
 #ifdef _SEARCH_PRIVATE
 typedef	struct node {
-	char         *key;
+	void         *key;
 	struct node  *llink, *rlink;
+	signed char   balance;
 } node_t;
 
 struct que_elem {

Modified: head/lib/libc/stdlib/tdelete.c
==============================================================================
--- head/lib/libc/stdlib/tdelete.c	Tue Dec 22 17:46:14 2015	(r292612)
+++ head/lib/libc/stdlib/tdelete.c	Tue Dec 22 18:12:11 2015	(r292613)
@@ -1,72 +1,213 @@
-/*	$NetBSD: tdelete.c,v 1.2 1999/09/16 11:45:37 lukem Exp $	*/
-
-/*
- * Tree search generalized from Knuth (6.2.2) Algorithm T just like
- * the AT&T man page says.
- *
- * The node_t structure is for internal use only, lint doesn't grok it.
+/*-
+ * Copyright (c) 2015 Nuxi, https://nuxi.nl/
  *
- * Written by reading the System V Interface Definition, not the code.
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
  *
- * Totally public domain.
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
-#if 0
-#if defined(LIBC_SCCS) && !defined(lint)
-__RCSID("$NetBSD: tdelete.c,v 1.6 2012/06/25 22:32:45 abs Exp $");
-#endif /* LIBC_SCCS and not lint */
-#endif
 __FBSDID("$FreeBSD$");
 
-#define _SEARCH_PRIVATE
+#define	_SEARCH_PRIVATE
 #include <search.h>
+#include <stdbool.h>
 #include <stdlib.h>
 
+#include "tsearch_path.h"
 
 /*
- * find a node with given key
- *
- * vkey:   key to be found
- * vrootp: address of the root of the tree
- * compar: function to carry out node comparisons
- */
+ * Makes a step to the left along the binary search tree. This step is
+ * also saved, so it can be replayed while rebalancing.
+*/
+#define	GO_LEFT() do {							\
+	if ((*leaf)->balance == 0 ||					\
+	    ((*leaf)->balance < 0 && (*leaf)->rlink->balance == 0)) {	\
+		/*							\
+		 * If we reach a node that is balanced, or has a child	\
+		 * in the opposite direction that is balanced, we know	\
+		 * that we won't need to perform any rotations above	\
+		 * this point. In this case rotations are always	\
+		 * capable of keeping the subtree in balance. Make	\
+		 * this the base node and reset the path.		\
+		 */							\
+		base = leaf;						\
+		path_init(&path);					\
+	}								\
+	path_taking_left(&path);					\
+	leaf = &(*leaf)->llink;						\
+} while (0)
+
+/* Makes a step to the right along the binary search tree. */
+#define	GO_RIGHT() do {							\
+	if ((*leaf)->balance == 0 ||					\
+	    ((*leaf)->balance > 0 && (*leaf)->llink->balance == 0)) {	\
+		base = leaf;						\
+		path_init(&path);					\
+	}								\
+	result = &(*leaf)->key;						\
+	path_taking_right(&path);					\
+	leaf = &(*leaf)->rlink;						\
+} while (0)
+
 void *
-tdelete(const void * __restrict vkey, void ** __restrict vrootp,
+tdelete(const void *restrict key, void **restrict rootp,
     int (*compar)(const void *, const void *))
 {
-	node_t **rootp = (node_t **)vrootp;
-	node_t *p, *q, *r;
+	struct path path;
+	node_t *root, **base, **leaf, *old, **n, *x, *y, *z;
+	void *result;
 	int cmp;
 
-	if (rootp == NULL || (p = *rootp) == NULL)
-		return NULL;
+	/* POSIX requires that tdelete() returns NULL if rootp is NULL. */
+	if (rootp == NULL)
+		return (NULL);
+	root = *rootp;
+
+	/*
+	 * Find the leaf that needs to be removed. Return if we cannot
+	 * find an existing entry. Keep track of the path that is taken
+	 * to get to the node, as we will need it to adjust the
+	 * balances.
+	 */
+	result = (void *)1;
+	path_init(&path);
+	base = &root;
+	leaf = &root;
+	for (;;) {
+		if (*leaf == NULL)
+			return (NULL);
+		cmp = compar(key, (*leaf)->key);
+		if (cmp < 0) {
+			result = &(*leaf)->key;
+			GO_LEFT();
+		} else if (cmp > 0) {
+			result = &(*leaf)->key;
+			GO_RIGHT();
+		} else {
+			break;
+		}
+	}
 
-	while ((cmp = (*compar)(vkey, (*rootp)->key)) != 0) {
-		p = *rootp;
-		rootp = (cmp < 0) ?
-		    &(*rootp)->llink :		/* follow llink branch */
-		    &(*rootp)->rlink;		/* follow rlink branch */
-		if (*rootp == NULL)
-			return NULL;		/* key not found */
+	/* Found a matching key in the tree. Remove the node. */
+	if ((*leaf)->llink == NULL) {
+		/* Node has no left children. Replace by its right subtree. */
+		old = *leaf;
+		*leaf = old->rlink;
+		free(old);
+	} else {
+		/*
+		 * Node has left children. Replace this node's key by
+		 * its predecessor's and remove that node instead.
+		 */
+		void **keyp = &(*leaf)->key;
+		GO_LEFT();
+		while ((*leaf)->rlink != NULL)
+			GO_RIGHT();
+		old = *leaf;
+		*keyp = old->key;
+		*leaf = old->llink;
+		free(old);
 	}
-	r = (*rootp)->rlink;			/* D1: */
-	if ((q = (*rootp)->llink) == NULL)	/* Left NULL? */
-		q = r;
-	else if (r != NULL) {			/* Right link is NULL? */
-		if (r->llink == NULL) {		/* D2: Find successor */
-			r->llink = q;
-			q = r;
-		} else {			/* D3: Find NULL link */
-			for (q = r->llink; q->llink != NULL; q = r->llink)
-				r = q;
-			r->llink = q->rlink;
-			q->llink = (*rootp)->llink;
-			q->rlink = (*rootp)->rlink;
+
+	/*
+	 * Walk along the same path a second time and adjust the
+	 * balances. Though this code looks similar to the rebalancing
+	 * performed in tsearch(), it is not identical. We now also need
+	 * to consider the case of outward imbalance in the right-right
+	 * and left-left case that only exists when deleting. Hence the
+	 * duplication of code.
+	 */
+	for (n = base; n != leaf;) {
+		if (path_took_left(&path)) {
+			x = *n;
+			if (x->balance < 0) {
+				y = x->rlink;
+				if (y->balance > 0) {
+					/* Right-left case. */
+					z = y->llink;
+					x->rlink = z->llink;
+					z->llink = x;
+					y->llink = z->rlink;
+					z->rlink = y;
+					*n = z;
+
+					x->balance = z->balance < 0 ? 1 : 0;
+					y->balance = z->balance > 0 ? -1 : 0;
+					z->balance = 0;
+				} else {
+					/* Right-right case. */
+					x->rlink = y->llink;
+					y->llink = x;
+					*n = y;
+
+					if (y->balance < 0) {
+						x->balance = 0;
+						y->balance = 0;
+					} else {
+						x->balance = -1;
+						y->balance = 1;
+					}
+				}
+			} else {
+				--x->balance;
+			}
+			n = &x->llink;
+		} else {
+			x = *n;
+			if (x->balance > 0) {
+				y = x->llink;
+				if (y->balance < 0) {
+					/* Left-right case. */
+					z = y->rlink;
+					y->rlink = z->llink;
+					z->llink = y;
+					x->llink = z->rlink;
+					z->rlink = x;
+					*n = z;
+
+					x->balance = z->balance > 0 ? -1 : 0;
+					y->balance = z->balance < 0 ? 1 : 0;
+					z->balance = 0;
+				} else {
+					/* Left-left case. */
+					x->llink = y->rlink;
+					y->rlink = x;
+					*n = y;
+
+					if (y->balance > 0) {
+						x->balance = 0;
+						y->balance = 0;
+					} else {
+						x->balance = 1;
+						y->balance = -1;
+					}
+				}
+			} else {
+				++x->balance;
+			}
+			n = &x->rlink;
 		}
 	}
-	if (p != *rootp)
-		free(*rootp);			/* D4: Free node */
-	*rootp = q;				/* link parent to new node */
-	return p;
+
+	/* Return the parent of the old entry. */
+	*rootp = root;
+	return (result);
 }

Modified: head/lib/libc/stdlib/tsearch.3
==============================================================================
--- head/lib/libc/stdlib/tsearch.3	Tue Dec 22 17:46:14 2015	(r292612)
+++ head/lib/libc/stdlib/tsearch.3	Tue Dec 22 18:12:11 2015	(r292613)
@@ -27,7 +27,7 @@
 .\"	OpenBSD: tsearch.3,v 1.2 1998/06/21 22:13:49 millert Exp
 .\" $FreeBSD$
 .\"
-.Dd June 15, 1997
+.Dd December 6, 2015
 .Dt TSEARCH 3
 .Os
 .Sh NAME
@@ -50,8 +50,12 @@ The
 .Fn tsearch ,
 and
 .Fn twalk
-functions manage binary search trees based on algorithms T and D
-from Knuth (6.2.2).
+functions manage binary search trees.
+This implementation uses a balanced AVL tree,
+which due to its strong theoretical limit on the height of the tree has
+the advantage of calling the comparison function relatively
+infrequently.
+.Pp
 The comparison function passed in by
 the user has the same style of return values as
 .Xr strcmp 3 .

Modified: head/lib/libc/stdlib/tsearch.c
==============================================================================
--- head/lib/libc/stdlib/tsearch.c	Tue Dec 22 17:46:14 2015	(r292612)
+++ head/lib/libc/stdlib/tsearch.c	Tue Dec 22 18:12:11 2015	(r292613)
@@ -1,55 +1,200 @@
-/*	$NetBSD: tsearch.c,v 1.7 2012/06/25 22:32:45 abs Exp $	*/
-
-/*
- * Tree search generalized from Knuth (6.2.2) Algorithm T just like
- * the AT&T man page says.
- *
- * The node_t structure is for internal use only, lint doesn't grok it.
+/*-
+ * Copyright (c) 2015 Nuxi, https://nuxi.nl/
  *
- * Written by reading the System V Interface Definition, not the code.
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
  *
- * Totally public domain.
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
-#if 0
-#if defined(LIBC_SCCS) && !defined(lint)
-__RCSID("$NetBSD: tsearch.c,v 1.7 2012/06/25 22:32:45 abs Exp $");
-#endif /* LIBC_SCCS and not lint */
-#endif
 __FBSDID("$FreeBSD$");
 
-#define _SEARCH_PRIVATE
+#define	_SEARCH_PRIVATE
 #include <search.h>
 #include <stdlib.h>
 
-/* find or insert datum into search tree */
+#include "tsearch_path.h"
+
 void *
-tsearch(const void *vkey, void **vrootp,
+tsearch(const void *key, void **rootp,
     int (*compar)(const void *, const void *))
 {
-	node_t *q;
-	node_t **rootp = (node_t **)vrootp;
+	struct path path;
+	node_t *root, **base, **leaf, *result, *n, *x, *y, *z;
+	int cmp;
 
+	/* POSIX requires that tsearch() returns NULL if rootp is NULL. */
 	if (rootp == NULL)
-		return NULL;
+	return (NULL);
+		root = *rootp;
 
-	while (*rootp != NULL) {	/* Knuth's T1: */
-		int r;
-
-		if ((r = (*compar)(vkey, (*rootp)->key)) == 0)	/* T2: */
-			return *rootp;		/* we found it! */
+	/*
+	 * Find the leaf where the new key needs to be inserted. Return
+	 * if we've found an existing entry. Keep track of the path that
+	 * is taken to get to the node, as we will need it to adjust the
+	 * balances.
+	*/
+	path_init(&path);
+	base = &root;
+	leaf = &root;
+	while (*leaf != NULL) {
+		if ((*leaf)->balance != 0) {
+			/*
+			 * If we reach a node that has a non-zero
+			 * balance on the way, we know that we won't
+			 * need to perform any rotations above this
+			 * point. In this case rotations are always
+			 * capable of keeping the subtree in balance.
+			 * Make this the base node and reset the path.
+			 */
+			base = leaf;
+			path_init(&path);
+		}
+		cmp = compar(key, (*leaf)->key);
+		if (cmp < 0) {
+			path_taking_left(&path);
+			leaf = &(*leaf)->llink;
+		} else if (cmp > 0) {
+			path_taking_right(&path);
+			leaf = &(*leaf)->rlink;
+		} else {
+			return (&(*leaf)->key);
+		}
+	}
 
-		rootp = (r < 0) ?
-		    &(*rootp)->llink :		/* T3: follow left branch */
-		    &(*rootp)->rlink;		/* T4: follow right branch */
+	/* Did not find a matching key in the tree. Insert a new node. */
+	result = *leaf = malloc(sizeof(**leaf));
+	if (result == NULL)
+		return (NULL);
+	result->key = (void *)key;
+	result->llink = NULL;
+	result->rlink = NULL;
+	result->balance = 0;
+
+	/*
+	 * Walk along the same path a second time and adjust the
+	 * balances. Except for the first node, all of these nodes must
+	 * have a balance of zero, meaning that these nodes will not get
+	 * out of balance.
+	*/
+	for (n = *base; n != *leaf;) {
+		if (path_took_left(&path)) {
+			n->balance += 1;
+			n = n->llink;
+		} else {
+			n->balance -= 1;
+			n = n->rlink;
+		}
 	}
 
-	q = malloc(sizeof(node_t));		/* T5: key not found */
-	if (q != 0) {				/* make new node */
-		*rootp = q;			/* link new node to old */
-		q->key = __DECONST(void *, vkey);/* initialize new node */
-		q->llink = q->rlink = NULL;
+	/*
+	 * Adjusting the balances may have pushed the balance of the
+	 * base node out of range. Perform a rotation to bring the
+	 * balance back in range.
+	 */
+	x = *base;
+	if (x->balance > 1) {
+		y = x->llink;
+		if (y->balance < 0) {
+			/*
+			 * Left-right case.
+			 *
+			 *         x
+			 *        / \            z
+			 *       y   D          / \
+			 *      / \     -->    y   x
+			 *     A   z          /|   |\
+			 *        / \        A B   C D
+			 *       B   C
+			 */
+			z = y->rlink;
+			y->rlink = z->llink;
+			z->llink = y;
+			x->llink = z->rlink;
+			z->rlink = x;
+			*base = z;
+
+			x->balance = z->balance > 0 ? -1 : 0;
+			y->balance = z->balance < 0 ? 1 : 0;
+			z->balance = 0;
+		} else {
+			/*
+			 * Left-left case.
+			 *
+			 *        x           y
+			 *       / \         / \
+			 *      y   C  -->  A   x
+			 *     / \             / \
+			 *    A   B           B   C
+			 */
+			x->llink = y->rlink;
+			y->rlink = x;
+			*base = y;
+
+			x->balance = 0;
+			y->balance = 0;
+		}
+	} else if (x->balance < -1) {
+		y = x->rlink;
+		if (y->balance > 0) {
+			/*
+			 * Right-left case.
+			 *
+			 *       x
+			 *      / \              z
+			 *     A   y            / \
+			 *        / \   -->    x   y
+			 *       z   D        /|   |\
+			 *      / \          A B   C D
+			 *     B   C
+			 */
+			node_t *z = y->llink;
+			x->rlink = z->llink;
+			z->llink = x;
+			y->llink = z->rlink;
+			z->rlink = y;
+			*base = z;
+
+			x->balance = z->balance < 0 ? 1 : 0;
+			y->balance = z->balance > 0 ? -1 : 0;
+			z->balance = 0;
+		} else {
+			/*
+			 * Right-right case.
+			 *
+			 *       x               y
+			 *      / \             / \
+			 *     A   y    -->    x   C
+			 *        / \         / \
+			 *       B   C       A   B
+			 */
+			x->rlink = y->llink;
+			y->llink = x;
+			*base = y;
+
+			x->balance = 0;
+			y->balance = 0;
+		}
 	}
-	return q;
+
+	/* Return the new entry. */
+	*rootp = root;
+	return (&result->key);
 }

Added: head/lib/libc/stdlib/tsearch_path.h
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ head/lib/libc/stdlib/tsearch_path.h	Tue Dec 22 18:12:11 2015	(r292613)
@@ -0,0 +1,97 @@
+/*-
+ * Copyright (c) 2015 Nuxi, https://nuxi.nl/
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ *
+ * $FreeBSD$
+ */
+
+#ifndef TSEARCH_PATH_H
+#define	TSEARCH_PATH_H
+
+#include <limits.h>
+#include <stdbool.h>
+#include <stdint.h>
+
+/*
+ * Bookkeeping for storing a path in a balanced binary search tree from
+ * the root to a leaf node.
+ *
+ * For an AVL tree we know that its maximum height of a tree is bounded
+ * by approximately 1.44 * log2(n) - 0.328. Given that the number of
+ * entries of the tree is constrained by the size of the address space,
+ * two uintptr_t's provide sufficient space to store the path from the
+ * root to any leaf.
+ */
+struct path {
+	uintptr_t steps[2];
+	unsigned int nsteps;
+};
+
+/* Initializes the path structure with a zero-length path. */
+static inline void
+path_init(struct path *p)
+{
+
+	p->nsteps = 0;
+}
+
+#define	STEPS_BIT (sizeof(uintptr_t) * CHAR_BIT)
+
+/* Pushes a step to the left to the end of the path. */
+static inline void
+path_taking_left(struct path *p)
+{
+
+	p->steps[p->nsteps / STEPS_BIT] |=
+	    (uintptr_t)1 << (p->nsteps % STEPS_BIT);
+	++p->nsteps;
+}
+
+/* Pushes a step to the right to the end of the path. */
+static inline void
+path_taking_right(struct path *p)
+{
+
+	p->steps[p->nsteps / STEPS_BIT] &=
+	    ~((uintptr_t)1 << (p->nsteps % STEPS_BIT));
+	++p->nsteps;
+}
+
+/*
+ * Pops the first step from the path and returns whether it was a step
+ * to the left.
+ */
+static inline bool
+path_took_left(struct path *p)
+{
+	bool result;
+
+	result = p->steps[0] & 0x1;
+	p->steps[0] = (p->steps[0] >> 1) | (p->steps[1] << (STEPS_BIT - 1));
+	p->steps[1] >>= 1;
+	return (result);
+}
+
+#undef STEPS_BIT
+
+#endif

Modified: head/lib/libc/tests/stdlib/Makefile
==============================================================================
--- head/lib/libc/tests/stdlib/Makefile	Tue Dec 22 17:46:14 2015	(r292612)
+++ head/lib/libc/tests/stdlib/Makefile	Tue Dec 22 18:12:11 2015	(r292613)
@@ -3,6 +3,7 @@
 ATF_TESTS_C+=		heapsort_test
 ATF_TESTS_C+=		mergesort_test
 ATF_TESTS_C+=		qsort_test
+ATF_TESTS_C+=		tsearch_test
 
 # TODO: t_getenv_thread, t_mi_vector_hash
 NETBSD_ATF_TESTS_C+=	abs_test

Added: head/lib/libc/tests/stdlib/tsearch_test.c
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ head/lib/libc/tests/stdlib/tsearch_test.c	Tue Dec 22 18:12:11 2015	(r292613)
@@ -0,0 +1,131 @@
+/*-
+ * Copyright (c) 2015 Nuxi, https://nuxi.nl/
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <sys/cdefs.h>
+__FBSDID("$FreeBSD$");
+
+#include <atf-c.h>
+#define _SEARCH_PRIVATE
+#include <search.h>
+#include <stdbool.h>
+#include <stdlib.h>
+
+/* Validates the integrity of an AVL tree. */
+static inline unsigned int
+tnode_assert(const node_t *n)
+{
+	unsigned int height_left, height_right;
+	int balance;
+
+	if (n == NULL)
+		return 0;
+	height_left = tnode_assert(n->llink);
+	height_right = tnode_assert(n->rlink);
+	balance = (int)height_left - (int)height_right;
+	ATF_CHECK(balance >= -1);
+	ATF_CHECK(balance <= 1);
+	ATF_CHECK_EQ(balance, n->balance);
+	return (height_left > height_right ? height_left : height_right) + 1;
+}
+
+static int
+compar(const void *a, const void *b)
+{
+
+	return *(int *)a - *(int *)b;
+}
+
+ATF_TC_WITHOUT_HEAD(tsearch_test);
+ATF_TC_BODY(tsearch_test, tc)
+{
+	/*
+	 * Run the test below in a deterministic fashion to prevent this
+	 * test from potentially flapping. We assume that this provides
+	 * enough coverage.
+	 */
+#if 0
+	unsigned short random_state[3];
+	arc4random_buf(random_state, sizeof(random_state));
+#else
+	unsigned short random_state[3] = { 26554, 13330, 3246 };
+#endif
+
+#define NKEYS 1000
+	/* Create 1000 possible keys. */
+	int keys[NKEYS];
+	for (int i = 0; i < NKEYS; ++i)
+		keys[i] = i;
+
+	/* Apply random operations on a binary tree and check the results. */
+	void *root = NULL;
+	bool present[NKEYS] = {};
+	for (int i = 0; i < NKEYS * 10; ++i) {
+		int key = nrand48(random_state) % NKEYS;
+		switch (nrand48(random_state) % 3) {
+		case 0:  /* tdelete(). */
+			if (present[key]) {
+				ATF_CHECK(tdelete(&key, &root, compar) != NULL);
+				present[key] = false;
+			} else {
+				ATF_CHECK_EQ(NULL,
+				    tdelete(&key, &root, compar));
+			}
+			break;
+		case 1:  /* tfind(). */
+			if (present[key]) {
+				ATF_CHECK_EQ(&keys[key],
+				    *(int **)tfind(&key, &root, compar));
+			} else {
+				ATF_CHECK_EQ(NULL, tfind(&key, &root, compar));
+			}
+			break;
+		case 2:  /* tsearch(). */
+			if (present[key]) {
+				ATF_CHECK_EQ(&keys[key],
+				    *(int **)tsearch(&key, &root, compar));
+			} else {
+				ATF_CHECK_EQ(&keys[key], *(int **)tsearch(
+				    &keys[key], &root, compar));
+				present[key] = true;
+			}
+			break;
+		}
+		tnode_assert(root);
+	}
+
+	/* Remove all entries from the tree. */
+	for (int key = 0; key < NKEYS; ++key)
+		if (present[key])
+			ATF_CHECK(tdelete(&key, &root, compar) != NULL);
+	ATF_CHECK_EQ(NULL, root);
+}
+
+ATF_TP_ADD_TCS(tp)
+{
+
+	ATF_TP_ADD_TC(tp, tsearch_test);
+
+	return (atf_no_error());
+}



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201512221812.tBMICCPZ003177>