From owner-freebsd-hackers@freebsd.org  Mon May 22 19:11:36 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id C64E3D79BC4
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Mon, 22 May 2017 19:11:36 +0000 (UTC)
 (envelope-from jfs@jstimpfle.de)
Received: from h1929017.stratoserver.net (jstimpfle.de [85.214.245.181])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 90E2D1E2D
 for <freebsd-hackers@freebsd.org>; Mon, 22 May 2017 19:11:35 +0000 (UTC)
 (envelope-from jfs@jstimpfle.de)
Received: from jfs by h1929017.stratoserver.net with local (Exim 4.84_2)
 (envelope-from <jfs@jstimpfle.de>) id 1dCsjn-0007Jf-CK
 for freebsd-hackers@freebsd.org; Mon, 22 May 2017 21:11:27 +0200
Date: Mon, 22 May 2017 21:11:27 +0200
From: Jens Stimpfle <jfs@jstimpfle.de>
To: freebsd-hackers@freebsd.org
Subject: Re: Improved Red-black tree implementation
Message-ID: <20170522191127.GB27949@jstimpfle.de>
References: <20170514050220.GA768@jstimpfle.de>
 <591EA74A.3080500@embedded-brains.de>
 <20170520165303.GA22452@jstimpfle.de>
 <59227456.9020805@embedded-brains.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <59227456.9020805@embedded-brains.de>
User-Agent: Mutt/1.5.23 (2014-03-12)
X-Mailman-Approved-At: Mon, 22 May 2017 19:36:09 +0000
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 May 2017 19:11:36 -0000

On Mon, May 22, 2017 at 07:17:10AM +0200, Sebastian Huber wrote:
> >Considering memory savings, have you considered 2-pointer trees (no
> >parent pointer)? They need a temporary search stack for insertion and
> >deletion, but not for iteration ("threaded trees").
>
> The RB and Eternally Confuzzled variants don't use a parent pointer as far
> as I remember.

I don't know "RB", but a few more words since both do not do too well in
your bench, performance-wise. I know that Eternally Confuzzled uses a
possibly easier to implement ("top-down") algorithm which is expected to
be slower. A 2-ptr tree with a conventional implementation need not be
much slower than a 3-ptr, or could in fact be faster due to memory
savings. Here are measurements including an unpolished 2-ptr tree.

Amd X2-250 (x86-64):
====================

  SIL 3-pointer Red-black tree: 0.699s 0.646s 0.635s
  SIL 2-pointer Red-black tree: 0.856s 0.856s 0.856s
  STL std::set:                 0.750s 0.751s 0.750s
  BSD RB tree:                  0.804s 0.832s 0.828s

Raspberry Pi 2 (ARM ???)
========================

  SIL 3-pointer Red-black tree: 3.844s 3.841s 3.839s
  SIL 2-pointer Red-black tree: 4.002s 4.231s 3.954s
  STL std::set:                 3.742s 3.786s 3.776s
  BSD RB tree:                  4.112s 4.107s 4.105s

That particular benchmark generates a random permutation of [0,2^16) in
a flat array of C ints. It adds 2^15 random elements from that array to
an empty tree and then makes another 2^15 operations, randomly chosen to
be add, delete, or find (in 128-bursts) on random elements from the
array.

The variance in the measurements accounts for the randomly chosen array
elements. There is no variance if I chose neighbouring elements or when
the bench data is small enough to fit in the L1 cache (it's also about
100x faster).

The measurements do not account for the performance disadvantage of
2-ptr that is having to "find", i.e. create a search stack, an
(intrusively linked) element that you already hold in your hands, if you
want to delete it or use it as a hint to add a similar element. This is
to my knowledge the only case where maybe 2x slowdown can be expected.

Jens