FreeBSD Mail Archives

Date:      Tue, 8 Jan 2019 00:45:30 -0700
From:      Gavin Howard <gavin.d.howard@gmail.com>
To:        Devin Teske <dteske@freebsd.org>
Cc:        cem@freebsd.org, "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>
Subject:   Re: GNU-compatible, BSD-licensed bc
Message-ID:  <CAF=dzROC1P=44D58hY0RcQW-3nwWeXvQ_5s4BNPG3AE=OzCZqQ@mail.gmail.com>
In-Reply-To: <61F802DC-2E59-4E0A-955D-899EBD7874A1@FreeBSD.org>
References:  <CAF=dzRNnurahLBOaKgq8_bDXNuM8biYPFbj6F2vp0t58Ejp8bg@mail.gmail.com> <8FFA4578-0BAE-4F9F-8A06-AE83283BDEA4@FreeBSD.org> <CAG6CVpXam0bJD9B7n0xDQiRF=ZTeH0hN7wd8f8fDGyMSsCwh0w@mail.gmail.com> <CAF=dzRNYxYf7P8q7mZo=Tc6a%2BfTYsARGpG0=ZTvBP1ESLPBLOg@mail.gmail.com> <61F802DC-2E59-4E0A-955D-899EBD7874A1@FreeBSD.org>

On Mon, Jan 7, 2019 at 10:57 PM Devin Teske <dteske@freebsd.org> wrote:
>
>
>
> > On Jan 7, 2019, at 4:42 PM, Gavin Howard <gavin.d.howard@gmail.com> wrote:
> >
> > On Mon, Jan 7, 2019 at 5:38 PM Conrad Meyer <cem@freebsd.org> wrote:
> >>
> >> On Mon, Jan 7, 2019 at 4:00 PM Devin Teske <dteske@freebsd.org> wrote:
> >>> How do you handle arbitrary arithmetic precision?
> >>
> >> Looks like https://github.com/gavinhoward/bc/blob/master/src/num.c .
> >
> > That is correct. Because this bc is meant to help bootstrap the Linux
> > kernel and have no dependencies other than POSIX 2008, I wrote my own.
>
> Impressive. It might be worth turning this into a library itself.

Eh...it is completely tuned for bc. And it won't be fast,
unfortunately. See below.

> > Also, the POSIX bc standard mandates doing math in decimal. OpenSSL
> > would not be smart if they did that.
>
> Not sure I understand what you mean here.

Well, for starters, OpenSSL's BIGNUM is integer only. Yes, those
integers can be any size, but they are only integers. That is not good
enough for bc; it has to allow arbitrary precision, including
non-integers, and its fractional part can be any size, up to a certain
limit, which you can get from my bc by typing "limits" at the prompt
and looking for the value of BC_NUM_MAX (which is actually the maximum
number of decimal digits, period).

Also, while bc requires arithmetic to be in decimal, OpenSSL probably
uses unsigned integers to do binary-based arithmetic. Let me explain.
(And yes, I am explaining in detail, even though you probably know
this already. I am just making sure I cover my bases.)

In any (normal) bignum implementation, a bignum is going to be an
array of hardware-based integers (in the case of OpenSSL, probably
unsigned ints). A normal bignum implementation would use the entire
range (or almost the entire range) of that integer type. For 32-bit
unsigned integers, that means that each "limb" (item in the bignum
array) can hold 2^32 integers. For a hardware-based, unsigned integer
implementation, that means that it only needs one item in the array
until the bignum is bigger than 2^32-1. Once it overflows that,
another unsigned integer will be added to the array and gets the
overflow. This allows the bignum implementation at least 2 advantages:
1) it can use very fast hardware types, and 2) the size of the bignum
array is kept as small as possible.

I could not do that for bc. Each "limb" could only hold one decimal
digit, meaning that each item in the array was limited to 10^1
numbers. Obviously, that is *much* smaller than 2^32. Now, in my bc, I
use unsigned chars, so my limbs are 1/4 the size of the limbs in a
32-bit-based bignum, but each limb can only use a small fraction of
the range of an unsigned char. This means three things: 1) I have to
manually check for overflow over 9, which adds code, 2) math on each
limb in the bc takes longer than in a 32-bit-based bignum, since
unsigned chars have to be zero-extended to the size of an unsigned int
before doing the actual math, and 3) the amount of memory needed (on
average) is much higher (though it would be astronomical if I used
unsigned ints). I also have to store the location of the radix
(decimal point) and sometimes perform more expensive operations when
doing math on digits behind the radix.

I would not have written it that way, but the POSIX bc spec mandated
it. (See https://pubs.opengroup.org/onlinepubs/9699919799/utilities/bc.html#tag_20_09_13_03,
where it says "Internal computations shall be conducted as if in
decimal, regardless of the input and output bases, to the specified
number of decimal digits.") So, in essence, a bc-based bignum library
is never going to be very fast. And it generally doesn't have to be;
it interacts directly with a user most of the time, and users are slow
compared to computers. Now, I have taken a few pains to make sure my
bc is as fast as possible, under the circumstances, but the math is
still the most expensive part.

It's kind of freaky, actually. My parser zips through my math library
fast enough that the user doesn't even notice the load lag, and though
my bytecode interpreter is not slow, it is not fast either, since my
parser does not optimize the bytecode. Despite that, the math is
consistently taking about 80+ percent of the time, and that percentage
goes *way* up when there are bigger numbers, since only addition and
subtraction are linear. (I do use Karatsuba for multiplication, and it
is a miracle, but it is still superlinear.)

> > There are also a few
> > peculiarities with the POSIX bc standard that (more or less) require a
> > standalone implementation.
> >
>
> How hard would it be to convert a bn(3)-based library to use your code?
> Would there be any performance impact -- I've benchmarked bn(3) to
> be pretty fast.

It would not be terribly hard, but as I said above, it would not be
fast at all, at least compared to a well-written hardware-based,
binary bignum implementation. But if you want to, go ahead; I would
appreciate the credit (though the license does not even require that).

> > Also, right now I am working on getting a release candidate out that
> > will enable me to make a quick port that Stefan could use as a jumping
> > off point. My build system changed between 1.0 and now, and I would
> > like to be able to test it.
> >
>
> Cool. Looking forward to it.

It's out. It works great. The Makefile that I sent to the mailing list
a few messages back does the job well enough, though I was told in a
private message not to use the GNU bc port's pkg-descr file, since it
might be under the GPL.

However, note that this is not the final 1.1 release; it is just for
testing, even though it is high quality.

Gavin Howard

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAF=dzROC1P=44D58hY0RcQW-3nwWeXvQ_5s4BNPG3AE=OzCZqQ>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation