[Toybox] Toybox bc's transcendental and irrational functions produce incorrect results, memory leaks and exhibit brute force algorithmic complexity.

Mon Aug 27 13:39:26 PDT 2018

Hi Sasha,

Thank you for bringing to my attention these potentially serious
issues. I will certainly investigate each of them and make the
appropriate changes. I've included an associate who is familiar
with the origins and development of this `bc' to help me address
each point below, as you make some strong assertions.

To your message [1] "Toybox bc run away interactive process",

> When attempting to use ctrl-c to kill the toybox `bc' implementation
> while in non-interactive mode a bug results in it entering interactive
> mode while the program continues to run and then the user loses
> control of both interactive and non-interactive modes.
>
> To reproduce:
> echo "s(131231)" | ./bc -l   .. and hit ctrl-c
>
> This consumes many GB of memory before finally exhausting the system.

On a glibc-based x86_64 system I was not able to reproduce this
issue as stated above, however one of your later messages stated
that you encountered the issues on an Alpine Linux (musl-based)
system. On an Alpine 3.8 system I still could not reproduce it,
both in the Toybox environment and the stand-alone upstream [5].

My associate was able to reproduce this using '131231' and '0.5'
on Alpine, even using a fixed scale, e.g., "scale=50", using the
stand-alone version.

As to your second message [2] "Toybox bc's transcendental and
irrational functions produce incorrect results, memory leaks and
exhibit brute force algorithmic complexity.",

> Upon inspection of the mathematics core of the toybox `bc' it appears
> to have multiple systemic issues and is potentially unsalvagable.

Let me begin by saying that I'm grateful for your insightful and
detailed suggestions. If you have any more suggestions that may
be helpful please feel free to write those to the mailing list.
Several subscribers are definitely interested in this material.

> All transcendental functions produce out of bounds memory accesses.
> All builtin functions produce gratuitous and insecure memory errors.

We've tested `bc' using three static analysis tools, two memory
scanners (Google Address Sanitizer and Valgrind), in addition to
manual review, and have not found "gratuitous and insecure
memory errors". There may be a single leak in the IO module, and
we are presently looking into this.

Valgrind does indicate excess (but not illegal) memory allocation in
some of the transcendental functions, with certain inputs. We've
yet to isolate specific classes of inputs that cause the issue.

> Excessive or incorrect memory consumption for invocations of
> transcendental functions result in total memory exhaustion consuming
> about 1 GB for every few seconds of run time.

Would you please explain how we might be able to reproduce this?
Is there a specific scenario or condition where this occurs, and
on which particular platform(s) does this occur?

> The algorithms used are of the most naive possible variety and balk
> under even the smallest inputs taking hundreds of times longer than
> similar implementations.

Typically, "naive" algorithms perform better for smaller inputs
and more "advanced" algorithms are desired for their lower bound
which is realized as the input grows large. To hear that "even
the smallest inputs [are] taking hundreds of times longer" than
similar implementations is ... odd.

Would you please share your benchmarks and testing methods? It
will be tremendously helpful. Reference benchmarks to "similar"
implementations (and what those are) would be even better.

For larger problems, you are certainly correct. This `bc' hasn't
been optimized for performance; it's been optimized for size. It
implements a complete language, virtual machine and math library
in under 6000 LOC.

> All builtin transcendental and irrational functions produce incorrect
> results.

Would you please provide an example (ideally for each) showing
an instance where incorrect results are produced?

> Functions that should be complete in a fraction of a second are
> getting bogged down in naive algorithmic complexity.

See my above remark re: algorithmic complexity, and please offer
an example or two. There had been some discussion regarding just
how "precise" the computation needs to be: an exact solution for
all digits except the last? We will refer to the specification
for clarification on this matter.

> I had read that a team of mathematicians and systems programmers were
> working on the new toybox bc implementation. What exactly happened
> here?

I was not aware of such efforts/arrangements. The Toybox mailing
list on 2018-03-09 by a Christopher M. Graff states that he had
asked a few people, including Gavin D. Howard ("gdh") who wrote
the `bc' in question [5], to assist him with his implementation.
Who are you referring to? That message doesn't reference either,
but perhaps we could collaborate with the mathematicians to help
in achieving lower bounds?

In your third message [4],

> We ran across these errors while doing "make check" on the hebimath
> bignum library on an alpine linux system running against toybox's
> `bc'. Hebimath uses POSIX bc as part of its test suite, Perhaps it
> could be used to drive the core arithmetic of your POSIX bc
> implementation as it is MIT licensed and is fast and accurate.

> https://github.com/suiginsoft/hebimath

This is a helpful suggestion (to use the 'hebimath' test suite);
thank you Sasha. In December 2017 my associate had discussed the
'hebimath' library in a Freenode channel with the aforementioned
CM Graff and determined it a reasonable implementation, but that
it was rather large and would not "fit" into the Toybox project
(and size is the limiting factor).

Anyway, we'll review and make the necessary improvements.

Regards,

    Gavin D. Howard             Zach van Rijn
    "gdh"                       "Thalheim"

[1]: http://lists.landley.net/pipermail/toybox-landley.net
/2018-August/009626.html

[2]: http://lists.landley.net/pipermail/toybox-landley.net
/2018-August/009628.html

[3]: http://lists.landley.net/pipermail/toybox-landley.net
/2018-March/009403.html

[4]: http://lists.landley.net/pipermail/toybox-landley.net
/2018-August/009629.html

[5]: https://github.com/gavinhoward/bc

On Mon, Aug 27, 2018 at 7:29 AM Sasha Toltec <sashatoltec1 at gmail.com> wrote:

> We ran across these errors while doing "make check" on the hebimath
> bignum library on an alpine linux system running against toybox's
> `bc'. Hebimath uses POSIX bc as part of its test suite, Perhaps it
> could be used to drive the core arithmetic of your POSIX bc
> implementation as it is MIT licensed and is fast and accurate.
>
> https://github.com/suiginsoft/hebimath
>
> Sasha Toltec
> Toltec Enterprises
> sashatoltec1 at gmail.com
>
> On 8/27/18, Sasha Toltec <sashatoltec1 at gmail.com> wrote:
> > Upon inspection of the mathematics core of the toybox bc it appears to
> > have multiple systemic issues and is potentially unsalvagable.
> >
> > All transcendental functions produce out of bounds memory accesses.
> >
> > All builtin functions produce gratuitous and insecure memory errors.
> >
> > Excessive or incorrect memory consumption for invocations of
> > transcendental functions result in total memory exhaustion consuming
> > about 1 GB for every few seconds of run time.
> >
> > The algorithms used are of the most naive possible variety and balk
> > under even the smallest inputs taking hundreds of times longer than
> > similar implementations.
> >
> > All builtin transcendental and irrational functions produce incorrect
> > results.
> >
> > Functions that should be complete in a fraction of a second are
> > getting bogged down in naive algorithmic complexity.
> >
> > I had read that a team of mathematicians and systems programmers were
> > working on the
> > new toybox bc implementation. What exactly happened here?
> >
> >
> > Sasha Toltec
> > Toltec Enterprises
> > sashatoltec1 at gmail.com
> >
> _______________________________________________
> Toybox mailing list
> Toybox at lists.landley.net
> http://lists.landley.net/listinfo.cgi/toybox-landley.net
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20180827/4d2444f1/attachment.html>