|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
<html>
|
|
Packit |
5c3484 |
<head>
|
|
Packit |
5c3484 |
<title>GMP Development Projects</title>
|
|
Packit |
5c3484 |
<link rel="shortcut icon" href="favicon.ico">
|
|
Packit |
5c3484 |
<link rel="stylesheet" href="gmp.css">
|
|
Packit |
5c3484 |
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
|
Packit |
5c3484 |
</head>
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
<center>
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
GMP Development Projects
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
</center>
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
<font size=-1>
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Copyright 2000-2006, 2008-2011 Free Software Foundation, Inc.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
This file is part of the GNU MP Library.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
The GNU MP Library is free software; you can redistribute it and/or modify
|
|
Packit |
5c3484 |
it under the terms of either:
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
* the GNU Lesser General Public License as published by the Free
|
|
Packit |
5c3484 |
Software Foundation; either version 3 of the License, or (at your
|
|
Packit |
5c3484 |
option) any later version.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
or
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
* the GNU General Public License as published by the Free Software
|
|
Packit |
5c3484 |
Foundation; either version 2 of the License, or (at your option) any
|
|
Packit |
5c3484 |
later version.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
or both in parallel, as here.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
The GNU MP Library is distributed in the hope that it will be useful, but
|
|
Packit |
5c3484 |
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
|
|
Packit |
5c3484 |
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
|
|
Packit |
5c3484 |
for more details.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
You should have received copies of the GNU General Public License and the
|
|
Packit |
5c3484 |
GNU Lesser General Public License along with the GNU MP Library. If not,
|
|
Packit |
5c3484 |
see https://www.gnu.org/licenses/.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
</font>
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
This file current as of 29 Jan 2014. An up-to-date version is available at
|
|
Packit |
5c3484 |
https://gmplib.org/projects.html.
|
|
Packit |
5c3484 |
Please send comments about this page to gmp-devel<font>@</font>gmplib.org.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
This file lists projects suitable for volunteers. Please see the
|
|
Packit |
5c3484 |
tasks file for smaller tasks.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
If you want to work on any of the projects below, please let
|
|
Packit |
5c3484 |
gmp-devel<font>@</font>gmplib.org know. If you want to help with a project
|
|
Packit |
5c3484 |
that already somebody else is working on, you will get in touch through
|
|
Packit |
5c3484 |
gmp-devel<font>@</font>gmplib.org. (There are no email addresses of
|
|
Packit |
5c3484 |
volunteers below, due to spamming problems.)
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Faster multiplication
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Work on the algorithm selection code for unbalanced multiplication.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Implement an FFT variant computing the coefficients mod m different
|
|
Packit |
5c3484 |
limb size primes of the form l*2^k+1. i.e., compute m separate FFTs.
|
|
Packit |
5c3484 |
The wanted coefficients will at the end be found by lifting with CRT
|
|
Packit |
5c3484 |
(Chinese Remainder Theorem). If we let m = 3, i.e., use 3 primes, we
|
|
Packit |
5c3484 |
can split the operands into coefficients at limb boundaries, and if
|
|
Packit |
5c3484 |
our machine uses b-bit limbs, we can multiply numbers with close to
|
|
Packit |
5c3484 |
2^b limbs without coefficient overflow. For smaller multiplication,
|
|
Packit |
5c3484 |
we might perhaps let m = 1, and instead of splitting our operands at
|
|
Packit |
5c3484 |
limb boundaries, split them in much smaller pieces. We might also use
|
|
Packit |
5c3484 |
4 or more primes, and split operands into bigger than b-bit chunks.
|
|
Packit |
5c3484 |
By using more primes, the gain in shorter transform length, but lose
|
|
Packit |
5c3484 |
in having to do more FFTs, but that is a slight total save. We then
|
|
Packit |
5c3484 |
lose in more expensive CRT.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
[We now have two implementations of this algorithm, one by Tommy
|
|
Packit |
5c3484 |
Färnqvist and one by Niels Möller.]
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Work on short products. Our mullo and mulmid are probably K, but we
|
|
Packit |
5c3484 |
lack mulhi.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Another possibility would be an optimized cube. In the basecase that
|
|
Packit |
5c3484 |
should definitely be able to save cross-products in a similar fashion to
|
|
Packit |
5c3484 |
squaring, but some investigation might be needed for how best to adapt
|
|
Packit |
5c3484 |
the higher-order algorithms. Not sure whether cubing or further small
|
|
Packit |
5c3484 |
powers have any particularly important uses though.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Assembly routines
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Write new and improve existing assembly routines. The tests/devel
|
|
Packit |
5c3484 |
programs and the tune/speed.c and tune/many.pl programs are useful for
|
|
Packit |
5c3484 |
testing and timing the routines you write. See the README files in those
|
|
Packit |
5c3484 |
directories for more information.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Please make sure your new routines are fast for these three situations:
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Small operands of less than, say, 10 limbs.
|
|
Packit |
5c3484 |
Medium size operands, that fit into the cache.
|
|
Packit |
5c3484 |
Huge operands that does not fit into the cache.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
The most important routines are mpn_addmul_1, mpn_mul_basecase and
|
|
Packit |
5c3484 |
mpn_sqr_basecase. The latter two don't exist for all machines, while
|
|
Packit |
5c3484 |
mpn_addmul_1 exists for almost all machines.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Standard techniques for these routines are unrolling, software
|
|
Packit |
5c3484 |
pipelining, and specialization for common operand values. For machines
|
|
Packit |
5c3484 |
with poor integer multiplication, it is sometimes possible to remedy the
|
|
Packit |
5c3484 |
situation using floating-point operations or SIMD operations such as MMX
|
|
Packit |
5c3484 |
(x86) (x86), SSE (x86), VMX (PowerPC), VIS (Sparc).
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Using floating-point operations is interesting but somewhat tricky.
|
|
Packit |
5c3484 |
Since IEEE double has 53 bit of mantissa, one has to split the operands
|
|
Packit |
5c3484 |
in small pieces, so that no intermediates are greater than 2^53. For
|
|
Packit |
5c3484 |
32-bit computers, splitting one operand into 16-bit pieces works. For
|
|
Packit |
5c3484 |
64-bit machines, one operand can be split into 21-bit pieces and the
|
|
Packit |
5c3484 |
other into 32-bit pieces. (A 64-bit operand can be split into just three
|
|
Packit |
5c3484 |
21-bit pieces if one allows the split operands to be negative!)
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Faster sqrt
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
The current code uses divisions, which are reasonably fast, but it'd be
|
|
Packit |
5c3484 |
possible to use only multiplications by computing 1/sqrt(A) using this
|
|
Packit |
5c3484 |
iteration:
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
2
|
|
Packit |
5c3484 |
x = x (3 − A x )/2
|
|
Packit |
5c3484 |
i+1 i i
|
|
Packit |
5c3484 |
The square root can then be computed like this:
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
sqrt(A) = A x
|
|
Packit |
5c3484 |
n
|
|
Packit |
5c3484 |
That final multiply might be the full size of the input (though it might
|
|
Packit |
5c3484 |
only need the high half of that), so there may or may not be any speedup
|
|
Packit |
5c3484 |
overall.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
We should probably allow a special exponent-like parameter, to speed
|
|
Packit |
5c3484 |
computations of a precise square root of a small number in mpf and mpfr.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Nth root
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Improve mpn_rootrem. The current code is not too bad, but its time
|
|
Packit |
5c3484 |
complexity is a function of the input, while it is possible to make
|
|
Packit |
5c3484 |
the average complexity a function of the output.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Fat binaries
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Add more functions to the set of fat functions.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
The speed of multiplication is today highly dependent on combination
|
|
Packit |
5c3484 |
functions like addlsh1_n . A fat binary will never use any such
|
|
Packit |
5c3484 |
functions, since they are classified as optional. Ideally, we should use
|
|
Packit |
5c3484 |
them, but making the current compile-time selections of optional functions
|
|
Packit |
5c3484 |
become run-time selections for fat binaries.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
If we make fat binaries work really well, we should move away frm tehe
|
|
Packit |
5c3484 |
current configure scheme (at least by default) and instead include all code
|
|
Packit |
5c3484 |
always.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Exceptions
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Some sort of scheme for exceptions handling would be desirable.
|
|
Packit |
5c3484 |
Presently the only thing documented is that divide by zero in GMP
|
|
Packit |
5c3484 |
functions provokes a deliberate machine divide by zero (on those systems
|
|
Packit |
5c3484 |
where such a thing exists at least). The global gmp_errno
|
|
Packit |
5c3484 |
is not actually documented, except for the old gmp_randinit
|
|
Packit |
5c3484 |
function. Being currently just a plain global means it's not
|
|
Packit |
5c3484 |
thread-safe.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
The basic choices for exceptions are returning an error code or having a
|
|
Packit |
5c3484 |
handler function to be called. The disadvantage of error returns is they
|
|
Packit |
5c3484 |
have to be checked, leading to tedious and rarely executed code, and
|
|
Packit |
5c3484 |
strictly speaking such a scheme wouldn't be source or binary compatible.
|
|
Packit |
5c3484 |
The disadvantage of a handler function is that a longjmp or
|
|
Packit |
5c3484 |
similar recovery from it may be difficult. A combination would be
|
|
Packit |
5c3484 |
possible, for instance by allowing the handler to return an error code.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Divide-by-zero, sqrt-of-negative, and similar operand range errors can
|
|
Packit |
5c3484 |
normally be detected at the start of functions, so exception handling
|
|
Packit |
5c3484 |
would have a clean state. What's worth considering though is that the
|
|
Packit |
5c3484 |
GMP function detecting the exception may have been called via some third
|
|
Packit |
5c3484 |
party library or self contained application module, and hence have
|
|
Packit |
5c3484 |
various bits of state to be cleaned up above it. It'd be highly
|
|
Packit |
5c3484 |
desirable for an exceptions scheme to allow for such cleanups.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
The C++ destructor mechanism could help with cleanups both internally and
|
|
Packit |
5c3484 |
externally, but being a plain C library we don't want to depend on that.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
A C++ throw might be a good optional extra exceptions
|
|
Packit |
5c3484 |
mechanism, perhaps under a build option. For
|
|
Packit |
5c3484 |
GCC -fexceptions will add the necessary frame information to
|
|
Packit |
5c3484 |
plain C code, or GMP could be compiled as C++.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Out-of-memory exceptions are expected to be handled by the
|
|
Packit |
5c3484 |
mp_set_memory_functions routines, rather than being a
|
|
Packit |
5c3484 |
prospective part of divide-by-zero etc. Some similar considerations
|
|
Packit |
5c3484 |
apply but what differs is that out-of-memory can arise deep within GMP
|
|
Packit |
5c3484 |
internals. Even fundamental routines like mpn_add_n and
|
|
Packit |
5c3484 |
mpn_addmul_1 can use temporary memory (for instance on Cray
|
|
Packit |
5c3484 |
vector systems). Allowing for an error code return would require an
|
|
Packit |
5c3484 |
awful lot of checking internally. Perhaps it'd still be worthwhile, but
|
|
Packit |
5c3484 |
it'd be a lot of changes and the extra code would probably be rather
|
|
Packit |
5c3484 |
rarely executed in normal usages.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
A longjmp recovery for out-of-memory will currently, in
|
|
Packit |
5c3484 |
general, lead to memory leaks and may leave GMP variables operated on in
|
|
Packit |
5c3484 |
inconsistent states. Maybe it'd be possible to record recovery
|
|
Packit |
5c3484 |
information for use by the relevant allocate or reallocate function, but
|
|
Packit |
5c3484 |
that too would be a lot of changes.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
One scheme for out-of-memory would be to note that all GMP allocations go
|
|
Packit |
5c3484 |
through the mp_set_memory_functions routines. So if the
|
|
Packit |
5c3484 |
application has an intended setjmp recovery point it can
|
|
Packit |
5c3484 |
record memory activity by GMP and abandon space allocated and variables
|
|
Packit |
5c3484 |
initialized after that point. This might be as simple as directing the
|
|
Packit |
5c3484 |
allocation functions to a separate pool, but in general would have the
|
|
Packit |
5c3484 |
disadvantage of needing application-level bookkeeping on top of the
|
|
Packit |
5c3484 |
normal system malloc . An advantage however is that it needs
|
|
Packit |
5c3484 |
nothing from GMP itself and on that basis doesn't burden applications not
|
|
Packit |
5c3484 |
needing recovery. Note that there's probably some details to be worked
|
|
Packit |
5c3484 |
out here about reallocs of existing variables, and perhaps about copying
|
|
Packit |
5c3484 |
or swapping between "permanent" and "temporary" variables.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Applications desiring a fine-grained error control, for instance a
|
|
Packit |
5c3484 |
language interpreter, would very possibly not be well served by a scheme
|
|
Packit |
5c3484 |
requiring longjmp . Wrapping every GMP function call with a
|
|
Packit |
5c3484 |
setjmp would be very inconvenient.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Another option would be to let mpz_t etc hold a sort of NaN,
|
|
Packit |
5c3484 |
a special value indicating an out-of-memory or other failure. This would
|
|
Packit |
5c3484 |
be similar to NaNs in mpfr. Unfortunately such a scheme could only be
|
|
Packit |
5c3484 |
used by programs prepared to handle such special values, since for
|
|
Packit |
5c3484 |
instance a program waiting for some condition to be satisfied could
|
|
Packit |
5c3484 |
become an infinite loop if it wasn't also watching for NaNs. The work to
|
|
Packit |
5c3484 |
implement this would be significant too, lots of checking of inputs and
|
|
Packit |
5c3484 |
intermediate results. And if mpn routines were to
|
|
Packit |
5c3484 |
participate in this (which they would have to internally) a lot of new
|
|
Packit |
5c3484 |
return values would need to be added, since of course there's no
|
|
Packit |
5c3484 |
mpz_t etc structure for them to indicate failure in.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Stack overflow is another possible exception, but perhaps not one that
|
|
Packit |
5c3484 |
can be easily detected in general. On i386 GNU/Linux for instance GCC
|
|
Packit |
5c3484 |
normally doesn't generate stack probes for an alloca , but
|
|
Packit |
5c3484 |
merely adjusts %esp . A big enough alloca can
|
|
Packit |
5c3484 |
miss the stack redzone and hit arbitrary data. GMP stack usage is
|
|
Packit |
5c3484 |
normally a function of operand size, which might be enough for some
|
|
Packit |
5c3484 |
applications to know they'll be safe. Otherwise a fixed maximum usage
|
|
Packit |
5c3484 |
can probably be obtained by building with
|
|
Packit |
5c3484 |
--enable-alloca=malloc-reentrant (or
|
|
Packit |
5c3484 |
notreentrant ). Arranging the default to be
|
|
Packit |
5c3484 |
alloca only on blocks up to a certain size and
|
|
Packit |
5c3484 |
malloc thereafter might be a better approach and would have
|
|
Packit |
5c3484 |
the advantage of not having calculations limited by available stack.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Actually recovering from stack overflow is of course another problem. It
|
|
Packit |
5c3484 |
might be possible to catch a SIGSEGV in the stack redzone
|
|
Packit |
5c3484 |
and do something in a sigaltstack , on systems which have
|
|
Packit |
5c3484 |
that, but recovery might otherwise not be possible. This is worth
|
|
Packit |
5c3484 |
bearing in mind because there's no point worrying about tight and careful
|
|
Packit |
5c3484 |
out-of-memory recovery if an out-of-stack is fatal.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Operand overflow is another exception to be addressed. It's easy for
|
|
Packit |
5c3484 |
instance to ask mpz_pow_ui for a result bigger than an
|
|
Packit |
5c3484 |
mpz_t can possibly represent. Currently overflows in limb
|
|
Packit |
5c3484 |
or byte count calculations will go undetected. Often they'll still end
|
|
Packit |
5c3484 |
up asking the memory functions for blocks bigger than available memory,
|
|
Packit |
5c3484 |
but that's by no means certain and results are unpredictable in general.
|
|
Packit |
5c3484 |
It'd be desirable to tighten up such size calculations. Probably only
|
|
Packit |
5c3484 |
selected routines would need checks, if it's assumed say that no input
|
|
Packit |
5c3484 |
will be more than half of all memory and hence size additions like say
|
|
Packit |
5c3484 |
mpz_mul won't overflow.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Performance Tool
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
It'd be nice to have some sort of tool for getting an overview of
|
|
Packit |
5c3484 |
performance. Clearly a great many things could be done, but some primary
|
|
Packit |
5c3484 |
uses would be,
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Checking speed variations between compilers.
|
|
Packit |
5c3484 |
Checking relative performance between systems or CPUs.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
A combination of measuring some fundamental routines and some
|
|
Packit |
5c3484 |
representative application routines might satisfy these.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
The tune/time.c routines would be the easiest way to get good accurate
|
|
Packit |
5c3484 |
measurements on lots of different systems. The high level
|
|
Packit |
5c3484 |
speed_measure may or may not suit, but the basic
|
|
Packit |
5c3484 |
speed_starttime and speed_endtime would cover
|
|
Packit |
5c3484 |
lots of portability and accuracy questions.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Using restrict
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
There might be some value in judicious use of C99 style
|
|
Packit |
5c3484 |
restrict on various pointers, but this would need some
|
|
Packit |
5c3484 |
careful thought about what it implies for the various operand overlaps
|
|
Packit |
5c3484 |
permitted in GMP.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Rumour has it some pre-C99 compilers had restrict , but
|
|
Packit |
5c3484 |
expressing tighter (or perhaps looser) requirements. Might be worth
|
|
Packit |
5c3484 |
investigating that before using restrict unconditionally.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Loops are presumably where the greatest benefit would be had, by allowing
|
|
Packit |
5c3484 |
the compiler to advance reads ahead of writes, perhaps as part of loop
|
|
Packit |
5c3484 |
unrolling. However critical loops are generally coded in assembler, so
|
|
Packit |
5c3484 |
there might not be very much to gain. And on Cray systems the explicit
|
|
Packit |
5c3484 |
use of _Pragma gives an equivalent effect.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
One thing to note is that Microsoft C headers (on ia64 at least) contain
|
|
Packit |
5c3484 |
__declspec(restrict) , so a #define of
|
|
Packit |
5c3484 |
restrict should be avoided. It might be wisest to setup a
|
|
Packit |
5c3484 |
gmp_restrict .
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Prime Testing
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
GMP is not really a number theory library and probably shouldn't have
|
|
Packit |
5c3484 |
large amounts of code dedicated to sophisticated prime testing
|
|
Packit |
5c3484 |
algorithms, but basic things well-implemented would suit. Tests offering
|
|
Packit |
5c3484 |
certainty are probably all too big or too slow (or both!) to justify
|
|
Packit |
5c3484 |
inclusion in the main library. Demo programs showing some possibilities
|
|
Packit |
5c3484 |
would be good though.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
The present "repetitions" argument to mpz_probab_prime_p is
|
|
Packit |
5c3484 |
rather specific to the Miller-Rabin tests of the current implementation.
|
|
Packit |
5c3484 |
Better would be some sort of parameter asking perhaps for a maximum
|
|
Packit |
5c3484 |
chance 1/2^x of a probable prime in fact being composite. If
|
|
Packit |
5c3484 |
applications follow the advice that the present reps gives 1/4^reps
|
|
Packit |
5c3484 |
chance then perhaps such a change is unnecessary, but an explicitly
|
|
Packit |
5c3484 |
described 1/2^x would allow for changes in the implementation or even for
|
|
Packit |
5c3484 |
new proofs about the theory.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
mpz_probab_prime_p always initializes a new
|
|
Packit |
5c3484 |
gmp_randstate_t for randomized tests, which unfortunately
|
|
Packit |
5c3484 |
means it's not really very random and in particular always runs the same
|
|
Packit |
5c3484 |
tests for a given input. Perhaps a new interface could accept an rstate
|
|
Packit |
5c3484 |
to use, so successive tests could increase confidence in the result.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
mpn_mod_34lsub1 is an obvious and easy improvement to the
|
|
Packit |
5c3484 |
trial divisions. And since the various prime factors are constants, the
|
|
Packit |
5c3484 |
remainder can be tested with something like
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
#define MP_LIMB_DIVISIBLE_7_P(n) \
|
|
Packit |
5c3484 |
((n) * MODLIMB_INVERSE_7 <= MP_LIMB_T_MAX/7)
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Which would help compilers that don't know how to optimize divisions by
|
|
Packit |
5c3484 |
constants, and is even an improvement on current gcc 3.2 code. This
|
|
Packit |
5c3484 |
technique works for any modulus, see Granlund and Montgomery "Division by
|
|
Packit |
5c3484 |
Invariant Integers" section 9.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
The trial divisions are done with primes generated and grouped at
|
|
Packit |
5c3484 |
runtime. This could instead be a table of data, with pre-calculated
|
|
Packit |
5c3484 |
inverses too. Storing deltas, ie. amounts to add, rather than actual
|
|
Packit |
5c3484 |
primes would save space. udiv_qrnnd_preinv style inverses
|
|
Packit |
5c3484 |
can be made to exist by adding dummy factors of 2 if necessary. Some
|
|
Packit |
5c3484 |
thought needs to be given as to how big such a table should be, based on
|
|
Packit |
5c3484 |
how much dividing would be profitable for what sort of size inputs. The
|
|
Packit |
5c3484 |
data could be shared by the perfect power testing.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Jason Moxham points out that if a sqrt(-1) mod N exists then any factor
|
|
Packit |
5c3484 |
of N must be == 1 mod 4, saving half the work in trial dividing. (If
|
|
Packit |
5c3484 |
x^2==-1 mod N then for a prime factor p we have x^2==-1 mod p and so the
|
|
Packit |
5c3484 |
jacobi symbol (-1/p)=1. But also (-1/p)=(-1)^((p-1)/2), hence must have
|
|
Packit |
5c3484 |
p==1 mod 4.) But knowing whether sqrt(-1) mod N exists is not too easy.
|
|
Packit |
5c3484 |
A strong pseudoprime test can reveal one, so perhaps such a test could be
|
|
Packit |
5c3484 |
inserted part way though the dividing.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Jon Grantham "Frobenius Pseudoprimes" (www.pseudoprime.com) describes a
|
|
Packit |
5c3484 |
quadratic pseudoprime test taking about 3x longer than a plain test, but
|
|
Packit |
5c3484 |
with only a 1/7710 chance of error (whereas 3 plain Miller-Rabin tests
|
|
Packit |
5c3484 |
would offer only (1/4)^3 == 1/64). Such a test needs completely random
|
|
Packit |
5c3484 |
parameters to satisfy the theory, though single-limb values would run
|
|
Packit |
5c3484 |
faster. It's probably best to do at least one plain Miller-Rabin before
|
|
Packit |
5c3484 |
any quadratic tests, since that can identify composites in less total
|
|
Packit |
5c3484 |
time.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Some thought needs to be given to the structure of which tests (trial
|
|
Packit |
5c3484 |
division, Miller-Rabin, quadratic) and how many are done, based on what
|
|
Packit |
5c3484 |
sort of inputs we expect, with a view to minimizing average time.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
It might be a good idea to break out subroutines for the various tests,
|
|
Packit |
5c3484 |
so that an application can combine them in ways it prefers, if sensible
|
|
Packit |
5c3484 |
defaults in mpz_probab_prime_p don't suit. In particular
|
|
Packit |
5c3484 |
this would let applications skip tests it knew would be unprofitable,
|
|
Packit |
5c3484 |
like trial dividing when an input is already known to have no small
|
|
Packit |
5c3484 |
factors.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
For small inputs, combinations of theory and explicit search make it
|
|
Packit |
5c3484 |
relatively easy to offer certainty. For instance numbers up to 2^32
|
|
Packit |
5c3484 |
could be handled with a strong pseudoprime test and table lookup. But
|
|
Packit |
5c3484 |
it's rather doubtful whether a smallnum prime test belongs in a bignum
|
|
Packit |
5c3484 |
library. Perhaps if it had other internal uses.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
An mpz_nthprime might be cute, but is almost certainly
|
|
Packit |
5c3484 |
impractical for anything but small n.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Intra-Library Calls
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
On various systems, calls within libgmp still go through the PLT, TOC or
|
|
Packit |
5c3484 |
other mechanism, which makes the code bigger and slower than it needs to
|
|
Packit |
5c3484 |
be.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
The theory would be to have all GMP intra-library calls resolved directly
|
|
Packit |
5c3484 |
to the routines in the library. An application wouldn't be able to
|
|
Packit |
5c3484 |
replace a routine, the way it can normally, but there seems no good
|
|
Packit |
5c3484 |
reason to do that, in normal circumstances.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
The visibility attribute in recent gcc is good for this,
|
|
Packit |
5c3484 |
because it lets gcc omit unnecessary GOT pointer setups or whatever if it
|
|
Packit |
5c3484 |
finds all calls are local and there's no global data references.
|
|
Packit |
5c3484 |
Documented entrypoints would be protected , and purely
|
|
Packit |
5c3484 |
internal things not wanted by test programs or anything can be
|
|
Packit |
5c3484 |
internal .
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Unfortunately, on i386 it seems protected ends up causing
|
|
Packit |
5c3484 |
text segment relocations within libgmp.so, meaning the library code can't
|
|
Packit |
5c3484 |
be shared between processes, defeating the purpose of a shared library.
|
|
Packit |
5c3484 |
Perhaps this is just a gremlin in binutils (debian packaged
|
|
Packit |
5c3484 |
2.13.90.0.16-1).
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
The linker can be told directly (with a link script, or options) to do
|
|
Packit |
5c3484 |
the same sort of thing. This doesn't change the code emitted by gcc of
|
|
Packit |
5c3484 |
course, but it does mean calls are resolved directly to their targets,
|
|
Packit |
5c3484 |
avoiding a PLT entry.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Keeping symbols private to libgmp.so is probably a good thing in general
|
|
Packit |
5c3484 |
too, to stop anyone even attempting to access them. But some
|
|
Packit |
5c3484 |
undocumented things will need or want to be kept visible, for use by
|
|
Packit |
5c3484 |
mpfr, or the test and tune programs. Libtool has a standard option for
|
|
Packit |
5c3484 |
selecting public symbols (used now for libmp).
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Math functions for the mpf layer
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Implement the functions of math.h for the GMP mpf layer! Check the book
|
|
Packit |
5c3484 |
"Pi and the AGM" by Borwein and Borwein for ideas how to do this. These
|
|
Packit |
5c3484 |
functions are desirable: acos, acosh, asin, asinh, atan, atanh, atan2,
|
|
Packit |
5c3484 |
cos, cosh, exp, log, log10, pow, sin, sinh, tan, tanh.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Note that the mpfr functions already
|
|
Packit |
5c3484 |
provide these functions, and that we usually recommend new programs to use
|
|
Packit |
5c3484 |
mpfr instead of mpf.
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
</body>
|
|
Packit |
5c3484 |
</html>
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
|
|
Packit |
5c3484 |
Local variables:
|
|
Packit |
5c3484 |
eval: (add-hook 'write-file-hooks 'time-stamp)
|
|
Packit |
5c3484 |
time-stamp-start: "This file current as of "
|
|
Packit |
5c3484 |
time-stamp-format: "%:d %3b %:y"
|
|
Packit |
5c3484 |
time-stamp-end: "\\."
|
|
Packit |
5c3484 |
time-stamp-line-limit: 50
|
|
Packit |
5c3484 |
End:
|
|
Packit |
5c3484 |
-->
|