Blame mpn/sparc32/README

Packit 5c3484
Copyright 1996, 2001 Free Software Foundation, Inc.
Packit 5c3484
Packit 5c3484
This file is part of the GNU MP Library.
Packit 5c3484
Packit 5c3484
The GNU MP Library is free software; you can redistribute it and/or modify
Packit 5c3484
it under the terms of either:
Packit 5c3484
Packit 5c3484
  * the GNU Lesser General Public License as published by the Free
Packit 5c3484
    Software Foundation; either version 3 of the License, or (at your
Packit 5c3484
    option) any later version.
Packit 5c3484
Packit 5c3484
or
Packit 5c3484
Packit 5c3484
  * the GNU General Public License as published by the Free Software
Packit 5c3484
    Foundation; either version 2 of the License, or (at your option) any
Packit 5c3484
    later version.
Packit 5c3484
Packit 5c3484
or both in parallel, as here.
Packit 5c3484
Packit 5c3484
The GNU MP Library is distributed in the hope that it will be useful, but
Packit 5c3484
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
Packit 5c3484
or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
Packit 5c3484
for more details.
Packit 5c3484
Packit 5c3484
You should have received copies of the GNU General Public License and the
Packit 5c3484
GNU Lesser General Public License along with the GNU MP Library.  If not,
Packit 5c3484
see https://www.gnu.org/licenses/.
Packit 5c3484
Packit 5c3484
Packit 5c3484
Packit 5c3484
Packit 5c3484
Packit 5c3484
This directory contains mpn functions for various SPARC chips.  Code that
Packit 5c3484
runs only on version 8 SPARC implementations, is in the v8 subdirectory.
Packit 5c3484
Packit 5c3484
RELEVANT OPTIMIZATION ISSUES
Packit 5c3484
Packit 5c3484
  Load and Store timing
Packit 5c3484
Packit 5c3484
On most early SPARC implementations, the ST instructions takes multiple
Packit 5c3484
cycles, while a STD takes just a single cycle more than an ST.  For the CPUs
Packit 5c3484
in SPARCstation I and II, the times are 3 and 4 cycles, respectively.
Packit 5c3484
Therefore, combining two ST instructions into a STD when possible is a
Packit 5c3484
significant optimization.
Packit 5c3484
Packit 5c3484
Later SPARC implementations have single cycle ST.
Packit 5c3484
Packit 5c3484
For SuperSPARC, we can perform just one memory instruction per cycle, even
Packit 5c3484
if up to two integer instructions can be executed in its pipeline.  For
Packit 5c3484
programs that perform so many memory operations that there are not enough
Packit 5c3484
non-memory operations to issue in parallel with all memory operations, using
Packit 5c3484
LDD and STD when possible helps.
Packit 5c3484
Packit 5c3484
UltraSPARC-1/2 has very slow integer multiplication.  In the v9 subdirectory,
Packit 5c3484
we therefore use floating-point multiplication.
Packit 5c3484
Packit 5c3484
STATUS
Packit 5c3484
Packit 5c3484
1. On a SuperSPARC, mpn_lshift and mpn_rshift run at 3 cycles/limb, or 2.5
Packit 5c3484
   cycles/limb asymptotically.  We could optimize speed for special counts
Packit 5c3484
   by using ADDXCC.
Packit 5c3484
Packit 5c3484
2. On a SuperSPARC, mpn_add_n and mpn_sub_n runs at 2.5 cycles/limb, or 2
Packit 5c3484
   cycles/limb asymptotically.
Packit 5c3484
Packit 5c3484
3. mpn_mul_1 runs at what is believed to be optimal speed.
Packit 5c3484
Packit 5c3484
4. On SuperSPARC, mpn_addmul_1 and mpn_submul_1 could both be improved by a
Packit 5c3484
   cycle by avoiding one of the add instructions.  See a29k/addmul_1.
Packit 5c3484
Packit 5c3484
The speed of the code for other SPARC implementations is uncertain.