Blame doc/man3/OPENSSL_ia32cap.pod

Packit c4476c
=pod
Packit c4476c
Packit c4476c
=head1 NAME
Packit c4476c
Packit c4476c
OPENSSL_ia32cap - the x86[_64] processor capabilities vector
Packit c4476c
Packit c4476c
=head1 SYNOPSIS
Packit c4476c
Packit c4476c
 env OPENSSL_ia32cap=... <application>
Packit c4476c
Packit c4476c
=head1 DESCRIPTION
Packit c4476c
Packit c4476c
OpenSSL supports a range of x86[_64] instruction set extensions. These
Packit c4476c
extensions are denoted by individual bits in capability vector returned
Packit c4476c
by processor in EDX:ECX register pair after executing CPUID instruction
Packit c4476c
with EAX=1 input value (see Intel Application Note #241618). This vector
Packit c4476c
is copied to memory upon toolkit initialization and used to choose
Packit c4476c
between different code paths to provide optimal performance across wide
Packit c4476c
range of processors. For the moment of this writing following bits are
Packit c4476c
significant:
Packit c4476c
Packit c4476c
=over 4
Packit c4476c
Packit c4476c
=item bit #4 denoting presence of Time-Stamp Counter.
Packit c4476c
Packit c4476c
=item bit #19 denoting availability of CLFLUSH instruction;
Packit c4476c
Packit c4476c
=item bit #20, reserved by Intel, is used to choose among RC4 code paths;
Packit c4476c
Packit c4476c
=item bit #23 denoting MMX support;
Packit c4476c
Packit c4476c
=item bit #24, FXSR bit, denoting availability of XMM registers;
Packit c4476c
Packit c4476c
=item bit #25 denoting SSE support;
Packit c4476c
Packit c4476c
=item bit #26 denoting SSE2 support;
Packit c4476c
Packit c4476c
=item bit #28 denoting Hyperthreading, which is used to distinguish
Packit c4476c
cores with shared cache;
Packit c4476c
Packit c4476c
=item bit #30, reserved by Intel, denotes specifically Intel CPUs;
Packit c4476c
Packit c4476c
=item bit #33 denoting availability of PCLMULQDQ instruction;
Packit c4476c
Packit c4476c
=item bit #41 denoting SSSE3, Supplemental SSE3, support;
Packit c4476c
Packit c4476c
=item bit #43 denoting AMD XOP support (forced to zero on non-AMD CPUs);
Packit c4476c
Packit c4476c
=item bit #54 denoting availability of MOVBE instruction;
Packit c4476c
Packit c4476c
=item bit #57 denoting AES-NI instruction set extension;
Packit c4476c
Packit c4476c
=item bit #58, XSAVE bit, lack of which in combination with MOVBE is used
Packit c4476c
to identify Atom Silvermont core;
Packit c4476c
Packit c4476c
=item bit #59, OSXSAVE bit, denoting availability of YMM registers;
Packit c4476c
Packit c4476c
=item bit #60 denoting AVX extension;
Packit c4476c
Packit c4476c
=item bit #62 denoting availability of RDRAND instruction;
Packit c4476c
Packit c4476c
=back
Packit c4476c
Packit c4476c
For example, in 32-bit application context clearing bit #26 at run-time
Packit c4476c
disables high-performance SSE2 code present in the crypto library, while
Packit c4476c
clearing bit #24 disables SSE2 code operating on 128-bit XMM register
Packit c4476c
bank. You might have to do the latter if target OpenSSL application is
Packit c4476c
executed on SSE2 capable CPU, but under control of OS that does not
Packit c4476c
enable XMM registers. Historically address of the capability vector copy
Packit c4476c
was exposed to application through OPENSSL_ia32cap_loc(), but not
Packit c4476c
anymore. Now the only way to affect the capability detection is to set
Packit c4476c
OPENSSL_ia32cap environment variable prior target application start. To
Packit c4476c
give a specific example, on Intel P4 processor 'env
Packit c4476c
OPENSSL_ia32cap=0x16980010 apps/openssl', or better yet 'env
Packit c4476c
OPENSSL_ia32cap=~0x1000000 apps/openssl' would achieve the desired
Packit c4476c
effect. Alternatively you can reconfigure the toolkit with no-sse2
Packit c4476c
option and recompile.
Packit c4476c
Packit c4476c
Less intuitive is clearing bit #28, or ~0x10000000 in the "environment
Packit c4476c
variable" terms. The truth is that it's not copied from CPUID output
Packit c4476c
verbatim, but is adjusted to reflect whether or not the data cache is
Packit c4476c
actually shared between logical cores. This in turn affects the decision
Packit c4476c
on whether or not expensive countermeasures against cache-timing attacks
Packit c4476c
are applied, most notably in AES assembler module.
Packit c4476c
Packit c4476c
The capability vector is further extended with EBX value returned by
Packit c4476c
CPUID with EAX=7 and ECX=0 as input. Following bits are significant:
Packit c4476c
Packit c4476c
=over 4
Packit c4476c
Packit c4476c
=item bit #64+3 denoting availability of BMI1 instructions, e.g. ANDN;
Packit c4476c
Packit c4476c
=item bit #64+5 denoting availability of AVX2 instructions;
Packit c4476c
Packit c4476c
=item bit #64+8 denoting availability of BMI2 instructions, e.g. MULX
Packit c4476c
and RORX;
Packit c4476c
Packit c4476c
=item bit #64+16 denoting availability of AVX512F extension;
Packit c4476c
Packit c4476c
=item bit #64+18 denoting availability of RDSEED instruction;
Packit c4476c
Packit c4476c
=item bit #64+19 denoting availability of ADCX and ADOX instructions;
Packit c4476c
Packit c4476c
=item bit #64+21 denoting availability of VPMADD52[LH]UQ instructions,
Packit c4476c
a.k.a. AVX512IFMA extension;
Packit c4476c
Packit c4476c
=item bit #64+29 denoting availability of SHA extension;
Packit c4476c
Packit c4476c
=item bit #64+30 denoting availability of AVX512BW extension;
Packit c4476c
Packit c4476c
=item bit #64+31 denoting availability of AVX512VL extension;
Packit c4476c
Packit c4476c
=item bit #64+41 denoting availability of VAES extension;
Packit c4476c
Packit c4476c
=item bit #64+42 denoting availability of VPCLMULQDQ extension;
Packit c4476c
Packit c4476c
=back
Packit c4476c
Packit c4476c
To control this extended capability word use ':' as delimiter when
Packit c4476c
setting up OPENSSL_ia32cap environment variable. For example assigning
Packit c4476c
':~0x20' would disable AVX2 code paths, and ':0' - all post-AVX
Packit c4476c
extensions.
Packit c4476c
Packit c4476c
It should be noted that whether or not some of the most "fancy"
Packit c4476c
extension code paths are actually assembled depends on current assembler
Packit c4476c
version. Base minimum of AES-NI/PCLMULQDQ, SSSE3 and SHA extension code
Packit c4476c
paths are always assembled. Apart from that, minimum assembler version
Packit c4476c
requirements are summarized in below table:
Packit c4476c
Packit c4476c
   Extension   | GNU as | nasm   | llvm
Packit c4476c
   ------------+--------+--------+--------
Packit c4476c
   AVX         | 2.19   | 2.09   | 3.0
Packit c4476c
   AVX2        | 2.22   | 2.10   | 3.1
Packit c4476c
   ADCX/ADOX   | 2.23   | 2.10   | 3.3
Packit c4476c
   AVX512      | 2.25   | 2.11.8 | see NOTES
Packit c4476c
   AVX512IFMA  | 2.26   | 2.11.8 | see NOTES
Packit c4476c
   VAES        | 2.30   | 2.13.3 |
Packit c4476c
Packit c4476c
=head1 NOTES
Packit c4476c
Packit c4476c
Even though AVX512 support was implemented in llvm 3.6, compilation of
Packit c4476c
assembly modules apparently requires explicit -march flag. But then
Packit c4476c
compiler generates processor-specific code, which in turn contradicts
Packit c4476c
the mere idea of run-time switch execution facilitated by the variable
Packit c4476c
in question. Till the limitation is lifted, it's possible to work around
Packit c4476c
the problem by making build procedure use following script:
Packit c4476c
Packit c4476c
   #!/bin/sh
Packit c4476c
   exec clang -no-integrated-as "$@"
Packit c4476c
Packit c4476c
instead of real clang. In which case it doesn't matter which clang
Packit c4476c
version is used, as it is GNU assembler version that will be checked.
Packit c4476c
Packit c4476c
=head1 RETURN VALUES
Packit c4476c
Packit c4476c
Not available.
Packit c4476c
Packit c4476c
=head1 COPYRIGHT
Packit c4476c
Packit c4476c
Copyright 2004-2018 The OpenSSL Project Authors. All Rights Reserved.
Packit c4476c
Packit c4476c
Licensed under the OpenSSL license (the "License").  You may not use
Packit c4476c
this file except in compliance with the License.  You can obtain a copy
Packit c4476c
in the file LICENSE in the source distribution or at
Packit c4476c
L<https://www.openssl.org/source/license.html>.
Packit c4476c
Packit c4476c
=cut