Blame HOWTO

Packit Service b439df
libhugetlbfs HOWTO
Packit Service b439df
==================
Packit Service b439df
Packit Service b439df
Author: David Gibson <dwg@au1.ibm.com>, Adam Litke <agl@us.ibm.com>, and others
Packit Service b439df
Last updated: December 07, 2011
Packit Service b439df
Packit Service b439df
Introduction
Packit Service b439df
============
Packit Service b439df
Packit Service b439df
In Linux(TM), access to hugepages is provided through a virtual file
Packit Service b439df
system, "hugetlbfs".  The libhugetlbfs library interface works with
Packit Service b439df
hugetlbfs to provide more convenient specific application-level
Packit Service b439df
services.  In particular libhugetlbfs has three main functions:
Packit Service b439df
Packit Service b439df
	* library functions
Packit Service b439df
libhugetlbfs provides functions that allow an applications to
Packit Service b439df
explicitly allocate and use hugepages more easily they could by
Packit Service b439df
directly accessing the hugetblfs filesystem
Packit Service b439df
Packit Service b439df
	* hugepage malloc()
Packit Service b439df
libhugetlbfs can be used to make an existing application use hugepages
Packit Service b439df
for all its malloc() calls.  This works on an existing (dynamically
Packit Service b439df
linked) application binary without modification.
Packit Service b439df
Packit Service b439df
	* hugepage text/data/BSS
Packit Service b439df
libhugetlbfs, in conjunction with included special linker scripts can
Packit Service b439df
be used to make an application which will store its executable text,
Packit Service b439df
its initialized data or BSS, or all of the above in hugepages.  This
Packit Service b439df
requires relinking an application, but does not require source-level
Packit Service b439df
modifications.
Packit Service b439df
Packit Service b439df
This HOWTO explains how to use the libhugetlbfs library.  It is for
Packit Service b439df
application developers or system administrators who wish to use any of
Packit Service b439df
the above functions.
Packit Service b439df
Packit Service b439df
The libhugetlbfs library is a focal point to simplify and standardise
Packit Service b439df
the use of the kernel API.
Packit Service b439df
Packit Service b439df
Prerequisites
Packit Service b439df
=============
Packit Service b439df
Packit Service b439df
Hardware prerequisites
Packit Service b439df
----------------------
Packit Service b439df
Packit Service b439df
You will need a CPU with some sort of hugepage support, which is
Packit Service b439df
handled by your kernel.  This covers recent x86, AMD64, 64-bit
Packit Service b439df
PowerPC(R) (POWER4, PPC970 and later), and IBM System z CPUs.
Packit Service b439df
Packit Service b439df
Currently, only x86, AMD64 and PowerPC are fully supported by
Packit Service b439df
libhugetlbfs. IA64 and Sparc64 have a working malloc, and SH64
Packit Service b439df
should also but it has not been tested. IA64, Sparc64, and SH64
Packit Service b439df
do not support segment remapping at this time. IBM System z supports
Packit Service b439df
malloc and also segment remapping with --hugetlbfs-align.
Packit Service b439df
Packit Service b439df
Kernel prerequisites
Packit Service b439df
--------------------
Packit Service b439df
Packit Service b439df
To use all the features of libhugetlbfs you will need a 2.6.16 or
Packit Service b439df
later kernel.  Many things will work with earlier kernels, but they
Packit Service b439df
have important bugs and missing features.  The later sections of the
Packit Service b439df
HOWTO assume a 2.6.16 or later kernel.  The kernel must also have
Packit Service b439df
hugepages enabled, that is to say the CONFIG_HUGETLB_PAGE and
Packit Service b439df
CONFIG_HUGETLBFS options must be switched on.
Packit Service b439df
Packit Service b439df
To check if hugetlbfs is enabled, use one of the following methods:
Packit Service b439df
Packit Service b439df
  * (Preferred) Use "grep hugetlbfs /proc/filesystems" to see if
Packit Service b439df
    hugetlbfs is a supported file system.
Packit Service b439df
  * On kernels which support /proc/config.gz (for example SLES10
Packit Service b439df
    kernels), you can search for the CONFIG_HUGETLB_PAGE and
Packit Service b439df
    CONFIG_HUGETLBFS options in /proc/config.gz
Packit Service b439df
  * Finally, attempt to mount hugetlbfs. If it works, the required
Packit Service b439df
    hugepage support is enabled.
Packit Service b439df
Packit Service b439df
Any kernel which meets the above test (even old ones) should support
Packit Service b439df
at least basic libhugetlbfs functions, although old kernels may have
Packit Service b439df
serious bugs.
Packit Service b439df
Packit Service b439df
The MAP_PRIVATE flag instructs the kernel to return a memory area that
Packit Service b439df
is private to the requesting process.  To use MAP_PRIVATE mappings,
Packit Service b439df
libhugetlbfs's automatic malloc() (morecore) feature, or the hugepage
Packit Service b439df
text, data, or BSS features, you will need a kernel with hugepage
Packit Service b439df
Copy-on-Write (CoW) support.  The 2.6.16 kernel has this.
Packit Service b439df
Packit Service b439df
PowerPC note: The malloc()/morecore features will generate warnings if
Packit Service b439df
used on PowerPC chips with a kernel where hugepage mappings don't
Packit Service b439df
respect the mmap() hint address (the "hint address" is the first
Packit Service b439df
parameter to mmap(), when MAP_FIXED is not specified; the kernel is
Packit Service b439df
not required to mmap() at this address, but should do so when
Packit Service b439df
possible).  2.6.16 and later kernels do honor the hint address.
Packit Service b439df
Hugepage malloc()/morecore should still work without this patch, but
Packit Service b439df
the size of the hugepage heap will be limited (to around 256M for
Packit Service b439df
32-bit and 1TB for 64-bit).
Packit Service b439df
Packit Service b439df
The 2.6.27 kernel introduced support for multiple huge page sizes for
Packit Service b439df
systems with the appropriate hardware support.  Unless specifically
Packit Service b439df
requested, libhugetlbfs will continue to use the default huge page size.
Packit Service b439df
Packit Service b439df
Toolchain prerequisites
Packit Service b439df
-----------------------
Packit Service b439df
Packit Service b439df
The library uses a number of GNU specific features, so you will need to use
Packit Service b439df
both gcc and GNU binutils.  For PowerPC and AMD64 systems you will need a
Packit Service b439df
"biarch" compiler, which can build both 32-bit and 64-bit binaries.  To use
Packit Service b439df
hugepage text and data segments, GNU binutils version 2.17 (or later) is
Packit Service b439df
recommended.  Older versions will work with restricted functionality.
Packit Service b439df
Packit Service b439df
Configuration prerequisites
Packit Service b439df
---------------------------
Packit Service b439df
Packit Service b439df
Direct access to hugepage pool has been deprecated in favor of the
Packit Service b439df
hugeadm utility.  This utility can be used for finding the available
Packit Service b439df
hugepage pools and adjusting their minimum and maximum sizes depending
Packit Service b439df
on kernel support.
Packit Service b439df
Packit Service b439df
To list all availabe hugepage pools and their current min and max values:
Packit Service b439df
	hugeadm --pool-list
Packit Service b439df
Packit Service b439df
To set the 2MB pool minimum to 10 pages:
Packit Service b439df
	hugeadm --pool-pages-min 2MB:10
Packit Service b439df
Packit Service b439df
Note: that the max pool size will be adjusted to keep the same number of
Packit Service b439df
overcommit pages available if the kernel support is available when min
Packit Service b439df
pages are adjusted
Packit Service b439df
Packit Service b439df
To add 15 pages to the maximum for 2MB pages:
Packit Service b439df
	hugeadm --pool-pages-min 2MB:-5
Packit Service b439df
Packit Service b439df
For more information see man 8 hugeadm
Packit Service b439df
Packit Service b439df
The raw kernel interfaces (as described below) are still available.
Packit Service b439df
Packit Service b439df
In kernels before 2.6.24, hugepages must be allocated at boot-time via
Packit Service b439df
the hugepages= command-line parameter or at run-time via the
Packit Service b439df
/proc/sys/vm/nr_hugepages sysctl. If memory is restricted on the system,
Packit Service b439df
boot-time allocation is recommended. Hugepages so allocated will be in
Packit Service b439df
the static hugepage pool.
Packit Service b439df
Packit Service b439df
In kernels starting with 2.6.24, the hugepage pool can grown on-demand.
Packit Service b439df
If this feature should be used, /proc/sys/vm/nr_overcommit_hugepages
Packit Service b439df
should be set to the maximum size of the hugepage pool. No hugepages
Packit Service b439df
need to be allocated via /proc/sys/vm/nr_hugepages or hugepages= in this
Packit Service b439df
case. Hugepages so allocated will be in the dynamic hugepage pool.
Packit Service b439df
Packit Service b439df
For the running of the libhugetlbfs testsuite (see below), allocating 25
Packit Service b439df
static hugepages is recommended. Due to memory restrictions, the number
Packit Service b439df
of hugepages requested may not be allocated if the allocation is
Packit Service b439df
attempted at run-time. Users should verify the actual number of
Packit Service b439df
hugepages allocated by:
Packit Service b439df
Packit Service b439df
       hugeadm --pool-list
Packit Service b439df
Packit Service b439df
or
Packit Service b439df
Packit Service b439df
       grep HugePages_Total /proc/meminfo
Packit Service b439df
Packit Service b439df
With 25 hugepages allocated, most tests should succeed. However, with
Packit Service b439df
smaller hugepages sizes, many more hugepages may be necessary.
Packit Service b439df
Packit Service b439df
To use libhugetlbfs features, as well as to run the testsuite, hugetlbfs
Packit Service b439df
must be mounted.  Each hugetlbfs mount point is associated with a page
Packit Service b439df
size.  To choose the size, use the pagesize mount option.  If this option
Packit Service b439df
is omitted, the default huge page size will be used.
Packit Service b439df
Packit Service b439df
To mount the default huge page size:
Packit Service b439df
Packit Service b439df
       mkdir -p /mnt/hugetlbfs
Packit Service b439df
       mount -t hugetlbfs none /mnt/hugetlbfs
Packit Service b439df
Packit Service b439df
To mount 64KB pages (assuming hardware support):
Packit Service b439df
Packit Service b439df
       mkdir -p /mnt/hugetlbfs-64K
Packit Service b439df
       mount -t hugetlbfs none -opagesize=64k /mnt/hugetlbfs-64K
Packit Service b439df
Packit Service b439df
If hugepages should be available to non-root users, the permissions on
Packit Service b439df
the mountpoint need to be set appropriately.
Packit Service b439df
Packit Service b439df
Installation
Packit Service b439df
============
Packit Service b439df
Packit Service b439df
1. Type "make" to build the library
Packit Service b439df
Packit Service b439df
This will create "obj32" and/or "obj64" under the top level
Packit Service b439df
libhugetlbfs directory, and build, respectively, 32-bit and 64-bit
Packit Service b439df
shared and static versions (as applicable) of the library into each
Packit Service b439df
directory.  This will also build (but not run) the testsuite.
Packit Service b439df
Packit Service b439df
On i386 systems, only the 32-bit library will be built.  On PowerPC
Packit Service b439df
and AMD64 systems, both 32-bit and 64-bit versions will be built (the
Packit Service b439df
32-bit AMD64 version is identical to the i386 version).
Packit Service b439df
Packit Service b439df
2. Run the testsuite with "make check"
Packit Service b439df
Packit Service b439df
Running the testsuite is a good idea to ensure that the library is
Packit Service b439df
working properly, and is quite quick (under 3 minutes on a 2GHz Apple
Packit Service b439df
G5).  "make func" will run the just the functionality tests, rather
Packit Service b439df
than stress tests (a subset of "make check") which is much quicker.
Packit Service b439df
The testsuite contains tests both for the library's features and for
Packit Service b439df
the underlying kernel hugepage functionality.
Packit Service b439df
Packit Service b439df
NOTE: The testsuite must be run as the root user.
Packit Service b439df
Packit Service b439df
WARNING: The testsuite contains testcases explicitly designed to test
Packit Service b439df
for a number of hugepage related kernel bugs uncovered during the
Packit Service b439df
library's development.  Some of these testcases WILL CRASH HARD a
Packit Service b439df
kernel without the relevant fixes.  2.6.16 contains all such fixes for
Packit Service b439df
all testcases included as of this writing.
Packit Service b439df
Packit Service b439df
3. (Optional) Install to system paths with "make install"
Packit Service b439df
Packit Service b439df
This will install the library images to the system lib/lib32/lib64 as
Packit Service b439df
appropriate, the helper utilities and the manual pages.  By default
Packit Service b439df
it will install under /usr/local.  To put it somewhere else use
Packit Service b439df
PREFIX=/path/to/install on the make command line.  For example:
Packit Service b439df
Packit Service b439df
	make install PREFIX=/opt/hugetlbfs
Packit Service b439df
Will install under /opt/hugetlbfs.
Packit Service b439df
Packit Service b439df
"make install" will also install the linker scripts and wrapper for ld
Packit Service b439df
used for hugepage test/data/BSS (see below for details).
Packit Service b439df
Packit Service b439df
Alternatively, you can use the library from the directory in which it
Packit Service b439df
was built, using the LD_LIBRARY_PATH environment variable.
Packit Service b439df
Packit Service b439df
To only install library with linker scripts, the manual pages or the helper
Packit Service b439df
utilities separetly, use the install-libs, install-man and install-bin targets
Packit Service b439df
respectively. This can be useful when you with to install the utilities but
Packit Service b439df
not override the distribution-supported version of libhugetlbfs for example.
Packit Service b439df
Packit Service b439df
Usage
Packit Service b439df
=====
Packit Service b439df
Packit Service b439df
Using hugepages for malloc() (morecore)
Packit Service b439df
---------------------------------------
Packit Service b439df
Packit Service b439df
This feature allows an existing (dynamically linked) binary executable
Packit Service b439df
to use hugepages for all its malloc() calls.  To run a program using
Packit Service b439df
the automatic hugepage malloc() feature, you must set several
Packit Service b439df
environment variables:
Packit Service b439df
Packit Service b439df
1. Set LD_PRELOAD=libhugetlbfs.so
Packit Service b439df
  This tells the dynamic linker to load the libhugetlbfs shared
Packit Service b439df
  library, even though the program wasn't originally linked against it.
Packit Service b439df
Packit Service b439df
  Note: If the program is linked against libhugetlbfs, preloading the
Packit Service b439df
        library may lead to application crashes. You should skip this
Packit Service b439df
        step in that case.
Packit Service b439df
Packit Service b439df
2. Set LD_LIBRARY_PATH to the directory containing libhugetlbfs.so
Packit Service b439df
  This is only necessary if you haven't installed libhugetlbfs.so to a
Packit Service b439df
  system default path.  If you set LD_LIBRARY_PATH, make sure the
Packit Service b439df
  directory referenced contains the right version of the library
Packit Service b439df
  (32-bit or 64-bit) as appropriate to the binary you want to run.
Packit Service b439df
Packit Service b439df
3. Set HUGETLB_MORECORE
Packit Service b439df
  This enables the hugepage malloc() feature, instructing libhugetlbfs
Packit Service b439df
  to override libc's normal morecore() function with a hugepage
Packit Service b439df
  version and use it for malloc().  From this point all malloc()s
Packit Service b439df
  should come from hugepage memory until it runs out.  This option can
Packit Service b439df
  be specified in two ways:
Packit Service b439df
Packit Service b439df
  To use the default huge page size:
Packit Service b439df
       HUGETLB_MORECORE=yes
Packit Service b439df
Packit Service b439df
  To use a specific huge page size:
Packit Service b439df
       HUGETLB_MORECORE=<pagesize>
Packit Service b439df
Packit Service b439df
  To use Transparent Huge Pages (THP):
Packit Service b439df
       HUGETLB_MORECORE=thp
Packit Service b439df
Packit Service b439df
Note: This option requires a kernel that supports Transparent Huge Pages
Packit Service b439df
Packit Service b439df
Usually it's preferable to set these environment variables on the
Packit Service b439df
command line of the program you wish to run, rather than using
Packit Service b439df
"export", because you'll only want to enable the hugepage malloc() for
Packit Service b439df
particular programs, not everything.
Packit Service b439df
Packit Service b439df
Examples:
Packit Service b439df
Packit Service b439df
If you've installed libhugetlbfs in the default place (under
Packit Service b439df
/usr/local) which is in the system library search path use:
Packit Service b439df
  $ LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes <your app command line>
Packit Service b439df
Packit Service b439df
If you have built libhugetlbfs in ~/libhugetlbfs and haven't installed
Packit Service b439df
it yet, the following would work for a 64-bit program:
Packit Service b439df
Packit Service b439df
  $ LD_PRELOAD=libhugetlbfs.so LD_LIBRARY_PATH=~/libhugetlbfs/obj64 \
Packit Service b439df
	HUGETLB_MORECORE=yes <your app command line>
Packit Service b439df
Packit Service b439df
Under some circumstances, you might want to specify the address where
Packit Service b439df
the hugepage heap is located.  You can do this by setting the
Packit Service b439df
HUGETLB_MORECORE_HEAPBASE environment variable to the heap address in
Packit Service b439df
hexadecimal.  NOTE: this will not work on PowerPC systems with old kernels
Packit Service b439df
which don't respect the hugepage hint address; see Kernel Prerequisites
Packit Service b439df
above.  Also note that this option is ignored for THP morecore.
Packit Service b439df
Packit Service b439df
By default, the hugepage heap begins at roughly the same place a
Packit Service b439df
normal page heap would, rounded up by an amount determined by your
Packit Service b439df
platform.  For 32-bit PowerPC binaries the normal page heap address is
Packit Service b439df
rounded-up to a multiple of 256MB (that is, putting it in the next MMU
Packit Service b439df
segment); for 64-bit PowerPC binaries the address is rounded-up to a
Packit Service b439df
multiple of 1TB.  On all other platforms the address is rounded-up to
Packit Service b439df
the size of a hugepage.
Packit Service b439df
Packit Service b439df
By default, the hugepage heap will be prefaulted by libhugetlbfs to
Packit Service b439df
guarantee enough hugepages exist and are reserved for the application
Packit Service b439df
(if this was not done, applications could receive a SIGKILL signal if
Packit Service b439df
hugepages needed for the heap are used by another application before
Packit Service b439df
they are faulted in). This leads to local-node allocations when no
Packit Service b439df
memory policy is in place for hugepages. Therefore, it is recommended to
Packit Service b439df
use
Packit Service b439df
Packit Service b439df
  $ numactl --interleave=all <your app command line>
Packit Service b439df
Packit Service b439df
to regain some of the performance impact of local-node allocations on
Packit Service b439df
large NUMA systems. This can still result in poor performance for those
Packit Service b439df
applications which carefully place their threads on particular nodes
Packit Service b439df
(such as by using OpenMP). In that case, thread-local allocation is
Packit Service b439df
preferred so users should select a memory policy that corresponds to
Packit Service b439df
the run-time behavior of the process' CPU usage. Users can specify
Packit Service b439df
HUGETLB_NO_PREFAULT to prevent the prefaulting of hugepages and instead
Packit Service b439df
rely on run-time faulting of hugepages.  NOTE: specifying
Packit Service b439df
HUGETLB_NO_PREFAULT on a system where hugepages are available to and
Packit Service b439df
used by many process can result in some applications receving SIGKILL,
Packit Service b439df
so its use is not recommended in high-availability or production
Packit Service b439df
environments.
Packit Service b439df
Packit Service b439df
By default, the hugepage heap does not shrink.  To enable hugepage heap
Packit Service b439df
shrinking, set HUGETLB_MORECORE_SHRINK=yes.  NB: We have been seeing some
Packit Service b439df
unexpected behavior from glibc's malloc when this is enabled.
Packit Service b439df
Packit Service b439df
Using hugepage shared memory
Packit Service b439df
----------------------------
Packit Service b439df
Packit Service b439df
Hugepages are used for shared memory segments if the SHM_HUGETLB flag is
Packit Service b439df
set when calling shmget() and the pool is large enough. For hugepage-unaware
Packit Service b439df
applications, libhugetlbfs overrides shmget and adds the SHM_HUGETLB if the
Packit Service b439df
environment variable HUGETLB_SHM is set to "yes". The steps to use hugepages
Packit Service b439df
with applications not linked to libhugetlbfs are similar to morecore except
Packit Service b439df
for step 3.
Packit Service b439df
Packit Service b439df
1. Set LD_PRELOAD=libhugetlbfs.so
Packit Service b439df
  This tells the dynamic linker to load the libhugetlbfs shared
Packit Service b439df
  library, even though the program wasn't originally linked against it.
Packit Service b439df
Packit Service b439df
  Note: If the program is linked against libhugetlbfs, preloading the
Packit Service b439df
        library may lead to application crashes. You should skip this
Packit Service b439df
        step in that case.
Packit Service b439df
Packit Service b439df
2. Set LD_LIBRARY_PATH to the directory containing libhugetlbfs.so
Packit Service b439df
  This is only necessary if you haven't installed libhugetlbfs.so to a
Packit Service b439df
  system default path.  If you set LD_LIBRARY_PATH, make sure the
Packit Service b439df
  directory referenced contains the right version of the library
Packit Service b439df
  (32-bit or 64-bit) as appropriate to the binary you want to run.
Packit Service b439df
Packit Service b439df
3. Set HUGETLB_SHM=yes
Packit Service b439df
   The shmget() call is overridden whether the application is linked or the
Packit Service b439df
   libhugetlbfs library is preloaded. When this environment variable is set,
Packit Service b439df
   the SHM_HUGETLB flag is added to the call and the size parameter is aligned
Packit Service b439df
   to back the shared memory segment with huge pages. In the event hugepages
Packit Service b439df
   cannot be used, small pages will be used instead and a warning will be
Packit Service b439df
   printed to explain the failure.
Packit Service b439df
Packit Service b439df
   Note: It is not possible to select any huge page size other than the
Packit Service b439df
         system default for this option.  If the kernel supports multiple
Packit Service b439df
         huge page sizes, the size used for shared memory can be changed by
Packit Service b439df
         altering the default huge page size via the default_hugepagesz
Packit Service b439df
         kernel boot parameter.
Packit Service b439df
Packit Service b439df
Using hugepage text, data, or BSS
Packit Service b439df
---------------------------------
Packit Service b439df
Packit Service b439df
To use the hugepage text, data, or BSS segments feature, you need to specially
Packit Service b439df
link your application.  How this is done depends on the version of GNU ld.  To
Packit Service b439df
support ld versions older than 2.17, libhugetlbfs provides custom linker
Packit Service b439df
scripts that must be used to achieve the required binary layout.  With version
Packit Service b439df
2.17 or later, the system default linker scripts should be used.
Packit Service b439df
Packit Service b439df
To link an application for hugepages, you should use the the ld.hugetlbfs
Packit Service b439df
script included with libhugetlbfs in place of your normal linker.  Without any
Packit Service b439df
special options this will simply invoke GNU ld with the same parameters.  When
Packit Service b439df
it is invoked with options detailed in the following sections, ld.hugetlbfs
Packit Service b439df
will call the system linker with all of the options necessary to link for
Packit Service b439df
hugepages.  If a custom linker script is required, it will also be selected.
Packit Service b439df
Packit Service b439df
If you installed ld.hugetlbfs using "make install", or if you run it
Packit Service b439df
from the place where you built libhugetlbfs, it should automatically
Packit Service b439df
be able to find the libhugetlbfs linker scripts.  Otherwise you may
Packit Service b439df
need to explicitly instruct it where to find the scripts with the
Packit Service b439df
option:
Packit Service b439df
	--hugetlbfs-script-path=/path/to/scripts
Packit Service b439df
(The linker scripts are in the ldscripts/ subdirectory of the
Packit Service b439df
libhugetlbfs source tree).
Packit Service b439df
Packit Service b439df
	Linking the application with binutils-2.17 or later:
Packit Service b439df
	----------------------------------------------------
Packit Service b439df
Packit Service b439df
This method will use the system default linker scripts.  Only one linker option
Packit Service b439df
is required to prepare the application for hugepages:
Packit Service b439df
Packit Service b439df
	--hugetlbfs-align
Packit Service b439df
Packit Service b439df
will instruct ld.hugetlbfs to call GNU ld with two options that increase the
Packit Service b439df
alignment of the resulting binary.  For reference, the options passed to ld are:
Packit Service b439df
Packit Service b439df
	-z common-page-size=<value>	and
Packit Service b439df
	-z max-page-size=<value>
Packit Service b439df
Packit Service b439df
	Linking the application with binutils-2.16 or older:
Packit Service b439df
	----------------------------------------------------
Packit Service b439df
Packit Service b439df
To link a program with a custom linker script, one of the following linker
Packit Service b439df
options should be specified:
Packit Service b439df
Packit Service b439df
	--hugetlbfs-link=B
Packit Service b439df
Packit Service b439df
will link the application to store BSS data (only) into hugepages
Packit Service b439df
Packit Service b439df
	--hugetlbfs-link=BDT
Packit Service b439df
Packit Service b439df
will link the application to store text, initialized data and BSS data
Packit Service b439df
into hugepages.
Packit Service b439df
Packit Service b439df
These are the only two available options when using custom linker scripts.
Packit Service b439df
Packit Service b439df
	A note about the custom libhugetlbfs linker scripts:
Packit Service b439df
	----------------------------------------------------
Packit Service b439df
Packit Service b439df
Linker scripts are usually distributed with GNU binutils and they may contain a
Packit Service b439df
partial implementation of new linker features.  As binutils evolves, the linker
Packit Service b439df
scripts supplied with previous versions become obsolete and are upgraded.
Packit Service b439df
Packit Service b439df
Libhugetlbfs distributes one set of linker scripts that must work across
Packit Service b439df
several Linux distributions and binutils versions.  This has worked well for
Packit Service b439df
some time but binutils-2.17 (including some late 2.16 builds) have made changes
Packit Service b439df
that are impossible to accomodate without breaking the libhugetlbfs linker
Packit Service b439df
scripts for older versions of binutils.  This is why the linker scripts (and
Packit Service b439df
the --hugetlbfs-link ld.hugetlbfs option) have been deprecated for binutils >=
Packit Service b439df
2.17 configurations.
Packit Service b439df
Packit Service b439df
If you are using a late 2.16 binutils version (such as 2.16.91) and are
Packit Service b439df
experiencing problems with huge page text, data, and bss, you can check
Packit Service b439df
binutils for the incompatibility with the following command:
Packit Service b439df
Packit Service b439df
	ld --verbose | grep SPECIAL
Packit Service b439df
Packit Service b439df
If any matches are returned, then the libhugetlbfs linker scripts may not work
Packit Service b439df
correctly.  In this case you should upgrade to binutils >= 2.17 and use the
Packit Service b439df
--hugetlbfs-align linking method.
Packit Service b439df
Packit Service b439df
	Linking via gcc:
Packit Service b439df
	----------------
Packit Service b439df
Packit Service b439df
In many cases it's normal to link an application by invoking gcc,
Packit Service b439df
which will then invoke the linker with appropriate options, rather
Packit Service b439df
than invoking ld directly.  In such cases it's usually best to
Packit Service b439df
convince gcc to invoke the ld.hugetlbfs script instead of the system
Packit Service b439df
linker, rather than modifying your build procedure to invoke the
Packit Service b439df
ld.hugetlbfs directly; the compilers may often add special libraries
Packit Service b439df
or other linker options which can be fiddly to reproduce by hand.
Packit Service b439df
To make this easier, 'make install' will install ld.hugetlbfs into
Packit Service b439df
$PREFIX/share/libhugetlbfs and create an 'ld' symlink to it.
Packit Service b439df
Packit Service b439df
Then with gcc, you invoke it as a linker with two options:
Packit Service b439df
Packit Service b439df
	-B $PREFIX/share/libhugetlbfs
Packit Service b439df
Packit Service b439df
This option tells gcc to look in a non-standard location for the
Packit Service b439df
linker, thus finding our script rather than the normal linker. This
Packit Service b439df
can optionally be set in the CFLAGS environment variable.
Packit Service b439df
Packit Service b439df
	-Wl,--hugetlbfs-align
Packit Service b439df
OR	-Wl,--hugetlbfs-link=B
Packit Service b439df
OR	-Wl,--hugetlbfs-link=BDT
Packit Service b439df
Packit Service b439df
This option instructs gcc to pass the option after the comma down to the
Packit Service b439df
linker, thus invoking the special behaviour of the ld.hugetblfs script. This
Packit Service b439df
can optionally be set in the LDFLAGS environment variable.
Packit Service b439df
Packit Service b439df
If you use a compiler other than gcc, you will need to consult its
Packit Service b439df
documentation to see how to convince it to invoke ld.hugetlbfs in
Packit Service b439df
place of the system linker.
Packit Service b439df
Packit Service b439df
	Running the application:
Packit Service b439df
	------------------------
Packit Service b439df
Packit Service b439df
The specially-linked application needs the libhugetlbfs library, so
Packit Service b439df
you might need to set the LD_LIBRARY_PATH environment variable so the
Packit Service b439df
application can locate libhugetlbfs.so.  Depending on the method used to link
Packit Service b439df
the application, the HUGETLB_ELFMAP environment variable can be used to control
Packit Service b439df
how hugepages will be used.
Packit Service b439df
Packit Service b439df
	When using --hugetlbfs-link:
Packit Service b439df
	----------------------------
Packit Service b439df
Packit Service b439df
The custom linker script determines which segments may be remapped into
Packit Service b439df
hugepages and this remapping will occur by default.  The following setting will
Packit Service b439df
disable remapping entirely:
Packit Service b439df
Packit Service b439df
	HUGETLB_ELFMAP=no
Packit Service b439df
Packit Service b439df
	When using --hugetlbfs-align:
Packit Service b439df
	-----------------------------
Packit Service b439df
Packit Service b439df
This method of linking an application permits greater flexibility at runtime.
Packit Service b439df
Using HUGETLB_ELFMAP, it is possible to control which program segments are
Packit Service b439df
placed in hugepages.  The following four settings will cause the indicated
Packit Service b439df
segments to be placed in hugepages:
Packit Service b439df
Packit Service b439df
	HUGETLB_ELFMAP=R	Read-only segments (text)
Packit Service b439df
	HUGETLB_ELFMAP=W	Writable segments (data/BSS)
Packit Service b439df
	HUGETLB_ELFMAP=RW	All segments (text/data/BSS)
Packit Service b439df
	HUGETLB_ELFMAP=no	No segments
Packit Service b439df
Packit Service b439df
It is possible to select specific huge page sizes for read-only and writable
Packit Service b439df
segments by using the following advanced syntax:
Packit Service b439df
Packit Service b439df
	HUGETLB_ELFMAP=[R[=<pagesize>]:[W[=<pagesize>]]
Packit Service b439df
Packit Service b439df
For example:
Packit Service b439df
Packit Service b439df
	Place read-only segments into 64k pages and writable into 16M pages
Packit Service b439df
	HUGETLB_ELFMAP=R=64k:W=16M
Packit Service b439df
Packit Service b439df
	Use the default for read-only segments, 1G pages for writable segments
Packit Service b439df
	HUGETLB_ELFMAP=R:W=1G
Packit Service b439df
Packit Service b439df
	Use 16M pages for writable segments only
Packit Service b439df
	HUGETLB_ELFMAP=W=16M
Packit Service b439df
Packit Service b439df
	Default remapping behavior:
Packit Service b439df
	---------------------------
Packit Service b439df
Packit Service b439df
If --hugetlbfs-link was used to link an application, the chosen remapping mode
Packit Service b439df
is saved in the binary and becomes the default behavior.  Setting
Packit Service b439df
HUGETLB_ELFMAP=no will disable all remapping and is the only way to modify the
Packit Service b439df
default behavior.
Packit Service b439df
Packit Service b439df
For applications linked with --hugetlbfs-align, the default behavior is to not
Packit Service b439df
remap any segments into huge pages.  To set or display the default remapping
Packit Service b439df
mode for a binary, the included hugeedit command can be used:
Packit Service b439df
Packit Service b439df
hugeedit [options] target-executable
Packit Service b439df
   options:
Packit Service b439df
   --text,--data	Remap the specified segment into huge pages by default
Packit Service b439df
   --disable		Do not remap any segments by default
Packit Service b439df
Packit Service b439df
When target-executable is the only argument, hugeedit will display the default
Packit Service b439df
remapping mode without making any modifications.
Packit Service b439df
Packit Service b439df
When a binary is remapped according to its default remapping policy, the
Packit Service b439df
system default huge page size will be used.
Packit Service b439df
Packit Service b439df
	Environment variables:
Packit Service b439df
	----------------------
Packit Service b439df
Packit Service b439df
There are a number of private environment variables which can affect
Packit Service b439df
libhugetlbfs:
Packit Service b439df
	HUGETLB_DEFAULT_PAGE_SIZE
Packit Service b439df
		Override the system default huge page size for all uses
Packit Service b439df
		except hugetlb-backed shared memory
Packit Service b439df
Packit Service b439df
	HUGETLB_RESTRICT_EXE
Packit Service b439df
		By default, libhugetlbfs will act on any program that it
Packit Service b439df
		is loaded with, either via LD_PRELOAD or by explicitly
Packit Service b439df
		linking with -lhugetlbfs.
Packit Service b439df
Packit Service b439df
		There are situations in which it is desirable to restrict
Packit Service b439df
		libhugetlbfs' actions to specific programs.  For example,
Packit Service b439df
		some ISV applications are wrapped in a series of scripts
Packit Service b439df
		that invoke bash, python, and/or perl.	It is more
Packit Service b439df
		convenient to set the environment variables related
Packit Service b439df
		to libhugetlbfs before invoking the wrapper scripts,
Packit Service b439df
		yet this has the unintended and undesirable consequence
Packit Service b439df
		of causing the script interpreters to use and consume
Packit Service b439df
		hugepages.  There is no obvious benefit to causing the
Packit Service b439df
		script interpreters to use hugepages, and there is a
Packit Service b439df
		clear disadvantage: fewer hugepages are available to
Packit Service b439df
		the actual application.
Packit Service b439df
Packit Service b439df
		To address this scenario, set HUGETLB_RESTRICT_EXE to a
Packit Service b439df
		colon-separated list of programs to which the other
Packit Service b439df
		libhugetlbfs environment variables should apply.  (If
Packit Service b439df
		not set, libhugetlbfs will attempt to apply the requested
Packit Service b439df
		actions to all programs.)  For example,
Packit Service b439df
Packit Service b439df
		    HUGETLB_RESTRICT_EXE="hpcc:long_hpcc"
Packit Service b439df
Packit Service b439df
		will restrict libhugetlbfs' actions to programs named
Packit Service b439df
		/home/fred/hpcc and /bench/long_hpcc but not /usr/hpcc_no.
Packit Service b439df
Packit Service b439df
	HUGETLB_ELFMAP
Packit Service b439df
		Control or disable segment remapping (see above)
Packit Service b439df
Packit Service b439df
	HUGETLB_MINIMAL_COPY
Packit Service b439df
		If equal to "no", the entire segment will be copied;
Packit Service b439df
		otherwise, only the necessary parts will be, which can
Packit Service b439df
		be much more efficient (default)
Packit Service b439df
Packit Service b439df
	HUGETLB_FORCE_ELFMAP
Packit Service b439df
		Explained in "Partial segment remapping"
Packit Service b439df
Packit Service b439df
	HUGETLB_MORECORE
Packit Service b439df
	HUGETLB_MORECORE_HEAPBASE
Packit Service b439df
	HUGETLB_NO_PREFAULT
Packit Service b439df
		Explained in "Using hugepages for malloc()
Packit Service b439df
		(morecore)"
Packit Service b439df
Packit Service b439df
	HUGETLB_VERBOSE
Packit Service b439df
		Specify the verbosity level of debugging output from 1
Packit Service b439df
		to 99 (default is 1)
Packit Service b439df
	HUGETLB_PATH
Packit Service b439df
		Specify the path to the hugetlbfs mount point
Packit Service b439df
	HUGETLB_SHARE
Packit Service b439df
		Explained in "Sharing remapped segments"
Packit Service b439df
	HUGETLB_DEBUG
Packit Service b439df
		Set to 1 if an application segfaults. Gives very detailed output
Packit Service b439df
		and runs extra diagnostics.
Packit Service b439df
Packit Service b439df
	Sharing remapped segments:
Packit Service b439df
	--------------------------
Packit Service b439df
Packit Service b439df
By default, when libhugetlbfs uses anonymous, unlinked hugetlbfs files
Packit Service b439df
to store remapped program segment data.  This means that if the same
Packit Service b439df
program is started multiple times using hugepage segments, multiple
Packit Service b439df
huge pages will be used to store the same program data.
Packit Service b439df
Packit Service b439df
The reduce this wastage, libugetlbfs can be instructed to allow
Packit Service b439df
sharing segments between multiple invocations of a program.  To do
Packit Service b439df
this, you must set the HUGETLB_SHARE variable must be set for all the
Packit Service b439df
processes in question.  This variable has two possible values:
Packit Service b439df
	anything but 1: the default, indicates no segments should be shared
Packit Service b439df
	1: indicates that read-only segments (i.e. the program text,
Packit Service b439df
in most cases) should be shared, read-write segments (data and bss)
Packit Service b439df
will not be shared.
Packit Service b439df
Packit Service b439df
If the HUGETLB_MINIMAL_COPY variable is set for any program using
Packit Service b439df
shared segments, it must be set to the same value for all invocations
Packit Service b439df
of that program.
Packit Service b439df
Packit Service b439df
Segment sharing is implemented by creating persistent files in a
Packit Service b439df
hugetlbfs containing the necessary segment data.  By default, these
Packit Service b439df
files are stored in a subdirectory of the first located hugetlbfs
Packit Service b439df
filesystem, named 'elflink-uid-XXX' where XXX is the uid of the
Packit Service b439df
process using sharing.  This directory must be owned by the uid in
Packit Service b439df
question, and have mode 0700.  If it doesn't exist, libhugetlbfs will
Packit Service b439df
create it automatically.  This means that (by default) separate
Packit Service b439df
invocations of the same program by different users will not share huge
Packit Service b439df
pages.
Packit Service b439df
Packit Service b439df
The location for storing the hugetlbfs page files can be changed by
Packit Service b439df
setting the HUGETLB_SHARE_PATH environment variable.  If set, this
Packit Service b439df
variable must contain the path of an accessible, already created
Packit Service b439df
directory located in a hugetlbfs filesystem.  The owner and mode of
Packit Service b439df
this directory are not checked, so this method can be used to allow
Packit Service b439df
processes of multiple uids to share huge pages.  IMPORTANT SECURITY
Packit Service b439df
NOTE: any process sharing hugepages can insert arbitrary executable
Packit Service b439df
code into any other process sharing hugepages in the same directory.
Packit Service b439df
Therefore, when using HUGETLB_SHARE_PATH, the directory created *must*
Packit Service b439df
allow access only to a set of uids who are mutually trusted.
Packit Service b439df
Packit Service b439df
The files created in hugetlbfs for sharing are persistent, and must be
Packit Service b439df
manually deleted to free the hugepages in question.  Future versions
Packit Service b439df
of libhugetlbfs should include tools and scripts to automate this
Packit Service b439df
cleanup.
Packit Service b439df
Packit Service b439df
	Partial segment remapping
Packit Service b439df
	-------------------------
Packit Service b439df
Packit Service b439df
libhugetlbfs has limited support for remapping a normal, non-relinked
Packit Service b439df
binary's data, text and BSS into hugepages. To enable this feature,
Packit Service b439df
HUGETLB_FORCE_ELFMAP must be set to "yes".
Packit Service b439df
Packit Service b439df
Partial segment remapping is not guaranteed to work. Most importantly, a
Packit Service b439df
binary's segments must be large enough even when not relinked by
Packit Service b439df
libhugetlbfs:
Packit Service b439df
Packit Service b439df
	architecture	address		minimum segment size
Packit Service b439df
	------------	-------		--------------------
Packit Service b439df
	i386, x86_64	all		hugepage size
Packit Service b439df
	ppc32		all		256M
Packit Service b439df
	ppc64		0-4G		256M
Packit Service b439df
	ppc64		4G-1T		1020G
Packit Service b439df
	ppc64		1T+		1T
Packit Service b439df
Packit Service b439df
The raw size, though, is not sufficient to indicate if the code will
Packit Service b439df
succeed, due to alignment. Since the binary is not relinked, however,
Packit Service b439df
this is relatively straightforward to 'test and see'.
Packit Service b439df
Packit Service b439df
NOTE: You must use LD_PRELOAD to load libhugetlbfs.so when using
Packit Service b439df
partial remapping.
Packit Service b439df
Packit Service b439df
Packit Service b439df
Examples
Packit Service b439df
========
Packit Service b439df
Packit Service b439df
Example 1:  Application Developer
Packit Service b439df
---------------------------------
Packit Service b439df
Packit Service b439df
To have a program use hugepages, complete the following steps:
Packit Service b439df
Packit Service b439df
1. Make sure you are working with kernel 2.6.16 or greater.
Packit Service b439df
Packit Service b439df
2. Modify the build procedure so your application is linked against
Packit Service b439df
libhugetlbfs.
Packit Service b439df
Packit Service b439df
For the remapping, you link against the library with the appropriate
Packit Service b439df
linker script (if necessary or desired).  Linking against the library
Packit Service b439df
should result in transparent usage of hugepages.
Packit Service b439df
Packit Service b439df
Example 2:  End Users and System Administrators
Packit Service b439df
-----------------------------------------------
Packit Service b439df
Packit Service b439df
To have an application use libhugetlbfs, complete the following steps:
Packit Service b439df
Packit Service b439df
1. Make sure you are using kernel 2.6.16.
Packit Service b439df
Packit Service b439df
2. Make sure the library is in the path, which you can set with the
Packit Service b439df
LD_LIBRARY_PATH environment variable. You might need to set other
Packit Service b439df
environment variables, including LD_PRELOAD as described above.
Packit Service b439df
Packit Service b439df
Packit Service b439df
Troubleshooting
Packit Service b439df
===============
Packit Service b439df
Packit Service b439df
The library has a certain amount of debugging code built in, which can
Packit Service b439df
be controlled with the environment variable HUGETLB_VERBOSE.  By
Packit Service b439df
default the debug level is "1" which means the library will only print
Packit Service b439df
relatively serious error messages.  Setting HUGETLB_VERBOSE=2 or
Packit Service b439df
higher will enable more debug messages (at present 2 is the highest
Packit Service b439df
debug level, but that may change).  Setting HUGETLB_VERBOSE=0 will
Packit Service b439df
silence the library completely, even in the case of errors - the only
Packit Service b439df
exception is in cases where the library has to abort(), which can
Packit Service b439df
happen if something goes wrong in the middle of unmapping and
Packit Service b439df
remapping segments for the text/data/bss feature.
Packit Service b439df
Packit Service b439df
If an application fails to run, set the environment variable HUGETLB_DEBUG
Packit Service b439df
to 1. This causes additional diagnostics to be run. This information should
Packit Service b439df
be included when sending bug reports to the libhugetlbfs team.
Packit Service b439df
Packit Service b439df
Specific Scenarios:
Packit Service b439df
-------------------
Packit Service b439df
Packit Service b439df
ISSUE:	When using the --hugetlbfs-align or -zmax-page-size link options, the
Packit Service b439df
	linker complains about truncated relocations and the build fails.
Packit Service b439df
Packit Service b439df
TRY:	Compile the program with the --relax linker option.  Either add
Packit Service b439df
	-Wl,--relax to CFLAGS or --relax to LDFLAGS.
Packit Service b439df
Packit Service b439df
ISSUE:  When using the xB linker script with a 32 bit binary on an x86 host with
Packit Service b439df
        NX support enabled, the binary segfaults.
Packit Service b439df
Packit Service b439df
TRY:    Recompiling with the --hugetlbfs-align options and use the new relinking
Packit Service b439df
        method or booting your kernel with noexec32=off.
Packit Service b439df
Packit Service b439df
Packit Service b439df
Trademarks
Packit Service b439df
==========
Packit Service b439df
Packit Service b439df
This work represents the view of the author and does not necessarily
Packit Service b439df
represent the view of IBM.
Packit Service b439df
Packit Service b439df
PowerPC is a registered trademark of International Business Machines
Packit Service b439df
Corporation in the United States, other countries, or both.  Linux is
Packit Service b439df
a trademark of Linus Torvalds in the United States, other countries,
Packit Service b439df
or both.