Blame docs/src/guide/threads.rst

Packit b5b901
Threads
Packit b5b901
=======
Packit b5b901
Packit b5b901
Wait a minute? Why are we on threads? Aren't event loops supposed to be **the
Packit b5b901
way** to do *web-scale programming*? Well... no. Threads are still the medium in
Packit b5b901
which processors do their jobs. Threads are therefore mighty useful sometimes, even
Packit b5b901
though you might have to wade through various synchronization primitives.
Packit b5b901
Packit b5b901
Threads are used internally to fake the asynchronous nature of all of the system
Packit b5b901
calls. libuv also uses threads to allow you, the application, to perform a task
Packit b5b901
asynchronously that is actually blocking, by spawning a thread and collecting
Packit b5b901
the result when it is done.
Packit b5b901
Packit b5b901
Today there are two predominant thread libraries: the Windows threads
Packit Service e08953
implementation and POSIX's :man:`pthreads(7)`. libuv's thread API is analogous to
Packit b5b901
the pthreads API and often has similar semantics.
Packit b5b901
Packit b5b901
A notable aspect of libuv's thread facilities is that it is a self contained
Packit b5b901
section within libuv. Whereas other features intimately depend on the event
Packit b5b901
loop and callback principles, threads are complete agnostic, they block as
Packit b5b901
required, signal errors directly via return values, and, as shown in the
Packit b5b901
:ref:`first example <thread-create-example>`, don't even require a running
Packit b5b901
event loop.
Packit b5b901
Packit b5b901
libuv's thread API is also very limited since the semantics and syntax of
Packit b5b901
threads are different on all platforms, with different levels of completeness.
Packit b5b901
Packit b5b901
This chapter makes the following assumption: **There is only one event loop,
Packit b5b901
running in one thread (the main thread)**. No other thread interacts
Packit b5b901
with the event loop (except using ``uv_async_send``).
Packit b5b901
Packit b5b901
Core thread operations
Packit b5b901
----------------------
Packit b5b901
Packit b5b901
There isn't much here, you just start a thread using ``uv_thread_create()`` and
Packit b5b901
wait for it to close using ``uv_thread_join()``.
Packit b5b901
Packit b5b901
.. _thread-create-example:
Packit b5b901
Packit b5b901
.. rubric:: thread-create/main.c
Packit b5b901
.. literalinclude:: ../../code/thread-create/main.c
Packit b5b901
    :linenos:
Packit b5b901
    :lines: 26-36
Packit b5b901
    :emphasize-lines: 3-7
Packit b5b901
Packit b5b901
.. tip::
Packit b5b901
Packit b5b901
    ``uv_thread_t`` is just an alias for ``pthread_t`` on Unix, but this is an
Packit b5b901
    implementation detail, avoid depending on it to always be true.
Packit b5b901
Packit b5b901
The second parameter is the function which will serve as the entry point for
Packit b5b901
the thread, the last parameter is a ``void *`` argument which can be used to pass
Packit b5b901
custom parameters to the thread. The function ``hare`` will now run in a separate
Packit b5b901
thread, scheduled pre-emptively by the operating system:
Packit b5b901
Packit b5b901
.. rubric:: thread-create/main.c
Packit b5b901
.. literalinclude:: ../../code/thread-create/main.c
Packit b5b901
    :linenos:
Packit b5b901
    :lines: 6-14
Packit b5b901
    :emphasize-lines: 2
Packit b5b901
Packit b5b901
Unlike ``pthread_join()`` which allows the target thread to pass back a value to
Packit b5b901
the calling thread using a second parameter, ``uv_thread_join()`` does not. To
Packit b5b901
send values use :ref:`inter-thread-communication`.
Packit b5b901
Packit b5b901
Synchronization Primitives
Packit b5b901
--------------------------
Packit b5b901
Packit b5b901
This section is purposely spartan. This book is not about threads, so I only
Packit b5b901
catalogue any surprises in the libuv APIs here. For the rest you can look at
Packit Service e08953
the :man:`pthreads(7)` man pages.
Packit b5b901
Packit b5b901
Mutexes
Packit b5b901
~~~~~~~
Packit b5b901
Packit b5b901
The mutex functions are a **direct** map to the pthread equivalents.
Packit b5b901
Packit b5b901
.. rubric:: libuv mutex functions
Packit Service e08953
.. code-block:: c
Packit Service e08953
Packit Service e08953
    int uv_mutex_init(uv_mutex_t* handle);
Packit Service e08953
    int uv_mutex_init_recursive(uv_mutex_t* handle);
Packit Service e08953
    void uv_mutex_destroy(uv_mutex_t* handle);
Packit Service e08953
    void uv_mutex_lock(uv_mutex_t* handle);
Packit Service e08953
    int uv_mutex_trylock(uv_mutex_t* handle);
Packit Service e08953
    void uv_mutex_unlock(uv_mutex_t* handle);
Packit b5b901
Packit b5b901
The ``uv_mutex_init()``, ``uv_mutex_init_recursive()`` and ``uv_mutex_trylock()``
Packit b5b901
functions will return 0 on success, and an error code otherwise.
Packit b5b901
Packit b5b901
If `libuv` has been compiled with debugging enabled, ``uv_mutex_destroy()``,
Packit b5b901
``uv_mutex_lock()`` and ``uv_mutex_unlock()`` will ``abort()`` on error.
Packit b5b901
Similarly ``uv_mutex_trylock()`` will abort if the error is anything *other
Packit b5b901
than* ``EAGAIN`` or ``EBUSY``.
Packit b5b901
Packit b5b901
Recursive mutexes are supported, but you should not rely on them. Also, they
Packit b5b901
should not be used with ``uv_cond_t`` variables.
Packit b5b901
Packit b5b901
The default BSD mutex implementation will raise an error if a thread which has
Packit b5b901
locked a mutex attempts to lock it again. For example, a construct like::
Packit b5b901
Packit b5b901
    uv_mutex_init(a_mutex);
Packit b5b901
    uv_mutex_lock(a_mutex);
Packit b5b901
    uv_thread_create(thread_id, entry, (void *)a_mutex);
Packit b5b901
    uv_mutex_lock(a_mutex);
Packit b5b901
    // more things here
Packit b5b901
Packit b5b901
can be used to wait until another thread initializes some stuff and then
Packit b5b901
unlocks ``a_mutex`` but will lead to your program crashing if in debug mode, or
Packit b5b901
return an error in the second call to ``uv_mutex_lock()``.
Packit b5b901
Packit b5b901
.. note::
Packit b5b901
Packit b5b901
    Mutexes on Windows are always recursive.
Packit b5b901
Packit b5b901
Locks
Packit b5b901
~~~~~
Packit b5b901
Packit b5b901
Read-write locks are a more granular access mechanism. Two readers can access
Packit b5b901
shared memory at the same time. A writer may not acquire the lock when it is
Packit b5b901
held by a reader. A reader or writer may not acquire a lock when a writer is
Packit b5b901
holding it. Read-write locks are frequently used in databases. Here is a toy
Packit b5b901
example.
Packit b5b901
Packit b5b901
.. rubric:: locks/main.c - simple rwlocks
Packit b5b901
.. literalinclude:: ../../code/locks/main.c
Packit b5b901
    :linenos:
Packit b5b901
    :emphasize-lines: 13,16,27,31,42,55
Packit b5b901
Packit b5b901
Run this and observe how the readers will sometimes overlap. In case of
Packit b5b901
multiple writers, schedulers will usually give them higher priority, so if you
Packit b5b901
add two writers, you'll see that both writers tend to finish first before the
Packit b5b901
readers get a chance again.
Packit b5b901
Packit b5b901
We also use barriers in the above example so that the main thread can wait for
Packit b5b901
all readers and writers to indicate they have ended.
Packit b5b901
Packit b5b901
Others
Packit b5b901
~~~~~~
Packit b5b901
Packit b5b901
libuv also supports semaphores_, `condition variables`_ and barriers_ with APIs
Packit b5b901
very similar to their pthread counterparts.
Packit b5b901
Packit Service e08953
.. _semaphores: https://en.wikipedia.org/wiki/Semaphore_(programming)
Packit Service e08953
.. _condition variables: https://en.wikipedia.org/wiki/Monitor_(synchronization)#Condition_variables_2
Packit Service e08953
.. _barriers: https://en.wikipedia.org/wiki/Barrier_(computer_science)
Packit b5b901
Packit b5b901
In addition, libuv provides a convenience function ``uv_once()``. Multiple
Packit b5b901
threads can attempt to call ``uv_once()`` with a given guard and a function
Packit b5b901
pointer, **only the first one will win, the function will be called once and
Packit b5b901
only once**::
Packit b5b901
Packit b5b901
    /* Initialize guard */
Packit b5b901
    static uv_once_t once_only = UV_ONCE_INIT;
Packit b5b901
Packit b5b901
    int i = 0;
Packit b5b901
Packit b5b901
    void increment() {
Packit b5b901
        i++;
Packit b5b901
    }
Packit b5b901
Packit b5b901
    void thread1() {
Packit b5b901
        /* ... work */
Packit b5b901
        uv_once(once_only, increment);
Packit b5b901
    }
Packit b5b901
Packit b5b901
    void thread2() {
Packit b5b901
        /* ... work */
Packit b5b901
        uv_once(once_only, increment);
Packit b5b901
    }
Packit b5b901
Packit b5b901
    int main() {
Packit b5b901
        /* ... spawn threads */
Packit b5b901
    }
Packit b5b901
Packit b5b901
After all threads are done, ``i == 1``.
Packit b5b901
Packit b5b901
.. _libuv-work-queue:
Packit b5b901
Packit b5b901
libuv v0.11.11 onwards also added a ``uv_key_t`` struct and api_ for
Packit b5b901
thread-local storage.
Packit b5b901
Packit b5b901
.. _api: http://docs.libuv.org/en/v1.x/threading.html#thread-local-storage
Packit b5b901
Packit b5b901
libuv work queue
Packit b5b901
----------------
Packit b5b901
Packit b5b901
``uv_queue_work()`` is a convenience function that allows an application to run
Packit b5b901
a task in a separate thread, and have a callback that is triggered when the
Packit b5b901
task is done. A seemingly simple function, what makes ``uv_queue_work()``
Packit b5b901
tempting is that it allows potentially any third-party libraries to be used
Packit b5b901
with the event-loop paradigm. When you use event loops, it is *imperative to
Packit b5b901
make sure that no function which runs periodically in the loop thread blocks
Packit b5b901
when performing I/O or is a serious CPU hog*, because this means that the loop
Packit b5b901
slows down and events are not being handled at full capacity.
Packit b5b901
Packit b5b901
However, a lot of existing code out there features blocking functions (for example
Packit b5b901
a routine which performs I/O under the hood) to be used with threads if you
Packit b5b901
want responsiveness (the classic 'one thread per client' server model), and
Packit b5b901
getting them to play with an event loop library generally involves rolling your
Packit b5b901
own system of running the task in a separate thread.  libuv just provides
Packit b5b901
a convenient abstraction for this.
Packit b5b901
Packit b5b901
Here is a simple example inspired by `node.js is cancer`_. We are going to
Packit b5b901
calculate fibonacci numbers, sleeping a bit along the way, but run it in
Packit b5b901
a separate thread so that the blocking and CPU bound task does not prevent the
Packit b5b901
event loop from performing other activities.
Packit b5b901
Packit b5b901
.. rubric:: queue-work/main.c - lazy fibonacci
Packit b5b901
.. literalinclude:: ../../code/queue-work/main.c
Packit b5b901
    :linenos:
Packit b5b901
    :lines: 17-29
Packit b5b901
Packit b5b901
The actual task function is simple, nothing to show that it is going to be
Packit b5b901
run in a separate thread. The ``uv_work_t`` structure is the clue. You can pass
Packit b5b901
arbitrary data through it using the ``void* data`` field and use it to
Packit b5b901
communicate to and from the thread. But be sure you are using proper locks if
Packit b5b901
you are changing things while both threads may be running.
Packit b5b901
Packit b5b901
The trigger is ``uv_queue_work``:
Packit b5b901
Packit b5b901
.. rubric:: queue-work/main.c
Packit b5b901
.. literalinclude:: ../../code/queue-work/main.c
Packit b5b901
    :linenos:
Packit b5b901
    :lines: 31-44
Packit b5b901
    :emphasize-lines: 10
Packit b5b901
Packit b5b901
The thread function will be launched in a separate thread, passed the
Packit b5b901
``uv_work_t`` structure and once the function returns, the *after* function
Packit b5b901
will be called on the thread the event loop is running in. It will be passed
Packit b5b901
the same structure.
Packit b5b901
Packit b5b901
For writing wrappers to blocking libraries, a common :ref:`pattern <baton>`
Packit b5b901
is to use a baton to exchange data.
Packit b5b901
Packit b5b901
Since libuv version `0.9.4` an additional function, ``uv_cancel()``, is
Packit b5b901
available. This allows you to cancel tasks on the libuv work queue. Only tasks
Packit b5b901
that *are yet to be started* can be cancelled. If a task has *already started
Packit b5b901
executing, or it has finished executing*, ``uv_cancel()`` **will fail**.
Packit b5b901
Packit b5b901
``uv_cancel()`` is useful to cleanup pending tasks if the user requests
Packit b5b901
termination. For example, a music player may queue up multiple directories to
Packit b5b901
be scanned for audio files. If the user terminates the program, it should quit
Packit b5b901
quickly and not wait until all pending requests are run.
Packit b5b901
Packit b5b901
Let's modify the fibonacci example to demonstrate ``uv_cancel()``. We first set
Packit b5b901
up a signal handler for termination.
Packit b5b901
Packit b5b901
.. rubric:: queue-cancel/main.c
Packit b5b901
.. literalinclude:: ../../code/queue-cancel/main.c
Packit b5b901
    :linenos:
Packit b5b901
    :lines: 43-
Packit b5b901
Packit b5b901
When the user triggers the signal by pressing ``Ctrl+C`` we send
Packit b5b901
``uv_cancel()`` to all the workers. ``uv_cancel()`` will return ``0`` for those that are already executing or finished.
Packit b5b901
Packit b5b901
.. rubric:: queue-cancel/main.c
Packit b5b901
.. literalinclude:: ../../code/queue-cancel/main.c
Packit b5b901
    :linenos:
Packit b5b901
    :lines: 33-41
Packit b5b901
    :emphasize-lines: 6
Packit b5b901
Packit b5b901
For tasks that do get cancelled successfully, the *after* function is called
Packit b5b901
with ``status`` set to ``UV_ECANCELED``.
Packit b5b901
Packit b5b901
.. rubric:: queue-cancel/main.c
Packit b5b901
.. literalinclude:: ../../code/queue-cancel/main.c
Packit b5b901
    :linenos:
Packit b5b901
    :lines: 28-31
Packit b5b901
    :emphasize-lines: 2
Packit b5b901
Packit b5b901
``uv_cancel()`` can also be used with ``uv_fs_t`` and ``uv_getaddrinfo_t``
Packit b5b901
requests. For the filesystem family of functions, ``uv_fs_t.errorno`` will be
Packit b5b901
set to ``UV_ECANCELED``.
Packit b5b901
Packit b5b901
.. TIP::
Packit b5b901
Packit b5b901
    A well designed program would have a way to terminate long running workers
Packit b5b901
    that have already started executing. Such a worker could periodically check
Packit b5b901
    for a variable that only the main process sets to signal termination.
Packit b5b901
Packit b5b901
.. _inter-thread-communication:
Packit b5b901
Packit b5b901
Inter-thread communication
Packit b5b901
--------------------------
Packit b5b901
Packit b5b901
Sometimes you want various threads to actually send each other messages *while*
Packit b5b901
they are running. For example you might be running some long duration task in
Packit b5b901
a separate thread (perhaps using ``uv_queue_work``) but want to notify progress
Packit b5b901
to the main thread. This is a simple example of having a download manager
Packit b5b901
informing the user of the status of running downloads.
Packit b5b901
Packit b5b901
.. rubric:: progress/main.c
Packit b5b901
.. literalinclude:: ../../code/progress/main.c
Packit b5b901
    :linenos:
Packit Service e08953
    :lines: 7-8,35-
Packit b5b901
    :emphasize-lines: 2,11
Packit b5b901
Packit b5b901
The async thread communication works *on loops* so although any thread can be
Packit b5b901
the message sender, only threads with libuv loops can be receivers (or rather
Packit b5b901
the loop is the receiver). libuv will invoke the callback (``print_progress``)
Packit b5b901
with the async watcher whenever it receives a message.
Packit b5b901
Packit b5b901
.. warning::
Packit b5b901
Packit b5b901
    It is important to realize that since the message send is *async*, the callback
Packit b5b901
    may be invoked immediately after ``uv_async_send`` is called in another
Packit b5b901
    thread, or it may be invoked after some time. libuv may also combine
Packit b5b901
    multiple calls to ``uv_async_send`` and invoke your callback only once. The
Packit b5b901
    only guarantee that libuv makes is -- The callback function is called *at
Packit b5b901
    least once* after the call to ``uv_async_send``. If you have no pending
Packit b5b901
    calls to ``uv_async_send``, the callback won't be called. If you make two
Packit b5b901
    or more calls, and libuv hasn't had a chance to run the callback yet, it
Packit b5b901
    *may* invoke your callback *only once* for the multiple invocations of
Packit b5b901
    ``uv_async_send``. Your callback will never be called twice for just one
Packit b5b901
    event.
Packit b5b901
Packit b5b901
.. rubric:: progress/main.c
Packit b5b901
.. literalinclude:: ../../code/progress/main.c
Packit b5b901
    :linenos:
Packit Service e08953
    :lines: 10-24
Packit b5b901
    :emphasize-lines: 7-8
Packit b5b901
Packit b5b901
In the download function, we modify the progress indicator and queue the message
Packit b5b901
for delivery with ``uv_async_send``. Remember: ``uv_async_send`` is also
Packit b5b901
non-blocking and will return immediately.
Packit b5b901
Packit b5b901
.. rubric:: progress/main.c
Packit b5b901
.. literalinclude:: ../../code/progress/main.c
Packit b5b901
    :linenos:
Packit Service e08953
    :lines: 31-34
Packit b5b901
Packit b5b901
The callback is a standard libuv pattern, extracting the data from the watcher.
Packit b5b901
Packit b5b901
Finally it is important to remember to clean up the watcher.
Packit b5b901
Packit b5b901
.. rubric:: progress/main.c
Packit b5b901
.. literalinclude:: ../../code/progress/main.c
Packit b5b901
    :linenos:
Packit Service e08953
    :lines: 26-29
Packit b5b901
    :emphasize-lines: 3
Packit b5b901
Packit b5b901
After this example, which showed the abuse of the ``data`` field, bnoordhuis_
Packit b5b901
pointed out that using the ``data`` field is not thread safe, and
Packit b5b901
``uv_async_send()`` is actually only meant to wake up the event loop. Use
Packit b5b901
a mutex or rwlock to ensure accesses are performed in the right order.
Packit b5b901
Packit b5b901
.. note::
Packit b5b901
Packit b5b901
    mutexes and rwlocks **DO NOT** work inside a signal handler, whereas
Packit b5b901
    ``uv_async_send`` does.
Packit b5b901
Packit b5b901
One use case where ``uv_async_send`` is required is when interoperating with
Packit b5b901
libraries that require thread affinity for their functionality. For example in
Packit b5b901
node.js, a v8 engine instance, contexts and its objects are bound to the thread
Packit b5b901
that the v8 instance was started in. Interacting with v8 data structures from
Packit b5b901
another thread can lead to undefined results. Now consider some node.js module
Packit b5b901
which binds a third party library. It may go something like this:
Packit b5b901
Packit b5b901
1. In node, the third party library is set up with a JavaScript callback to be
Packit b5b901
   invoked for more information::
Packit b5b901
Packit b5b901
    var lib = require('lib');
Packit b5b901
    lib.on_progress(function() {
Packit b5b901
        console.log("Progress");
Packit b5b901
    });
Packit b5b901
Packit b5b901
    lib.do();
Packit b5b901
Packit b5b901
    // do other stuff
Packit b5b901
Packit b5b901
2. ``lib.do`` is supposed to be non-blocking but the third party lib is
Packit b5b901
   blocking, so the binding uses ``uv_queue_work``.
Packit b5b901
Packit b5b901
3. The actual work being done in a separate thread wants to invoke the progress
Packit b5b901
   callback, but cannot directly call into v8 to interact with JavaScript. So
Packit b5b901
   it uses ``uv_async_send``.
Packit b5b901
Packit b5b901
4. The async callback, invoked in the main loop thread, which is the v8 thread,
Packit b5b901
   then interacts with v8 to invoke the JavaScript callback.
Packit b5b901
Packit b5b901
----
Packit b5b901
Packit Service e08953
.. _node.js is cancer: http://widgetsandshit.com/teddziuba/2011/10/node-js-is-cancer.html
Packit b5b901
.. _bnoordhuis: https://github.com/bnoordhuis