|
Packit |
4e8bc4 |
Multithreading in memcached *was* originally simple:
|
|
Packit |
4e8bc4 |
|
|
Packit |
4e8bc4 |
- One listener thread
|
|
Packit |
4e8bc4 |
- N "event worker" threads
|
|
Packit |
4e8bc4 |
- Some misc background threads
|
|
Packit |
4e8bc4 |
|
|
Packit |
4e8bc4 |
Each worker thread is assigned connections, and runs its own epoll loop. The
|
|
Packit |
4e8bc4 |
central hash table, LRU lists, and some statistics counters are covered by
|
|
Packit |
4e8bc4 |
global locks. Protocol parsing, data transfer happens in threads. Data lookups
|
|
Packit |
4e8bc4 |
and modifications happen under central locks.
|
|
Packit |
4e8bc4 |
|
|
Packit |
4e8bc4 |
THIS HAS CHANGED!
|
|
Packit |
4e8bc4 |
|
|
Packit |
4e8bc4 |
- A secondary small hash table of locks is used to lock an item by its hash
|
|
Packit |
4e8bc4 |
value. This prevents multiple threads from acting on the same item at the
|
|
Packit |
4e8bc4 |
same time.
|
|
Packit |
4e8bc4 |
- This secondary hash table is mapped to the central hash tables buckets. This
|
|
Packit |
4e8bc4 |
allows multiple threads to access the hash table in parallel. Only one
|
|
Packit |
4e8bc4 |
thread may read or write against a particular hash table bucket.
|
|
Packit |
4e8bc4 |
- atomic refcounts per item are used to manage garbage collection and
|
|
Packit |
4e8bc4 |
mutability.
|
|
Packit |
4e8bc4 |
|
|
Packit |
4e8bc4 |
- When pulling an item off of the LRU tail for eviction or re-allocation, the
|
|
Packit |
4e8bc4 |
system must attempt to lock the item's bucket, which is done with a trylock
|
|
Packit |
4e8bc4 |
to avoid deadlocks. If a bucket is in use (and not by that thread) it will
|
|
Packit |
4e8bc4 |
walk up the LRU a little in an attempt to fetch a non-busy item.
|
|
Packit |
4e8bc4 |
|
|
Packit |
4e8bc4 |
- Each LRU (and sub-LRU's in newer modes) has an independent lock.
|
|
Packit |
4e8bc4 |
|
|
Packit |
4e8bc4 |
- Raw accesses to the slab class are protected by a global slabs_lock. This
|
|
Packit |
4e8bc4 |
is a short lock which covers pushing and popping free memory.
|
|
Packit |
4e8bc4 |
|
|
Packit |
4e8bc4 |
- item_lock must be held while modifying an item.
|
|
Packit |
4e8bc4 |
- slabs_lock must be held while modifying the ITEM_SLABBED flag bit within an item.
|
|
Packit |
4e8bc4 |
- ITEM_LINKED must not be set before an item has a key copied into it.
|
|
Packit |
4e8bc4 |
- items without ITEM_SLABBED set cannot have their memory zeroed out.
|
|
Packit |
4e8bc4 |
|
|
Packit |
4e8bc4 |
LOCK ORDERS:
|
|
Packit |
4e8bc4 |
|
|
Packit |
4e8bc4 |
(incomplete as of writing, sorry):
|
|
Packit |
4e8bc4 |
|
|
Packit |
4e8bc4 |
item_lock -> lru_lock -> slabs_lock
|
|
Packit |
4e8bc4 |
|
|
Packit |
4e8bc4 |
lru_lock -> item_trylock
|
|
Packit |
4e8bc4 |
|
|
Packit |
4e8bc4 |
Various stats_locks should never have other locks as dependencies.
|
|
Packit |
4e8bc4 |
|
|
Packit |
4e8bc4 |
Various locks exist for background threads. They can be used to pause the
|
|
Packit |
4e8bc4 |
thread execution or update settings while the threads are idle. They may call
|
|
Packit |
4e8bc4 |
item or lru locks.
|
|
Packit |
4e8bc4 |
|
|
Packit |
4e8bc4 |
A low priority issue:
|
|
Packit |
4e8bc4 |
|
|
Packit |
4e8bc4 |
- If you remove the per-thread stats lock, CPU usage goes down by less than a
|
|
Packit |
4e8bc4 |
point of a percent, and it does not improve scalability.
|
|
Packit |
4e8bc4 |
- In my testing, the remaining global STATS_LOCK calls never seem to collide.
|
|
Packit |
4e8bc4 |
|
|
Packit |
4e8bc4 |
Yes, more stats can be moved to threads, and those locks can actually be
|
|
Packit |
4e8bc4 |
removed entirely on x86-64 systems. However my tests haven't shown that as
|
|
Packit |
4e8bc4 |
beneficial so far, so I've prioritized other work.
|