The nice overview of threading primitives in Chromium is posted in Chromium Advent Calendar. As Chromium/Blink maintain their own threading primitives, threading primitives in WebKit are also largely changed from the fork. In this post, I introduce threading primitives in WebKit briefly.
WTF::Thread
In old days, we have ThreadIdentifier
, which is actually uint32_t
identifier to a thread.
We have an internal hash table between ThreadIdentifier
and PlatformThread
like pthread_t
.
createThread(String name, Function) -> ThreadIdentifier
waitForCompletion(ThreadIdentifier) -> bool
However, this design has several problems.
First, this has very limited extensibility: attaching various additional information to thread is not as easy as we extend class.
Second, this interface requires looking up the corresponding thread handle from the hash table every time we call threading operations.
Finally, and the worst problem is that we cannot know whether the given ThreadIdentifier
is used.
The following code explains the above problem.
We cannot manage the lifetime of the holder of the thread (in the example, ThreadIdentifier
).
This means that ThreadIdentifier
must be monotonically increasing.
1 2 3 4 5 |
|
Thread
class in WTF
(of course, WTF stands for web template framework) is a brand new abstraction over native threads to solve the above problems.
This is ref-counted Thread
object. Thread::create(...) -> Ref<Thread>
creates it and it offers threading operations as its member functions.
For example, thread->waitForCompletion()
is join
function in WebKit threading.
1 2 3 4 5 6 7 8 9 |
|
This Thread
class is portable. It just works (TM) on macOS, Linux (and UNIX environments including FreeBSD), and Windows.
It is important to build advanced features on the top of this Thread
abstraction.
Thread
has one on one correspondence between a native thread and Thread
.
This ref-counted object is held in thread local storage (TLS) and retained while Thread
is running.
We can get the current thread from TLS by calling Thread::current() -> Thread&
.
So, for example, checking whether the given Thread
is the current one is done by thread == &Thread::current()
pointer comparison.
And user of this thread can retain Thread
to perform threading operations onto it.
Since Thread
is ref-counted, Thread
is destroyed when nobody retains it.
When (1) no users retain this thread and (2) the thread itself finishes, Thread
will be destructed.
By introducing Thread
, (1) we can easily attach any information to Thread
as we want.
Moreover, (2) we can manage the lifetime of the holder of the thread by ref-counted Thread
object.
We can destroy Thread
when it is no longer used.
When using ThreadIdentifier
, we were not able to recycle unused ThreadIdentifier since it may be used in some places.
Note that Thread::current()
works even if we call it from non-WebKit-created threads (a.k.a. external threads).
At that time, Thread
is created and stored in TLS. If the thread finishes, this TLS and held Thread
are automatically destroyed.
Advanced features of Thread
One of the interesting aspect of our Thread
is that it has bunch of advanced features that are not typically offered in standard libraries like C++ std::thread
.
This is derived from the fact that our WTF
library is tightly coupled with JavaScriptCore (JSC).
Thread
offers advanced features that is necessary for JSC.
For example, Thread::suspend() -> Expected<void, PlatformSuspendError>
is platform independent way to suspend the thread.
This is portable (working in macOS, Linux, and Windows) and used for garbage collection (GC)’s stop the world1.
There is a list of such advanced features in our Thread
;). They are building block of our GC in JSC.
Thread::suspend() -> Expected<void, PlatformSuspendError>
Thread::resume() -> void
Thread::getRegisters(PlatformRegisters&) -> size_t
Thread::stack() -> const StackBounds&
In POSIX environment, we have further features.
Thread::signal(int) -> bool
While macOS and Windows have platform APIs to suspend and resume threads, Linux does not have such one. We implement it by using POSIX signal and semaphore, which is typical way to implement stop-the-world GC operation.
Locking
I do not say much about locking in this post since here is very nice blog post in webkit.org.
This offers WTF::Lock
and WTF::Condition
.
WTF::ThreadGroup
Grouping live threads is useful. Consider multi-threaded environment, various threads take a lock of one VM, run JS, and release the lock. When GC happens, conservative GC would like to scan the stack and registers of live threads that touch this VM before.
ThreadGroup
offers exact this feature. We can add Thread
to ThreadGroup
.
When Thread
finished its execution, Thread
cooperatively removes itself from added ThreadGroup
s.
If you take a lock of ThreadGroup
, all the threads included in this ThreadGroup
is kept alive until the lock is released.
We can iterate live threads in ThreadGroup
and suspend each thread to perform stop-the-world.
While the concept of ThreadGroup
is simple, its implementation is a bit tricky.
Thread
can concurrently finish, and be removed from ThreadGroup
.
Any thread can add any Thread
to multiple ThreadGroup
s at any time.
If you are interested in the implementation, you can look into the change.
WTF::WorkQueue and WTF::AutomaticThread
WorkQueue
is simple abstraction. We can put a task (Function
) to queue, and thread running inside the WorkQueue polls and run the task.
There is also a similar abstraction to WorkQueue
: AutomaticThread
.
This is very fancy feature used in JSC. AutomaticThread
can poll tasks and run them.
While WorkQueue
can take any functions as its tasks, AutomaticThread
implements the body of the task in its virtual
member, but semantics is very similar.
The difference is that Thread
is automatically destroyed when AutomaticThread
becomes idle more than 10 seconds.
Reducing threads significantly affects on the memory consumption of the browser.
Outstanding example is malloc
. Recent malloc
library uses TLS to gain high performance in multi-threaded environment.
For example, various malloc
implementations (including bmalloc
in WebKit) have synchronization-free cache in TLS to speed up the fast case.
This cache remains until the thread is destroyed!
AutomaticThread
is mainly used for concurrent JIT compiler threads in JSC.
WTF::ParallelHelperPool
ParallelHelperPool
is an interesting thread pool which is intended to be shared by multiple parallel tasks.
We have a task that can be executed in parallel manner e.g. GC’s parallel marking.
We set this task to the pool. And the pool run this task in parallel.
As is noted in the code, this abstraction is suitable for the use case: there are multiple concurrent tasks that may all want parallelism.
Since threads are managed by AutomaticThread
, threads will be automatically destroyed if the pool is not used.
Currently it is used for GC’s parallel marking (And recently, this thread also performs parallel marking constraint solving).
WTF::ThreadMessage
ThreadMessage
is a feature executing lambda while suspending a specified thread.
It is constructed on Thread::suspend
and Thread::resume
.
1 2 3 |
|
Since suspend
and resume
are portable, this ThreadMessage
feature is also portable.
By using sendMessage
, we can modify some data while suspending a thread.
In WebKit, it is used to insert trap in running VM (called VM trap). One example is terminating a running VM without introducing runtime cost. You may see a dialog like “Your JS takes too much time. Do you want to stop it?”.
In JSC, we have check_trap
bytecode. In optimizing compiler, it just emits nop
.
When we would like to terminating a running VM, we sendMessage
to a thread running this VM.
While suspending the thread, we rewrite JIT generated nop
with hlt
in x86, and then resume the thread.
When the resumed thread hits this hlt
, it causes fault signal. And we handle this fault in our signal handler.
In the signal handler, we throw uncatchable JS exception for VM termination, and VM will be terminated.
Since we just execute nop
in an usual optimizing JIT code, we do not need to introduce runtime cost for this feature.
WTF::ThreadSpecific
WTF::ThreadSpecific<>
offers the portable abstraction of TLS in POSIX and Windows.
TLS is a storage to put per-thread data. It is good to achieve high performance in multi-threaded environment since accessing TLS’s data does not require synchronization.
WTF::ThreadSpecific<>
uses pthread_get_specific
and pthread_set_specific
in POSIX environment.
In Windows, we use fiber local storage (FLS) under the hood.
One interesting feature of TLS is FAST_TLS
in Darwin.
It is a platform-provided system-reserved TLS slot like __PTK_FRAMEWORK_JAVASCRIPTCORE_KEY3
.
You can check them in system library header.
These slots are intended to be used for system libraries.
For these slots, you can use _pthread_getspecific_direct(KEY)
and it is compiled to code accessing memory segment register and offset, it is quite fast.
It is nice example of co-designing platform and system libraries.
Summary
Webkit has WTF
utility library, and it offers various fancy threading primitives.
As we encourage more parallelism in WebKit, we will add more features to WTF
.
-
This stop the world functionality is also used to implement sampling profilers in JSC. One sampling profiler thread periodically stops the JS VM thread, retrieves execution context data including stack traces, and resumes the thread. ↩