B673: Introduction to Parallel Computing

Thread Scheduling

The scheduling of threads has three levels: scheduling between threads, the mapping of threads to processes, and the mapping of processes to processors. The first is controlled by pthread library calls, the second is sometimes partly controllable through the thread library, and the third is up to the OS. The second is called kernel scheduling here, and is dealt with first.

Kernel scheduling

Some OS's have something called a light-weight process (LWP), which uses less resources than a full-blown process. Another term, which may be more useful conceptually, is kernel threads, meaning threads that are provided by the OS and onto which regular threads are mapped. This was done on SGI machines, where LWP's are called sproc's. Although you can fire up LWP's in much the same way you can start up and control regular processes (via fork() and similar calls), we will regard them as kernel resources here.

There are three basic models of mapping threads to LWPs:

Many-to-one: Multiple threads in user space take turns sharing a single LWP. That is, the kernel provides only one LWP to each user, and that user's threads must share the resource. This means thread creation and synchronization is fast, but if any thread does a blocking OS call (for I/O, etc) then everybody is blocked. Running the code on a multiprocessor gains you nothing.
One-to-one: One LWP is allocated for each thread. This allows multiprocessing of threads (if LWP's get mapped to different processors), and a thread can block on the OS while other threads can continue operation. The problem is that thread creation is expensive, requiring system calls. Win32 threads did this ... can you see why it would?
M-to-N: Multiple user threads are mapped to a smaller or equal number of LWP's. The OS can create more LWP's if it sees a large number of threads getting spawned, or it can just run a fixed number at all times.

The M-to-N scheme is most common. Variants (called two-level) allow a user to impose a one-to-one binding for certain threads, that is, specify that a single LWP is to be assigned to a particular thread.

Scheduling Scope

If the M-to-N model is used, then scheduling of threads is in "process contention scope" - threads within a process compete with each other, and all the scheduling for the thread is local to the process. So the threads library determines which thread will be scheduled onto a LWP.

The 1-to-1 model has "system contention scope" - the kernel is responsible for scheduling threads. The thread library may provide hints (so, for example, when it is blocked waiting on a mutex it may let the kernel know it is doing so). Win32 pthreads uses system contention scope.

Thread States

Threads can be in the following states:

Active: the thread is running on an LWP. This does not mean it is actually running on a processor, however.
Runnable: the thread is ready to run and is waiting to be assigned to a LWP (either one freed up by another thread, or one newly created).
Blocked: the thread is sleeping, waiting on a mutex.

There are some possible intermediate states which do not affect parallel programming seriously. One is a zombie state; the thread has finished and is waiting for its resources to be collected. Using detached threads is intended to prevent zombies.

Thread Priorities

For parallel computing, don't mess with this. You will most likely end up shooting your code in the foot.

Last Modified: Tue 06 Feb 2018, 03:50 PM