The thread switching latency is the time needed by the operating system to switch the CPU to run another thread. In some operating systems running on some hardware, switching between threads belonging to the same process is much faster than switching to a thread from different process (because threads from different processes require a more complicated process context switch).
During the thread switching, the CPU must save all of the current thread's execution data to memory, and load another thread's execution data from memory. Since reading and writing from memory requires some latency, we can define the thread switching latency as the amount of time needed to complete this operation. There is no virtual memory switching required in a thread switch, unlike a process switch.
Related pages