The series so far:
- The Unconventional Guide to Introduction to Thread
- Why Do People Think Dedicated CLR Threads is a Good Idea?
- How Not Knowing Thread Members and Execution State Makes You a Rookie
- Doing Thread Scheduling and Priority the Right Way
- How to Start Using CLR's Thread Pool
- What Wikipedia Can't Tell You About Thread Execution Contexts
- The Insider's Guide to Cooperative Cancellation and Timeout
Why Windows support Process?
Earlier Computers have a just single thread of execution. Imagine a 32-bit window with infinite loop bug corrupts OS data.
To prevent such disasters, we redesigned OS to run each instance of the application in Process. A Process is just collection of resources used by a single application instance.
Virtual Space for each Process with Data and Code. The process is Secure, and it cannot corrupt OS or other Application.
What if application enters infinite loop? What about the CPU?
If there is one CPU, the infinite loop bug the data security and hangs the entire application.
Threads were the answer
- Windows Concept whose job is to virtualise CPU
- Windows gives each process its thread
- Thread functions similar to a CPU
Voila! Threads save the day
For above scenario, if application code enters an infinite loop, the process associated with that code freezes up, but other processes (which have their threads) are not frozen; they keep running!
Like every virtualisation mechanism, threads have space and time overheads.
AWESOMENESS of Threads
- Responsiveness Enable Windows to be responsive for long running applications
- Better Control Allow the user to force kill application (Task Manager)
They have space and time overheads:
- Space Overhead - Memory Consumption
- Time Overhead - Runtime Execution Performance
Aspects for understanding Thread Objects
Let's look at the some common aspects below useful for understanding Thread Object.
The Kernel is a bridge between Application and Hardware
2. Interactions with OS
- The system keeps all information required for thread execution/ scheduling inside thread kernel object.
- Thread kernel object is the only handle through which operating system access all the information about the thread and uses it for thread execution/ scheduling.
3. Kernel Stack Operations
- The application code passes arguments to a kernel-mode function in the operating system using the kernel-mode stack.
- For security reasons, Windows copies any arguments passed, from user-mode code to the kernel from the thread’s user-mode stack to the thread’s kernel-mode stack.
- Once copied, the Kernel can verify the arguments’ values, and because the application code can’t access the kernel-mode stack, the application can’t modify the arguments’ values after validation, and the operating system kernel code begins to operate on them.
- Also, the kernel calls methods within itself and uses the kernel-mode stack to pass its arguments, to store a function’s local variables, and to store return addresses.
- The kernel-mode stack is 12 KB when running on a 32-bit Windows system and 24 KB when running on a 64-bit Windows system.
4. DLL Thread-Attach and Thread-Detach Notifications
These notifications are Special Initialization and Clean-up performed only in unmanaged and not in managed threads
Disable Thread Library Calls
- DLLs by C# and most other managed programming languages do not have a
DllMainin them at all, and so it will not receive the
DLL_THREAD_DETACHnotifications for improving performance.
- Besides, unmanaged DLLs can opt out of these notifications by calling the Win32
5. What Are Kernel Objects
- The Kernel needs to maintain lots of data about numerous resources such as processes, threads, files, etc., for that kernel use “Kernel Data Structures” which are known as Kernel objects.
- Each Kernel object is merely a memory block allocated by the kernel and is accessible only to the the Kernel.
- The Kernel creates and manipulates several types of kernel objects, such as process objects, thread objects, event objects, file objects, file-mapping objects, I/O completion port objects, job object, MUTEX objects, pipe objects, semaphore objects, etc.
Before digging deeper into overheads, lets first understand what forms a thread.
Every thread has one of following components
- Thread Object - Stores Thread Context and value of CPU Registers
- User-mode Stack - Stores local variables and arguments passed and the address of the next method (1 Mb)
- Kernel-mode Stack - application passes arguments to a kernel function in the operating system (12 Kb)
- Thread Environment Block - contains information about exception handling and thread local storage
ThreadLocalStorage: This field contains the thread-specific data.
ExceptionList: This field contains the Exception Handlers List used by SEH (Microsoft Structured Exception Handling)
ExceptionCode: This field contains the last exception code generated by the Thread.
LastErrorValue: This field contains the last DLL Error Value for the Thread.
CountOwnedCriticalSections: This field counts the number of Critical Sections (a Synchronization mechanism) that the Thread owns.
IsImpersonating: This field is a flag on whether the Thread is doing any impersonation.
ImpersonationLocale: This field contains the locale ID that the Thread is impersonating.
Essential for the Robust and Responsive OS. It also brings performance hit and to improve the situation, previous thread's code and data reside in CPU Caches, and RAM thus involves latency.
- Computer with one CPU can do one thing at a time. Therefore, Windows has to share the actual CPU hardware among all the threads using logical CPUs that are sitting around in the system.
- At any given moment in time, Windows assigns one thread to a CPU. That thread is allowed to run for a time-slice (quantum). When the time-slice expires, Windows context switches to another thread.
- Every Context switch requires windows perform following actions.
- Save CPU's registers to thread context
- Select next thread to execute. If owned by another process, then windows switch to virtual address space seen by CPU before starts running any code.
- Update CPU registers with values from selected thread's context
Note: After context switch, executes selected thread and another context switch. Pure memory / Performance overhead of approximately 30ms
Things to consider before using Threads
- Code considerations
- Design code to avoid Context Switching
- Thread ends or time-slice early
- Garbage Collection of Threads
- CLR suspends all threads
- Walk their stacks to find roots
- Mark objects in heap
- Walk again to update roots to object moved
- Slow Debugging Experience with Threads
- Considerations on number of threads
- Best performance when number of threads identical to number of CPUs
- More threads means more context switching
- Windows prefers reliability & responsiveness over speed & performance.
- Multiple CPUs can actually run multiple threads by assigning one thread to each CPU.
- Task Manager:
- Analyse CPU in Task Manager
- What is number of threads / process?
- Notice number of threads
- What are the number of threads for Visual Studio?
- Open new Notepad window and observe threads
- What is the change in count of threads when Open Dialog Box is used in Notepad