Q&A: How do you Select Thread Priorities to Improve Performance?

7 June 2021 by Phillip Johnston • Last updated 14 October 2021

Intended Audience: People who are interested in adjusting thread priorities to improve their system’s performance and/or responsiveness.

We’ve fielded multiple questions recently that follow this general pattern:

The number of threads in my system are growing, and finding the best priority settings to optimize performance using trial and error is too cumbersome. How should I go about properly selecting thread priorities?

Thread prioritization is always a prickly discussion point, because I actually think that changing thread priorities is rarely the right answer in improving a system’s overall performance.

What priority actually lets us control is the determinism of the latency of response for a given task. When we change the priority of a task, we are changing its worst-case latency of response. Increasing priority relative to other threads makes its latency more predictable (deterministic).

Alternatives to Thread Priority Changes

Our rule of thumb is that you should only change priorities when you have specific latency/determinism requirements for a given thread. Sometimes this is necessary. However, I’d also like for you to consider a few other details first.

Profile Your System

A major question comes to mind whenever I’m asked about selecting thread priorities: are you sure that priorities are really the biggest performance factor for your system? If the answer is “yes”, I usually follow this up with: how do you know? Often, there isn’t a good answer, just a general feeling that priorities are a problem.

In practice, we must follow one of the indispensable rules of debugging: quit thinking and look.

Note: Sometimes the answer is “I have hard real-time requirements”, in which case jump to the RMS section.

We need to actually observe the system while running to figure out where the bottlenecks are. Your performance problems might be caused by a single long-running high-priority task with a logic error, a resource deadlock, or an interrupt storm. We might even find that having threads with different priorities is actually hurting performance due to the sheer number of context switches that are occurring.

If you want to profile your system, Segger’s Ozone is a great RTOS-aware tool that can help us profile our system and see how threads are actually behaving. When you look at what’s actually going on in your system, you might find that the problem is unrelated to thread priorities altogether. You may also find that you’re taking a significant performance hit due to frequent context changes due to preemption.

If you can’t use a tool like Ozone, you could instrument statistical performance analysis in your program, allowing your system to run for a set amount of time and then displaying metrics regarding how often each thread runs, how long threads run for, the number of context switches that occurred, etc. No matter how you tackle it, the important thing is that guessing is insufficient: we need to use data to identify where a performance bottleneck is actually occurring.

Try to Keep Threads At the Same Priority Level

After you have profiling set up, you might run an experiment: set as many threads as possible to the same priority level. For threads which have explicit latency requirements (e.g., a button handler or another type of user input), adjust the priority to be higher than the baseline priority level, but keep other threads the same. Remember, priorities control determinism of worst-case latency of response, so ideally you only need to adjust the priority when you need to adjust that factor!

When your tasks run at the same priority, you avoid the overhead of context switches, which can be quite expensive depending on the number that occur when your system is running. Increasing overall program throughput (by letting operations complete without interruption) may also improve the perceived responsiveness of your program.

Having threads at the same priority doesn’t necessarily mean that they will starve each other: time slicing is also an option in most RTOSes, which ensures that threads of equal priority get a “fair share” of the processor. Even without time slicing, run-to-completion type scheduling for tasks that share a priority is a viable strategy for many systems!

Incorporate Dispatch Queues and Active Objects

Because threads can be so hard to reason about, we do well to reduce the threading surface in our program. By making use of constructs like active objects and dispatch queues to eliminate the number of raw threads in our system.

Dispatch queues are especially helpful when eliminating a number of “small” threads that have a singular purpose, such as threads that respond to a button interrupt or a thread that periodically reads a gas gauge to update a battery charge percentage in a UI. Instead of having dedicated threads for these types of operations, an interrupt, timer event, or another thread can instead “dispatch” a function onto the general queue. Instead of managing priorities for each of these threads, we have simplified things so that we only need to manage a single priority for the dispatch queue itself. Of course, dispatch queues can also be assigned with higher priorities than the standard operating level for low-latency operations (such as responding to a button press), and we often do this for interrupt-related events in our systems.

Active objects (AOs) are essentially an object with its own thread and a queue of operations. AOs don’t eliminate threads, but they do help us “hide” the threading details and write our code in such a way that we aren’t worrying about handling manual details such as mutual exclusion. Whenever we call an API for an active object, we actually enqueue a message or event onto the internal queue, and the internal thread will pop from the queue and handle the events in the order they come in. Because we are no longer sharing memory between tasks but instead operating via message passing, we significantly reduce the occurrence of deadlocks due to poor handling of threading constructs (e.g., mutexes).

Note: Our dispatch queue articles are provided in the Further Reading section below, and active object references can be found in the corresponding glossary entry.

Use a Scheduling Methodology to Select Priorities

Of course, you may find that you do indeed need to explicitly set your task priorities in order to meet your latency or scheduling requirements. For this, a method such as Rate Monotonic Analysis (RMA), Deadline Monotonic Analysis (DMA), or Earliest Deadline First (EDF) is the right tool for the job. These are mathematical approaches for guaranteeing that our tasks will meet their stated deadlines. If you want to learn more about rate monotonic analysis, I recommend starting with Phil Koopman’s overview and the Wikipedia article.

Conclusion

In practice, I find that changing thread priorities is actually not a generally helpful practice unless you know it is the right tool for solving a particular latency or deadline problem that you have observed in your system. We are often better off keeping most threads at a common priority level. We also improve our abilities to reason about our system by reducing the number of threads (e.g., using dispatch queues) and the avoiding direct use of threading primitives (e.g., using active objects). If we truly need to set priorities for our threads, guesswork is insufficient: we need to make measurements of our system and use a scheduling methodology such as Rate Monotonic Analysis.

Q&A: How do you Select Thread Priorities to Improve Performance? - Embedded Artistry (2024)