Inspiration in the Oddest Places

These last 23 days of writing daily has taught me a lot of lessons.. “Inspiration in the Oddest Places” is published by Kandice Leaf.

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Never Feel Insecure About Thread Safety Again

I used to feel very insecure when it comes to the topic of thread safety. Thread, by itself, is already an intimidating concept. (This article hopefully can help make the concept of thread more approachable.) Adding the word “safety” to thread does not help. It sounds an even more daunting topic and leaves one feel a bit — unsafe. As if talking about thread safety is not enough, people begin to throw words like “race condition”, “critical section”, “atomicity”, “immutability”, “mutual exclusion”, “locks”, “lock-free data structures”, and so on, in the context of thread safety.

Anyone in search for an understanding of thread safety deserves better explanations. Thus, I am dedicating this writing to those who strove to understand, yet may still be confused about, a topic that once triggered much anxiety in me — thread safety.

Although the code in this writing is in Java, the concepts are universal in other programming languages and anyone who has some experience in object oriented programming should be able to understand most of it.

An understanding of thread safety begins with a glimpse of how unsafe threads can be.

Imagine two threads, both have access to an integer counter. One thread tries to increment the counter in a loop, while the other thread tries to decrement the counter in a loop. Both threads operate on the counter the same number of times. If the counter starts with zero, one would expect that after both threads are done, the counter would remain zero.

In an unpredictable world of concurrent programming, the last line of the above code segment may or may not output zero. The above code segment is said to be thread unsafe exactly because of this nature of not being deterministic. There will be no concern about thread safety if the output is always zero.

The culprit of making the code not thread safe lies in two lines of seemingly innocent code: counter++ and counter--. Do not be deceived by their simplistic looking. When these two statements are compiled into Java bytecode (or machine code for pure compiled languages such as C++), they each become multiple statements. Here is how counter++ looks like when compiled into Java bytecode:

counter-- looks similar except that iadd is changed to isub. When the Java Virtual Machine (JVM) and the underlying processor execute these instructions, instructions from different threads can be interleaved, because the operating system tries to execute instruction sequences for different threads concurrently. Here is a possible interleaving in the case of the counter example:

Suppose the value of counter starts with 0. After thread t1 load the value of counter and added 1 to it, just before thread t1 is about to set the value of counter to be 1, thread t2 gets its turn to run and load the value of counter (which is still 0 since t1 has not changed it yet), calculates counter-1 and pushes the result -1 onto its own stack (note that each thread has its own stack). At exactly this moment, the scheduler decides that it is now thread t1’s turn to run and executes the instruction of putstatic which sets the value of counter to be 1. After that, it is thread t2’s turn and its putstatic instruction sets the value of counter to be -1. This is the defining moment for violating thread safety. After thread t1 incremented the counter and thread t2 decremented the counter, the counter should be still be 0! In this case, however, the value of counter mysteriously becomes -1!

Whenever there are multiple threads, the exact sequences for thread execution are entirely up to the scheduler and programmers have little control. This means that there is no way to predict how the machine instructions will be interleaved. This condition that the program output or end state is dependent upon its uncontrollable sequences of execution is called a race condition.

The presence of concurrency alone does not necessarily cause race conditions. Independent threads, each minds its own business, having no shared data among them, can peacefully coexist without raising concerns about thread safety. Problems start to arise when they start to share data. In real life situations, sharing is often associated with positive feelings. Not in the world of concurrent programming! One of the root causes for violating thread safety in the case of the counter example is that both thread t1 and thread t2 manipulate a shared variable counter. In fact, if both threads only try to read counter, there will not be any problem at all. But if one or both threads start to write to the shared variable, race conditions start to appear.

Why sharing is bad then, when a thread is only trying to make changes to common resources? Well, the truth is that a thread can make changes (write) to common data, as long as the thread does that in an atomic way, isolated from other threads. The section of code that can cause race conditions and must be executed in a mutual exclusive way is called critical section. In the counter example, counter++ and counter--are both critical sections that must be executed atomically in order to ensure thread safety.

Every approach that can achieve thread safety directly or indirectly targets the problem of sharing and strives for atomicity in critical sections. The solutions for thread safety can be roughly divided into the following four categories.

To prevent sharing, each thread can make sure that all the data it makes changes to are local to that thread. Since each thread has its own stack, other threads cannot access its local data. In the spirit of no sharing, here is how to make the two threads in the earlier example do their independent counting and aggregate their results when they are done:

If you feel perplexed by the use of FutureTask in the above code, details are provided here. In a nutshell, a FutureTask object can be passed as an argument to the Thread constructor so that the result of the thread run can be obtained by calling the get() method of FutureTask. Here both threads operate on their own local counter variable, making their counting operations isolated from each other. Thus, thread safety is achieved through localization and confinement.

2. Mutual Exclusion

If sharing is unpreventable, then the shared resources have to be accessed in a mutual exclusive way to ensure thread safety. Mutual exclusion is usually achieved by using some type of lock entities. Before entering the critical section, a lock needs to be acquired (locked). Upon exiting the critical section, the same lock needs to be released (unlocked). At any point in time, only one thread can lock the critical section, accessing the shared data.

In Java, the most common type of lock is the intrinsic lock on an object, also known as the monitor lock. Locking and unlocking are implicitly achieved by wrapping the critical section inside a synchronized block. The monitor lock is acquired before entering the synchronized block and released upon exiting the synchronized block. If another thread is trying to enter a synchronized block locked by the same monitor lock, it has to wait until the lock is release and reenter the competition for acquiring the lock. Here is how to use synchronized blocks to ensure thread safety in the earlier example of counter:

Note how counter++ and counter--are protected by the same monitor lock associated with the obj object.

3. Atomic Variables

Atomic variables are often said to be a lock-free way to achieve thread safety. Operations that access and update atomic variables are guaranteed to be atomic because of the use of special compare-and-set (CAS) machine instructions. Atomic variables are more performant than mutual exclusion locks because they eliminate the need for context switching. When using locks, threads have to be constantly switched between the running state and the blocking state, which involves suspending and saving a thread’s status before resuming another thread. In a nutshell, context switching is expensive! Thus, atomic variables are often harnessed in performance sensitive multi-threaded applications.

In Java, the java.util.concurrent.atomic package provides a set of different atomic variable types. Other programming languages such as C++ and GoLang also provide similar constructs. Here is how to use AtomicInteger in the earlier example of counter:

Note that the variable counter is initialized to 0. The methods incrementAndGet and decrementAndGet are both atomic operations that are executed sequentially in their entirety in one shot.

4. Immutability

An object being immutable means that it cannot be changed after it is constructed. Immutable objects are thread safe because no thread can change its internal status after construction. All threads can only do read-only operations on the immutable objects.

In Java, some library classes such as String and primitive wrapper classes (Integer, Long, Float, Boolean, etc.) are immutable. However, there is one big caveat concerning using objects of these immutable classes as a thread safe guarantee. For example, here is the wrong way to achieve thread safety:

As you can see, I have changed the first line, making counter a boxed type. Since Integer is an immutable type, it is easy for one to derive that counter is immutable too and thus erroneously think that changing counter from int type to Integer type would make the code thread safe. This incorrect thinking stems from its failure to realize that an Integer object is immutable, but its reference is not! counter is a reference to an Integer object. To make the reference immutable, we have to declare it as final. If counter is final, counter++ and counter--are no longer allowed. So we assign counter to local variables of threads and use FutureTask to obtain the counting result from each thread:

As long as references to immutable objects are made final, one can rest assured that there will be no thread safety issues on these immutable objects.

If you have read this far, I hope you have a clearer mind about thread safety now. Thread safety is crucial in concurrent programming. Thread safety errors are hard to debug and reproduce. Fortunately, a variety of tools and data structures exist to help us ensure thread safety, so we can enjoy the benefits of better performance and resource utilization provided by concurrent programs.

Add a comment

Related posts:

Engaging Your Audience

Engaging with your audience as a journalist should not be a secondary thought anymore. The public is not only the recipient of your journalism, they are often your sources, leads and interviewees. To…

Khudi and Self learning

Edhi sb uses one of the principle of progress that is Amal. He did not wait for time when he would have lot of money then he was going to help poor. Infact what he did he used to help poor with…

Basic Javascript

Hello Everyone!! This is a blog, Where I discuss about basic javascript. This blog will help you to start your developing career with javascript very easily. Javascript is a scripting language. By…