What Is Memory Churn and How to Avoid It in Java

How do I become proficient with functional programming in Java

Introduction

Modern hardware is very good at predicting the next instructions to be executed. This kind of prediction allows CPUs to do more work in less time. One of the strategies that hardware relies on is data locality. This means that when CPU requests data from main memory, it not only retrieves the requested data, but the hardware also retrieves the data that is stored in proximity to the requested data.

On the software side, the Java compiler is heavily optimized to assist the hardware in these optimizations. Despite these optimizations, there are cases where none of them can help us. If we as programmers don't address this issue, it can lead to degraded application performance.

Before I continue, let me tell you this:

Don't optimize your application prematurely!

If you don't have performance issues with your application, you don't need to optimize. However, it pays to know what can affect the performance of your application and how you can be more careful and write better performing code.

What Is Memory Churn?

Memory churn refers to the continuous and repetitive process of creating, deleting, and reallocating memory in a computer system. It is the rate at which memory is allocated and deallocated over a certain period of time, usually measured in cycles or seconds.

Memory churn can have an impact on the overall performance of a computer system. When memory is constantly allocated and deallocated, it can lead to fragmentation of the memory space, which can result in slower performance and increased resource usage.

One common example of memory churn is in software applications that allocate and deallocate memory frequently, such as video games or web browsers. These applications can cause significant memory churn, leading to slower performance and increased resource usage over time. Overall, minimizing memory churn is important for maintaining optimal performance and resource usage in computer systems. This can be achieved through various strategies, such as efficient memory management and reducing unnecessary memory allocations and deallocations.

The JVM runs garbage collection periodically, either when it can, because the program threads are waiting for some external event, or when it needs to, because it's run out of memory for creating new objects. Despite the automatic nature of the process, it's important to understand that it's going on, because it can be a significant part of the overhead of Java programs.

Example of Memory Churn

One common example of memory churn that can occur in applications is when data is stored in a hash map with a composite key consisting of two combined strings. Let me show you an example:

Let's say that we want to fetch a person data from in-memory Person repository. Contents of the PersonRepository and the Person class are defined as following:

 1@Value
 2public class Person {
 3    String firstName;
 4    String lastName;
 5    int age;
 6}
 7
 8public class PersonRepository {
 9
10    Map<String, Person> personNameMap = new HashMap<>();
11
12    public Person findPersonByFullName(String firstName, String lastName) {
13        return personNameMap.get(firstName + lastName);
14    }
15    /*...*/
16}

Note: In the examples in this article, I used Lombok annotations to reduce the need for writing boilerplate code. You can find more details about it on the following link.

On first look, this code looks good to most of the developers. But, there is one performance issue that can arise if there is a massive load on the findPersonByFullName method. To understand the problem, let's check how the + operator for String concatenation works in Java. In Java 8, this concatenation is implemented using StringBuilder class. So the code for string concatenation would be implemented as follows:

 1public class Person {
 2    /*...*/
 3    public Person findPersonByFullName(String firstName, String lastName) {
 4        return personNameMap1.get(new StringBuilder().append(firstName).append(lastName).toString());
 5    }
 6}
 7
 8public class StringBuilder extends AbstractStringBuilder {
 9    /*...*/
10
11    /**
12     * Constructs a string builder with no characters in it and an
13     * initial capacity of 16 characters.
14     */
15    @IntrinsicCandidate
16    public StringBuilder() {
17        super(16);
18    }
19    /*...*/
20}
21
22public class AbstractStringBuilder {
23    /*...*/
24    /**
25     * Creates an AbstractStringBuilder of the specified capacity.
26     */
27    AbstractStringBuilder(int capacity) {
28        if (COMPACT_STRINGS) {
29            value = new byte[capacity];
30            coder = LATIN1;
31        } else {
32            value = StringUTF16.newBytesFor(capacity);
33            coder = UTF16;
34        }
35    }
36    /*...*/
37}
38
39public class String /*...*/ {
40    /*...*/
41    /*
42     * Package private constructor. Trailing Void argument is there for
43     * disambiguating it against other (public) constructors.
44     */
45    String(AbstractStringBuilder asb, Void sig) {
46        byte[] val = asb.getValue();
47        int length = asb.length();
48        if (asb.isLatin1()) {
49            this.coder = LATIN1;
50            this.value = Arrays.copyOfRange(val, 0, length);
51        } else {
52            // only try to compress val if some characters were deleted.
53            if (COMPACT_STRINGS && asb.maybeLatin1) {
54                byte[] buf = StringUTF16.compress(val, 0, length);
55                if (buf != null) {
56                    this.coder = LATIN1;
57                    this.value = buf;
58                    return;
59                }
60            }
61            this.coder = UTF16;
62            this.value = Arrays.copyOfRange(val, 0, length << 1);
63        }
64    }
65    /*...*/
66}

By examining the StringBuilder parameterless constructor, it becomes apparent that its character array is initialized with a default size of 16. If the concatenation's result exceeds this limit, a new array must be created and initialized. And all the data must be copied over to the new array. Finally, when the toString method is called, the array is once again copied to create the resulting string.

The + operator has much better implementation in Java 17. StringConcatFactory.makeConcatWithConstants method is used for strings concatenation. It further calls the method StringConcatHelper.simpleConcat for the special case of two strings concatenation. This method is the fastest implementation for the String concatenation that I have seen till now in Java. It is optimized to create just one character array for all the concatenations. And not only that, it uses the same character array to create a new instance of String class. This can be done only in jdk code, since the constructor for String class that accepts character array is package private. Here is the content of simpleConcat method.

 1/** JDK 18 code
 2 * 
 3 *
 4 * Perform a simple concatenation between two objects. Added for startup
 5 * performance, but also demonstrates the code that would be emitted by
 6 * {@code java.lang.invoke.StringConcatFactory$MethodHandleInlineCopyStrategy}
 7 * for two Object arguments.
 8 *
 9 * @param first         first argument
10 * @param second        second argument
11 * @return String       resulting string
12 */
13@ForceInline
14static String simpleConcat(Object first, Object second) {
15        String s1 = stringOf(first);
16        String s2 = stringOf(second);
17        if (s1.isEmpty()) {
18        // newly created string required, see JLS 15.18.1
19        return new String(s2);
20        }
21        if (s2.isEmpty()) {
22        // newly created string required, see JLS 15.18.1
23        return new String(s1);
24        }
25        // start "mixing" in length and coder or arguments, order is not
26        // important
27        long indexCoder = mix(initialCoder(), s1);
28        indexCoder = mix(indexCoder, s2);
29        byte[] buf = newArray(indexCoder);
30        // prepend each argument in reverse order, since we prepending
31        // from the end of the byte array
32        indexCoder = prepend(indexCoder, buf, s2);
33        indexCoder = prepend(indexCoder, buf, s1);
34        return newString(buf, indexCoder);
35        }

From the above description, it can be inferred that the example code generates a huge number of objects (specially in java 8 example). Garbage collector needs to clean up those objects at some point in time. The greater the number of memory allocations made by an application, the more the garbage collector is required to work in order to clean them up. Furthermore, these objects happen to be arrays. It means that objects are memory heavy. This can lead to additional strain on the garbage collector. As a result, memory churn may occur.

How Can We Fix It?

The solution would be to reduce the number of allocations. For example, we can consider creating a pool of objects that can be reused (something like thread pools). The proposed solution wouldn't work on this example, because we don't know what instances of the String object will be created.

What Can We Do Then?

If we must allocate memory, then we should check if we can perform smaller memory allocations. We can create a new class that will contain references to existing strings. The class must implement equals and hashCode methods. Requested behaviour will be the same, but the amount of allocated memory will be significantly smaller. We can call this class PersonPk and implement it as follows:

1@Value
2public class PersonPk {
3    private final String firstName;
4    private final String lastName;
5}

By creating new objects with references to existing strings, we can reduce the burden on the garbage collector. This should improve performance.

In the end, I compared the performance of both methods and recorded the following results:

 1Java 8
 2Benchmark                                 (nameLength)   Mode  Cnt     Score      Error  Units
 3MemoryChurnBench.getFromRepoNewObject                5  thrpt    5  6366.565 ±  342.554  ops/s
 4MemoryChurnBench.getFromRepoNewObject               10  thrpt    5  6365.775 ±  331.102  ops/s
 5MemoryChurnBench.getFromRepoNewObject              100  thrpt    5  6388.885 ±   44.873  ops/s
 6MemoryChurnBench.getFromRepoNewObject             1000  thrpt    5  6246.701 ±  531.240  ops/s
 7MemoryChurnBench.getFromRepoNewObject            10000  thrpt    5  6564.168 ± 1421.583  ops/s
 8MemoryChurnBench.getFromRepoStringConcat             5  thrpt    5  3151.173 ±   39.861  ops/s
 9MemoryChurnBench.getFromRepoStringConcat            10  thrpt    5  2531.838 ±   89.890  ops/s
10MemoryChurnBench.getFromRepoStringConcat           100  thrpt    5   391.318 ±    0.190  ops/s
11MemoryChurnBench.getFromRepoStringConcat          1000  thrpt    5    35.975 ±    0.039  ops/s
12MemoryChurnBench.getFromRepoStringConcat         10000  thrpt    5     3.762 ±    0.001  ops/s
13
14Java 17
15Benchmark                                 (nameLength)   Mode  Cnt      Score      Error  Units
16MemoryChurnBench.getFromRepoNewObject                5  thrpt    5  10286.933 ±  709.515  ops/s
17MemoryChurnBench.getFromRepoNewObject               10  thrpt    5  10687.213 ±  167.468  ops/s
18MemoryChurnBench.getFromRepoNewObject              100  thrpt    5  10675.250 ± 1194.855  ops/s
19MemoryChurnBench.getFromRepoNewObject             1000  thrpt    5  11001.058 ± 1460.443  ops/s
20MemoryChurnBench.getFromRepoNewObject            10000  thrpt    5  10464.001 ±   80.074  ops/s
21MemoryChurnBench.getFromRepoStringConcat             5  thrpt    5   3036.421 ±  524.678  ops/s
22MemoryChurnBench.getFromRepoStringConcat            10  thrpt    5   3133.770 ±  401.871  ops/s
23MemoryChurnBench.getFromRepoStringConcat           100  thrpt    5   2291.642 ±  122.767  ops/s
24MemoryChurnBench.getFromRepoStringConcat          1000  thrpt    5    238.308 ±    1.260  ops/s
25MemoryChurnBench.getFromRepoStringConcat         10000  thrpt    5     28.638 ±    0.278  ops/s

Conclusion

From the results above, we can draw the following conclusions:

  1. The bigger the strings that are used for concatenation, the bigger the memory churn is.
  2. Java 17 has an order of magnitude better performance for string concatenation compared to java 8.
  3. Optimized version of code, without string concatenation, performs much better than string concatenation. This statement is true for all versions of Java.
  4. Java 17 has much performs better when it comes to garbage collection.

For more stuff like this, you can follow me on Twitter, LinkedIn, or visit my website.

Appendix: Java Code Used for Benchmarking

At the end, here is the code that I used to perform benchmarks:

 1@Fork(warmups = 0, value = 1)
 2@BenchmarkMode(Mode.Throughput)
 3@OutputTimeUnit(TimeUnit.SECONDS)
 4@Measurement(time = 10, iterations = 5)
 5@Warmup(iterations = 5, time = 1)
 6public class MemoryChurnBench {
 7    private static final int TOTAL_NO_ITEMS = 10_000;
 8
 9    @State(Scope.Benchmark)
10    public static class InputParams {
11        PersonRepository personRepository = new PersonRepository();
12        @Param({"5", "10", "100", "1000", "10000"})
13        private int nameLength;
14
15        List<Pair<String, String>> searchPersonNamesList;
16
17        public InputParams() {
18        }
19
20        @Setup(Level.Trial)
21        public void createRandomList() {
22            searchPersonNamesList = new ArrayList<>(TOTAL_NO_ITEMS);
23            Set<String> generated = new HashSet<>();
24            for (int i = 0; i < TOTAL_NO_ITEMS; i++) {
25                String firstName = getUniqueFirstName(generated);
26                String lastName = getUniqueLastName(generated);
27                searchPersonNamesList.add(Pair.of(firstName, lastName));
28            }
29        }
30
31        private String getUniqueLastName(Set<String> generated) {
32            String firstName;
33            do {
34                firstName = getLastName(nameLength);
35            }
36            while (generated.contains(firstName));
37            return firstName;
38        }
39
40        private String getUniqueFirstName(Set<String> generated) {
41            String firstName;
42            do {
43                firstName = getFirstName(nameLength);
44            }
45            while (generated.contains(firstName));
46            return firstName;
47        }
48
49        public String getFirstName(int i) {
50            return getString("F:", i);
51        }
52
53        private static String getString(String prefix, int length) {
54            return prefix + RandomStringUtils.random(length, true, true);
55        }
56
57        public String getLastName(int i) {
58            return getString("L:", i);
59        }
60
61        public void setNameLength(int nameLength) {
62            this.nameLength = nameLength;
63        }
64    }
65
66    @Benchmark
67    public void getFromRepoStringConcat(InputParams params, Blackhole b) {
68        for (int i = 0; i < TOTAL_NO_ITEMS; i++) {
69            Pair<String, String> personFullName = params
70                    .searchPersonNamesList
71                    .get(i);
72            Person repositoryPerson = params
73                    .personRepository
74                    .findPerson1(personFullName.left(), personFullName.right());
75            b.consume(repositoryPerson);
76        }
77    }
78
79    @Benchmark
80    public void getFromRepoNewObject(InputParams params, Blackhole b) {
81        for (int i = 0; i < TOTAL_NO_ITEMS; i++) {
82            Pair<String, String> personFullName = params
83                    .searchPersonNamesList
84                    .get(i);
85            Person repositoryPerson = params
86                    .personRepository
87                    .findPerson2(personFullName.left(), personFullName.right());
88            b.consume(repositoryPerson);
89        }
90    }
91}
comments powered by Disqus