What is the difference between value-types and reference-types in C#

Fri, Nov 26, 2021

This is the question that appears on every technical interview that we do in our company. And although very basic, surprisingly few people even understand what it’s about. In this post I’ll try to explain the matter and also give some tips on how to make use of it.

The key to understanding this question is memory layout.

How a program’s memory is organized

When a program starts running, the operating system allocates a virtual memory space for it. The explanation of a virtual memory space is beyond the scope of this article, but in short it consist of all the addressed that the program can possibly use. On a 64-bit system those addresses go from 0 to 2^64. The simplified scheme of this memory space is presented on the following picture:

Memory layout

Going from the lowest addresses, we see the following sections of the memory:

Text section. This is where the executable code of the program is stored. This section is fixed in size and immutable.
Data section. This is where the global and static variables are. It’s also fixed in size, but is writable in runtime.
Heap. This is the biggest and the most flexible section. It holds the runtime data that the program generates and can expand if needed. In C# the programmer doesn’t need to worry about freeing the memory on the heap, since this is being taken care of by the garbage collector. Note, however, that allocating memory on the heap is relatively expensive, and the garbage collector freezes the entire program once it decides to clean up the unused memory. That’s why, if performance is of a concern, the heap should be used carefully.
Stack. This is the section used for local variables, function arguments, return addresses and few other things. It is based on the stack data structure, so allocations and deallocations always happen on the top of the stack. Because of this, these operations are extremely cheap and are managed automatically on the hardware level. No garbage collector is needed here as well. But compared to the heap, the stack is small in size, so we cannot store large amounts of memory on it. Another limitation is that we need to know how much stack memory we need in compile-time, that’s why it’s not suitable for storing things like dynamic arrays or lists.

The two sections relevant to our topic are stack and heap. And now once we understand what they are and how they are different, we need to explore another part of the question: reference-types and value-types.

What are reference-types and value-types

In C#, not all types are created equal. Consider this example:

void ExampleMethod()
{
	int a = 5;
	int[] arr = new int[10];
}

Where in memory do these variables live? Or rather, where do they hold their value?

Well, in case of int a it’s simple. The variable and its value are the same thing. Once we declare the variable a, 4 bytes get allocated on the stack and the value of 5 gets written there. Because of this, type int is called “value-type”. Other value-types include primitive types, like float, bool, long, char. All structs are also value-types, so things like KeyValuePair, DateTime, IntPtr, as well as all custom-written structs.

With int[] things look a bit different, though. For the variable arr, memory also gets allocated on the stack, but not for the entire array. What is allocated on the stack, is only a reference, an 8-bytes ¹ integer address of some location on the heap. And the array itself is allocated on the heap, where the reference is pointing. And here we have it - reference-type! Other reference-types are all class-based types (like List<T>, FileStream, Thread), built-in arrays and type string. A very important thing to keep in mind about reference-types: every time you create an instance of a reference-type with the new keyword, you allocate memory on the heap. Which, as we discussed in the previous paragraph, is expensive and produces garbage.

It’s worth noting that when you make a new class, it becomes a reference-type; when you make a new struct, it becomes a value-type. So take into consideration the use-case of the type you’re creating. If it’s meant as a plain data object, with a few value-type fields, than this is a good candidate for a value-type itself, therefore struct. If the type is meant to perform logic on the data it holds, make it a reference-type by using class keyword.

So to sum it up, when at an interview we ask the question in the title, we expect an answer somewhere along the lines of “Reference-type objects get allocated on the heap, value-type objects get allocated on the stack”. Once this single sentence gets pronounced, we usually politely stop the candidate and move on to the next question. If a person formulates their answer like this, it means that they understand the difference between those two definitions, and memory sections, and which objects get allocated where. This is very important thing to to be aware of when writing code that needs to be fast and garbage-free.

Bonus: Boxing

There is one caveat in the saying that value-types are allocated on the stack. In some situations it’s possible to make a heap allocation by creating or using an instance of a value-type. This process is called “boxing”. Consider an example:

interface IPrint 
{
	void Print();
}

struct A : IPrint { ... }

void DoPrint(IPrint p) { ... }

A a = new A();

DoPrint(a);

In C# it’s possible to provide a function parameter of a type that inherits from the type in the function declaration. Like in the example, function DoPrint expects a parameter of type IPrint but we pass it an object of type A. It’s fine because type A implements IPrint. Note, however, that type A is a value-type, since it’s a struct. On the other hand, DoPrint function expects a reference-type, since all interfaces are reference-types. C# compiler resolves this issue in such a way, that it generates a new class under the hood, which implements interface IPrint and holds the object a as its field, and passes this new generated object to the function. This way our value-type object gets “boxed” into a reference-type object. In turn, creation of this new object causes a heap allocation.

Another instance of boxing that frequently occurs is passing a value-type object to a function that expects a parameter of type object. A very common example of this is string.Format. Passing an instance of a struct there will trigger boxing.

In general, objects generated by boxing are very small, and one or two of them won’t hurt the performance too much. But you need to be careful in some contexts where boxing happens frequently, for example, in a while loop. In such cases these small allocations will add up and may cause problems.

I hope that this article helped a few people to understand what are value-types and reference-types. And perhaps also to successfully pass the interview to our company :)

Assuming the program is run on the 64-bit system. In case of 32-bit systems references take only 4 bytes. ↩︎