Interestingly enough, I think “abandoned memory” would be a more proper name. But there seems to be some historical reasons on why it’s called “leaking”.
Anyway, today we will be taking a quick look at Memory and Resource leaks, what they mean, why does it happen and how to, hopefully, prevent it.
As usual - the bitesized graphic first and then more detailed explanation will follow.
(click on the image to expand)
Firstly, I want to share some findings on why it’s called “memory leak” at all. Rather than paraphrasing, I’ll just copy paste the original answer from StackOverflow:
There are two theories about where the term "leak" and other liquid based analogies for memory originated.
In the ENIAC, and in very old systems, memory was stored in circulating mercury, something called a mercury delay line. Information was represented as a charge in a portion of mercury that circulated around tubes. The charge was read when that part of the mercury passed by detectors (BTW, this is why ENIAC could only use half of the memory... see below).
In this case, a memory leak is quite literal, and fairly hazardous.
The other theory comes from the mainframe days. Memory was shared between any running jobs and was called a "pool" of memory after things like motor pools, secretarial pools, and similar. When a job (program) caused memory to become inaccessible, that memory "leaked" out of the pool.
I guess that makes sense, although “abandond memory” still makes way more sense to me, but I digress.
So what is the memory leak, really? It’s really just a memory that you allocate on Heap but never release back. In terms of the Beach analogy from the previous article, it would be like constantly taking stuff with you that you keep piling but never returning back. Or something like hoarding - you keep putting stuff on top of existing stuff without ever cleaning it up. In simple terms - you keep allocating MORE Heap memory, but you never free-up the bits you don’t need any more.
Here’s the simplest example:
class Foo
{
int a;
};
int main()
{
for (int i = 0; i < 1000; i++) {
Foo* f = new Foo();
}
}
For people coming from Managed Languages, this could be rather confusing. Where’s the problem?
The problem, just like the devil, is in the details :) So what the code above does is the following:
Foo* f
allocates a variable named “f” and places it on the stack. This variable is of a pointer type, and it points to an object of type “Foo”. Size of this variable is 64 bits and it will remain at the same address at stack, occuping this same size.new Foo
allocates 32 bits (which is the size of the Foo class - it has a single INT field, and INT is 32 bits in size) on the Heap. But here’s the devil and his details - if you remember the article I wrote on keyword “new”, what it does is it allocates memory every time you call it. So every time you call “new Foo()”, you are asking for 32 bits from the Heap.
The problem is in step #2 - if you keep asking for new memory, but never remove the one you don’t need any more - you will just keep allocating more and more. And yes, what this means is that you’d be “leaking” the memory :) Keep doing it for long enough and eventually you will eat up all RAM and potentially crash the OS. Not good.
Again, this might be a complete brain-fuck for anyone coming from languages with Garbage Collectors. And technically, all that GCs do is track the unused bits in the Heap and free them up. Well, they do it on a way more sophisticated scale, but in a nutshell, that’s what they do.
In order to tackle this, Bjarne Stroustrup came up with a concept called RAII (Resource Acquisition Is Initialization). Now mind you, and with all due respect to the author himself, I find this name to be as ugly as HATEOAS. Two super-important concepts wrapped in comletely and utterly meaningless names making people either ignore them or pull their hair off to understand the meaning.
RAII is a VERY SIMPLE idea, and I really love how Microsoft’s Docs explain it:
Modern C++ avoids using heap memory as much as possible by declaring objects on the stack. When a resource is too large for the stack, then it should be owned by an object. As the object gets initialized, it acquires the resource it owns. The object is then responsible for releasing the resource in its destructor. The owning object itself is declared on the stack. The principle that objects own resources is also known as "resource acquisition is initialization," or RAII.
When a resource-owning stack object goes out of scope, its destructor is automatically invoked. In this way, garbage collection in C++ is closely related to object lifetime, and is deterministic. A resource is always released at a known point in the program, which you can control. Only deterministic destructors like those in C++ can handle memory and non-memory resources equally.
Source: Microsoft Docs
Put in simple words, what it means is - instead of allocating raw pointers, wrap them in a damn class that will be placed on a Stack, define a destructor that frees the memory and once the class goes out of scope and destructor gets called you are guaranteed that the memory will be freed up.
Here’s a simple stupid example:
class SmartFoo
{
public:
Foo* f;
SmartFoo()
{
f = new Foo();
}
~SmartFoo()
{
delete f;
}
};
int main()
{
for (int i = 0; i < 3000; i++) {
SmartFoo f;
}
}
In comparison to previous piece of code, this time we are creating variable “f” on the Stack, and whenever the variable goes out of scope (i.e. whenever loop ends) the destructor gets called. We are literally using Stack as a crutch to ensure that we wipe our resources out once they go out of scope. Would it surprise you to learn that this is pretty much exactly what Smart Pointers do?
You might argue that my examples are too simple, and I’d agree. But my idea was to show you HOW leaks happen - they happen if you allocate bits on Heap but never free them up. Those bits will leave on the Heap for as long as your process lives!
And that’s about all I had to share for today :) Do let me know if you have some crazy stories with memory leaks that were hard to trace!
Next time I’m going to talk a bit about Pointers & References, and then I’ll start exploring the world of Windows’ Heap Allocations - HeapAlloc and VirtualAlloc APIs.
Until then!
Other articles from the C++ Memory Management series: