Pointers & References - our school day boogeymans

Chronicles of Memory Management - Part 9

Feb 17, 2023

Hi there!

I had to go a bit slower this week as I’m officially on vacation and actually spending some time on the mountain with my kid and wife :)

Regardless, the C++ story has to keep flowing, especially when I want to talk about the stuff that made me hate C and C++ for quite a while. I even joked the other day on Twitter that it took me 20+ years to feel comfortable enough to write about Pointers.

As usual, the bitesized graphic first and then the more detailed version will follow!

(click on image to expand)

Interestingly enough, I also spent quite some time thinking about the simplest way to talk about pointers. Sure, everybody “gets them” and yes, they are “just pointers to memory location”, but I guarantee you that 5 out of 10 people aren’t quite comfortable explaining it. So I decided to take another approach.

Let’s start with the thing that confuses most people. The notation itself — *pointer and &reference. First question is WHY is the notation confusing? I find it to be confusing simply because it deviates from what we are used to seeing normally. And normally the variable names have letters and eventually numbers (e.g. fullName or streetAddress12). But never do they have anything like asterisk or ampersand. So I argue it’s simply unusual and hence “hard” to grasp.

But here’s the thing — do you know what’s the square root of -1? It’s i. Yep. And why i? Well, apparently mathematicians couldn’t come up with anything that makes much sense so they opted for a single letter. And frankly, if you’re just a rookie mathematician, seeing i really makes you pause for a second trying to make sense of it along with other numbers. But once you become seasoned - it’s just like any other symbol. Hell, even NUMBERS are symbols, as far as mathematicians are concerned.

So why am I saying this? Because, from my POV, pointers and references are hard to grasp just because they are not that common. Just like numbers, you are used to seeing normal variable names all the times, but once you see asterisk or ampersand, well it makes you pause for a second. Trust me, it just requires some getting used to!

Here’s how you should think of it. Hell, here’s how to think about ANY variable - think of it as “this is what is on the Stack”. When you encounter regular variable name, what’s on Stack is a full-blown structure that can contain from 8 bits (e.g. a char), through 32 bits (e.g. an int) all the way to some class that could contain up to 1MB (1MB because default Stack size on Windows is 1MB!). But point is - when you see “regular” variable name — it’s a “full data structure” on your Stack.

char foo; // This is 8 bits on your Stack
int bar; // This is 32 bits on your Stack

class Doughnut {
  int a;    // 32 bits
  int b;    // 32 bits
  double c; // 64 bits
}

Doughnut cookie; // 128 bits on your Stack

You following me so far? It’s really simple - regular variable is a full-blown data structure on your Stack and that’s about it.

Here’s the simple magic now - when you see asterisk (e.g. int* foo) what you have on a Stack is a POINTER TO ANOTHER PLACE that contains your data structure. It’s like i really, just another way of representing some additional concept.

int* a; // This is 64 bits on your Stack which store some memory address where you can find "integer"

double* b; // This is 64 bits on your Stack which keep some memory address where you can find "double" 

Doughnut* cookie; // This is 64 bits on your Stack which keep some memory address where you can find "Doughnut"

See where I’m going with this? It’s all about thinking about “what’s on your Stack”. Is it the “full data structure” or just a location where you can find full data structure.

What about “references” (e.g. &addressLine1)? Kind of similar story, except this time what you have is a SHORTCUT, a “hard symlink” of a kind to ACTUAL memory address. It’s literally a way to pass an actual memory address to wherever you want to pass it to. And in contrast to regular variables and pointers - they DO NOT exist on Stack. They are SHORTCUTS, a way how you communicate memory addresses.

int a = 32; // Assume memory address of this var is 0x0001

int* b = &a; // We store 0x0001 as value of pointer variable "b" on Stack

int& c = a; // C is a shortcut, a hardlink to 0x0001. It's NOT a variable and as such - it doesn't exist on Stack. It's a way of passing memory address around

If you find all of this confusing - that’s totally OK. It takes time to get used to really. My advice for you is - take your time and always think in terms of “notation”. It’s just a way to write down certain things.

P.S. Here’s something scary - int** foo. As if a single asterisk wasn’t scary enough, right? What about int*** foo? Your mind starting to boil already, or? Luckily, as I said before - it’s REALLY simple. All you need to think about is “what’s inside the variable on a Stack”. In **foo case, what you have on Stack is a pointer variable that contains address of a memory place that contains another memory address that actually holds the integer. Nothing spectacular, just two jumps to get to integer. Similarly, int*** foo is just three jumps to get to integer, that’s all. You can keep going for as long as you want but I guess there are no valid use-cases for trimple-jumps :)

And that’s about it for today! Next time I’m going to talk a bit more about VirtualAlloc and HeapAlloc (i.e. Windows’ Native API for allocating Heap memory). Until then, do let me know if you find this article useful!

Cheers!

Mixa

Other articles from the C++ Memory Management series:

Bitesized Engineering

Discussion about this post