Virtual Adressing and Why every process sees the same memory space
Chronicles of Memory Management - Part 12
This is one of those articles that I didn’t plan on writing. Seriously. My plan was to cover Heaps, Heap API and then write about Virtual functions and Virtual Memory allocation. But then I realized I never explained what Virtual Memory and Virtual Addressing is. Bummer!
As we do usually, first the bitesized infographic and then the more detailed explanation below!
Enjoy!
A’ight, I hope you found that useful. If not - well, at least I know I gave my best.
The truth is that I lied to you before. I lied when I said that memory addresses that you get back from OS are RAM addresses. They are not. And some of you probably figured that I’m a liar and you stopped reading. That’s OK. Thing is - I think this was a WHITE LIE. A lie that told with a greater purpose. Purpose of making it simpler to digest the content at the moment. But now I’m giving you the full picture (and I swear that was the only lie I told!).
Thing is - you never ever get to see the actual physical RAM addresses. Never. Ever. Hell, I actually went way and beyond to figure out if there’s even a way to do so! No matter how privileged your process is, you ALWAYS see Virtual Address space. The actual conversion from Virtual to Physical happens in the OS’ Kernel itself!
But why is that? Well I guess I explained it in the infographic below, but I’ll repeat for those with fuse-short attention span:
Process can ask for more memory than there is Physical RAM — this isn’t such a big of a problem in these days of massive servers with hundreds of gigabytes of RAM. At least not a big enough of a problem to bother you. But to get an actual picture, I want to take you back to 32-bit Windows era. The era when Windows supported maximum of 4 gigs of RAM (simply due to number of bits available for addressing!).
During the 4 gigs and 32bits era, Windows would usually split your RAM into two halves - User Space and System Space. And it’d split it in 50:50 ratio. So your program, along with all other programs out there gets 2 gigs of RAM and Windows uses other 2 gigs for, you know, Kernel, Drivers and all other things going under the hood.
Question then becomes — how the heck do you start a Chrome browser with 2 gigs of RAM, right? Not to mention Chrome and, god forbid, a Calculator. The answer is simple - you don’t. Or at least - you couldn’t. And that sucked because some people really wanted to use Chrome.
So how do you solve this unsolvable problem? Well, you lie. Just like I lied a bit when I told you that addresses reference physical RAM addresses, so does Windows lie a little bit and allows you to ask for more memory than it actually has. And this white lie of Windows is called - Virtual Addressing.
Windows will allow you to allocate, say 5 gigs of RAM in order to fire up two Chrome tabs (e.g. for GMail and LinkedIn) and it will happily provide you with memory space. But here’s the deal - some of those bits will end up on your hard drive (a process called “Paging Out”; more about it in future articles). So you keep working with these Virtual Addresses, thinking you have 5 gigs at disposal, but what happens in the background is that Operating System keeps juggling those bits back & forth between actual RAM and your Hard Drive. Cool stuff!You can have contigous ranges of addresses — objectively speaking, there is NO way you’d ever get a contigous range of addresses from RAM, simply due to fragmentation that happens over time. So if you were working with ACTUAL physical addresses - that would be a no-go. Forget about integer arrays where next element is 4 bytes away from current one. Ain’t gonna work.
But when you use made-up addresses, you can do whatever the heck you want! So you ask for 2 gigs of RAM, and you get a 2 gigs contigous block back. In the background it’s 99% NOT contigous and some of it is likely sitting in your hard drive, but that’s somebody else’s problem!Stricter process isolation — just imagine a scenario where each process could write to any PHYSICAL memory address. You don’t have to be a security expert to see that a malicious process would try and scan the whole RAM immediately, right? I mean, sure, you could task your OS to track what belongs to whom, but once you end up in shared memory waters, shit becomes serious. And why bother?
If you provide each process with a virtual address space starting at 0x000 and going all the way to 0xFFF…FFFF, all you need to do is ensure to track the mappings of the process itself. Easy peasy (in a way!).
Finally, and I know I haven’t introduced it officially yet (and I’ll do that in next articles!) but there are TWO modes of operation for each process in Windows. One is “User Mode” and another is “Kernel Mode”. User Mode is where all your user-started processes execute, whereas Kernel Mode is where all your drivers and Windows operate. Obviously, that’s oversimplification, but let’s assume it to be like that for now.
Difference between these two (again, an over-simplification!) is that each Process running in User Mode sees only it’s own Virtual Address space. Whichever memory location you ask for from inside the User process, Windows would see if you allocated that area and if you did - you’d get it back. If not - well, segmentation fault. Oh, and by the way - there’s NO way for user process to even ask for Virtual Addresses in Kernel space!
On the other hand, processes running in System Mode (Kernel Mode, whatever), actually see a SINGLE & UNIFORM Virtual Address space. They could ask for ANY address which, when mapped, could either be of a process running in user mode or could be of bits that were allocated by another process running in System Mode. Point being - anything running in System Mode can see and modify ANYTHING in the RAM!
And that’d be about it for today :) I hope you found that useful, and if you did - I’d appreciate if you share it with your network!
Next time I will be talking about Virtual Memory functions and VirtualAlloc(). Until then!
Other articles from the C++ Memory Management series: