Hi there! 👋
It’s Mixa again with one more article that digs into the Container Images (also the last one, I promise!). This time I want to talk about something that I found fascinating the moment I figured it out - Image Layers. Let me give you a bite-sized image first and then I’ll add some additional details ;)
So, what’s the deal with all of it? If you recall the Containers are like The Truman show, you know that inside the container it looks like you have your own File System. Fully functional File System. And actually you do! And that File System is built using a technique called Union Mount.
In a nutshell, Union Mount is a really simple idea. This is an obvious oversimplification but think of it as combining multiple files & folders (i.e. taking a union of them) and creating a unified file system out of them. So, take Windows OS Files + Some Random App that you install + Your app’s bits and the end result would be what you see inside of a container.
Now, reasonable question to ask would be - why? Why would you do that? And the reasons seem to be multiple:
Caching & Layer reuse — if you containerize ten different apps, and all of them are based on Windows Base image, it makes sense to reuse those files instead of copying them every time.
Ephemerality — it’s exactly this layering technique that allows containers to be ephemeral. Before unionizing your layers into a single file-system, Container Runtime will add one more layer on top and use that one for writing. This way, whatever you write inside container will stay on this top layer which is simply wiped out when container exits.
Cool thing about all of this is that OCI Image Spec clearly defines HOW this should work (i.e. how Adding, Removing & Combining of files should work). In a nutshell - to add files, you simply add them in any of the layers you want to combine. But to remove them, you still have to use something called Whiteouts. It sounds mystic but it’s really simple - it says that REMOVAL of file is represented as ADDITION with a ‘.wh’ prefix in the path.
Another thing worth mentioning is content addresability. In the previous article where I discussed the content of exported image, I pasted this screenshot:
This blobs directory holds the layers among other things (in this case it’s just one layer - the biggest file that you see) and this layer is actually the “base layer” (or “base image” or whatever you want to call the very first layer) and that’s the Windows OS.
But where do these weird hash-looking names come from? Turns out they ARE hashes. Digests, to be more precise. All content that you can find inside Container Images (and again, if you missed previous article on Image Content, do make sure to check it out) follows the Content-addressable storage (CAS) principle. This principle dictates that any content should be named after a hash of it’s bytes. Hence running sha256 agains Nanoserver version of Windows, will produce the 25550… hex output and that’s what uniquely identifies the content inside. Again, something that is clearly defined in the OCI Image Specification under Descriptors section.
Finally, if you are wondering which libraries are used for actually doing these Union Mounts, turns out that Docker at least has it under “Storage Drivers”. For Linux, “overlay2” is used by default, whereas for Windows it turns out to be something called “windowsfilter storage”.
And that’d be about it for today. Next time I’m going to focus a bit more on answering the “Why use Containers?” and then will likely proceed to discuss some intricacies of how Docker fits into this world.
Until then, if you liked this article or learned something new, you would help me tremendously if you share it with others :) Obviously, you can also subscribe if you haven’t already.
Other articles in the Container series:
Deep-dive into exported Container Image content (Part 9 of Container series)
What's inside the Container Image? (Part 8 of Container series)
What is Container Network Interface (CNI)? (Part 7 of Container series)