Hey hey! 👋
You know what’s interesting? I initially created THREE infographics on this topic, and that’s because I figured there’s a lot to unfuddle here. But as luck would have it, I didn’t have that much time to wrap them up, so I let them air-dry for couple of days.
Well, during this morning’s shower (yes, most of these are ‘created’ during the shower-time), the thought hit me — why the heck would anyone want to read THREE graphics on Strongly-named assemblies? It can and should be compressed in a single one, and if anyone wants to explore more than that - they’ll be equipped with knowledge to do so.
So yeah, lo and behold, I bring you - ONE image :) And as usual - the graphic goes first and then more details will follow:
Alright, cool! I hope that gave you an idea?
You know, in my opinion, Strongly-named is just a mouthful for a very simple concept - a file that is (digitally) signed. And yeah, Weakly-named is not an official name, but I “stole” it from CLR via C# (amazing book, btw!).
So wbat’s the whole deal? Well the deal is really simple - anyone could make a malicious assembly, fill it up with bunch of malware, name it System.dll and find a way to store it in a shared location on your system where .NET runtime would pick it up. Well, anyone could have been able to do this IF there was no way to verify whether the DLL is legit. And luckily - there is such way. And that way is something which has been fairly known for a while now - Digital Signing.
Let’s for a moment assume that you are Microsoft and you want to ship a new version of System.dll. I want to emphasize that, even though I currently am employed by Microsoft, my day to day job has NOTHING to do with .NET framework development. I work in Azure SQL MI. This scenario is 100% made up in my head in order to showcase a theoretical process:
First things first —> I’d update the code with whatever I wanted to do in System.dll and I’d build a new version of it. At this point I have a “weakly-named” assembly - it’s just a DLL without any info on the identity and ownership that could be validated.
Next, I’d take all the bytes that make up my DLL and run them through SHA256 hashing algorithm. You can think of it as
hash = SHA256(file_content)
.Now that I have a fixed-size hash (hashes ALWAYS produce fixed-sized output, no matter how big the input is), next thing I do is take a Private Key that only couple of Key keepers in Microsoft have access to. It’s named “private” for a reason - access to it is severly restricted. What I do is I encrypt the file’s hash with this private key. Think of it as doing
ENCRYPT(file_content_hash, private_key)
.The output of previous operation is what is called a DIGITAL SIGNATURE - it’s a hash of my file’s content, encrypted with the key that only I know about. Next, I add this digital signature to the assembly as well.
Finally, I add Microsoft’s PUBLIC KEY alongside this signature as well. Public Key is public for a reason - anybody can have it!
At this point, I embedded the digital signature and public key, and alongside other info in this file (name of assembly, version, culture, etc.) this makes the assembly STRONGLY-Named, meaning - anybody can verify that this was really created by Microsoft.
But logical question to ask is - but HOW can anybody verify it? And the answer is simple - by using Public Key.
The thing is, if you don’t know how Private/Public key crypto works, this all sounds like bunch of complicated crap. Let me assure you that it’s REALLY SIMPLE and I’ll spend the whole next article explaining this topic.
For now, think of it this way - you have Private and Public key. Former you must keep super-secret, and latter you can print on your wall if you want. But here’s the trick - whatever you ENCODE with Private Key, can only be DECODED with Public Key! You CAN NOT decode with Private Key. It works in one-way only!
Same is true for Public Key - if you encode something with Public Key, the only way to decrypt it is by using the matching Private Key :) It’s beauty of the math at it’s finest.
So what this all means is that, if you want to verify genuiness of the System.dll, you’d take the file’s bytes, run them through SHA256 hashing algorithm and then take the digital signature (which, if you remember, is a Digital Signature that was encoded with Microsoft’s Private Key), decode it with the public key that is embedded in the assembly, and compare the two. If the hashes are IDENTICAL - then file is genuine and hasn’t been tampered with. If there is a mismatch - file has been tempered with!
In terms of pseudo-code, that would be:
# On Microsoft side
SIGNATURE = ENCRYPT(SHA256(FILE_CONTENT), PRIVATE_KEY)
EMBED(SIGNATURE in DLL)
EMBED(PUBLIC_KEY in DLL)
# On Client side
FILE_HASH = SHA256(FILE_CONTENT)
ORIGINAL_HASH = DECODE(SIGNATURE, PUBLIC_KEY)
IF FILE_HASH == ORIGINAL_HASH : file is Genuine
But how do you know if public key is genuine, right? Well that’s where Certificate Authorities come into play. They are middle-man’s, trusted by everyone, whose only job is to stay safe and make guarantees about who owns which public key. But that’s a story for the next time. For now just take my word that there is a simple way to validate if Public Key belongs to Microsoft, and that is done by asking the “Trusted Authority”.
Long story made short - combo of these four (Assembly Name, Version, Culture, Public Key) is what uniquely identifies assembly and makes it a Strong-named one. Strong-named as in - it can be UNIQUELY identified. Remove the public key from equation and you have a weakly named assembly.
Last but not least - GAC. As you likely now, Global Assembly Cache (GAC) is a place where you store DLLs that you want to share. And that’s where all .NET’s DLLs reside. You can, of course, add your own assemblies there, but there’s a hard requirement that any assembly going into GAC must be strongly-named. This makes sense because GAC has the HIGHEST priority when CLR starts searching for assembly to load. If assembly is in GAC - everything else will be ignored and this is by design.
And that’s about it for today :) Next time I’m going to talk a bit about Cryptography and Private/Public keys and how they work. After that I’m likely moving either towards Assembly Metadata Tables (incredibly cool topic btw!) or towards the Stacks and Heaps in .NET. Until then, if you haven’t already, do consider subscribing as there’s lot more cool stuff to come!
Thanks for reading!