Hey hey! š
You know whatās interesting? I initially created THREE infographics on this topic, and thatās because I figured thereās a lot to unfuddle here. But as luck would have it, I didnāt have that much time to wrap them up, so I let them air-dry for couple of days.
Well, during this morningās shower (yes, most of these are ācreatedā during the shower-time), the thought hit me ā why the heck would anyone want to read THREE graphics on Strongly-named assemblies? It can and should be compressed in a single one, and if anyone wants to explore more than that - theyāll be equipped with knowledge to do so.
So yeah, lo and behold, I bring you - ONE image :) And as usual - the graphic goes first and then more details will follow:
Alright, cool! I hope that gave you an idea?
You know, in my opinion, Strongly-named is just a mouthful for a very simple concept - a file that is (digitally) signed. And yeah, Weakly-named is not an official name, but I āstoleā it from CLR via C# (amazing book, btw!).
So wbatās the whole deal? Well the deal is really simple - anyone could make a malicious assembly, fill it up with bunch of malware, name it System.dll and find a way to store it in a shared location on your system where .NET runtime would pick it up. Well, anyone could have been able to do this IF there was no way to verify whether the DLL is legit. And luckily - there is such way. And that way is something which has been fairly known for a while now - Digital Signing.
Letās for a moment assume that you are Microsoft and you want to ship a new version of System.dll. I want to emphasize that, even though I currently am employed by Microsoft, my day to day job has NOTHING to do with .NET framework development. I work in Azure SQL MI. This scenario is 100% made up in my head in order to showcase a theoretical process:
First things first ā> Iād update the code with whatever I wanted to do in System.dll and Iād build a new version of it. At this point I have a āweakly-namedā assembly - itās just a DLL without any info on the identity and ownership that could be validated.
Next, Iād take all the bytes that make up my DLL and run them through SHA256 hashing algorithm. You can think of it as
hash = SHA256(file_content)
.Now that I have a fixed-size hash (hashes ALWAYS produce fixed-sized output, no matter how big the input is), next thing I do is take a Private Key that only couple of Key keepers in Microsoft have access to. Itās named āprivateā for a reason - access to it is severly restricted. What I do is I encrypt the fileās hash with this private key. Think of it as doing
ENCRYPT(file_content_hash, private_key)
.The output of previous operation is what is called a DIGITAL SIGNATURE - itās a hash of my fileās content, encrypted with the key that only I know about. Next, I add this digital signature to the assembly as well.
Finally, I add Microsoftās PUBLIC KEY alongside this signature as well. Public Key is public for a reason - anybody can have it!
At this point, I embedded the digital signature and public key, and alongside other info in this file (name of assembly, version, culture, etc.) this makes the assembly STRONGLY-Named, meaning - anybody can verify that this was really created by Microsoft.
But logical question to ask is - but HOW can anybody verify it? And the answer is simple - by using Public Key.
The thing is, if you donāt know how Private/Public key crypto works, this all sounds like bunch of complicated crap. Let me assure you that itās REALLY SIMPLE and Iāll spend the whole next article explaining this topic.
For now, think of it this way - you have Private and Public key. Former you must keep super-secret, and latter you can print on your wall if you want. But hereās the trick - whatever you ENCODE with Private Key, can only be DECODED with Public Key! You CAN NOT decode with Private Key. It works in one-way only!
Same is true for Public Key - if you encode something with Public Key, the only way to decrypt it is by using the matching Private Key :) Itās beauty of the math at itās finest.
So what this all means is that, if you want to verify genuiness of the System.dll, youād take the fileās bytes, run them through SHA256 hashing algorithm and then take the digital signature (which, if you remember, is a Digital Signature that was encoded with Microsoftās Private Key), decode it with the public key that is embedded in the assembly, and compare the two. If the hashes are IDENTICAL - then file is genuine and hasnāt been tampered with. If there is a mismatch - file has been tempered with!
In terms of pseudo-code, that would be:
# On Microsoft side
SIGNATURE = ENCRYPT(SHA256(FILE_CONTENT), PRIVATE_KEY)
EMBED(SIGNATURE in DLL)
EMBED(PUBLIC_KEY in DLL)
# On Client side
FILE_HASH = SHA256(FILE_CONTENT)
ORIGINAL_HASH = DECODE(SIGNATURE, PUBLIC_KEY)
IF FILE_HASH == ORIGINAL_HASH : file is Genuine
But how do you know if public key is genuine, right? Well thatās where Certificate Authorities come into play. They are middle-manās, trusted by everyone, whose only job is to stay safe and make guarantees about who owns which public key. But thatās a story for the next time. For now just take my word that there is a simple way to validate if Public Key belongs to Microsoft, and that is done by asking the āTrusted Authorityā.
Long story made short - combo of these four (Assembly Name, Version, Culture, Public Key) is what uniquely identifies assembly and makes it a Strong-named one. Strong-named as in - it can be UNIQUELY identified. Remove the public key from equation and you have a weakly named assembly.
Last but not least - GAC. As you likely now, Global Assembly Cache (GAC) is a place where you store DLLs that you want to share. And thatās where all .NETās DLLs reside. You can, of course, add your own assemblies there, but thereās a hard requirement that any assembly going into GAC must be strongly-named. This makes sense because GAC has the HIGHEST priority when CLR starts searching for assembly to load. If assembly is in GAC - everything else will be ignored and this is by design.
And thatās about it for today :) Next time Iām going to talk a bit about Cryptography and Private/Public keys and how they work. After that Iām likely moving either towards Assembly Metadata Tables (incredibly cool topic btw!) or towards the Stacks and Heaps in .NET. Until then, if you havenāt already, do consider subscribing as thereās lot more cool stuff to come!
Thanks for reading!