The dreaded B-Tree ...
Chronicles of DB Indexes - Part 3
Oh the dreaded B-Tree index. Everyone who created a table at least ONCE in they life surely saw the "Index Type" where by default the "B-Tree" is selected. Some might even have clicked that dropdown and saw that there's also a Hash index; maybe something else as well. But I'm sure almost nobody (myself included) never even cared to check what B-Trees are. We assumed they are Binary Tree, I guess. Oh the mistake!
It actually took me a while to digest everything about B-Trees. It doesn't help that, you know, there are B-Trees, B+Trees, and BTrees. The two latter ones sometimes being referred to as "B+-trees" and "B-trees". Yikes!
So it took me a while to digest the WTFs. And even funnier - once you do digest, the first question becomes - well why the heck don't we simply use Binary Trees instead?!
There's a lot to this story. BUT, and hear me out, the story is actually quite amazing and makes a lot of sense! And I'll dig into all the details next week, so be patient. I have to create that story for you first :)
So, the B-Trees. First of - they are called "B Trees"! They are NOT abbreviation but variation of Binary Tree. A variation that was literally ARCHITECTED for being stored and used on HDD (and we'll talk more about this next week!).
Binary Trees are quite think and they grow tall. But B-Trees try to pack as much data as possible so they grow wide (thick, if you will!). And all of this because you want to have as few disk I/Os as possible!
Finally, when it comes to SQL Server, B-Trees pack up to 8KBs of data. And yes, that is so because SQL Server's page size is 8KBs. So you want to have one node per page really :)
When it comes to Searching - the process is the same as with Binary Trees - you start from root, compare your values and progress left or right. Rinse&repeat until you reach destination (or leaf and figure out there's no destination available!). However, their main advantage is that even for huge tables - they might have up to 2-3 levels of depth max! And that's quite in contrast to Binary Trees that grow in depth almost linearly as your data does!
All in all, quite a cool topic to be explored, especially once I give you some bits about how HDD works (hint: it's like a crane trying to move cargo containers around; it's bulky and slow!). But I'll save that for next week!
Thanks for reading Bitesized Engineering! Subscribe for free to receive new posts and support my work.