The Big-O-Notation - Part #3Filed Under: Weekly Tuesday Dose of goodness
- The Big-O-Notation - Part #1
- The Big-O-Notation - Part #2
- The Big-O-Notation - Part #3
Hi all,
Today I’m going to wrap up the Big-O-Notation with a summary of how to use, when to use and what to look out for in each of the data structures using the Big-O-Notation.
This, therefore will be a short article compared to the previous articles.
If you can’t stand the long and wordy explanations of the previous 2 articles, then this article will be much more useful and direct for your needs.
To know more - read on…
Right, let’s cut the chase. Here’s the table that shows the performance of each of the basic data structures.
| Data Structure / Operations | Insertion | Arbitrary Access | Arbitrary Removal | Iteration to Next element |
| Array (or std::vector<T>) | O(c) | O(c) | O(n) | O(c) |
| Linked List | O(c) | O(n) | O(n) | O(c) |
| Binary Tree | O(log 2 n) | O(log 2 n) | O(log 2 n) | O(log 2 n) |
Knowing this isn’t enough. The essence of applying the right data structure based on the Big-O-Notation is in the technical requirements it’s used for. Take a look.
| Data Structure / Technical Requirements | Index Access | Key-Value Access | Sorted List | Ordered(in Sequence) List | Internal List Integrity* | Removal During Iteration |
| Array | Good | Bad | Manual | Possible, with some efforts | N/A | Bad |
| Linked List | Bad (but it’s the only way) | Bad | Bad | Good | Good | Good |
| Binary Tree | Bad | Good | Good | Internally impossible, iteratively possible | Good | Good |
* List Integrity refers to the internal node structure. This doesn’t apply to arrays.
Next, I have a few examples on how this table can apply to game development.
| Technical Game Requirements | Objects requires identifier? | Recommended Data Structure To Be Used |
| Projectile list | No | Linked List |
| Fixed size access buffer | Yes, index-based | Vector/Array |
| Sorting Z-depth buffer for rendering | No | Binary Tree, Key is float, Value is buffer |
| Retrieving an entity by name | Yes, string-based | Binary Tree, Key is string, Value is entity object |
| Known size, requires quick direct access | Yes, index-based | Vector/Array |
| Unknown size, a lot of insertions, a lot of iterations and a lot of iterations by removal | Yes, index-based but not necessary | Linked List |
| Random insertions at random timings, sorting not required, uses a string as a key | Yes, string-based | Binary Tree |
| Event list with quick insertion, quick access, and quick removal during iteration | Yes, string-based | Consider using BOTH Linked List and Binary Tree to store the same set of data. Careful manipulation required for synchronization. |
Perhaps there’s at least one more data structure which one might have heard out there but never mentioned here. That is - Hash Map.
Hash Map
What’s a hash map? First of all, they’re two separate components working together to form a data structure. A map in our case is also known as a binary tree. A hash on the other hand, is an algorithm that derives at an index.
A hash map also has a size known as buckets. This is not to be understood as the data structure size. To understand what’s buckets, we need to see the entire composition of a hash map.
Hash Map = Hash Algorithm + An Array of Binary Trees
The term bucket comes from the array of binary trees. The bucket size specifies how big the array should be when it’s created initially. Why? So that the hash algorithm is likely to assign different objects to different a bucket index instead of dumping everything into a single bucket, which is a single binary tree.
The strategy here is to make the hash algorithm derive an index based on the key component. Typically, a hash algorithm itself contains at least 2 parts.
1) The unique hash provider
2) The hash index provider
Who provides the unqiue hash ID? The objects themselves! This is the reason why in Java, you’ll see methods like .getHashCode() from any objects. As to how the hash code is being calculated, it depends very greatly on the contents of the object itself. A simple example would be “ABCDE” means 65+66+67+68+69 = 335. (ASCII sum of all the characters). The same “ECDBA” will also yield the same hash code but it’s ok, the chances of having the same hash code is relatively negligble.
So, the next step would be to pass this hash code into the hashing algorithm. The simplest form of hashing would be n % bucket size. Let’s assume that the bucket size is 10 and the hash code is 335. The final resting place of this object will be in [5].
Do note, the modulus operation is by nature, deriving a number that is 1 below the maximum number, therefore automatically ensuring that the bucket index is never more than or equals the bucket size or lesser than 0.
Advantages/Disadvantages
Obviously, with such an approach, one is able to reduce the O (log 2 N ) delay time to a lower N instead for large number of objects.
However, the same approach has a lot of constant O ( c ) costs and may not be efficient enough to justify its usage for a small group of objects.
Hash Maps are good for objects that are plentiful in numbers, ie, 100,000 objects and they’re being named exclusively to represent themselves. The main objective is merely to reduce the amount of search traversals required. The price is that the hash algorithm as a whole must be fast enough in the first place to offset the traversal delays. Otherwise, it’ll not be worthwhile to use Hash Map afterall.
I hope this has been a good ride for many.
From next week onwards, I’ll focus on how to get things done with Strides Octopus by replacing articles with tutorials. Do voice out your opinions should you want to see more articles being posted instead.
Till then, I’m signing off now!
Regards,
Jeremy
- Permalink
- Admin
- 22 Sep 2009 11:53 PM
- Comments (2)
September 6th, 2010 at 9:08 pm
Just thought i’d comment and say neat design, did you code it yourself? Looksexcellent.
September 23rd, 2011 at 9:14 am
found your site on bing today and really liked it. i bookmarked it and will be back to check it out some more later.