If you are a self-taught programmer you hit a point in your career where Big-O Notation bites you. Either in a programming interview or in a given problem in the work place like N+1 SQL queries. In falling into this trap if you will, I’ve taken some strides into trying to grapple with the beast (Big-Oh), without having to delve into the intricacies of the math involved. This is my journey and a dissemination of information which I hope could help you too, dear reader.
So the first question, what is Big-O Notation:
Basically it’s used to determine how well a computer algorithm scales as the data increases. More explicitly it measures the efficiency of an algorithm based on the time it takes for the algorithm to run as a function of the input size.
Big-O notation has a symbol notation, listed below in tabular form:
Notation |
Name |
O(1) |
constant |
O(log(n)) |
logarithmic |
O((log(n))c) |
polylogarithmic |
O(n) |
linear |
O(n 2 ) |
quadratic |
O(n c ) |
polynomial |
O(c n ) |
exponential |
Don’t be thrown off just yet, just get your eye used to the format.
O(1) is said to be, Order of 1.
O(n) is said to be, Order of N and so on.
What is the benefit to knowing Big-O Notation?
We just mentioned two reasons at the start of this discussion, but a driving factor is being able to compare algorithms and decide on which one is more efficient and thus the best to implement. Big-O Notation gives us some basis for measuring the efficiency.
Apply Big-O Notation with some loose rules of thumb:
- Big-O Notation cares about the worst-case scenario (in which the algorithm is run).
- Single values or single iterations of a value over a worst-case scenario often don’t have that much of a bearing to consider
- Get to the heart of the complexity (either a comparison or loops), then use that to determine worst-case
- n could be the actual input, or the size of the input
- When comparing algorithms, throw-away the constants (or things that don’t matter over time)
- Big-O Notation is concerned with three ideas. The number of steps converted to a formula, then only the highest power of n is used to represent the entire algorithm
- Determine what you trying to optimise for, for example; time or memory.
- Time can also be considered as the number of steps it takes to complete a problem
- When we are trying to find the complexity of a function/procedure/algorithm, we are not interested in the exact number of operations that are being performed. Instead we are interested in the relation of the number of operations to the problem size.
Working examples:
O(1) – execution time of the algorithm is constant.
FIELD_LIMIT = 24 def valid_length?(input_array) input_array.length == FIELD_LIMIT end
In the example, you can see that despite that the input size of the array may be, the operation is independent, only focusing on the length of the input, as opposed to it’s actual contents.
O(n) – execution tile will grow linearly depending on the input size.
def dump_it(customers_array) for customer_name in customers_array do puts "#{customer_name}" end end
In this example, you can see that there is a direct relationship to the input and the number of steps or rather the time taken. This growth is considered linear. If the input is an array of 100 elements, then the algorithm (however simple) will print out 100 lines of output. If the input is an array of 100 000 elements, then so will the output match.
*Hint – there is only one iteration
O(n^2) – for each increment of the input size, the time of the algorithm will double
def print_chessboard(size) for i in (1..size) for j in (1..size) print '*' end puts end end
In this example, you can see there is a nested loop used to print out just the general form of a chessboard (no indicators for black or white). If the size of chessboard to be printed out is 2, then there will be 4 asterisks (*). If the size is then 6, then there will be 36 asterisks. If the size requested of board is 8, then there will be 64 asterisks. The pattern should be clear here of n^2. That is, 2^2 = 4. 6^2 = 36. 8^2 = 64
O(2n) – for each increment of the input size, the time of the algorithm will grow times two
def dump_it(customers_array) for customer in customers_array do puts "#{customer[:name]}" end for customer in customers_array do puts "#{customer[:country]}" end end
Lastly in this example, we see two separate loops, one for the printing out of customer names, and the other for printing out customer locations (which city they are in).
The one thing about Big-O Notation is that it takes some time to get, and as such there may have been things repeated with intent. But if this has not assisted, here are some really good links:
- History and a more mathematical perspective of Big-O
- Brief overview of Big-O from a Rubyist
- Programmer Interview Big-O Notation
- Big-O explained beautifully
- Beginners Guide To Big-O
- StackOverflow Explanation
- Very handy common cases
- Big-O cheatsheet
- Ruby Algorithms and Data Structures