If you are a self-taught programmer you hit a point in your career where Big-O Notation bites you. Either in a programming interview or in a given problem in the work place like N+1 SQL queries. In falling into this trap if you will, I’ve taken some strides into trying to grapple with the beast (Big-Oh), without having to delve into the intricacies of the math involved. This is my journey and a dissemination of information which I hope could help you too, dear reader.

So the first question, what is Big-O Notation:

Basically it’s used to determine how well a computer algorithm scales as the data increases. More explicitly it measures the efficiency of an algorithm based on the time it takes for the algorithm to run as a function of the input size.

Big-O notation has a symbol notation, listed below in tabular form:

Notation

Name

O(1)

constant

O(log(n))     

logarithmic

O((log(n))c)  

polylogarithmic

O(n)  

linear

O(n 2 ) 

quadratic

O(n c )

polynomial

O(c n ) 

exponential

Don’t be thrown off just yet, just get your eye used to the format. 

O(1) is said to be, Order of 1.

O(n) is said to be, Order of N and so on.

What is the benefit to knowing Big-O Notation?

We just mentioned two reasons at the start of this discussion, but a driving factor is being able to compare algorithms and decide on which one is more efficient and thus the best to implement. Big-O Notation gives us some basis for measuring the efficiency.

Apply Big-O Notation with some loose rules of thumb:

  1. Big-O Notation cares about the worst-case scenario (in which the algorithm is run).
  2. Single values or single iterations of a value over a worst-case scenario often don’t have that much of a bearing to consider
  3. Get to the heart of the complexity (either a comparison or loops), then use that to determine worst-case
  4. n could be the actual input, or the size of the input
  5. When comparing algorithms, throw-away the constants (or things that don’t matter over time)
  6. Big-O Notation is concerned with three ideas. The number of steps converted to a formula, then only the highest power of n is used to represent the entire algorithm
  7. Determine what you trying to optimise for, for example; time or memory.
  8. Time can also be considered as the number of steps it takes to complete a problem
  9. When we are trying to find the complexity of a function/procedure/algorithm, we are not interested in the exact number of operations that are being performed. Instead we are interested in the relation of the number of operations to the problem size.

Working examples:

O(1) – execution time of the algorithm is constant.

FIELD_LIMIT = 24

def valid_length?(input_array)
  input_array.length == FIELD_LIMIT
end

In the example, you can see that despite that the input size of the array may be, the operation is independent, only focusing on the length of the input, as opposed to it’s actual contents.

O(n) – execution tile will grow linearly depending on the input size.

def dump_it(customers_array)
  for customer_name in customers_array do    
    puts "#{customer_name}"
  end
end

In this example, you can see that there is a direct relationship to the input and the number of steps or rather the time taken. This growth is considered linear. If the input is an array of 100 elements, then the algorithm (however simple) will print out 100 lines of output. If the input is an array of 100 000 elements, then so will the output match.

*Hint – there is only one iteration

O(n^2) – for each increment of the input size, the time of the algorithm will double

def print_chessboard(size)
  for i in (1..size)
    for j in (1..size)
      print '*'
    end
    puts
  end
end

In this example, you can see there is a nested loop used to print out just the general form of a chessboard (no indicators for black or white). If the size of chessboard to be printed out is 2, then there will be 4 asterisks (*). If the size is then 6, then there will be 36 asterisks. If the size requested of board is 8, then there will be 64 asterisks. The pattern should be clear here of n^2. That is, 2^2 = 4. 6^2 = 36. 8^2 = 64

O(2n) – for each increment of the input size, the time of the algorithm will grow times two

def dump_it(customers_array)
  for customer in customers_array do    
    puts "#{customer[:name]}"
  end

  for customer in customers_array do
    puts "#{customer[:country]}"
  end
end

Lastly in this example, we see two separate loops, one for the printing out of customer names, and the other for printing out customer locations (which city they are in). 

The one thing about Big-O Notation is that it takes some time to get, and as such there may have been things repeated with intent. But if this has not assisted, here are some really good links:

  1. History and a more mathematical perspective of Big-O
  2. Brief overview of Big-O from a Rubyist
  3. Programmer Interview Big-O Notation
  4. Big-O explained beautifully
  5. Beginners Guide To Big-O
  6. StackOverflow Explanation
  7. Very handy common cases
  8. Big-O cheatsheet
  9. Ruby Algorithms and Data Structures

 

Stack concept

 

Firstly before delving into the code, let’s get a handle on the definition of a Stack.

According to Wikipedia:

“The stack is the memory set aside as scratch space for a thread of execution. When a function is called, a block is reserved on the top of the stack for local variables and some bookkeeping data. When that function returns, the block becomes unused and can be used the next time a function is called. The stack is always reserved in a LIFO (last in first out) order; the most recently reserved block is always the next block to be freed. This makes it really simple to keep track of the stack; freeing a block from the stack is nothing more than adjusting one pointer.”

A stack has some attributes that helps clarify it’s purpose and perhaps it’s definition. Consider a stack simply literally, as taking bricks and placing one on top of another. In coming back to this pile of brick, the most efficient way to access the next brick would be to reach for the one top-most. So too with a stack, it’s always concerned about the top. Whether adding or removing, all interactions happen at the top. This is called LIFO – last in, first out. Last brick in, is the last brick out. Stacks call, the removing of a brick (popping or POP) and the adding of a brick (pushing or PUSH).

So a stack has the following attributes:

  • LIFO – last in first out
  • Push – to add something onto the stack
  • Pop – to remove something from the stack

To dig a little deeper and bring out more about this peculiar thing called a stack, let us associate a stack to that of toddler stacking cups. Yes you’ve seen those. They look like this.

Stacking cups

A stack is simply the toddler placing these cups one on top of another. The cups slot into one another and provide a wide-rim for the next cup to slide onto and position itself. Now bear with me, let’s now call each cup (like the green one above), a Node. Why a node? Let’s consider a node in a tree, or rather the attributes of a node. A node (cup) has a colour of it’s own (a value) but also links to the next cup below it. Hold on a minute, this inter-linking should raise an alarm, yes, well done! This introduces another important concept with a stack, a Linked List. 

What is a Linked List?

A linked list is as we described above, ‘is a simple object (we’ll call it a Node) which has its own value or data, plus a pointer to the next Node in the list’.

In the image below, the first joined block is a node, with the first block being the value 12. The second block being a pointer, the location to the next node. But, and I say but, when you access that pointer, you don’t get back the pointer itself, but the actual node to which it’s linked. This is the beauty!

Linked list

How does a Linked List then relate back to a Stack? A linked list is the individual cups (blue, green, red, yellow above), and placing one on top of another (linking them), is called a “Stack”.

So what does the code look like?

A Linked List example:

class Node
  attr_accessor :value, :next_node

  def initialize(value, next_node)
    @value = value
    @next_node = next_node
  end
end

The code above simply provides us the mechanism for a Linked List. Now it’s interesting to note that Ruby doesn’t have a Linked List Data Type (as far as I know), because the functionality is built into Array. We get the same functionality using Array#shift (pop) and Array#unshift (push), however the difference here being that the focus is not on the top but on the bottom. Secondly, the advantage to this, is that you get Linked List functionality with the hybrid data access of an Array. By that I mean, an Array is great for getting random access to large data. A Linked list is purposeful when manipulating, inserting, deleting or re-ordering elements.

A Stack example:

class Stack
  def initialize
    @current = nil
  end

  def pop
    raise 'Stack is empty' if is_empty?
    top_of_stack = @current
    @current = top_of_stack.next_node
    top_of_stack.value 
  end

  def push(value)
    @current = Node.new(value, @current)
  end

  def is_empty?
    @current.nil?
  end
end

This is the output from interacting with the code in IRB:

Stack irb2

Credits:

I’ve regurgitated all this 😉

  1. Ruby Missing Datastructure
  2. Stack vs Heap (stack overflow)
  3. A Stack in Ruby using Linked Lists
  4. What is a Stack Data Structure (youtube)

DotNet Foundation

 

Microsoft are doing some amazing things these days, no more so than the open-sourcing of software but also adopting the development processes of the open-source communities. The introduction of dnvm, dnu, dnx command line utilities are amazing, and reflections of what we have seen in the Ruby and Python community for the longest time. I’ve been about 2 years away from the Microsoft world, embedded in the world of Ruby and there is so much to be said about the language and it’s friendliness towards developers, but perhaps in another post. For today, let us turn our attention to some of the command line utilities that are now available on your Mac (or OS X) to begin your .NET journey. These utilities are available on Linux and Windows as well, so you should have similar experiences.

Let’s begin.

DNX- The .NET Execution Environment

 “The .NET Execution Environment (DNX) is a software development kit (SDK) and runtime environment that has everything you need to build and run .NET applications for Windows, Mac and Linux. It provides a host process, CLR hosting logic and managed entry point discovery. DNX was built for running cross-platform ASP.NET Web applications, but it can run other types of .NET applications, too, such as cross-platform console apps.”

I think the explanation from the ASP.NET website is pretty clear, but DNX on the command-line more specifically will grant you the ability to “bootstrap” your application for self-hosting. That’s pretty impressive, given that historically one would have to get IIS up and running. Below you can see that I have fired up my application and it is now running on port 5000. 

 

DNX up and running

 

The CoreCLR is the .NET execution engine that performs functions like garbage collection and compilation to machine code. The github repo for the CoreCLR describes it’s function as – This repo contains the .NET Core runtime, called CoreCLR, and the base library, called mscorlib. It includes the garbage collector, JIT compiler, base .NET data types and many low-level classes.”

CoreFX is the library implementation for .NET Core (that is the cross-platform implementation of .NET) that is primarily driven by ASP.NET 5. 

So DNX by way of definition in terms of SDK (as stated above) contains the CoreCLR and base parts of CoreFX .

DNU – The .NET Development Utility

DNU is used to manage the dependencies or the packages required for your project. DNU reads the package.json file included in your project and manages it’s dependencies from the runtime version to required packages all via NuGet. With it, you can list packages, install/uninstall the application, publish the application for deployment and restore packages. Very handy! 

Building with DNU

 

DNVM – The .NET Version Manager

 “DNVM lets you list the different DNX versions and flavors on your machine, install new ones and switch the active DNX”.

DNVM is similar to Ruby’s RVM, that is, it allows you to download different versions of the .NET execution environment, manage them and set the default (as per below).


dnvm list

 

All in all, .NET has come to Mac and it’s ABOUT TIME 😉

PS. TO GET STARTED. Also please be advised ASP.NET 5 has been renamed to ASP.NET Core 1.0

Further reading and references:

DNVM, DNX and DNU – Understanding the RunTime

What are DNX, DNVM and DNU and other ASP.NET 5 components

Getting to the Crux of DNVM, DNU and DNX in ASP.NET 5

 

Picjumbo com HNCK4432

Problem:

There are certain aspects of programming that lay dormant or un-touched for sometime, some problem areas that evade your daily grasp until a suite of side-affects in your environment appear to have an inexplicable randomness. One of these problems is that of concurrent file access, that is shared files.

I saw the worst of this come to light one day, when a partially migrated legacy application was running concurrently to a new one (let’s call it version 1). The aspect that was missed here, was that while locking was happening within each individual application, both applications were manipulating the same file. That is the maintainers and developers did not for see that individual applicaiton threads were protected from concurrent access, across application there was no such fail-safe..

So to better understand the problem, let’s try out some scenarios.

File Locking Work Across Applications

A common misconception is that an Exclusive Lock will lock across applications (aka. globally across the OS). This is not true. One might assume a Operating System Level lock would do so, but imagine the sequence of issues that would then ensure including starvation / deadlocks. Here is a simple proof of concept:

Snip #1:

filename = "virtualdomains"
File.open(filename, 'r') do |f|
  success = f.flock(File::LOCK_EX)
  puts "Do I have a lock? #{success == 0 ? true : false}" 
  puts "Holding lock..."
  sleep(500)
end

 

In the snippet above, I am placing an exclusive lock on the “virtual domains” file (File::LOCK_EX) and thereafter calling sleep. While that is running I open another terminal and run the following piece of code:

Snip #2

echo "Hello World" >> virtual domains

That echo would have absolutely no problem appending to our exclusively locked virtual domains file, and thus proving to us, that file system locks (at least in Linux) does not work. But that is not a problem, it’s design. Remember, an operating system is not a database, it is not managing records or rows, nor should it “generally” speaking.

Another valuable lesson here, is that opening a file with the attribute of ‘w’ or ‘w+’ will lead to truncation before the file can be written too. 

File Locking Work Within The Same Application Across Threads

If however you wanted to lock within an application, then so much the easier – provided again, you retain the knowledge of nothing external to application (or process) that will be manipulating your resource (with a similar or likely timing).

Snip #3

filename = "/virtualdomains"
File.open(filename, 'w') do |f|
  old_pos = 0
  f.each do |line|
    f.pos = old_pos
    f.print line.gsub(/^dane/, "#dane")
    old_pos = f.pos
  end
  sleep(500)
end

In the snippet above if we execute this code and then open another terminal and try again, we will immediately see that our exclusive locks now holds true and our code will block, till the sleep has expired.

An important lesson here, is that Ruby provides LOCK_NB (non-blocking) which will return false for such a scenario, as demonstrated below.

Snip #4

def read_to_write(filename)
  wrote_contents = true
  tries ||= 3
  File.open(filename, 'r+') do |file|
    begin
      locked = file.flock(File::LOCK_NB | File::LOCK_EX)
      raise "Failed to get exclusive lock on #{filename}" if !locked
      entries = File.readlines(filename).collect(&:strip)
      entries.collect! { |entry| entry.gsub(/^my$/, "#my") }
      file << entries.join("\n")
    rescue Exception => e
      retry unless (tries -= 1).zero?
      wrote_contents = false
    end
  end
  wrote_contents
end

How does File Locking Work in Regards to Reads?

As a last point of discussion it’s important to realise that no matter the lock or how you would implement it (r+ attribute -> read then write), you  cannot block a read from happening, and as such, one should be well aware prior to assuming it’s state before a write (without a good concurrency/locking mechanism at play).