Data and Decision Science with Machine Learning Algorithms

Data science always relates to decision science. However, machine learning algorithms bridge the gap. How does one relate to the other?

In this section we’ll try to understand how it happens.

For example, in data science decision making depends on effective calculation. 

As a result, without learning machine learning algorithms we cannot think of building correct data science models.

Let’s try to make it simple. 

What is the shortest path from point A to point B? Since there are several routes, knowing the shortest route makes a big difference. 

As a result, a big transport company that sends products from one continent to others, depends on the correct decision. 

A wrong decision makes them incur loss. On the other hand, correct decision makes their revenue grow.

Only effective calculation leads us to the correct decision. Right? 

Before computer science or python comes, mathematicians work on the same principle. But in a different way.

Now as time goes on, the task becomes difficult. 

Why?

Because the volume of data grows.

As a result, we need to find more accurate algorithms that take less system resources.

Let’s consider the Sieve of Eratosthenes. It’s an ancient algorithm which Greek Mathematician Eratosthenes invented almost 2300 years ago.

What does it do? Most importantly, what role does it play in data or decision science?

Why Machine learning algorithm is important in data and decision science?

In short, the sieve of Eratosthenes is one of the oldest machine learning algorithms.

Why?

Because it leads us to a correct decision when we handle a volume of data.

It’s an algorithm to find all prime numbers between a range of numbers. For example between 2 to 20 it will separate all composite numbers and pick up only the prime numbers.

Do we find any similarity with finding the shortest route when there are many routes.

Actually when we find the shortest route among many routes, we follow the same algorithm.

If you’re an absolute beginner, let’s know what a prime number is first. 

A prime number is a natural number that has exactly two discrete natural divisors. 

Consider this example. 2 is a prime number, because there are exactly two divisors: 1 and 2. 

The same way, 11 is a prime number, because there are exactly two divisors or factors: 1 and 11. 

To make the long story short, let’s try the code.

2300 years old machine learning algorithm in modern python. 🙂

def sieve_of_eratosthenes(range_of_numbers):    
    prime_array = [True for any_number in range(range_of_numbers + 1)]
    starting_number = 2
    while (starting_number * starting_number <= range_of_numbers):
        if (prime_array[starting_number] == True):
            for any_number in range(starting_number * 2, range_of_numbers + 1, starting_number):
                prime_array[any_number] = False
            starting_number += 1
            prime_array[0] = False
            prime_array[1] = False
        
        for starting_number in range(range_of_numbers + 1):
            if prime_array[starting_number]:
                print(starting_number)

sieve_of_eratosthenes(20)

Here is the output of the above code.

As we’ve passed 20, the range is 2 to 20. 

The ancient algorithm keeps on separating the composite numbers which have more than 2 divisors or factors.

Finally it picks us only the prime numbers.

2
    3
    5
    7
    11
    13
    17
    19

As we progress, we’ll learn more such algorithms. Above all, if we don’t learn to write algorithms on our own, we cannot understand how algorithms libraries work.

Most importantly, we cannot choose the correct machine learning algorithms to make the correct decision.

Let’s try to understand the above algorithm in natural language.

That will explain the steps.

 1. First we need to create a list of consecutive integers from 2 through a certain number like 30, as we have seen in the above statement: (2, 3, 4, …, 20); we do this, because 2 is the smallest prime number
    
    2. Therefore, we can initialize a variable like this: startingNUmber = 2
    
    3. Now, we can specify the multiples of the ‘startingNUmber’ by counting in increments of ‘startingNUmber’ from (2 *  startingNUmber) to 20, and mark them in the list, like this: 
    (2 *  startingNUmber), (3 *  startingNUmber), (4 *  startingNUmber), and so on.
    
    4. It is not to be mentioned that multiples of 2 will never be the primes, because the number of factors becomes greater than 2.
    
    5. Next, we will find the first number that is greater than the ‘startingNUmber’; if there is no such number, then we stop. Otherwise, let ‘startingNUmber’ equal the new number, which is the next prime and repeat from the step 3.
    
    6. When the algorithm ends, the numbers not marked in the list below 20, are all primes.

However, the classical way of using the sieve of Eratosthenes to find the prime is not the fastest way. 

Later we’ll discuss this topic later.

What Next?

Books at Leanpub

Books in Apress

My books at Amazon

Courses at Educative

GitHub repository

Flutter, Dart and Algorithm

Twitter

Comments

One response to “Data and Decision Science with Machine Learning Algorithms”

  1. […] Is there any difference between traditional algorithms and machine learning algorithms? […]

Leave a Reply