We can sort any array with the NumPy array sort method. Certainly, it’s faster and easier than writing the sorting algorithm in Python.
We’ll see to that in a minute.
But before that, we need to recapitulate one key concept in Python and algorithm.
Every algorithm comes with a cost. That means, we can write the instructions in many ways.
For example, if we instruct somebody to go from point A to B, there might be three or four ways to reach.
However, only one way comes out as the shortest and quickest. No doubt, that is the best route to save time. Right?
In our context it’s not wrong to assume that there are three or four algorithms to reach from point A to point B.
However, one algorithm is the best one. That saves our time. As a result, we can say, in the language of computer science, it saves the system resources in the most efficient way.
Enough talking. Let’s see some code so that we can compare why we’re talking about the best algorithm.
Let’s see one code, where we have arranged an array in the ascending order.
In short, we have an array with arbitrary numbers. However, our algorithm will arrange it in such a way so that it will move from the smallest to the largest value.
def arrange_in_ascending_order(generic_array, starting_number, ending_number):
searching_index = generic_array[starting_number]
low_index = starting_number + 1
high_index = ending_number
while True:
while low_index <= high_index and generic_array[high_index] >= searching_index:
high_index = high_index - 1
while low_index <= high_index and generic_array[low_index] <= searching_index:
low_index = low_index + 1
if low_index <= high_index:
generic_array[low_index], generic_array[high_index] = generic_array[high_index], generic_array[low_index]
else:
break
generic_array[starting_number], generic_array[high_index] = generic_array[high_index], generic_array[starting_number]
return high_index
def quickSort(generic_array_of_numbers, starting_number, ending_number):
if starting_number >= ending_number:
return
partitioning_index = arrange_in_ascending_order(generic_array_of_numbers, starting_number, ending_number)
quickSort(generic_array_of_numbers, starting_number, partitioning_index - 1)
quickSort(generic_array_of_numbers, partitioning_index + 1, ending_number)
generic_array_of_numbers = [100, 45, 1, 8, 47895, 5, 56, 23, 0, 89]
quickSort(generic_array_of_numbers, 0, len(generic_array_of_numbers) - 1)
print("The above random generic_array of numbers in ascending order: " + str(generic_array_of_numbers))
Let’s run the code and see the output.
The above random generic_array of numbers in ascending order: [0, 1, 5, 8, 23, 45, 56, 89, 100, 47895]
Why NumPy array sort is better
Now, as a data scientist, whenever we want to sort a big volume of numerical data, should we write such an algorithm? Or, will we take the help of NumPy array sort method?
Let’s import the NumPy package and write the same code using the NumPy array sort method.
import numpy as np
generic_array_of_numbers = [100, 45, 1, 8, 47895, 5, 56, 23, 0, 89]
np_array = np.array(generic_array_of_numbers)
sorting_np_array = np.sort(np_array, axis=None)
print(f'Sorting by NumPy sort method: {sorting_np_array}')
Sorting by NumPy sort method: [ 0 1 5 8 23 45 56 89 100 47895]
We can see the difference.
Certainly NumPy makes our lives easier. Isn’t it?
Not only that, NumPy also saves the system resources.
Therefore as a student of data science, we’ll always welcome NumPy whenever we get the chance.
Leave a Reply