Pandas sort by column name

Pandas sort by column name is nothing but a certain type of data analysis. For beginners especially, we can do it to give an idea.

Think about the data structure as a spreadsheet where we have multiple rows and columns. Right?

Now we can use Pandas to handle a large amount of data because this library offers highly performant data manipulation capabilities.

Let us start by importing the Pandas package first. After that we will read a CSV file.

import pandas as pd

df = pd.read_csv('https://raw.githubusercontent.com/sanjibsinha/Machine-Learning-Primer/main/world_internet_user.csv', encoding = 'unicode_escape', engine ='python')
df.head()

As a result, we can see the first five rows of the DataFrame. Based on which we can use the sort methods.


# sorting the 'Country' Series in ascending order (it will always return a Series)
df.Country.sort_values().head()

# output
1        Afganistan
2           Albania
3           Algeria
4    American Samoa
5           Andorra

Name: Country, dtype: object

There are several ways we can sort the values and analyze the data. 

We can sort a pandas DataFrame by the values of one or more columns. It also depends on which parameter we are going to use.

By default, Pandas use the ascending parameter to change the sort order. But we can change it. Right?

# we can sort in descending order instead
df.Country.sort_values(ascending=False).head()

# output
0              _World
242          Zimbabwe
241            Zambia
240             Yemen
239    Western Sahara
Name: Country, dtype: object

Consequently, we can sort a DataFrame in place using the “inplace” argument set to True.

Since we have already read the values we can change the parameters and see the output.


# sort the entire DataFrame by the 'Internet Users' Series (it always returns a DataFrame)
df.sort_values('Internet Users').head()

As a result, as the number of the internet users ascends, the rows progress accordingly. 

By the way, a DataFrame represents a data structure with labeled axes for both rows and columns. 

As an outcome, we can sort a DataFrame by row or column value as well as by row or column index.

# sorting the DataFrame first by 'Internet Users', then by 'Country' = it makes two columns in ascending orders
df.sort_values(['Internet Users', 'Country']).head()

In the above example we have used two columns, however, we can use two columns also. 

For that reason, sorting our DataFrame on a Single Column looks much easier than more than one column. 

We use the sort_values() method. 

As we said, by default, this will return a new DataFrame sorted in ascending order. 

In no circumstances, it does not modify the original DataFrame.

df.sort_values(by=['Country'])

We can also use more than one column in the same way.

df.sort_values(by=['Internet Users', '% of Population'])

#output

We can take a look at the table.

Pandas sort
Pandas sort

For the full code please visit the respective branches of the GitHub Repository.

What Next?

Books at Leanpub

Books in Apress

My books at Amazon

GitHub repository

TensorFlow, Machine Learning, AI and Data Science

Flutter, Dart and Algorithm

C, C++, Java and Game Development

Twitter

Comments

One response to “Pandas sort by column name”

  1. […] we use Python in data science, machine learning, and web […]

Leave a Reply