How to Rename Column Names in Pandas?


Problem Formulation

How to change the column names to replace the original ones?

Here’s an example using the following DataFrame:

   Col_A  Col_B  Col_C
0      1      3      5
1      2      4      6

You want to rename the column names ['Col_A', 'Col_B', 'Col_C'] to ['a', 'b', 'c'] so that the resulting DataFrame is:

   a  b  c
0  1  3  5
1  2  4  6

Method 1: Changing the DataFrame.columns Attribute

Given a list of strings that are the new column names. To change the original column names of a given DataFrame, assign the new column names to the attribute df.columns using df.columns = <new column names>.

Here’s how you’d solve the example given above:

>>> df.columns = ['a', 'b', 'c']
>>> df
   a  b  c
0  1  3  5
1  2  4  6

For ease of copy&paste, here’s the full source code to change the column names in an existing DataFrame:

import pandas as pd

df = pd.DataFrame({'Col_A': [1, 2],
                   'Col_B': [3, 4],
                   'Col_C': [5, 6]})
print(df)
'''
   Col_A  Col_B  Col_C
0      1      3      5
1      2      4      6
'''

df.columns = ['a', 'b', 'c']
print(df)
'''
   a  b  c
0  1  3  5
1  2  4  6
'''

Method 2: Renaming Specific Attributes with DataFrame.rename()

To rename a specific subset of column names {'old_1': 'new_1', 'old_2': 'new_2', ...}, use the DataFrame.rename() method and pass a dictionary with the {old : new} mappings into the method.

  • df.rename(columns = {'old_1': 'new_1', 'old_2': 'new_2', ...}, inplace=True) to replace the original DataFrame, or
  • df = df.rename(columns = {'old_1': 'new_1', 'old_2': 'new_2', ...}) to create a new DataFrame and assign the result to the original variable df.

Here’s a practical example:

import pandas as pd

df = pd.DataFrame({'Col_A': [1, 2],
                   'Col_B': [3, 4],
                   'Col_C': [5, 6]})
print(df)
'''
   Col_A  Col_B  Col_C
0      1      3      5
1      2      4      6
'''

df.rename(columns = {'Col_A': 'a', 'Col_C': 'c'}, inplace=True)
print(df)
'''
   a  Col_B  c
0  1      3  5
1  2      4  6
'''

Note that the rename() method can also take a function to change the column names programmatically as specified by the function:

import pandas as pd

df = pd.DataFrame({'Col_A': [1, 2],
                   'Col_B': [3, 4],
                   'Col_C': [5, 6]})
print(df)
'''
   Col_A  Col_B  Col_C
0      1      3      5
1      2      4      6
'''

df = df.rename(columns = lambda x: x[-1].lower())
print(df)
'''
   a  b  c
0  1  3  5
1  2  4  6
'''

If you need a refresher on lambda functions, feel free to check out the following article.

Related Tutorial: Python Lambda Functions

Method 3: Reassign Column Headers using DataFrame.set_axis()

Use df.set_axis(new_col_names, axis=1, inplace=True) to change the original DataFrame with replaced headers. If you don’t want to overwrite the original DataFrame, use inplace=False in which case the method will return a new DataFrame copy with replace headers.

import pandas as pd

df = pd.DataFrame({'Col_A': [1, 2],
                   'Col_B': [3, 4],
                   'Col_C': [5, 6]})
print(df)
'''
   Col_A  Col_B  Col_C
0      1      3      5
1      2      4      6
'''

df.set_axis(['a', 'b', 'c'], axis=1, inplace=True)
print(df)
'''
   a  b  c
0  1  3  5
1  2  4  6
'''

Summary

There are three main ways to rename the column names ['Col_A', 'Col_B', 'Col_C'] with ['a', 'b', 'c'] in a given Pandas DataFrame:

  1. df.columns = ['a', 'b', 'c']
  2. df.rename(columns = {'Col_A': 'a', 'Col_C': 'c'}, inplace=True)
  3. df.set_axis(['a', 'b', 'c'], axis=1, inplace=True)

Only the second method is suitable to partially replace the column names.



Source link

Latest articles

Related articles

Leave a reply

Please enter your comment!
Please enter your name here