Python Pandas pivot() Function | Finxter

0
87
Python Pandas pivot() Function | Finxter


Syntax

pandas.pivot(data, index=None, columns=None, values=None)

Return Value: The return value for the pivot() function is a reshaped DataFrame organized by index/column values.

Background

Direct quote from the Pandas Documentation website:

/ column values. This function does not support data aggregation. If there are multiple values, they will result in a Multi-Index in the columns.

This article delves into each parameter for this function separately.

Getting Started

Remember to add the Required Starter Code to the top of each code snippet. This snippet will allow the code in this article to run error-free.

The data structure used in this article is a Dictionary of Lists made up of Keys (column names) and associated Values (list).

Required Starter Code:

import pandas as pd

staff = {'FName':  ['Clare', 'Micah', 'Ben', 'Mac', 'Emma'], 
         'EID':    [100, 101, 102, 103, 104], 
         'Job':    ['Designer I', 'Data Scientist', 'Developer', 'Designer II', 'Manager'],
         'Age':    [19, 23, 21, 27, 36],
         'Salary': [87343, 94123, 96543, 65232, 102375]}

The “data” Parameter

The pivot() function data parameter is a DataFrame. This parameter can be one of the following data types or another data type that converts to a DataFrame:

If the DataFrame is empty, the following output will display:

df = pd.DataFrame()
print(df)

Output

Empty DataFrame
Columns: []
Index: []

If the DataFrame contains staff, the output will be similar to the table below. Formatting will vary and depend on the IDE used to run the code.

df = pd.DataFrame(staff)
print(df)

Output

  FName  EID Job Age Salary
0 Clare 100 Designer I 19 87343
1 Micah 101 Data Scientist 23 94123
2 Ben 102 Developer 21 96543
3 Mac 103 Designer II 27 65232
4 Emma 104 Manager 36 102375

Note: For all examples in this article, use the staff DataFrame.

The “index” Parameter

The pivot() method’s index parameter is not required. If used, these column(s) create a new index. If empty, the default index numbering system will display. The parameter can be one of the following data types:

  • string
  • object
  • list of strings
  • list of index names

Note: All data assigned to the index parameter must contain unique values. If not, a ValueError occurs.

In this example, the index parameter is the EID and Job columns.  The column parameter is required. If empty, the entire DataFrame will display. If missing, a TypeError will occur.

df_index = df.pivot(index=['EID', 'Job'], columns=[])
print(df_index)

Output:

For this code example, the index parameter contains a list of column names. They display on the left in the same order as entered in the list: not the original data structure order.

staff = {'FName':  ['Clare', 'Micah', 'Ben', 'Mac', 'Emma'], 
         'EID':    [100, 101, 102, 103, 104], 
         'Job':    ['Designer I', 'Data Scientist', 'Developer', 'Designer II', 'Manager'],
         'Age':    [19, 23, 21, 27, 36],
         'Salary': [87343, 94123, 96543, 65232, 102375]}

The remaining columns with the relevant data will display (FName, Age, Salary) on the right.

EID Job FName  Age Salary
100 Designer I Clare 19 87343
101 Data Scientist Micah 23 94123
102 Developer Ben 21 96543
103 Designer II Mac 27 65232
104 Manager Emma 36 102375

The “columns” Parameter

The Pandas pivot() method’s columns parameter is required. The parameter can be one of the following data types:

  • string
  • object
  • list of strings
  • list of index names

If empty (as shown above), the entire DataFrame displays. If missing, a TypeError occurs.

df_cols = df.pivot(index=['EID', 'Job', 'Salary'], columns=['FName'])
print(df_cols)

Output:

In this example, the index columns will display on the left side. The order is the same as that listed in the above parameter.

The list passed to the columns parameter (FNames) displays the text as a sub-heading. The remaining sub-headings are the values from that column.

The Age column is the only column omitted from both parameters. This column becomes the main table heading.

The Age data from the original data structure will display as floats.

The “values” Parameter

This function’s value parameter is not required. The parameter can be one of the following data types:

  • string
  • object
  • list of the previous

This parameter uses column(s) for populating DataFrame values. If this parameter is empty, all remaining columns will display.

In this example, the Salary data from the original data structure will display as floats.

df_values = df.pivot(index=['EID', 'Job', 'Age'], columns=['FName'], values=['Salary'])
print(df_values)

The DataFrame:



Source link

Leave a reply

Please enter your comment!
Please enter your name here