Python Pandas Groupby function agg Series GroupbyObject

Group By FunctionThis is a quick look at Python groupby function. Very powerful and useful function. We will take a simple look at it here.

Credits to Data School , creator of Python course materials.

lets import sample dataset

In [18]:

 

In [19]:

 

In [20]:

 

Out[20]:
country beer_servings spirit_servings wine_servings total_litres_of_pure_alcohol continent
0 Afghanistan 0 0 0 0.0 Asia
1 Albania 89 132 54 4.9 Europe
2 Algeria 25 0 14 0.7 Africa
3 Andorra 245 138 312 12.4 Europe
4 Angola 217 57 45 5.9 Africa

What is the average beer servings across ALL countries

In [21]:

 

Out[21]:
In [22]:

 

Out[22]:
In [23]:

 

In [ ]:

 

What is the average beer servings by continents

In [24]:

 

Out[24]:

lets filter the drinks by only one continent, eg Africa and then get its mean

In [25]:

 

Out[25]:

lets find the maximum beer_serving by continent

In [26]:

 

Out[26]:

there is a powerful ‘agg’ function which allows us to specifiy multiply functions at one time , by passing the functions as a list to the agg function

In [27]:

 

Out[27]:
count min max mean
continent
Africa 53 0 376 61.471698
Asia 44 0 247 37.045455
Europe 45 0 361 193.777778
North America 23 1 285 145.434783
Oceania 16 0 306 89.687500
South America 12 93 333 175.083333

You can also make calculations across all the numerical columns at one time by not selecting any specific column to use for calculation.

In [28]:

 

Out[28]:
beer_servings spirit_servings wine_servings total_litres_of_pure_alcohol
continent
Africa 61.471698 16.339623 16.264151 3.007547
Asia 37.045455 60.840909 9.068182 2.170455
Europe 193.777778 132.555556 142.222222 8.617778
North America 145.434783 165.739130 24.521739 5.995652
Oceania 89.687500 58.437500 35.625000 3.381250
South America 175.083333 114.750000 62.416667 6.308333

We can visualize the information in a simple plot

In [29]:

 

Out[29]:

You can also retrieven how many instances of each continent is seen.

In [32]:

 

Out[32]:

Group By SPLITS the dataframe into a group of objects which each has their own keys. Functions can then be applied to each indivicual
split object, a group of these split objects or all of these split objects as groups.
We can run analysis and afterwards combine these split object back into a dataframe!

In [ ]:

lets create a groupyby object

In [33]:

 

lets check the type

In [34]:

 

Out[34]:

The groupby object is iteratable and the split objects (groups of groupbydataframe objects) from the grougpby function has their repective keys / index.
Lets iterate through this grouped object

In [35]:

 

 

Want more information like this?

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *