|

Creating Graphs For The Data – Data Management and Visualization

Background of the Dataset CSV file Used:

The background to the Dataset CSV file used has been explained extensively in the week 2’s assignment. Not to bore assessors and readers by repeating everything here again, please simply check the background information from my previous assignment which can be assessed at this link: http://adabadata.tumblr.com/  OR in tumblr, it can be seen as one of the past posts specifically the one with the title

(PYTHON PROGRAM For The Research
Topic Association Of The Literacy Rate And Life Expectancy & Association Of
The Literacy Rate And Income Per Person:  The Case of Ghana
)

For easy access though, I will post the link to the actual dataset csv which has been used for this project here again.

The   gapminder_ghana_updated.csv  dataset csv for this project can be view and dowloaded here:

https://drive.google.com/file/d/0B2KfPRxy4ootbzl5N0g1dUtIVzA/view?pref=2&pli=1

see screenshot here
for guide (http://prntscr.com/9gctxn)

 

Creating Graphs For My Data

As instructed in the assignment, I will be continuing with the program I have successfully run.
That will be the program written in week 2.

MY PYTHON PROGRAM
CODE: 

 

The Univariate graph for quantitative variable – incomeperperson

please view the graph at this link

https://drive.google.com/file/d/0B2KfPRxy4ootaHoxWWpDTnRjZ1E/view?usp=sharing

This graph is bimodal, with its first highest peak at the median incomeperperson rate of 500 to 1000.
The second highest peak is at the median rate of 2000 – 2500 of the incomeperperson rate of the people of Ghana. It seems to be skewed to the right as there are higher frequencies in lower incomeperperson levels than the higher incomeperperson levels.

 

 

The Univariate graph for quantitative variable – lifeexpectancy

please view the graph at this link

https://drive.google.com/file/d/0B2KfPRxy4ootTjNnMDhDLTVTTFk/view?usp=sharing

 

This graph is bimodal, with its first highest peak at the median lifeexpectancy between 24 and 30 years. The second highest peak is at the median lifeexpectancy of 60 – 66 years.

It seems to be skewed to the right as there is a very high frequency in the lower
number of years than the higher number of years.

 

The Univariate graph for quantitative variable – literacyrate

please view the graph at this link

https://drive.google.com/file/d/0B2KfPRxy4ootUEdGU28wQXl6cVE/view?usp=sharing

This graph is uniform. This is because it has no modes or values around which the distribution is concentrated. It is uniform not for the fact that all the literacyrate values gathered for the entire 216 years are around the same figure, but for the fact that only 2 literacyrate values were present in the Gapminder data for the country under discussion – Ghana

Hence there is a great deal of NaN values (Years without any literacyrate data collected) leaving me with only 2 values to analyse which are literacyrates of 57.897473 and 71.497075 – hence making the data for the literacyrate somewhat limited.

 

Scatterplot for the relationship between Literacy Rate and Life Expectancy of Ghana

please view the graph at this link

https://drive.google.com/file/d/0B2KfPRxy4ootRXR2NkJvS3BUR1E/view?usp=sharing

From the scatterplot it can be seen that the higher the literacyrate,
the higher the lifeexpectancy of the people of Ghana; and the lower the literacyrate, the lower the lifeexpectancy of the people of Ghana.  We can say there is a positive relationship between the literacyrate  and the lifeexpectancy
of the people of Ghana. However, it must be mentioned that, the GapMinder data
for the literacyrate is limited as it is recorded for only 2 different years.
Hence this does not give us enough data to make a definite conclusion that
there is a positive relationship between literacyrate  and the lifeexpectancy of the people of Ghana even though the 2 literacyrate values  and the over 200 lifeexpectancy values generating the scatterplot above depicts so.

 

Showing The Line Of Best to show Scatterplot for the relationship between Literacy Rate and Life Expectancy of Ghana

please view the graph at this link

https://drive.google.com/file/d/0B2KfPRxy4ootRU1Ma2x4S28wTjA/view?usp=sharing

The Line Of Best fit in the Scatterplot suggests there is a positive relationship between the literacyrate and the lifeexpectancy of the people of
Ghana. However, as mentioned above, due to limited Gapminder data on the literacyrate , eventhough the scatterplots suggests such relationship,  I cannot definitely conclude there is positive relationship between the literacyrate  and the lifeexpectancy of the people of Ghana.

 

Scatterplot for the relationship between Literacy Rate and Income Per Person of Ghana

please view the graph at this link

https://drive.google.com/file/d/0B2KfPRxy4ootMWtRYVBNdVQ5c0k/view?usp=sharing

 

Similar analysis applies to the analysis for the relationship between the literacyrate  and the lifeexpectancy of the people of Ghana

From the scatterplot it can be seen that the higher the literacyrate,
the higher the incomeperperson of the people of Ghana; and the lower the literacyrate, the lower the incomeperperson of the people of Ghana.  We can say there is a positive relationship between the literacyrate  and the incomeperperson of the people of Ghana. However, it must be mentioned that, the GapMinder data for the literacyrate is limited as it is recorded for only 2 different years.
Hence this does not give us enough data to make a definite conclusion that
there is a positive relationship between literacyrate  and the incomeperperson of the people of Ghana even though the 2 literacyrate values  and the over 200 incomeperperson  values generating the scatterplot above depicts so.

 

Showing The Line Of Best Fit by to show Scatterplot for the relationship between Literacy Rate and Income Per Person of Ghana

please view the graph at this link

https://drive.google.com/file/d/0B2KfPRxy4ootM25rZE4zV05yS3M/view?usp=sharing

 

The Line Of Best fit in the Scatterplot suggests there is a positive relationship between the literacyrate and the incomeperperson  of the people of Ghana. However, as mentioned above, due to limited Gapminder data on the literacyrate  , eventhough the scatterplots suggests such relationship, I cannot definitely conclude there is positive relationship between the literacyrate  and
the incomeperperson  of the people of Ghana.

Want more information like this?

Similar Posts

  • Getting Stock Prices from Yahoo and plotting Python 3 Matplolib Urllib

    This is  some quick notes about getting stock data from Yahoo and plotting it using Matplotlib . The Python version used is Python 3.5 Credits to sentdex.  You can check him out on Youtube. In [11]:

      In [12]:

      In [13]:

      In [14]:

        Want more information like this?

  • |

    Writing About The Data – Data Analysis and Intrepretation

    OVERVIEW My research work deals with Ghana, a country from the Gapminder dataset and there are 5 main variables I have been working with so I will be looking at the sample, procedure and measures for these 5 variables. The variables are: i.      Incomeperperson (Income Per Person) ii.    literacyrate (Literacy  Rate) iii.   lifeexpectancy  (Life…

  • | |

    Running a Random Forest – Data Analysis and Intrepretation

    Overview My research work deals with Ghana, a country from the Gapminder dataset as has already been discussed from the beginning and progression through this course. The variables in my observation dataset are all quantitative. For the purposes of this assignment, I have binned my quantitative target variable, Life Expectancy (lifeexpectancy) into a 2-level binary categorical target variable. I have named…

  • |

    Test a Multiple – Multivariate Regression Model

    OVERVIEW My research work deals with Ghana, a country from the Gapminder dataset.   What I found in my multiple regression analysis. Discussion of the results for the associations between all of my explanatory variables and my response variable The primary quantitative explanatory variable in my regression analysis is the Income Per Person (incomeperperson) and…

  • | |

    Save Multiple Pandas DataFrames to One Single Excel Sheet Side by Side or Dowwards – XlsxWriter

      This tutorial is just to illustrate how to save Python Pandas dataframe into one excel work SHEET . You can save it column-wise, that is side by side or row-wise, that is downwards, one dataframe after the other.   In [110]:

      In [111]:

     

    In [112]:

      Out[112]: First Name Last Name…

  • |

    Python iloc, loc, ix Data Retrieving Selection Functions

      Pandas iloc, loc, and ix functions are very powerful ways to quickly select data from your dataframe. Today , we take a quick look at these 3 functions. Credits to Data School, you can check him out in Youtube  In [1]:

      In [2]:

      In [3]:

      Out[3]: City Colors Reported Shape Reported State…

Leave a Reply

Your email address will not be published. Required fields are marked *