I worked with a World Bank Dataset provided by the Course instructors.

The research question was:

Prediction of Adjusted Net National Income Per Capita of Countries

Brief Introduction to the Research Question

The purpose of this project was to identify the best predictors for Adjusted Net National Income Per Capita of countries from multiple World Bank development indicators such as Exports of Goods and Services, Food Production Index, Foreign Direct Investment- Net Inflows, Forest Area, Gdp at Market Prices, and Gdp Growth

At the end of the project I had the following Conclusions

Brief Overview Of Key Findings And Implications

This project used lasso regression analysis to identify the best predictors for Adjusted Net National Income Per Capita, (Current Us$)   of countries from multiple World Bank development indicators in N= 248 countries and regions of the world for the year 2012 which is made up of national, regional and global estimates. The Adjusted Net National Income Per Capita, (Current Us$) for this period ranged from 192.66 (Current Us$)     to 67688.51 (Current Us$) indicating that there was considerable variability in the Adjusted Net National Income Per Capita, (Current Us$) of countries and regions of the world for that year.

 

The prediction accuracy of the model was 0.995739422921 (96%) on the training dataset and 0.99525790594 (99.5%) when ran on the test dataset. Hence, the algorithm helped to identify the best predictors for Adjusted Net National Income Per Capita, (Current Us$)   of countries for the year 2012 given the indicators available in this World Bank dataset.

 

There was significant increase in the MSE when the training set lasso regression algorithm was used to predict the Adjusted Net National Income Per Capita, (Current Us$) in the test data set. This suggests that the predictive accuracy of the algorithm may not be very stable for future datasets and hence has to be looked into further by using other analytic methods such as Multiple Regression.

 

The   Pearson correlation “r” values and associated p-values revealed that all the retained variables were significantly associated with Adjusted Net National Income Per Capita ,(Current Us$)  as can be seen in the Table 2 above. From the Scatter Plots and the Pearson “r” values and also from the multivariate analysis it can be seen that GDP PER CAPITA (CURRENT US$) is the most strongest relation to the Adjusted Net National Income Per Capita ,(Current Us$)  followed by HEALTH EXPENDITURE PER CAPITA (CURRENT US$).

 

This means that countries and regions should spend more on increasing their GDP PER CAPITA (CURRENT US$) and also they should make more efforts and allocate more resources to HEALTH EXPENDITURE PER CAPITA (CURRENT US$) as the more these two variables increase, there is more likelihood of increase in their Adjusted Net National Income Per Capita ,(Current Us$).  Although,

  • SECURE INTERNET SERVERS (PER 1 MILLION PEOPLE),
  • FIXED BROADBAND SUBSCRIPTIONS (PER 100 PEOPLE),
  • ADJUSTED NET NATIONAL INCOME (CURRENT US$),
  • SURVIVAL TO AGE 65, MALE (% OF COHORT),
  • POPULATION AGES 65 AND ABOVE (% OF TOTAL)

all indicated a positive co-efficient ,meaning an increase in these variables are likely to lead to increase in the  Adjusted Net National Income Per Capita ,(Current Us$)  of countries and regions, due to general scarcity of resources globally, if there is only a few resources at the disposal of countries and regions,  it will be advisable to allocate this scarce resources to GDP PER CAPITA (CURRENT US$)  and HEALTH EXPENDITURE PER CAPITA (CURRENT US$). This is because these were the 2 that indicated the most strongest positive association with Adjusted Net National Income Per Capita ,(Current Us$) .

 

On the other hand, it is advisable for countries and regions to reduce their

  • ADJUSTED SAVINGS: NATURAL RESOURCES DEPLETION (% OF GNI),
  • RURAL POPULATION (% OF TOTAL POPULATION),
  • ADJUSTED SAVINGS: CARBON DIOXIDE DAMAGE (% OF GNI) and
  • EXPORTS OF GOODS AND SERVICES (% OF GDP)

as a decrease in these variables have the tendency to increase their Adjusted Net National Income Per Capita ,(Current Us$) as these factors are negatively associated with the Adjusted Net National Income Per Capita ,(Current Us$) .

Access the full report as PDF here:

Prediction of Adjusted Net National Income Per Capita of Countries

Find my full Python source code here:

Prediction of Adjusted Net National Income Per Capita of Countries PYTHON SOURCE CODE

Get THE FULL REPORT as PDF straight into your inbox here:

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *