import sqlalchemy
import pyodbc
from pandas import DataFrame
from bokeh.plotting import figure, output_file, show
import pandas as pd

df = DataFrame


## connect to SqlServer Database and get information. 
try:

    import urllib
    params = urllib.quote_plus("DRIVER={SQL Server Native Client 11.0};SERVER=.\MSSQLSERVER_ENT;DATABASE=Antrak;Trusted_Connection=yes;")

    engine = sqlalchemy.create_engine('mssql+pyodbc:///?odbc_connect=%s"' % params)
    resoverall = engine.execute("SELECT top 10 * FROM FlightBookingsMain")


    df = df(resoverall.fetchall())
    df.columns = resoverall.keys()

    
except (RuntimeError, TypeError, NameError):
    print('Error in Conneccting')
    print(RuntimeError, TypeError, NameError)
finally:
    print("connected")

connected

#lets look at the head of the dataframe
df.head()

#lets define the bokeh figure, and declare the x axis as type datetime

p = figure(width=800, height=350, x_axis_type="datetime")

#lots plot both a circle and line graph
p.circle(df['Booking Date'], df['AFP'], size=4, color='darkgrey', alpha=0.2, legend='close')
p.line(df['Booking Date'], df['Total Cost'], color='navy', legend='avg')

<bokeh.models.renderers.GlyphRenderer at 0xc431ac8>

# show the results
output_file('linegraph.html')
show(p)

# bokeh server plotting
#sample server plotting code from bokeh

from bokeh.plotting import figure, show, output_server

p = figure(title="Server Plot")
p.circle([1, 2, 3], [4, 5, 6])

output_server("hover")

show(p)

I came across this brilliant analyis by pbpython. and decided to play around with it on this data. you can see the full code and explanation on his website: http://pbpython.com/pandas-pivot-table-explained.html. Though I have added just a little bit of some extra exploration, the key and major analysis are from pbpython so you can check him out.

For convenience sake, let’s define the SEX column as a category and set the order we want to view.

This isn’t strictly required but helps us keep the order we want as we work through analyzing the data.

df['Sex'] = df['Sex'].astype('category')
df['Sex'].cat.set_categories(['m','f'], inplace=True)

The simplest pivot table must have a dataframe and an index . In this case, let’s use the Name as our index.

df['Booking Date'] = pd.to_datetime(df['Booking Date'])

#get year from date

df['year'] = df['Booking Date'].apply(lambda x: x.year)

#get month from date

df['month'] = df['Booking Date'].apply(lambda x: x.month)

df.head()

#get all data less than a particular date
df[df['Booking Date']<'2014-10-28']

year_plot = pd.pivot_table(df, index=['year'])
year_plot

p.line(df['year'], df['Total Cost'], color='darkgrey', alpha=0.2, legend='year plot')
output_file('linegraph.html')
show(p)

pv_result = pd.pivot_table(df, index=['Last Name'])

type(pv_result)
pv_result

You can have multiple indexes as well. In fact, most of the pivot_table args can take multiple values via a list.

pd.pivot_table(df, index=['Last Name','Booking Date','Departure Date'])

For this purpose, the Flight Booking ID, Insurance and Postcode columns aren’t really useful. Let’s remove it by explicitly defining the columns we care about using the values field.

pd.pivot_table(df, index=['Last Name'], values=['Total Cost'])

The price column automatically averages the data but we can do a count or a sum. Adding them is simple using aggfunc and np.sum .

pd.pivot_table(df, index=['Last Name'], values=['Total Cost'], aggfunc=np.sum)

aggfunc can take a list of functions. Let’s try a mean using the numpy mean function and len to get a count.

pd.pivot_table(df, index=['Last Name'], values=['Total Cost'], aggfunc =[np.sum,len])

I think one of the confusing points with the pivot_table is the use of columns and values . Remember, columns are optional – they provide an additional way to segment the actual values you care about. The aggregation functions are applied to the values you list.

	First Name	Last Name	Sex	Phone	Fax	Email Address	Booking Date	Departure Date	Arrival Date	Address1	…	Number of Passengers	Total Cost	Currency Numeric Code	Company website	Flight ID Number	Flight Booking ID	Insurance	AFP	Flight Name	Age
0	Ellis	Harris	m	077 5665 3370	077 5665 3373	Ellis.E.Harris@mailinator.com	2012-12-18	2011-07-16	2000-08-23	63 Hull Road	…	2	983	710	http://qvc.com	4111113236285560	1	16	62	ZAR	62
1	Aaron	Field	m	079 6068 9016	079 6068 9015	Aaron.A.Field@mailinator.com	2014-10-28	2005-02-26	2000-07-30	42 Russell Rd	…	9	2452	710	http://msnbc.com	4111110402728960	2	83	38	ZAR	44
2	Archie	Kaur	m	070 3056 9804	070 3056 9809	Archie.M.Kaur@spambob.com	2003-11-17	2000-12-05	2007-03-07	10 Southend Avenue	…	8	1710	710	http://ap.org	4111110879255550	3	54	58	ZAR	60
3	Bradley	Middleton	m	070 0765 7814	070 0765 7816	Bradley.A.Middleton@dodgeit.com	2013-07-18	2011-02-26	2015-10-25	32 Pier Road	…	6	518	710	http://tripadvisor.com	4111111855573320	4	72	43	ZAR	66
4	Evan	Reeves	m	070 8266 4451	070 8266 4458	Evan.A.Reeves@dodgeit.com	2006-07-20	2006-06-20	2001-02-25	75 Moulton Road	…	7	1943	710	http://aliexpress.com	4111113022178850	5	68	52	ZAR	38

	First Name	Last Name	Sex	Phone	Fax	Email Address	Booking Date	Departure Date	Arrival Date	Address1	…	Currency Numeric Code	Company website	Flight ID Number	Flight Booking ID	Insurance	AFP	Flight Name	Age	year	month
0	Ellis	Harris	NaN	077 5665 3370	077 5665 3373	Ellis.E.Harris@mailinator.com	2012-12-18	2011-07-16	2000-08-23	63 Hull Road	…	710	http://qvc.com	4111113236285560	1	16	62	ZAR	62	2012	12
1	Aaron	Field	NaN	079 6068 9016	079 6068 9015	Aaron.A.Field@mailinator.com	2014-10-28	2005-02-26	2000-07-30	42 Russell Rd	…	710	http://msnbc.com	4111110402728960	2	83	38	ZAR	44	2014	10
2	Archie	Kaur	NaN	070 3056 9804	070 3056 9809	Archie.M.Kaur@spambob.com	2003-11-17	2000-12-05	2007-03-07	10 Southend Avenue	…	710	http://ap.org	4111110879255550	3	54	58	ZAR	60	2003	11
3	Bradley	Middleton	NaN	070 0765 7814	070 0765 7816	Bradley.A.Middleton@dodgeit.com	2013-07-18	2011-02-26	2015-10-25	32 Pier Road	…	710	http://tripadvisor.com	4111111855573320	4	72	43	ZAR	66	2013	7
4	Evan	Reeves	NaN	070 8266 4451	070 8266 4458	Evan.A.Reeves@dodgeit.com	2006-07-20	2006-06-20	2001-02-25	75 Moulton Road	…	710	http://aliexpress.com	4111113022178850	5	68	52	ZAR	38	2006	7

	First Name	Last Name	Sex	Phone	Fax	Email Address	Booking Date	Departure Date	Arrival Date	Address1	…	Currency Numeric Code	Company website	Flight ID Number	Flight Booking ID	Insurance	AFP	Flight Name	Age	year	month
0	Ellis	Harris	NaN	077 5665 3370	077 5665 3373	Ellis.E.Harris@mailinator.com	2012-12-18	2011-07-16	2000-08-23	63 Hull Road	…	710	http://qvc.com	4111113236285560	1	16	62	ZAR	62	2012	12
2	Archie	Kaur	NaN	070 3056 9804	070 3056 9809	Archie.M.Kaur@spambob.com	2003-11-17	2000-12-05	2007-03-07	10 Southend Avenue	…	710	http://ap.org	4111110879255550	3	54	58	ZAR	60	2003	11
3	Bradley	Middleton	NaN	070 0765 7814	070 0765 7816	Bradley.A.Middleton@dodgeit.com	2013-07-18	2011-02-26	2015-10-25	32 Pier Road	…	710	http://tripadvisor.com	4111111855573320	4	72	43	ZAR	66	2013	7
4	Evan	Reeves	NaN	070 8266 4451	070 8266 4458	Evan.A.Reeves@dodgeit.com	2006-07-20	2006-06-20	2001-02-25	75 Moulton Road	…	710	http://aliexpress.com	4111113022178850	5	68	52	ZAR	38	2006	7
6	Harley	Carroll	NaN	077 5239 3650	077 5239 3659	Harley.F.Carroll@dodgeit.com	2008-12-18	2007-09-28	2004-10-05	71 Stone St	…	710	http://fixya.com	4111111662317450	7	83	79	ZAR	60	2008	12
7	Christopher	Gibson	NaN	077 3549 4164	077 3549 4169	Christopher.M.Gibson@mailinator.com	2008-01-28	2011-05-04	2006-01-19	78 Bootham Terrace	…	710	http://vertitechnologygroup.com	4111112798242170	8	75	57	ZAR	60	2008	1
8	Isobel	Holden	NaN	079 2946 6947	079 2946 6946	Isobel.J.Holden@mailinator.com	2001-12-15	2009-05-25	2007-04-20	71 Front St	…	710	http://rr.com	4111114862347230	9	91	99	ZAR	27	2001	12
9	Alfie	Bolton	NaN	079 4864 1114	079 4864 1112	Alfie.K.Bolton@mailinator.com	2006-05-05	2002-11-29	2012-08-06	57 Ockham Road	…	710	http://nfl.com	4111111081803270	10	83	31	ZAR	30	2006	5

	AFP	Age	Flight Booking ID	Insurance	Number of Passengers	Postcode	Total Cost	month
year
2001	99.0	27.0	9.0	91.0	6.0	0.0	978.0	12.0
2003	58.0	60.0	3.0	54.0	8.0	0.0	1710.0	11.0
2006	41.5	34.0	7.5	75.5	7.0	0.0	2346.5	6.0
2008	68.0	60.0	7.5	79.0	8.5	0.0	2751.5	6.5
2012	62.0	62.0	1.0	16.0	2.0	0.0	983.0	12.0
2013	43.0	66.0	4.0	72.0	6.0	0.0	518.0	7.0
2014	38.0	44.0	2.0	83.0	9.0	0.0	2452.0	10.0
2015	14.0	46.0	6.0	55.0	6.0	0.0	1271.0	7.0

	AFP	Age	Flight Booking ID	Insurance	Number of Passengers	Postcode	Total Cost	month	year
Last Name
Bolton	31.0	30.0	10.0	83.0	7.0	0.0	2750.0	5.0	2006.0
Carroll	79.0	60.0	7.0	83.0	10.0	0.0	3869.0	12.0	2008.0
Field	38.0	44.0	2.0	83.0	9.0	0.0	2452.0	10.0	2014.0
Gibson	57.0	60.0	8.0	75.0	7.0	0.0	1634.0	1.0	2008.0
Harris	62.0	62.0	1.0	16.0	2.0	0.0	983.0	12.0	2012.0
Holden	99.0	27.0	9.0	91.0	6.0	0.0	978.0	12.0	2001.0
Kaur	36.0	53.0	4.5	54.5	7.0	0.0	1490.5	9.0	2009.0
Middleton	43.0	66.0	4.0	72.0	6.0	0.0	518.0	7.0	2013.0
Reeves	52.0	38.0	5.0	68.0	7.0	0.0	1943.0	7.0	2006.0

Python Bokeh plotting Data Exploration Visualization And Pivot Tables Analysis

Python iloc, loc, ix Data Retrieving Selection Functions

ImportError: No module named ‘scikits’ – Python Fix Jupyter Notebook

Save Python Pivot Table in Excel Sheets ExcelWriter

Create Batch Dot Bat File to Run Your Python Script With Windows Scheduler

ImportError: No module named learn.base Crab Python Fix

Leave a Reply Cancel reply

DataPandas LTS

EXPLORE DataPandas

ImportAnt link

GET IN TOUCH

© 2024 DataPandas

People Who Read The Above Post Also Read This:

Similar Posts