Pandas iloc, loc, and ix functions are very powerful ways to quickly select data from your dataframe. Today , we take a quick look at these 3 functions.

Credits to Data School, you can check him out in Youtube 

In [1]:
import pandas as pd

 

In [2]:
#lets get this public dataset and play with it
ufo = pd.read_csv('http://bit.ly/uforeports')

 

In [3]:
#lets examine the head
ufo.head()

 

Out[3]:
City Colors Reported Shape Reported State Time
0 Ithaca NaN TRIANGLE NY 6/1/1930 22:00
1 Willingboro NaN OTHER NJ 6/30/1930 20:00
2 Holyoke NaN OVAL CO 2/15/1931 14:00
3 Abilene NaN DISK KS 6/1/1931 13:00
4 New York Worlds Fair NaN LIGHT NY 4/18/1933 19:00
In [4]:
#lets work with loc. it is for selecting rows and columns by label
#lets get first row and all columns
ufo.loc[0, : ]

 

Out[4]:
City                       Ithaca
Colors Reported               NaN
Shape Reported           TRIANGLE
State                          NY
Time               6/1/1930 22:00
Name: 0, dtype: object

that returns a pandas series object

In [5]:
#lets get first 3 rows and all columns, by passing the rows as lists
ufo.loc[[0,1,2], : ]

 

Out[5]:
City Colors Reported Shape Reported State Time
0 Ithaca NaN TRIANGLE NY 6/1/1930 22:00
1 Willingboro NaN OTHER NJ 6/30/1930 20:00
2 Holyoke NaN OVAL CO 2/15/1931 14:00
In [9]:
#we can also use a slice to get all the rows we want and ALL columns
ufo.loc[0:2, : ]    # note that the last row number '2' in the slice is also outputed

 

Out[9]:
City Colors Reported Shape Reported State Time
0 Ithaca NaN TRIANGLE NY 6/1/1930 22:00
1 Willingboro NaN OTHER NJ 6/30/1930 20:00
2 Holyoke NaN OVAL CO 2/15/1931 14:00
In [11]:
#you can also do it this way, but for legibility it is normally not ideal
# selects rows 0 -2 and all columns
ufo.loc[0:2]

 

Out[11]:
City Colors Reported Shape Reported State Time
0 Ithaca NaN TRIANGLE NY 6/1/1930 22:00
1 Willingboro NaN OTHER NJ 6/30/1930 20:00
2 Holyoke NaN OVAL CO 2/15/1931 14:00
In [14]:
#lets do some column selections
ufo.loc[ : ,'City']

 

Out[14]:
0                      Ithaca
1                 Willingboro
2                     Holyoke
3                     Abilene
4        New York Worlds Fair
5                 Valley City
6                 Crater Lake
7                        Alma
8                     Eklutna
9                     Hubbard
10                    Fontana
11                   Waterloo
12                     Belton
13                     Keokuk
14                  Ludington
15                Forest Home
16                Los Angeles
17                  Hapeville
18                     Oneida
19                 Bering Sea
20                   Nebraska
21                        NaN
22                        NaN
23                  Owensboro
24                 Wilderness
25                  San Diego
26                 Wilderness
27                     Clovis
28                 Los Alamos
29               Ft. Duschene
                 ...         
18211                 Holyoke
18212                  Carson
18213                Pasadena
18214                  Austin
18215                El Campo
18216            Garden Grove
18217           Berthoud Pass
18218              Sisterdale
18219            Garden Grove
18220             Shasta Lake
18221                Franklin
18222          Albrightsville
18223              Greenville
18224                 Eufaula
18225             Simi Valley
18226           San Francisco
18227           San Francisco
18228              Kingsville
18229                 Chicago
18230             Pismo Beach
18231             Pismo Beach
18232                    Lodi
18233               Anchorage
18234                Capitola
18235          Fountain Hills
18236              Grant Park
18237             Spirit Lake
18238             Eagle River
18239             Eagle River
18240                    Ybor
Name: City, dtype: object

Lets get multiple columns by passing a list

In [19]:
ufo.loc[ : , ['City','State'] ]

 

Out[19]:
City State
0 Ithaca NY
1 Willingboro NJ
2 Holyoke CO
3 Abilene KS
4 New York Worlds Fair NY
5 Valley City ND
6 Crater Lake CA
7 Alma MI
8 Eklutna AK
9 Hubbard OR
10 Fontana CA
11 Waterloo AL
12 Belton SC
13 Keokuk IA
14 Ludington MI
15 Forest Home CA
16 Los Angeles CA
17 Hapeville GA
18 Oneida TN
19 Bering Sea AK
20 Nebraska NE
21 NaN LA
22 NaN LA
23 Owensboro KY
24 Wilderness WV
25 San Diego CA
26 Wilderness WV
27 Clovis NM
28 Los Alamos NM
29 Ft. Duschene UT
18211 Holyoke MA
18212 Carson CA
18213 Pasadena CA
18214 Austin TX
18215 El Campo TX
18216 Garden Grove CA
18217 Berthoud Pass CO
18218 Sisterdale TX
18219 Garden Grove CA
18220 Shasta Lake CA
18221 Franklin NH
18222 Albrightsville PA
18223 Greenville SC
18224 Eufaula OK
18225 Simi Valley CA
18226 San Francisco CA
18227 San Francisco CA
18228 Kingsville TX
18229 Chicago IL
18230 Pismo Beach CA
18231 Pismo Beach CA
18232 Lodi WI
18233 Anchorage AK
18234 Capitola CA
18235 Fountain Hills AZ
18236 Grant Park IL
18237 Spirit Lake IA
18238 Eagle River WI
18239 Eagle River WI
18240 Ybor FL

18241 rows × 2 columns

We can also specify a range

In [16]:
ufo.loc[ : , 'City':'State']

 

Out[16]:
City Colors Reported Shape Reported State
0 Ithaca NaN TRIANGLE NY
1 Willingboro NaN OTHER NJ
2 Holyoke NaN OVAL CO
3 Abilene NaN DISK KS
4 New York Worlds Fair NaN LIGHT NY
5 Valley City NaN DISK ND
6 Crater Lake NaN CIRCLE CA
7 Alma NaN DISK MI
8 Eklutna NaN CIGAR AK
9 Hubbard NaN CYLINDER OR
10 Fontana NaN LIGHT CA
11 Waterloo NaN FIREBALL AL
12 Belton RED SPHERE SC
13 Keokuk NaN OVAL IA
14 Ludington NaN DISK MI
15 Forest Home NaN CIRCLE CA
16 Los Angeles NaN NaN CA
17 Hapeville NaN NaN GA
18 Oneida NaN RECTANGLE TN
19 Bering Sea RED OTHER AK
20 Nebraska NaN DISK NE
21 NaN NaN NaN LA
22 NaN NaN LIGHT LA
23 Owensboro NaN RECTANGLE KY
24 Wilderness NaN DISK WV
25 San Diego NaN CIGAR CA
26 Wilderness NaN DISK WV
27 Clovis NaN DISK NM
28 Los Alamos NaN DISK NM
29 Ft. Duschene NaN DISK UT
18211 Holyoke NaN DIAMOND MA
18212 Carson NaN DISK CA
18213 Pasadena GREEN FIREBALL CA
18214 Austin NaN FORMATION TX
18215 El Campo NaN OTHER TX
18216 Garden Grove ORANGE LIGHT CA
18217 Berthoud Pass NaN TRIANGLE CO
18218 Sisterdale NaN DIAMOND TX
18219 Garden Grove NaN CHEVRON CA
18220 Shasta Lake BLUE DISK CA
18221 Franklin NaN DISK NH
18222 Albrightsville NaN OTHER PA
18223 Greenville NaN NaN SC
18224 Eufaula NaN DISK OK
18225 Simi Valley NaN FORMATION CA
18226 San Francisco NaN FORMATION CA
18227 San Francisco NaN TRIANGLE CA
18228 Kingsville NaN LIGHT TX
18229 Chicago NaN DISK IL
18230 Pismo Beach NaN OVAL CA
18231 Pismo Beach NaN OVAL CA
18232 Lodi NaN NaN WI
18233 Anchorage RED VARIOUS AK
18234 Capitola NaN TRIANGLE CA
18235 Fountain Hills NaN NaN AZ
18236 Grant Park NaN TRIANGLE IL
18237 Spirit Lake NaN DISK IA
18238 Eagle River NaN NaN WI
18239 Eagle River RED LIGHT WI
18240 Ybor NaN OVAL FL

18241 rows × 4 columns

We can also combine the row and column selections

In [17]:
ufo.loc[0:2, 'City':'State']

 

Out[17]:
City Colors Reported Shape Reported State
0 Ithaca NaN TRIANGLE NY
1 Willingboro NaN OTHER NJ
2 Holyoke NaN OVAL CO

We can also simply drop the time column to achieve the same result

In [18]:
ufo.head(3).drop('Time', axis=1)

 

Out[18]:
City Colors Reported Shape Reported State
0 Ithaca NaN TRIANGLE NY
1 Willingboro NaN OTHER NJ
2 Holyoke NaN OVAL CO

You can also select data using boolean statemet. example lets select where city = OAKLAND

In [21]:
ufo[ufo.City=='Oakland']

 

Out[21]:
City Colors Reported Shape Reported State Time
1694 Oakland NaN CIGAR CA 7/21/1968 14:00
2144 Oakland NaN DISK CA 8/19/1971 0:00
4686 Oakland NaN LIGHT MD 6/1/1982 0:00
7293 Oakland NaN LIGHT CA 3/28/1994 17:00
8488 Oakland NaN NaN CA 8/10/1995 21:45
8768 Oakland NaN NaN CA 10/10/1995 22:40
10816 Oakland NaN LIGHT OR 10/1/1997 21:30
10948 Oakland NaN DISK CA 11/14/1997 19:55
11045 Oakland NaN TRIANGLE CA 12/10/1997 1:30
12322 Oakland NaN FIREBALL CA 10/9/1998 19:40
12941 Oakland NaN CYLINDER CA 1/23/1999 21:30
16803 Oakland NaN TRIANGLE MD 7/4/2000 23:00
17322 Oakland NaN CYLINDER CA 9/1/2000 21:35

We can do same with loc

In [22]:
ufo.loc[ufo.City=='Oakland', : ]

 

Out[22]:
City Colors Reported Shape Reported State Time
1694 Oakland NaN CIGAR CA 7/21/1968 14:00
2144 Oakland NaN DISK CA 8/19/1971 0:00
4686 Oakland NaN LIGHT MD 6/1/1982 0:00
7293 Oakland NaN LIGHT CA 3/28/1994 17:00
8488 Oakland NaN NaN CA 8/10/1995 21:45
8768 Oakland NaN NaN CA 10/10/1995 22:40
10816 Oakland NaN LIGHT OR 10/1/1997 21:30
10948 Oakland NaN DISK CA 11/14/1997 19:55
11045 Oakland NaN TRIANGLE CA 12/10/1997 1:30
12322 Oakland NaN FIREBALL CA 10/9/1998 19:40
12941 Oakland NaN CYLINDER CA 1/23/1999 21:30
16803 Oakland NaN TRIANGLE MD 7/4/2000 23:00
17322 Oakland NaN CYLINDER CA 9/1/2000 21:35

We can also select specific rows with the boolean expression

In [24]:
ufo.loc[ufo.City=='Oakland', 'State']

 

Out[24]:
1694     CA
2144     CA
4686     MD
7293     CA
8488     CA
8768     CA
10816    OR
10948    CA
11045    CA
12322    CA
12941    CA
16803    MD
17322    CA
Name: State, dtype: object

Now lets tate a look at it ‘iloc’
iloc is for selecting rows and columns by integer position. that is what the ‘i’ stands for

Lets select all rows and columns in integer positions 0 and 3

In [25]:
ufo.iloc[ : , [0, 3]]

 

Out[25]:
City State
0 Ithaca NY
1 Willingboro NJ
2 Holyoke CO
3 Abilene KS
4 New York Worlds Fair NY
5 Valley City ND
6 Crater Lake CA
7 Alma MI
8 Eklutna AK
9 Hubbard OR
10 Fontana CA
11 Waterloo AL
12 Belton SC
13 Keokuk IA
14 Ludington MI
15 Forest Home CA
16 Los Angeles CA
17 Hapeville GA
18 Oneida TN
19 Bering Sea AK
20 Nebraska NE
21 NaN LA
22 NaN LA
23 Owensboro KY
24 Wilderness WV
25 San Diego CA
26 Wilderness WV
27 Clovis NM
28 Los Alamos NM
29 Ft. Duschene UT
18211 Holyoke MA
18212 Carson CA
18213 Pasadena CA
18214 Austin TX
18215 El Campo TX
18216 Garden Grove CA
18217 Berthoud Pass CO
18218 Sisterdale TX
18219 Garden Grove CA
18220 Shasta Lake CA
18221 Franklin NH
18222 Albrightsville PA
18223 Greenville SC
18224 Eufaula OK
18225 Simi Valley CA
18226 San Francisco CA
18227 San Francisco CA
18228 Kingsville TX
18229 Chicago IL
18230 Pismo Beach CA
18231 Pismo Beach CA
18232 Lodi WI
18233 Anchorage AK
18234 Capitola CA
18235 Fountain Hills AZ
18236 Grant Park IL
18237 Spirit Lake IA
18238 Eagle River WI
18239 Eagle River WI
18240 Ybor FL

18241 rows × 2 columns

We can also do a slice range of numbers. When you do a slice range of numbers, the selected data does not include the last number
in the slice

In [26]:
ufo.iloc[ : , 0:4]

 

Out[26]:
City Colors Reported Shape Reported State
0 Ithaca NaN TRIANGLE NY
1 Willingboro NaN OTHER NJ
2 Holyoke NaN OVAL CO
3 Abilene NaN DISK KS
4 New York Worlds Fair NaN LIGHT NY
5 Valley City NaN DISK ND
6 Crater Lake NaN CIRCLE CA
7 Alma NaN DISK MI
8 Eklutna NaN CIGAR AK
9 Hubbard NaN CYLINDER OR
10 Fontana NaN LIGHT CA
11 Waterloo NaN FIREBALL AL
12 Belton RED SPHERE SC
13 Keokuk NaN OVAL IA
14 Ludington NaN DISK MI
15 Forest Home NaN CIRCLE CA
16 Los Angeles NaN NaN CA
17 Hapeville NaN NaN GA
18 Oneida NaN RECTANGLE TN
19 Bering Sea RED OTHER AK
20 Nebraska NaN DISK NE
21 NaN NaN NaN LA
22 NaN NaN LIGHT LA
23 Owensboro NaN RECTANGLE KY
24 Wilderness NaN DISK WV
25 San Diego NaN CIGAR CA
26 Wilderness NaN DISK WV
27 Clovis NaN DISK NM
28 Los Alamos NaN DISK NM
29 Ft. Duschene NaN DISK UT
18211 Holyoke NaN DIAMOND MA
18212 Carson NaN DISK CA
18213 Pasadena GREEN FIREBALL CA
18214 Austin NaN FORMATION TX
18215 El Campo NaN OTHER TX
18216 Garden Grove ORANGE LIGHT CA
18217 Berthoud Pass NaN TRIANGLE CO
18218 Sisterdale NaN DIAMOND TX
18219 Garden Grove NaN CHEVRON CA
18220 Shasta Lake BLUE DISK CA
18221 Franklin NaN DISK NH
18222 Albrightsville NaN OTHER PA
18223 Greenville NaN NaN SC
18224 Eufaula NaN DISK OK
18225 Simi Valley NaN FORMATION CA
18226 San Francisco NaN FORMATION CA
18227 San Francisco NaN TRIANGLE CA
18228 Kingsville NaN LIGHT TX
18229 Chicago NaN DISK IL
18230 Pismo Beach NaN OVAL CA
18231 Pismo Beach NaN OVAL CA
18232 Lodi NaN NaN WI
18233 Anchorage RED VARIOUS AK
18234 Capitola NaN TRIANGLE CA
18235 Fountain Hills NaN NaN AZ
18236 Grant Park NaN TRIANGLE IL
18237 Spirit Lake NaN DISK IA
18238 Eagle River NaN NaN WI
18239 Eagle River RED LIGHT WI
18240 Ybor NaN OVAL FL

18241 rows × 4 columns

the returned data does not include the data at integer number 4, so it only returns 0,1,2,3 integer positions

lets select a set of rows and all columns

In [28]:
ufo.iloc[0:3 , : ]

 

Out[28]:
City Colors Reported Shape Reported State Time
0 Ithaca NaN TRIANGLE NY 6/1/1930 22:00
1 Willingboro NaN OTHER NJ 6/30/1930 20:00
2 Holyoke NaN OVAL CO 2/15/1931 14:00
In [ ]:

Now lets take a look at ix. it allows you to mix labels and integers. Tis kind of blend between loc and iloc

lets grab some new public data

In [31]:
drinks = pd.read_csv('http://bit.ly/drinksbycountry', index_col='country') #lets set the index to country

 

In [32]:
drinks.head()

 

Out[32]:
beer_servings spirit_servings wine_servings total_litres_of_pure_alcohol continent
country
Afghanistan 0 0 0 0.0 Asia
Albania 89 132 54 4.9 Europe
Algeria 25 0 14 0.7 Africa
Andorra 245 138 312 12.4 Europe
Angola 217 57 45 5.9 Africa

Lets select the first figure under beer servings for albania

In [33]:
drinks.ix['Albania',0]

 

Out[33]:
89

we can also do the same thing like this way

In [34]:
drinks.ix[ 1 , 'beer_servings']

 

Out[34]:
89

lets select with slice range

In [35]:
drinks.ix['Albania':'Andorra', 0:2]

 

Out[35]:
beer_servings spirit_servings
country
Albania 89 132
Algeria 25 0
Andorra 245 138

important thing about ix. if you pass it slice range numbers and you have integer indexes or integer columns it will treat the passed
slice range as integers positions and returns all data inclusive of the both numbers in the slice. However if your columns or index has
labels instead of numbers and you pass a slice range number, ix will treat it as labels and we will return data not including the last number
in the slice

In [36]:
#example
ufo.ix[0:2, 0:2]

 

Out[36]:
City Colors Reported
0 Ithaca NaN
1 Willingboro NaN
2 Holyoke NaN

we have labels in our columns so ix treated the slice range as labels and returned index positions 0,1
but we had numbers in our index and ix treated them as integer positions and returns all data from 0,1 and 2

In [ ]:

 

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *