Pandas iloc, loc, and ix functions are very powerful ways to quickly select data from your dataframe. Today , we take a quick look at these 3 functions.
Credits to Data School, you can check him out in Youtube
Out[3]:
|
City |
Colors Reported |
Shape Reported |
State |
Time |
0 |
Ithaca |
NaN |
TRIANGLE |
NY |
6/1/1930 22:00 |
1 |
Willingboro |
NaN |
OTHER |
NJ |
6/30/1930 20:00 |
2 |
Holyoke |
NaN |
OVAL |
CO |
2/15/1931 14:00 |
3 |
Abilene |
NaN |
DISK |
KS |
6/1/1931 13:00 |
4 |
New York Worlds Fair |
NaN |
LIGHT |
NY |
4/18/1933 19:00 |
Out[4]:
City Ithaca
Colors Reported NaN
Shape Reported TRIANGLE
State NY
Time 6/1/1930 22:00
Name: 0, dtype: object
that returns a pandas series object
Out[5]:
|
City |
Colors Reported |
Shape Reported |
State |
Time |
0 |
Ithaca |
NaN |
TRIANGLE |
NY |
6/1/1930 22:00 |
1 |
Willingboro |
NaN |
OTHER |
NJ |
6/30/1930 20:00 |
2 |
Holyoke |
NaN |
OVAL |
CO |
2/15/1931 14:00 |
Out[9]:
|
City |
Colors Reported |
Shape Reported |
State |
Time |
0 |
Ithaca |
NaN |
TRIANGLE |
NY |
6/1/1930 22:00 |
1 |
Willingboro |
NaN |
OTHER |
NJ |
6/30/1930 20:00 |
2 |
Holyoke |
NaN |
OVAL |
CO |
2/15/1931 14:00 |
Out[11]:
|
City |
Colors Reported |
Shape Reported |
State |
Time |
0 |
Ithaca |
NaN |
TRIANGLE |
NY |
6/1/1930 22:00 |
1 |
Willingboro |
NaN |
OTHER |
NJ |
6/30/1930 20:00 |
2 |
Holyoke |
NaN |
OVAL |
CO |
2/15/1931 14:00 |
Out[14]:
0 Ithaca
1 Willingboro
2 Holyoke
3 Abilene
4 New York Worlds Fair
5 Valley City
6 Crater Lake
7 Alma
8 Eklutna
9 Hubbard
10 Fontana
11 Waterloo
12 Belton
13 Keokuk
14 Ludington
15 Forest Home
16 Los Angeles
17 Hapeville
18 Oneida
19 Bering Sea
20 Nebraska
21 NaN
22 NaN
23 Owensboro
24 Wilderness
25 San Diego
26 Wilderness
27 Clovis
28 Los Alamos
29 Ft. Duschene
...
18211 Holyoke
18212 Carson
18213 Pasadena
18214 Austin
18215 El Campo
18216 Garden Grove
18217 Berthoud Pass
18218 Sisterdale
18219 Garden Grove
18220 Shasta Lake
18221 Franklin
18222 Albrightsville
18223 Greenville
18224 Eufaula
18225 Simi Valley
18226 San Francisco
18227 San Francisco
18228 Kingsville
18229 Chicago
18230 Pismo Beach
18231 Pismo Beach
18232 Lodi
18233 Anchorage
18234 Capitola
18235 Fountain Hills
18236 Grant Park
18237 Spirit Lake
18238 Eagle River
18239 Eagle River
18240 Ybor
Name: City, dtype: object
Lets get multiple columns by passing a list
Out[19]:
|
City |
State |
0 |
Ithaca |
NY |
1 |
Willingboro |
NJ |
2 |
Holyoke |
CO |
3 |
Abilene |
KS |
4 |
New York Worlds Fair |
NY |
5 |
Valley City |
ND |
6 |
Crater Lake |
CA |
7 |
Alma |
MI |
8 |
Eklutna |
AK |
9 |
Hubbard |
OR |
10 |
Fontana |
CA |
11 |
Waterloo |
AL |
12 |
Belton |
SC |
13 |
Keokuk |
IA |
14 |
Ludington |
MI |
15 |
Forest Home |
CA |
16 |
Los Angeles |
CA |
17 |
Hapeville |
GA |
18 |
Oneida |
TN |
19 |
Bering Sea |
AK |
20 |
Nebraska |
NE |
21 |
NaN |
LA |
22 |
NaN |
LA |
23 |
Owensboro |
KY |
24 |
Wilderness |
WV |
25 |
San Diego |
CA |
26 |
Wilderness |
WV |
27 |
Clovis |
NM |
28 |
Los Alamos |
NM |
29 |
Ft. Duschene |
UT |
… |
… |
… |
18211 |
Holyoke |
MA |
18212 |
Carson |
CA |
18213 |
Pasadena |
CA |
18214 |
Austin |
TX |
18215 |
El Campo |
TX |
18216 |
Garden Grove |
CA |
18217 |
Berthoud Pass |
CO |
18218 |
Sisterdale |
TX |
18219 |
Garden Grove |
CA |
18220 |
Shasta Lake |
CA |
18221 |
Franklin |
NH |
18222 |
Albrightsville |
PA |
18223 |
Greenville |
SC |
18224 |
Eufaula |
OK |
18225 |
Simi Valley |
CA |
18226 |
San Francisco |
CA |
18227 |
San Francisco |
CA |
18228 |
Kingsville |
TX |
18229 |
Chicago |
IL |
18230 |
Pismo Beach |
CA |
18231 |
Pismo Beach |
CA |
18232 |
Lodi |
WI |
18233 |
Anchorage |
AK |
18234 |
Capitola |
CA |
18235 |
Fountain Hills |
AZ |
18236 |
Grant Park |
IL |
18237 |
Spirit Lake |
IA |
18238 |
Eagle River |
WI |
18239 |
Eagle River |
WI |
18240 |
Ybor |
FL |
18241 rows × 2 columns
We can also specify a range
Out[16]:
|
City |
Colors Reported |
Shape Reported |
State |
0 |
Ithaca |
NaN |
TRIANGLE |
NY |
1 |
Willingboro |
NaN |
OTHER |
NJ |
2 |
Holyoke |
NaN |
OVAL |
CO |
3 |
Abilene |
NaN |
DISK |
KS |
4 |
New York Worlds Fair |
NaN |
LIGHT |
NY |
5 |
Valley City |
NaN |
DISK |
ND |
6 |
Crater Lake |
NaN |
CIRCLE |
CA |
7 |
Alma |
NaN |
DISK |
MI |
8 |
Eklutna |
NaN |
CIGAR |
AK |
9 |
Hubbard |
NaN |
CYLINDER |
OR |
10 |
Fontana |
NaN |
LIGHT |
CA |
11 |
Waterloo |
NaN |
FIREBALL |
AL |
12 |
Belton |
RED |
SPHERE |
SC |
13 |
Keokuk |
NaN |
OVAL |
IA |
14 |
Ludington |
NaN |
DISK |
MI |
15 |
Forest Home |
NaN |
CIRCLE |
CA |
16 |
Los Angeles |
NaN |
NaN |
CA |
17 |
Hapeville |
NaN |
NaN |
GA |
18 |
Oneida |
NaN |
RECTANGLE |
TN |
19 |
Bering Sea |
RED |
OTHER |
AK |
20 |
Nebraska |
NaN |
DISK |
NE |
21 |
NaN |
NaN |
NaN |
LA |
22 |
NaN |
NaN |
LIGHT |
LA |
23 |
Owensboro |
NaN |
RECTANGLE |
KY |
24 |
Wilderness |
NaN |
DISK |
WV |
25 |
San Diego |
NaN |
CIGAR |
CA |
26 |
Wilderness |
NaN |
DISK |
WV |
27 |
Clovis |
NaN |
DISK |
NM |
28 |
Los Alamos |
NaN |
DISK |
NM |
29 |
Ft. Duschene |
NaN |
DISK |
UT |
… |
… |
… |
… |
… |
18211 |
Holyoke |
NaN |
DIAMOND |
MA |
18212 |
Carson |
NaN |
DISK |
CA |
18213 |
Pasadena |
GREEN |
FIREBALL |
CA |
18214 |
Austin |
NaN |
FORMATION |
TX |
18215 |
El Campo |
NaN |
OTHER |
TX |
18216 |
Garden Grove |
ORANGE |
LIGHT |
CA |
18217 |
Berthoud Pass |
NaN |
TRIANGLE |
CO |
18218 |
Sisterdale |
NaN |
DIAMOND |
TX |
18219 |
Garden Grove |
NaN |
CHEVRON |
CA |
18220 |
Shasta Lake |
BLUE |
DISK |
CA |
18221 |
Franklin |
NaN |
DISK |
NH |
18222 |
Albrightsville |
NaN |
OTHER |
PA |
18223 |
Greenville |
NaN |
NaN |
SC |
18224 |
Eufaula |
NaN |
DISK |
OK |
18225 |
Simi Valley |
NaN |
FORMATION |
CA |
18226 |
San Francisco |
NaN |
FORMATION |
CA |
18227 |
San Francisco |
NaN |
TRIANGLE |
CA |
18228 |
Kingsville |
NaN |
LIGHT |
TX |
18229 |
Chicago |
NaN |
DISK |
IL |
18230 |
Pismo Beach |
NaN |
OVAL |
CA |
18231 |
Pismo Beach |
NaN |
OVAL |
CA |
18232 |
Lodi |
NaN |
NaN |
WI |
18233 |
Anchorage |
RED |
VARIOUS |
AK |
18234 |
Capitola |
NaN |
TRIANGLE |
CA |
18235 |
Fountain Hills |
NaN |
NaN |
AZ |
18236 |
Grant Park |
NaN |
TRIANGLE |
IL |
18237 |
Spirit Lake |
NaN |
DISK |
IA |
18238 |
Eagle River |
NaN |
NaN |
WI |
18239 |
Eagle River |
RED |
LIGHT |
WI |
18240 |
Ybor |
NaN |
OVAL |
FL |
18241 rows × 4 columns
We can also combine the row and column selections
Out[17]:
|
City |
Colors Reported |
Shape Reported |
State |
0 |
Ithaca |
NaN |
TRIANGLE |
NY |
1 |
Willingboro |
NaN |
OTHER |
NJ |
2 |
Holyoke |
NaN |
OVAL |
CO |
We can also simply drop the time column to achieve the same result
Out[18]:
|
City |
Colors Reported |
Shape Reported |
State |
0 |
Ithaca |
NaN |
TRIANGLE |
NY |
1 |
Willingboro |
NaN |
OTHER |
NJ |
2 |
Holyoke |
NaN |
OVAL |
CO |
You can also select data using boolean statemet. example lets select where city = OAKLAND
Out[21]:
|
City |
Colors Reported |
Shape Reported |
State |
Time |
1694 |
Oakland |
NaN |
CIGAR |
CA |
7/21/1968 14:00 |
2144 |
Oakland |
NaN |
DISK |
CA |
8/19/1971 0:00 |
4686 |
Oakland |
NaN |
LIGHT |
MD |
6/1/1982 0:00 |
7293 |
Oakland |
NaN |
LIGHT |
CA |
3/28/1994 17:00 |
8488 |
Oakland |
NaN |
NaN |
CA |
8/10/1995 21:45 |
8768 |
Oakland |
NaN |
NaN |
CA |
10/10/1995 22:40 |
10816 |
Oakland |
NaN |
LIGHT |
OR |
10/1/1997 21:30 |
10948 |
Oakland |
NaN |
DISK |
CA |
11/14/1997 19:55 |
11045 |
Oakland |
NaN |
TRIANGLE |
CA |
12/10/1997 1:30 |
12322 |
Oakland |
NaN |
FIREBALL |
CA |
10/9/1998 19:40 |
12941 |
Oakland |
NaN |
CYLINDER |
CA |
1/23/1999 21:30 |
16803 |
Oakland |
NaN |
TRIANGLE |
MD |
7/4/2000 23:00 |
17322 |
Oakland |
NaN |
CYLINDER |
CA |
9/1/2000 21:35 |
Out[22]:
|
City |
Colors Reported |
Shape Reported |
State |
Time |
1694 |
Oakland |
NaN |
CIGAR |
CA |
7/21/1968 14:00 |
2144 |
Oakland |
NaN |
DISK |
CA |
8/19/1971 0:00 |
4686 |
Oakland |
NaN |
LIGHT |
MD |
6/1/1982 0:00 |
7293 |
Oakland |
NaN |
LIGHT |
CA |
3/28/1994 17:00 |
8488 |
Oakland |
NaN |
NaN |
CA |
8/10/1995 21:45 |
8768 |
Oakland |
NaN |
NaN |
CA |
10/10/1995 22:40 |
10816 |
Oakland |
NaN |
LIGHT |
OR |
10/1/1997 21:30 |
10948 |
Oakland |
NaN |
DISK |
CA |
11/14/1997 19:55 |
11045 |
Oakland |
NaN |
TRIANGLE |
CA |
12/10/1997 1:30 |
12322 |
Oakland |
NaN |
FIREBALL |
CA |
10/9/1998 19:40 |
12941 |
Oakland |
NaN |
CYLINDER |
CA |
1/23/1999 21:30 |
16803 |
Oakland |
NaN |
TRIANGLE |
MD |
7/4/2000 23:00 |
17322 |
Oakland |
NaN |
CYLINDER |
CA |
9/1/2000 21:35 |
We can also select specific rows with the boolean expression
Out[24]:
1694 CA
2144 CA
4686 MD
7293 CA
8488 CA
8768 CA
10816 OR
10948 CA
11045 CA
12322 CA
12941 CA
16803 MD
17322 CA
Name: State, dtype: object
Now lets tate a look at it ‘iloc’
iloc is for selecting rows and columns by integer position. that is what the ‘i’ stands for
Lets select all rows and columns in integer positions 0 and 3
Out[25]:
|
City |
State |
0 |
Ithaca |
NY |
1 |
Willingboro |
NJ |
2 |
Holyoke |
CO |
3 |
Abilene |
KS |
4 |
New York Worlds Fair |
NY |
5 |
Valley City |
ND |
6 |
Crater Lake |
CA |
7 |
Alma |
MI |
8 |
Eklutna |
AK |
9 |
Hubbard |
OR |
10 |
Fontana |
CA |
11 |
Waterloo |
AL |
12 |
Belton |
SC |
13 |
Keokuk |
IA |
14 |
Ludington |
MI |
15 |
Forest Home |
CA |
16 |
Los Angeles |
CA |
17 |
Hapeville |
GA |
18 |
Oneida |
TN |
19 |
Bering Sea |
AK |
20 |
Nebraska |
NE |
21 |
NaN |
LA |
22 |
NaN |
LA |
23 |
Owensboro |
KY |
24 |
Wilderness |
WV |
25 |
San Diego |
CA |
26 |
Wilderness |
WV |
27 |
Clovis |
NM |
28 |
Los Alamos |
NM |
29 |
Ft. Duschene |
UT |
… |
… |
… |
18211 |
Holyoke |
MA |
18212 |
Carson |
CA |
18213 |
Pasadena |
CA |
18214 |
Austin |
TX |
18215 |
El Campo |
TX |
18216 |
Garden Grove |
CA |
18217 |
Berthoud Pass |
CO |
18218 |
Sisterdale |
TX |
18219 |
Garden Grove |
CA |
18220 |
Shasta Lake |
CA |
18221 |
Franklin |
NH |
18222 |
Albrightsville |
PA |
18223 |
Greenville |
SC |
18224 |
Eufaula |
OK |
18225 |
Simi Valley |
CA |
18226 |
San Francisco |
CA |
18227 |
San Francisco |
CA |
18228 |
Kingsville |
TX |
18229 |
Chicago |
IL |
18230 |
Pismo Beach |
CA |
18231 |
Pismo Beach |
CA |
18232 |
Lodi |
WI |
18233 |
Anchorage |
AK |
18234 |
Capitola |
CA |
18235 |
Fountain Hills |
AZ |
18236 |
Grant Park |
IL |
18237 |
Spirit Lake |
IA |
18238 |
Eagle River |
WI |
18239 |
Eagle River |
WI |
18240 |
Ybor |
FL |
18241 rows × 2 columns
We can also do a slice range of numbers. When you do a slice range of numbers, the selected data does not include the last number
in the slice
Out[26]:
|
City |
Colors Reported |
Shape Reported |
State |
0 |
Ithaca |
NaN |
TRIANGLE |
NY |
1 |
Willingboro |
NaN |
OTHER |
NJ |
2 |
Holyoke |
NaN |
OVAL |
CO |
3 |
Abilene |
NaN |
DISK |
KS |
4 |
New York Worlds Fair |
NaN |
LIGHT |
NY |
5 |
Valley City |
NaN |
DISK |
ND |
6 |
Crater Lake |
NaN |
CIRCLE |
CA |
7 |
Alma |
NaN |
DISK |
MI |
8 |
Eklutna |
NaN |
CIGAR |
AK |
9 |
Hubbard |
NaN |
CYLINDER |
OR |
10 |
Fontana |
NaN |
LIGHT |
CA |
11 |
Waterloo |
NaN |
FIREBALL |
AL |
12 |
Belton |
RED |
SPHERE |
SC |
13 |
Keokuk |
NaN |
OVAL |
IA |
14 |
Ludington |
NaN |
DISK |
MI |
15 |
Forest Home |
NaN |
CIRCLE |
CA |
16 |
Los Angeles |
NaN |
NaN |
CA |
17 |
Hapeville |
NaN |
NaN |
GA |
18 |
Oneida |
NaN |
RECTANGLE |
TN |
19 |
Bering Sea |
RED |
OTHER |
AK |
20 |
Nebraska |
NaN |
DISK |
NE |
21 |
NaN |
NaN |
NaN |
LA |
22 |
NaN |
NaN |
LIGHT |
LA |
23 |
Owensboro |
NaN |
RECTANGLE |
KY |
24 |
Wilderness |
NaN |
DISK |
WV |
25 |
San Diego |
NaN |
CIGAR |
CA |
26 |
Wilderness |
NaN |
DISK |
WV |
27 |
Clovis |
NaN |
DISK |
NM |
28 |
Los Alamos |
NaN |
DISK |
NM |
29 |
Ft. Duschene |
NaN |
DISK |
UT |
… |
… |
… |
… |
… |
18211 |
Holyoke |
NaN |
DIAMOND |
MA |
18212 |
Carson |
NaN |
DISK |
CA |
18213 |
Pasadena |
GREEN |
FIREBALL |
CA |
18214 |
Austin |
NaN |
FORMATION |
TX |
18215 |
El Campo |
NaN |
OTHER |
TX |
18216 |
Garden Grove |
ORANGE |
LIGHT |
CA |
18217 |
Berthoud Pass |
NaN |
TRIANGLE |
CO |
18218 |
Sisterdale |
NaN |
DIAMOND |
TX |
18219 |
Garden Grove |
NaN |
CHEVRON |
CA |
18220 |
Shasta Lake |
BLUE |
DISK |
CA |
18221 |
Franklin |
NaN |
DISK |
NH |
18222 |
Albrightsville |
NaN |
OTHER |
PA |
18223 |
Greenville |
NaN |
NaN |
SC |
18224 |
Eufaula |
NaN |
DISK |
OK |
18225 |
Simi Valley |
NaN |
FORMATION |
CA |
18226 |
San Francisco |
NaN |
FORMATION |
CA |
18227 |
San Francisco |
NaN |
TRIANGLE |
CA |
18228 |
Kingsville |
NaN |
LIGHT |
TX |
18229 |
Chicago |
NaN |
DISK |
IL |
18230 |
Pismo Beach |
NaN |
OVAL |
CA |
18231 |
Pismo Beach |
NaN |
OVAL |
CA |
18232 |
Lodi |
NaN |
NaN |
WI |
18233 |
Anchorage |
RED |
VARIOUS |
AK |
18234 |
Capitola |
NaN |
TRIANGLE |
CA |
18235 |
Fountain Hills |
NaN |
NaN |
AZ |
18236 |
Grant Park |
NaN |
TRIANGLE |
IL |
18237 |
Spirit Lake |
NaN |
DISK |
IA |
18238 |
Eagle River |
NaN |
NaN |
WI |
18239 |
Eagle River |
RED |
LIGHT |
WI |
18240 |
Ybor |
NaN |
OVAL |
FL |
18241 rows × 4 columns
the returned data does not include the data at integer number 4, so it only returns 0,1,2,3 integer positions
lets select a set of rows and all columns
Out[28]:
|
City |
Colors Reported |
Shape Reported |
State |
Time |
0 |
Ithaca |
NaN |
TRIANGLE |
NY |
6/1/1930 22:00 |
1 |
Willingboro |
NaN |
OTHER |
NJ |
6/30/1930 20:00 |
2 |
Holyoke |
NaN |
OVAL |
CO |
2/15/1931 14:00 |
Now lets take a look at ix. it allows you to mix labels and integers. Tis kind of blend between loc and iloc
lets grab some new public data
Out[32]:
|
beer_servings |
spirit_servings |
wine_servings |
total_litres_of_pure_alcohol |
continent |
country |
|
|
|
|
|
Afghanistan |
0 |
0 |
0 |
0.0 |
Asia |
Albania |
89 |
132 |
54 |
4.9 |
Europe |
Algeria |
25 |
0 |
14 |
0.7 |
Africa |
Andorra |
245 |
138 |
312 |
12.4 |
Europe |
Angola |
217 |
57 |
45 |
5.9 |
Africa |
Lets select the first figure under beer servings for albania
we can also do the same thing like this way
lets select with slice range
Out[35]:
|
beer_servings |
spirit_servings |
country |
|
|
Albania |
89 |
132 |
Algeria |
25 |
0 |
Andorra |
245 |
138 |
important thing about ix. if you pass it slice range numbers and you have integer indexes or integer columns it will treat the passed
slice range as integers positions and returns all data inclusive of the both numbers in the slice. However if your columns or index has
labels instead of numbers and you pass a slice range number, ix will treat it as labels and we will return data not including the last number
in the slice
Out[36]:
|
City |
Colors Reported |
0 |
Ithaca |
NaN |
1 |
Willingboro |
NaN |
2 |
Holyoke |
NaN |
we have labels in our columns so ix treated the slice range as labels and returned index positions 0,1
but we had numbers in our index and ix treated them as integer positions and returns all data from 0,1 and 2