Visualizing Geospatial Data

In this second chapter, you will be introduced to the basics of visualizing geospatial data.

Loading Geospatial Data

To inspect some of the basic methods for visualizing geospatial data, let’s load two shape files, one consisting of Germany’s geography, and the other consisting of Berlin’s geography:

# import pandas, matplotlib, and seaborn
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# import geopandas
import geopandas as gpd
# load first DataFrame: Germany with its federal states
germany = gpd.read_file('Data/Cleansed_Data/Germany.shp')
germany.head()

# slice Berlin
berlin_in_germany = germany.iloc[2]
berlin_in_germany.geometry
# load second DataFrame: Berlin with its districts
berlin_districs = gpd.read_file('Data/Cleansed_Data/Berlin_Districts.shp')
berlin_districs
# check crs
berlin_districs.geometry.crs
Out[4]: {'init': 'epsg:4326'}
# change crs to using metres
berlin_districs.to_crs(epsg=3857)
berlin_districs.head(2)

Let’s explore different methods of plotting this GeoDataFrame:

Plot with Uniform Color

berlin_districs.plot(color='lightsteelblue', figsize=(10,10))
sns.despine(top=True, right=True, left=True, bottom=True);

Plot with Adjusted Colormap

We can also adjust the color based on specific attributes’ values by using the column keyword:

berlin_districs.plot(column='POPULATION', cmap='Blues', figsize=(10,10))
sns.despine(top=True, right=True, left=True, bottom=True);

You can also style plot legends by passing a dictionary with keywords to GeoPandas.plot(). Here you can see an example:

leg_kwds={'title':'District Name', 
          'loc': 'upper left', 
          'bbox_to_anchor':(1, 1.03), 
          'ncol':2}

berlin_districs.plot(column='DISTRICT', cmap='Dark2', figsize=(10,10),
               legend=True, legend_kwds=leg_kwds)

plt.axis('off');

Colormaps

Choosing the right colormap requires a bit of thought. Here are some ideas:

  • When you map regions without a quantitative relationship, a qualitative colormap is the correct choice:
  • Divergent colormaps are best at highlighting a particular middle range of quantitatively related maps:
  • If you map regions with a quantitative relationship and do not want to focus on one particular range within your data, a sequential colormap is best:

Multi-Layer Plots

We also might want to combine multiple layers of geometries into a single plot. For example, we could combine

  • a scatterplot of the Airbnb data containing the latitude and longitude of each Airbnb apartment
  • with the polygon geometries of berlin_districts data

to see where in Berlin the apartments are located.

To add an additional layer to an existing plot, we can use the ax keyword of the .plot() method:

# read the airbnb csv file
airbnb = pd.read_csv('Data/airbnb_listings_berlin.csv')

# prepare plot
fig, ax = plt.subplots(figsize=(12, 6))

# plot the map of Berlin
berlin_districs.plot(ax=ax, cmap='Set3')

# plot the airbnb data
ax.plot(airbnb.longitude, airbnb.latitude, 'o', markersize=0.3, color='navy')

# remove the axis
ax.set_axis_off();

Visualizing a Spatial Variation

In one of the previous exercises we visualized the districts with a uniform column: POPULATION. But we often want to show the spatial variation of a variable and color the polygons accordingly. Now we will visualize the spatial variation of the population density in Berlin. For this purpose, we will first calculate the population density by dividing the population by the area, and add the result to the dataframe as a new column.

(Note: The area is given in square meters, so you will need to multiply the result by 10 ** 6)

# add a population density column
berlin_districs['POPULATION_DENSITY'] = berlin_districs['POPULATION'] / berlin_districs.geometry.area * 10**6

# make a plot of the districts colored by the population density
fig, ax = plt.subplots(figsize=(12, 6))
berlin_districs.plot(ax=ax, column='POPULATION_DENSITY', legend=True, cmap='Reds')
ax.set_title('Population Density in Berlin Districts')
ax.set_axis_off();

Street Maps with Folium

Wouldn’t it be more useful to add a street map for context? To do so, we’ll turn to the Folium package, which is a Python library for creating and styling interactive maps.

import folium

Folium Plots without GeoPandas

To create a map, you pass your starting coordinates as location to the folium.map constructor to create a map object. It’s important to know that Folium wants an array with latitude first!

You can set an initial zoom level when you construct a map with the zoom_start argument. The higher the number, the closer your map will zoom into your starting coordinate pair.

Then use display to show the map:

# construct a map centered at the Brandenburg Gate in Berlin
brandenburg_gate = folium.Map(location=[52.516275,13.377704], zoom_start=16)

# display the map
display(brandenburg_gate)

If you look closely, you can actually see a tiny representation of the Brandenburg Gate in the center of the map.

Folium Plots with GeoPandas

Now we want to use the Point Geometry of a GeoSeries in GeoDataFrame.

# access the Berlin Mitte district in the center of Berlin
berlin_mitte = berlin_districs.iloc[9]
berlin_mitte
# create center column from the centroid method for all federal states in Germany ...
germany['center'] = germany.geometry.centroid
germany.head()
# ... and slice the centroid of Berlin and save the variable as point
point = germany.center[2]

# check its type
type(point)
Out[19]: shapely.geometry.point.Point

Note that we need to reverse the order of the coordinate pairs for Folium, so that latitude is first! By moving the y value of center_mitte to the first position and the x value to the second position, we create a Folium location called mitte_location with latitude first:

# reverse the order for folium location array
folium_location = [point.y, point.x]

# check the reversed order
print(point)
print(folium_location)
POINT (13.40878247769892 52.4993362056014)
[52.499336205601395, 13.408782477698923]

We now add a polygon to the Folium.map object using folium_location (= centroid of Berlin) for the location:

# construct a folium map for Berlin from folium_location: using the centroid of Berlin
berlin_map = folium.Map(location=folium_location, zoom_start=12)

# draw our Mitte district
folium.GeoJson(berlin_mitte.geometry).add_to(berlin_map)

# display the folium map
display(berlin_map)

. . . . . . . . . . . . . .

I hope this little tutorial helps to work with geospatial data! If you think I’ve missed something or have made an error somewhere, please let me know: bb@brittabettendorf.com