Constructing a Geospatial GeoDataFrame from a DataFrame

In this – rather short – chapter you will learn how to create a GeoDataFrame from a DataFrame.

Requirements

You can construct a GeoDataFrame from a DataFrame as long as you have the required pieces in place:

  • a geometry column and
  • the Coordinate Reference System (CRS).

To create a geometry column, first build a representation of the geometry and then use a specific constructor from the geometry module in the Shapely package. Shapely is a Python package that provides methods for creating and working with points, lines and polygons.

# import pandas and matplotlib
import pandas as pd
import matplotlib.pyplot as plt

# import geospatial libraries
import geopandas as gpd
from shapely.geometry import Point

Let’s use a dataset with restaurants in Berlin:

restaurants = pd.read_csv('Data/Cleansed_Data/Berlin_Restaurants')
restaurants.head(2)

Creating a Geometry

Next, let’s create a Point Geometry Series.

The lambda function we apply combines longitude and latitude to create a tuple and then constructs a Point Geometry from the tuple. A different way to create a geometry using zip is also provided:

## create a point geometry Series

# option 1
geometry = restaurants.apply(lambda x: Point((x.lng, x.lat)), axis=1)

# option 2
geometry = [Point(xy) for xy in zip(restaurants['lng'], restaurants['lat'])]

Now that we have our geometry Series, the DataFrame is ready to be used as a GeoDataFrame.

Creating a GeoDataFrame

To construct a GeoDataFrame, we use the GeoDataFrame constructor, passing to it

  • the restaurants DataFrame,
  • the crs to use and
  • the geometry to use.

Here we create an object called crs and set it to use the EPSG:4326 CRS. We specify the geometry series we just created as the new GeoDataFrame geometry column:

crs = {'init':'epsg:4326'}
restaurants_geodf = gpd.GeoDataFrame(restaurants, crs=crs, geometry=geometry)
restaurants_geodf[10:16]
# checking the dataframe's type
type(restaurants_geodf)
Out[6]: geopandas.geodataframe.GeoDataFrame

Comparing both dataframes, we see that they are almost identical. The only differences are that the datatype has changed from a DataFrame to a GeoDataFrame, and the geometry column has been added.

Converting the CRS

Notice that the GeoDataFrame’s geometry uses decimal degrees to measure distances from the reference points. Remembering the first tutorial, in order to measure distance in meters we can convert the geometry using the .to_crs() method.

Let’s convert the crs to EPSG:3857 with the resulting measurements in meters.

Note that the original latitude and longitude columns remain in decimal degree units – .to_crs() only changes the geometry column.

# convert geometry from decimal degrees to meters
restaurants_geodf.geometry = restaurants_geodf.geometry.to_crs(epsg=3857)
restaurants_geodf[10:16]

Comparing this dataframe with the one before, one can see that latitude and longitude is exactly the same – only the geometry data has changed.

Accessing the Geometry

Let’s extract the values of the geometry column using the .loc attribute of a dataframe:

kuchen_kaiser = restaurants_geodf.loc[10, 'geometry']
tee_tea = restaurants_geodf.loc[12, 'geometry']
die_henne = restaurants_geodf.loc[14, 'geometry']

If we print this value, we can see that it’s a Point geometry:

print(kuchen_kaiser)
POINT (1493519.066988837 6891513.247496091)

And when checking the type of this value, we see it’s a Shapely Point object:

type(kuchen_kaiser)
Out[10]: shapely.geometry.point.Point

The geometry column in a GeoDataFrame thus consists of Shapely objects!

Creating a Geometry Manually

But geometries can also be created manually. Here we create a Point geometry for the Brandenburg Gate with coordinates 13.377704 (longitude) and 52.516275 (latitude):

# Python order: long, lat
brandenburg_gate = Point(13.377704, 52.516275)
print(brandenburg_gate)
POINT (13.377704 52.516275)

Always keep in mind that the longitude is limited to a range of -180° to 180°, while the latitude is limited to a range of -90° to 90°.

. . . . . . . . . . . . . .

I hope this little tutorial helps to work with geospatial data! If you think I’ve missed something or have made an error somewhere, please let me know: bb@brittabettendorf.com