import pandas as pd
import geopandas as gp
import mapclassify
import folium
Open Ottawa
Data Provider
Open Ottawa is an open-access data portal designed to provide accessible datasets from municipal data sources. Users can access datasets from a variety of municipal departments including Parks, Public Health, Water Treatment, and many others. All data from this portal is subject to Open Data Licence Version 2.0. Information released under this licence may be copied, modified, published, adapted, or otherwise used for any lawful purpose, as long as a data attibution from the information provider is included. When using information from multiple providers, the following attribution must be used: Contains information licensed under the Open Government Licence – City of Ottawa.
Plotting Geographical Data Points on Open Street Maps
library(leaflet)
library(sf)
library(tidyverse)
library(OpenStreetMap)
A word about retrieving Open Ottawa Data.
Open Ottawa datasets are available to download as CSVs, however the download URL is generated on demand, and expires shortly, within a few days, or sometimes even hours. For this reason, when revisiting a dataset, you may have to navigate to the dataset’s “About” page and regenerate a download link.
Service Requests
#https://open.ottawa.ca/datasets/8a5030af268a4a3485b72356dd7dfa85/about
= pd.read_csv("311opendata_currentyear") service
<string>:4: DtypeWarning: Columns (0,9) have mixed types. Specify dtype option on import or set low_memory=False.
service.head()
Service Request ID | Numéro de demande ... Channel | Voie de service
0 202457000007 ... Phone
1 202457000012 ... Walk-In
2 202457000119 ... Voice In
3 202457000145 ... Walk-In
4 202457000177 ... Voice In
[5 rows x 11 columns]
= service.columns.str.split("|").str[0].str.rstrip() service.columns
#https://open.ottawa.ca/datasets/8a5030af268a4a3485b72356dd7dfa85/about
<- ("311opendata_currentyear")
csv_path
<- read_csv(csv_path) service
Rows: 72419 Columns: 11
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (10): Service Request ID | Numéro de demande, Status | État, Type | Typ...
date (1): Opened Date | Date d'ouverture
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(service)
# A tibble: 6 × 11
Service Request ID | Nu…¹ `Status | État` `Type | Type` Description | Descri…²
<chr> <chr> <chr> <chr>
1 202457000007 Resolved Water and th… Water Hydrants - Hydr…
2 202457000012 Resolved Parking Cont… No Parking | Stationn…
3 202457000119 Resolved Parking Cont… Overtime Parking | Co…
4 202457000145 Resolved Parking Cont… Overtime Parking | Co…
5 202457000177 Active Roads and Tr… Infrastructure - Asse…
6 202457000224 Active Roads and Tr… Road Maintenance - Ca…
# ℹ abbreviated names: ¹`Service Request ID | Numéro de demande`,
# ²`Description | Description`
# ℹ 7 more variables: `Opened Date | Date d'ouverture` <date>,
# `Closed Date | Date de fermeture` <chr>, `Address | Adresse` <chr>,
# `Latitude | Latitude` <chr>, `Longitude | Longitude` <chr>,
# `Ward | Quartier` <chr>, `Channel | Voie de service` <chr>
colnames(service) <- sub(" \\|.*", "", colnames(service))
<- service |> rename_all(make.names) service
Some service requests do not include location data. Let’s filter those out.
= service[~service["Latitude"].str.contains("N")]
service_locations
service_locations.tail()
Service Request ID Status ... Ward Channel
68707 202400455741 Open ... 23 WEB
68708 202400455956 Open ... 14 WEB
68709 202400456910 Closed ... 17 Voice In
68710 202400457011 Closed ... 2 WEB
68711 202400457015 Open ... 19 WEB
[5 rows x 11 columns]
<- service |> filter(!str_detect(Latitude, 'N'))|>
parkservice filter(str_detect(Description, "Park")) |>
st_as_sf(coords = c("Longitude", "Latitude")) |>
st_coordinates()
<- service |> filter(!str_detect(Latitude, 'N')) service_locations
After initially plotting all points, I noticed that there was a service request apparently placed from the Pacific Ocean near Africa. Clearly there was a data input error. It’s worth filtering for latitudes containing 45 (which reflect Ottawa).
= gp.GeoDataFrame(
service_geo =gp.points_from_xy(service_locations["Longitude"],
service_locations, geometry"Latitude"]), crs="epsg:4386")
service_locations[
"Latitude"].str.contains("45")].explore() service_geo[service_geo[