Here is a brief introduction to drawing maps and working with map coordinates in R.
This page uses maps in rnaturalearth
and uses the sf
package to handle spatial data. The previous standard rgdal
was retired in 2023. Here is an introduction to sf
objects. Here is a longer explanation.
The maps use latitude and longitude coordinates by default (= coordinate reference system WGS84; its EPSG code is 4326). Greenwich is the 0 longitude point (prime meridian). The same coordinate reference system is used by GPS.
library(ggplot2)
library(ggrepel)
library(sf)
library(rnaturalearth)
library(rnaturalearthdata)
library(measurements)
Extract preexisting maps from rnaturalearth
. Three scales are available: 110 (“small”), 50 (“medium”), or 10 (“large”). 110 is the coarsest scale, and 10 is the finest scale.
In ggplot()
, theme_bw()
includes grid lines on the map whereas theme_classic()
leaves them out.
coord_sf()
adds numbered axes for latitude and longitude.
By default, the international date line at -180/180 degrees forms the left and right edges of this plot.
To include country borders, use ne_countries()
instead of ne_coastline()
.
world <- ne_coastline(scale = "medium", returnclass = "sf")
ggplot(world) +
geom_sf(lwd = 0.2, fill = "white") +
coord_sf(expand = FALSE) +
theme_bw()
Make a map of an area defined by latitude and longitude. For example, the Western hemisphere north of the Tropic of Capricorn.
Here I’m showing the use of ne_countries()
to include country borders.
world <- ne_countries(scale = "medium", returnclass = "sf")
ggplot(world) +
geom_sf(lwd = 0.2) +
scale_x_continuous(limits = c(-179, -40)) +
scale_y_continuous(limits = c(23.4, 90)) +
coord_sf(expand = FALSE) +
theme_classic()
africa <- ne_countries(scale = "large", returnclass = "sf", continent = "Africa")
ggplot(africa) +
geom_sf(lwd = 0.2) +
theme_classic()
To see all the continents available, use the following.
z <- ne_countries(scale = 110, returnclass = "sf", type = "countries")
unique(z$continent)
canada <- ne_countries(country = "Canada", scale = "large", returnclass = "sf")
ggplot(canada) +
geom_sf(lwd = 0.2) +
theme_classic()
To see all the countries available, enter the following.
z <- ne_countries(scale = 110, returnclass = "sf", type = "countries")
unique(z$admin)
canada <- ne_states(country = "Canada", returnclass = "sf")
bc <- canada[canada$name == "British Columbia", ]
ggplot(bc) +
geom_sf(lwd = 0.2) +
theme_classic()
usa <- ne_states(country = "United States of America", returnclass = "sf")
ca <- usa[usa$name == "California", ]
ggplot(ca) +
geom_sf(lwd = 0.2) +
theme_bw()
Use st_transform()
to convert map coordinates to R’s equal area projection, which retains the approximate relative size of geographic areas.
Use scale_x_continuous()
and scale_y_continuous()
to control the axis ticks, which might otherwise tend to be sparse.
canadaEA <- st_transform(canada, crs = "+proj=eqearth +wktext")
ggplot(canadaEA) +
geom_sf(lwd = 0.2) +
scale_x_continuous(breaks = seq(-120, 60, by = 20)) +
theme_classic()
By default, the world map is centred at Greenwich (longitude 0 degrees) and the edge of maps occurs at the international date line in the mid-Pacific (-180/180 degrees). Re-centering a map from a Pacific perspective (by added 360 to all negative longitudes) doesn’t work in all cases. Map components (polygons) are drawn from one side to the other, causing horizontal lines to appear. Run the following to see what I mean.
# Not run
world = ne_coastline(scale = "medium", returnclass = "sf")
world_2 = st_shift_longitude(world)
ggplot(world_2) +
geom_sf(lwd = 0.2, fill = "white") +
coord_sf(expand = FALSE) +
theme_classic()
The best solution is to change the coordinate reference system so that the meridian is set at a new location (e.g., to the international date line at 180 degrees).
The following map of the North Pacific splits the map components (polygons) at the new meridian before projecting to a new re-centered coordinate system. Code is based on this page.
world = ne_coastline(scale = "medium", returnclass = "sf")
# Decide new meridian
meridian_2 <- -180
# Split world at new meridian
world_2 <- st_break_antimeridian(world, lon_0 = meridian_2)
# Warning: attribute variables are assumed to be spatially constant throughout all geometries
# Now reproject to new meridian center
world_3 <- st_transform(world_2,
crs = paste("+proj=latlong +datum=WGS84 +lon_0=", meridian_2) )
ggplot(world_3) +
geom_sf(lwd = 0.2, fill = "white") +
# coord_sf(xlim = c(-100, 100), ylim = c(0, 90), expand = FALSE) +
coord_sf(expand = FALSE) +
scale_x_continuous(limits = c(-100, 100)) +
scale_y_continuous(limits = c(0, 90)) +
theme_bw()
As an example, we’ll plot the ignition locations of all recorded forest fires in British Columbia in 1990. The data were downloaded from the Canadian National Fire DataBase.
BCfires1990 <- read.csv("https://www.zoology.ubc.ca/~schluter/R/csv/BCfires1990.csv")
head(BCfires1990)
## SRC_AGENCY FIRE_ID FIRENAME LATITUDE LONGITUDE YEAR MONTH DAY REP_DATE ATTK_DATE OUT_DATE DECADE SIZE_HA CAUSE PROTZONE
## 1 BC 1990-R90003 NA 59.929 -128.200 1990 3 15 1990-03-15 NA NA 1990-1999 0.1 H NA
## 2 BC 1990-R90004 NA 59.911 -128.431 1990 5 2 1990-05-02 NA NA 1990-1999 0.1 H NA
## 3 BC 1990-R90005 NA 59.920 -128.484 1990 6 16 1990-06-16 NA NA 1990-1999 NA H NA
## 4 BC 1990-R90012 NA 59.848 -128.168 1990 7 26 1990-07-26 NA NA 1990-1999 1.0 H NA
## 5 BC 1990-R90014 NA 59.661 -128.718 1990 7 30 1990-07-30 NA NA 1990-1999 0.1 L NA
## 6 BC 1990-R90027 NA 59.321 -129.056 1990 9 18 1990-09-18 NA NA 1990-1999 0.2 H NA
## FIRE_TYPE MORE_INFO CFS_REF_ID CFS_NOTE1 CFS_NOTE2 ACQ_DATE SRC_AGY2 ECOZONE ECOZ_REF ECOZ_NAME ECOZ_NOM
## 1 Fire NA BC-1990-1990-R90003 NA NA 2020-05-05 BC 12 12 Boreal Cordillera CordillCre boreale
## 2 Fire NA BC-1990-1990-R90004 NA NA 2020-05-05 BC 12 12 Boreal Cordillera CordillCre boreale
## 3 Fire NA BC-1990-1990-R90005 NA NA 2020-05-05 BC 12 12 Boreal Cordillera CordillCre boreale
## 4 Fire NA BC-1990-1990-R90012 NA NA 2020-05-05 BC 12 12 Boreal Cordillera CordillCre boreale
## 5 Fire NA BC-1990-1990-R90014 NA NA 2020-05-05 BC 12 12 Boreal Cordillera CordillCre boreale
## 6 Fire NA BC-1990-1990-R90027 NA NA 2020-05-05 BC 12 12 Boreal Cordillera CordillCre boreale
Extract the latitude and longitude of every fire and place in a data frame. Then use geom_point()
to add these points to a map of the province.
# Data frame of x and y coordinates of all 1990 fires
dat <- data.frame(longitude = BCfires1990$LONGITUDE,
latitude = BCfires1990$LATITUDE)
# Obtain the map of BC from rnaturalearth
canada <- ne_states(country = "Canada", returnclass = "sf")
bc <- canada[canada$name == "British Columbia", ]
# Make the plot with fire points
ggplot(bc) +
geom_sf(lwd = 0.2) +
geom_point(data = dat, aes(x = longitude, y = latitude),
size = 0.2, color = "firebrick") +
theme_classic()
You can see that some of the points lie outside the province, indicating errors in the database. For now, delete the points that are out of bounds.
First, convert the data frame of points to an sf
object (4326 is the EPSG code for WGS84, which uses latitude/longitude). Then identify the points that are within bounds (indicated by an sfdat$in_bounds
value of 1).
sfdat <- st_as_sf(dat, coords = c("longitude", "latitude"), crs = 4326)
head(sfdat)
## Simple feature collection with 6 features and 0 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -129.056 ymin: 59.321 xmax: -128.168 ymax: 59.929
## Geodetic CRS: WGS 84
## geometry
## 1 POINT (-128.2 59.929)
## 2 POINT (-128.431 59.911)
## 3 POINT (-128.484 59.92)
## 4 POINT (-128.168 59.848)
## 5 POINT (-128.718 59.661)
## 6 POINT (-129.056 59.321)
sfdat$in_bounds = lengths(st_within(sfdat, bc))
table(sfdat$in_bounds)
##
## 0 1
## 71 3184
Finally, drop the out-of-bounds points and replot.
BCfires1990_fixed <- BCfires1990[sfdat$in_bounds == 1, ]
dat <- data.frame(longitude = BCfires1990_fixed$LONGITUDE,
latitude = BCfires1990_fixed$LATITUDE)
ggplot(bc) +
geom_sf(lwd = 0.2) +
geom_point(data = dat, aes(x = longitude, y = latitude),
size = 0.2, color = "firebrick") +
theme_classic()
Add site labels and data points to the map.
The following example file has sampling latitude and longitude of locations where marine threespine stickleback were collected.
mydata <- read.csv("https://www.zoology.ubc.ca/~schluter/R/csv/sampleSites.csv")
head(mydata)
## Site ocean latitude longitude N
## 1 Bering Sea Pacific 58.0890 -166.6430 1
## 2 Barrow, Alaska Pacific 71.3350 -156.6060 1
## 3 Seward, Alaksa Pacific 60.1240 -149.3950 1
## 4 Rabbit Slough Pacific 61.5132 -149.3454 43
## 5 Mud Lake Pacific 61.4975 -149.2590 51
## 6 Auke Bay Pacific 58.3841 -134.6461 4
The function geom_text_repel()
reduces overlap between labels but it also drops labels if crowding is too great. In this case you’ll need to zoom in to a finer scale to see them all.
Plot the locations and names of sites on a world map.
world = ne_coastline(scale = "medium", returnclass = "sf")
ggplot(world) +
geom_sf(lwd = 0.2) +
geom_point(data = mydata, aes(x = longitude, y = latitude),
size = 0.4, color = "firebrick") +
geom_text_repel(data = mydata,
aes(x = longitude, y = latitude, label = Site),
size = 1.5, max.overlaps = 20) +
coord_sf(expand = FALSE) +
theme_classic()
The second example zooms in to include only latitudes and longitudes that cover the eastern Pacific Ocean.
world = ne_coastline(scale = "medium", returnclass = "sf")
ggplot(world) +
geom_sf(lwd = 0.2) +
geom_point(data = mydata, aes(x = longitude, y = latitude),
size = 0.4, color = "firebrick") +
geom_text_repel(data = mydata,
aes(x = longitude, y = latitude, label = Site),
size = 1.5, max.overlaps = 50) +
coord_sf(expand = FALSE) +
scale_x_continuous(limits = c(-179, -110)) +
scale_y_continuous(limits = c(30, 80)) +
theme_classic()
Range data can be in the form of point occurrences or as range map shapefiles.
The Global Biodiversity Information Facility provides geographical records of occurrence of species drawn from various sources ranging from museum collections to smartphone identifications. Access to the dataset is enabled using the rgbif
package. to download more than 100K records you’ll need to register (free) at the GBIF site.
I downloaded more than 200K occurrences of the threespine stickleback with the following steps.
First, find the taxonKey
values (also called usageKey
– confusing) for the species. The name_backbone()
command found 5 keys, of which one was associated with the vast majority of occurrences. I used that one only. This did not find all the taxonKey
values because of alternate names and synonyms in the literature. The rgbif vignettes introduces this and other ways to find occurrence keys.
library(rgbif)
# Find taxonKey (= usageKey) associated with the species name
backbone <- name_backbone(name = "Gasterosteus aculeatus", verbose = TRUE)
backbone$usageKey
## 4286327 11306040 4352560 9166461 4286348
# Number of occurrences associated with each taxonKey
sapply(backbone$usageKey, function(x){occ_count(taxonKey = x)})
## 208608 3 5 10 0
The next step is to download the records. Here is the code I used – I saved the results to a file that you can download.
# For now, download only occurrences associated with the first backbone$usageKey,
# since it is associated with the vast majority of records.
# I used the USERNAME, PASSWORD, and EMAIL associated with my GBIF account.
occ_download(pred_in("taxonKey", backbone$usageKey[1] ),
format = "SIMPLE_CSV", user = "USERNAME",
pwd = "PASSWORD", email = "EMAIL")
# Import the downloaded data -- DOWNLOAD KEY is provided by occ_download()
d <- occ_download_get('DOWNLOAD KEY') %>% occ_download_import()
# Save a subset of variables to a .csv file
write.csv(d[, c("species", "verbatimScientificName", "individualCount",
"decimalLatitude", "decimalLongitude", "eventDate")],
"~/Desktop/rgbif_stickleback_occurrences.csv", row.names = FALSE)
The file can be downloaded here.
stickleData <-
read.csv("https://www.zoology.ubc.ca/~schluter/R/csv/rgbif_stickleback_occurrences.csv")
head(stickleData)
## species verbatimScientificName individualCount decimalLatitude decimalLongitude eventDate
## 1 Gasterosteus aculeatus Gasterosteus aculeatus Linnaeus, 1758 NA 48.66301 6.668630 2010-06-15
## 2 Gasterosteus aculeatus Gasterosteus aculeatus NA 53.51938 -3.011979 1992-07-14
## 3 Gasterosteus aculeatus Gasterosteus aculeatus NA 53.37632 -2.918391 2011-05-19
## 4 Gasterosteus aculeatus Gasterosteus aculeatus NA 53.41370 -2.708566 2019-04-29
## 5 Gasterosteus aculeatus Gasterosteus aculeatus NA 53.43168 -2.708865 2004-02-05
## 6 Gasterosteus aculeatus Gasterosteus aculeatus NA 53.39474 -2.858634 2017-03-27
Extract the latitude and longitude of every record and place in a data frame. Then use geom_point()
to plot these points on a map. Each record is here given a single point regardless of how many individuals of the species were observed.
# Data frame of latitude and longitude
dat <- data.frame(longitude = stickleData$decimalLongitude,
latitude = stickleData$decimalLatitude)
# Obtain the map of the world from rnaturalearth
world <- ne_coastline(scale = "medium", returnclass = "sf")
Finally, plot the occurrences on the world map. Some of the occurrences are in unlikely locations (e.g., Africa). This site gives some tips on how to clean species occurrence data.
ggplot(world) +
geom_sf(lwd = 0.2) +
geom_point(data = dat, aes(x = longitude, y = latitude),
size = 0.2, color = "firebrick") +
coord_sf(expand = FALSE) +
theme_classic()
Range maps for given species are often available online as a database containing a shapefile. For example, I downloaded the range map for the Patagonian rockfish from IUCN\(^1\) and overlay it on a map of South America.
First, read the shapefile into R. A shapefile is a folder of files (with extensions like .shp
, .shx
, .dbf
) that together define geographic features. To read the shapefile you can give st_read()
the name of the whole folder. Or, you can give it the name of the .shp
file as long as the associated files are in the same folder.
rockfish_shapefile <- st_read("csv/IUCN_Patagonian_Rockfish")
# or
rockfish_shapefile <- st_read("csv/IUCN_Patagonian_Rockfish/data_0.shp")
Next, overlay the shapefile on your map.
samerica <- ne_countries(scale = "large", returnclass = "sf", continent = "South America")
ggplot(samerica) +
geom_sf(data = rockfish_shapefile, fill = "goldenrod1", color = "goldenrod1") +
geom_sf(lwd = 0.2) +
theme_classic()
Different programs and agencies might use different coordinate systems for locating points on a map, but you can convert between them. The default coordinate reference system of rnaturalearth
is WGS84 (EPSG code 4326) in decimal degrees.
Units of latitude and longitude might be given in degrees, minutes, and seconds, or in degrees decimal minutes instead of decimal degrees. I use the measurements
package to convert to decimal degrees.
For example, a spot along the shoreline of Paxton Lake, BC is located at latitude 49\(\circ\)42’23.298”N and longitude 124\(\huge\circ\)31’19.236”W in degrees, minutes and seconds. Convert to decimal degrees as follows (degrees S or W are entered as negative numbers).
conv_unit(c("49 42 23.298", "-124 31 19.236"), from = "deg_min_sec", to = "dec_deg")
## [1] "49.7064716666667" "-124.52201"
Similarly, the coordinates 49\(\circ\)42.3883N’, 124\(\circ\)31.3206W in degrees decimal minutes can be converted to decimal degrees as follows.
conv_unit(c("49 42.3883", "-124 31.3206"), from = "deg_dec_min", to ="dec_deg")
## [1] "49.7064716666667" "-124.52201"
The BC government uses UTM (Universal Transverse Mercator) coordinates for the latitude and longitude of lakes (EPSG code is prefix 326 with 2-digit zone number appended). Coordinates are referred to as Eastings and Northings and they come with a specific zone number. Zone 10 is for most of the Pacific Northwest (EPSG code 32610). Parts of Vancouver Island are Zone 09 (EPSG code 32609).
For example, here are the BC government UTM coordinates for the point locations of Paxton and Priest lakes on Texada Island. Convert to WGS84 as follows.
# Put points into a data frame
x <- data.frame(easting = c(390347,387959), northing = c(5507797, 5511150),
row.names = c("Paxton", "Priest"))
x
## easting northing
## Paxton 390347 5507797
## Priest 387959 5511150
# Convert data frame to an sf object with UTM coordinates
x.sf <- st_as_sf(x, coords = c("easting", "northing"))
x.sf <- st_set_crs(x.sf, value = 32610)
# Convert coordinates to WGS84 (EPSG code 4326)
y <- st_transform(x.sf, crs = st_crs(4326))
y
## Simple feature collection with 2 features and 0 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -124.5551 ymin: 49.71269 xmax: -124.521 ymax: 49.74239
## Geodetic CRS: WGS 84
## geometry
## 1 POINT (-124.521 49.71269)
## 2 POINT (-124.5551 49.74239)
\(^1\)IUCN (International Union for Conservation of Nature) 2023. Sebastes oculatus. The IUCN Red List of Threatened Species Version 2023-1. https://www.iucnredlist.org. Downloaded on 17 December 2023.
© 2009-2024 Dolph Schluter