Data Science,
Séance 6 : Données spatiales

Etienne Côme

21 novembre 2018

Packages historique

  • rgdal: interface entre R and GDAL (Geospatial Data Abstraction Library) and PROJ4 librarie: raster / vector

  • sp: classes d’objets spatiaux pour R. (S4)

  • rgeos: interface entre R et GEOS (Geometry Engine - Open Source) library: area, perimeter, distances, dissolve, buffer, overlap, union, contains…

Toujours utilisé

Simple Features for R

  • sf Website: Simple Features for R Octobre 2016

  • sp, rgeos and rgdal tout dans le même package

  • Plus simple

  • Tidy data: compatible dplyr et autre tidyverse

Simple Features for R

sf objects data structure:

format sf

Lire des données

library(sf)
mtq <- st_read("data/mtq/martinique.shp")
class(mtq)
## [1] "sf"         "data.frame"
str(mtq)
## Classes 'sf' and 'data.frame':   34 obs. of  24 variables:
##  $ INSEE_COM: Factor w/ 34 levels "97201","97202",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ STATUT   : Factor w/ 3 levels "Commune simple",..: 1 1 1 1 1 1 1 1 2 1 ...
##  $ LIBGEO   : Factor w/ 34 levels "Basse-Pointe",..: 9 22 1 11 3 12 4 5 6 13 ...
##  $ P13_POP  : num  1830 3929 3565 3742 4464 ...
##  $ C13_POP  : num  1482 3190 2983 3157 3513 ...
##  $ C13_CS1  : num  9.78 97.43 39.51 80.13 36.14 ...
##  $ C13_CS2  : num  48.9 170.5 98.8 192.3 172.7 ...
##  $ C13_CS3  : num  9.78 109.61 43.46 84.14 349.33 ...
##  $ C13_CS4  : num  103 240 182 341 518 ...
##  $ C13_CS5  : num  274 560 569 533 675 ...
##  $ C13_CS6  : num  289 386 565 260 317 ...
##  $ C13_CS7  : num  430 747 941 990 735 ...
##  $ C13_CS8  : num  318 881 545 677 711 ...
##  $ P08_POP  : num  1691 3826 3804 3760 4515 ...
##  $ C08_POP  : num  1347 3068 3054 3039 3454 ...
##  $ C08_CS1  : num  31.4 49 44.5 88.9 32.8 ...
##  $ C08_CS2  : num  43.2 144 106.8 129.4 155.7 ...
##  $ C08_CS3  : num  11.8 65.3 27.7 153.6 262.2 ...
##  $ C08_CS4  : num  145 216 186 408 615 ...
##  $ C08_CS5  : num  224 600 448 509 729 ...
##  $ C08_CS6  : num  251 459 620 271 332 ...
##  $ C08_CS7  : num  381 559 882 832 549 ...
##  $ C08_CS8  : num  259 975 739 646 778 ...
##  $ geometry :sfc_POLYGON of length 34; first list element: List of 1
##   ..$ : num [1:55, 1:2] 699261 699226 699016 698406 698001 ...
##   ..- attr(*, "class")= chr  "XY" "POLYGON" "sfg"
##  - attr(*, "sf_column")= chr "geometry"
##  - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
##   ..- attr(*, "names")= chr  "INSEE_COM" "STATUT" "LIBGEO" "P13_POP" ...

Changer la projection ou passer en lat/long

mtq_rp = st_transform(mtq,4326)
mtq_rp
## Simple feature collection with 34 features and 23 fields
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: -61.2291 ymin: 14.39456 xmax: -60.8095 ymax: 14.8781
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
## First 10 features:
##    INSEE_COM               STATUT            LIBGEO P13_POP   C13_POP
## 1      97201       Commune simple L'Ajoupa-Bouillon    1830  1481.801
## 2      97202       Commune simple Les Anses-d'Arlet    3929  3190.115
## 3      97203       Commune simple      Basse-Pointe    3565  2983.215
## 4      97204       Commune simple         Le Carbet    3742  3157.044
## 5      97205       Commune simple       Case-Pilote    4464  3513.386
## 6      97206       Commune simple        Le Diamant    6063  4766.876
## 7      97207       Commune simple             Ducos   17051 14032.139
## 8      97208       Commune simple Fonds-Saint-Denis     813   684.000
## 9      97209 Préfecture de région    Fort-de-France   84174 68712.330
## 10     97210       Commune simple       Le François   18225 14959.798
##       C13_CS1    C13_CS2     C13_CS3   C13_CS4    C13_CS5   C13_CS6
## 1    9.780866   48.90433    9.780866  102.6991   273.8642  288.5355
## 2   97.433459  170.50855  109.612642  239.5239   560.2424  385.6741
## 3   39.510829   98.77707   43.461911  181.7498   568.9559  565.0048
## 4   80.133275  192.31986   84.139939  340.5664   532.8800  260.4331
## 5   36.137682  172.65782  349.330931  517.9734   674.5701  317.2085
## 6   28.374581  251.31772  397.190830  802.5953   835.0234  506.6890
## 7   85.296622  614.27717  772.462370 2016.3061  2716.9554 1548.7246
## 8   32.000000   16.00000   12.000000   68.0000   112.0000  100.0000
## 9   86.572838 2720.48912 4000.386802 8407.4236 13799.2254 7309.1357
## 10 137.873479  817.80956  544.951619 1470.5494  2966.3701 2115.8299
##       C13_CS7    C13_CS8 P08_POP   C08_POP   C08_CS1    C08_CS2    C08_CS3
## 1    430.3581   317.8781    1691  1346.519  31.40569   43.18282   11.77713
## 2    746.5743   880.5453    3826  3067.742  49.00453  143.95079   65.33937
## 3    940.5055   545.2494    3804  3054.108  44.51803  106.84327   27.70011
## 4    989.5457   677.0259    3760  3038.656  88.93759  129.36377  153.61948
## 5    734.7995   710.7078    4515  3453.848  32.77765  155.69385  262.22121
## 6    956.6287   989.0568    5850  4569.721  33.41790  329.82460  338.35626
## 7   2936.4613  3341.6553   16433 13441.442 106.52838  563.35180  629.34551
## 8    224.0000   120.0000     873   708.000  16.00000   16.00000   16.00000
## 9  16184.2574 16204.8388   89000 71566.825 119.06082 2480.10528 3976.89825
## 10  3577.5536  3328.8607   19189 15209.492 132.46653  666.44376  614.97272
##      C08_CS4    C08_CS5   C08_CS6    C08_CS7    C08_CS8
## 1   145.2513   223.7655  251.2455   380.7940   259.0969
## 2   216.3534   600.3054  459.4174   558.8020   974.5694
## 3   186.0292   448.1481  620.2845   881.5853   738.9993
## 4   408.2622   509.2011  270.7288   832.0621   646.4813
## 5   614.5810   729.2055  331.8737   549.0257   778.4692
## 6   676.7125   760.2573  563.9271   743.5483  1123.6770
## 7  1785.4747  2786.1717 1549.6171  2378.2314  3642.7213
## 8    48.0000   104.0000  124.0000   200.0000   184.0000
## 9  8630.6045 15437.6017 7964.5133 15996.5794 16961.4616
## 10 1359.9284  3079.8702 2219.3613  3277.7336  3858.7154
##                          geometry
## 1  POLYGON ((-61.14848 14.8059...
## 2  POLYGON ((-61.0533 14.45579...
## 3  POLYGON ((-61.0846 14.85312...
## 4  POLYGON ((-61.16765 14.6753...
## 5  POLYGON ((-61.11754 14.6275...
## 6  POLYGON ((-61.0533 14.45579...
## 7  POLYGON ((-60.954 14.55508,...
## 8  POLYGON ((-61.11316 14.7019...
## 9  POLYGON ((-61.0392 14.64185...
## 10 POLYGON ((-60.89389 14.6500...

Système de coordonées

Les projections/système de coordonées sont répertoriées grâce à un code le code epsg :

Préservations ? angles, aires, disstance locales, …

Affichage

plot(mtq)

Affichage, seulement la géométrie

plot(st_geometry(mtq))

Centroids

mtq_c <- st_centroid(mtq)
plot(st_geometry(mtq))
plot(st_geometry(mtq_c), add=TRUE, cex=1.2, col="red", pch=20)

Distances

mat <- st_distance(x=mtq_c,y=mtq_c)
mat[1:5,1:5]
## Units: m
##           [,1]     [,2]      [,3]      [,4]      [,5]
## [1,]     0.000 35297.56  3091.501 12131.617 17136.310
## [2,] 35297.557     0.00 38332.602 25518.913 18605.249
## [3,]  3091.501 38332.60     0.000 15094.702 20226.198
## [4,] 12131.617 25518.91 15094.702     0.000  7177.011
## [5,] 17136.310 18605.25 20226.198  7177.011     0.000

Polygons Aggregation, union :

mtq_u <- st_union(mtq)
plot(st_geometry(mtq), col="lightblue")
plot(st_geometry(mtq_u), add=T, lwd=2, border = "red")

Polygons Aggregation,

avec une variable de groupe :

library(dplyr)
mtq_u2 <- mtq %>% 
  group_by(STATUT) %>% 
  summarize(P13_POP=sum(P13_POP))
plot(st_geometry(mtq), col="lightblue")
plot(st_geometry(mtq_u2), add=T, lwd=2, border = "red", col=NA)

Buffer

mtq_b <- st_buffer(x = mtq_u, dist = 5000)
plot(st_geometry(mtq), col="lightblue")
plot(st_geometry(mtq_u), add=T, lwd=2)
plot(st_geometry(mtq_b), add=T, lwd=2, border = "red")

Polygon Intersection

m <- rbind(c(700015,1624212), c(700015,1641586), c(719127,1641586), 
           c(719127,1624212), c(700015,1624212))
p <- st_sf(st_sfc(st_polygon(list(m))), crs = st_crs(mtq))
plot(st_geometry(mtq))
plot(p, border="red", lwd=2, add=T)

Polygon Intersection

mtq_z <- st_intersection(x = mtq, y = p)
plot(st_geometry(mtq))
plot(st_geometry(mtq_z), col="red", border="green", add=T)

Voronoi Polygons

google: “st_voronoi R sf” (https://github.com/r-spatial/sf/issues/474 & https://stackoverflow.com/questions/45719790/create-voronoi-polygon-with-simple-feature-in-r)

mtq_v <- st_voronoi(x = st_union(mtq_c))
mtq_v <- st_intersection(st_cast(mtq_v), st_union(mtq))
mtq_v <- st_join(x = st_sf(mtq_v), y = mtq_c, join=st_intersects)
mtq_v <- st_cast(mtq_v, "MULTIPOLYGON")
plot(st_geometry(mtq_v), col='lightblue')

Ecrire des données

system("rm data/voronoi.geojson")
write_sf(mtq_v,"./data/voronoi.geojson",driver="GeoJSON")

Cartographie

Carte plus complexes

library(sf)
# Import geo layers
# Communes of Seine Maritime
sm <- st_read(dsn = "data/seine_maritime.geojson", 
              stringsAsFactors = F, quiet=TRUE)
# French departements
dep <- st_read(dsn = "data/dep.geojson", 
               stringsAsFactors = F, quiet=TRUE)
# change projection (lambert93)
sm <- st_transform(sm, 2154)
dep <- st_transform(dep, 2154)
# Import dataset  
csp <- read.csv("data/data_seine_maritime.csv")
# merge geolayer and dataset
sm <- merge(sm, csp, by="INSEE_COM", all.x=TRUE)

Carte simple symboles proportionels

library(cartography)
## Loading required package: sp
plot(st_geometry(sm))
propSymbolsLayer(sm, var = "act")
## Warning in st_centroid.sf(x = x, of_largest_polygon =
## max(sf::st_is(sf::st_as_sf(x), : st_centroid assumes attributes are
## constant over geometries of x
title("Active Population")

Carte simple symboles proportionels

Carte symboles proportionels

# Custom map of active population
par(mar=c(0.2,0.2,1.4,0.2))
bb <- st_bbox(sm)
# the bbox is used to center the map on the Seine Maritime depatement
plot(st_geometry(dep), col = "ivory", border="ivory3",  bg="azure", 
     xlim = bb[c(1,3)], ylim =  bb[c(2,4)])
plot(st_geometry(sm), col="cornsilk2", border = NA, lwd = 0.5, add=T)
propSymbolsLayer(sm, var = "act", col="darkblue", inches = 0.6, 
                 border = "white", lwd=0.7, symbols = "square",
                 legend.style = "e", legend.pos="topleft",
                 legend.title.txt = "Labor Force\n(2014)", 
                 legend.values.rnd = 0)
## Warning in st_centroid.sf(x = x, of_largest_polygon =
## max(sf::st_is(sf::st_as_sf(x), : st_centroid assumes attributes are
## constant over geometries of x
# Scale Bar
barscale(size = 10)
# North Arrow
north(pos = "topright", col = "darkblue")
# Full layout
layoutLayer(title = "Workforce in Seine-Maritime", 
            sources = "Insee, 2018", author = "Kim & Tim, 2018", 
            col = "darkblue", coltitle = "white", tabtitle = TRUE, 
            frame = TRUE, scale = NULL, north = FALSE)
 
title("Active Population")

Carte symboles proportionels

Carte qualitative

#To display qualitative data modalities
mod <- c("agr", "art", "cad", "int", "emp", "ouv")
# labels in the legedn
modlab <- c("Agriculteurs", "Artisans","Cadres", 
            "Prof. Inter.", "Employés", "Ouvriers")
# colors
cols <- c("#e3b4a2", "#a2d5d6", "#debbd4", 
          "#b5dab6", "#afc2e3", "#e9e2c1")
par(mar=c(0.2,0.2,1.4,0.2))
plot(st_geometry(dep), col = "ivory", border="ivory3", bg="azure",
     xlim = bb[c(1,3)], ylim =  bb[c(2,4)])
typoLayer(sm, var = "cat", 
          border = "ivory", lwd = 0.5, 
          legend.values.order = mod,
          col = cols,
          add=TRUE, legend.pos = "n")
# functions are dedicated to legend display
legendTypo(title.txt = "Dominant Socio-Professional\nCategory", 
           col = cols, 
           categ = modlab, 
           nodata = F)
barscale(size = 10)
north(pos = "topright", col = "darkblue")
layoutLayer(title = "Workforce Distribution in Seine-Maritime", 
            sources = "Insee, 2018", author = "Kim & Tim, 2018", 
            col = "darkblue", coltitle = "white", tabtitle = TRUE,
            frame = TRUE, scale = NULL, north = FALSE)

Carte qualitative

Choroplethe

# Compute the share of "managers" in the active population
sm$pcad <- 100 * sm$cad / sm$act

# The getBreaks() function is used to classify the variable
bks <- getBreaks(v = sm$pcad, method = "quantile", nclass = 6)

# The carto.pal() function give access to various palettes
cols <- carto.pal("green.pal", 3,"wine.pal",3)

# Create the map
par(mar=c(0.2,0.2,1.4,0.2))
plot(st_geometry(dep), col = "ivory", border="ivory3", bg="azure",
     xlim = bb[c(1,3)], ylim =  bb[c(2,4)])
choroLayer(sm, var = "pcad", breaks = bks, 
           col = cols, border = "grey80", 
           legend.values.rnd = 1,
           lwd = 0.4, legend.pos = "topleft", 
           legend.title.txt = "Share of managers (%)", add=T)
# Add a layout
layoutLayer(title = "Managers", 
            sources = "Insee, 2018", author = "Kim & Tim, 2018", 
            theme = "green.pal", 
            col = "darkred", coltitle = "white", 
            tabtitle = TRUE, 
            frame = TRUE, scale = 10)
north(pos = "topright")

Choroplethe

Carte interactives avec Leaflet

library(leaflet)
sm.center.4326 = sm %>% st_centroid()%>% st_transform(4326)
leaflet(data = sm.center.4326 ) %>% 
  addTiles() %>% 
  addCircleMarkers(radius =~ sqrt(act/100)*1.5,
                   fillColor = "#449944",
                   stroke=FALSE,
                   fillOpacity = 1,
                   popup = ~paste(LIBELLE,":",act,"actifs"))

https://rstudio.github.io/leaflet/

  • addMarkers
  • addCircleMarkers
  • addPolygons
  • …..

Carte interactives avec Leaflet

Exercice

Charger les contours des départements français contenus dans le répertoire exo6_dep

Calculer à partir des données communales contenues dans le fichier exo6_data.csv des taux de naissances par départements.

Joindre les deux tables et vérifier ques des données sont associés à chaque départements.

Exporter les données en geojson

Réaliser une carte coroplèthe avec ces données.

Exercice

Calculer et afficher les voronois associés aux stations velib de new-york.

Utiliser pour cela la fonction st_as_sf avec l’option coords.

Exercice

Faire une carte interactive des monuments historiques à paris

Faire une carte en symbol proportionel avec le nombre de monuments par iris

Les contours des iris sont disponnibles dans le répertoire “data/CONTOURS-IRIS/”

Les données sont disponibles dans le répertoire “data/monuments_paris.geojson”

Exercice

Faire une carte en symboles proportionels du nombre de vélos dans les stations vélib à NewYork.

Utiliser un fond de carte open street map avec la library cartography

Sources

  • Thimothé Giraud / Kim antunez